so this is what I am running: 
"./bin/run-example SparkKMeans ~/Documents/2dim2.txt 2 0.001"

And this is the input file:"
┌───[spark2013@SparkOne]──────[~/spark-1.0.0].$
└───#!cat ~/Documents/2dim2.txt
2 1
1 2
3 2
2 3
4 1
5 1
6 1
4 2
6 2
4 3
5 3
6 3
"

This is the final output from spark:
"14/07/10 20:05:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 
2 non-empty blocks out of 2 blocks
14/07/10 20:05:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 
0 remote fetches in 0 ms
14/07/10 20:05:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: 
maxBytesInFlight: 50331648, targetRequestSize: 10066329
14/07/10 20:05:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 
2 non-empty blocks out of 2 blocks
14/07/10 20:05:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 
0 remote fetches in 0 ms
14/07/10 20:05:12 INFO Executor: Serialized size of result for 14 is 1433
14/07/10 20:05:12 INFO Executor: Sending result for 14 directly to driver
14/07/10 20:05:12 INFO Executor: Finished task ID 14
14/07/10 20:05:12 INFO DAGScheduler: Completed ResultTask(6, 0)
14/07/10 20:05:12 INFO TaskSetManager: Finished TID 14 in 5 ms on localhost 
(progress: 1/2)
14/07/10 20:05:12 INFO Executor: Serialized size of result for 15 is 1433
14/07/10 20:05:12 INFO Executor: Sending result for 15 directly to driver
14/07/10 20:05:12 INFO Executor: Finished task ID 15
14/07/10 20:05:12 INFO DAGScheduler: Completed ResultTask(6, 1)
14/07/10 20:05:12 INFO TaskSetManager: Finished TID 15 in 7 ms on localhost 
(progress: 2/2)
14/07/10 20:05:12 INFO DAGScheduler: Stage 6 (collectAsMap at 
SparkKMeans.scala:75) finished in 0.008 s
14/07/10 20:05:12 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have 
all completed, from pool
14/07/10 20:05:12 INFO SparkContext: Job finished: collectAsMap at 
SparkKMeans.scala:75, took 0.02472681 s
Finished iteration (delta = 0.0)
Final centers:
DenseVector(2.8571428571428568, 2.0)
DenseVector(5.6000000000000005, 2.0)
"




On Thursday, July 10, 2014 12:02 PM, Bertrand Dechoux <decho...@gmail.com> 
wrote:
 


A picture is worth a thousand... Well, a picture with this dataset, what you 
are expecting and what you get, would help answering your initial question.


Bertrand


On Thu, Jul 10, 2014 at 10:44 AM, Wanda Hawk <wanda_haw...@yahoo.com> wrote:

Can someone please run the standard kMeans code on this input with 2 centers ?:
>2 1
>1 2
>3 2
>2 3
>4 1
>5 1
>6 1
>4 2
>6 2
>4 3
>5 3
>6 3
>
>
>The obvious result should be (2,2) and (5,2) ... (you can draw them if you 
>don't believe me ...)
>
>
>Thanks, 
>Wanda

Reply via email to