Hi Jeff, first of all, thank you for your response.
But unfortunately, I don`t think that is the cause. as I checked, there is only one file part-m-00000 under directory clusteredPoints. $ hadoop fs -ls /bmz/mahout/output/videotags-kmeans-clusters/clusteredPoints Found 1 items -rw-r----- 3 bmz dev 24608 2012-08-27 10:27 /bmz/mahout/output/videotags-kmeans-clusters/clusteredPoints/part-m-00000 so, what else could it be? btw, since kmeans belongs to supervised learning, is it possible that it take out some data to construct a training dataset? just a guess and it seems unreasonable to do that. Thanks On Tue, Aug 28, 2012 at 12:39 AM, Jeff Eastman <[email protected]>wrote: > Offhand, I wonder why you are specifying only a single part-m-00000 file > in your clusterdump step? If there are more than one part file (a usual > case) then you might be missing some of the clustered points. If so, then > using the directory instead might help: > > --pointsDir > /group/tbdev/zhimo.bmz/mahout/**output/videotags-kmeans-**clusters/clusteredPoints > \ > > > > > On 8/27/12 2:49 AM, Phoenix Bai wrote: > >> --pointsDir >> /group/tbdev/zhimo.bmz/mahout/**output/videotags-kmeans-** >> clusters/clusteredPoints/part-**m-00000 >> \ >> > >
