It seems there is a / missing between clustersOutput and
clusteredPoints in the path.

Cheers,

Frank

Sent from a Hungarian keyboard at Sziget festival

On Tue, Aug 9, 2011 at 7:07 PM, eric skinner <[email protected]> wrote:
> Hello,
>
> I am practicing the NewsKMeansClustering.java, an example code given in
> chapter 9 of Mahout-in-Action? I run this program against a directory of
> sequence files. The output error message is as follows:
>
> Exception in thread "main" java.io.FileNotFoundException:* File
> newsClusters/clustersclusteredPoints/part-m-00000 does not exist*.
>  at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>  at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
>
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
> at
> mia.clustering.ch09.NewsKMeansClustering.main(NewsKMeansClustering.java:76)
>
> As reference, the directory structure of the result generated after running
> this program is shown as follows as well:
>
> ~/workspaceMahout1/recommender/newsClusters% ls
>  canopy-centroids clusters df-count dictionary.file-0 frequency.file-0
> tfidf-vectors tf-vectors tokenized-documents wordcount
>  ~/workspaceMahout1/recommender/newsClusters/clusters/clusteredPoints% ls
> part-m-00000
>
> Afterwards, I change the code from the original one
>
> new Path(clusterOutput+Cluster.CLUSTERED_POINTS_DIR +”/part-m-00000”), conf);
>
>
> to
>
> *new Path(clusterOutput+”/clusteredPoints”+”/part-m-00000”), conf);*
>
>
> The program can go through without giving the above error messages. I would
> like to know is that a bug in the original code or are there any other
> hidden issues?
>

Reply via email to