[ https://issues.apache.org/jira/browse/MAHOUT-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084304#comment-14084304 ]
ASF GitHub Bot commented on MAHOUT-1593: ---------------------------------------- GitHub user roengram opened a pull request: https://github.com/apache/mahout/pull/37 use exact path to clustering results instead of wild-card Detail is available at https://issues.apache.org/jira/browse/MAHOUT-1593 Briefly speaking, kmeans-clustering example script doesn't run correctly with Hadoop version: 2.4.0.2.1.1.0-385. The reason is that the script uses wild-carded path for clustering result directory, which is not returning the correct path. I replaced the wild-carded path with a simple combination of commands that returns the exact path. You can merge this pull request into a Git repository by running: $ git pull https://github.com/roengram/mahout MAHOUT-1593 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/37.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #37 ---- commit 50491a8e43d45e91247fab63b8504a81c41eeabb Author: roengram <jaehoon13...@samsung.com> Date: 2014-08-04T04:59:06Z use exact path to clustering results instead of wild-card ---- > cluster-reuters.sh does not work complaining java.lang.IllegalStateException > ---------------------------------------------------------------------------- > > Key: MAHOUT-1593 > URL: https://issues.apache.org/jira/browse/MAHOUT-1593 > Project: Mahout > Issue Type: Bug > Components: Examples > Affects Versions: 0.9 > Environment: Hadoop version: 2.4.0.2.1.1.0-385 > Git hash: 2b65475c3ab682ebd47cffdc6b502698799cd2c8 (trunk) > Reporter: jaehoon ko > Priority: Minor > Labels: patch > Fix For: 1.0 > > Attachments: MAHOUT-1593.patch > > > When I choose "kmeans clustering" in cluster-reuters.sh, clusterdump > complains java.lang.IllegalStateException as follows: > {code:borderStyle=solid} > Exception in thread "main" java.lang.IllegalStateException: > /tmp/mahout-work-user/reuters-kmeans/clusters-*-final > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterable.iterator(SequenceFileDirValueIterable.java:78) > at > org.apache.mahout.clustering.evaluation.ClusterEvaluator.loadClusters(ClusterEvaluator.java:93) > at > org.apache.mahout.clustering.evaluation.ClusterEvaluator.<init>(ClusterEvaluator.java:81) > at > org.apache.mahout.utils.clustering.ClusterDumper.printClusters(ClusterDumper.java:208) > at > org.apache.mahout.utils.clustering.ClusterDumper.run(ClusterDumper.java:157) > at > org.apache.mahout.utils.clustering.ClusterDumper.main(ClusterDumper.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Caused by: java.io.FileNotFoundException: File > /tmp/mahout-work-user/reuters-kmeans/clusters-*-final does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1483) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1523) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:70) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterable.iterator(SequenceFileDirValueIterable.java:76) > ... 18 more > {code} > Other clustering options run well. -- This message was sent by Atlassian JIRA (v6.2#6252)