[
https://issues.apache.org/jira/browse/MAHOUT-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084304#comment-14084304
]
ASF GitHub Bot commented on MAHOUT-1593:
----------------------------------------
GitHub user roengram opened a pull request:
https://github.com/apache/mahout/pull/37
use exact path to clustering results instead of wild-card
Detail is available at https://issues.apache.org/jira/browse/MAHOUT-1593
Briefly speaking, kmeans-clustering example script doesn't run correctly
with Hadoop version: 2.4.0.2.1.1.0-385. The reason is that the script uses
wild-carded path for clustering result directory, which is not returning the
correct path. I replaced the wild-carded path with a simple combination of
commands that returns the exact path.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/roengram/mahout MAHOUT-1593
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/37.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #37
----
commit 50491a8e43d45e91247fab63b8504a81c41eeabb
Author: roengram <[email protected]>
Date: 2014-08-04T04:59:06Z
use exact path to clustering results instead of wild-card
----
> cluster-reuters.sh does not work complaining java.lang.IllegalStateException
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-1593
> URL: https://issues.apache.org/jira/browse/MAHOUT-1593
> Project: Mahout
> Issue Type: Bug
> Components: Examples
> Affects Versions: 0.9
> Environment: Hadoop version: 2.4.0.2.1.1.0-385
> Git hash: 2b65475c3ab682ebd47cffdc6b502698799cd2c8 (trunk)
> Reporter: jaehoon ko
> Priority: Minor
> Labels: patch
> Fix For: 1.0
>
> Attachments: MAHOUT-1593.patch
>
>
> When I choose "kmeans clustering" in cluster-reuters.sh, clusterdump
> complains java.lang.IllegalStateException as follows:
> {code:borderStyle=solid}
> Exception in thread "main" java.lang.IllegalStateException:
> /tmp/mahout-work-user/reuters-kmeans/clusters-*-final
> at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterable.iterator(SequenceFileDirValueIterable.java:78)
> at
> org.apache.mahout.clustering.evaluation.ClusterEvaluator.loadClusters(ClusterEvaluator.java:93)
> at
> org.apache.mahout.clustering.evaluation.ClusterEvaluator.<init>(ClusterEvaluator.java:81)
> at
> org.apache.mahout.utils.clustering.ClusterDumper.printClusters(ClusterDumper.java:208)
> at
> org.apache.mahout.utils.clustering.ClusterDumper.run(ClusterDumper.java:157)
> at
> org.apache.mahout.utils.clustering.ClusterDumper.main(ClusterDumper.java:101)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.FileNotFoundException: File
> /tmp/mahout-work-user/reuters-kmeans/clusters-*-final does not exist.
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1483)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1523)
> at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:70)
> at
> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterable.iterator(SequenceFileDirValueIterable.java:76)
> ... 18 more
> {code}
> Other clustering options run well.
--
This message was sent by Atlassian JIRA
(v6.2#6252)