alan krumholz created MAHOUT-1327:
-------------------------------------

             Summary: 
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles is 
using a new instance of the Configuration object to read the file form the Path 
instead of using the Configuration object passed to the method
                 Key: MAHOUT-1327
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1327
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 0.8, 0.7
            Reporter: alan krumholz
            Priority: Critical


When you use KmeansDriver.run with a Configuration object pointing to HDFS:

 Configuration conf = new Configuration();
        conf.addResource(new 
Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf\\core-site.xml"));
        conf.addResource(new 
Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf\\hdfs-site.xml"))

It calls 
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles

at some point and I get an exception (there is no problem if you run it with a 
conf object pointing to the local file system):


java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
    at 
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:215)


I think this is happening because that method is using a new instance of the 
Configuration object to read the file form the Path instead of using the 
Configuration object passed to the method.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to