Yann Moisan created MAHOUT-1068:
-----------------------------------

             Summary: FileDataModel should ignore directories when reloading 
data
                 Key: MAHOUT-1068
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1068
             Project: Mahout
          Issue Type: Bug
          Components: Collaborative Filtering
    Affects Versions: 0.7
            Reporter: Yann Moisan
            Assignee: Sean Owen


I work with a directory that contains :
- a file test.csv (my data for recommendation)
- a directory test (for other purpose ...)

And surprinsigly i encountered the following 
error.java.io.FileNotFoundException: .../test (Is a directory)
        at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_03]
        at java.io.FileInputStream.<init>(FileInputStream.java:138) 
~[na:1.7.0_03]
        at 
org.apache.mahout.common.iterator.FileLineIterator.getFileInputStream(FileLineIterator.java:98)
 ~[mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:79)
 ~[mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:67)
 ~[mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:238)
 [mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:207)
 [mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:193)
 [mahout-core-0.7.jar:0.7]
        at 
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:148)
 [mahout-core-0.7.jar:0.7]


After looking at the code, i saw that the method findUpdateFilesAfter doesn't 
filter directories. 

I proposed to add a test in the method :
    ...
    for (File updateFile : parentDir.listFiles()) {
+     if (!updateFile.isDirectory()) { 
      String updateFileName = updateFile.getName();


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to