Yann Moisan created MAHOUT-1068:
-----------------------------------
Summary: FileDataModel should ignore directories when reloading
data
Key: MAHOUT-1068
URL: https://issues.apache.org/jira/browse/MAHOUT-1068
Project: Mahout
Issue Type: Bug
Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Yann Moisan
Assignee: Sean Owen
I work with a directory that contains :
- a file test.csv (my data for recommendation)
- a directory test (for other purpose ...)
And surprinsigly i encountered the following
error.java.io.FileNotFoundException: .../test (Is a directory)
at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_03]
at java.io.FileInputStream.<init>(FileInputStream.java:138)
~[na:1.7.0_03]
at
org.apache.mahout.common.iterator.FileLineIterator.getFileInputStream(FileLineIterator.java:98)
~[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:79)
~[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:67)
~[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:238)
[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:207)
[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:193)
[mahout-core-0.7.jar:0.7]
at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:148)
[mahout-core-0.7.jar:0.7]
After looking at the code, i saw that the method findUpdateFilesAfter doesn't
filter directories.
I proposed to add a test in the method :
...
for (File updateFile : parentDir.listFiles()) {
+ if (!updateFile.isDirectory()) {
String updateFileName = updateFile.getName();
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira