DIH FileListEntityProcessor fileName filters directory names and stops
recursion
---------------------------------------------------------------------------------
Key: SOLR-1000
URL: https://issues.apache.org/jira/browse/SOLR-1000
Project: Solr
Issue Type: Improvement
Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Fergus McMenemie
I have been trying to find out why DIH in FileListEntityProcessor mode did not
appear to be recursing into subdirectories. Going through
FileListEntityProcessor.java I eventually tumbled to the fact that my filename
filter setting from data-config.xml also applied to directory names.
Now, I feel that the fieldName filter should be applied to files fed into the
parser, it should not be applied to the directory names we are recursing
through. I bodged the code to adjust the behavior so that the "FileName" and
"excludes" attributes of "entity" only apply to filenames and not directory
names. It now recurses though my directory tree only indexing the appropriate
files! I think the new behavior is more standard.
I will submit the a patch once I have constructed one!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.