[
https://issues.apache.org/jira/browse/SOLR-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fergus McMenemie updated SOLR-1000:
-----------------------------------
Attachment: SOLR-1000.patch
Here is my first attempt at a patch, it seems to work OK however the testcase I
added TestFileListEntityProcessor.java fails. I need somebody who knows what
they are doing to point out what I am doing wrong!
> DIH FileListEntityProcessor fileName filters directory names and stops
> recursion
> ---------------------------------------------------------------------------------
>
> Key: SOLR-1000
> URL: https://issues.apache.org/jira/browse/SOLR-1000
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 1.3
> Reporter: Fergus McMenemie
> Attachments: SOLR-1000.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> I have been trying to find out why DIH in FileListEntityProcessor mode did
> not appear to be recursing into subdirectories. Going through
> FileListEntityProcessor.java I eventually tumbled to the fact that my
> filename filter setting from data-config.xml also applied to directory names.
> Now, I feel that the fieldName filter should be applied to files fed into the
> parser, it should not be applied to the directory names we are recursing
> through. I bodged the code to adjust the behavior so that the "FileName" and
> "excludes" attributes of "entity" only apply to filenames and not directory
> names. It now recurses though my directory tree only indexing the appropriate
> files! I think the new behavior is more standard.
> I will submit the a patch once I have constructed one!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.