[
https://issues.apache.org/jira/browse/SOLR-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson resolved SOLR-798.
---------------------------------
Resolution: Won't Fix
SPRING_CLEANING_2013 we can reopen if necessary.
> FileListEntityProcessor can't handle directories containing lots of files
> -------------------------------------------------------------------------
>
> Key: SOLR-798
> URL: https://issues.apache.org/jira/browse/SOLR-798
> Project: Solr
> Issue Type: Bug
> Components: contrib - DataImportHandler
> Reporter: Grant Ingersoll
> Priority: Minor
>
> The FileListEntityProcessor currently tries to process all documents in a
> single directory at once, and stores the results into a hashmap. On
> directories containing a large number of documents, this quickly causes
> OutOfMemory errors.
> Unfortunately, the typical fix to this is to hack FileFilter to do the work
> for you and always return false from the accept method. It may be possible
> to hook up some type of Producer/Consumer multithreaded FileFilter approach
> whereby the FileFilter blocks until the nextRow() mechanism requests another
> row, thereby avoiding the need to cache everything in the map.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]