[
https://issues.apache.org/jira/browse/SOLR-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shalin Shekhar Mangar resolved SOLR-987.
----------------------------------------
Resolution: Duplicate
Duplicate of SOLR-980
> Add a new DataImportHandler EntityProcessor to handle non-XML files
> -------------------------------------------------------------------
>
> Key: SOLR-987
> URL: https://issues.apache.org/jira/browse/SOLR-987
> Project: Solr
> Issue Type: New Feature
> Components: contrib - DataImportHandler
> Reporter: Nathan Adams
>
> Need a way to use Data Import Handler to index non-XML (i.e. simple text)
> files (either via HTTP or FileSystem)? This would assist in putting the
> entire contents of a text file into a single field of a document for which
> the other fields are being pulled out of another DataSource. An
> EntityProcessor looks like the right place for this as it may help us add
> more attributes if needed. We could also consider support for other file
> formats (PDF, office, etc.), which may overlap with some of the
> Extraction/Tika work.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.