[ 
https://issues.apache.org/jira/browse/SOLR-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667672#action_12667672
 ] 

Nathan Adams commented on SOLR-987:
-----------------------------------

Yes it is - I didn't realize you had already created an issue for this.

> Add a new DataImportHandler EntityProcessor to handle non-XML files
> -------------------------------------------------------------------
>
>                 Key: SOLR-987
>                 URL: https://issues.apache.org/jira/browse/SOLR-987
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Nathan Adams
>
> Need a way to use Data Import Handler to index non-XML (i.e. simple text) 
> files (either via HTTP or FileSystem)?  This would assist in putting the 
> entire contents of a text file into a single field of a document for which 
> the other fields are being pulled out of another DataSource.  An 
> EntityProcessor looks like the right place for this as it may help us add 
> more attributes if needed.  We could also consider support for other file 
> formats (PDF, office, etc.), which may overlap with some of the 
> Extraction/Tika work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to