[ https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650634#action_12650634 ]
Grant Ingersoll commented on SOLR-284: -------------------------------------- bq. Tika doesn't need to do this explicitly.... you know all fields coming out of your call to the Tika API will be Tika fields. Solar Cell could map all Tika output fields to tika_* where * is the Tika outputted field name. And with field name mapping this default would be overridden, say tika_title mapped to "title". I can add in an option to have it do this mapping. > Parsing Rich Document Types > --------------------------- > > Key: SOLR-284 > URL: https://issues.apache.org/jira/browse/SOLR-284 > Project: Solr > Issue Type: New Feature > Components: update > Reporter: Eric Pugh > Assignee: Grant Ingersoll > Fix For: 1.4 > > Attachments: libs.zip, rich.patch, rich.patch, rich.patch, > rich.patch, rich.patch, rich.patch, rich.patch, SOLR-284.patch, > SOLR-284.patch, solr-word.pdf, source.zip, test-files.zip, test-files.zip, > test.zip, un-hardcode-id.diff > > > I have developed a RichDocumentRequestHandler based on the CSVRequestHandler > that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into > Solr. > There is a wiki page with information here: > http://wiki.apache.org/solr/UpdateRichDocuments > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.