[jira] Issue Comment Edited: (SOLR-1358) Integration of Tika and DataImportHandler

Akshay K. Ukey (JIRA) Fri, 11 Dec 2009 05:25:44 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789300#action_12789300
 ]


Akshay K. Ukey edited comment on SOLR-1358 at 12/11/09 1:23 PM:
----------------------------------------------------------------

Patch with test case and with tika parser configurable via parser attribute for 
entity tag.

      was (Author: akshay):
    Patch with test case, tika parser configurable via parser attribute for 
entity tag.
  
> Integration of Tika and DataImportHandler
> -----------------------------------------
>
>                 Key: SOLR-1358
>                 URL: https://issues.apache.org/jira/browse/SOLR-1358
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Sascha Szott
>            Assignee: Noble Paul
>         Attachments: SOLR-1358.patch, SOLR-1358.patch, SOLR-1358.patch, 
> SOLR-1358.patch
>
>
> At the moment, it's impossible to configure Solr such that it build up 
> documents by using data that comes from both pdf documents and database table 
> columns. Currently, to accomplish this task, it's up to the user to add some 
> preprocessing that converts pdf files into plain text files. Therefore, I 
> would like to see an integration of Solr Cell into DIH that makes those 
> preprocessing obsolete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1358) Integration of Tika and DataImportHandler

Reply via email to