[ 
https://issues.apache.org/jira/browse/SOLR-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210033#comment-13210033
 ] 

Lance Norskog commented on SOLR-2332:
-------------------------------------

Unpacking a zip file is a very narrow, focused operation. This could also be 
done with a separate UpdateRequestHandler that does nothing but unpack zip 
files. It would use the basic JDK zip file code, not Tika. You configure the 
Tika handler beneath it. 

Another use case is a ZIP file full of solr update xml files, which TIKA does 
not know about. To do this, you want an UpdateRequestHandler stack like this: 
zip unpacker -> XmlUpdateRequestHandler

                
> TikaEntityProcessor retrieves only File Names from Zip extraction
> -----------------------------------------------------------------
>
>                 Key: SOLR-2332
>                 URL: https://issues.apache.org/jira/browse/SOLR-2332
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>            Reporter: Jayendra Patil
>             Fix For: 3.6, 4.0
>
>         Attachments: SOLR-2332.patch, solr-word.zip
>
>
> Extraction of Zip files using TikaEntityProcessor results in only names of 
> file.
> It does not extract the contents of the Files in the Zip

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to