[ 
https://issues.apache.org/jira/browse/TIKA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173147#comment-13173147
 ] 

Antoni Mylka commented on TIKA-686:
-----------------------------------

Why keep this issue open?

PdfParser appeared in PdfBox (PDFBOX-1132). Keeping both hardly makes sense and 
has already been identified as a problem (TIKA-810). Pushing parsers upstream 
covers the "I'm in favor of anything that helps with avoiding dependencies on 
POI" use case of Ken. We agree that we keep the dependency from tika-parsers to 
POI (doubts about that dispelled in 
http://mail-archives.apache.org/mod_mbox/tika-dev/201112.mbox/%3C4EEBA9CA.9030900%40gmail.com%3E).
 With this dependency, it will be possible to use the maven exclusion 
construct, exactly as described in my "I like exclusions better" post. So all 
known use cases are covered.

Since we can't actually remove the PdfParser from Tika now (as that would 
definitely be a backward-incompatible change), we should deprecate it, remove 
it from the /META-INF/services/org.apache.tika.parser.Parser and replace the 
implementation with a delegation to the pdfbox version, but that would fall 
within the scope of TIKA-810.

Anyway, this can be closed. The discussion can continue in TIKA-810 and in some 
new issue for POI.

WDYT?
                
> Split tika-parsers into separate components
> -------------------------------------------
>
>                 Key: TIKA-686
>                 URL: https://issues.apache.org/jira/browse/TIKA-686
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Christopher Currie
>            Priority: Minor
>
> The email thread [1] from two years ago that led to splitting Tika into 
> separate components also suggested splitting tika-parsers into separate 
> components based on dependencies. This would be extremely useful, especially 
> in cases where a given parser has no dependencies beyond tika-core. Please 
> consider refactoring the parsers into separate components for 1.0.
> [1] http://markmail.org/message/tavirkqhn6r2szrz

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to