[ 
https://issues.apache.org/jira/browse/JCR-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295063#comment-16295063
 ] 

Julian Reschke commented on JCR-1878:
-------------------------------------

trunk: [r887170|http://svn.apache.org/r887170] 
[r884646|http://svn.apache.org/r884646] [r884642|http://svn.apache.org/r884642] 
[r884575|http://svn.apache.org/r884575] [r883777|http://svn.apache.org/r883777] 
[r883382|http://svn.apache.org/r883382] [r816089|http://svn.apache.org/r816089] 
[r815785|http://svn.apache.org/r815785] [r815776|http://svn.apache.org/r815776] 
[r815774|http://svn.apache.org/r815774] [r807139|http://svn.apache.org/r807139] 
[r794633|http://svn.apache.org/r794633] [r778621|http://svn.apache.org/r778621] 
[r763242|http://svn.apache.org/r763242] [r763160|http://svn.apache.org/r763160] 
[r762823|http://svn.apache.org/r762823] [r762821|http://svn.apache.org/r762821] 
[r762818|http://svn.apache.org/r762818] [r762817|http://svn.apache.org/r762817] 
[r762814|http://svn.apache.org/r762814] [r762813|http://svn.apache.org/r762813] 
[r762804|http://svn.apache.org/r762804] [r762802|http://svn.apache.org/r762802]


> Use Apache Tika for text extraction
> -----------------------------------
>
>                 Key: JCR-1878
>                 URL: https://issues.apache.org/jira/browse/JCR-1878
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-text-extractors
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 2.0
>
>
> Once Apache Tika is released with a resolution to TIKA-175 (making Tika 
> available to Java 1.4 projects), we should replace our direct parser library 
> dependencies with Tika parsers. Ideally we'd just use the Tika 
> AutoDetectParser that'll automatically detect the type of a binary and parse 
> it accordingly, solving JCR-728.
> I guess we should keep some level of backwards compatibility with existing 
> textFilterClasses="..." configurations, perhaps by keeping the existing 
> TextExtractor classes as wrappers around respective Tika parsers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to