[ 
https://issues.apache.org/jira/browse/TIKA-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-518.
--------------------------------

    Resolution: Won't Fix

Resolving as won't fix as in most common cases XML attribute values are not 
interesting from a text extraction perspective. Document types for which 
extracting attribute values make sense should have their own parser classes.
                
> Attribute values are not indexed
> --------------------------------
>
>                 Key: TIKA-518
>                 URL: https://issues.apache.org/jira/browse/TIKA-518
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 0.6
>            Reporter: Ovidiu  Cilnician
>            Assignee: Jukka Zitting
>
> I just switched from jackrabbit 1.4.11 to jackrabbit2.1.1
> Some of the test cases that were working in 1.4, fail in 2.1.1. 
> These test cases(CSW service related) contain an AnyText filter and they are 
> looking for an attribute value. No records are returned in this case. It 
> works when an element value is used.
> By looking at Jackrabbit Content Repository project I found this 
> issue(JCR-470 XMLIndexFilter should index the attributes) which was fixed for 
> Jackrabbit 1.4.
> Did the switch to tika(my version of jackrabbit 2.1.1 uses tika 0.6) caused 
> this problem?
> Thank you.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to