Tim Allison created TIKA-3062:
---------------------------------

             Summary: Improve attachment alignment in tika-eval
                 Key: TIKA-3062
                 URL: https://issues.apache.org/jira/browse/TIKA-3062
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


We noticed in the last few runs that there are areas for improvement in 
alignment of attachments in tika-eval.  Different extractors or different 
versions can extract different numbers of attachments with different names in 
different orders, sometimes even with different digests.

We should increase the trust on digests if they exist, and decrease the 
reliance on matching embedded file names. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to