[ 
https://issues.apache.org/jira/browse/TIKA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-321.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.6
         Assignee: Jukka Zitting

I've made a number of optimizations to the type detection code and as a result 
it's already over an order of magnitude faster than before. I believe there's 
*still* an order of magnitude of improvement available (check most common types 
first, short-circuit matching to only subtypes of already detected types, 
etc.), but already now I've reached the performance goals I had so I'll mark 
this as resolved for Tika 0.6. We can follow up with another issue in case 
anyone has more strict performance requirements.

> Optimize type detection speed
> -----------------------------
>
>                 Key: TIKA-321
>                 URL: https://issues.apache.org/jira/browse/TIKA-321
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.6
>
>
> It would be good to do some simple benchmarks on the type detection code 
> (Tika.detect) to see if there are obvious performance optimizations we could 
> make. There are some use cases like attaching file type information directory 
> listings where type detection speed is important and not necessarily dwarfed 
> by IO waits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to