[ https://issues.apache.org/jira/browse/TIKA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-321. -------------------------------- Resolution: Fixed Fix Version/s: 0.6 Assignee: Jukka Zitting I've made a number of optimizations to the type detection code and as a result it's already over an order of magnitude faster than before. I believe there's *still* an order of magnitude of improvement available (check most common types first, short-circuit matching to only subtypes of already detected types, etc.), but already now I've reached the performance goals I had so I'll mark this as resolved for Tika 0.6. We can follow up with another issue in case anyone has more strict performance requirements. > Optimize type detection speed > ----------------------------- > > Key: TIKA-321 > URL: https://issues.apache.org/jira/browse/TIKA-321 > Project: Tika > Issue Type: Improvement > Components: mime > Reporter: Jukka Zitting > Assignee: Jukka Zitting > Priority: Minor > Fix For: 0.6 > > > It would be good to do some simple benchmarks on the type detection code > (Tika.detect) to see if there are obvious performance optimizations we could > make. There are some use cases like attaching file type information directory > listings where type detection speed is important and not necessarily dwarfed > by IO waits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.