[
https://issues.apache.org/jira/browse/TIKA-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276936#comment-14276936
]
Nick Burch commented on TIKA-1509:
----------------------------------
Passing a strategy to CompositeParser, then having that select the parser(s)
and wrap with decorators as needed could work
One addition thing to consider is that CompositeParser will walk its way up the
type hierarchy until it finds a parser for the type. If someone has two parsers
for Microsoft Excel .xls, and one parser for x-tika-msoffice (the ole2
container that .xls sits in), should they be able to say that all parsers for
parent types also be tried? Or would it just be "go up the type hierarchy until
you find at least one parser, then run all parsers at that level based on the
strategy"?
> Create configurable strategies for composite parsers
> ----------------------------------------------------
>
> Key: TIKA-1509
> URL: https://issues.apache.org/jira/browse/TIKA-1509
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
>
> Several parsers can handle the same mime type, and we are currently ordering
> which parser is chosen (roughly) by the alphabetic order of the parser class
> name.
> Let's allow users to configure strategies for picking parsers.
> ***NOTE: this description is just a place holder, will edit later.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)