[
https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187205#comment-15187205
]
Nick Burch commented on TIKA-1508:
----------------------------------
> I think that's exactly what ParseContext should be for..it should be a
> vehicle for Param passing. We can delineate by property name (FQ) and/or by
> class.
I view {{ParseContext}} as somewhere you configure things on a per-document
basis, not a per-parser basis.
So, need to set where Tesseract lives on your system? Applies to everything, so
on the parser. Need to tell Tesseract to use a German not an English dictionary
on this particular jpeg? Applies to just this one document being parserd, so on
the {{ParseContext}}
> Add uniformity to parser parameter configuration
> ------------------------------------------------
>
> Key: TIKA-1508
> URL: https://issues.apache.org/jira/browse/TIKA-1508
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Fix For: 1.13
>
>
> We can currently configure parsers by the following means:
> 1) programmatically by direct calls to the parsers or their config objects
> 2) sending in a config object through the ParseContext
> 3) modifying .properties files for specific parsers (e.g. PDFParser)
> Rather than scattering the landscape with .properties files for each parser,
> it would be great if we could specify parser parameters in the main config
> file, something along the lines of this:
> {noformat}
> <parser class="org.apache.tika.parser.audio.AudioParser">
> <params>
> <int name="someparam1">2</int>
> <str name="someOtherParam2">something or other</str>
> </params>
> <mime>audio/basic</mime>
> <mime>audio/x-aiff</mime>
> <mime>audio/x-wav</mime>
> </parser>
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)