[
https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187196#comment-15187196
]
Chris A. Mattmann commented on TIKA-1508:
-----------------------------------------
Tim and Thamme:
bq. 1) Let's not use ParseContext as the vehicle for param passing, we will
have collisions with different parsers if anyone uses configure() outside of
the normal course of events...it is simpler to use Map<String,String>. Or, if
we do use the ParseContext, we should specify which parser the params are for,
e.g. context.set{{PDFParser.class, Map<String,String> params. I do like the
dual use of configure with ParseContext to achieve Nick's recommendation
elegantly.
I think that's exactly what ParseContext should be for..it should be a vehicle
for Param passing. We can delineate by property name (FQ) and/or by class.
bq. 4) Let's subclass TikaException for TikaParameterConfigException? I don't
feel strongly about this one.
+1
bq. A) Are we ok with Map<String,String> parameters? Or should we follow, say,
Solr's syntax for type checking?
Yes I'm OK with Map<String,String>
bq. B) We could use reflection to get around each parser having to add its own
configuration code. We could create a static configurator that has a
configure(Configurable configurable, Map<String, String> params method. That
isn't quite right, because we'd have to know the type for each param (see
above), but something along those lines. Too complex?
Maybe not too complex, but not as a start :) Just my 2c.
> Add uniformity to parser parameter configuration
> ------------------------------------------------
>
> Key: TIKA-1508
> URL: https://issues.apache.org/jira/browse/TIKA-1508
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Fix For: 1.13
>
>
> We can currently configure parsers by the following means:
> 1) programmatically by direct calls to the parsers or their config objects
> 2) sending in a config object through the ParseContext
> 3) modifying .properties files for specific parsers (e.g. PDFParser)
> Rather than scattering the landscape with .properties files for each parser,
> it would be great if we could specify parser parameters in the main config
> file, something along the lines of this:
> {noformat}
> <parser class="org.apache.tika.parser.audio.AudioParser">
> <params>
> <int name="someparam1">2</int>
> <str name="someOtherParam2">something or other</str>
> </params>
> <mime>audio/basic</mime>
> <mime>audio/x-aiff</mime>
> <mime>audio/x-wav</mime>
> </parser>
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)