[
https://issues.apache.org/jira/browse/TIKA-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332209#comment-15332209
]
Tim Allison commented on TIKA-1986:
-----------------------------------
>From Nick
bq. > I think that's exactly what ParseContext should be for..it should be a
vehicle for Param passing. We can delineate by property name (FQ) and/or by
class.
I view ParseContext as somewhere you configure things on a per-document basis,
not a per-parser basis.
So, need to set where Tesseract lives on your system? Applies to everything, so
on the parser. Need to tell Tesseract to use a German not an English dictionary
on this particular jpeg? Applies to just this one document being parserd, so on
the ParseContext
[link|https://issues.apache.org/jira/browse/TIKA-1508?focusedCommentId=15187205&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15187205]
My current proposal is to add [[email protected]]'s fantastic beans for the
initialization step, but to go back to what we have for runtime/per-file
(setting parameters programmatically).
If we allow users to use the Param stuff programmatically, they'll have some
nasty java like so:
{noformat}
Param<Boolean> paramVal = new Param<>("sortByPosition", new
Boolean(true));
context.setParam(PDFParser.class, paramVal);
{noformat}
And there are no compile time guarantees that "sortByPosition" exists for
PDFParser...
> support parser parameters with type (int, double, etc) in configuration XML
> file
> --------------------------------------------------------------------------------
>
> Key: TIKA-1986
> URL: https://issues.apache.org/jira/browse/TIKA-1986
> Project: Tika
> Issue Type: Sub-task
> Components: config
> Reporter: Thamme Gowda
> Fix For: 1.14
>
>
> Tika Configuration should be enhanced to support for basic types like int,
> double, boolean, url, file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)