Like this.
TikaConfig tikaConfig = new TikaConfig();
final AutoDetectParser parser = new AutoDetectParser(tikaConfig);
final ParseContext parseContext = new ParseContext();
parseContext.set(AutoDetectParser.class, parser);
parseContext.set(PDFParserConfig.class, pdfConfig);
parseContext.set(TesseractOCRConfig.class, tessConfig);
-----Original Message-----
From: Tim Allison <[email protected]>
Sent: Monday, February 8, 2021 5:31 PM
To: [email protected]
Subject: Re: Tika-config
How are you using the TikaConfig?
On Mon, Feb 8, 2021 at 4:11 PM Peter Kronenberg <[email protected]>
wrote:
>
> What is wrong with this?
>
> I specified the tika-config env variable. I know it works because if
> I make a syntax error in the tika-config.xml, it complains. So it’s
> finding the file. But it’s not applying the properties
>
>
>
> I have this tika-config. I tried forward slashes instead of the double
> backslashes. Same result. No errors. It’s just not applying the values.
>
>
>
> <?xml version="1.0" encoding="UTF-8"?> <properties>
> <parsers>
> <parser class="org.apache.tika.parser.DefaultParser">
> </parser>
>
> <parser class="org.apache.tika.parser.ocr.TesseractOCRParser">
> <params>
> <param name="tesseractPath"
> type="string">c:\\tesseract_config</param>
> <param name="tessdataPath"
> type="string">c:\\tessdata_config</param>
> </params>
> </parser>
> </parsers>
> </properties>
>
>