Like this.

        TikaConfig tikaConfig = new TikaConfig();

        final AutoDetectParser parser = new AutoDetectParser(tikaConfig);

        final ParseContext parseContext = new ParseContext();

        parseContext.set(AutoDetectParser.class, parser);
        parseContext.set(PDFParserConfig.class, pdfConfig);
        parseContext.set(TesseractOCRConfig.class, tessConfig);

-----Original Message-----
From: Tim Allison <[email protected]> 
Sent: Monday, February 8, 2021 5:31 PM
To: [email protected]
Subject: Re: Tika-config

How are you using the TikaConfig?

On Mon, Feb 8, 2021 at 4:11 PM Peter Kronenberg <[email protected]> 
wrote:
>
> What is wrong with this?
>
> I specified the tika-config env variable.  I know it works because if 
> I make a syntax error in the tika-config.xml, it complains.  So it’s 
> finding the file.  But it’s not applying the properties
>
>
>
> I have this tika-config.  I tried forward slashes instead of the double 
> backslashes.  Same result.  No errors.  It’s just not applying the values.
>
>
>
> <?xml version="1.0" encoding="UTF-8"?> <properties>
>     <parsers>
>         <parser class="org.apache.tika.parser.DefaultParser">
>         </parser>
>
>         <parser class="org.apache.tika.parser.ocr.TesseractOCRParser">
>             <params>
>                 <param name="tesseractPath" 
> type="string">c:\\tesseract_config</param>
>                 <param name="tessdataPath" 
> type="string">c:\\tessdata_config</param>
>             </params>
>         </parser>
>     </parsers>
> </properties>
>
>

Reply via email to