Kevin Oberlag created TIKA-2290:
-----------------------------------

             Summary: PDFParser 'ocr' properties cannot be set via headers when 
using Tika JAXRS
                 Key: TIKA-2290
                 URL: https://issues.apache.org/jira/browse/TIKA-2290
             Project: Tika
          Issue Type: Bug
          Components: ocr, parser
    Affects Versions: 1.14, 1.13
            Reporter: Kevin Oberlag


I have created a stackoverflow question on this topic [here | 
http://stackoverflow.com/questions/42602834/x-tika-pdfocrstrategy-is-an-invalid-x-tika-ocr-header-error],
 but I'll reiterate the main issue. 

I am trying to use TikaJAXRS and add headers for setting PDFParser properties. 
Specifically the ocrStrategy property. However, when I add the header using 
X-Tika-PDFocrStrategy, I get an error stating that it is an invalid X-Tika-OCR 
header.

After looking into the source code, I believe the issue might be with the 
'fillParseContext' method in the TikaResource.java file.

The if statement first looks for a key that starts with the OCR header prefix, 
and since the PDFParser's property name contains 'ocr', it is trying to find a 
property named 'ocrStrategy' in the OCRParser class, which doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to