Kevin Oberlag created TIKA-2290:
-----------------------------------
Summary: PDFParser 'ocr' properties cannot be set via headers when
using Tika JAXRS
Key: TIKA-2290
URL: https://issues.apache.org/jira/browse/TIKA-2290
Project: Tika
Issue Type: Bug
Components: ocr, parser
Affects Versions: 1.14, 1.13
Reporter: Kevin Oberlag
I have created a stackoverflow question on this topic [here |
http://stackoverflow.com/questions/42602834/x-tika-pdfocrstrategy-is-an-invalid-x-tika-ocr-header-error],
but I'll reiterate the main issue.
I am trying to use TikaJAXRS and add headers for setting PDFParser properties.
Specifically the ocrStrategy property. However, when I add the header using
X-Tika-PDFocrStrategy, I get an error stating that it is an invalid X-Tika-OCR
header.
After looking into the source code, I believe the issue might be with the
'fillParseContext' method in the TikaResource.java file.
The if statement first looks for a key that starts with the OCR header prefix,
and since the PDFParser's property name contains 'ocr', it is trying to find a
property named 'ocrStrategy' in the OCRParser class, which doesn't exist.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)