[
https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Subhajit Das updated TIKA-3320:
-------------------------------
Description:
It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case
sensitive. Same can be confirmed for latest main brunch version.
This is creating issue in a system where request headers are automatically
lowercased, before passing down to TikaServer.
According to [https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2]
"Field names are case-insensitive"
The issue is due to
First a case-sensitive checking happens for startsWith "X-Tika-PDF" or
"X-Tika-OCR". Then getDeclaredField of the respective config class is called to
get field, and invokes the setter method.
The same is maintained in newer TikaServer.
Possible solution:
Case-insensitive checking for startsWith. For getDeclaredField we can assume
only fields will be there (irrespective of case) for any name, and then find
out the field for it. Then derive setter from actual field name. Invoke the
same.
was:
It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case
sensitive.
This is creating issue in a system where request headers are automatically
lowercased.
According to [https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2]
"Field names are case-insensitive"
The issue is due to
First a case-sensitive checking happens for startsWith "X-Tika-PDF" or
"X-Tika-OCR". Then getDeclaredField of the respective config class is called to
get field, and invokes the setter method.
The same is maintained in newer TikaServer.
Possible solution:
Case-insensitive checking for startsWith. For getDeclaredField we can assume
only fields will be there (irrespective of case) for any name, and then find
out the field for it. Then derive setter from actual field name. Invoke the
same.
> TikaServer Header Name is Case-sensitive
> ----------------------------------------
>
> Key: TIKA-3320
> URL: https://issues.apache.org/jira/browse/TIKA-3320
> Project: Tika
> Issue Type: Bug
> Components: core, server
> Affects Versions: 1.25
> Reporter: Subhajit Das
> Priority: Minor
>
> It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case
> sensitive. Same can be confirmed for latest main brunch version.
> This is creating issue in a system where request headers are automatically
> lowercased, before passing down to TikaServer.
>
> According to [https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2]
> "Field names are case-insensitive"
>
> The issue is due to
> First a case-sensitive checking happens for startsWith "X-Tika-PDF" or
> "X-Tika-OCR". Then getDeclaredField of the respective config class is called
> to get field, and invokes the setter method.
> The same is maintained in newer TikaServer.
>
> Possible solution:
> Case-insensitive checking for startsWith. For getDeclaredField we can assume
> only fields will be there (irrespective of case) for any name, and then find
> out the field for it. Then derive setter from actual field name. Invoke the
> same.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)