[
https://issues.apache.org/jira/browse/TIKA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902494#action_12902494
]
Robert Muir commented on TIKA-498:
----------------------------------
Thanks for taking a look Chris.
There might be other problems too, but this one was detected easily by an
existing test.
In general if the value is not going to be for display, but for comparisons and
such,
it is safest to do the same for any toUpperCase/toLowerCase calls.
I found this while trying to debug the reason that SOLR-2088 fails, no evidence
that its tika's fault
but the integration with Solr has a regression that appeared since we upgraded
to 0.8-snapshot.
> HTML parser fails on turkish locale
> -----------------------------------
>
> Key: TIKA-498
> URL: https://issues.apache.org/jira/browse/TIKA-498
> Project: Tika
> Issue Type: Bug
> Components: parser
> Reporter: Robert Muir
> Assignee: Chris A. Mattmann
> Attachments: TIKA-498.patch
>
>
> To reproduce: mvn test -DargLine=-Duser.language=tr
> This is because it uses toLowerCase for the default Locale
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.