[ 
https://issues.apache.org/jira/browse/SOLR-12759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681857#comment-16681857
 ] 

Steve Rowe edited comment on SOLR-12759 at 11/9/18 7:18 PM:
------------------------------------------------------------

Looks like the regex needs another adjustment, maybe relax it from 
"[A-Z]{3,}([+-]\\d\\d(:\\d\\d)?)?}} to "[A-Za-z]{3,}([+-]\\d\\d(:\\d\\d)?)?" to 
allow for case-insensitive matching, which would match the currently 
problematic {{ChST}} locale below?

See e.g. [https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Solaris/899]:

{noformat}
Checking out Revision d214f968d765e5c30c8782c5545c38d9aef487fe 
(refs/remotes/origin/branch_7x)
[...]
[java-info] java version "1.8.0_191"
[java-info] Java(TM) SE Runtime Environment (1.8.0_191-b12, Oracle Corporation)
[java-info] Java HotSpot(TM) 64-Bit Server VM (25.191-b12, Oracle Corporation)
[java-info] Test args: [-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC]
[...]
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=ExtractingRequestHandlerTest -Dtests.seed=B4BB8D072ABBC41E 
-Dtests.slow=true -Dtests.locale=en-CA -Dtests.timezone=Pacific/Guam 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.00s J0 | ExtractingRequestHandlerTest (suite) <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: Is some other JVM 
affected?  Or bad regex? TzDisplayName: ChST
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([B4BB8D072ABBC41E]:0)
   [junit4]    >        at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.beforeClass(ExtractingRequestHandlerTest.java:50)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
{noformat}



was (Author: steve_rowe):
Looks like the regex needs another adjustment, maybe relax it from 
{{\[A-Z\]\{3,\}(\[+-\]\\d\\d(:\\d\\d)?)?}} to 
{{[A-Za-z]{3,}([+-]\\d\\d(:\\d\\d)?)?}} to allow for case-insensitive matching, 
which would match the currently problematic {{ChST}} locale below?

See e.g. [https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Solaris/899]:

{noformat}
Checking out Revision d214f968d765e5c30c8782c5545c38d9aef487fe 
(refs/remotes/origin/branch_7x)
[...]
[java-info] java version "1.8.0_191"
[java-info] Java(TM) SE Runtime Environment (1.8.0_191-b12, Oracle Corporation)
[java-info] Java HotSpot(TM) 64-Bit Server VM (25.191-b12, Oracle Corporation)
[java-info] Test args: [-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC]
[...]
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=ExtractingRequestHandlerTest -Dtests.seed=B4BB8D072ABBC41E 
-Dtests.slow=true -Dtests.locale=en-CA -Dtests.timezone=Pacific/Guam 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.00s J0 | ExtractingRequestHandlerTest (suite) <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: Is some other JVM 
affected?  Or bad regex? TzDisplayName: ChST
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([B4BB8D072ABBC41E]:0)
   [junit4]    >        at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.beforeClass(ExtractingRequestHandlerTest.java:50)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
{noformat}


> Disable ExtractingRequestHandlerTest on JDK 11
> ----------------------------------------------
>
>                 Key: SOLR-12759
>                 URL: https://issues.apache.org/jira/browse/SOLR-12759
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - Solr Cell (Tika extraction)
>         Environment: JDK 11 and Tika 1.x
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Minor
>             Fix For: 7.6
>
>
> ExtractingRequestHandlerTest has failed on a JDK 11 RC due to two conspiring 
> problems: (A) Tika 1.x sometimes calls Date.toString() when extracting 
> metadata (unreleased 2.x will fix this), (B) JDK 11 RC has a bug in some 
> locales like Arabic in which a Date.toString() will have a timezone offset 
> using its locale's characters for the digits instead of using EN_US.  
> I'll add an "assume" check so we don't see failures about this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to