[ 
https://issues.apache.org/jira/browse/LUCENE-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-7540:
---------------------------------------
    Attachment: LUCENE-7540.patch

I attempted to upgrade to ICU 58.1 (see attached patch), and ran {{ant 
regenerate}}, but our evil {{checkRandomData}} test is tripping assertions in 
ICU's {{RuleBasedBreakIterator.java}}:

{noformat}
   [junit4]   2> ??? 16, 2016 6:56:39 ? 
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
 uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: 
Thread[Thread-3,5,TGRP-TestICUTokenizer]
   [junit4]   2> java.lang.AssertionError
   [junit4]   2>        at 
__randomizedtesting.SeedInfo.seed([34D64859D1A7CD98]:0)
   [junit4]   2>        at 
com.ibm.icu.text.RuleBasedBreakIterator.checkDictionary(RuleBasedBreakIterator.java:544)
   [junit4]   2>        at 
com.ibm.icu.text.RuleBasedBreakIterator.next(RuleBasedBreakIterator.java:428)
   [junit4]   2>        at 
org.apache.lucene.analysis.icu.segmentation.BreakIteratorWrapper$RBBIWrapper.next(BreakIteratorWrapper.java:96)
   [junit4]   2>        at 
org.apache.lucene.analysis.icu.segmentation.CompositeBreakIterator.next(CompositeBreakIterator.java:65)
   [junit4]   2>        at 
org.apache.lucene.analysis.icu.segmentation.ICUTokenizer.incrementTokenBuffer(ICUTokenizer.java:210)
   [junit4]   2>        at 
org.apache.lucene.analysis.icu.segmentation.ICUTokenizer.incrementToken(ICUTokenizer.java:104)
   [junit4]   2>        at 
org.apache.lucene.analysis.icu.ICUNormalizer2Filter.incrementToken(ICUNormalizer2Filter.java:80)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:183)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:301)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:305)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:829)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:628)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:61)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:496)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestICUTokenizer 
-Dtests.method=testRandomHugeStrings -Dtests.seed=34D64859D1A7CD98 
-Dtests.locale=ar-QA -Dtests.timezone=Africa/Bujumbura -Dtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1
{noformat}

I had previously installed icu4c 58.1 from sources, and installed it on my dev 
box so its generation tools (e.g. {{gennorm2}}) are available ... so maybe I 
messed something up in that process, or maybe this is an ICU bug?

> Upgrade ICU to 58.1
> -------------------
>
>                 Key: LUCENE-7540
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7540
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: master (7.0), 6.4
>
>         Attachments: LUCENE-7540.patch
>
>
> ICU is up to 58.1, but our ICU analysis components currently use 56.1, which 
> is ~1 year old by now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to