[ 
https://issues.apache.org/jira/browse/LUCENE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792325#comment-13792325
 ] 

Robert Muir commented on LUCENE-5269:
-------------------------------------

The test needs some improvement... after backporting i ran tests about 30 
times, and I hit this one:

ant test  -Dtestcase=TestBugInSomething 
-Dtests.method=testUnicodeShinglesAndNgrams -Dtests.seed=1BFA8BADE39EDF70 
-Dtests.slow=true -Dtests.locale=th_TH_TH_#u-nu-thai 
-Dtests.timezone=Europe/Copenhagen -Dtests.file.encoding=US-ASCII

{noformat}
   [junit4] Suite: org.apache.lucene.analysis.core.TestBugInSomething
   [junit4]   2> TEST FAIL: useCharFilter=true text='ike to thank the rap'
   [junit4]   2> ?.?. ??, ???? ?:??:?? ?????????? 
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
 uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: 
Thread[Thread-2,5,TGRP-TestBugInSomething]
   [junit4]   2> java.lang.OutOfMemoryError: GC overhead limit exceeded
   [junit4]   2>        at 
__randomizedtesting.SeedInfo.seed([1BFA8BADE39EDF70]:0)
   [junit4]   2>        at 
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:269)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:57)
   [junit4]   2>        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:476)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBugInSomething 
-Dtests.method=testUnicodeShinglesAndNgrams -Dtests.seed=1BFA8BADE39EDF70 
-Dtests.slow=true -Dtests.locale=th_TH_TH_#u-nu-thai 
-Dtests.timezone=Europe/Copenhagen -Dtests.file.encoding=US-ASCII
   [junit4] ERROR   30.6s | TestBugInSomething.testUnicodeShinglesAndNgrams <<<
   [junit4]    > Throwable #1: java.lang.RuntimeException: some thread(s) failed
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:526)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:428)
   [junit4]    >        at 
org.apache.lucene.analysis.core.TestBugInSomething.testUnicodeShinglesAndNgrams(TestBugInSomething.java:255)
   [junit4]    >        at java.lang.Thread.run(Thread.java:724)Throwable #2: 
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=12, name=Thread-2, state=RUNNABLE, 
group=TGRP-TestBugInSomething]
   [junit4]    > Caused by: java.lang.OutOfMemoryError: GC overhead limit 
exceeded
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([1BFA8BADE39EDF70]:0)
   [junit4]    >        at 
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:269)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:57)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:476)
   [junit4]   2> NOTE: test params are: 
codec=DummyCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=DUMMY,
 chunkSize=313), 
termVectorsFormat=CompressingTermVectorsFormat(compressionMode=DUMMY, 
chunkSize=313)), sim=RandomSimilarityProvider(queryNorm=true,coord=crazy): {}, 
locale=th_TH_TH_#u-nu-thai, timezone=Europe/Copenhagen
   [junit4]   2> NOTE: Linux 3.5.0-27-generic amd64/Oracle Corporation 1.7.0_25 
(64-bit)/cpus=8,threads=1,free=155107808,total=477233152
   [junit4]   2> NOTE: All tests run in this JVM: [TestBugInSomething]
   [junit4] Completed in 30.92s, 1 test, 1 error <<< FAILURES!
   [junit4] 
   [junit4] 
   [junit4] Tests with failures:
   [junit4]   - 
org.apache.lucene.analysis.core.TestBugInSomething.testUnicodeShinglesAndNgrams
{noformat}

I will see if i can make a less-ridiculous version of the test that still fails 
with the bug.

> TestRandomChains failure
> ------------------------
>
>                 Key: LUCENE-5269
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5269
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>             Fix For: 4.5.1, 4.6, 5.0
>
>         Attachments: LUCENE-5269.patch, LUCENE-5269.patch, LUCENE-5269.patch, 
> LUCENE-5269_test.patch, LUCENE-5269_test.patch, LUCENE-5269_test.patch
>
>
> One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or 
> possibly only the combination of them conspiring together.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to