[
https://issues.apache.org/jira/browse/LUCENE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792325#comment-13792325
]
Robert Muir commented on LUCENE-5269:
-------------------------------------
The test needs some improvement... after backporting i ran tests about 30
times, and I hit this one:
ant test -Dtestcase=TestBugInSomething
-Dtests.method=testUnicodeShinglesAndNgrams -Dtests.seed=1BFA8BADE39EDF70
-Dtests.slow=true -Dtests.locale=th_TH_TH_#u-nu-thai
-Dtests.timezone=Europe/Copenhagen -Dtests.file.encoding=US-ASCII
{noformat}
[junit4] Suite: org.apache.lucene.analysis.core.TestBugInSomething
[junit4] 2> TEST FAIL: useCharFilter=true text='ike to thank the rap'
[junit4] 2> ?.?. ??, ???? ?:??:?? ??????????
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
uncaughtException
[junit4] 2> WARNING: Uncaught exception in thread:
Thread[Thread-2,5,TGRP-TestBugInSomething]
[junit4] 2> java.lang.OutOfMemoryError: GC overhead limit exceeded
[junit4] 2> at
__randomizedtesting.SeedInfo.seed([1BFA8BADE39EDF70]:0)
[junit4] 2> at
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:269)
[junit4] 2> at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696)
[junit4] 2> at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
[junit4] 2> at
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:57)
[junit4] 2> at
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:476)
[junit4] 2>
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestBugInSomething
-Dtests.method=testUnicodeShinglesAndNgrams -Dtests.seed=1BFA8BADE39EDF70
-Dtests.slow=true -Dtests.locale=th_TH_TH_#u-nu-thai
-Dtests.timezone=Europe/Copenhagen -Dtests.file.encoding=US-ASCII
[junit4] ERROR 30.6s | TestBugInSomething.testUnicodeShinglesAndNgrams <<<
[junit4] > Throwable #1: java.lang.RuntimeException: some thread(s) failed
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:526)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:428)
[junit4] > at
org.apache.lucene.analysis.core.TestBugInSomething.testUnicodeShinglesAndNgrams(TestBugInSomething.java:255)
[junit4] > at java.lang.Thread.run(Thread.java:724)Throwable #2:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught
exception in thread: Thread[id=12, name=Thread-2, state=RUNNABLE,
group=TGRP-TestBugInSomething]
[junit4] > Caused by: java.lang.OutOfMemoryError: GC overhead limit
exceeded
[junit4] > at
__randomizedtesting.SeedInfo.seed([1BFA8BADE39EDF70]:0)
[junit4] > at
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:269)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:57)
[junit4] > at
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:476)
[junit4] 2> NOTE: test params are:
codec=DummyCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=DUMMY,
chunkSize=313),
termVectorsFormat=CompressingTermVectorsFormat(compressionMode=DUMMY,
chunkSize=313)), sim=RandomSimilarityProvider(queryNorm=true,coord=crazy): {},
locale=th_TH_TH_#u-nu-thai, timezone=Europe/Copenhagen
[junit4] 2> NOTE: Linux 3.5.0-27-generic amd64/Oracle Corporation 1.7.0_25
(64-bit)/cpus=8,threads=1,free=155107808,total=477233152
[junit4] 2> NOTE: All tests run in this JVM: [TestBugInSomething]
[junit4] Completed in 30.92s, 1 test, 1 error <<< FAILURES!
[junit4]
[junit4]
[junit4] Tests with failures:
[junit4] -
org.apache.lucene.analysis.core.TestBugInSomething.testUnicodeShinglesAndNgrams
{noformat}
I will see if i can make a less-ridiculous version of the test that still fails
with the bug.
> TestRandomChains failure
> ------------------------
>
> Key: LUCENE-5269
> URL: https://issues.apache.org/jira/browse/LUCENE-5269
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Fix For: 4.5.1, 4.6, 5.0
>
> Attachments: LUCENE-5269.patch, LUCENE-5269.patch, LUCENE-5269.patch,
> LUCENE-5269_test.patch, LUCENE-5269_test.patch, LUCENE-5269_test.patch
>
>
> One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or
> possibly only the combination of them conspiring together.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]