[ 
https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691682#comment-16691682
 ] 

Mike Sokolov commented on LUCENE-8517:
--------------------------------------

Hmm sadly it does still repro for me even with LUCENE-8564 applied. I haven't 
had a chance to look closely this morning. In the meantime I'll post an update 
to the PR addressing your comments, and for now I'll leave in the exclusion for 
{{FIxedShingleFilter}}, although I hope we'll get to the bottom of it.

> TestRandomChains.testRandomChainsWithLargeStrings failure
> ---------------------------------------------------------
>
>                 Key: LUCENE-8517
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8517
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Steve Rowe
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> From 
> [https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/2828/consoleText], 
> reproduces for me on Java8:
> {noformat}
> Checking out Revision 216f10026b86627750e133fe24ce6a750c470695 
> (refs/remotes/origin/branch_7x)
> [...]
> [java-info] java version "10.0.1"
> [java-info] OpenJDK Runtime Environment (10.0.1+10, Oracle Corporation)
> [java-info] OpenJDK 64-Bit Server VM (10.0.1+10, Oracle Corporation)
> [java-info] Test args: [-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC]
> [...]
>    [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>    [junit4]   2> Exception from random analyzer: 
>    [junit4]   2> charfilters=
>    [junit4]   2>   
> org.apache.lucene.analysis.charfilter.MappingCharFilter(org.apache.lucene.analysis.charfilter.NormalizeCharMap@3ef95503,
>  java.io.StringReader@70dde633)
>    [junit4]   2>   
> org.apache.lucene.analysis.fa.PersianCharFilter(org.apache.lucene.analysis.charfilter.MappingCharFilter@12423b20)
>    [junit4]   2> tokenizer=
>    [junit4]   2>   org.apache.lucene.analysis.th.ThaiTokenizer()
>    [junit4]   2> filters=
>    [junit4]   2>   
> org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter(ValidatingTokenFilter@7914bba7
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,
>  org.apache.lucene.analysis.compound.hyphenation.HyphenationTree@abd7bca)
>    [junit4]   2>   
> Conditional:org.apache.lucene.analysis.MockGraphTokenFilter(java.util.Random@56348091,
>  OneTimeWrapper@aa1c073 
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1)
>    [junit4]   2>   
> Conditional:org.apache.lucene.analysis.shingle.FixedShingleFilter(OneTimeWrapper@4cf58fce
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,
>  4, <NUM>, <SOUTHEAST_ASIAN>)
>    [junit4]   2>   
> org.apache.lucene.analysis.pt.PortugueseLightStemFilter(ValidatingTokenFilter@3a915324
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,keyword=false)
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
> -Dtests.method=testRandomChainsWithLargeStrings -Dtests.seed=92344C536D4E00F4 
> -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=en-ZW 
> -Dtests.timezone=Atlantic/Faroe -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>    [junit4] ERROR   0.46s J2 | 
> TestRandomChains.testRandomChainsWithLargeStrings <<<
>    [junit4]    > Throwable #1: java.lang.IllegalStateException: stage 3: 
> inconsistent startOffset at pos=0: 0 vs 5; token=effort
>    [junit4]    >      at 
> __randomizedtesting.SeedInfo.seed([92344C536D4E00F4:F86FF34234002007]:0)
>    [junit4]    >      at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:109)
>    [junit4]    >      at 
> org.apache.lucene.analysis.pt.PortugueseLightStemFilter.incrementToken(PortugueseLightStemFilter.java:48)
>    [junit4]    >      at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:68)
>    [junit4]    >      at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkResetException(BaseTokenStreamTestCase.java:441)
>    [junit4]    >      at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:546)
>    [junit4]    >      at 
> org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:897)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    [junit4]    >      at 
> java.base/java.lang.reflect.Method.invoke(Method.java:564)
>    [junit4]    >      at java.base/java.lang.Thread.run(Thread.java:844)
>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {dummy=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
>  docValues:{}, maxPointsInLeafNode=214, maxMBSortInHeap=5.729405811878087, 
> sim=RandomSimilarity(queryNorm=true): {}, locale=en-ZW, 
> timezone=Atlantic/Faroe
>    [junit4]   2> NOTE: Linux 4.15.0-32-generic amd64/Oracle Corporation 
> 10.0.1 (64-bit)/cpus=8,threads=1,free=266844648,total=518979584
>    [junit4]   2> NOTE: All tests run in this JVM: [TestOptionalCondition, 
> TestSerbianNormalizationRegularFilter, TestCommonGramsFilterFactory, 
> TestDoubleEscape, TestDictionaryCompoundWordTokenFilterFactory, 
> TestNorwegianMinimalStemFilter, TestCzechStemmer, 
> TestTurkishLowerCaseFilterFactory, TestAnalyzers, 
> TestScandinavianFoldingFilterFactory, TestReversePathHierarchyTokenizer, 
> TestSimplePatternTokenizer, TestGalicianMinimalStemFilter, MinHashFilterTest, 
> TestPortugueseStemFilterFactory, TestPersianCharFilter, 
> TestPerFieldAnalyzerWrapper, TestRandomChains]
>    [junit4] Completed [73/291 (1!)] on J2 in 3.12s, 2 tests, 1 error <<< 
> FAILURES!
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to