[jira] [Commented] (LUCENE-8376) TestRandomChains.testRandomChains() failure

2018-07-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530010#comment-16530010
 ] 

ASF subversion and git services commented on LUCENE-8376:
-

Commit 3a7ca355fce227bc3194ae32abf263b9152aec63 in lucene-solr's branch 
refs/heads/branch_7x from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a7ca35 ]

LUCENE-8376, LUCENE-8371: ConditionalTokenFilter fixes


> TestRandomChains.testRandomChains() failure
> ---
>
> Key: LUCENE-8376
> URL: https://issues.apache.org/jira/browse/LUCENE-8376
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Steve Rowe
>Assignee: Alan Woodward
>Priority: Major
>
> From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/253/]:
> {noformat}
> Checking out Revision 9a395f83ccd83bca568056f178757dd032007140 
> (refs/remotes/origin/branch_7x)
> [...]
>[junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>[junit4]   2> TEST FAIL: useCharFilter=false text='Protein_Data_Bank|PD'
>[junit4]   2> Exception from random analyzer: 
>[junit4]   2> charfilters=
>[junit4]   2> tokenizer=
>[junit4]   2>   
> org.apache.lucene.analysis.pattern.PatternTokenizer(org.apache.lucene.util.AttributeFactory$DefaultAttributeFactory@2efbbefd,
>  a, -14)
>[junit4]   2> filters=
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@519cd943
>  term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1, 
> )
>[junit4]   2>   
> org.apache.lucene.analysis.ru.RussianLightStemFilter(ValidatingTokenFilter@12e97d93
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false)
>[junit4]   2>   
> Conditional:org.apache.lucene.analysis.synonym.SynonymGraphFilter(OneTimeWrapper@4264e89
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1,
>  org.apache.lucene.analysis.synonym.SynonymMap@3469385d, true)
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@6084b824
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1)
>[junit4]   2> NOTE: download the large Jenkins line-docs file by running 
> 'ant get-jenkins-line-docs' in the lucene directory.
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
> -Dtests.method=testRandomChains -Dtests.seed=E5D6D73E34CFBE1F 
> -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
>  -Dtests.locale=sl -Dtests.timezone=America/Boise -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>[junit4] ERROR   93.8s J2 | TestRandomChains.testRandomChains <<<
>[junit4]> Throwable #1: java.lang.IllegalStateException: last stage: 
> inconsistent startOffset at pos=2: 12 vs 15; token=word
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E5D6D73E34CFBE1F:D837FE5F73DDA3DF]:0)
>[junit4]>  at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:109)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:748)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:659)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:561)
>[junit4]>  at 
> org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:866)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
>[junit4]   2> NOTE: leaving temporary files on disk at: 
> /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/checkout/lucene/build/analysis/common/test/J2/temp/lucene.analysis.core.TestRandomChains_E5D6D73E34CFBE1F-001
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {dummy=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
>  docValues:{}, maxPointsInLeafNode=864, maxMBSortInHeap=5.854194972498171, 
> sim=RandomSimilarity(queryNorm=true): {dummy=IB LL-LZ(0.3)}, locale=sl, 
> timezone=America/Boise
>[junit4]   2> NOTE: Linux 4.4.0-112-generic amd64/Oracle Corporation 
> 1.8.0_172 (64-bit)/cpus=4,threads=1,free=132673784,total=233832448
>[junit4]   2> NOTE: All tests run in this JVM: [TestMultiWordSynonyms, 
> TestIndonesianAnalyzer, TestMorphData, TestLimitTokenCountFilter, 
> TestWikipediaTokenizerFactory, TestFlagNum, TestPortugueseAnalyzer, 
> 

[jira] [Commented] (LUCENE-8376) TestRandomChains.testRandomChains() failure

2018-07-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530012#comment-16530012
 ] 

ASF subversion and git services commented on LUCENE-8376:
-

Commit f835d2499778972ad901a6be11ecf6ef308c0bb0 in lucene-solr's branch 
refs/heads/master from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f835d24 ]

LUCENE-8376, LUCENE-8371: ConditionalTokenFilter fixes


> TestRandomChains.testRandomChains() failure
> ---
>
> Key: LUCENE-8376
> URL: https://issues.apache.org/jira/browse/LUCENE-8376
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Steve Rowe
>Assignee: Alan Woodward
>Priority: Major
>
> From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/253/]:
> {noformat}
> Checking out Revision 9a395f83ccd83bca568056f178757dd032007140 
> (refs/remotes/origin/branch_7x)
> [...]
>[junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>[junit4]   2> TEST FAIL: useCharFilter=false text='Protein_Data_Bank|PD'
>[junit4]   2> Exception from random analyzer: 
>[junit4]   2> charfilters=
>[junit4]   2> tokenizer=
>[junit4]   2>   
> org.apache.lucene.analysis.pattern.PatternTokenizer(org.apache.lucene.util.AttributeFactory$DefaultAttributeFactory@2efbbefd,
>  a, -14)
>[junit4]   2> filters=
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@519cd943
>  term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1, 
> )
>[junit4]   2>   
> org.apache.lucene.analysis.ru.RussianLightStemFilter(ValidatingTokenFilter@12e97d93
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false)
>[junit4]   2>   
> Conditional:org.apache.lucene.analysis.synonym.SynonymGraphFilter(OneTimeWrapper@4264e89
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1,
>  org.apache.lucene.analysis.synonym.SynonymMap@3469385d, true)
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@6084b824
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1)
>[junit4]   2> NOTE: download the large Jenkins line-docs file by running 
> 'ant get-jenkins-line-docs' in the lucene directory.
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
> -Dtests.method=testRandomChains -Dtests.seed=E5D6D73E34CFBE1F 
> -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
>  -Dtests.locale=sl -Dtests.timezone=America/Boise -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>[junit4] ERROR   93.8s J2 | TestRandomChains.testRandomChains <<<
>[junit4]> Throwable #1: java.lang.IllegalStateException: last stage: 
> inconsistent startOffset at pos=2: 12 vs 15; token=word
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E5D6D73E34CFBE1F:D837FE5F73DDA3DF]:0)
>[junit4]>  at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:109)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:748)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:659)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:561)
>[junit4]>  at 
> org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:866)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
>[junit4]   2> NOTE: leaving temporary files on disk at: 
> /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/checkout/lucene/build/analysis/common/test/J2/temp/lucene.analysis.core.TestRandomChains_E5D6D73E34CFBE1F-001
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {dummy=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
>  docValues:{}, maxPointsInLeafNode=864, maxMBSortInHeap=5.854194972498171, 
> sim=RandomSimilarity(queryNorm=true): {dummy=IB LL-LZ(0.3)}, locale=sl, 
> timezone=America/Boise
>[junit4]   2> NOTE: Linux 4.4.0-112-generic amd64/Oracle Corporation 
> 1.8.0_172 (64-bit)/cpus=4,threads=1,free=132673784,total=233832448
>[junit4]   2> NOTE: All tests run in this JVM: [TestMultiWordSynonyms, 
> TestIndonesianAnalyzer, TestMorphData, TestLimitTokenCountFilter, 
> TestWikipediaTokenizerFactory, TestFlagNum, TestPortugueseAnalyzer, 
> 

[jira] [Commented] (LUCENE-8376) TestRandomChains.testRandomChains() failure

2018-07-02 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529791#comment-16529791
 ] 

Alan Woodward commented on LUCENE-8376:
---

There was a bug in ConditionalTokenFilter where it could incorrectly think that 
it had processed the last token, and therefore didn't need to use offsets from 
end().  I'll commit a fix shortly.

> TestRandomChains.testRandomChains() failure
> ---
>
> Key: LUCENE-8376
> URL: https://issues.apache.org/jira/browse/LUCENE-8376
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Steve Rowe
>Assignee: Alan Woodward
>Priority: Major
>
> From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/253/]:
> {noformat}
> Checking out Revision 9a395f83ccd83bca568056f178757dd032007140 
> (refs/remotes/origin/branch_7x)
> [...]
>[junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>[junit4]   2> TEST FAIL: useCharFilter=false text='Protein_Data_Bank|PD'
>[junit4]   2> Exception from random analyzer: 
>[junit4]   2> charfilters=
>[junit4]   2> tokenizer=
>[junit4]   2>   
> org.apache.lucene.analysis.pattern.PatternTokenizer(org.apache.lucene.util.AttributeFactory$DefaultAttributeFactory@2efbbefd,
>  a, -14)
>[junit4]   2> filters=
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@519cd943
>  term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1, 
> )
>[junit4]   2>   
> org.apache.lucene.analysis.ru.RussianLightStemFilter(ValidatingTokenFilter@12e97d93
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false)
>[junit4]   2>   
> Conditional:org.apache.lucene.analysis.synonym.SynonymGraphFilter(OneTimeWrapper@4264e89
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1,
>  org.apache.lucene.analysis.synonym.SynonymMap@3469385d, true)
>[junit4]   2>   
> org.apache.lucene.analysis.miscellaneous.TypeAsSynonymFilter(ValidatingTokenFilter@6084b824
>  
> term=,bytes=[],startOffset=0,endOffset=0,type=word,positionIncrement=1,keyword=false,positionLength=1)
>[junit4]   2> NOTE: download the large Jenkins line-docs file by running 
> 'ant get-jenkins-line-docs' in the lucene directory.
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
> -Dtests.method=testRandomChains -Dtests.seed=E5D6D73E34CFBE1F 
> -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
>  -Dtests.locale=sl -Dtests.timezone=America/Boise -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>[junit4] ERROR   93.8s J2 | TestRandomChains.testRandomChains <<<
>[junit4]> Throwable #1: java.lang.IllegalStateException: last stage: 
> inconsistent startOffset at pos=2: 12 vs 15; token=word
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([E5D6D73E34CFBE1F:D837FE5F73DDA3DF]:0)
>[junit4]>  at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:109)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:748)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:659)
>[junit4]>  at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:561)
>[junit4]>  at 
> org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:866)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
>[junit4]   2> NOTE: leaving temporary files on disk at: 
> /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/checkout/lucene/build/analysis/common/test/J2/temp/lucene.analysis.core.TestRandomChains_E5D6D73E34CFBE1F-001
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {dummy=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
>  docValues:{}, maxPointsInLeafNode=864, maxMBSortInHeap=5.854194972498171, 
> sim=RandomSimilarity(queryNorm=true): {dummy=IB LL-LZ(0.3)}, locale=sl, 
> timezone=America/Boise
>[junit4]   2> NOTE: Linux 4.4.0-112-generic amd64/Oracle Corporation 
> 1.8.0_172 (64-bit)/cpus=4,threads=1,free=132673784,total=233832448
>[junit4]   2> NOTE: All tests run in this JVM: [TestMultiWordSynonyms, 
> TestIndonesianAnalyzer, TestMorphData, TestLimitTokenCountFilter, 
> TestWikipediaTokenizerFactory, TestFlagNum, TestPortugueseAnalyzer, 
> TestGalicianStemFilterFactory, TestCollationKeyAnalyzer, 
> TestStrangeOvergeneration,