Is the character processing here all done by the charfilter, or does it use some encoding methods from the JDK?
when i looked at it, it looked like a jvm bug. On Sun, Nov 23, 2014 at 1:04 PM, Steve Rowe <[email protected]> wrote: > This is the same line in the same test that failed on Windows under a > 1.8.0_20 JVM five days ago > <http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4439/>, but in a > different way. > > This test's input is the string "�" - HTML character > references for U+D86D U+E28F - and the expected output is the char sequence > U+FFFD U+E28F (the Unicode replacement character followed by the second > input char). > > In the Windows failure, the output was U+D86D U+E28F (improperly paired high > surrogate). > > In this Linux failure, the output is U+2B68F (properly paired UTF-16 U+D86D > U+DE8F). > > Very weird. > > I'm beasting this suite now on Windows under Oracle JVM 1.8.0_20 to see if I > can get it to fail. No dice so far after 140 trials. > > > On Sun, Nov 23, 2014 at 6:19 AM, Policeman Jenkins Server > <[email protected]> wrote: >> >> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11492/ >> Java: 32bit/jdk1.8.0_20 -server -XX:+UseParallelGC (asserts: false) >> >> 1 tests failed. >> FAILED: >> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates >> >> Error Message: >> term 0 expected:<[�]> but was:<[𫚏]> >> >> Stack Trace: >> org.junit.ComparisonFailure: term 0 expected:<[�]> but was:<[𫚏]> >> at >> __randomizedtesting.SeedInfo.seed([CF8F65E969B602B9:93CFDF3CEB58ED83]:0) >> at org.junit.Assert.assertEquals(Assert.java:125) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:180) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:295) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:299) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:303) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:353) >> at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:362) >> at >> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates(HTMLStripCharFilterTest.java:600) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:483) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) >> at >> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) >> at >> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) >> at >> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) >> at >> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) >> at >> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) >> at >> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) >> at >> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) >> at >> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) >> at >> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) >> at >> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) >> at >> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) >> at >> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) >> at >> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) >> at >> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) >> at >> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) >> at >> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) >> at >> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) >> at >> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) >> at >> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) >> at >> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) >> at java.lang.Thread.run(Thread.java:745) >> >> >> >> >> Build Log: >> [...truncated 5753 lines...] >> [junit4] Suite: >> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest >> [junit4] 2> NOTE: reproduce with: ant test >> -Dtestcase=HTMLStripCharFilterTest -Dtests.method=testUTF16Surrogates >> -Dtests.seed=CF8F65E969B602B9 -Dtests.multiplier=3 -Dtests.slow=true >> -Dtests.locale=th_TH -Dtests.timezone=PLT -Dtests.asserts=false >> -Dtests.file.encoding=UTF-8 >> [junit4] FAILURE 0.07s J0 | HTMLStripCharFilterTest.testUTF16Surrogates >> <<< >> [junit4] > Throwable #1: org.junit.ComparisonFailure: term 0 >> expected:<[�]> but was:<[𫚏]> >> [junit4] > at >> __randomizedtesting.SeedInfo.seed([CF8F65E969B602B9:93CFDF3CEB58ED83]:0) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:180) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:295) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:299) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:303) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:353) >> [junit4] > at >> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:362) >> [junit4] > at >> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates(HTMLStripCharFilterTest.java:600) >> [junit4] > at java.lang.Thread.run(Thread.java:745) >> [junit4] 2> NOTE: test params are: codec=Asserting(Lucene50): >> {dummy=BlockTreeOrds(blocksize=128)}, docValues:{}, sim=DefaultSimilarity, >> locale=th_TH, timezone=PLT >> [junit4] 2> NOTE: Linux 3.13.0-39-generic i386/Oracle Corporation >> 1.8.0_20 (32-bit)/cpus=8,threads=1,free=88329216,total=222035968 >> [junit4] 2> NOTE: All tests run in this JVM: >> [TestPatternReplaceCharFilter, TestArabicNormalizationFilter, >> TestPatternReplaceCharFilterFactory, TestWikipediaTokenizerFactory, >> TestCondition2, TestIrishLowerCaseFilterFactory, TestGalicianStemFilter, >> TestWordlistLoader, TestElisionFilterFactory, TestLengthFilter, >> TestGermanLightStemFilterFactory, EdgeNGramTokenFilterTest, >> TestSerbianNormalizationFilterFactory, TestPortugueseLightStemFilter, >> TestSwedishLightStemFilterFactory, TestPatternReplaceFilterFactory, >> TestElision, TestCzechStemFilterFactory, TestSpanishLightStemFilter, >> TestSingleTokenTokenFilter, TestHindiStemmer, TestKeepWordFilter, >> TestLimitTokenCountFilter, TestShingleFilterFactory, TestTrimFilter, >> TestCapitalizationFilterFactory, TestFactories, >> TestGalicianMinimalStemFilterFactory, TestFlagLong, TestIgnore, >> TestGermanMinimalStemFilterFactory, TestUAX29URLEmailTokenizerFactory, >> TestPatternCaptureGroupTokenFilter, TestAlternateCasing, TestCzechAnalyzer, >> TestOnlyInCompound, TestPersianNormalizationFilter, >> TestGermanNormalizationFilterFactory, WikipediaTokenizerTest, >> TestMultiWordSynonyms, TestTruncateTokenFilter, TestPersianAnalyzer, >> TestArabicAnalyzer, TestRemoveDuplicatesTokenFilter, >> TestSoraniStemFilterFactory, TestPorterStemFilterFactory, >> TestCodepointCountFilterFactory, TokenTypeSinkTokenizerTest, >> TestSoraniAnalyzer, TestApostropheFilter, QueryAutoStopWordAnalyzerTest, >> TestTwoSuffixes, TestScandinavianFoldingFilterFactory, TestArmenianAnalyzer, >> TestFinnishAnalyzer, TestFlagNum, TestIndonesianStemmer, >> TestLimitTokenCountAnalyzer, TestScandinavianNormalizationFilterFactory, >> TestReversePathHierarchyTokenizer, TestGalicianMinimalStemFilter, >> TestPersianNormalizationFilterFactory, TestNeedAffix, >> TestGermanLightStemFilter, TestLimitTokenPositionFilterFactory, >> TestStopFilterFactory, TestMappingCharFilter, HTMLStripCharFilterTest] >> [junit4] Completed on J0 in 2.12s, 31 tests, 1 failure <<< FAILURES! >> >> [...truncated 403 lines...] >> BUILD FAILED >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:525: The >> following error occurred while executing this line: >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:473: The >> following error occurred while executing this line: >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:61: The >> following error occurred while executing this line: >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/extra-targets.xml:39: The >> following error occurred while executing this line: >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/build.xml:452: The >> following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:2141: >> The following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/analysis/build.xml:106: >> The following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/analysis/build.xml:38: >> The following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/module-build.xml:58: >> The following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:1359: >> The following error occurred while executing this line: >> >> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:966: >> There were test failures: 270 suites, 1408 tests, 1 failure, 1 ignored >> >> Total time: 30 minutes 5 seconds >> Build step 'Invoke Ant' marked build as failure >> [description-setter] Description set: Java: 32bit/jdk1.8.0_20 -server >> -XX:+UseParallelGC (asserts: false) >> Archiving artifacts >> Recording test results >> Email was triggered for: Failure - Any >> Sending email for trigger: Failure - Any >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
