And, for the record - indeed enwiki contains an odd field with a
super-long term that looks like this:
13:24:08.000
{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=1680}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=738}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=358}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=197}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=305}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=59}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=482}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=613}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=361}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=141}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=34}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=484}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p8n|}}}={{{p8v|}}}|{{{p9n|}}}={{{p9v|}}}|{{{p10n|}}}={{{p10v|}}}|{{{mun|1}}}=1723}}{{{{{substc|}}}{{{1}}}|{{{p1n|}}}=
[snip]
On Fri, Apr 22, 2022 at 11:57 PM Dawid Weiss <[email protected]> wrote:
>
> This actually reproduces (if you download enwiki). I wonder if we
> should tune LineFileDocs so that it avoids trying to add humongous
> terms.
>
> D.
>
> On Wed, Apr 20, 2022 at 3:42 AM Apache Jenkins Server
> <[email protected]> wrote:
> >
> > Build:
> > https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.1/42/
> >
> > 1 tests failed.
> > FAILED: org.apache.lucene.index.TestAllFilesCheckIndexHeader.test
> >
> > Error Message:
> > java.lang.IllegalArgumentException: Document contains at least one immense
> > term in field="body" (whose UTF8 encoding is longer than the max length
> > 32766), all of which were skipped. Please correct the analyzer to not
> > produce such terms. The prefix of the first immense term is: '[125, 125,
> > 123, 123, 123, 123, 123, 115, 117, 98, 115, 116, 99, 124, 125, 125, 125,
> > 123, 123, 123, 49, 125, 125, 125, 124, 123, 123, 123, 112, 49]...',
> > original message: bytes can be at most 32766 in length; got 94384
> >
> > Stack Trace:
> > java.lang.IllegalArgumentException: Document contains at least one immense
> > term in field="body" (whose UTF8 encoding is longer than the max length
> > 32766), all of which were skipped. Please correct the analyzer to not
> > produce such terms. The prefix of the first immense term is: '[125, 125,
> > 123, 123, 123, 123, 123, 115, 117, 98, 115, 116, 99, 124, 125, 125, 125,
> > 123, 123, 123, 49, 125, 125, 125, 124, 123, 123, 123, 112, 49]...',
> > original message: bytes can be at most 32766 in length; got 94384
> > at
> > __randomizedtesting.SeedInfo.seed([34ECEDA648B62DC2:BCB8D27CE64A403A]:0)
> > at
> > org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1242)
> > at
> > org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:729)
> > at
> > org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:620)
> > at
> > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:241)
> > at
> > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
> > at
> > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1531)
> > at
> > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1816)
> > at
> > org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1469)
> > at
> > org.apache.lucene.tests.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:222)
> > at
> > org.apache.lucene.index.TestAllFilesCheckIndexHeader.test(TestAllFilesCheckIndexHeader.java:58)
> > at
> > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at
> > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992)
> > at
> > org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
> > at
> > org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> > at
> > org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> > at
> > org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> > at
> > org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> > at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> > at
> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370)
> > at
> > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819)
> > at
> > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887)
> > at
> > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898)
> > at
> > org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> > at
> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> > org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> > at
> > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> > at
> > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> > at
> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> > org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> > at
> > org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> > at
> > org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> > at
> > org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> > at
> > org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
> > at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> > at
> > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> > at
> > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370)
> > at
> > com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:826)
> > at java.base/java.lang.Thread.run(Thread.java:834)
> > Suppressed: java.lang.IllegalStateException: close() called in
> > wrong state: INCREMENT
> > at
> > org.apache.lucene.tests.analysis.MockTokenizer.fail(MockTokenizer.java:136)
> > at
> > org.apache.lucene.tests.analysis.MockTokenizer.close(MockTokenizer.java:327)
> > at
> > org.apache.lucene.analysis.TokenFilter.close(TokenFilter.java:58)
> > at
> > org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1136)
> > ... 48 more
> > Caused by:
> > org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
> > can be at most 32766 in length; got 94384
> > at
> > app//org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:258)
> > at
> > app//org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:193)
> > at
> > app//org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1224)
> > ... 48 more
> >
> >
> >
> >
> > Build Log:
> > [...truncated 573 lines...]
> > ERROR: The following test(s) have failed:
> > - org.apache.lucene.index.TestAllFilesCheckIndexHeader.test (:lucene:core)
> > Test output:
> > /home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-9.1/checkout/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestAllFilesCheckIndexHeader.txt
> > Reproduce with: gradlew :lucene:core:test --tests
> > "org.apache.lucene.index.TestAllFilesCheckIndexHeader.test" -Ptests.jvms=4
> > -Ptests.haltonfailure=false -Ptests.jvmargs=-XX:TieredStopAtLevel=1
> > -Ptests.seed=34ECEDA648B62DC2 -Ptests.multiplier=2 -Ptests.nightly=true
> > -Ptests.badapples=false -Ptests.file.encoding=ISO-8859-1
> > -Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-9.1/test-data/enwiki.random.lines.txt
> >
> >
> > BUILD SUCCESSFUL in 1h 49m 55s
> > 223 actionable tasks: 223 executed
> > Build step 'Invoke Gradle script' changed build result to SUCCESS
> > Archiving artifacts
> > java.lang.InterruptedException: no matches found within 10000
> > at hudson.FilePath$ValidateAntFileMask.hasMatch(FilePath.java:3079)
> > at hudson.FilePath$ValidateAntFileMask.invoke(FilePath.java:2958)
> > at hudson.FilePath$ValidateAntFileMask.invoke(FilePath.java:2939)
> > at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3329)
> > Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to
> > lucene-solr-2
> > at
> > hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1797)
> > at
> > hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
> > at hudson.remoting.Channel.call(Channel.java:1001)
> > at hudson.FilePath.act(FilePath.java:1165)
> > at hudson.FilePath.act(FilePath.java:1154)
> > at hudson.FilePath.validateAntFileMask(FilePath.java:2937)
> > at
> > hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:268)
> > at
> > hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
> > at
> > hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
> > at hudson.model.Build$BuildExecution.post2(Build.java:178)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
> > at hudson.model.Run.execute(Run.java:1913)
> > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> > at
> > hudson.model.ResourceController.execute(ResourceController.java:99)
> > at hudson.model.Executor.run(Executor.java:432)
> > Caused: hudson.FilePath$TunneledInterruptedException
> > at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3331)
> > at hudson.remoting.UserRequest.perform(UserRequest.java:211)
> > at hudson.remoting.UserRequest.perform(UserRequest.java:54)
> > at hudson.remoting.Request$2.run(Request.java:376)
> > at
> > hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused: java.lang.InterruptedException: java.lang.InterruptedException: no
> > matches found within 10000
> > at hudson.FilePath.act(FilePath.java:1167)
> > at hudson.FilePath.act(FilePath.java:1154)
> > at hudson.FilePath.validateAntFileMask(FilePath.java:2937)
> > at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:268)
> > at
> > hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
> > at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
> > at hudson.model.Build$BuildExecution.post2(Build.java:178)
> > at
> > hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
> > at hudson.model.Run.execute(Run.java:1913)
> > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> > at
> > hudson.model.ResourceController.execute(ResourceController.java:99)
> > at hudson.model.Executor.run(Executor.java:432)
> > No artifacts found that match the file pattern
> > "**/*.events,heapdumps/**,**/hs_err_pid*". Configuration error?
> > Recording test results
> > [Checks API] No suitable checks publisher found.
> > Build step 'Publish JUnit test result report' changed build result to
> > UNSTABLE
> > Email was triggered for: Unstable (Test Failures)
> > Sending email for trigger: Unstable (Test Failures)
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]