[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475566#comment-13475566 ] Uwe Schindler commented on LUCENE-4482: --- {quote} The patch at least isolates the JVM bug even if it's not exactly a minimal test Somehow the idfExplain method, which is overridden in this test's BoostingSimilarity, fails to be called (the super.idfExplain is called instead), which leads to the test failures. {quote} So it is definitely a JVM and not a Lucene bug! Have you reported it? I would run Zing tests, too, but before doing that they should: Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I will never ever run it with outdated kernels. They should (if they really need a kernel module, which is in my opinion a no-go, too) use DKMS and make the kernel module open source, so my kernel is also not tainted. Without that I will not support Zing, sorry. But I doubt if the kernel module is really needed! Without a clear explanation why this is needed on their homepage I don't agree. Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded Error Message: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Caused by: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0) at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapper.java:37) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:72) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:28) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnumBase.init(Lucene40PostingsReader.java:329) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegmentDocsEnum.init(Lucene40PostingsReader.java:511) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lucene40PostingsReader.java:247) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40PostingsReader.java:228) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188) at org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtomicReader.java:188) at org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(AssertingAtomicReader.java:122) at org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403) at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:322) at org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293) at org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124) at org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139) at org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42) at org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292) Build Log: [...truncated 624 lines...] [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=820 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_35) - Build # 1725 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/1725/ Java: 32bit/jdk1.6.0_35 -client -XX:+UseSerialGC All tests passed Build Log: [...truncated 23382 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:342: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:65: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:511: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/common-build.xml:1910: Can't get https://issues.apache.org/jira/rest/api/2/project/LUCENE to /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/changes/jiraVersionList.json Total time: 30 minutes 31 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_35 -client -XX:+UseSerialGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475624#comment-13475624 ] Michael McCandless commented on LUCENE-4482: bq. So it is definitely a JVM and not a Lucene bug! I'm pretty sure: the JVM is failing to call the overridden method in a subclass (calling the base class method instead). No other JVMs fail here, and Zing won't fail if you run the test in isolation ... and it doesn't always fail if you run all tests (ie it seems to depend on the seed). bq. Have you reported it? Working on it ... trying to get an account at http://www.azulsystems.com/developers/bugzilla/ {quote} I would run Zing tests, too, but before doing that they should: Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I will never ever run it with outdated kernels. They should (if they really need a kernel module, which is in my opinion a no-go, too) use DKMS and make the kernel module open source, so my kernel is also not tainted. Without that I will not support Zing, sorry. But I doubt if the kernel module is really needed! Without a clear explanation why this is needed on their homepage I don't agree. {quote} I agree: it's crazy it only runs as binary module on old kernel versions ... they know this is a showstopper (I've complained about it several times...) and they're working on it. Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!
I can't repro (standalone or running all tests w/ same JVM count seed ant -Dtests.jvms=8 clean test-core -Dtests.seed=3DC69507A6600F79). This build seems not to save OOM heap dumps. I realize these are space consuming ... but can we eg save up to N of them (ie delete oldest ones first)? This way we at least have a shot of seeing what was taking so much RAM... Mike McCandless http://blog.mikemccandless.com On Sat, Oct 13, 2012 at 9:42 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded Error Message: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Caused by: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0) at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapper.java:37) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:72) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:28) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnumBase.init(Lucene40PostingsReader.java:329) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegmentDocsEnum.init(Lucene40PostingsReader.java:511) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lucene40PostingsReader.java:247) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40PostingsReader.java:228) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188) at org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtomicReader.java:188) at org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(AssertingAtomicReader.java:122) at org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403) at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:322) at org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293) at org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124) at org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139) at org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42) at org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292) Build Log: [...truncated 624 lines...] [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action
[jira] [Commented] (LUCENE-4446) Switch to BlockPostingsFormat for Lucene 4.1
[ https://issues.apache.org/jira/browse/LUCENE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475636#comment-13475636 ] Robert Muir commented on LUCENE-4446: - Branch is ready for merging (/lucene/dev/branches/lucene4446). Switch to BlockPostingsFormat for Lucene 4.1 Key: LUCENE-4446 URL: https://issues.apache.org/jira/browse/LUCENE-4446 Project: Lucene - Core Issue Type: Improvement Components: core/codecs Reporter: Robert Muir Fix For: 4.1 This has baked for some time: no crazy fails in hudson or anything. The code (in my opinion) is actually a lot simpler than the current postings format, its faster, the indexes are smaller, and so on. We should probably spend some time just going over the code and adding some more tests and such but I think its time to start looking at cutting over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!
I disabled them a while ago also for Apache Jenkins. The problem is: One of the real tests produce the OOM dump, too (I think the crash-my-JVM one). So it consumes always this space! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 13, 2012 5:37 PM To: dev@lucene.apache.org Cc: sim...@apache.org Subject: Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure! I can't repro (standalone or running all tests w/ same JVM count seed ant - Dtests.jvms=8 clean test-core -Dtests.seed=3DC69507A6600F79). This build seems not to save OOM heap dumps. I realize these are space consuming ... but can we eg save up to N of them (ie delete oldest ones first)? This way we at least have a shot of seeing what was taking so much RAM... Mike McCandless http://blog.mikemccandless.com On Sat, Oct 13, 2012 at 9:42 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThre aded Error Message: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Caused by: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0) at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.ja va:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapp er.java:37) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp er.java:72) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp er.java:28) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnu mBase.init(Lucene40PostingsReader.java:329) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegment DocsEnum.init(Lucene40PostingsReader.java:511) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lu cene40PostingsReader.java:247) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40Pos tingsReader.java:228) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTerms Enum.docs(BlockTreeTermsReader.java:2188) at org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtom icReader.java:188) at org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(As sertingAtomicReader.java:122) at org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403) at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery. java:322) at org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSear cher.java:80) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293) at org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingC ollector.java:124) at org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLi mitingCollector.java:139) at org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimit ingCollector.java:42) at org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimit ingCollector.java:292) Build Log: [...truncated 624 lines...] [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=760 = 658 = 7.0 * (
Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!
This one is now a nightly-only test! So maybe we can safely enable this for the hourly builds? On Sat, Oct 13, 2012 at 12:01 PM, Uwe Schindler u...@thetaphi.de wrote: I disabled them a while ago also for Apache Jenkins. The problem is: One of the real tests produce the OOM dump, too (I think the crash-my-JVM one). So it consumes always this space! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 13, 2012 5:37 PM To: dev@lucene.apache.org Cc: sim...@apache.org Subject: Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure! I can't repro (standalone or running all tests w/ same JVM count seed ant - Dtests.jvms=8 clean test-core -Dtests.seed=3DC69507A6600F79). This build seems not to save OOM heap dumps. I realize these are space consuming ... but can we eg save up to N of them (ie delete oldest ones first)? This way we at least have a shot of seeing what was taking so much RAM... Mike McCandless http://blog.mikemccandless.com On Sat, Oct 13, 2012 at 9:42 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThre aded Error Message: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Caused by: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0) at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.ja va:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapp er.java:37) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp er.java:72) at org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp er.java:28) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnu mBase.init(Lucene40PostingsReader.java:329) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegment DocsEnum.init(Lucene40PostingsReader.java:511) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lu cene40PostingsReader.java:247) at org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40Pos tingsReader.java:228) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTerms Enum.docs(BlockTreeTermsReader.java:2188) at org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtom icReader.java:188) at org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(As sertingAtomicReader.java:122) at org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403) at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery. java:322) at org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSear cher.java:80) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293) at org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingC ollector.java:124) at org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLi mitingCollector.java:139) at org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimit ingCollector.java:42) at org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimit ingCollector.java:292) Build Log: [...truncated 624 lines...] [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector [junit4:junit4] 1 Informative: timeout exceeded (no action required: most probably just because the test machine is slower than usual): lastDoc=1 , allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution + TIME_ALLOWED + SLOW_DOWN = 2*20 + 51 + 3) [junit4:junit4]
[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643 ] Gil Tene commented on LUCENE-4482: -- We're looking into this bug report. Will hopefully report back / resolve it soon. [But Michael, please go ahead and report it on our bugzilla as well per the above]. [Uwe Schindler wrote:] I would run Zing tests, too, but before doing that they should: Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I will never ever run it with outdated kernels. They should (if they really need a kernel module, which is in my opinion a no-go, too) use DKMS and make the kernel module open source, so my kernel is also not tainted. Without that I will not support Zing, sorry. But I doubt if the kernel module is really needed! Without a clear explanation why this is needed on their homepage I don't agree. This has two parts: One asking/questioning why our loadable module is needed at all, and the other relating to it's availability for various kernels and Linux distros. 1. Why is the ZST (which includes a loadable module) needed for Zing to operate? One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC internally) eliminates garbage collection as a response time concern for enterprise applications. Among other things, C4 relies on rapid manipulation of virtual memory and physical memory mappings to maintain continuous operation. While the semantics of the manipulations we do are possible using the vanilla mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux (and most other OSs) is extremely low due mostly to the historic, extremely conservative approach to in-process TLB invalidation, and due partly to issues with multiple-page size manipulations. We're not talking small change here. More like 4-6 orders of magnitude for our common operation, which is, right now, the difference between a practical and impractical implementation of C4. You can find a detailed discussion of the difference in metrics for these operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site http://tinyurl.com/7rydpvo). 2. Loadable Module availability and compatibility To be clear our loadable module is open source, under GPLv2, and you can have the sources for it if you wish. The reason for the current choice of packaging is that a wide range of current end-customer's Linux systems do not have (or wish to install) the tooling needed to build or re-build the module, and what they need operationally is an RPM that opens and installs without requiring kernel headers and the like. In addition, we tend to intensively test and examine the kernel module against specific distros and kernel to verify compatibility and stability, and declare official support for these well tested combinations. On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is fairly slow, and the kernel api signatures tend to remain the same unless semantics are actually modified. As a result, we use a single module RPM of RHEL5 and CentOS 5 versions, and have only needed a single rev of the module packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. So we're working on it, and it will get better... Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643 ] Gil Tene edited comment on LUCENE-4482 at 10/13/12 4:09 PM: We're looking into this bug report. Will hopefully report back / resolve it soon. [But Michael, please go ahead and report it on our bugzilla as well per the above]. [Uwe Schindler wrote:] I would run Zing tests, too, but before doing that they should: Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I will never ever run it with outdated kernels. They should (if they really need a kernel module, which is in my opinion a no-go, too) use DKMS and make the kernel module open source, so my kernel is also not tainted. Without that I will not support Zing, sorry. But I doubt if the kernel module is really needed! Without a clear explanation why this is needed on their homepage I don't agree. This has two parts: One asking/questioning why our loadable module is needed at all, and the other relating to it's availability for various kernels and Linux distros. 1. Why is the ZST (which includes a loadable module) needed for Zing to operate? One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC internally) eliminates garbage collection as a response time concern for enterprise applications. Among other things, C4 relies on rapid manipulation of virtual memory and physical memory mappings to maintain continuous operation. While the semantics of the manipulations we do are possible using the vanilla mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux (and most other OSs) is extremely low due mostly to the historic, extremely conservative approach to in-process TLB invalidation, and due partly to issues with multiple-page size manipulations. We're not talking small change here. More like 4-6 orders of magnitude for our common operation, which is, right now, the difference between a practical and impractical implementation of C4. You can find a detailed discussion of the difference in metrics for these operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site http://tinyurl.com/7rydpvo). 2. Loadable Module availability and compatibility To be clear our loadable module is open source, under GPLv2, and you can have the sources for it if you wish. The reason for the current choice of packaging is that a wide range of current end-customer's Linux systems do not have (or wish to install) the tooling needed to build or re-build the module, and what they need operationally is an RPM that opens and installs without requiring kernel headers and the like. In addition, we tend to intensively test and examine the kernel module against specific distros and kernel to verify compatibility and stability, and declare official support for these well tested combinations. On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is fairly slow, and the kernel api signatures tend to remain the same unless semantics are actually modified. As a result, we use a single module RPM of RHEL5 and CentOS 5 versions, and have only needed a single rev of the module packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. So we're working on it, and it will get better... -- Gil. [CTO, Azul Systems] was (Author: giltene): We're looking into this bug report. Will hopefully report back / resolve it soon. [But Michael, please go ahead and report it on our bugzilla as well per the above]. [Uwe Schindler wrote:] I would run Zing tests, too, but before doing that they should: Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I will never ever run it with outdated kernels. They should (if they really need a kernel module, which is in my opinion a no-go, too) use DKMS and make the kernel module open source, so my kernel is also not tainted. Without that I will not support Zing, sorry. But I doubt if the kernel module is really needed! Without a clear explanation why this is needed on their homepage I don't agree. This has two parts: One asking/questioning why
[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658 ] Uwe Schindler commented on LUCENE-4482: --- Thanks Gil for the explanation, Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? bq. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. VirtualBOX has similar requirements and uses DKMS. They ship with a deb package that contains the source code of their kernel module. It is rebuild on every kernel installation automatically. DKMS itsself depends on compiler and kernel headers, so you only need to depend on DKMS module and almost nothing more. The Jenkins Server als runs Windows in a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. By the way, Ubuntu LTS is also very popular on servers! Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658 ] Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 4:56 PM: - Thanks Gil for the explanation, Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? bq. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. VirtualBOX has similar requirements and uses DKMS. They ship with a deb package that contains the source code of their kernel module. It is rebuild on every kernel installation automatically. DKMS itsself depends on compiler and suggests kernel headers, so you only need to depend on dkms module and linux-headers and almost nothing more. The Jenkins Server als runs Windows in a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. By the way, Ubuntu LTS is also very popular on servers! was (Author: thetaphi): Thanks Gil for the explanation, Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? bq. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. VirtualBOX has similar requirements and uses DKMS. They ship with a deb package that contains the source code of their kernel module. It is rebuild on every kernel installation automatically. DKMS itsself depends on compiler and kernel headers, so you only need to depend on DKMS module and almost nothing more. The Jenkins Server als runs Windows in a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. By the way, Ubuntu LTS is also very popular on servers! Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658 ] Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 4:57 PM: - Thanks Gil for the explanation, Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? bq. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. VirtualBOX has similar requirements and uses DKMS. They ship with a deb package that contains the source code of their kernel module. It is rebuild on every kernel installation automatically. DKMS itsself depends on compiler and suggests kernel headers, so you only need to depend on dkms package and linux-headers and almost nothing more. The Jenkins Server als runs Windows in a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. By the way, Ubuntu LTS is also very popular on servers! was (Author: thetaphi): Thanks Gil for the explanation, Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? bq. As we added Zing support for Ubunutu, primarily due to it's popularity with developers, we found that kernel api signatures there change with practically every patch, even with no semantic change. This creates some serious friction with our current loadable module packaging and distribution choice for Ubuntu. We are working to resolve this, either by using DKMS or some other alternative, such that modules can continue to work or be properly updated as kernels rev up in Ubunutu-style distros. VirtualBOX has similar requirements and uses DKMS. They ship with a deb package that contains the source code of their kernel module. It is rebuild on every kernel installation automatically. DKMS itsself depends on compiler and suggests kernel headers, so you only need to depend on dkms module and linux-headers and almost nothing more. The Jenkins Server als runs Windows in a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. By the way, Ubuntu LTS is also very popular on servers! Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at
[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666 ] Gil Tene commented on LUCENE-4482: -- bq. Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... bq. ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666 ] Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:20 PM: {quote} Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? {quote} It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... {quote} ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. {quote} We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... was (Author: giltene): bq. Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... bq. ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666 ] Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:22 PM: {quote} Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? {quote} It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... {quote} ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. {quote} We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... was (Author: giltene): {quote} Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? {quote} It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... {quote} ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. {quote} We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite:
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666 ] Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:23 PM: {quote} Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? {quote} It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work, or from not having to leave the process to look something up. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... {quote} ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. {quote} We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... was (Author: giltene): {quote} Maybe it would be a good idea to provide both C4 - Memory Management layers, so also for plain kernels (as configuration option to the JVM like huge pages in Oracle's). Or is your VM then only as fast as Oracle's? {quote} It's not so much a matter of speed as it is a matter of pause time. Zing is not faster than Oracle's JVM, it's just as fast but without those pesky pauses. It's those pauses that keep people from using anything more than a tiny amount of memory in Java these days (to me tiny means a small fraction of a commodity, $4K server). With the ability to practically (i.e. without completely stopping for many seconds at a time once is a while) use the nice, cheap memory we now have in servers comes another form of speed - the kind that comes from not repeating work. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... {quote} ...As VirtualBOX's module has similar use-cases like yours for virtualization, I hope yours does not conflict with that one. {quote} We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in
[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475671#comment-13475671 ] Uwe Schindler commented on LUCENE-4482: --- bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... In that case, the VM is not running inside a guest OS, but in parallel to a hypervisor using the same linux kernel. The question was if the 2 modules may conflict to each other. But I could also imagine to use Zing inside a virtual machine on one of my servers using Lucene (once the bugs are fixed). bq. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... Have you thought about making this kernel module available to the kernel developers for other potential use cases (like virtual machines) also needing to re-allocate lots of RAM and influence paging/unmapping/mapping? I did not find the GPL source code of your kernel module only the binary downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a standard Makefile, see https://help.ubuntu.com/community/DKMS) without any custom debian package. Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at
[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475671#comment-13475671 ] Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 5:41 PM: - bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... In our/my case, the VM is not running inside a guest OS, but in parallel to a hypervisor (running windows) using the same linux kernel. The question was if the 2 modules may conflict to each other. But I could also imagine to use Zing inside a virtual machine on one of my servers using Lucene (once the bugs are fixed). bq. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... Have you thought about making this kernel module available to the kernel developers for other potential use cases (like virtual machines) also needing to re-allocate lots of RAM and influence paging/unmapping/mapping? I did not find the GPL source code of your kernel module only the binary downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a standard Makefile, see https://help.ubuntu.com/community/DKMS) without any custom debian package. was (Author: thetaphi): bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing works fine when run on top of hypervisors that fully support things like 2MB page mappings (the same sort of support needed for hugetlb feature to work). Unfortunately, there are some hypervisors out there (e.g. some versions of Xen for paravirt guests) that don't support that, and will crash a vanilla linux kernel trying to use hugetlb. Zing won't work in such cases either, and for the same reasons... In that case, the VM is not running inside a guest OS, but in parallel to a hypervisor using the same linux kernel. The question was if the 2 modules may conflict to each other. But I could also imagine to use Zing inside a virtual machine on one of my servers using Lucene (once the bugs are fixed). bq. A 4 to 6 order of magnitude difference in pause time and in sustainable allocation rate is so big that a C4 that uses the vanilla memory management api would be unusable at this point. Think of the difference between a 20usec phase shift and a 20 second pause... Have you thought about making this kernel module available to the kernel developers for other potential use cases (like virtual machines) also needing to re-allocate lots of RAM and influence paging/unmapping/mapping? I did not find the GPL source code of your kernel module only the binary downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a standard Makefile, see https://help.ubuntu.com/community/DKMS) without any custom debian package. Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s |
[jira] [Created] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf
Paul Elschot created LUCENE-4483: Summary: Make Term constructor javadoc refer to BytesRef.deepCopyOf Key: LUCENE-4483 URL: https://issues.apache.org/jira/browse/LUCENE-4483 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.1 Reporter: Paul Elschot Priority: Trivial Fix For: 4.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf
[ https://issues.apache.org/jira/browse/LUCENE-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-4483: - Description: The Term constructor from BytesRef javadoc indicates that a clone needs to be made of the BytesRef. But the clone() method of BytesRef is not what is meant, a deep copy needs to be made. Make Term constructor javadoc refer to BytesRef.deepCopyOf -- Key: LUCENE-4483 URL: https://issues.apache.org/jira/browse/LUCENE-4483 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.1 Reporter: Paul Elschot Priority: Trivial Fix For: 4.1 Attachments: LUCENE-4483.patch The Term constructor from BytesRef javadoc indicates that a clone needs to be made of the BytesRef. But the clone() method of BytesRef is not what is meant, a deep copy needs to be made. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf
[ https://issues.apache.org/jira/browse/LUCENE-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-4483: - Attachment: LUCENE-4483.patch Make Term constructor javadoc refer to BytesRef.deepCopyOf -- Key: LUCENE-4483 URL: https://issues.apache.org/jira/browse/LUCENE-4483 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.1 Reporter: Paul Elschot Priority: Trivial Fix For: 4.1 Attachments: LUCENE-4483.patch The Term constructor from BytesRef javadoc indicates that a clone needs to be made of the BytesRef. But the clone() method of BytesRef is not what is meant, a deep copy needs to be made. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4446) Switch to BlockPostingsFormat for Lucene 4.1
[ https://issues.apache.org/jira/browse/LUCENE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475725#comment-13475725 ] Michael McCandless commented on LUCENE-4446: +1 I diff'd branch vs trunk and it looks good! Looks like this was quite a bit of work :) Thanks Robert! Switch to BlockPostingsFormat for Lucene 4.1 Key: LUCENE-4446 URL: https://issues.apache.org/jira/browse/LUCENE-4446 Project: Lucene - Core Issue Type: Improvement Components: core/codecs Reporter: Robert Muir Fix For: 4.1 This has baked for some time: no crazy fails in hudson or anything. The code (in my opinion) is actually a lot simpler than the current postings format, its faster, the indexes are smaller, and so on. We should probably spend some time just going over the code and adding some more tests and such but I think its time to start looking at cutting over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org