[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475566#comment-13475566
 ] 

Uwe Schindler commented on LUCENE-4482:
---

{quote}
The patch at least isolates the JVM bug even if it's not exactly a
minimal test  Somehow the idfExplain method, which
is overridden in this test's BoostingSimilarity, fails to be called
(the super.idfExplain is called instead), which leads to the test
failures.
{quote}

So it is definitely a JVM and not a Lucene bug! Have you reported it?

I would run Zing tests, too, but before doing that they should:
Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 
LTS. The Jenkins server is running in DMZ so I will never ever run it with 
outdated kernels. They should (if they really need a kernel module, which is in 
my opinion a no-go, too) use DKMS and make the kernel module open source, so my 
kernel is also not tainted. Without that I will not support Zing, sorry. But I 
doubt if the kernel module is really needed! Without a clear explanation why 
this is needed on their homepage I don't agree.


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
  at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
  at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at 
 

[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!

2012-10-13 Thread builder
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/

1 tests failed.
REGRESSION:  
org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded

Error Message:
Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, 
state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, 
group=TGRP-TestTimeLimitingCollector]
Caused by: java.lang.OutOfMemoryError: Java heap space
at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0)
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at 
org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapper.java:37)
at 
org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:72)
at 
org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:28)
at 
org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnumBase.init(Lucene40PostingsReader.java:329)
at 
org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegmentDocsEnum.init(Lucene40PostingsReader.java:511)
at 
org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lucene40PostingsReader.java:247)
at 
org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40PostingsReader.java:228)
at 
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188)
at 
org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtomicReader.java:188)
at 
org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(AssertingAtomicReader.java:122)
at org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403)
at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157)
at 
org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:322)
at 
org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293)
at 
org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124)
at 
org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139)
at 
org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42)
at 
org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292)




Build Log:
[...truncated 624 lines...]
[junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is slower than usual):  lastDoc=1 , 
allowed=51 , elapsed=820 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
SLOW_DOWN = 2*20 + 51 + 3)
[junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
probably just  because the test machine is 

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_35) - Build # 1725 - Failure!

2012-10-13 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/1725/
Java: 32bit/jdk1.6.0_35 -client -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 23382 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:342: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:65: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:511: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/common-build.xml:1910: 
Can't get https://issues.apache.org/jira/rest/api/2/project/LUCENE to 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/changes/jiraVersionList.json

Total time: 30 minutes 31 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_35 -client -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475624#comment-13475624
 ] 

Michael McCandless commented on LUCENE-4482:



bq. So it is definitely a JVM and not a Lucene bug!

I'm pretty sure: the JVM is failing to call the overridden method in a
subclass (calling the base class method instead).  No other JVMs fail
here, and Zing won't fail if you run the test in isolation ... and it
doesn't always fail if you run all tests (ie it seems to depend on the
seed).

bq. Have you reported it?

Working on it ... trying to get an account at 
http://www.azulsystems.com/developers/bugzilla/

{quote}
I would run Zing tests, too, but before doing that they should:
Not rely on strange binary kernel modules that are outdated on Ubuntu 12.04.1 
LTS. The Jenkins server is running in DMZ so I will never ever run it with 
outdated kernels. They should (if they really need a kernel module, which is in 
my opinion a no-go, too) use DKMS and make the kernel module open source, so my 
kernel is also not tainted. Without that I will not support Zing, sorry. But I 
doubt if the kernel module is really needed! Without a clear explanation why 
this is needed on their homepage I don't agree.
{quote}

I agree: it's crazy it only runs as binary module on old kernel
versions ... they know this is a showstopper (I've complained about
it several times...) and they're working on it.


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
  at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  

Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!

2012-10-13 Thread Michael McCandless
I can't repro (standalone or running all tests w/ same JVM count 
seed ant -Dtests.jvms=8 clean test-core
-Dtests.seed=3DC69507A6600F79).

This build seems not to save OOM heap dumps.  I realize these are
space consuming ... but can we eg save up to N of them (ie delete
oldest ones first)?  This way we at least have a shot of seeing what
was taking so much RAM...

Mike McCandless

http://blog.mikemccandless.com

On Sat, Oct 13, 2012 at 9:42 AM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded

 Error Message:
 Captured an uncaught exception in thread: Thread[id=117, name=Thread-88, 
 state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]

 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=117, name=Thread-88, state=RUNNABLE, 
 group=TGRP-TestTimeLimitingCollector]
 Caused by: java.lang.OutOfMemoryError: Java heap space
 at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0)
 at java.util.Arrays.copyOf(Arrays.java:2367)
 at 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
 at 
 java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
 at 
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
 at java.lang.StringBuilder.append(StringBuilder.java:132)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at 
 org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapper.java:37)
 at 
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:72)
 at 
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:28)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnumBase.init(Lucene40PostingsReader.java:329)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegmentDocsEnum.init(Lucene40PostingsReader.java:511)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lucene40PostingsReader.java:247)
 at 
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40PostingsReader.java:228)
 at 
 org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188)
 at 
 org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtomicReader.java:188)
 at 
 org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(AssertingAtomicReader.java:122)
 at 
 org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403)
 at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157)
 at 
 org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86)
 at 
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:322)
 at 
 org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293)
 at 
 org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124)
 at 
 org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139)
 at 
 org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42)
 at 
 org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292)




 Build Log:
 [...truncated 624 lines...]
 [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector
 [junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
 probably just  because the test machine is slower than usual):  lastDoc=1 , 
 allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
 SLOW_DOWN = 2*20 + 51 + 3)
 [junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
 probably just  because the test machine is slower than usual):  lastDoc=1 , 
 allowed=51 , elapsed=760 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
 SLOW_DOWN = 2*20 + 51 + 3)
 [junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
 probably just  because the test machine is slower than usual):  lastDoc=1 , 
 allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
 SLOW_DOWN = 2*20 + 51 + 3)
 [junit4:junit4]   1 Informative: timeout exceeded (no action required: most 
 probably just  because the test machine is slower than usual):  lastDoc=1 , 
 allowed=51 , elapsed=720 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED + 
 SLOW_DOWN = 2*20 + 51 + 3)
 [junit4:junit4]   1 Informative: timeout exceeded (no action 

[jira] [Commented] (LUCENE-4446) Switch to BlockPostingsFormat for Lucene 4.1

2012-10-13 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475636#comment-13475636
 ] 

Robert Muir commented on LUCENE-4446:
-

Branch is ready for merging (/lucene/dev/branches/lucene4446).

 Switch to BlockPostingsFormat for Lucene 4.1
 

 Key: LUCENE-4446
 URL: https://issues.apache.org/jira/browse/LUCENE-4446
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Reporter: Robert Muir
 Fix For: 4.1


 This has baked for some time: no crazy fails in hudson or anything.
 The code (in my opinion) is actually a lot simpler than the current postings 
 format, its faster, the indexes are smaller, and so on.
 We should probably spend some time just going over the code and adding some 
 more tests and such but I think its time to start looking at cutting over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!

2012-10-13 Thread Uwe Schindler
I disabled them a while ago also for Apache Jenkins. The problem is: One of the 
real tests produce the OOM dump, too (I think the crash-my-JVM one). So it 
consumes always this space!

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 13, 2012 5:37 PM
 To: dev@lucene.apache.org
 Cc: sim...@apache.org
 Subject: Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 -
 Failure!
 
 I can't repro (standalone or running all tests w/ same JVM count  seed ant -
 Dtests.jvms=8 clean test-core -Dtests.seed=3DC69507A6600F79).
 
 This build seems not to save OOM heap dumps.  I realize these are space
 consuming ... but can we eg save up to N of them (ie delete oldest ones 
 first)?
 This way we at least have a shot of seeing what was taking so much RAM...
 
 Mike McCandless
 
 http://blog.mikemccandless.com
 
 On Sat, Oct 13, 2012 at 9:42 AM,  buil...@flonkings.com wrote:
  Build:
  builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/
 
  1 tests failed.
  REGRESSION:
  org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThre
  aded
 
  Error Message:
  Captured an uncaught exception in thread: Thread[id=117,
  name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]
 
  Stack Trace:
  com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an
  uncaught exception in thread: Thread[id=117, name=Thread-88,
  state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]
  Caused by: java.lang.OutOfMemoryError: Java heap space
  at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0)
  at java.util.Arrays.copyOf(Arrays.java:2367)
  at
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
  at
 java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.ja
 va:114)
  at
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
  at java.lang.StringBuilder.append(StringBuilder.java:132)
  at java.lang.StringBuilder.append(StringBuilder.java:128)
  at
 org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapp
 er.java:37)
  at
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp
 er.java:72)
  at
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp
 er.java:28)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnu
 mBase.init(Lucene40PostingsReader.java:329)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegment
 DocsEnum.init(Lucene40PostingsReader.java:511)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lu
 cene40PostingsReader.java:247)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40Pos
 tingsReader.java:228)
  at
 org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTerms
 Enum.docs(BlockTreeTermsReader.java:2188)
  at
 org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtom
 icReader.java:188)
  at
 org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(As
 sertingAtomicReader.java:122)
  at
 org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403)
  at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157)
  at
 org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86)
  at
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.
 java:322)
  at
 org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSear
 cher.java:80)
  at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587)
  at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingC
 ollector.java:124)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLi
 mitingCollector.java:139)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimit
 ingCollector.java:42)
  at
  org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimit
  ingCollector.java:292)
 
 
 
 
  Build Log:
  [...truncated 624 lines...]
  [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector
  [junit4:junit4]   1 Informative: timeout exceeded (no action required: most
 probably just  because the test machine is slower than usual):  lastDoc=1 ,
 allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED
 + SLOW_DOWN = 2*20 + 51 + 3)
  [junit4:junit4]   1 Informative: timeout exceeded (no action required: most
 probably just  because the test machine is slower than usual):  lastDoc=1 ,
 allowed=51 , elapsed=760 = 658 = 7.0 * ( 

Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 - Failure!

2012-10-13 Thread Robert Muir
This one is now a nightly-only test! So maybe we can safely enable
this for the hourly builds?

On Sat, Oct 13, 2012 at 12:01 PM, Uwe Schindler u...@thetaphi.de wrote:
 I disabled them a while ago also for Apache Jenkins. The problem is: One of 
 the real tests produce the OOM dump, too (I think the crash-my-JVM one). So 
 it consumes always this space!

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 13, 2012 5:37 PM
 To: dev@lucene.apache.org
 Cc: sim...@apache.org
 Subject: Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9691 -
 Failure!

 I can't repro (standalone or running all tests w/ same JVM count  seed ant 
 -
 Dtests.jvms=8 clean test-core -Dtests.seed=3DC69507A6600F79).

 This build seems not to save OOM heap dumps.  I realize these are space
 consuming ... but can we eg save up to N of them (ie delete oldest ones 
 first)?
 This way we at least have a shot of seeing what was taking so much RAM...

 Mike McCandless

 http://blog.mikemccandless.com

 On Sat, Oct 13, 2012 at 9:42 AM,  buil...@flonkings.com wrote:
  Build:
  builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9691/
 
  1 tests failed.
  REGRESSION:
  org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThre
  aded
 
  Error Message:
  Captured an uncaught exception in thread: Thread[id=117,
  name=Thread-88, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]
 
  Stack Trace:
  com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an
  uncaught exception in thread: Thread[id=117, name=Thread-88,
  state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]
  Caused by: java.lang.OutOfMemoryError: Java heap space
  at __randomizedtesting.SeedInfo.seed([3DC69507A6600F79]:0)
  at java.util.Arrays.copyOf(Arrays.java:2367)
  at
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
  at
 java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.ja
 va:114)
  at
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
  at java.lang.StringBuilder.append(StringBuilder.java:132)
  at java.lang.StringBuilder.append(StringBuilder.java:128)
  at
 org.apache.lucene.store.MockIndexInputWrapper.init(MockIndexInputWrapp
 er.java:37)
  at
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp
 er.java:72)
  at
 org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapp
 er.java:28)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$SegmentDocsEnu
 mBase.init(Lucene40PostingsReader.java:329)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader$AllDocsSegment
 DocsEnum.init(Lucene40PostingsReader.java:511)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.newDocsEnum(Lu
 cene40PostingsReader.java:247)
  at
 org.apache.lucene.codecs.lucene40.Lucene40PostingsReader.docs(Lucene40Pos
 tingsReader.java:228)
  at
 org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTerms
 Enum.docs(BlockTreeTermsReader.java:2188)
  at
 org.apache.lucene.index.FilterAtomicReader$FilterTermsEnum.docs(FilterAtom
 icReader.java:188)
  at
 org.apache.lucene.index.AssertingAtomicReader$AssertingTermsEnum.docs(As
 sertingAtomicReader.java:122)
  at
 org.apache.lucene.index.MultiTermsEnum.docs(MultiTermsEnum.java:403)
  at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:157)
  at
 org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:86)
  at
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.
 java:322)
  at
 org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSear
 cher.java:80)
  at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:587)
  at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:293)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingC
 ollector.java:124)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLi
 mitingCollector.java:139)
  at
 org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimit
 ingCollector.java:42)
  at
  org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimit
  ingCollector.java:292)
 
 
 
 
  Build Log:
  [...truncated 624 lines...]
  [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector
  [junit4:junit4]   1 Informative: timeout exceeded (no action required: 
  most
 probably just  because the test machine is slower than usual):  lastDoc=1 ,
 allowed=51 , elapsed=680 = 658 = 7.0 * ( 2*resolution +  TIME_ALLOWED
 + SLOW_DOWN = 2*20 + 51 + 3)
  [junit4:junit4]   

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643
 ] 

Gil Tene commented on LUCENE-4482:
--

We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why our loadable module is needed at 
all, and the other relating to it's availability for various kernels and Linux 
distros.

1. Why is the ZST (which includes a loadable module) needed for Zing to operate?

One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC 
internally) eliminates garbage collection as a response time concern for 
enterprise applications. Among other things, C4 relies on rapid manipulation of 
virtual memory and physical memory mappings to maintain continuous operation. 
While the semantics of the manipulations we do are possible using the vanilla 
mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux 
(and most other OSs) is extremely low due mostly to the historic, extremely 
conservative approach to in-process TLB invalidation, and due partly to issues 
with multiple-page size manipulations. We're not talking small change here. 
More like 4-6 orders of magnitude for our common operation, which is, right 
now, the difference between a practical and impractical implementation of C4.
You can find a detailed discussion of the difference in metrics for these 
operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in 
our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site 
http://tinyurl.com/7rydpvo).
   
2. Loadable Module availability and compatibility

To be clear our loadable module is open source, under GPLv2, and you can have 
the sources for it if you wish. The reason for the current choice of packaging 
is that a wide range of current end-customer's Linux systems do not have (or 
wish to install) the tooling needed to build or re-build the module, and what 
they need operationally is an RPM that opens and installs without requiring 
kernel headers and the like. In addition, we tend to  intensively test and 
examine the kernel module against specific distros and kernel to verify 
compatibility and stability, and declare official support for these well tested 
combinations.

On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is 
fairly slow, and the kernel api signatures tend to remain the same unless 
semantics are actually modified. As a result, we use a single module RPM of 
RHEL5 and CentOS 5 versions, and have only needed a single rev of the module 
packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far.  

As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

So we're working on it, and it will get better...


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 4:09 PM:


We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why our loadable module is needed at 
all, and the other relating to it's availability for various kernels and Linux 
distros.

1. Why is the ZST (which includes a loadable module) needed for Zing to operate?

One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC 
internally) eliminates garbage collection as a response time concern for 
enterprise applications. Among other things, C4 relies on rapid manipulation of 
virtual memory and physical memory mappings to maintain continuous operation. 
While the semantics of the manipulations we do are possible using the vanilla 
mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux 
(and most other OSs) is extremely low due mostly to the historic, extremely 
conservative approach to in-process TLB invalidation, and due partly to issues 
with multiple-page size manipulations. We're not talking small change here. 
More like 4-6 orders of magnitude for our common operation, which is, right 
now, the difference between a practical and impractical implementation of C4.
You can find a detailed discussion of the difference in metrics for these 
operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in 
our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site 
http://tinyurl.com/7rydpvo).
   
2. Loadable Module availability and compatibility

To be clear our loadable module is open source, under GPLv2, and you can have 
the sources for it if you wish. The reason for the current choice of packaging 
is that a wide range of current end-customer's Linux systems do not have (or 
wish to install) the tooling needed to build or re-build the module, and what 
they need operationally is an RPM that opens and installs without requiring 
kernel headers and the like. In addition, we tend to  intensively test and 
examine the kernel module against specific distros and kernel to verify 
compatibility and stability, and declare official support for these well tested 
combinations.

On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is 
fairly slow, and the kernel api signatures tend to remain the same unless 
semantics are actually modified. As a result, we use a single module RPM of 
RHEL5 and CentOS 5 versions, and have only needed a single rev of the module 
packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far.  

As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

So we're working on it, and it will get better...

-- Gil. [CTO, Azul Systems]


  was (Author: giltene):
We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why 

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658
 ] 

Uwe Schindler commented on LUCENE-4482:
---

Thanks Gil for the explanation,

Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

bq. As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

VirtualBOX has similar requirements and uses DKMS. They ship with a deb package 
that contains the source code of their kernel module. It is rebuild on every 
kernel installation automatically. DKMS itsself depends on compiler and kernel 
headers, so you only need to depend on DKMS module and almost nothing more. The 
Jenkins Server als runs Windows in a virtual machine, and therefore uses their 
kernel module, too. As VirtualBOX's module has similar use-cases like yours for 
virtualization, I hope yours does not conflict with that one.

By the way, Ubuntu LTS is also very popular on servers!

 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
  at 
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658
 ] 

Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 4:56 PM:
-

Thanks Gil for the explanation,

Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

bq. As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

VirtualBOX has similar requirements and uses DKMS. They ship with a deb package 
that contains the source code of their kernel module. It is rebuild on every 
kernel installation automatically. DKMS itsself depends on compiler and 
suggests kernel headers, so you only need to depend on dkms module and 
linux-headers and almost nothing more. The Jenkins Server als runs Windows in 
a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's 
module has similar use-cases like yours for virtualization, I hope yours does 
not conflict with that one.

By the way, Ubuntu LTS is also very popular on servers!

  was (Author: thetaphi):
Thanks Gil for the explanation,

Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

bq. As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

VirtualBOX has similar requirements and uses DKMS. They ship with a deb package 
that contains the source code of their kernel module. It is rebuild on every 
kernel installation automatically. DKMS itsself depends on compiler and kernel 
headers, so you only need to depend on DKMS module and almost nothing more. The 
Jenkins Server als runs Windows in a virtual machine, and therefore uses their 
kernel module, too. As VirtualBOX's module has similar use-cases like yours for 
virtualization, I hope yours does not conflict with that one.

By the way, Ubuntu LTS is also very popular on servers!
  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475658#comment-13475658
 ] 

Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 4:57 PM:
-

Thanks Gil for the explanation,

Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

bq. As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

VirtualBOX has similar requirements and uses DKMS. They ship with a deb package 
that contains the source code of their kernel module. It is rebuild on every 
kernel installation automatically. DKMS itsself depends on compiler and 
suggests kernel headers, so you only need to depend on dkms package and 
linux-headers and almost nothing more. The Jenkins Server als runs Windows in 
a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's 
module has similar use-cases like yours for virtualization, I hope yours does 
not conflict with that one.

By the way, Ubuntu LTS is also very popular on servers!

  was (Author: thetaphi):
Thanks Gil for the explanation,

Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

bq. As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

VirtualBOX has similar requirements and uses DKMS. They ship with a deb package 
that contains the source code of their kernel module. It is rebuild on every 
kernel installation automatically. DKMS itsself depends on compiler and 
suggests kernel headers, so you only need to depend on dkms module and 
linux-headers and almost nothing more. The Jenkins Server als runs Windows in 
a virtual machine, and therefore uses their kernel module, too. As VirtualBOX's 
module has similar use-cases like yours for virtualization, I hope yours does 
not conflict with that one.

By the way, Ubuntu LTS is also very popular on servers!
  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene commented on LUCENE-4482:
--

bq.
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

bq.
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:20 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
bq.
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

bq.
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:22 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:23 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work, or from not having to leave the process to look 
something up.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in 

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475671#comment-13475671
 ] 

Uwe Schindler commented on LUCENE-4482:
---

bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing 
works fine when run on top of hypervisors that fully support things like 2MB 
page mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g. some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

In that case, the VM is not running inside a guest OS, but in parallel to a 
hypervisor using the same linux kernel. The question was if the 2 modules may 
conflict to each other. But I could also imagine to use Zing inside a virtual 
machine on one of my servers using Lucene (once the bugs are fixed).

bq. A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

Have you thought about making this kernel module available to the kernel 
developers for other potential use cases (like virtual machines) also needing 
to re-allocate lots of RAM and influence paging/unmapping/mapping?

I did not find the GPL source code of your kernel module only the binary 
downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a 
standard Makefile, see https://help.ubuntu.com/community/DKMS) without any 
custom debian package.

 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475671#comment-13475671
 ] 

Uwe Schindler edited comment on LUCENE-4482 at 10/13/12 5:41 PM:
-

bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing 
works fine when run on top of hypervisors that fully support things like 2MB 
page mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g. some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

In our/my case, the VM is not running inside a guest OS, but in parallel to a 
hypervisor (running windows) using the same linux kernel. The question was if 
the 2 modules may conflict to each other. But I could also imagine to use Zing 
inside a virtual machine on one of my servers using Lucene (once the bugs are 
fixed).

bq. A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

Have you thought about making this kernel module available to the kernel 
developers for other potential use cases (like virtual machines) also needing 
to re-allocate lots of RAM and influence paging/unmapping/mapping?

I did not find the GPL source code of your kernel module only the binary 
downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a 
standard Makefile, see https://help.ubuntu.com/community/DKMS) without any 
custom debian package.

  was (Author: thetaphi):
bq. We don't test on VirtualBOX, so I don't know for sure. In general, Zing 
works fine when run on top of hypervisors that fully support things like 2MB 
page mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g. some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

In that case, the VM is not running inside a guest OS, but in parallel to a 
hypervisor using the same linux kernel. The question was if the 2 modules may 
conflict to each other. But I could also imagine to use Zing inside a virtual 
machine on one of my servers using Lucene (once the bugs are fixed).

bq. A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

Have you thought about making this kernel module available to the kernel 
developers for other potential use cases (like virtual machines) also needing 
to re-allocate lots of RAM and influence paging/unmapping/mapping?

I did not find the GPL source code of your kernel module only the binary 
downloads. Where can I get it. It's easy to hook it's build into DKMS (if its a 
standard Makefile, see https://help.ubuntu.com/community/DKMS) without any 
custom debian package.
  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | 

[jira] [Created] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf

2012-10-13 Thread Paul Elschot (JIRA)
Paul Elschot created LUCENE-4483:


 Summary: Make Term constructor javadoc refer to BytesRef.deepCopyOf
 Key: LUCENE-4483
 URL: https://issues.apache.org/jira/browse/LUCENE-4483
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.1
Reporter: Paul Elschot
Priority: Trivial
 Fix For: 4.1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf

2012-10-13 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-4483:
-

Description: 
The Term constructor from BytesRef javadoc indicates that a clone needs to be 
made of the BytesRef.
But the clone() method of BytesRef is not what is meant, a deep copy needs to 
be made.

 Make Term constructor javadoc refer to BytesRef.deepCopyOf
 --

 Key: LUCENE-4483
 URL: https://issues.apache.org/jira/browse/LUCENE-4483
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.1
Reporter: Paul Elschot
Priority: Trivial
 Fix For: 4.1

 Attachments: LUCENE-4483.patch


 The Term constructor from BytesRef javadoc indicates that a clone needs to be 
 made of the BytesRef.
 But the clone() method of BytesRef is not what is meant, a deep copy needs to 
 be made.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4483) Make Term constructor javadoc refer to BytesRef.deepCopyOf

2012-10-13 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-4483:
-

Attachment: LUCENE-4483.patch

 Make Term constructor javadoc refer to BytesRef.deepCopyOf
 --

 Key: LUCENE-4483
 URL: https://issues.apache.org/jira/browse/LUCENE-4483
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.1
Reporter: Paul Elschot
Priority: Trivial
 Fix For: 4.1

 Attachments: LUCENE-4483.patch


 The Term constructor from BytesRef javadoc indicates that a clone needs to be 
 made of the BytesRef.
 But the clone() method of BytesRef is not what is meant, a deep copy needs to 
 be made.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4446) Switch to BlockPostingsFormat for Lucene 4.1

2012-10-13 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475725#comment-13475725
 ] 

Michael McCandless commented on LUCENE-4446:


+1

I diff'd branch vs trunk and it looks good!  Looks like this was quite a bit of 
work :)  Thanks Robert!


 Switch to BlockPostingsFormat for Lucene 4.1
 

 Key: LUCENE-4446
 URL: https://issues.apache.org/jira/browse/LUCENE-4446
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Reporter: Robert Muir
 Fix For: 4.1


 This has baked for some time: no crazy fails in hudson or anything.
 The code (in my opinion) is actually a lot simpler than the current postings 
 format, its faster, the indexes are smaller, and so on.
 We should probably spend some time just going over the code and adding some 
 more tests and such but I think its time to start looking at cutting over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org