[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643
 ] 

Gil Tene commented on LUCENE-4482:
--

We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why our loadable module is needed at 
all, and the other relating to it's availability for various kernels and Linux 
distros.

1. Why is the ZST (which includes a loadable module) needed for Zing to operate?

One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC 
internally) eliminates garbage collection as a response time concern for 
enterprise applications. Among other things, C4 relies on rapid manipulation of 
virtual memory and physical memory mappings to maintain continuous operation. 
While the semantics of the manipulations we do are possible using the vanilla 
mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux 
(and most other OSs) is extremely low due mostly to the historic, extremely 
conservative approach to in-process TLB invalidation, and due partly to issues 
with multiple-page size manipulations. We're not talking small change here. 
More like 4-6 orders of magnitude for our common operation, which is, right 
now, the difference between a practical and impractical implementation of C4.
You can find a detailed discussion of the difference in metrics for these 
operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in 
our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site 
http://tinyurl.com/7rydpvo).
   
2. Loadable Module availability and compatibility

To be clear our loadable module is open source, under GPLv2, and you can have 
the sources for it if you wish. The reason for the current choice of packaging 
is that a wide range of current end-customer's Linux systems do not have (or 
wish to install) the tooling needed to build or re-build the module, and what 
they need operationally is an RPM that opens and installs without requiring 
kernel headers and the like. In addition, we tend to  intensively test and 
examine the kernel module against specific distros and kernel to verify 
compatibility and stability, and declare official support for these well tested 
combinations.

On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is 
fairly slow, and the kernel api signatures tend to remain the same unless 
semantics are actually modified. As a result, we use a single module RPM of 
RHEL5 and CentOS 5 versions, and have only needed a single rev of the module 
packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far.  

As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

So we're working on it, and it will get better...


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475643#comment-13475643
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 4:09 PM:


We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why our loadable module is needed at 
all, and the other relating to it's availability for various kernels and Linux 
distros.

1. Why is the ZST (which includes a loadable module) needed for Zing to operate?

One of Zing JVM's main distinctions is that it's C4 garbage collector (aka GPGC 
internally) eliminates garbage collection as a response time concern for 
enterprise applications. Among other things, C4 relies on rapid manipulation of 
virtual memory and physical memory mappings to maintain continuous operation. 
While the semantics of the manipulations we do are possible using the vanilla 
mmap/mremap/munmap/madvise APIs, the rate at which those are supported in Linux 
(and most other OSs) is extremely low due mostly to the historic, extremely 
conservative approach to in-process TLB invalidation, and due partly to issues 
with multiple-page size manipulations. We're not talking small change here. 
More like 4-6 orders of magnitude for our common operation, which is, right 
now, the difference between a practical and impractical implementation of C4.
You can find a detailed discussion of the difference in metrics for these 
operations at http://tinyurl.com/34ytcvc, and a detailed discussion of C4 in 
our ISMM paper (http://tinyurl.com/94c9btb at the ACM site, or at the Azul site 
http://tinyurl.com/7rydpvo).
   
2. Loadable Module availability and compatibility

To be clear our loadable module is open source, under GPLv2, and you can have 
the sources for it if you wish. The reason for the current choice of packaging 
is that a wide range of current end-customer's Linux systems do not have (or 
wish to install) the tooling needed to build or re-build the module, and what 
they need operationally is an RPM that opens and installs without requiring 
kernel headers and the like. In addition, we tend to  intensively test and 
examine the kernel module against specific distros and kernel to verify 
compatibility and stability, and declare official support for these well tested 
combinations.

On other linux distros (RHEL, CentOS, SLES), the kernel revision velocity is 
fairly slow, and the kernel api signatures tend to remain the same unless 
semantics are actually modified. As a result, we use a single module RPM of 
RHEL5 and CentOS 5 versions, and have only needed a single rev of the module 
packaging during the evolution of RHEL6/CentOS6 and SLES 11 thus far.  

As we added Zing support for Ubunutu, primarily due to it's popularity with 
developers, we found that kernel api signatures there change with practically 
every patch, even with no semantic change. This creates some serious friction 
with our current loadable module packaging and distribution choice for Ubuntu. 
We are working to resolve this, either by using DKMS or some other alternative, 
such that modules can continue to work or be properly updated as kernels rev up 
in Ubunutu-style distros.

So we're working on it, and it will get better...

-- Gil. [CTO, Azul Systems]


  was (Author: giltene):
We're looking into this bug report. Will hopefully report back / resolve it 
soon. [But Michael, please go ahead and report it on our bugzilla as well per 
the above].

[Uwe Schindler wrote:]
 I would run Zing tests, too, but before doing that they should:
 Not rely on strange binary kernel modules that are outdated on
 Ubuntu 12.04.1 LTS. The Jenkins server is running in DMZ so I
 will never ever run it with outdated kernels. They should (if
 they really need a kernel module, which is in my opinion a no-go,
 too) use DKMS and make the kernel module open source, so my kernel
 is also not tainted. Without that I will not support Zing, sorry.
 But I doubt if the kernel module is really needed! Without a
 clear explanation why this is needed on their homepage I don't agree.

This has two parts: One asking/questioning why 

[jira] [Commented] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene commented on LUCENE-4482:
--

bq.
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

bq.
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:20 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
bq.
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

bq.
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:22 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: 

[jira] [Comment Edited] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-10-13 Thread Gil Tene (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475666#comment-13475666
 ] 

Gil Tene edited comment on LUCENE-4482 at 10/13/12 5:23 PM:


{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work, or from not having to leave the process to look 
something up.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...


  was (Author: giltene):
{quote}
Maybe it would be a good idea to provide both C4 - Memory Management layers, 
so also for plain kernels (as configuration option to the JVM like huge pages 
in Oracle's). Or is your VM then only as fast as Oracle's?
{quote}

It's not so much a matter of speed as it is a matter of pause time. Zing is not 
faster than Oracle's JVM, it's just as fast but without those pesky pauses. 
It's those pauses that keep people from using anything more than a tiny amount 
of memory in Java these days (to me tiny means a small fraction of a 
commodity, $4K server). With the ability to practically (i.e. without 
completely stopping for many seconds at a time once is a while) use the nice, 
cheap memory we now have in servers comes another form of speed - the kind that 
comes from not repeating work.

A 4 to 6 order of magnitude difference in pause time and in sustainable 
allocation rate is so big that a C4 that uses the vanilla memory management api 
would be unusable at this point. Think of the difference between a 20usec phase 
shift and a 20 second pause...

{quote}
...As VirtualBOX's module has similar use-cases like yours for virtualization, 
I hope yours does not conflict with that one.
{quote}

We don't test on VirtualBOX, so I don't know for sure. In general, Zing works 
fine when run on top of hypervisors that fully support things like 2MB page 
mappings (the same sort of support needed for hugetlb feature to work). 
Unfortunately, there are some hypervisors out there (e.g.  some versions of Xen 
for paravirt guests) that don't support that, and will crash a vanilla linux 
kernel trying to use hugetlb. Zing won't work in such cases either, and for the 
same reasons...

  
 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in