[
https://issues.apache.org/jira/browse/LUCENE-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16828039#comment-16828039
]
Uwe Schindler edited comment on LUCENE-8780 at 4/28/19 3:32 PM:
----------------------------------------------------------------
Thats the result after 20 runs of wikimediumall with 6 searcher threads (with
ParallelGC) on Mike's lucenebench:
{noformat}
use java command /home/jenkins/tools/java/64bit/jdk-11.0.2/bin/java -server
-Xms2g -Xmx2g -XX:+UseParallelGC -Xbatch
JAVA:
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
OS:
Linux serv1.sd-datasolutions.de 4.18.0-17-generic #18~18.04.1-Ubuntu SMP Fri
Mar 15 15:27:12 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[...]
Report after iter 19:
Task QPS orig StdDev QPS patch StdDev
Pct diff
IntNRQ 30.88 (0.6%) 26.33 (0.8%)
-14.7% ( -16% - -13%)
PKLookup 107.70 (2.7%) 94.31 (2.9%)
-12.4% ( -17% - -7%)
AndHighHigh 10.76 (11.5%) 10.17 (3.3%)
-5.4% ( -18% - 10%)
Fuzzy2 45.10 (7.7%) 43.21 (9.0%)
-4.2% ( -19% - 13%)
LowSloppyPhrase 7.28 (16.8%) 6.98 (6.3%)
-4.2% ( -23% - 22%)
OrHighNotLow 783.24 (7.1%) 751.37 (2.5%)
-4.1% ( -12% - 5%)
OrHighNotHigh 934.39 (6.5%) 896.38 (2.1%)
-4.1% ( -11% - 4%)
Respell 45.36 (10.6%) 43.65 (7.0%)
-3.8% ( -19% - 15%)
OrNotHighHigh 779.95 (3.8%) 752.28 (1.8%)
-3.5% ( -8% - 2%)
HighSloppyPhrase 10.37 (12.8%) 10.03 (3.5%)
-3.3% ( -17% - 14%)
LowPhrase 11.60 (8.9%) 11.23 (1.7%)
-3.2% ( -12% - 8%)
LowTerm 1694.00 (8.9%) 1642.34 (5.5%)
-3.0% ( -16% - 12%)
MedTerm 1292.82 (9.3%) 1253.69 (8.2%)
-3.0% ( -18% - 15%)
AndHighMed 71.41 (9.9%) 69.77 (7.5%)
-2.3% ( -17% - 16%)
OrNotHighMed 634.32 (7.2%) 620.67 (7.5%)
-2.2% ( -15% - 13%)
Prefix3 110.65 (14.9%) 108.55 (8.7%)
-1.9% ( -22% - 25%)
OrHighLow 347.02 (4.3%) 340.51 (9.9%)
-1.9% ( -15% - 12%)
OrNotHighLow 591.61 (5.5%) 580.60 (9.0%)
-1.9% ( -15% - 13%)
OrHighNotMed 1258.21 (1.8%) 1237.28 (5.0%)
-1.7% ( -8% - 5%)
Fuzzy1 91.79 (4.3%) 90.77 (11.1%)
-1.1% ( -15% - 14%)
OrHighMed 10.29 (7.9%) 10.25 (11.8%)
-0.4% ( -18% - 20%)
Wildcard 52.28 (6.3%) 52.21 (6.8%)
-0.1% ( -12% - 13%)
OrHighHigh 8.16 (6.9%) 8.22 (9.3%)
0.8% ( -14% - 18%)
AndHighLow 563.89 (9.1%) 569.31 (15.3%)
1.0% ( -21% - 27%)
HighPhrase 15.88 (9.3%) 16.04 (13.0%)
1.0% ( -19% - 25%)
MedPhrase 14.84 (9.0%) 15.15 (12.8%)
2.1% ( -18% - 26%)
HighSpanNear 2.16 (9.8%) 2.21 (10.1%)
2.3% ( -16% - 24%)
MedSloppyPhrase 18.48 (15.4%) 18.96 (18.9%)
2.6% ( -27% - 43%)
MedSpanNear 17.75 (3.8%) 18.31 (10.0%)
3.1% ( -10% - 17%)
HighTerm 1031.00 (9.9%) 1068.12 (17.1%)
3.6% ( -21% - 33%)
LowSpanNear 8.22 (5.5%) 8.53 (13.3%)
3.7% ( -14% - 23%)
HighTermDayOfYearSort 9.78 (11.0%) 10.25 (18.2%)
4.8% ( -21% - 38%)
HighTermMonthSort 23.40 (26.5%) 27.11 (32.1%)
15.9% ( -33% - 101%)
{noformat}
The total runtime of each run did not change, always approx 280s per run
patched and unpatched. Not sure how to interpret this.
was (Author: thetaphi):
Thats the result after 20 runs with 6 searcher threads (with ParallelGC) on
Mike's lucenebench:
{noformat}
use java command /home/jenkins/tools/java/64bit/jdk-11.0.2/bin/java -server
-Xms2g -Xmx2g -XX:+UseParallelGC -Xbatch
JAVA:
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
OS:
Linux serv1.sd-datasolutions.de 4.18.0-17-generic #18~18.04.1-Ubuntu SMP Fri
Mar 15 15:27:12 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[...]
Report after iter 19:
Task QPS orig StdDev QPS patch StdDev
Pct diff
IntNRQ 30.88 (0.6%) 26.33 (0.8%)
-14.7% ( -16% - -13%)
PKLookup 107.70 (2.7%) 94.31 (2.9%)
-12.4% ( -17% - -7%)
AndHighHigh 10.76 (11.5%) 10.17 (3.3%)
-5.4% ( -18% - 10%)
Fuzzy2 45.10 (7.7%) 43.21 (9.0%)
-4.2% ( -19% - 13%)
LowSloppyPhrase 7.28 (16.8%) 6.98 (6.3%)
-4.2% ( -23% - 22%)
OrHighNotLow 783.24 (7.1%) 751.37 (2.5%)
-4.1% ( -12% - 5%)
OrHighNotHigh 934.39 (6.5%) 896.38 (2.1%)
-4.1% ( -11% - 4%)
Respell 45.36 (10.6%) 43.65 (7.0%)
-3.8% ( -19% - 15%)
OrNotHighHigh 779.95 (3.8%) 752.28 (1.8%)
-3.5% ( -8% - 2%)
HighSloppyPhrase 10.37 (12.8%) 10.03 (3.5%)
-3.3% ( -17% - 14%)
LowPhrase 11.60 (8.9%) 11.23 (1.7%)
-3.2% ( -12% - 8%)
LowTerm 1694.00 (8.9%) 1642.34 (5.5%)
-3.0% ( -16% - 12%)
MedTerm 1292.82 (9.3%) 1253.69 (8.2%)
-3.0% ( -18% - 15%)
AndHighMed 71.41 (9.9%) 69.77 (7.5%)
-2.3% ( -17% - 16%)
OrNotHighMed 634.32 (7.2%) 620.67 (7.5%)
-2.2% ( -15% - 13%)
Prefix3 110.65 (14.9%) 108.55 (8.7%)
-1.9% ( -22% - 25%)
OrHighLow 347.02 (4.3%) 340.51 (9.9%)
-1.9% ( -15% - 12%)
OrNotHighLow 591.61 (5.5%) 580.60 (9.0%)
-1.9% ( -15% - 13%)
OrHighNotMed 1258.21 (1.8%) 1237.28 (5.0%)
-1.7% ( -8% - 5%)
Fuzzy1 91.79 (4.3%) 90.77 (11.1%)
-1.1% ( -15% - 14%)
OrHighMed 10.29 (7.9%) 10.25 (11.8%)
-0.4% ( -18% - 20%)
Wildcard 52.28 (6.3%) 52.21 (6.8%)
-0.1% ( -12% - 13%)
OrHighHigh 8.16 (6.9%) 8.22 (9.3%)
0.8% ( -14% - 18%)
AndHighLow 563.89 (9.1%) 569.31 (15.3%)
1.0% ( -21% - 27%)
HighPhrase 15.88 (9.3%) 16.04 (13.0%)
1.0% ( -19% - 25%)
MedPhrase 14.84 (9.0%) 15.15 (12.8%)
2.1% ( -18% - 26%)
HighSpanNear 2.16 (9.8%) 2.21 (10.1%)
2.3% ( -16% - 24%)
MedSloppyPhrase 18.48 (15.4%) 18.96 (18.9%)
2.6% ( -27% - 43%)
MedSpanNear 17.75 (3.8%) 18.31 (10.0%)
3.1% ( -10% - 17%)
HighTerm 1031.00 (9.9%) 1068.12 (17.1%)
3.6% ( -21% - 33%)
LowSpanNear 8.22 (5.5%) 8.53 (13.3%)
3.7% ( -14% - 23%)
HighTermDayOfYearSort 9.78 (11.0%) 10.25 (18.2%)
4.8% ( -21% - 38%)
HighTermMonthSort 23.40 (26.5%) 27.11 (32.1%)
15.9% ( -33% - 101%)
{noformat}
The total runtime of each run did not change, always approx 280s per run
patched and unpatched. Not sure how to interpret this.
> Improve ByteBufferGuard in Java 11
> ----------------------------------
>
> Key: LUCENE-8780
> URL: https://issues.apache.org/jira/browse/LUCENE-8780
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/store
> Affects Versions: master (9.0)
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Priority: Major
> Labels: Java11
> Attachments: LUCENE-8780.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In LUCENE-7409 we added {{ByteBufferGuard}} to protect MMapDirectory from
> crushing the JVM with SIGSEGV when you close and unmap the mmapped buffers of
> an IndexInput, while another thread is accessing it.
> The idea was to do a volatile write access to flush the caches (to trigger a
> full fence) and set a non-volatile boolean to true. All accesses would check
> the boolean and stop the caller from accessing the underlying ByteBuffer.
> This worked most of the time, until the JVM optimized away the plain read
> access to the boolean (you can easily see this after some runtime of our
> by-default ignored testcase).
> With master on Java 11, we can improve the whole thing. Using VarHandles you
> can use any access type when reading or writing the boolean. After reading
> Doug Lea's expanation <http://gee.cs.oswego.edu/dl/html/j9mm.html> and some
> testing, I was no longer able to crush my JDK (even after running for minutes
> unmapping bytebuffers).
> The apraoch is the same, we do a full-fenced write (standard volatile write)
> when we unmap, then we yield the thread (to finish in-flight reads in other
> threads) and then unmap all byte buffers.
> On the test side (read access), instead of using a plain read, we use the new
> "opaque read". Opaque reads are the same as plain reads, there are only
> different order requirements. Actually the main difference is explained by
> Doug like this: "For example in constructions in which the only modification
> of some variable x is for one thread to write in Opaque (or stronger) mode,
> X.setOpaque(this, 1), any other thread spinning in
> while(X.getOpaque(this)!=1){} will eventually terminate. Note that this
> guarantee does NOT hold in Plain mode, in which spin loops may (and usually
> do) infinitely loop -- they are not required to notice that a write ever
> occurred in another thread if it was not seen on first encounter." - And
> that's waht we want to have: We don't want to do volatile reads, but we want
> to prevent the compiler from optimizing away our read to the boolean. So we
> want it to "eventually" see the change. By the much stronger volatile write,
> the cache effects should be visible even faster (like in our Java 8 approach,
> just now we improved our read side).
> The new code is much slimmer (theoretically we could also use a AtomicBoolean
> for that and use the new method {{getOpaque()}}, but I wanted to prevent
> extra method calls, so I used a VarHandle directly).
> It's setup like this:
> - The underlying boolean field is a private member (with unused
> SuppressWarnings, as its unused by the java compiler), marked as volatile
> (that's the recommendation, but in reality it does not matter at all).
> - We create a VarHandle to access this boolean, we never do this directly
> (this is why the volatile marking does not affect us).
> - We use VarHandle.setVolatile() to change our "invalidated" boolean to
> "true", so enforcing a full fence
> - On the read side we use VarHandle.getOpaque() instead of VarHandle.get()
> (like in our old code for Java 8).
> I had to tune our test a bit, as the VarHandles make it take longer until it
> actually crushes (as optimizations jump in later). I also used a random for
> the reads to prevent the optimizer from removing all the bytebuffer reads.
> When we commit this, we can disable the test again (it takes approx 50 secs
> on my machine).
> I'd still like to see the differences between the plain read and the opaque
> read in production, so maybe [~mikemccand] or [~rcmuir] can do a comparison
> with nightly benchmarker?
> Have fun, maybe [~dweiss] has some ideas, too.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]