[
https://issues.apache.org/jira/browse/SOLR-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397032#comment-17397032
]
Mark Robert Miller commented on SOLR-15560:
-------------------------------------------
A quick crack at *{color:#ff8b00}JavaBinCodec#encode{color}*:
||Before Benchmark||(docCount)||Mode||Cnt||Score||Error||Units||
|JavaBinBasicPerf.encode|30|thrpt|5|{color:#00875a}157.581{color}|{color:#de350b}±
4.108{color}|ops/s|
||After Benchmark||(docCount)||Mode||Cnt||Score||Error||Units||
|JavaBinBasicPerf.encode|30|thrpt|5|{color:#00875a}674.324{color}|{color:#de350b}±
54.433{color}|ops/s|
!javabin.encode.before.and.after.summary.png!
!javabin.encode.before.and.after.compare.png!
FYI: this apparently breaks TestJavaBinCodec#testForwardCompat, so that will
have to be resolved somehow still.
> Optimize JavaBinCodec encode/decode performance.
> ------------------------------------------------
>
> Key: SOLR-15560
> URL: https://issues.apache.org/jira/browse/SOLR-15560
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Mark Robert Miller
> Assignee: Mark Robert Miller
> Priority: Minor
> Attachments: javabin.decode.1.before.json,
> javabin.decode.2.after.json, javabin.decode.before.and.after.compare.png,
> javabin.decode.before.and.after.summary.png
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Javabin performance can be pretty impactful on search side scatter / gather
> and especially the /export handler.
> It turns out, in JavaBin, where it does a large switch to dispatch based on
> the type, its a hot spot that is too large to be inlined.
> You can pull some less common paths out into another method to address this.
> I have not benchmark this yet, and it’s possible other bottlenecks may dampen
> the win, but I noticed the following on ref branch (with a couple other
> optimizations that were not nearly as wide affecting or quite as hot):
> When you run the tests, you get the best results in “client” mode - eg you
> prevent the C2 compiler from kicking in. Let’s say I could run the core
> nightly tests serially on my laptop in about 8 minutes with C1 - C2 might
> take another 2 to 3 minutes on top. This is because the work it does
> optimizing and compiling and uncompiling on such a diverse task ends up being
> the dominant performance drag.
> With a bit of key optimization here, running the tests with C2 ends up about
> on par with stopping at C1, even though C2 still dominates everything else.
> That’s a pretty impactful win in order to be able to move the needle like
> that.
> Why such a win on C2 without C1 also dodging forward? It’s much more
> manageable to reduce the byte code for a none inlined hot method below the C2
> size threshold for inlining than C1s.
> So this should be a decent win i hope. There are a variety of differences
> that may outweigh it though.
> * javabin on master has tail recursion.
> * generates a tremendous number of byte arrays
> * converts between utf8 and utf16
> * manually does the encoding (the jvm can cheat)
> * Has a number of classes that extend it (vs 1 here)
> * lots of other things
> I’m optimistic we can see some gain though.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]