[jira] [Commented] (SOLR-15509) Issues to potentially improve JSON faceting and Stats performance.

Mark Robert Miller (Jira) Wed, 30 Jun 2021 13:33:10 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-15509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17372197#comment-17372197
 ]


Mark Robert Miller commented on SOLR-15509:
-------------------------------------------

Yonik did most of this pipeline, so there are not a lot of great this or that, 
twist, watch the improvements fall out paths here.

You can’t replace noggit with the fastest Java json parsers, you can’t find 
lots of broad “if we just did this, all these cases win” excitement, and the 
parts are pretty fast and well tested.

There is a still a lot of specific case stuff, but those are the type of things 
you go after with high energy when solving for a particular problem.

So the approach here is a bit blunt. It’s hard to target the json faceting path 
directly and broadly. So this approach is blunt but it’s also fairly broad in 
terms of intended affect.

There are basically 3 motivations.

* Allowing minor changes in config or behavior that target more of a “decent 
hardware, larger scale issues starting to take priority over raw, top 
concurrent user, performance issues”

* some targeted and simple garbage management / efficiency options - eg reduce 
top gc hotspots on hot path.

* the ability towards being able to config towards a less hands on approach to 
keeping Java and Solr happy on server resources with often more than plenty of 
resources available. The principal reasons being to offer high scale 
optimizations/trade offs around small tight loops on a large data objects, cost 
of creating and discarding those objects, heap babysitting and crises 
management, etc - as well as offering a path towards “don’t make me play around 
with the heap, don’t make me have to micromanage this as we scale (as much as 
you can), tell me a reasonable setting to use and then just use the damn 
hardware available.”

There may be some worthwhile more targeted efforts on the near real time per 
segment trade offs that often end up charging many that would be better served 
coming at nrt from the other direction (almost nothing favoring nrt) and 
stopping when they hit a match in their requirements.

I’ve seen this json faceting hot path much faster, but coming from this large 
accumulation of changes, generally not directly related. So some of these items 
are an attempt to capture some of that, but they are not intended solely for 
performance here. Offheap is more important in offering a more predictable, 
more scale friendly setup, but with some other targeted changes, and options, 
it can also have a fairly performant impact on gc interactions, footprint, etc. 

> Issues to potentially improve JSON faceting and Stats performance.
> ------------------------------------------------------------------
>
>                 Key: SOLR-15509
>                 URL: https://issues.apache.org/jira/browse/SOLR-15509
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mark Robert Miller
>            Priority: Minor
>              Labels: RobustSQL
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-15509) Issues to potentially improve JSON faceting and Stats performance.

Reply via email to