Re: Long STW GCs with Solr Cloud

Jeff Wartes Fri, 17 Jun 2016 10:01:07 -0700

For what it’s worth, I looked into reducing the allocation footprint of 
CollapsingQParserPlugin a bit, but without success. See 
https://issues.apache.org/jira/browse/SOLR-9125


As it happened, I was collapsing on a field with such high cardinality that the 
chances of a query even doing much collapsing of interest was pretty low. That 
allowed me to use a vastly stripped-down version of CollapsingQParserPlugin 
with a *much* lower memory footprint, in exchange for collapsed document heads 
essentially being picked at random. (That is, when collapsing two documents, 
the one that gets returned is random.)

If that’s of interest, I could probably throw the code someplace public.


On 6/16/16, 3:39 PM, "Cas Rusnov" <c...@manzama.com> wrote:

>Hey thanks for your reply.
>
>Looks like running the suggested CMS config from Shawn, we're getting some
>nodes with 30+sec pauses, I gather due to large heap, interestingly enough
>while the scenario Jeff talked about is remarkably similar (we use field
>collapsing), including the performance aspects of it, we are getting
>concurrent mode failures both due to new space allocation failures and due
>to promotion failures. I suspect there's a lot of garbage building up.
>We're going to run tests with field collapsing disabled and see if that
>makes a difference.
>
>Cas
>
>
>On Thu, Jun 16, 2016 at 1:08 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
>> Check your gc log for CMS “concurrent mode failure” messages.
>>
>> If a concurrent CMS collection fails, it does a stop-the-world pause while
>> it cleans up using a *single thread*. This means the stop-the-world CMS
>> collection in the failure case is typically several times slower than a
>> concurrent CMS collection. The single-thread business means it will also be
>> several times slower than the Parallel collector, which is probably what
>> you’re seeing. I understand that it needs to stop the world in this case,
>> but I really wish the CMS failure would fall back to a Parallel collector
>> run instead.
>> The Parallel collector is always going to be the fastest at getting rid of
>> garbage, but only because it stops all the application threads while it
>> runs, so it’s got less complexity to deal with. That said, it’s probably
>> not going to be orders of magnitude faster than a (successfully) concurrent
>> CMS collection.
>>
>> Regardless, the bigger the heap, the bigger the pause.
>>
>> If your application is generating a lot of garbage, or can generate a lot
>> of garbage very suddenly, CMS concurrent mode failures are more likely. You
>> can turn down the  -XX:CMSInitiatingOccupancyFraction value in order to
>> give the CMS collection more of a head start at the cost of more frequent
>> collections. If that doesn’t work, you can try using a bigger heap, but you
>> may eventually find yourself trying to figure out what about your query
>> load generates so much garbage (or causes garbage spikes) and trying to
>> address that. Even G1 won’t protect you from highly unpredictable garbage
>> generation rates.
>>
>> In my case, for example, I found that a very small subset of my queries
>> were using the CollapseQParserPlugin, which requires quite a lot of memory
>> allocations, especially on a large index. Although generally this was fine,
>> if I got several of these rare queries in a very short window, it would
>> always spike enough garbage to cause CMS concurrent mode failures. The
>> single-threaded concurrent-mode failure would then take long enough that
>> the ZK heartbeat would fail, and things would just go downhill from there.
>>
>>
>>
>> On 6/15/16, 3:57 PM, "Cas Rusnov" <c...@manzama.com> wrote:
>>
>> >Hey Shawn! Thanks for replying.
>> >
>> >Yes I meant HugePages not HugeTable, brain fart. I will give the
>> >transparent off option a go.
>> >
>> >I have attempted to use your CMS configs as is and also the default
>> >settings and the cluster dies under our load (basically a node will get a
>> >35-60s GC STW and then the others in the shard will take the load, and
>> they
>> >will in turn get long STWs until the shard dies), which is why basically
>> in
>> >a fit of desperation I tried out ParallelGC and found it to be half-way
>> >acceptable. I will run a test using your configs (and the defaults) again
>> >just to be sure (since I'm certain the machine config has changed since we
>> >used your unaltered settings).
>> >
>> >Thanks!
>> >Cas
>> >
>> >
>> >On Wed, Jun 15, 2016 at 3:41 PM, Shawn Heisey <apa...@elyograg.org>
>> wrote:
>> >
>> >> On 6/15/2016 3:05 PM, Cas Rusnov wrote:
>> >> > After trying many of the off the shelf configurations (including CMS
>> >> > configurations but excluding G1GC, which we're still taking the
>> >> > warnings about seriously), numerous tweaks, rumors, various instance
>> >> > sizes, and all the rest, most of which regardless of heap size and
>> >> > newspace size resulted in frequent 30+ second STW GCs, we settled on
>> >> > the following configuration which leads to occasional high GCs but
>> >> > mostly stays between 10-20 second STWs every few minutes (which is
>> >> > almost acceptable): -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions
>> >> > -XX:+UseAdaptiveSizePolicy -XX:+UseLargePages -XX:+UseParallelGC
>> >> > -XX:+UseParallelOldGC -XX:MaxGCPauseMillis=15000 -XX:MaxNewSize=12000m
>> >> > -XX:ParGCCardsPerStrideChunk=4096 -XX:ParallelGCThreads=16 -Xms31000m
>> >> > -Xmx31000m
>> >>
>> >> You mentioned something called "HugeTable" ... I assume you're talking
>> >> about huge pages.  If that's what you're talking about, have you also
>> >> turned off transparent huge pages?  If you haven't, you might want to
>> >> completely disable huge pages in your OS.  There's evidence that the
>> >> transparent option can affect performance.
>> >>
>> >> I assume you've probably looked at my GC info at the following URL:
>> >>
>> >> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr
>> >>
>> >> The parallel collector is most definitely not a good choice.  It does
>> >> not optimize for latency.  It's my understanding that it actually
>> >> prefers full GCs, because it is optimized for throughput.  Solr thrives
>> >> on good latency, throughput doesn't matter very much.
>> >>
>> >> If you want to continue avoiding G1, you should definitely be using
>> >> CMS.  My recommendation right now would be to try the G1 settings on my
>> >> wiki page under the heading "Current experiments" or the CMS settings
>> >> just below that.
>> >>
>> >> The out-of-the-box GC tuning included with Solr 6 is probably a better
>> >> option than the parallel collector you've got configured now.
>> >>
>> >> Thanks,
>> >> Shawn
>> >>
>> >>
>> >
>> >
>> >--
>> >
>> >Cas Rusnov,
>> >
>> >Engineer
>> >[image: Manzama Logo] <http://www.manzama.com>
>> >
>> >Visit our Resource Center <http://www.manzama.com/resource-center/>.
>> >
>> >US & Canada Office: +1 (541) 306-3271 <+15413063271> | UK Office: +44
>> >(0)203 282 1633 <+4402032821633> | AUS Office: +61 02 9326 6264
>> ><+610293266264>
>> >
>> >LinkedIn  <http://www.linkedin.com/company/manzama>| Twitter
>> ><https://twitter.com/ManzamaInc>| Facebook
>> ><http://www.facebook.com/manzamainc>| Google +
>> ><https://plus.google.com/u/0/b/116326385357563344293/+ManzamaInc/about>|
>> >YouTube  <https://www.youtube.com/channel/UCgbgt-xWBTxrbQESTVeMMHw>|
>> >Pinterest <https://www.pinterest.com/manzama1754/>
>>
>>
>
>
>-- 
>
>Cas Rusnov,
>
>Engineer
>[image: Manzama Logo] <http://www.manzama.com>
>
>Visit our Resource Center <http://www.manzama.com/resource-center/>.
>
>US & Canada Office: +1 (541) 306-3271 <+15413063271> | UK Office: +44
>(0)203 282 1633 <+4402032821633> | AUS Office: +61 02 9326 6264
><+610293266264>
>
>LinkedIn  <http://www.linkedin.com/company/manzama>| Twitter
><https://twitter.com/ManzamaInc>| Facebook
><http://www.facebook.com/manzamainc>| Google +
><https://plus.google.com/u/0/b/116326385357563344293/+ManzamaInc/about>|
>YouTube  <https://www.youtube.com/channel/UCgbgt-xWBTxrbQESTVeMMHw>|
>Pinterest <https://www.pinterest.com/manzama1754/>

Re: Long STW GCs with Solr Cloud

Reply via email to