Re: Long GC pauses while reading Solr docs using Cursor approach

Mikhail Khludnev Wed, 12 Apr 2017 12:53:19 -0700

And what is the rows parameter?

12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" <chetas.jo...@gmail.com>
написал:


> Thanks for your response Shawn and Wunder.
>
> Hi Shawn,
>
> Here is the system config:
>
> Total system memory = 512 GB
> each server handles two 500 MB cores
> Number of solr docs per 500 MB core = 200 MM
>
> The average heap usage is around 4-6 GB. When the read starts using the
> Cursor approach, the heap usage starts increasing with the base of the
> sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, the
> heap usage remains around 15 GB and then it comes down to 8 GB.
>
> With 100K docs, the requirement will be in MBs so it is strange it is
> jumping from 8 GB to 17 GB while preparing the sorted response.
>
> Thanks!
>
>
>
> On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood <wun...@wunderwood.org>
> wrote:
>
> > JVM version? We’re running v8 update 121 with the G1 collector and it is
> > working really well. We also have an 8GB heap.
> >
> > Graph your heap usage. You’ll see a sawtooth shape, where it grows, then
> > there is a major GC. The maximum of the base of the sawtooth is the
> working
> > set of heap that your Solr installation needs. Set the heap to that
> value,
> > plus a gigabyte or so. We run with a 2GB eden (new space) because so much
> > of Solr’s allocations have a lifetime of one request. So, the base of the
> > sawtooth, plus a gigabyte breathing room, plus two more for eden. That
> > should work.
> >
> > I don’t set all the ratios and stuff. When were running CMS, I set a size
> > for the heap and a size for the new space. Done. With G1, I don’t even
> get
> > that fussy.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> > >
> > > On 4/11/2017 2:56 PM, Chetas Joshi wrote:
> > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold
> collection
> > >> with number of shards = 80 and replication Factor=2
> > >>
> > >> Sold JVM heap size = 20 GB
> > >> solr.hdfs.blockcache.enabled = true
> > >> solr.hdfs.blockcache.direct.memory.allocation = true
> > >> MaxDirectMemorySize = 25 GB
> > >>
> > >> I am querying a solr collection with index size = 500 MB per core.
> > >
> > > I see that you and I have traded messages before on the list.
> > >
> > > How much total system memory is there per server?  How many of these
> > > 500MB cores are on each server?  How many docs are in a 500MB core?
> The
> > > answers to these questions may affect the other advice that I give you.
> > >
> > >> The off-heap (25 GB) is huge so that it can load the entire index.
> > >
> > > I still know very little about how HDFS handles caching and memory.
> You
> > > want to be sure that as much data as possible from your indexes is
> > > sitting in local memory on the server.
> > >
> > >> Using cursor approach (number of rows = 100K), I read 2 fields (Total
> 40
> > >> bytes per solr doc) from the Solr docs that satisfy the query. The
> docs
> > are sorted by "id" and then by those 2 fields.
> > >>
> > >> I am not able to understand why the heap memory is getting full and
> Full
> > >> GCs are consecutively running with long GC pauses (> 30 seconds). I am
> > >> using CMS GC.
> > >
> > > A 20GB heap is quite large.  Do you actually need it to be that large?
> > > If you graph JVM heap usage over a long period of time, what are the
> low
> > > points in the graph?
> > >
> > > A result containing 100K docs is going to be pretty large, even with a
> > > limited number of fields.  It is likely to be several megabytes.  It
> > > will need to be entirely built in the heap memory before it is sent to
> > > the client -- both as Lucene data structures (which will probably be
> > > much larger than the actual response due to Java overhead) and as the
> > > actual response format.  Then it will be garbage as soon as the
> response
> > > is done.  Repeat this enough times, and you're going to go through even
> > > a 20GB heap pretty fast, and need a full GC.  Full GCs on a 20GB heap
> > > are slow.
> > >
> > > You could try switching to G1, as long as you realize that you're going
> > > against advice from Lucene experts.... but honestly, I do not expect
> > > this to really help, because you would probably still need full GCs due
> > > to the rate that garbage is being created.  If you do try it, I would
> > > strongly recommend the latest Java 8, either Oracle or OpenJDK.  Here's
> > > my wiki page where I discuss this:
> > >
> > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_
> > First.29_Collector
> > >
> > > Reducing the heap size (which may not be possible -- need to know the
> > > answer to the question about memory graphing) and reducing the number
> of
> > > rows per query are the only quick solutions I can think of.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
> >
>

Re: Long GC pauses while reading Solr docs using Cursor approach

Reply via email to