+1 for eclipse mat. Yourkit is another option. Heap dumps are invaluable but a pain. If you’re just interested in overall heap and gc analysis I use gc-viewer, which is usually all you need to know. I do heap dumps when there are for large deviations from expectations and it is non obvious why
Greg On Fri, Jun 7, 2019 at 11:30 AM John Davis <johndavis925...@gmail.com> wrote: > What would be the best way to understand where heap is being used? > > On Tue, Jun 4, 2019 at 9:31 PM Greg Harris <harrisgre...@gmail.com> wrote: > > > Just a couple of points I’d make here. I did some testing a while back in > > which if no commit is made, (hard or soft) there are internal memory > > structures holding tlogs and it will continue to get worse the more docs > > that come in. I don’t know if that’s changed in further versions. I’d > > recommend doing commits with some amount of frequency in indexing heavy > > apps, otherwise you are likely to have heap issues. I personally would > > advocate for some of the points already made. There are too many > variables > > going on here and ways to modify stuff to make sizing decisions and think > > you’re doing anything other than a pure guess if you don’t test and > > monitor. I’d advocate for a process in which testing is done regularly to > > figure out questions like number of shards/replicas, heap size, memory > etc. > > Hard data, good process and regular testing will trump guesswork every > time > > > > Greg > > > > On Tue, Jun 4, 2019 at 9:22 AM John Davis <johndavis925...@gmail.com> > > wrote: > > > > > You might want to test with softcommit of hours vs 5m for heavy > indexing > > + > > > light query -- even though there is internal memory structure overhead > > for > > > no soft commits, in our testing a 5m soft commit (via commitWithin) has > > > resulted in a very very large heap usage which I suspect is because of > > > other overhead associated with it. > > > > > > On Tue, Jun 4, 2019 at 8:03 AM Erick Erickson <erickerick...@gmail.com > > > > > wrote: > > > > > > > I need to update that, didn’t understand the bits about retaining > > > internal > > > > memory structures at the time. > > > > > > > > > On Jun 4, 2019, at 2:10 AM, John Davis <johndavis925...@gmail.com> > > > > wrote: > > > > > > > > > > Erick - These conflict, what's changed? > > > > > > > > > > So if I were going to recommend settings, they’d be something like > > > this: > > > > > Do a hard commit with openSearcher=false every 60 seconds. > > > > > Do a soft commit every 5 minutes. > > > > > > > > > > vs > > > > > > > > > > Index-heavy, Query-light > > > > > Set your soft commit interval quite long, up to the maximum latency > > you > > > > can > > > > > stand for documents to be visible. This could be just a couple of > > > minutes > > > > > or much longer. Maybe even hours with the capability of issuing a > > hard > > > > > commit (openSearcher=true) or soft commit on demand. > > > > > > > > > > > > > > > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Jun 2, 2019 at 8:58 PM Erick Erickson < > > erickerick...@gmail.com > > > > > > > > > wrote: > > > > > > > > > >>> I've looked through SolrJ, DIH and others -- is the bottomline > > > > >>> across all of them to "batch updates" and not commit as long as > > > > possible? > > > > >> > > > > >> Of course it’s more complicated than that ;)…. > > > > >> > > > > >> But to start, yes, I urge you to batch. Here’s some stats: > > > > >> https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/ > > > > >> > > > > >> Note that at about 100 docs/batch you hit diminishing returns. > > > > _However_, > > > > >> that test was run on a single shard collection, so if you have 10 > > > shards > > > > >> you’d > > > > >> have to send 1,000 docs/batch. I wouldn’t sweat that number much, > > just > > > > >> don’t > > > > >> send one at a time. And there are the usual gotchas if your > > documents > > > > are > > > > >> 1M .vs. 1K. > > > > >> > > > > >> About committing. No, don’t hold off as long as possible. When you > > > > commit, > > > > >> segments are merged. _However_, the default 100M internal buffer > > size > > > > means > > > > >> that segments are written anyway even if you don’t hit a commit > > point > > > > when > > > > >> you have 100M of index data, and merges happen anyway. So you > won’t > > > save > > > > >> anything on merging by holding off commits. > > > > >> And you’ll incur penalties. Here’s more than you want to know > about > > > > >> commits: > > > > >> > > > > >> > > > > > > > > > > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > > > >> > > > > >> But some key take-aways… If for some reason Solr abnormally > > > > >> terminates, the accumulated documents since the last hard > > > > >> commit are replayed. So say you don’t commit for an hour of > > > > >> furious indexing and someone does a “kill -9”. When you restart > > > > >> Solr it’ll try to re-index all the docs for the last hour. Hard > > > commits > > > > >> with openSearcher=false aren’t all that expensive. I usually set > > mine > > > > >> for a minute and forget about it. > > > > >> > > > > >> Transaction logs hold a window, _not_ the entire set of operations > > > > >> since time began. When you do a hard commit, the current tlog is > > > > >> closed and a new one opened and ones that are “too old” are > deleted. > > > If > > > > >> you never commit you have a huge transaction log to no good > purpose. > > > > >> > > > > >> Also, while indexing, in order to accommodate “Real Time Get”, all > > > > >> the docs indexed since the last searcher was opened have a pointer > > > > >> kept in memory. So if you _never_ open a new searcher, that > internal > > > > >> structure can get quite large. So in bulk-indexing operations, I > > > > >> suggest you open a searcher every so often. > > > > >> > > > > >> Opening a new searcher isn’t terribly expensive if you have no > > > > autowarming > > > > >> going on. Autowarming as defined in solrconfig.xml in filterCache, > > > > >> queryResultCache > > > > >> etc. > > > > >> > > > > >> So if I were going to recommend settings, they’d be something like > > > this: > > > > >> Do a hard commit with openSearcher=false every 60 seconds. > > > > >> Do a soft commit every 5 minutes. > > > > >> > > > > >> I’d actually be surprised if you were able to measure differences > > > > between > > > > >> those settings and just hard commit with openSearcher=true every > 60 > > > > >> seconds and soft commit at -1 (never)… > > > > >> > > > > >> Best, > > > > >> Erick > > > > >> > > > > >>> On Jun 2, 2019, at 3:35 PM, John Davis < > johndavis925...@gmail.com> > > > > >> wrote: > > > > >>> > > > > >>> If we assume there is no query load then effectively this boils > > down > > > to > > > > >>> most effective way for adding a large number of documents to the > > solr > > > > >>> index. I've looked through SolrJ, DIH and others -- is the > > bottomline > > > > >>> across all of them to "batch updates" and not commit as long as > > > > possible? > > > > >>> > > > > >>> On Sun, Jun 2, 2019 at 7:44 AM Erick Erickson < > > > erickerick...@gmail.com > > > > > > > > > >>> wrote: > > > > >>> > > > > >>>> Oh, there are about a zillion reasons ;). > > > > >>>> > > > > >>>> First of all, most tools that show heap usage also count > > uncollected > > > > >>>> garbage. So your 10G could actually be much less “live” data. > > Quick > > > > way > > > > >> to > > > > >>>> test is to attach jconsole to the running Solr and hit the > button > > > that > > > > >>>> forces a full GC. > > > > >>>> > > > > >>>> Another way is to reduce your heap when you start Solr (on a > test > > > > system > > > > >>>> of course) until bad stuff happens, if you reduce it to very > close > > > to > > > > >> what > > > > >>>> Solr needs, you’ll get slower as more and more cycles are spent > on > > > GC, > > > > >> if > > > > >>>> you reduce it a little more you’ll get OOMs. > > > > >>>> > > > > >>>> You can take heap dumps of course to see where all the memory is > > > being > > > > >>>> used, but that’s tricky as it also includes garbage. > > > > >>>> > > > > >>>> I’ve seen cache sizes (filterCache in particular) be something > > that > > > > uses > > > > >>>> lots of memory, but that requires queries to be fired. Each > > > > filterCache > > > > >>>> entry can take up to roughly maxDoc/8 bytes + overhead…. > > > > >>>> > > > > >>>> A classic error is to sort, group or facet on a docValues=false > > > field. > > > > >>>> Starting with Solr 7.6, you can add an option to fields to throw > > an > > > > >> error > > > > >>>> if you do this, see: > > > https://issues.apache.org/jira/browse/SOLR-12962 > > > > . > > > > >>>> > > > > >>>> In short, there’s not enough information until you dive in and > > test > > > > >>>> bunches of stuff to tell. > > > > >>>> > > > > >>>> Best, > > > > >>>> Erick > > > > >>>> > > > > >>>> > > > > >>>>> On Jun 2, 2019, at 2:22 AM, John Davis < > > johndavis925...@gmail.com> > > > > >>>> wrote: > > > > >>>>> > > > > >>>>> This makes sense, any ideas why lucene/solr will use 10g heap > > for a > > > > 20g > > > > >>>>> index.My hypothesis was merging segments was trying to read it > > all > > > > but > > > > >> if > > > > >>>>> that's not the case I am out of ideas. The one caveat is we are > > > > trying > > > > >> to > > > > >>>>> add the documents quickly (~1g an hour) but if lucene does > write > > > 100m > > > > >>>>> segments and does streaming merge it shouldn't matter? > > > > >>>>> > > > > >>>>> On Sat, Jun 1, 2019 at 9:24 AM Walter Underwood < > > > > wun...@wunderwood.org > > > > >>> > > > > >>>>> wrote: > > > > >>>>> > > > > >>>>>>> On May 31, 2019, at 11:27 PM, John Davis < > > > > johndavis925...@gmail.com> > > > > >>>>>> wrote: > > > > >>>>>>> > > > > >>>>>>> 2. Merging segments - does solr load the entire segment in > > memory > > > > or > > > > >>>>>> chunks > > > > >>>>>>> of it? if later how large are these chunks > > > > >>>>>> > > > > >>>>>> No, it does not read the entire segment into memory. > > > > >>>>>> > > > > >>>>>> A fundamental part of the Lucene design is streaming posting > > lists > > > > >> into > > > > >>>>>> memory and processing them sequentially. The same amount of > > memory > > > > is > > > > >>>>>> needed for small or large segments. Each posting list is in > > > > >> document-id > > > > >>>>>> order. The merge is a merge of sorted lists, writing a new > > posting > > > > >> list > > > > >>>> in > > > > >>>>>> document-id order. > > > > >>>>>> > > > > >>>>>> wunder > > > > >>>>>> Walter Underwood > > > > >>>>>> wun...@wunderwood.org > > > > >>>>>> http://observer.wunderwood.org/ (my blog) > > > > >>>>>> > > > > >>>>>> > > > > >>>> > > > > >>>> > > > > >> > > > > >> > > > > > > > > > > > > > >