Re: Solr Heap Usage

2019-06-07 Thread Greg Harris
+1 for eclipse mat. Yourkit is another option. Heap dumps are invaluable but a pain. If you’re just interested in overall heap and gc analysis I use gc-viewer, which is usually all you need to know. I do heap dumps when there are for large deviations from expectations and it is non obvious why

RE: Solr Heap Usage

2019-06-07 Thread Markus Jelsma
Subject: Re: Solr Heap Usage > > What would be the best way to understand where heap is being used? > > On Tue, Jun 4, 2019 at 9:31 PM Greg Harris wrote: > > > Just a couple of points I’d make here. I did some testing a while back in > > which if no commit is made, (

Re: Solr Heap Usage

2019-06-07 Thread John Davis
What would be the best way to understand where heap is being used? On Tue, Jun 4, 2019 at 9:31 PM Greg Harris wrote: > Just a couple of points I’d make here. I did some testing a while back in > which if no commit is made, (hard or soft) there are internal memory > structures holding tlogs and

Re: Solr Heap Usage

2019-06-04 Thread Greg Harris
Just a couple of points I’d make here. I did some testing a while back in which if no commit is made, (hard or soft) there are internal memory structures holding tlogs and it will continue to get worse the more docs that come in. I don’t know if that’s changed in further versions. I’d recommend

Re: Solr Heap Usage

2019-06-04 Thread John Davis
You might want to test with softcommit of hours vs 5m for heavy indexing + light query -- even though there is internal memory structure overhead for no soft commits, in our testing a 5m soft commit (via commitWithin) has resulted in a very very large heap usage which I suspect is because of other

Re: Solr Heap Usage

2019-06-04 Thread Erick Erickson
I need to update that, didn’t understand the bits about retaining internal memory structures at the time. > On Jun 4, 2019, at 2:10 AM, John Davis wrote: > > Erick - These conflict, what's changed? > > So if I were going to recommend settings, they’d be something like this: > Do a hard commit

Re: Solr Heap Usage

2019-06-04 Thread John Davis
Erick - These conflict, what's changed? So if I were going to recommend settings, they’d be something like this: Do a hard commit with openSearcher=false every 60 seconds. Do a soft commit every 5 minutes. vs Index-heavy, Query-light Set your soft commit interval quite long, up to the maximum

Re: Solr Heap Usage

2019-06-03 Thread Shawn Heisey
On 6/2/2019 4:35 PM, John Davis wrote: If we assume there is no query load then effectively this boils down to most effective way for adding a large number of documents to the solr index. I've looked through SolrJ, DIH and others -- is the bottomline across all of them to "batch updates" and not

Re: Solr Heap Usage

2019-06-02 Thread Erick Erickson
> I've looked through SolrJ, DIH and others -- is the bottomline > across all of them to "batch updates" and not commit as long as possible? Of course it’s more complicated than that ;)…. But to start, yes, I urge you to batch. Here’s some stats:

Re: Solr Heap Usage

2019-06-02 Thread John Davis
If we assume there is no query load then effectively this boils down to most effective way for adding a large number of documents to the solr index. I've looked through SolrJ, DIH and others -- is the bottomline across all of them to "batch updates" and not commit as long as possible? On Sun, Jun

Re: Solr Heap Usage

2019-06-02 Thread Erick Erickson
Oh, there are about a zillion reasons ;). First of all, most tools that show heap usage also count uncollected garbage. So your 10G could actually be much less “live” data. Quick way to test is to attach jconsole to the running Solr and hit the button that forces a full GC. Another way is to

Re: Solr Heap Usage

2019-06-02 Thread John Davis
This makes sense, any ideas why lucene/solr will use 10g heap for a 20g index.My hypothesis was merging segments was trying to read it all but if that's not the case I am out of ideas. The one caveat is we are trying to add the documents quickly (~1g an hour) but if lucene does write 100m segments

Re: Solr Heap Usage

2019-06-01 Thread Walter Underwood
> On May 31, 2019, at 11:27 PM, John Davis wrote: > > 2. Merging segments - does solr load the entire segment in memory or chunks > of it? if later how large are these chunks No, it does not read the entire segment into memory. A fundamental part of the Lucene design is streaming posting lists

Re: Solr Heap Usage

2019-06-01 Thread Erick Erickson
s of most common scenarios that you can > run on the Solr cluster to evaluate the performance. Then you can also > simulate concurrencies of scenarios and users etc. > >> Am 01.06.2019 um 08:27 schrieb John Davis : >> >> I've read a bunch of the wiki's on solr

Re: Solr Heap Usage

2019-06-01 Thread Jörn Franke
I recommend to setup JMeter Test cases of most common scenarios that you can run on the Solr cluster to evaluate the performance. Then you can also simulate concurrencies of scenarios and users etc. > Am 01.06.2019 um 08:27 schrieb John Davis : > > I've read a bunch of the wiki's on

Re: Solr Heap Usage

2019-06-01 Thread Shawn Heisey
On 6/1/2019 12:27 AM, John Davis wrote: I've read a bunch of the wiki's on solr heap usage and wanted to confirm my understanding of what all does solr use the heap for: This is something that's not straightforward to answer. It would not be wrong to say that Solr uses the Java heap

Solr Heap Usage

2019-06-01 Thread John Davis
I've read a bunch of the wiki's on solr heap usage and wanted to confirm my understanding of what all does solr use the heap for: 1. Indexing new documents - until committed? if not how long are the new documents kept in heap? 2. Merging segments - does solr load the entire segment in memory

Re: Solr Heap usage

2018-05-02 Thread Greenhorn Techie
Thanks Shawn for the inputs, which will definitely help us to scale our cluster better. Regards On 2 May 2018 at 18:15:12, Shawn Heisey (apa...@elyograg.org) wrote: On 5/1/2018 5:33 PM, Greenhorn Techie wrote: > Wondering what are the considerations to be aware to arrive at an optimal > heap

Re: Solr Heap usage

2018-05-02 Thread Shawn Heisey
On 5/1/2018 5:33 PM, Greenhorn Techie wrote: > Wondering what are the considerations to be aware to arrive at an optimal > heap size for Solr JVM? Though I did discuss this on the IRC, I am still > unclear on how Solr uses the JVM heap space. Are there any pointers to > understand this aspect

Re: Solr Heap usage

2018-05-02 Thread Susheel Kumar
Take a look at https://wiki.apache.org/solr/SolrPerformanceProblems. The section "how much heap do i need" talks about that. Cache also goes to JVM so take a look how much you need/allocating for different cache's. Thnx On Tue, May 1, 2018 at 7:33 PM, Greenhorn Techie

Solr Heap usage

2018-05-01 Thread Greenhorn Techie
Hi, Wondering what are the considerations to be aware to arrive at an optimal heap size for Solr JVM? Though I did discuss this on the IRC, I am still unclear on how Solr uses the JVM heap space. Are there any pointers to understand this aspect better? Given that Solr requires an optimally