Hi Shawn, thanks again for the reply.

I've tried increasing the memory to 32 gb and 16gb of ram heap with 8 cores, 
and even though I still see peaks of 300% CPU on the solr process it can handle 
it (solr doesn't go down).
But, I've tried several different configurations for the auto commit and soft 
commit and results always take a few minutes to show up on search, which is 
really unacceptable for us, I'm not sure how to proceed now.

I've looked at the cores and for example of the collection I'm testing against 
right now, I see these values:

Core 1: 
Num Docs:4806841
Max Doc:4845793
Heap Memory Usage:387392
Core 2:
Num Docs:4810159
Max Doc:4849229
Heap Memory Usage:450008

Other collections look fairly similar, except for this one:

Preview Core1:
Num Docs:5774937
Max Doc:5832482
Heap Memory Usage:407424

Preview Core2:
Num Docs:5774937
Max Doc:5833942
Heap Memory Usage:463632

Preview Core 3:
Num Docs:5778245
Max Doc:5790174
Heap Memory Usage:480672

For some reason, the "Preview Collection" has 3 shards instead of 2 like it was 
before... maybe that could be related? The collection overview say shards 2 and 
replication factor 2.

As additional info, Zookeeper is running on it's own server and solr is the 
only thing running on that server, aside some system processes.

Thanks again! 

MATIAS LAINO | DIRECTOR OF PASSARE REMOTE DEVELOPMENT
matias.la...@passare.com | +54 11-6357-2143


-----Original Message-----
From: Shawn Heisey <elyog...@elyograg.org> 
Sent: Thursday, December 1, 2022 1:07 AM
To: users@solr.apache.org
Subject: Re: Very High CPU when indexing

On 11/30/22 08:57, Matias Laino wrote:
> Q: What is the total document count?
> A: Based on the dashboard, it's Total #docs: 68.6mn each node (I'm 
> replicating the same data on both)

Each core has a count.  And here you can see what I was talking about with max 
doc compared to num docs.

https://www.dropbox.com/s/jdgddn4ve5mluhr/core_doc_counts.png?dl=0

> Q: but it would be great to have an on-disk size and document count 
> (max docs, not num docs) for each collection
> A: I'm not sure where to get that from metrics, based on the cloud dashboard 
> it say the following by shard:
> preview_s1r2:  1.9Gb
> preview_s2r11:  1.9Gb
> preview_s2r6:  1.9Gb
> staging-d_s1r1:  1.8Gb
> staging-d_s2r4:  1.8Gb
> staging-a_s1r1:  1.7Gb
> staging-a_s2r4:  1.7Gb
> staging-c_s2r5:  1.6Gb
> staging-c_s1r2:  1.6Gb
> pre-prod_s1r1:  1.6Gb
> pre-prod_s2r4:  1.6Gb
> staging-b_s1r2:  1.5Gb
> staging-b_s2r5:  1.5Gb
> That is replicated on the other node.

So you've got 22GB of data, and assuming Solr is the only thing running on the 
machine, only about 8GB of memory to cache it (total RAM of 16GB minus 8GB for 
the Solr heap).  I would hope for at least of 12GB of cache for that, and more 
is always better. 8GB may not be enough.  If you have other software running on 
the machine, it will be even less. Does ZK live on the same instance?  If so, 
how much heap are you giving to that?

Performance of a system is often perfectly fine up until some threshold, and 
once you throw just little bit more data in the mix so it goes over that 
threshold, performance drops drastically. That is how a small increase can 
bring a system to its knees.

If you can upgrade the instance to one with more memory, that might also help, 
but I do think that the biggest problem is the autoSoftCommit setting.  If you 
really can't make it at least two minutes, which is the value I would use, then 
set it as high as you can.  10 to 30 seconds, maybe.

Thanks,
Shawn

Reply via email to