Hi Shawn! Thanks again for replying, a few answers to your questions. Q: How did the index size compare 3 months ago to today? A: Pretty much the same, we've been using websolr for years but they had a lot of performance issues and it was expensive (their service went down pretty frequently), so we moved to their v4.10 to our own cluster of 8.11 SolrCloud (our test environments are using a Lucene Match version 7.1.0 though). So our collection might have increased by maybe 1M records at most, but I really think so.
Q: How much total index data is there on each Solr node? A: I'm not what's the difference with total # docs, I think it's the same but I'm probably wrong, it's 68.6mn per node Q: What is the total document count? A: Based on the dashboard, it's Total #docs: 68.6mn each node (I'm replicating the same data on both) Q: but it would be great to have an on-disk size and document count (max docs, not num docs) for each collection A: I'm not sure where to get that from metrics, based on the cloud dashboard it say the following by shard: preview_s1r2: 1.9Gb preview_s2r11: 1.9Gb preview_s2r6: 1.9Gb staging-d_s1r1: 1.8Gb staging-d_s2r4: 1.8Gb staging-a_s1r1: 1.7Gb staging-a_s2r4: 1.7Gb staging-c_s2r5: 1.6Gb staging-c_s1r2: 1.6Gb pre-prod_s1r1: 1.6Gb pre-prod_s2r4: 1.6Gb staging-b_s1r2: 1.5Gb staging-b_s2r5: 1.5Gb That is replicated on the other node. > I think what I would start with is lowering autoCommit to 15000 and raising > autoSoftCommit to 60000. I will try this, as far as I understood from Solr documentation for NRT, auto soft commit should be lower than autocommit as hard commit is a more expensive operation, should I try autoSoftCommit 15000 and autoCommit 60000 ? The base line is that we need to have an "almost instant" availability when indexing data, we use solr for searches so whenever we add a new record it needs to be available on search almost instantly, I'm not sure what's the best for this, but we've been using the configuration I mentioned for a lot of time (we had individual "indexes" oin websolr, I'm fairly sure they weren't on different servers but I also don't have any info on how much memory those servers had). For GC logs, yes! Here is the .0 file for each node: node 1: https://drive.google.com/file/d/1IgneAh412HQbHC2cwZTD_PIAR7NFPWe8/view?usp=share_link node 2: https://drive.google.com/file/d/1lll7WQK3T_p3G9bFv3w1B5SPnvTQleV3/view?usp=share_link Processes list: sorry if this is not enough, I don't know how else to make it available, but I'm open to any suggestions! Node 1: https://drive.google.com/file/d/1YQF0571oHecyPwEEuxZsxyafIf5EfUz9/view?usp=share_link Node 2: https://drive.google.com/file/d/1xX72JS-LVb-VfJBxVUk45-jtGvx8SBHv/view?usp=share_link Thank you very much again for your help on this! MATIAS LAINO | DIRECTOR OF PASSARE REMOTE DEVELOPMENT matias.la...@passare.com | +54 11-6357-2143 -----Original Message----- From: Shawn Heisey <apa...@elyograg.org> Sent: Tuesday, November 29, 2022 7:10 PM To: users@solr.apache.org Subject: Re: Very High CPU when indexing On 11/29/22 13:58, Matias Laino wrote: > Thank you Shawn, I'm definitely checking out those recommendations, but what > I cannot explain is how this worked fine for the last 3 months and then > suddenly this issue started happening. I'd say you got REALLY lucky that there weren't problems sooner. How did the index size compare 3 months ago to today? How much total index data is there on each Solr node? What is the total document count? From the original message, I can conclude it's probably in the ballpark of 60 million, but it would be great to have an on-disk size and document count (max docs, not num docs) for each collection. > On our application, customers expect that when a record is created, that > record should be available on search immediately (that's why the auto Soft > commit of 1 second), what can you recommend for a situation like this? I think what I would start with is lowering autoCommit to 15000 and raising autoSoftCommit to 60000. As I said, it is completely unrealistic to expect 1 second latency unless the index is VERY small. With a total document count north of 60 million, I would not call it small, even though there are users with much bigger indexes. By chance can you gather the GC logs from your install and make them available? That can answer a LOT of questions. On the wiki article I sent last time is a section about getting a screenshot of a process list. Can you get that and make it available? Depending on what I learn from that info, I may have more questions. Thanks, Shawn