Hello,

We have a 3 node SolrCloud instance with 3 collections (roughly 1mn, 2mn, and 
8mn docs) in 1 shard each. We have set this up to have 2 replicas for each of 
the collections for redundancy in Production.

Each night I get a burst of indexing due to updates of the documents that we 
index changing. And each night I see timeout errors (and sometimes the Task 
Queue errors) for the transaction collection. (We’re using pysolr to do the 
indexing with 5 parallel processes). Typically they will resolve themselves 
after a few tries, but sometimes with larger amounts of changes it’s taking a 
few hours and longer than I’d like.

Most common error: Connection to server 
'https://<redacted>/solr/transaction_solrize/update/' timed out: 
HTTPSConnectionPool(host=’<redacted>', port=443): Read timed out. (read 
timeout=60)

Second most common error: Solr responded with an error (HTTP 500): [Reason: 
Task queue processing has stalled for 20011 ms with 0 remaining elements to 
process.]

When we were designing the system, we played around autoCommit and softCommit 
settings following this 
guide<https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/>.
 This was mainly to optimize the initial bulk load or ReIndex after a Schema 
change.

We landed on a 15sec autoCommit and no soft-commit, as the only thing that 
seemed to mitigate the timeout issue was to not have any Replicas while 
Indexing heavily, then add them back after.

<autoCommit>
            <maxTime>15000</maxTime>
            <openSearcher>true</openSearcher>
</autoCommit>

This works fine for a ReIndex as I do it side-by-side using aliases, but this 
doesn’t really work for our nightly updates, and I’d like the keep the replicas 
active at all times for the active collections. I also still do get the “Read 
timed out” errors even with no replicas, so that doesn’t seem to be the total 
silver bullet, but does really help.

Can anyone recommend some next steps for troubleshooting this issue and how to 
reduce the timeouts?

I haven’t tried extending the autoCommit time as I was concerned with tlog 
growing too large. We don’t have any strong requirement for Real Time Searching.


Config details:
Each Node: 9 GB RAM
-Xms3g
-Xmx3g
-Xss256k

https://iatistandard.org/en/

Thank you!
Nik Osvalds | IATI Developer
Development Initiatives, First Floor Centre, The Quorum, Bond Street South, 
Bristol, BS1 3AE, UK
T: +44 (0) 1179 272 505
devinit.org<http://www.devinit.org/> | 
@devinitorg<https://twitter.com/devinitorg>
Sign 
up<https://us11.list-manage.com/subscribe?u=a829237ca0cf1470615c7f059&id=ce30e2af0f>
 to our newsletter and topic updates
[signature_1120764141]



Read more about our work and its impact<https://devinit.org/what-we-do>
Development Initiatives is the trading name for: Development Initiatives 
Poverty Research Ltd, a not-for-profit company. Company No. 06368740 and DI 
International Ltd. Company No. 5802543, both registered in the UK.  Registered 
Address: First Floor Centre, The Quorum, Bond Street South, Bristol, BS1 3AE, 
UK.
Development Initiatives Poverty Research America Inc, is a 501(c)3 charity 
registered in the US. Registration number 5737757. Registered Address: 1209 
Orange Street, Wilmington, New Castle County, Delaware 19801, US.
DISCLAIMER: This email and any attachments are confidential and intended solely 
for the use of the individual or organisation to whom it is addressed. Any 
views or opinions expressed are solely those of the author and do not 
necessarily represent those of Development Initiatives. If you have received 
this email in error, please delete it and notify the sender.



______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

Reply via email to