Hi All, *tl;dr* : running into long GC pauses and solr client socket timeouts when indexing bulk of documents into solr. Commit strategy in essence is to do hard commits at the interval of 50k documents (maxDocs=50k) and disable soft commit altogether during bulk indexing. Simple solr cloud set up with one node and one shard.
*Details*: We have about 6 million documents which we are trying to index into solr. >From these, about 500k documents have a text field which holds Abstracts of scientific papers/Articles. We extract keywords from these Abstracts and we index these keywords as well into solr. We have a many to many kind of relationship between Articles and keywords. To store this, we have following structure. Article documents Keyword documents Article-Keyword Join documents We use block join to index Articles with "Article-Keyword" join documents and Keyword documents are indexed independently. In other words, we have blocks of "Article + Article-Keyword Joins" and we have Keyword documents(they hold some additional metadata about keyword ). We have a bulk processing operation which creates these documents and indexes them into solr. During this bulk indexing, we don't need documents to be searchable. We need to search against them only after ALL the documents are indexed. *Based on this, this is our current strategy. * Soft commits are disabled and Hard commits are done at an interval of 50k documents with openSearcher=false. Our code triggers explicit commits 4 times after various stages of bulk indexing. Transaction logs are enabled and have default settings. <autoCommit> <maxTime>${solr.autoCommit.maxTime:-1}</maxTime> <maxDocs>${solr.autoCommit.maxDocs:50000}</maxDocs> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> </autoSoftCommit> Other Environmental Details: Xms=8g and Xmx=14g, solr client socketTimeout=7 minutes and zkClienttimeout=2 mins Our indexing operation triggers many "add" operations in parallel using RxJava (15 to 30 threads) each "add" operation is passed about 1000 documents. Currently, when we run this indexing operation, we notice that after a while solr goes into long GC pauses (longer than our sockeTimeout of 7 minutes) and we get SocketTimeoutExceptions. *What could be causing such long GC pauses?* *Does this commit strategy make sense ? If not, what is the recommended strategy that we can look into? * *Any help on this is much appreciated. Thanks.*