Krishna Sahoo created SOLR-10676:
------------------------------------
Summary: Optimize the reindexing of sunspot solr
Key: SOLR-10676
URL: https://issues.apache.org/jira/browse/SOLR-10676
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Components: clients - ruby - flare
Affects Versions: 5.0
Reporter: Krishna Sahoo
We are using solr 5.0. <luceneMatchVersion>5.0.0</luceneMatchVersion>
We have more than 5 million products. It is taking around 3.30 hours to reindex
all the products.
For optimizing the reindexing speed, we have used the following configurations
<indexConfig>
<ramBufferSizeMB>960</ramBufferSizeMB>
<mergePolicyFactory>100</mergePolicyFactory>
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>+
</indexConfig>
<autoCommit>
<maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
</autoSoftCommit>
We are indexing with the following option
{ :batch_commit => false,:batch_size=>20000 }
We have set autocommit false in our model. So whenever a new record is inserted
it is not automatically added to the solr index. But when a record is updated
we manually call the Sunspot.index! method for that particular product data.
Everyday we are inserting around .2 millions of records. We have target of 50
million products.
Is there any way, that we can add to index only the new records or updated
records?
Can we increase the indexing speed by changing any of the current
configurations?
If we add new products to the solr through ruby code by using loop, it fails
miserably as it takes too much time.
Please help to find the best way to improve the indexing speed of solr.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]