Krishna Sahoo created SOLR-10676:
------------------------------------

             Summary: Optimize the reindexing of sunspot solr
                 Key: SOLR-10676
                 URL: https://issues.apache.org/jira/browse/SOLR-10676
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: clients - ruby - flare
    Affects Versions: 5.0
            Reporter: Krishna Sahoo


We are using solr 5.0. <luceneMatchVersion>5.0.0</luceneMatchVersion>
We have more than 5 million products. It is taking around 3.30 hours to reindex 
all the products.

For optimizing the reindexing speed, we have used the following configurations

<indexConfig>
    <ramBufferSizeMB>960</ramBufferSizeMB>
    <mergePolicyFactory>100</mergePolicyFactory>
    <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>+
 </indexConfig>

<autoCommit> 
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
       <openSearcher>false</openSearcher> 
     </autoCommit>
<autoSoftCommit> 
       <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> 
     </autoSoftCommit>

We are indexing with the following option 
{ :batch_commit => false,:batch_size=>20000 }

We have set autocommit false in our model. So whenever a new record is inserted 
it is not automatically added to the solr index. But when a record is updated 
we manually call the Sunspot.index! method for that particular product data. 
Everyday we are inserting around .2 millions of records. We have target of 50 
million products. 

Is there any way, that we can add to index only the new records or updated 
records?

Can we increase the indexing speed by changing any of the current 
configurations?

If we add new products to the solr through ruby code by using loop, it fails 
miserably as it takes too much time.

Please help to find the best way to improve the indexing speed of solr.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to