Re: solr/lucene index merge and optimize performance improvement

2015-06-17 Thread Toke Eskildsen
On Tue, 2015-06-16 at 09:54 -0700, Shenghua(Daniel) Wan wrote: Hi, Toke, Did you try MapReduce with solr? I think it should be a good fit for your use case. Thanks for the suggestion. Improved logistics, such as starting build of a new shard while the previous shard is optimizing, would work

Re: solr/lucene index merge and optimize performance improvement

2015-06-16 Thread Toke Eskildsen
Shenghua(Daniel) Wan wansheng...@gmail.com wrote: Actually, I am currently interested in how to boost merging/optimizing performance of single solr instance. We have the same challenge (we build static 900GB shards one at a time and the final optimization takes 8 hours with only 1 CPU core at

Re: solr/lucene index merge and optimize performance improvement

2015-06-16 Thread Shenghua(Daniel) Wan
Hi, Toke, Did you try MapReduce with solr? I think it should be a good fit for your use case. On Tue, Jun 16, 2015 at 5:02 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: Shenghua(Daniel) Wan wansheng...@gmail.com wrote: Actually, I am currently interested in how to boost

Re: solr/lucene index merge and optimize performance improvement

2015-06-16 Thread Shenghua(Daniel) Wan
​I think your advice on future incremental update is very useful. I will keep eye on that. Actually, I am currently interested in how to boost merging/optimizing performance of single solr instance. Parallelism at MapReduce level does not help merging/optimizing much, unless Solr/Lucene

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Erick Erickson
Ah, OK. For very slowly changing indexes optimize can makes sense. Do note, though, that if you incrementally index after the full build, and especially if you update documents, you're laying a trap for the future. Let's say you optimize down to a single segment. The default TieredMergePolicy

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Shenghua(Daniel) Wan
Hi, Erick, First thanks for sharing the ideas. I am further giving more context here accordingly. 1. why optimize? I have done some experiments to compare the query response time, and there is some difference. In addition, the searcher will be customer-facing. I think any performance boost will

solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Shenghua(Daniel) Wan
Hi, Do you have any suggestions to improve the performance for merging and optimizing index? I have been using embedded solr server to merge and optimize the index. I am looking for the right parameters to tune. My use case have about 300 fields plus 250 copyfields, and moderate doc size (about

Re: solr/lucene index merge and optimize performance improvement

2015-06-15 Thread Erick Erickson
The first question is why you're optimizing at all. It's not recommended unless you can demonstrate that an optimized index is giving you enough of a performance boost to be worth the effort. And why are you using embedded solr server? That's kind of unusual so I wonder if you've gone down a