RE: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Gian Maria Ricci - aka Alkampfer
THanks a lot I'll have a look to Sematext SPM. Actually the index is not static, but the number of new documents will be small and probably they will be indexed during the night, so I'm not expecting too much problem from merge factor. We can index new document during the night and then optimize

Re: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Emir Arnautovic
Hi, OS does not care much about search v.s. retrieve so amount of RAM needed for file caches would depend on your index usage patterns. If you are not retrieving stored fields much and most/all results are only id+score, than it can be assumed that you can go with less RAM than actual index

Re: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Erick Erickson
And to make matters worse, much worse (actually, better)... See: https://issues.apache.org/jira/browse/SOLR-8220 That ticket (and there will be related ones) is about returning data from DocValues fields rather than from the stored data in some situations. Which means it will soon (I hope) be

Re: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Jack Krupansky
Personally, I'll continue to recommend that the ideal goal is to fully cache the entire Lucene index in system memory, as well as doing a proof of concept implementation to validate actual performance for your actual data. You can do a POC with a small fraction of your full data, like 15% or even

Re: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Toke Eskildsen
Jack Krupansky wrote: > Again to be clear, if you really do need the best/minimal overall query > latency, your best bet is to have sufficient system memory to fully cache > the entire index. If you actually don't need minimal latency, then of > course you can feel free