Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Erick Erickson
bq: We also had an HDFS setup already so it looked like a good option to not loos data. Earlier we had a few cases where we lost the machines so HDFS looked safer for that. right, that's one of the places where using HDFS to back Solr makes a lot of sense. The other approach is to just have

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
We actually use no auto warming. Our collections are pretty small and the query performance is not really a problem so far. We are using lots of collections and most Solr caches seem to be per core and not global so we also have a problem with caching. I have to test the HDFS cache some more

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Erick Erickson
In my experience, for relatively static indexes the performance is roughly similar. Once the data is read from whatever data source it's in memory, where the data came from is (largely) secondary in importance. In cases where there's a lot of I/O I expect HDFS to be slower, this fits Hendrik's

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Greenhorn Techie
Hendrik, Thanks for your response. Regarding "But this seems to greatly depend on how your setup looks like and what actions you perform." May I know what are the factors influence and what considerations are to be taken in relation to this? Thanks On Wed, 22 Nov 2017 at 14:16 Hendrik Haddorp

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
We did some testing and the performance was strangely even better with HDFS then the with the local file system. But this seems to greatly depend on how your setup looks like and what actions you perform. We now had a patter with lots of small updates and commits and that seems to be quite a

Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Greenhorn Techie
Hi, Good Afternoon!! While the discussion around issues related to "Solr on HDFS" is live, I would like to understand if anyone has done any performance benchmarking for both Solr indexing and search between HDFS vs local file system. Also, from experience, what would the community folks