Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR
On 25/08/2015 13:21, Simer P wrote: http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr . *Question:* How can I get guarantee commits with Apache SOLR where persisting data to disk and visibility are both equally important ? *Background:* We have a website which requires high end search functionality for machine learning and also requires guaranteed commit for financial transaction. We just want to SOLR as our only datastore to keep things simple and *do not* want to use another database on the side. I can't seem to find any answer to this question. The simplest solution for a financial transaction seems to be to periodically query SOLR for the record after it has been persisted but this can have longer wait time or is there a better solution ? Can anyone please suggest a solution for achieving guaranteed commits with SOLR ? Firstly, if you're asking here, you're likely to be answered here, not on Stack Overflow. A search engine is not a database. Although both Solr and Elasticsearch are often used as primary stores with varying degrees of success, they are after all search engines and designed for this use. Cheers Charlie -- Charlie Hull Flax - Open Source Enterprise Search tel/fax: +44 (0)8700 118334 mobile: +44 (0)7767 825828 web: www.flax.co.uk
Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR
You could also look at an integrated product such as DataStax Enterprise which fully integrates the Cassandra database and Solr - you execute your database transactions in Cassandra and then DSE Search automatically indexes the data in the embedded version of Solr. See: http://www.datastax.com/products/datastax-enterprise-search About the only downside is that it is a proprietary product and the integration is not open source. -- Jack Krupansky On Tue, Aug 25, 2015 at 10:15 AM, Upayavira u...@odoko.co.uk wrote: On Tue, Aug 25, 2015, at 01:21 PM, Simer P wrote: http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr . *Question:* How can I get guarantee commits with Apache SOLR where persisting data to disk and visibility are both equally important ? *Background:* We have a website which requires high end search functionality for machine learning and also requires guaranteed commit for financial transaction. We just want to SOLR as our only datastore to keep things simple and *do not* want to use another database on the side. I can't seem to find any answer to this question. The simplest solution for a financial transaction seems to be to periodically query SOLR for the record after it has been persisted but this can have longer wait time or is there a better solution ? Can anyone please suggest a solution for achieving guaranteed commits with SOLR ? Be sure whether you are trying to use the wrong tool for the job. Solr does not offer per transaction guarantees. It is heavily optimised around high read/low write situations (i.e. more reads than writes). If you commit to disk too often, the implementation will be very inefficient (it will create lots of segments that need to be merged, and caches will become ineffective). Also, when you issue a commit, it commits all pending documents, regardless of whom posted them to Solr. These do not sound like things that suit your application. There remains the possibility (even if extremely uncommon/unlikely) that a transaction could be lost were a server to die/loose power in the few seconds between a post and a subsequent commit. Personally, I'd use a more traditional database for the data, then also post it to Solr for fast search/faceting/etc as needed. But then, perhaps there's more to your usecase than I have so far understood. Upayavira
Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR
On Tue, Aug 25, 2015, at 01:21 PM, Simer P wrote: http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr . *Question:* How can I get guarantee commits with Apache SOLR where persisting data to disk and visibility are both equally important ? *Background:* We have a website which requires high end search functionality for machine learning and also requires guaranteed commit for financial transaction. We just want to SOLR as our only datastore to keep things simple and *do not* want to use another database on the side. I can't seem to find any answer to this question. The simplest solution for a financial transaction seems to be to periodically query SOLR for the record after it has been persisted but this can have longer wait time or is there a better solution ? Can anyone please suggest a solution for achieving guaranteed commits with SOLR ? Be sure whether you are trying to use the wrong tool for the job. Solr does not offer per transaction guarantees. It is heavily optimised around high read/low write situations (i.e. more reads than writes). If you commit to disk too often, the implementation will be very inefficient (it will create lots of segments that need to be merged, and caches will become ineffective). Also, when you issue a commit, it commits all pending documents, regardless of whom posted them to Solr. These do not sound like things that suit your application. There remains the possibility (even if extremely uncommon/unlikely) that a transaction could be lost were a server to die/loose power in the few seconds between a post and a subsequent commit. Personally, I'd use a more traditional database for the data, then also post it to Solr for fast search/faceting/etc as needed. But then, perhaps there's more to your usecase than I have so far understood. Upayavira
Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR
http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr . *Question:* How can I get guarantee commits with Apache SOLR where persisting data to disk and visibility are both equally important ? *Background:* We have a website which requires high end search functionality for machine learning and also requires guaranteed commit for financial transaction. We just want to SOLR as our only datastore to keep things simple and *do not* want to use another database on the side. I can't seem to find any answer to this question. The simplest solution for a financial transaction seems to be to periodically query SOLR for the record after it has been persisted but this can have longer wait time or is there a better solution ? Can anyone please suggest a solution for achieving guaranteed commits with SOLR ?