Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR

2015-08-26 Thread Charlie Hull

On 25/08/2015 13:21, Simer P wrote:

http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
.

*Question:* How can I get guarantee commits with Apache SOLR where
persisting data to disk and visibility are both equally important ?

*Background:* We have a website which requires high end search
functionality for machine learning and also requires guaranteed commit for
financial transaction. We just want to SOLR as our only datastore to keep
things simple and *do not* want to use another database on the side.

I can't seem to find any answer to this question. The simplest solution for
a financial transaction seems to be to periodically query SOLR for the
record after it has been persisted but this can have longer wait time or is
there a better solution ?

Can anyone please suggest a solution for achieving guaranteed commits
with SOLR ?

Firstly, if you're asking here, you're likely to be answered here, not 
on Stack Overflow.


A search engine is not a database. Although both Solr and Elasticsearch 
are often used as primary stores with varying degrees of success, they 
are after all search engines and designed for this use.


Cheers

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR

2015-08-25 Thread Jack Krupansky
You could also look at an integrated product such as DataStax Enterprise
which fully integrates the Cassandra database and Solr - you execute your
database transactions in Cassandra and then DSE Search automatically
indexes the data in the embedded version of Solr.

See:
http://www.datastax.com/products/datastax-enterprise-search

About the only downside is that it is a proprietary product and the
integration is not open source.


-- Jack Krupansky

On Tue, Aug 25, 2015 at 10:15 AM, Upayavira u...@odoko.co.uk wrote:



 On Tue, Aug 25, 2015, at 01:21 PM, Simer P wrote:
 
 http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
  .
 
  *Question:* How can I get guarantee commits with Apache SOLR where
  persisting data to disk and visibility are both equally important ?
 
  *Background:* We have a website which requires high end search
  functionality for machine learning and also requires guaranteed commit
  for
  financial transaction. We just want to SOLR as our only datastore to keep
  things simple and *do not* want to use another database on the side.
 
  I can't seem to find any answer to this question. The simplest solution
  for
  a financial transaction seems to be to periodically query SOLR for the
  record after it has been persisted but this can have longer wait time or
  is
  there a better solution ?
 
  Can anyone please suggest a solution for achieving guaranteed commits
  with SOLR ?

 Be sure whether you are trying to use the wrong tool for the job. Solr
 does not offer per transaction guarantees. It is heavily optimised
 around high read/low write situations (i.e. more reads than writes). If
 you commit to disk too often, the implementation will be very
 inefficient (it will create lots of segments that need to be merged, and
 caches will become ineffective).

 Also, when you issue a commit, it commits all pending documents,
 regardless of whom posted them to Solr. These do not sound like things
 that suit your application.

 There remains the possibility (even if extremely uncommon/unlikely) that
 a transaction could be lost were a server to die/loose power in the few
 seconds between a post and a subsequent commit.

 Personally, I'd use a more traditional database for the data, then also
 post it to Solr for fast search/faceting/etc as needed.

 But then, perhaps there's more to your usecase than I have so far
 understood.

 Upayavira



Re: Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR

2015-08-25 Thread Upayavira


On Tue, Aug 25, 2015, at 01:21 PM, Simer P wrote:
 http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
 .
 
 *Question:* How can I get guarantee commits with Apache SOLR where
 persisting data to disk and visibility are both equally important ?
 
 *Background:* We have a website which requires high end search
 functionality for machine learning and also requires guaranteed commit
 for
 financial transaction. We just want to SOLR as our only datastore to keep
 things simple and *do not* want to use another database on the side.
 
 I can't seem to find any answer to this question. The simplest solution
 for
 a financial transaction seems to be to periodically query SOLR for the
 record after it has been persisted but this can have longer wait time or
 is
 there a better solution ?
 
 Can anyone please suggest a solution for achieving guaranteed commits
 with SOLR ?

Be sure whether you are trying to use the wrong tool for the job. Solr
does not offer per transaction guarantees. It is heavily optimised
around high read/low write situations (i.e. more reads than writes). If
you commit to disk too often, the implementation will be very
inefficient (it will create lots of segments that need to be merged, and
caches will become ineffective).

Also, when you issue a commit, it commits all pending documents,
regardless of whom posted them to Solr. These do not sound like things
that suit your application.

There remains the possibility (even if extremely uncommon/unlikely) that
a transaction could be lost were a server to die/loose power in the few
seconds between a post and a subsequent commit. 

Personally, I'd use a more traditional database for the data, then also
post it to Solr for fast search/faceting/etc as needed.

But then, perhaps there's more to your usecase than I have so far
understood.

Upayavira


Please answer my question on StackOverflow ... Best approach to guarantee commits in SOLR

2015-08-25 Thread Simer P
http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
.

*Question:* How can I get guarantee commits with Apache SOLR where
persisting data to disk and visibility are both equally important ?

*Background:* We have a website which requires high end search
functionality for machine learning and also requires guaranteed commit for
financial transaction. We just want to SOLR as our only datastore to keep
things simple and *do not* want to use another database on the side.

I can't seem to find any answer to this question. The simplest solution for
a financial transaction seems to be to periodically query SOLR for the
record after it has been persisted but this can have longer wait time or is
there a better solution ?

Can anyone please suggest a solution for achieving guaranteed commits
with SOLR ?