hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added or 
a commit is performed, but the doc(id) does not seem to be included for the 
commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or postSoftCommit

The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: hook to know when a DOC is committed.

2013-05-23 Thread Jack Krupansky
A poller really is the most sensible, practical, and easiest route to go. If 
you add the versions=true parameter to your update request and have the 
transaction log enabled the update response will have the version numbers 
for each document id, then the poller can also tell if an update has been 
committed as well.


Also, with soft commit, documents should be visible must more rapidly.

Do you have some other, unmentioned requirement that you feel is biasing you 
against a sensible poller? Clue us in as to the nature of such a 
requirement.


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 7:53 AM
To: solr-user@lucene.apache.org
Subject: hook to know when a DOC is committed.

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added 
or a commit is performed, but the doc(id) does not seem to be included for 
the commit-hooks (naturally I guess):


A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or 
postSoftCommit


The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.


Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr 



Re: hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

 A poller really is the most sensible, practical, and easiest route to go. If 
 you add the versions=true parameter to your update request and have the 
 transaction log enabled the update response will have the version numbers for 
 each document id, then the poller can also tell if an update has been 
 committed as well.

The poller will still have to retry before advertising a doc as searchable - 
won't it?

 Do you have some other, unmentioned requirement that you feel is biasing you 
 against a sensible poller? Clue us in as to the nature of such a requirement.

My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, so 
keeping track of all these by pollers would require a whole bunch of logic.  
Having a pushed-based system would facilitate knowing where  when a document 
is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: hook to know when a DOC is committed.

2013-05-23 Thread Jack Krupansky
Yes, by definition, a poller retries. But by picking a sensible default for 
initial poll and retry (possibly an initial delay tuned to match average 
update/commit time) couple with a traditional exponential backoff, that 
should not be a problem at all. In other words, an average request would not 
require a retry.


Even so, do you feel that there is some sort of problem with retry? If so, 
please state what it is.


Again, if you utilize soft commit, the time to commit will be significantly 
reduced.


Or, just go ahead a force a commit on every commit here the delay of a poll 
request is not acceptable. But I'd recommend the tuned poller.


would require a whole bunch of logic - and you think the commit hooks and 
your push model implementation (on both Solr and client side) will be less 
logic?!!


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: hook to know when a DOC is committed.

On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

A poller really is the most sensible, practical, and easiest route to go. 
If you add the versions=true parameter to your update request and have 
the transaction log enabled the update response will have the version 
numbers for each document id, then the poller can also tell if an update 
has been committed as well.


The poller will still have to retry before advertising a doc as searchable - 
won't it?


Do you have some other, unmentioned requirement that you feel is biasing 
you against a sensible poller? Clue us in as to the nature of such a 
requirement.


My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.


Our system consist of approx 10 indexes and 8 replications of each of these, 
so keeping track of all these by pollers would require a whole bunch of 
logic.  Having a pushed-based system would facilitate knowing where  when a 
document is searchable quite a lot.




regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr