: Basically we have an application that does indexing of documents to 
: SOLR. This application is basically a third party and we didn't do much 
: meddling with it. There is another application that I'm developing to 
: use some of the fields data indexed (both when it is new or updated), do 
: some calculation, and adds a field with the result of that process. Then 
: I would like to index that document back to Solr (do an update).
: 
: I know it will mean twice reindexing of the whole document and that 
: sounds really inefficient, but we don't want the success of the first 
: application to depend on the extra processing.

can you explain what you mean by "we don't want the success of the first 
application to depend on the extra processing." ?

Im guessing you are saying that you want the documents addeded by system#1 
to be in the index and searchable right away, before system#2 runs, and 
even if the calclutions (and reindexing) done by system#2 fail?

: I am using PHP for the application and would prefer not to use Java just 
: for listener. Is there no way to do this?

Why not just include a "timestamp" field in your schema.xml, that 
defaults to "NOW" (so system#1 doesn't even have to know about it) and 
have system#2 periodically query for docs with a timestamp value greater 
then the last time it ran? then you don't have to worry about any special 
post-doc-added code, or deal with java at all.

If you *really* care that you get notified immediately after documents 
were committed (and dont' want to just "poll", you can still configure a 
postCommit listener that uses the RunExecutableListern to fire off 
whatever shell process you want to tell your (external) code to run, and 
then your external application can use the same polling logic to get the 
list of newly added docs since the last time it ran.


-Hoss

Reply via email to