On 6/2/06, Simon Willnauer <[EMAIL PROTECTED]> wrote:
This is also true. This problem is still the server response, if i queue
some updates / inserts or index them into a RamDir I still have the problem
of concurrent indexing. The client should wait for the writing process to
finish correctly otherwise the reponse should be some Error 500. If the
client will not wait (be hold) there is a risk of a lost update.
The same problem appears in indexing entries into the search index. There
won't be a lot of inserts and update concurrent so I can't wait for other
inserts to do batch indexing. I could index them into ramDirs and search
multiple indexes. but what happens if the server crashes with a certain
amount of entries indexed into a ramDir?
any solutions for that in the solr project?
But the problem is twofold:
1) You can't freely mix adds and deletes in Lucene.
2) changes are not immediately visible... you need to close the
current writer and open a new IndexSearcher, which are relatively
heavyweight operations.
Solr solved (1) by adding all documents immediately as they come in
(using the same thread as the client request). Deletes are replied to
immediately, but are defered. When a "commit" happens, the writer is
closed, a new reader is opened, and all the deletes are processed.
Then a new IndexSearcher is opened, making all the adds and deletes
visible.
Solr doesn't do anything to solve (2). It's main focus has been
providing high throughput and low latency queries, not on the
"freshness" of updates.
Decoupling the indexing from storage might help if new additions don't
need to be searchable (but do need to be retrievable by id)... you
could make storage synchronous, but batch the adds/deletes in some
manner and open a new IndexSearcher less frequently.
-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]