Jason Rutherglen wrote:
For Ocean I created a workaround where the IndexCommits from IndexDeletionPolicy are saved in a map in order to achieve deleting based on the IndexReader. It would be more straightforward to delete from the IndexCommit in IndexReader.
It seems like we are mixing up deleting a whole commit point, vs deleting individual documents? Or does Ocean somehow decide to delete a whole commit point based on which documents have been deleted?
I realize people want to get away from IndexReader performing updates, however, for my use case, realtime search updating from IndexReader makes sense mainly for obtaining the doc ids of deletions. With IndexWriter managing the merges it would seem difficult to expose doc numbers, but perhaps there is a way.
IndexWriter can now delete by query, but it sounds like that's not sufficient for Ocean?
Under the hood, IndexWriter has the infrastructure to hold pending deleted docIDs and update these docIDs when a merge is committed. Ie, previously we forced a flush of all pending deletes on every flush/ merge, but now we buffer the docIDs across flushes/merges. This means IndexWriter *could* delete by docID, however, none of this is exposed publicly.
Also, this doesn't solve the problem of how you would get the docIDs to delete in the first place (ie one must still use a separate IndexReader for that).
I'm not sure this helps you (Ocean) since you presumably need to flush deletes very quickly to have realtime search...
Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]