I would find that unacceptable for many systems I have worked on. Lucene update-behind would be fine, but waiting the insert until all of the Lucene stuff happened would not be acceptable.
I would much rather that Lucene update from the write log in batches that are as big as needed to catch/keep up. On Mon, Feb 14, 2011 at 9:48 AM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > > Yes, that should work. But doesn't it assume that the index is updated > > synchronously with the HBase row? I can imagine this will sometimes be an > > issue, e.g. if it would involve performing expensive content extraction > > (tika) or analysis. > > I don't understand here. You mean that the delay in indexing a > document will adversely affect the HBase row insert because it's all > in the same transaction? I think that fine, eg, it's just how the > system'd work?