On Sun, Dec 20, 2009 at 12:14 AM, Marvin Humphrey <[email protected]> wrote:
>> I also think that Mike is making too much distinction between >> "relying on the file system" and "using shared memory". I think >> one can safely view them as two interfaces to the same underlying >> mechanism. > > I agree with that, and it was kind of confusing since Mike had > previously seemed to suggest that the flush() semantics were a > "Lucy-ification" of the Lucene model. I still need to answer on 2026, but this caught my eye first ;) Using the filesystem for sharing vs using shared memory seem quite different to me. EG one could create a rich data structure (say an FST) to represent the terms dict in RAM, then share that terms dict amongst many processes, right? Whereas, using the filesystem really requires a file-flat data structure? Ie, "going through the filesystem" and "going through shared memory" are two alternatives for enabling efficient process-only concurrency models. They have interesting tradeoffs (I'll answer more in 2026), but the fact that one of them is backed by a file by the OS seems like a salient difference. Net/net, I think the proposed Lucy flush v commit semantics is a good approach, given Lucy's design constraints. Just like Lucene's NRT, Lucy users won't be forced to tradeoff reopen time for durability. Mike
