How often is fsync called? If it's just during calls to commit, then is that that expensive? I mean, how often do you call commit?
If that's that expensive (do you have some numbers to share) then I think that's be a neat idea. Though "losing a few minutes worth of updates" may sometimes be unrecoverable, depending on the scenario, bur I guess for those cases the 'standard way' should be used. What if your background thread simply committed every couple of minutes? What's the difference between taking the snapshot (which means you had to call commit previously) and commit it, to call iw.commit by a backgroud merge? Shai On Tue, Apr 6, 2010 at 5:11 PM, Earwin Burrfoot <ear...@gmail.com> wrote: > So, I want to pump my IndexWriter hard and fast with documents. > > Removing fsync from FSDirectory helps. But for that I pay with possibility > of > index corruption, not only if my node suddenly loses > power/kernelpanics, but also if it > runs out of disk space (which happens more frequently). > > I invented the following solution: > We write a special deletion policy that resembles SnapshotDeletionPolicy. > At all times it takes hold of "current synced commit" and preserves > it. Once every N minutes > a special thread takes latest commit, syncs it and nominates as > "current synced commit". The > previous one gets deleted. > > Now we are disastery-proof, and do fsync asynchronously from indexing > threads. We pay for this with > somewhat bigger transient disc usage, and probably losing a few > minutes worth of updates in > case of a crash, but that's acceptable. > > How does this sound? > > -- > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 > ICQ: 104465785 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >