I don't have the system at hand now, but if I remember right fsync took like 100-200ms.
2010/4/7 Shai Erera <ser...@gmail.com>: > Earwin - do you have some numbers to share on the running time of the > indexing application? You've mentioned that if you take out fsync into a BG > thread, the running time improves, but I'm curious to know by how much. > > Shai > > On Wed, Apr 7, 2010 at 2:26 AM, Earwin Burrfoot <ear...@gmail.com> wrote: >> >> > Running out of disk space with fsync disabled won't lead to corruption. >> > Even kill -9 the JRE process with fsync disabled won't corrupt. >> > In these cases index just falls back to last successful commit. >> > >> > It's "only" power loss / OS / machine crash where you need fsync to >> > avoid possible corruption (corruption may not even occur w/o fsync if >> > you "get lucky"). >> >> Sorry to disappoint you, but running out of disk space is worse than kill >> -9. >> You can write down the file (to cache in fact), close it, all without >> getting any >> exceptions. And then it won't get flushed to disk because the disk is >> full. >> This can happen to segments file (and old one is deleted with default >> deletion >> policy). This can happen to fat freq/prox files mentioned in segments file >> (and yeah, the old segments file is deleted, so no falling back). >> >> > What if your background thread simply committed every couple of minutes? >> > What's the difference between taking the snapshot (which means you had >> > to call commit previously) and commit it, to call iw.commit by a >> > backgroud merge? >> -- >> > But: why do you need to commit so often? >> To see stuff on reopen? Yes, I know about NRT. >> >> > You've reinvented autocommit=true! >> ?? I'm doing regular commits, syncing down every Nth of it. >> >> > Doesn't this just BG the syncing? Ie you could make a dedicated >> > thread to do this. >> Yes, exactly, this BGs the syncing to a dedicated thread. Threads >> doing indexation/merging can continue unhampered. >> >> > One possible win with this aproach is.... the cost of fsync should go >> > way down the longer you wait after writing bytes to the file and >> > before calling fsync. This is because typically OS write caches >> > expire by time (eg 30 seconds) so if you want long enough the bytes >> > will already at least be delivered to the IO system (but the IO system >> > can do further caching which could still take time). On windows at >> > least I definitely noticed this effect -- wait some before fync'ing >> > and it's net/net much less costly. >> Yup. In fact you can just hold on to the latest commit for N seconds, >> than switch to the new latest commit. >> OS will fsync everything for you. >> >> >> I'm just playing around with stupid idea. I'd like to have NRT >> look-alike without binding readers and writers. :) >> Right now it's probably best for me to save my time and cut over to >> current NRT. >> But. An important lesson was learnt - no fsyncing blows up your index >> on out-of-disk-space. >> >> -- >> Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) >> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 >> ICQ: 104465785 >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> > > -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org