Same question. It would be very difficult to explain these two modes to users. We should do our best to fix LOG_ONLY first. Without these guarantees there is no reason to keep LOG_ONLY at all, user could simply use BACKGROUND with high flush frequency. This is precisely how Cassandra works.
p.1 - sounds like a bug p.2 - sounds like a bug as well; hopefully it should not introduce serious performance hit unless we write too much data to WAL, what would mean that we should work on it's optimization (e.g. free list update overhead, no delta updates, etc). p.3 - sounds like a bug as well On Fri, Mar 16, 2018 at 8:17 AM, Dmitriy Setrakyan <dsetrak...@apache.org> wrote: > Ivan, > > Is there a performance difference between LOG_ONLY and LOG_ONLY_SAFE? > > D. > > On Thu, Mar 15, 2018 at 4:23 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote: > > > Igniters and especially Native Persistence experts, > > > > We decided to change default WAL mode from DEFAULT(FSYNC) to LOG_ONLY in > > 2.4 release. That was difficult decision: we sacrificed power loss / OS > > crash tolerance, but gained significant performance boost. From my > > perspective, LOG_ONLY is right choice, but it still misses some critical > > features that default mode should have. > > > > Let's focus on exact guarantees each mode provides. Documentation > explains > > it in pretty simple manner: LOG_ONLY - writes survive process crash, > FSYNC > > - writes survive power loss scenarios. I have to notice that > documentation > > doesn't describe what exactly can happen to node in LOG_ONLY mode in case > > of power loss / OS crash scenario. Basically, there are two possible > > negative outcomes: loss of several last updates (it's exactly what can > > happen in BACKGROUND mode in case of process crash) and total storage > > corruption (not only last updates, but all data will be lost). I've made > a > > quick research on this and came into conclusion that power loss in > LOG_ONLY > > can lead to storage corruption. There are several explanations for this: > > 1) IgniteWriteAheadLogManager#fsync is kind of broken - it doesn't > > perform actual fsync unless current WAL mode is FSYNC. We call this > method > > when we write checkpoint marker to WAL. As long as part of WAL before > > checkpoint marker can be not synced, "physical" records that are > necessary > > for crash recovery in "Node stopped in the middle of checkpoint" scenario > > may be corrupted after power loss. If that happens, we won't be able to > > recover internal data structures, which means loss of all data. > > 2) We don't fsync WAL archive files unless current WAL mode is FSYNC. WAL > > archive can contain necessary "physical" records as well, which leads us > to > > the case described above. > > 3) We do perform fsync on rollover (switch of current WAL segment) in all > > modes, but only when there's enough space to write switch segment record > - > > see FileWriteHandle#close. So there's a little chance that we'll skip > fsync > > and bump into the same case. > > > > Enforcing fsync on that three situations will give us a guarantee that > > LOG_ONLY will survive power loss scenarios with possibility of losing > > several last updates. There still can be a total binary mess in the last > > part of WAL, but as long as we perform CRC check during WAL replay, we'll > > detect start of that mess. Extra fsyncs may cause slight performance > > degradation - all writes will have to await for one fsync on every > rollover > > and checkpoint. It's still much faster than fsync on every write in WAL > - I > > expect a few percent (0-5%) drop comparing to current LOG_ONLY. But > > degradation is degradation, and LOG_ONLY mode without extra fsyncs makes > > sense as well - that's why we need to introduce "LOG_ONLY + extra fsyncs" > > as separate WAL mode. I think, we should make it default - it provides > > significant durability bonus for the cost of one extra fsync for each WAL > > segment written. > > > > To sum it up, I propose a new set of possible WAL modes: > > NONE - both process crash and power loss can lead to corruption > > BACKGROUND - process crash can lead to last updates loss, power loss can > > lead to corruption > > LOG_ONLY - writes survive process crash, power loss can lead to > corruption > > LOG_ONLY_SAFE (default) - writes survive process crash, power loss can > > lead to last updates loss > > FSYNC - writes survive both process crash and power loss > > > > Thoughts? > > > > > > Best Regards, > > Ivan Rakov > > > > >