+1 for accepting drop in LOG_ONLY. 7% is not that much and not a drop at all, provided that we fixing a bug. I.e. should we implement it correctly in the first place we would never notice any "drop". I do not understand why someone would like to use current broken mode.
On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov <dpavlov....@gmail.com> wrote: > Hi, I think option 1 is better. As Val said any mode that allows corruption > does not make much sense. > > What Ivan mentioned here as drop, in relation to old mode DEFAULT (FSYNC > now), is still significant perfromance boost. > > Sincerely, > Dmitriy Pavlov > > ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <ivan.glu...@gmail.com>: > > > I've attached benchmark results to the JIRA ticket. > > We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of WAL > > compaction enabled flag. It's pretty significant drop: WAL compaction > > itself gives only ~3% drop. > > > > I see two options here: > > 1) Change LOG_ONLY behavior. That implies that we'll be ready to release > > AI 2.5 with 7% drop. > > 2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI 2.5 > > that we added power loss durability in default mode, but user may > > fallback to previous LOG_ONLY in order to retain performance. > > > > Thoughts? > > > > Best Regards, > > Ivan Rakov > > > > On 20.03.2018 16:00, Ivan Rakov wrote: > > > Val, > > > > > >> If a storage is in > > >> corrupted state, does it mean that it needs to be completely removed > and > > >> cluster needs to be restarted without data? > > > > > > Yes, there's a chance that in LOG_ONLY all local data will be lost, > > > but only in *power loss**/ OS crash* case. > > > kill -9, JVM crash, death of critical system thread and all other > > > cases that usually take place are variations of *process crash*. All > > > WAL modes (except NONE, of course) ensure corruption-safety in case of > > > process crash. > > > > > >> If so, I'm not sure any mode > > >> that allows corruption makes much sense to me. > > > It depends on performance impact of enforcing power-loss corruption > > > safety. Price of full protection from power loss is high - FSYNC is > > > way slower (2-10 times) than other WAL modes. The question is whether > > > ensuring weaker guarantees (corruption can't happen, but loss of last > > > updates can) will affect performance as badly as strong guarantees. > > > I'll share benchmark results soon. > > > > > > Best Regards, > > > Ivan Rakov > > > > > > On 20.03.2018 5:09, Valentin Kulichenko wrote: > > >> Guys, > > >> > > >> What do we understand under "data corruption" here? If a storage is in > > >> corrupted state, does it mean that it needs to be completely removed > and > > >> cluster needs to be restarted without data? If so, I'm not sure any > mode > > >> that allows corruption makes much sense to me. How am I supposed to > > >> use a > > >> database, if virtually any failure can end with complete loss of data? > > >> > > >> In any case, this definitely should not be a default behavior. If > > >> user ever > > >> switches to corruption-unsafe mode, there should be a clear warning > > >> about > > >> this. > > >> > > >> -Val > > >> > > >> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <ivan.glu...@gmail.com> > > >> wrote: > > >> > > >>> Ticket to track changes: > > >>> https://issues.apache.org/jira/browse/IGNITE-7754 > > >>> > > >>> Best Regards, > > >>> Ivan Rakov > > >>> > > >>> > > >>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote: > > >>> > > >>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <ivan.glu...@gmail.com > > > > >>>> wrote: > > >>>> > > >>>> Vladimir, > > >>>>> Unlike BACKGROUND, LOG_ONLY provides strict write guarantees > > >>>>> unless power > > >>>>> loss has happened. > > >>>>> Seems like we need to measure performance difference to decide > > >>>>> whether do > > >>>>> we need separate WAL mode. If it will be invisible, we'll just fix > > >>>>> these > > >>>>> bugs without introducing new mode; if it will be perceptible, we'll > > >>>>> continue the discussion about introducing LOG_ONLY_SAFE. > > >>>>> Makes sense? > > >>>>> > > >>>>> Yes, this sounds like the right approach. > > >>>> > > > > > > > > > > >