Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Dmitry Pavlov Tue, 27 Mar 2018 10:13:48 -0700

Ivan, sure :)

Thank you for this contribution, merged to master.


вт, 27 мар. 2018 г. в 20:08, Ivan Rakov <[email protected]>:

> Dmitry,
>
> Firstly PR contained dirty fix for performance measurement, but now it
> contains good fix. :) Sorry for inconvenience.
> I've renamed the PR.
>
> Best Regards,
> Ivan Rakov
>
> On 27.03.2018 19:40, Dmitry Pavlov wrote:
> > Hi Eduard, thank you for review.
> >
> > Hi Ivan,
> >
> > I'm confused on PR naming
> > https://github.com/apache/ignite/pull/3656
> >
> > Could you rename?
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 27 мар. 2018 г. в 19:38, Eduard Shangareev <
> [email protected]
> >> :
> >> Ivan, I have reviewed your changes, looks good.
> >>
> >> On Tue, Mar 27, 2018 at 2:56 PM, Ivan Rakov <[email protected]>
> wrote:
> >>
> >>> Igniters,
> >>>
> >>> I've completed development of https://issues.apache.org/jira
> >>> /browse/IGNITE-7754. TeamCity state is ok. Please, review my changes.
> >>> Please note that it will be possible to track time of WAL fsync on
> >>> checkpoint begin by *walCpRecordFsyncDuration *metric in "Checkpoint
> >>> started" message.
> >>>
> >>> Also, I've created https://issues.apache.org/jira/browse/IGNITE-8057
> >> with
> >>> description of possible further improvement of WAL fsync on checkpoint
> >>> begin.
> >>>
> >>> Best Regards,
> >>> Ivan Rakov
> >>>
> >>>
> >>> On 26.03.2018 23:45, Valentin Kulichenko wrote:
> >>>
> >>>> Ivan,
> >>>>
> >>>> It's all good then :) Thanks!
> >>>>
> >>>> -Val
> >>>>
> >>>> On Mon, Mar 26, 2018 at 1:50 AM, Ivan Rakov <[email protected]>
> >>>> wrote:
> >>>>
> >>>> Val,
> >>>>> There's no any sense to use WalMode.NONE in production environment,
> >> it's
> >>>>> kept for testing and debugging purposes (including possible user
> >>>>> activities
> >>>>> like capacity planning).
> >>>>> We already print a warning at node start in case WalMode.NONE is set:
> >>>>>
> >>>>> U.quietAndWarn(log,"Started write-ahead log manager in NONE mode,
> >>>>>
> >>>>>> persisted data may be lost in " +
> >>>>>>        "a case of unexpected node failure. Make sure to deactivate
> the
> >>>>>> cluster before shutdown.");
> >>>>>>
> >>>>>> Best Regards,
> >>>>> Ivan Rakov
> >>>>>
> >>>>>
> >>>>> On 24.03.2018 1:40, Valentin Kulichenko wrote:
> >>>>>
> >>>>> Dmitry,
> >>>>>> Thanks for clarification. So it sounds like if we fix all other
> modes
> >> as
> >>>>>> we
> >>>>>> discuss here, NONE would be the only one allowing corruption. I also
> >>>>>> don't
> >>>>>> see much sense in this and I think we should clearly state this in
> the
> >>>>>> doc,
> >>>>>> as well print out a warning if NONE mode is used. Eventually, if
> it's
> >>>>>> confirmed that there are no reasonable use cases for it, we can
> >>>>>> deprecate
> >>>>>> it.
> >>>>>>
> >>>>>> -Val
> >>>>>>
> >>>>>> On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlov <
> [email protected]
> >>>>>> wrote:
> >>>>>>
> >>>>>> Hi Val,
> >>>>>>
> >>>>>>> NONE means that the WAL log is disabled and not written at all. Use
> >> of
> >>>>>>> the
> >>>>>>> mode is at your own risk. It is possible that restore state after
> the
> >>>>>>> crash
> >>>>>>> at the middle of checkpoint will not succeed. I do not see much
> sence
> >>>>>>> in
> >>>>>>> it, especially in production.
> >>>>>>>
> >>>>>>> BACKGROUND is full functional WAL mode, but allows some delay
> before
> >>>>>>> flush
> >>>>>>> to disk.
> >>>>>>>
> >>>>>>> Sincerely,
> >>>>>>> Dmitriy Pavlov
> >>>>>>>
> >>>>>>> сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko <
> >>>>>>> [email protected]>:
> >>>>>>>
> >>>>>>> I agree. In my view, any possibility to get a corrupted storage is
> a
> >>>>>>> bug
> >>>>>>>
> >>>>>>>> which needs to be fixed.
> >>>>>>>>
> >>>>>>>> BTW, can someone explain semantics of NONE mode? What is the
> >>>>>>>> difference
> >>>>>>>> from BACKGROUND from user's perspective? Is there any particular
> use
> >>>>>>>> case
> >>>>>>>> where it can be used?
> >>>>>>>>
> >>>>>>>> -Val
> >>>>>>>>
> >>>>>>>> On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov <
> >> [email protected]
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Ivan,
> >>>>>>>>
> >>>>>>>>> IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
> >>>>>>>>>
> >>>>>>>>> Sincerely,
> >>>>>>>>> Dmitriy Pavlov
> >>>>>>>>>
> >>>>>>>>> пт, 23 мар. 2018 г. в 12:23, Ivan Rakov <[email protected]>:
> >>>>>>>>>
> >>>>>>>>> Igniters, there's another important question about this matter.
> >>>>>>>>>
> >>>>>>>>>> Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think
> >> that
> >>>>>>>>>> we
> >>>>>>>> have to do it: it will cause similar performance drop, but if we
> >>>>>>>>
> >>>>>>>>> consider LOG_ONLY broken without these fixes, BACKGROUND is
> broken
> >> as
> >>>>>>>>>> well.
> >>>>>>>>> Best Regards,
> >>>>>>>>>> Ivan Rakov
> >>>>>>>>>>
> >>>>>>>>>> On 23.03.2018 10:27, Ivan Rakov wrote:
> >>>>>>>>>>
> >>>>>>>>>> Fixes are quite simple.
> >>>>>>>>>>> I expect them to be merged in master in a week in worst case.
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>
> >>>>>>>>>>> On 22.03.2018 17:49, Denis Magda wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Ivan,
> >>>>>>>>>>>> How quick are you going to merge the fix into the master? Many
> >>>>>>>>>>>> persistence
> >>>>>>>>>>>> related optimizations have already stacked up. Probably, we
> can
> >>>>>>>>>>>>
> >>>>>>>>>>>> release
> >>>>>>>>>> them sooner if the community agrees.
> >>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>>> Denis
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov <
> >>>>>>>>>>>>
> >>>>>>>>>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>> Thanks all!
> >>>>>>>>>>>>> We seem to have reached a consensus on this issue. I'll just
> >> add
> >>>>>>>>>>>>> necessary
> >>>>>>>>>>>>> fsyncs under IGNITE-7754.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1 for fixing LOG_ONLY. If current implementation doesn't
> >>>>>>>>>>>>> protect
> >>>>>>>>>>>>>
> >>>>>>>>>>>> from
> >>>>>>>>> data
> >>>>>>>>>>> corruption, it doesn't make sence.
> >>>>>>>>>>>>>> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>> wrote:
> >>>>>>>>> +1 for the fix of LOG_ONLY
> >>>>>>>>>>>>>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> >>>>>>>>>>>>>>> [email protected]> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +1 for fixing LOG_ONLY to enforce corruption safety given
> the
> >>>>>>>>>>>>>>> provided
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> performance results.
> >>>>>>>>>>>>>>>> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>> :
> >>>>>>>>>> +1 for accepting drop in LOG_ONLY. 7% is not that much and
> >>>>>>>>>>>> not a
> >>>>>>>>>>>>>> drop
> >>>>>>>>> at
> >>>>>>>>>>>>>>>> all, provided that we fixing a bug. I.e. should we
> implement
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>> correctly
> >>>>>>>>> in the first place we would never notice any "drop".
> >>>>>>>>>>>>>>>> I do not understand why someone would like to use current
> >>>>>>>>>>>>>>>>> broken
> >>>>>>>>>>>>>>> mode.
> >>>>>>>>>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> >>>>>>>>>>>>>>>>> <[email protected]>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi, I think option 1 is better. As Val said any mode that
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> allows
> >>>>>>>>>>>>>>> corruption
> >>>>>>>>>> does not make much sense.
> >>>>>>>>>>>>>>>>>> What Ivan mentioned here as drop, in relation to old
> mode
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> DEFAULT
> >>>>>>>>>>>>>>>> (FSYNC
> >>>>>>>>>>> now), is still significant perfromance boost.
> >>>>>>>>>>>>>>>>> Sincerely,
> >>>>>>>>>>>>>>>>>> Dmitriy Pavlov
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>> :
> >>>>>>>>>> I've attached benchmark results to the JIRA ticket.
> >>>>>>>>>>>> We observe ~7% drop in "fair" LOG_ONLY_SAFE mode,
> >>>>>>>>>>>>>>>>>>> independent
> >>>>>>>>>>>>>>>>> of
> >>>>>>>>> WAL
> >>>>>>>>>>> compaction enabled flag. It's pretty significant drop: WAL
> >>>>>>>>>>>>>>>>> compaction
> >>>>>>>>>>>>>>>>> itself gives only ~3% drop.
> >>>>>>>>>>>>>>>>> I see two options here:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1) Change LOG_ONLY behavior. That implies that we'll be
> >>>>>>>>>>>>>>>>>>> ready
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>> release
> >>>>>>>>>>> AI 2.5 with 7% drop.
> >>>>>>>>>>>>>>>>>> 2) Introduce LOG_ONLY_SAFE, make it default, add release
> >>>>>>>>>>>>>>>>>>> note
> >>>>>>>>>>>>>>>>> to AI
> >>>>>>>>> 2.5
> >>>>>>>>>>>>>>>>>> that we added power loss durability in default mode, but
> >>>>>>>>>>>>>>>>> user
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> may
> >>>>>>>>>>>>>>> fallback to previous LOG_ONLY in order to retain
> >>>>>>>>>> performance.
> >>>>>>>>>>>>>>>>> Thoughts?
> >>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On 20.03.2018 16:00, Ivan Rakov wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Val,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> If a storage is in
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be
> >>>>>>>>>>>>>>>>>>>>> completely
> >>>>>>>>>>>>>>>>>>> removed
> >>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> cluster needs to be restarted without data?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Yes, there's a chance that in LOG_ONLY all local data
> >> will
> >>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>> lost,
> >>>>>>>>>> but only in *power loss**/ OS crash* case.
> >>>>>>>>>>>>>>>>> kill -9, JVM crash, death of critical system thread and
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>> other
> >>>>>>>>> cases that usually take place are variations of *process
> >>>>>>>>>>>>>>>>>>>> crash*.
> >>>>>>>>>>>>>>>>>> All
> >>>>>>>>>>> WAL modes (except NONE, of course) ensure corruption-safety
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>> case
> >>>>>>>>> of
> >>>>>>>>>>>>>>>>> process crash.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> If so, I'm not sure any mode
> >>>>>>>>>>>>>>>>>>>> that allows corruption makes much sense to me.
> >>>>>>>>>>>>>>>>>>>>> It depends on performance impact of enforcing
> >> power-loss
> >>>>>>>>>>>>>>>>>>>> corruption
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> safety. Price of full protection from power loss is
> high
> >> -
> >>>>>>>>>>>>>>>>> FSYNC
> >>>>>>>>>>>>>> is
> >>>>>>>>> way slower (2-10 times) than other WAL modes. The question is
> >>>>>>>>>>>>>>>>> whether
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> ensuring weaker guarantees (corruption can't happen, but
> >>>>>>>>>>>>>>>>>> loss
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>> last
> >>>>>>>>>> updates can) will affect performance as badly as strong
> >>>>>>>>>>>>>>>>>> guarantees.
> >>>>>>>>>>>>>>>>>> I'll share benchmark results soon.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>>>> On 20.03.2018 5:09, Valentin Kulichenko wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Guys,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> What do we understand under "data corruption" here?
> If
> >> a
> >>>>>>>>>>>>>>>>>>>>> storage
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be
> >> completely
> >>>>>>>>>>>>>>>>>> removed
> >>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> cluster needs to be restarted without data? If so, I'm
> not
> >>>>>>>>>>>>>>>>>> sure
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> any
> >>>>>>>>>> mode
> >>>>>>>>>>>>>>>>>> that allows corruption makes much sense to me. How am I
> >>>>>>>>>>>>>>>>>> supposed
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>> use a
> >>>>>>>>>>>>>>>>>> database, if virtually any failure can end with complete
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> loss of
> >>>>>>>>>>>>>>>>>>>>> data?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> In any case, this definitely should not be a default
> >>>>>>>>>>>>>>>>>> behavior.
> >>>>>>>>>>>>>>>> If
> >>>>>>>>> user ever
> >>>>>>>>>>>>>>>>>> switches to corruption-unsafe mode, there should be a
> >>>>>>>>>>>>>>>>>> clear
> >>>>>>>>>>>>>>>>>>> warning
> >>>>>>>>> about
> >>>>>>>>>>>>>>>>>> this.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Val
> >>>>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Ticket to track changes:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-7754
> >>>>>>>>>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
> >>>>>>>>>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> Vladimir,
> >>>>>>>>>>>>>>>>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write
> >>>>>>>>>>>>>>>>>>>>>>> guarantees
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> unless power
> >>>>>>>>>>> loss has happened.
> >>>>>>>>>>>>>>>>>>>>>>>> Seems like we need to measure performance
> difference
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> decide
> >>>>>>>>> whether do
> >>>>>>>>>>>>>>>>>>>>> we need separate WAL mode. If it will be invisible,
> >>>>>>>>>>>>>>>>>> we'll
> >>>>>>>>>>>>>>>>>>>>>> just
> >>>>>>>>>> fix
> >>>>>>>>>>>>>>>>>>>>> these
> >>>>>>>>>>>>>>>>>> bugs without introducing new mode; if it will be
> >>>>>>>>>>>>>>>>>>>> perceptible,
> >>>>>>>>>>>>>>>>>>>>>>>> we'll
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> continue the discussion about introducing
> >>>>>>>>>>>>>>>>>>>>>> LOG_ONLY_SAFE.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Makes sense?
> >>>>>>>>>>>>>>>>>>>> Yes, this sounds like the right approach.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
>
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Reply via email to