Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Eduard Shangareev Tue, 27 Mar 2018 09:38:30 -0700

Ivan, I have reviewed your changes, looks good.

On Tue, Mar 27, 2018 at 2:56 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote:


> Igniters,
>
> I've completed development of https://issues.apache.org/jira
> /browse/IGNITE-7754. TeamCity state is ok. Please, review my changes.
> Please note that it will be possible to track time of WAL fsync on
> checkpoint begin by *walCpRecordFsyncDuration *metric in "Checkpoint
> started" message.
>
> Also, I've created https://issues.apache.org/jira/browse/IGNITE-8057 with
> description of possible further improvement of WAL fsync on checkpoint
> begin.
>
> Best Regards,
> Ivan Rakov
>
>
> On 26.03.2018 23:45, Valentin Kulichenko wrote:
>
>> Ivan,
>>
>> It's all good then :) Thanks!
>>
>> -Val
>>
>> On Mon, Mar 26, 2018 at 1:50 AM, Ivan Rakov <ivan.glu...@gmail.com>
>> wrote:
>>
>> Val,
>>>
>>> There's no any sense to use WalMode.NONE in production environment, it's
>>> kept for testing and debugging purposes (including possible user
>>> activities
>>> like capacity planning).
>>> We already print a warning at node start in case WalMode.NONE is set:
>>>
>>> U.quietAndWarn(log,"Started write-ahead log manager in NONE mode,
>>>
>>>> persisted data may be lost in " +
>>>>       "a case of unexpected node failure. Make sure to deactivate the
>>>> cluster before shutdown.");
>>>>
>>>> Best Regards,
>>> Ivan Rakov
>>>
>>>
>>> On 24.03.2018 1:40, Valentin Kulichenko wrote:
>>>
>>> Dmitry,
>>>>
>>>> Thanks for clarification. So it sounds like if we fix all other modes as
>>>> we
>>>> discuss here, NONE would be the only one allowing corruption. I also
>>>> don't
>>>> see much sense in this and I think we should clearly state this in the
>>>> doc,
>>>> as well print out a warning if NONE mode is used. Eventually, if it's
>>>> confirmed that there are no reasonable use cases for it, we can
>>>> deprecate
>>>> it.
>>>>
>>>> -Val
>>>>
>>>> On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlov <dpavlov....@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Val,
>>>>
>>>>> NONE means that the WAL log is disabled and not written at all. Use of
>>>>> the
>>>>> mode is at your own risk. It is possible that restore state after the
>>>>> crash
>>>>> at the middle of checkpoint will not succeed. I do not see much sence
>>>>> in
>>>>> it, especially in production.
>>>>>
>>>>> BACKGROUND is full functional WAL mode, but allows some delay before
>>>>> flush
>>>>> to disk.
>>>>>
>>>>> Sincerely,
>>>>> Dmitriy Pavlov
>>>>>
>>>>> сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko <
>>>>> valentin.kuliche...@gmail.com>:
>>>>>
>>>>> I agree. In my view, any possibility to get a corrupted storage is a
>>>>> bug
>>>>>
>>>>>> which needs to be fixed.
>>>>>>
>>>>>> BTW, can someone explain semantics of NONE mode? What is the
>>>>>> difference
>>>>>> from BACKGROUND from user's perspective? Is there any particular use
>>>>>> case
>>>>>> where it can be used?
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov <dpavlov....@gmail.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> Hi Ivan,
>>>>>>
>>>>>>> IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
>>>>>>>
>>>>>>> Sincerely,
>>>>>>> Dmitriy Pavlov
>>>>>>>
>>>>>>> пт, 23 мар. 2018 г. в 12:23, Ivan Rakov <ivan.glu...@gmail.com>:
>>>>>>>
>>>>>>> Igniters, there's another important question about this matter.
>>>>>>>
>>>>>>>> Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that
>>>>>>>>
>>>>>>>> we
>>>>>>>
>>>>>> have to do it: it will cause similar performance drop, but if we
>>>>>>
>>>>>>> consider LOG_ONLY broken without these fixes, BACKGROUND is broken as
>>>>>>>>
>>>>>>>> well.
>>>>>>>
>>>>>>> Best Regards,
>>>>>>>> Ivan Rakov
>>>>>>>>
>>>>>>>> On 23.03.2018 10:27, Ivan Rakov wrote:
>>>>>>>>
>>>>>>>> Fixes are quite simple.
>>>>>>>>> I expect them to be merged in master in a week in worst case.
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Ivan Rakov
>>>>>>>>>
>>>>>>>>> On 22.03.2018 17:49, Denis Magda wrote:
>>>>>>>>>
>>>>>>>>> Ivan,
>>>>>>>>>>
>>>>>>>>>> How quick are you going to merge the fix into the master? Many
>>>>>>>>>> persistence
>>>>>>>>>> related optimizations have already stacked up. Probably, we can
>>>>>>>>>>
>>>>>>>>>> release
>>>>>>>>>
>>>>>>>> them sooner if the community agrees.
>>>>>>>>
>>>>>>>>> --
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov <
>>>>>>>>>>
>>>>>>>>>> ivan.glu...@gmail.com>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>
>>>>>>> Thanks all!
>>>>>>>>>>
>>>>>>>>>>> We seem to have reached a consensus on this issue. I'll just add
>>>>>>>>>>> necessary
>>>>>>>>>>> fsyncs under IGNITE-7754.
>>>>>>>>>>>
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Ivan Rakov
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 22.03.2018 15:13, Ilya Lantukh wrote:
>>>>>>>>>>>
>>>>>>>>>>> +1 for fixing LOG_ONLY. If current implementation doesn't
>>>>>>>>>>> protect
>>>>>>>>>>>
>>>>>>>>>> from
>>>>>>
>>>>>>> data
>>>>>>>>
>>>>>>>>> corruption, it doesn't make sence.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <
>>>>>>>>>>>>
>>>>>>>>>>>> dma...@apache.org>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>
>>>>>>> +1 for the fix of LOG_ONLY
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
>>>>>>>>>>>>> alexey.goncha...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +1 for fixing LOG_ONLY to enforce corruption safety given the
>>>>>>>>>>>>> provided
>>>>>>>>>>>>>
>>>>>>>>>>>>> performance results.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> voze...@gridgain.com
>>>>>>>>>>>>>
>>>>>>>>>>>> :
>>>>>>>
>>>>>>>> +1 for accepting drop in LOG_ONLY. 7% is not that much and
>>>>>>>>>
>>>>>>>>>> not a
>>>>>>>>>>>>>
>>>>>>>>>>>> drop
>>>>>>
>>>>>>> at
>>>>>>>>>>>>>> all, provided that we fixing a bug. I.e. should we implement
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>> correctly
>>>>>>
>>>>>>> in the first place we would never notice any "drop".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do not understand why someone would like to use current
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> broken
>>>>>>>>>>>>>>
>>>>>>>>>>>>> mode.
>>>>>>>
>>>>>>>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
>>>>>>>>>>>>>>> <dpavlov....@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, I think option 1 is better. As Val said any mode that
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> allows
>>>>>>>>>>>>>>
>>>>>>>>>>>>> corruption
>>>>>>>
>>>>>>>> does not make much sense.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What Ivan mentioned here as drop, in relation to old mode
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> DEFAULT
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (FSYNC
>>>>>>>>
>>>>>>>>> now), is still significant perfromance boost.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>>>>> Dmitriy Pavlov
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ivan.glu...@gmail.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> :
>>>>>>>
>>>>>>>> I've attached benchmark results to the JIRA ticket.
>>>>>>>>>
>>>>>>>>>> We observe ~7% drop in "fair" LOG_ONLY_SAFE mode,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> independent
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> of
>>>>>>
>>>>>>> WAL
>>>>>>>>
>>>>>>>>> compaction enabled flag. It's pretty significant drop: WAL
>>>>>>>>>>>>>>> compaction
>>>>>>>>>>>>>>> itself gives only ~3% drop.
>>>>>>>>>>>>>>> I see two options here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1) Change LOG_ONLY behavior. That implies that we'll be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ready
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> to
>>>>>>
>>>>>>> release
>>>>>>>>
>>>>>>>>> AI 2.5 with 7% drop.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2) Introduce LOG_ONLY_SAFE, make it default, add release
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> note
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> to AI
>>>>>>
>>>>>>> 2.5
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> that we added power loss durability in default mode, but
>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> may
>>>>>>>>>>>>>>
>>>>>>>>>>>>> fallback to previous LOG_ONLY in order to retain
>>>>>>>
>>>>>>>> performance.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thoughts?
>>>>>>
>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>> Ivan Rakov
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 20.03.2018 16:00, Ivan Rakov wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Val,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If a storage is in
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> completely
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> removed
>>>>>>>
>>>>>>>> and
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> cluster needs to be restarted without data?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, there's a chance that in LOG_ONLY all local data will
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> lost,
>>>>>>>
>>>>>>>> but only in *power loss**/ OS crash* case.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> kill -9, JVM crash, death of critical system thread and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> other
>>>>>>
>>>>>>> cases that usually take place are variations of *process
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> crash*.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> All
>>>>>>>>
>>>>>>>>> WAL modes (except NONE, of course) ensure corruption-safety
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>
>>>>>>>>>>>>> case
>>>>>>
>>>>>>> of
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> process crash.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If so, I'm not sure any mode
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> that allows corruption makes much sense to me.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It depends on performance impact of enforcing power-loss
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> corruption
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> safety. Price of full protection from power loss is high -
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FSYNC
>>>>>>>>>>>>>
>>>>>>>>>>>> is
>>>>>>
>>>>>>> way slower (2-10 times) than other WAL modes. The question is
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ensuring weaker guarantees (corruption can't happen, but
>>>>>>>>>>>>>>>> loss
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> last
>>>>>>>
>>>>>>>> updates can) will affect performance as badly as strong
>>>>>>>>>>>>>>>> guarantees.
>>>>>>>>>>>>>>>> I'll share benchmark results soon.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ivan Rakov
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 20.03.2018 5:09, Valentin Kulichenko wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Guys,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What do we understand under "data corruption" here? If a
>>>>>>>>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be completely
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> removed
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> cluster needs to be restarted without data? If so, I'm not
>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> any
>>>>>>>
>>>>>>>> mode
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> that allows corruption makes much sense to me. How am I
>>>>>>>>>>>>>>>> supposed
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> to
>>>>>>>>
>>>>>>>>> use a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> database, if virtually any failure can end with complete
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> loss of
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> data?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In any case, this definitely should not be a default
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> behavior.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If
>>>>>>
>>>>>>> user ever
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> switches to corruption-unsafe mode, there should be a
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> clear
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> warning
>>>>>>
>>>>>>> about
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ivan.glu...@gmail.com>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ticket to track changes:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-7754
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>> Ivan Rakov
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
>>>>>>>>>>>>>>>>>>>> ivan.glu...@gmail.com
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write
>>>>>>>>>>>>>>>>>>>>> guarantees
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> unless power
>>>>>>>>
>>>>>>>>> loss has happened.
>>>>>>>>>>>>>>>>>>>>>> Seems like we need to measure performance difference
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> decide
>>>>>>
>>>>>>> whether do
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> we need separate WAL mode. If it will be invisible,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> we'll
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> just
>>>>>>>
>>>>>>>> fix
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> bugs without introducing new mode; if it will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> perceptible,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> we'll
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> continue the discussion about introducing
>>>>>>>>>>>>>>>>>>>> LOG_ONLY_SAFE.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Makes sense?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes, this sounds like the right approach.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Reply via email to