Re: Exceeding the DataStorageConfiguration#getMaxWalArchiveSize due to historical rebalance

Andrey Gura Tue, 11 May 2021 03:18:41 -0700

Stan

> If archive size is less than min or more than max then the system 
> functionality can degrade (e.g. historical rebalance may not work as 
> expected).


Why does the condition "archive size is less than min" lead to system
degradation? Actually, the described case is a normal situation for
brand new clusters.

I'm okay with the proposed minWalArchiveSize property. Looks like
relatively understandable property.

On Sun, May 9, 2021 at 7:12 PM Stanislav Lukyanov
<[email protected]> wrote:
>
> Discuss this with Kirill verbally.
>
> Kirill showed me that having the min threshold doesn't quite work.
> It doesn't work because we no longer know how much WAL we should remove if we 
> reach getMaxWalArchiveSize.
>
> For example, say we have minWalArchiveTimespan=2 hours and 
> maxWalArchiveSize=2GB.
> Say, under normal load on stable topology 2 hours of WAL use 1 GB of space.
> Now, say we're doing historical rebalance and reserve the WAL archive.
> The WAL archive starts growing and soon it occupies 2 GB.
> Now what?
> We're supposed to give up WAL reservations and start agressively removing WAL 
> archive.
> But it is not clear when can we stop removing WAL archive - since last 2 
> hours of WAL are larger than our maxWalArchiveSize
> there is no meaningful point the system can use as a "minimum" WAL size.
>
> I understand the description above is a bit messy but I believe that whoever 
> is interested in this will understand it
> after drawing this on paper.
>
>
> I'm giving up on my latest suggestion about time-based minimum. Let's keep it 
> simple.
>
> I suggest the minWalArchiveSize and maxWalArchvieSize properties as the 
> solution,
> with the behavior as initially described by Kirill.
>
> Stan
>
>
> > On 7 May 2021, at 15:09, ткаленко кирилл <[email protected]> wrote:
> >
> > Stas hello!
> >
> > I didn't quite get your last idea.
> > What will we do if we reach getMaxWalArchiveSize? Shall we not delete the 
> > segment until minWalArchiveTimespan?
> >
> > 06.05.2021, 20:00, "Stanislav Lukyanov" <[email protected]>:
> >> An interesting suggestion I heard today.
> >>
> >> The minWalArchiveSize property might actually be minWalArchiveTimespan - 
> >> i.e. be a number of seconds instead of a number of bytes!
> >>
> >> I think this makes perfect sense from the user point of view.
> >> "I want to have WAL archive for at least N hours but I have a limit of M 
> >> gigabytes to store it".
> >>
> >> Do we have checkpoint timestamp stored anywhere? (cp start markers?)
> >> Perhaps we can actually implement this?
> >>
> >> Thanks,
> >> Stan
> >>
> >>>  On 6 May 2021, at 14:13, Stanislav Lukyanov <[email protected]> 
> >>> wrote:
> >>>
> >>>  +1 to cancel WAL reservation on reaching getMaxWalArchiveSize
> >>>  +1 to add a public property to replace 
> >>> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE
> >>>
> >>>  I don't like the name getWalArchiveSize - I think it's a bit confusing 
> >>> (is it the current size? the minimal size? the target size?)
> >>>  I suggest to name the property geMintWalArchiveSize. I think that this 
> >>> is exactly what it is - the minimal size of the archive that we want to 
> >>> have.
> >>>  The archive size at all times should be between min and max.
> >>>  If archive size is less than min or more than max then the system 
> >>> functionality can degrade (e.g. historical rebalance may not work as 
> >>> expected).
> >>>  I think these rules are intuitively understood from the "min" and "max" 
> >>> names.
> >>>
> >>>  Ilya's suggestion about throttling is great although I'd do this in a 
> >>> different ticket.
> >>>
> >>>  Thanks,
> >>>  Stan
> >>>
> >>>>  On 5 May 2021, at 19:25, Maxim Muzafarov <[email protected]> wrote:
> >>>>
> >>>>  Hello, Kirill
> >>>>
> >>>>  +1 for this change, however, there are too many configuration settings
> >>>>  that exist for the user to configure Ignite cluster. It is better to
> >>>>  keep the options that we already have and fix the behaviour of the
> >>>>  rebalance process as you suggested.
> >>>>
> >>>>  On Tue, 4 May 2021 at 19:01, ткаленко кирилл <[email protected]> 
> >>>> wrote:
> >>>>>  Hi Ilya!
> >>>>>
> >>>>>  Then we can greatly reduce the user load on the cluster until the 
> >>>>> rebalance is over. Which can be critical for the user.
> >>>>>
> >>>>>  04.05.2021, 18:43, "Ilya Kasnacheev" <[email protected]>:
> >>>>>>  Hello!
> >>>>>>
> >>>>>>  Maybe we can have a mechanic here similar (or equal) to checkpoint 
> >>>>>> based
> >>>>>>  write throttling?
> >>>>>>
> >>>>>>  So we will be throttling for both checkpoint page buffer and WAL 
> >>>>>> limit.
> >>>>>>
> >>>>>>  Regards,
> >>>>>>  --
> >>>>>>  Ilya Kasnacheev
> >>>>>>
> >>>>>>  вт, 4 мая 2021 г. в 11:29, ткаленко кирилл <[email protected]>:
> >>>>>>
> >>>>>>>  Hello everybody!
> >>>>>>>
> >>>>>>>  At the moment, if there are partitions for the rebalance for which 
> >>>>>>> the
> >>>>>>>  historical rebalance will be used, then we reserve segments in the 
> >>>>>>> WAL
> >>>>>>>  archive (we do not allow cleaning the WAL archive) until the 
> >>>>>>> rebalance for
> >>>>>>>  all cache groups is over.
> >>>>>>>
> >>>>>>>  If a cluster is under load during the rebalance, WAL archive size may
> >>>>>>>  significantly exceed limits set in
> >>>>>>>  DataStorageConfiguration#getMaxWalArchiveSize until the process is
> >>>>>>>  complete. This may lead to user issues and nodes may crash with the 
> >>>>>>> "No
> >>>>>>>  space left on device" error.
> >>>>>>>
> >>>>>>>  We have a system property 
> >>>>>>> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE by
> >>>>>>>  default 0.5, which sets the threshold (multiplied by 
> >>>>>>> getMaxWalArchiveSize)
> >>>>>>>  from which and up to which the WAL archive will be cleared, i.e. 
> >>>>>>> sets the
> >>>>>>>  size of the WAL archive that will always be on the node. I propose to
> >>>>>>>  replace this system property with the
> >>>>>>>  DataStorageConfiguration#getWalArchiveSize in bytes, the default is
> >>>>>>>  (getMaxWalArchiveSize * 0.5) as it is now.
> >>>>>>>
> >>>>>>>  Main proposal:
> >>>>>>>  When theDataStorageConfiguration#getMaxWalArchiveSize is reached, 
> >>>>>>> cancel
> >>>>>>>  and do not give the reservation of the WAL segments until we reach
> >>>>>>>  DataStorageConfiguration#getWalArchiveSize. In this case, if there 
> >>>>>>> is no
> >>>>>>>  segment for historical rebalance, we will automatically switch to 
> >>>>>>> full
> >>>>>>>  rebalance.
>

Re: Exceeding the DataStorageConfiguration#getMaxWalArchiveSize due to historical rebalance

Reply via email to