Re: [DISCUSSION] IEP-47 Native persistence defragmentation

2020-06-02 Thread Anton Vinogradov
Ivan,

Thanks for hints.
Invalidated memory cache was restored :)

On Tue, Jun 2, 2020 at 2:55 PM Ivan Bessonov  wrote:

> Hello Anton,
>
> I'd like to address your last message. First of all, it was already
> partially discussed
> in this thread: [1] To reiterate - expected performance degradation will be
> significant.
> There's no way that we can throttle it because free/reuse lists have to be
> maintained
> sorted all the time. And these are very optimized data structures.
>
> More then that, "dummy" updates clash with data access, this is a very
> dangerous
> thing to do. And these updates don't save you from the situation when last
> pages in
> the file are not data pages, but tree pages, for example. They are much
> harder to
> move. Not only you should update all links to it but also do it
> effectively, without
> blocking the tree too much. I can think of many other examples.
>
> *Easy to implement/understand*
>  - no, it's not easy at all, defragmentation under the load is a very
> challenging thing to
>implement.
>
> *Why we're going to implement distributed system defragmentation in the old
> (offline) way?*
>  - because it's easier and safer, and it won't introduce any performance
> degradation.
>
> [1]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html
>
> вт, 2 июн. 2020 г. в 14:17, Anton Vinogradov :
>
> > Folks,
> >
> > Modern OS never ask you to schedule defragmentation and turn your PC off,
> > it performs it while you're browsing.
> > Why we're going to implement distributed system defragmentation in the
> old
> > (offline) way?
> >
> > All you need is to implement free/reuse-list sorting. They should provide
> > pages closest to the file beginning.
> > So, every insert/update will automatically defragment the entry.
> > Also, a special process should iterate over the partitions in a reverse
> way
> > just performing dummy updates.
> > The partition file may be safely truncated after the iterator.
> >
> > Props:
> > - Your system still operating (no downtime)
> > - Defragmentation can be performed partially
> > - Defragmentation can be scheduled to periods of inactivity or performed
> on
> > a regular basis
> > - SQL will not be broken (no reason to recalculate the whole index, it
> will
> > be recalculated in a regular way on every entry update)
> > - Topology changes allowed
> > - Easy to implement/understand
> >
> > Cons:
> > - Performance degradation (solvable by throttling)
> >
> > On Mon, Jun 1, 2020 at 4:04 PM Sergey Chugunov <
> sergey.chugu...@gmail.com>
> > wrote:
> >
> > > Hi Ivan,
> > >
> > > I have an idea about suggested maintenance mode.
> > >
> > > First of all, I agree with your ideas about discovery restrictions:
> node
> > > should not join topology when performing defragmentation.
> > >
> > > At the same time I haven't heard about requests for this mode from
> users,
> > > so we don't know much about possible requirements.
> > > So I suggest to implement it in a pragmatical way: instead of inventing
> > > (unknown in reality) user scenarios lets develop minimal but yet
> > > well-designed functionality that suites our case. If we constrain our
> > > implementation with reasonable set of restrictions that's OK.
> > >
> > > So my idea is the following: to transit a node to maintenance user has
> to
> > > send special command to the node (e.g. with new command in control.sh
> > > utility or via JMX interface). Node saves maintenance request in local
> > > metastorage and waits for restart. User has to manually restart that
> node
> > > in order to finish moving it to maintenance mode.
> > >
> > > When node restarts and finds maintenance request it creates special
> type
> > of
> > > discovery SPI that will not try to join topology at all yet node is
> able
> > to
> > > start all necessary components and APIs like REST processor or JMX
> > > interface.
> > >
> > > When in maintenance, we'll be able to do defragmentation safely and
> > remove
> > > maintenance request from metastorage only when it is completed (with
> all
> > > fault-tolerance logic in mind).
> > >
> > > As we don't have a mechanism (like watcher) to perform a "safe restart"
> > (by
> > > safe I mean Ignite restart without OS-level process restart) we cannot
> > > finish maintenance mode without another manual restart but I think it
> is
> > a
> > > reasonable restriction as maintenance mode shouldn't be an every-day
> > > routine and will be used quite rare.
> > >
> > > What do you think about it?
> > >
> > > On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov 
> > > wrote:
> > >
> > > > Hello Igniters,
> > > >
> > > > I'd like to discuss this new IEP with you: [1]. The main idea is to
> > have
> > > a
> > > > procedure that relocates
> > > > pages to the top of the file as compact as possible which allows us
> to
> > > > trim the file and increase its
> > > > fill-factor. It will be configured 

Re: [DISCUSSION] IEP-47 Native persistence defragmentation

2020-06-02 Thread Ivan Bessonov
Hello Anton,

I'd like to address your last message. First of all, it was already
partially discussed
in this thread: [1] To reiterate - expected performance degradation will be
significant.
There's no way that we can throttle it because free/reuse lists have to be
maintained
sorted all the time. And these are very optimized data structures.

More then that, "dummy" updates clash with data access, this is a very
dangerous
thing to do. And these updates don't save you from the situation when last
pages in
the file are not data pages, but tree pages, for example. They are much
harder to
move. Not only you should update all links to it but also do it
effectively, without
blocking the tree too much. I can think of many other examples.

*Easy to implement/understand*
 - no, it's not easy at all, defragmentation under the load is a very
challenging thing to
   implement.

*Why we're going to implement distributed system defragmentation in the old
(offline) way?*
 - because it's easier and safer, and it won't introduce any performance
degradation.

[1]
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html

вт, 2 июн. 2020 г. в 14:17, Anton Vinogradov :

> Folks,
>
> Modern OS never ask you to schedule defragmentation and turn your PC off,
> it performs it while you're browsing.
> Why we're going to implement distributed system defragmentation in the old
> (offline) way?
>
> All you need is to implement free/reuse-list sorting. They should provide
> pages closest to the file beginning.
> So, every insert/update will automatically defragment the entry.
> Also, a special process should iterate over the partitions in a reverse way
> just performing dummy updates.
> The partition file may be safely truncated after the iterator.
>
> Props:
> - Your system still operating (no downtime)
> - Defragmentation can be performed partially
> - Defragmentation can be scheduled to periods of inactivity or performed on
> a regular basis
> - SQL will not be broken (no reason to recalculate the whole index, it will
> be recalculated in a regular way on every entry update)
> - Topology changes allowed
> - Easy to implement/understand
>
> Cons:
> - Performance degradation (solvable by throttling)
>
> On Mon, Jun 1, 2020 at 4:04 PM Sergey Chugunov 
> wrote:
>
> > Hi Ivan,
> >
> > I have an idea about suggested maintenance mode.
> >
> > First of all, I agree with your ideas about discovery restrictions: node
> > should not join topology when performing defragmentation.
> >
> > At the same time I haven't heard about requests for this mode from users,
> > so we don't know much about possible requirements.
> > So I suggest to implement it in a pragmatical way: instead of inventing
> > (unknown in reality) user scenarios lets develop minimal but yet
> > well-designed functionality that suites our case. If we constrain our
> > implementation with reasonable set of restrictions that's OK.
> >
> > So my idea is the following: to transit a node to maintenance user has to
> > send special command to the node (e.g. with new command in control.sh
> > utility or via JMX interface). Node saves maintenance request in local
> > metastorage and waits for restart. User has to manually restart that node
> > in order to finish moving it to maintenance mode.
> >
> > When node restarts and finds maintenance request it creates special type
> of
> > discovery SPI that will not try to join topology at all yet node is able
> to
> > start all necessary components and APIs like REST processor or JMX
> > interface.
> >
> > When in maintenance, we'll be able to do defragmentation safely and
> remove
> > maintenance request from metastorage only when it is completed (with all
> > fault-tolerance logic in mind).
> >
> > As we don't have a mechanism (like watcher) to perform a "safe restart"
> (by
> > safe I mean Ignite restart without OS-level process restart) we cannot
> > finish maintenance mode without another manual restart but I think it is
> a
> > reasonable restriction as maintenance mode shouldn't be an every-day
> > routine and will be used quite rare.
> >
> > What do you think about it?
> >
> > On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov 
> > wrote:
> >
> > > Hello Igniters,
> > >
> > > I'd like to discuss this new IEP with you: [1]. The main idea is to
> have
> > a
> > > procedure that relocates
> > > pages to the top of the file as compact as possible which allows us to
> > > trim the file and increase its
> > > fill-factor. It will be configured manually and executed after the
> > restart,
> > > but before node joins
> > > topology (it means any load would be impossible during
> defragmentation).
> > It
> > > is described in detail
> > > in the IEP itself, please read it. This topic was also previously
> > discussed
> > > here on dev-list in [2].
> > >
> > > Here I would like to list a few moments that are not as clear and
> require
> > > your opinion.
> 

Re: [DISCUSSION] IEP-47 Native persistence defragmentation

2020-06-02 Thread Anton Vinogradov
Folks,

Modern OS never ask you to schedule defragmentation and turn your PC off,
it performs it while you're browsing.
Why we're going to implement distributed system defragmentation in the old
(offline) way?

All you need is to implement free/reuse-list sorting. They should provide
pages closest to the file beginning.
So, every insert/update will automatically defragment the entry.
Also, a special process should iterate over the partitions in a reverse way
just performing dummy updates.
The partition file may be safely truncated after the iterator.

Props:
- Your system still operating (no downtime)
- Defragmentation can be performed partially
- Defragmentation can be scheduled to periods of inactivity or performed on
a regular basis
- SQL will not be broken (no reason to recalculate the whole index, it will
be recalculated in a regular way on every entry update)
- Topology changes allowed
- Easy to implement/understand

Cons:
- Performance degradation (solvable by throttling)

On Mon, Jun 1, 2020 at 4:04 PM Sergey Chugunov 
wrote:

> Hi Ivan,
>
> I have an idea about suggested maintenance mode.
>
> First of all, I agree with your ideas about discovery restrictions: node
> should not join topology when performing defragmentation.
>
> At the same time I haven't heard about requests for this mode from users,
> so we don't know much about possible requirements.
> So I suggest to implement it in a pragmatical way: instead of inventing
> (unknown in reality) user scenarios lets develop minimal but yet
> well-designed functionality that suites our case. If we constrain our
> implementation with reasonable set of restrictions that's OK.
>
> So my idea is the following: to transit a node to maintenance user has to
> send special command to the node (e.g. with new command in control.sh
> utility or via JMX interface). Node saves maintenance request in local
> metastorage and waits for restart. User has to manually restart that node
> in order to finish moving it to maintenance mode.
>
> When node restarts and finds maintenance request it creates special type of
> discovery SPI that will not try to join topology at all yet node is able to
> start all necessary components and APIs like REST processor or JMX
> interface.
>
> When in maintenance, we'll be able to do defragmentation safely and remove
> maintenance request from metastorage only when it is completed (with all
> fault-tolerance logic in mind).
>
> As we don't have a mechanism (like watcher) to perform a "safe restart" (by
> safe I mean Ignite restart without OS-level process restart) we cannot
> finish maintenance mode without another manual restart but I think it is a
> reasonable restriction as maintenance mode shouldn't be an every-day
> routine and will be used quite rare.
>
> What do you think about it?
>
> On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov 
> wrote:
>
> > Hello Igniters,
> >
> > I'd like to discuss this new IEP with you: [1]. The main idea is to have
> a
> > procedure that relocates
> > pages to the top of the file as compact as possible which allows us to
> > trim the file and increase its
> > fill-factor. It will be configured manually and executed after the
> restart,
> > but before node joins
> > topology (it means any load would be impossible during defragmentation).
> It
> > is described in detail
> > in the IEP itself, please read it. This topic was also previously
> discussed
> > here on dev-list in [2].
> >
> > Here I would like to list a few moments that are not as clear and require
> > your opinion.
> >
> >  - what are your overall thoughts on the IEP? Any concerns?
> >
> >  - maintenance mode - how do we communicate with the node that's not in
> > topology? What are
> >the options? As far as I know, we have no current tools like this.
> >
> >  - checkpointer refactoring - these changes will involve intensive
> writing
> > of pages to the storage.
> >If we're going to reuse the offheap page model to perform
> > defragmentation then the
> >checkpointing mechanism will have to be adapted in some form.
> >Are you fine with this? Or we need a separate discussion?
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
> > [2]
> >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html
> >
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>


Re: [DISCUSSION] IEP-47 Native persistence defragmentation

2020-06-01 Thread Sergey Chugunov
Hi Ivan,

I have an idea about suggested maintenance mode.

First of all, I agree with your ideas about discovery restrictions: node
should not join topology when performing defragmentation.

At the same time I haven't heard about requests for this mode from users,
so we don't know much about possible requirements.
So I suggest to implement it in a pragmatical way: instead of inventing
(unknown in reality) user scenarios lets develop minimal but yet
well-designed functionality that suites our case. If we constrain our
implementation with reasonable set of restrictions that's OK.

So my idea is the following: to transit a node to maintenance user has to
send special command to the node (e.g. with new command in control.sh
utility or via JMX interface). Node saves maintenance request in local
metastorage and waits for restart. User has to manually restart that node
in order to finish moving it to maintenance mode.

When node restarts and finds maintenance request it creates special type of
discovery SPI that will not try to join topology at all yet node is able to
start all necessary components and APIs like REST processor or JMX
interface.

When in maintenance, we'll be able to do defragmentation safely and remove
maintenance request from metastorage only when it is completed (with all
fault-tolerance logic in mind).

As we don't have a mechanism (like watcher) to perform a "safe restart" (by
safe I mean Ignite restart without OS-level process restart) we cannot
finish maintenance mode without another manual restart but I think it is a
reasonable restriction as maintenance mode shouldn't be an every-day
routine and will be used quite rare.

What do you think about it?

On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov  wrote:

> Hello Igniters,
>
> I'd like to discuss this new IEP with you: [1]. The main idea is to have a
> procedure that relocates
> pages to the top of the file as compact as possible which allows us to
> trim the file and increase its
> fill-factor. It will be configured manually and executed after the restart,
> but before node joins
> topology (it means any load would be impossible during defragmentation). It
> is described in detail
> in the IEP itself, please read it. This topic was also previously discussed
> here on dev-list in [2].
>
> Here I would like to list a few moments that are not as clear and require
> your opinion.
>
>  - what are your overall thoughts on the IEP? Any concerns?
>
>  - maintenance mode - how do we communicate with the node that's not in
> topology? What are
>the options? As far as I know, we have no current tools like this.
>
>  - checkpointer refactoring - these changes will involve intensive writing
> of pages to the storage.
>If we're going to reuse the offheap page model to perform
> defragmentation then the
>checkpointing mechanism will have to be adapted in some form.
>Are you fine with this? Or we need a separate discussion?
>
> [1]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
> [2]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html
>
>
> --
> Sincerely yours,
> Ivan Bessonov
>


[DISCUSSION] IEP-47 Native persistence defragmentation

2020-05-26 Thread Ivan Bessonov
Hello Igniters,

I'd like to discuss this new IEP with you: [1]. The main idea is to have a
procedure that relocates
pages to the top of the file as compact as possible which allows us to
trim the file and increase its
fill-factor. It will be configured manually and executed after the restart,
but before node joins
topology (it means any load would be impossible during defragmentation). It
is described in detail
in the IEP itself, please read it. This topic was also previously discussed
here on dev-list in [2].

Here I would like to list a few moments that are not as clear and require
your opinion.

 - what are your overall thoughts on the IEP? Any concerns?

 - maintenance mode - how do we communicate with the node that's not in
topology? What are
   the options? As far as I know, we have no current tools like this.

 - checkpointer refactoring - these changes will involve intensive writing
of pages to the storage.
   If we're going to reuse the offheap page model to perform
defragmentation then the
   checkpointing mechanism will have to be adapted in some form.
   Are you fine with this? Or we need a separate discussion?

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
[2]
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html


-- 
Sincerely yours,
Ivan Bessonov