[jira] [Updated] (IGNITE-18033) Implement cooperative GC of MV data during RAFT commands execution

Semyon Danilov (Jira) Thu, 23 Feb 2023 07:54:09 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Semyon Danilov updated IGNITE-18033:
------------------------------------
    Description: 
Please refer to the Epic (IGNITE-17571) for the basic description. Please also 
refer to IGNITE-18031 for naive implementation details and general thoughts.

Technically, there is a possibility that the background GC process wouldn't 
catch up if there's too much data being loaded into the system. Scanning 
through the entire partition takes time, and only a small subset of data could 
be under a constant stream of modification.

To account for that, each update can be preceded with the manual GC of that 
row. In this case, there's less work for the background processor, and there's 
an empirical sense that frequently updated data will be just as frequently 
vacuumed, thus not allowing too much garbage to appear in the first place.

Now that's not a guarantee that there would be no problems at all, so we should 
also think about other ways of cooperative GC.

Of course, it would be nice to have a queue of "old" rows ready for you to 
clean. How feasible is it? It would have to be supported by every engine. 
That's extra data and extra time for management. Looks pretty much exactly like 
a TTL manager in Ignite 2.x. I assume that we don't implement anything like 
that right now, starting with more naive approaches. Anyway, please refer to 
the IGNITE-18113 to see more details, maybe that's the way to go.

  was:
Please refer to the Epic for the basic description. Please also refer to 
IGNITE-18031 for naive implementation details and general thoughts.

Technically, there is a possibility that the background GC process wouldn't 
catch up if there's too much data being loaded into the system. Scanning 
through the entire partition takes time, and only a small subset of data could 
be under a constant stream of modification.

To account for that, each update can be preceded with the manual GC of that 
row. In this case, there's less work for the background processor, and there's 
an empirical sense that frequently updated data will be just as frequently 
vacuumed, thus not allowing too much garbage to appear in the first place.

Now that's not a guarantee that there would be no problems at all, so we should 
also think about other ways of cooperative GC.

Of course, it would be nice to have a queue of "old" rows ready for you to 
clean. How feasible is it? It would have to be supported by every engine. 
That's extra data and extra time for management. Looks pretty much exactly like 
a TTL manager in Ignite 2.x. I assume that we don't implement anything like 
that right now, starting with more naive approaches. Anyway, please refer to 
the IGNITE-18113 to see more details, maybe that's the way to go.


> Implement cooperative GC of MV data during RAFT commands execution
> ------------------------------------------------------------------
>
>                 Key: IGNITE-18033
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18033
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3
>
> Please refer to the Epic (IGNITE-17571) for the basic description. Please 
> also refer to IGNITE-18031 for naive implementation details and general 
> thoughts.
> Technically, there is a possibility that the background GC process wouldn't 
> catch up if there's too much data being loaded into the system. Scanning 
> through the entire partition takes time, and only a small subset of data 
> could be under a constant stream of modification.
> To account for that, each update can be preceded with the manual GC of that 
> row. In this case, there's less work for the background processor, and 
> there's an empirical sense that frequently updated data will be just as 
> frequently vacuumed, thus not allowing too much garbage to appear in the 
> first place.
> Now that's not a guarantee that there would be no problems at all, so we 
> should also think about other ways of cooperative GC.
> Of course, it would be nice to have a queue of "old" rows ready for you to 
> clean. How feasible is it? It would have to be supported by every engine. 
> That's extra data and extra time for management. Looks pretty much exactly 
> like a TTL manager in Ignite 2.x. I assume that we don't implement anything 
> like that right now, starting with more naive approaches. Anyway, please 
> refer to the IGNITE-18113 to see more details, maybe that's the way to go.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18033) Implement cooperative GC of MV data during RAFT commands execution

Reply via email to