[jira] [Updated] (IGNITE-22516) Shorten waiting out clock skew on DDL execution

Roman Puchkovskiy (Jira) Sun, 16 Jun 2024 00:59:04 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-22516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Roman Puchkovskiy updated IGNITE-22516:
---------------------------------------
    Description: 
After IGNITE-20378 is fixed, a wait after a DDL will be 
DelayDuration+MaxClockSkew. The second component is needed to make sure that 
the new schema activates on each node of the cluster, even if its clock is 
skewed. We can get an explicit ack about new schema activation from each 
cluster node instead of pessimistically waiting out for MaxClockSkew. This will 
allow us to wait less.

This could look like the following. After we submit the new schema update to 
the Metastorage (and this write gets acked by its majority), we do the 
following:
 # Take the combined set of validated nodes and the logical topology from the 
CMG leader (let it be S)
 # Send a WaitForCatalogVersionActivationRequest(createdCatalogVersion) to each 
node in S
 # (A node getting such a request waits till the give catalog version activates 
on the node and then responds with an ack)
 # Complete the user's DDL future when for each node in S one of the following 
happens:
 ## An ack is received
 ## The node leaves the logical topology
 # As it's already done, still wait for DelayDuration+MaxClockSkew; if this 
wait completes faster than the wait described in items 1-4, it completes the 
user's DDL future

If the logical topology is stable, this will guarantee that either each node 
acks the activation or DelayDuration+MaxClockSkew passes (which will guarantee 
activation on the whole cluster, given that local clock skews are bounded by 
MaxClockSkew).

If a node gets validated after we execute item 1, then its validation happens 
after the new schema update is written do the Metastorage; the node does 
Metastorage recovery after validation, hence after the new schema update is 
written to the Metastorage; hence the node will apply the new schema update 
during its recovery, and it will surely see the new schema update before 
becoming operational.

> Shorten waiting out clock skew on DDL execution
> -----------------------------------------------
>
>                 Key: IGNITE-22516
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22516
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> After IGNITE-20378 is fixed, a wait after a DDL will be 
> DelayDuration+MaxClockSkew. The second component is needed to make sure that 
> the new schema activates on each node of the cluster, even if its clock is 
> skewed. We can get an explicit ack about new schema activation from each 
> cluster node instead of pessimistically waiting out for MaxClockSkew. This 
> will allow us to wait less.
> This could look like the following. After we submit the new schema update to 
> the Metastorage (and this write gets acked by its majority), we do the 
> following:
>  # Take the combined set of validated nodes and the logical topology from the 
> CMG leader (let it be S)
>  # Send a WaitForCatalogVersionActivationRequest(createdCatalogVersion) to 
> each node in S
>  # (A node getting such a request waits till the give catalog version 
> activates on the node and then responds with an ack)
>  # Complete the user's DDL future when for each node in S one of the 
> following happens:
>  ## An ack is received
>  ## The node leaves the logical topology
>  # As it's already done, still wait for DelayDuration+MaxClockSkew; if this 
> wait completes faster than the wait described in items 1-4, it completes the 
> user's DDL future
> If the logical topology is stable, this will guarantee that either each node 
> acks the activation or DelayDuration+MaxClockSkew passes (which will 
> guarantee activation on the whole cluster, given that local clock skews are 
> bounded by MaxClockSkew).
> If a node gets validated after we execute item 1, then its validation happens 
> after the new schema update is written do the Metastorage; the node does 
> Metastorage recovery after validation, hence after the new schema update is 
> written to the Metastorage; hence the node will apply the new schema update 
> during its recovery, and it will surely see the new schema update before 
> becoming operational.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-22516) Shorten waiting out clock skew on DDL execution

Reply via email to