[
https://issues.apache.org/jira/browse/IGNITE-22516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Puchkovskiy updated IGNITE-22516:
---------------------------------------
Description:
After IGNITE-20378 is fixed, a wait after a DDL will be
DelayDuration+MaxClockSkew. The second component is needed to make sure that
the new schema activates on each node of the cluster, even if its clock is
skewed. We can get an explicit ack about new schema activation from each
cluster node instead of pessimistically waiting out for MaxClockSkew. This will
allow us to wait less.
This could look like the following. After we submit the new schema update to
the Metastorage (and this write gets acked by its majority), we do the
following:
# Take the combined set of validated nodes and the logical topology from the
CMG leader (let it be S)
# Send a WaitForCatalogVersionActivationRequest(createdCatalogVersion) to each
node in S
# (A node getting such a request waits till the give catalog version activates
on the node and then responds with an ack)
# Complete the user's DDL future when for each node in S one of the following
happens:
## An ack is received
## The node leaves the logical topology
# As it's already done, still wait for DelayDuration+MaxClockSkew; if this
wait completes faster than the wait described in items 1-4, it completes the
user's DDL future
If the logical topology is stable, this will guarantee that either each node
acks the activation or DelayDuration+MaxClockSkew passes (which will guarantee
activation on the whole cluster, given that local clock skews are bounded by
MaxClockSkew).
If a node gets validated after we execute item 1, then its validation happens
after the new schema update is written do the Metastorage; the node does
Metastorage recovery after validation, hence after the new schema update is
written to the Metastorage; hence the node will apply the new schema update
during its recovery, and it will surely see the new schema update before
becoming operational.
> Shorten waiting out clock skew on DDL execution
> -----------------------------------------------
>
> Key: IGNITE-22516
> URL: https://issues.apache.org/jira/browse/IGNITE-22516
> Project: Ignite
> Issue Type: Improvement
> Reporter: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> After IGNITE-20378 is fixed, a wait after a DDL will be
> DelayDuration+MaxClockSkew. The second component is needed to make sure that
> the new schema activates on each node of the cluster, even if its clock is
> skewed. We can get an explicit ack about new schema activation from each
> cluster node instead of pessimistically waiting out for MaxClockSkew. This
> will allow us to wait less.
> This could look like the following. After we submit the new schema update to
> the Metastorage (and this write gets acked by its majority), we do the
> following:
> # Take the combined set of validated nodes and the logical topology from the
> CMG leader (let it be S)
> # Send a WaitForCatalogVersionActivationRequest(createdCatalogVersion) to
> each node in S
> # (A node getting such a request waits till the give catalog version
> activates on the node and then responds with an ack)
> # Complete the user's DDL future when for each node in S one of the
> following happens:
> ## An ack is received
> ## The node leaves the logical topology
> # As it's already done, still wait for DelayDuration+MaxClockSkew; if this
> wait completes faster than the wait described in items 1-4, it completes the
> user's DDL future
> If the logical topology is stable, this will guarantee that either each node
> acks the activation or DelayDuration+MaxClockSkew passes (which will
> guarantee activation on the whole cluster, given that local clock skews are
> bounded by MaxClockSkew).
> If a node gets validated after we execute item 1, then its validation happens
> after the new schema update is written do the Metastorage; the node does
> Metastorage recovery after validation, hence after the new schema update is
> written to the Metastorage; hence the node will apply the new schema update
> during its recovery, and it will surely see the new schema update before
> becoming operational.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)