[ 
https://issues.apache.org/jira/browse/IGNITE-19028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-19028:
-----------------------------------
    Description: 
For the future implementation of schema-synchronization, we need to have a 
hybrid-timestamp, associated with the meta-storage.

Database schema changes are always associated with time, and proper place to 
store them would be a meta-storage.

We don't have an "partition replica listener" that would have been a single 
source of truth when it comes to new "write" commands. In case of meta-storage, 
all nodes may create write commands. Assigning a time from the _hlc_ wouldn't 
work - there's a chance of having out-of order events, which is really, really 
bad.

In other words, timestamps should come in order. Does this mean that 
meta-storage should also have its own replica listener? That's one possibility.

Another possibility is to make leader into a timestamp-generator. This would 
lead to changes in JRaft code, but still, this may be the right way to go. It 
simply requires less changes to the code. We should just remember to adjust the 
clock on leader's re-election, so that time would be monotonic.

By the way, if we go with the second option, it would also fit safe time 
propagation in partitions.

  was:
For the future implementation of schema-synchronization, we need to have a 
hybrid-timestamp, associated with the meta-storage.

Database schema changes are always associated with time, and proper place to 
store them would be a meta-storage.

We don't have an "action request processor" that would have been a single 
source of truth when it comes to new "write" commands. In case of meta-storage, 
all nodes may create write commands. Assigning a time from the _hlc_ wouldn't 
work - there's a chance of having out-of order events, which is really, really 
bad.

In other words, timestamps should come in order. Does this mean that 
meta-storage should also have its own action request processor? That's one 
possibility.

Another possibility is to make leader into a timestamp-generator. This would 
lead to changes in JRaft code, but still, this may be the right way to go. It 
simply requires less changes to the code. We should just remember to adjust the 
clock on leader's re-election, so that time would be monotonic.

By the way, if we go with the second option, it would also fit safe time 
propagation in partitions.


> Implement safe-time propagation for meta-storage raft-group
> -----------------------------------------------------------
>
>                 Key: IGNITE-19028
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19028
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3
>
> For the future implementation of schema-synchronization, we need to have a 
> hybrid-timestamp, associated with the meta-storage.
> Database schema changes are always associated with time, and proper place to 
> store them would be a meta-storage.
> We don't have an "partition replica listener" that would have been a single 
> source of truth when it comes to new "write" commands. In case of 
> meta-storage, all nodes may create write commands. Assigning a time from the 
> _hlc_ wouldn't work - there's a chance of having out-of order events, which 
> is really, really bad.
> In other words, timestamps should come in order. Does this mean that 
> meta-storage should also have its own replica listener? That's one 
> possibility.
> Another possibility is to make leader into a timestamp-generator. This would 
> lead to changes in JRaft code, but still, this may be the right way to go. It 
> simply requires less changes to the code. We should just remember to adjust 
> the clock on leader's re-election, so that time would be monotonic.
> By the way, if we go with the second option, it would also fit safe time 
> propagation in partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to