[jira] [Updated] (IGNITE-18502) Excessive full partition scan on node start

Denis Chudov (Jira) Mon, 09 Jan 2023 10:20:35 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-18502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Denis Chudov updated IGNITE-18502:
----------------------------------
    Description: 
*Motivation*
All tables have a transactional nature, even if we don't use transactions 
explicitly. That means all entries are stored as write intents (pending 
entries) before they are committed. Those entries are required for 
transactional protocol and stored in in-memory collection 
(_PartitionListener#txsPendingRowIds_). The collection can be lost due to 
restart of the node where the data located. Recovery of the collection happens 
through scan of all entries in the partition and can take long time.

There are several possible ways to solve this problem: to persist pending rows 
collection, or resolve write intents when they are met by another transaction, 
including RW transactions. After some discussions the latter seems to be better 
decision, as it doesn't require one more persistent storage/structure, and RW 
transactions should be able to resolve write intents anyway.

*Definition of Done*
No full partition scan on start is needed.

*Implementation notes*
The chosen decision implies that unresolved write intents in partition can be 
present in partition, even if such write intents belong to transaction that was 
already committed or aborted. When another transaction (no matter read-only or 
read-write) meets such write-intent it tries to resolve it. If it belongs to 
transaction that was already finished, then local cleanup must be run for this 
write-intent. This means that the write intent should be committed or aborted 
as like it was processed by _TxCleanupCommand_, but locally, because the 
corresponding write intents on another nodes had to be already processed by 
transaction cleanup as normally.

  was:
*Motivation*
All tables have a transactional nature, even if we don't use transactions 
explicitly. That means all entries are stored as write intents (pending 
entries) before they are committed. Those entries are required for 
transactional protocol and stored in in-memory collection 
(_PartitionListener#txsPendingRowIds_). The collection can be lost due to 
restart of the node where the data located. Recovery of the collection happens 
through scan of all entries in the partition and can take long time.

There are several possible ways to solve this problem: to persist pending rows 
collection, or resolve write intents when they are met by another transaction, 
including RW transactions. After some discussions the latter seems to be better 
decision, as it doesn't require one more persistent storage/structure, and RW 
transactions should be able to resolve write intents anyway.

*Definition of Done*
No full partition scan on start is needed.

*Implementation notes*
The chosen decision implies that unresolved write intents in partition can be 
present in partition, even if such write intents belong to transaction that was 
already committed or aborted. When another transaction (no matter read-only or 
read-write) meets such write-intent it tries to resolve it. If it belongs to 
transaction that was already finished, then local cleanup must be run for this 
write-intent. This means that the write intent should be committed or aborted 
as like it was processed by _TxCleanupCommand_, but locally, because the 
corresponding write intents on abother nodes should have been already processed 
by transaction cleanup as normally.


> Excessive full partition scan on node start
> -------------------------------------------
>
>                 Key: IGNITE-18502
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18502
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vladislav Pyatkov
>            Priority: Major
>              Labels: ignite-3
>
> *Motivation*
> All tables have a transactional nature, even if we don't use transactions 
> explicitly. That means all entries are stored as write intents (pending 
> entries) before they are committed. Those entries are required for 
> transactional protocol and stored in in-memory collection 
> (_PartitionListener#txsPendingRowIds_). The collection can be lost due to 
> restart of the node where the data located. Recovery of the collection 
> happens through scan of all entries in the partition and can take long time.
> There are several possible ways to solve this problem: to persist pending 
> rows collection, or resolve write intents when they are met by another 
> transaction, including RW transactions. After some discussions the latter 
> seems to be better decision, as it doesn't require one more persistent 
> storage/structure, and RW transactions should be able to resolve write 
> intents anyway.
> *Definition of Done*
> No full partition scan on start is needed.
> *Implementation notes*
> The chosen decision implies that unresolved write intents in partition can be 
> present in partition, even if such write intents belong to transaction that 
> was already committed or aborted. When another transaction (no matter 
> read-only or read-write) meets such write-intent it tries to resolve it. If 
> it belongs to transaction that was already finished, then local cleanup must 
> be run for this write-intent. This means that the write intent should be 
> committed or aborted as like it was processed by _TxCleanupCommand_, but 
> locally, because the corresponding write intents on another nodes had to be 
> already processed by transaction cleanup as normally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18502) Excessive full partition scan on node start

Reply via email to