[
https://issues.apache.org/jira/browse/IGNITE-18502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis Chudov updated IGNITE-18502:
----------------------------------
Description:
*Motivation*
All tables have a transactional nature, even if we don't use transactions
explicitly. That means all entries are stored as write intents (pending
entries) before they are committed. Those entries are required for
transactional protocol and stored in in-memory collection
(_PartitionListener#txsPendingRowIds_). The collection can be lost due to
restart of the node where the data located. Recovery of the collection happens
through scan of all entries in the partition and can take long time.
There are several possible ways to solve this problem: to persist pending rows
collection, or resolve write intents when they are met by another transaction,
including RW transactions. After some discussions the latter seems to be better
decision, as it doesn't require one more persistent storage/structure, and RW
transactions should be able to resolve write intents anyway.
*Definition of Done*
No full partition scan on start is needed.
*Implementation notes*
The chosen decision implies that unresolved write intents in partition can be
present in partition, even if such write intents belong to transaction that was
already committed or aborted. When another transaction (no matter read-only or
read-write) meets such write-intent it tries to resolve it. If it belongs to
transaction that was already finished, then local cleanup must be run for this
write-intent. This means that the write intent should be committed or aborted
as like it was processed by _TxCleanupCommand_, but locally, because the
corresponding write intents on another nodes had to be already processed by
transaction cleanup as normally.
was:
*Motivation*
All tables have a transactional nature, even if we don't use transactions
explicitly. That means all entries are stored as write intents (pending
entries) before they are committed. Those entries are required for
transactional protocol and stored in in-memory collection
(_PartitionListener#txsPendingRowIds_). The collection can be lost due to
restart of the node where the data located. Recovery of the collection happens
through scan of all entries in the partition and can take long time.
There are several possible ways to solve this problem: to persist pending rows
collection, or resolve write intents when they are met by another transaction,
including RW transactions. After some discussions the latter seems to be better
decision, as it doesn't require one more persistent storage/structure, and RW
transactions should be able to resolve write intents anyway.
*Definition of Done*
No full partition scan on start is needed.
*Implementation notes*
The chosen decision implies that unresolved write intents in partition can be
present in partition, even if such write intents belong to transaction that was
already committed or aborted. When another transaction (no matter read-only or
read-write) meets such write-intent it tries to resolve it. If it belongs to
transaction that was already finished, then local cleanup must be run for this
write-intent. This means that the write intent should be committed or aborted
as like it was processed by _TxCleanupCommand_, but locally, because the
corresponding write intents on abother nodes should have been already processed
by transaction cleanup as normally.
> Excessive full partition scan on node start
> -------------------------------------------
>
> Key: IGNITE-18502
> URL: https://issues.apache.org/jira/browse/IGNITE-18502
> Project: Ignite
> Issue Type: Improvement
> Reporter: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
>
> *Motivation*
> All tables have a transactional nature, even if we don't use transactions
> explicitly. That means all entries are stored as write intents (pending
> entries) before they are committed. Those entries are required for
> transactional protocol and stored in in-memory collection
> (_PartitionListener#txsPendingRowIds_). The collection can be lost due to
> restart of the node where the data located. Recovery of the collection
> happens through scan of all entries in the partition and can take long time.
> There are several possible ways to solve this problem: to persist pending
> rows collection, or resolve write intents when they are met by another
> transaction, including RW transactions. After some discussions the latter
> seems to be better decision, as it doesn't require one more persistent
> storage/structure, and RW transactions should be able to resolve write
> intents anyway.
> *Definition of Done*
> No full partition scan on start is needed.
> *Implementation notes*
> The chosen decision implies that unresolved write intents in partition can be
> present in partition, even if such write intents belong to transaction that
> was already committed or aborted. When another transaction (no matter
> read-only or read-write) meets such write-intent it tries to resolve it. If
> it belongs to transaction that was already finished, then local cleanup must
> be run for this write-intent. This means that the write intent should be
> committed or aborted as like it was processed by _TxCleanupCommand_, but
> locally, because the corresponding write intents on another nodes had to be
> already processed by transaction cleanup as normally.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)