[
https://issues.apache.org/jira/browse/IGNITE-25270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov updated IGNITE-25270:
---------------------------------------
Issue Type: Improvement (was: Test)
> Come up with solution that allow persisting pending entries and does not have
> a performance impact
> --------------------------------------------------------------------------------------------------
>
> Key: IGNITE-25270
> URL: https://issues.apache.org/jira/browse/IGNITE-25270
> Project: Ignite
> Issue Type: Improvement
> Reporter: Vladislav Pyatkov
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> h3. Motivation
> Our transaction protocol needs to know which entries were used in a
> transaction in a specific partition to switch their state from a write intent
> to a regular entry.
> {code:java|title=StorageUpdateHandler}
> /** A container for rows that were inserted, updated or removed. */
> private final PendingRows pendingRows = new PendingRows();
> {code}
> The set of the rows is in volatile storage (they disappear after the node
> restart). It leads to issues like IGNITE-25079.
> h3. Implementation details
> The idea is to organize a doubly linked list that would connect all version
> chain heads that correspond to write intents. This is effectively a “pending
> tree”, but not a tree.
> The structure:
> * Partition’s meta has a link to the list’s head.
> * Each version chain element will be enriched with two links: “prev” and
> “next”.
> * If it’s not a write intent (commit timestamp is not 0), then these links
> should be equal to 0.
> * If it’s a write intent, then the version chain element should be a node
> inside of a doubly-linked list.
> Upon restart:
> * Scan the list to get all RowIds of all write intents.
> * Find each individual ID in the main partition tree to retrieve information
> about their transactions, and construct volatile pending rows tree at the
> same time.
> Problems:
> * A simple scan of the list is not enough, we have to perform a tree lookup
> for each RowId that we’ve found. It might be expensive.
> * Constructing a linked list on top of tree nodes directly is impossible
> because trees always move data between nodes.
> Cleanup can be performed concurrently (see scheduleAsyncWriteIntentSwitch),
> which means that our doubly linked list could be concurrently modified at any
> place. Guaranteeing the correctness of concurrent modifications is not an
> easy task that requires very careful consideration. This problem might be a
> stopper for this particular solution.
> h3. Definition of done
> Write a document where a solution would be described.
> Jira tasks should be created.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)