[jira] [Created] (IGNITE-19395) Reduce write amplification for RocksDB partition storage

Ivan Bessonov (Jira) Tue, 02 May 2023 01:04:08 -0700

Ivan Bessonov created IGNITE-19395:
--------------------------------------

             Summary: Reduce write amplification for RocksDB partition storage
                 Key: IGNITE-19395
                 URL: https://issues.apache.org/jira/browse/IGNITE-19395
             Project: Ignite
          Issue Type: Improvement
            Reporter: Ivan Bessonov



Currently, the "commit" operation in rocksdb storage looks like this:
{code:java}
val data = db.read(writeIntentKey);
db.remove(writeIntentKey);
db.write(committedKey, data);{code}
This is wasteful, we end up writing everything twice. There's another solution, 
we may add a level of indirection to the data:
{code:java}
// RowId index.
[ TableId?? | PartId | RowId | Timestamp ] -> [ DataId ]
[ TableId?? | PartId | RowId ] -> [ DataId | TxId | CommitTableId | 
CommitPartId ]

// Data.
[ DataId ] -> [ Payload ]{code}
{{DataId}} must be unique. I don't like the idea of auto-incrementing key we 
should always persist latest value), there must be another way.

The main idea is that DataId doesn't change while committing the data, meaning 
that it can be generated using RowId and TxId.

For example, {{RowId ++ beginTimestamp(TxId)}} seems like a unique value (with 
mandatory partition ID prefix and probably a table ID prefix)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-19395) Reduce write amplification for RocksDB partition storage

Reply via email to