[jira] [Comment Edited] (IGNITE-19395) Reduce write amplification for RocksDB partition storage

Aleksandr Polovtsev (Jira) Thu, 15 Aug 2024 05:38:39 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-19395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17873900#comment-17873900
 ]


Aleksandr Polovtsev edited comment on IGNITE-19395 at 8/15/24 12:37 PM:
------------------------------------------------------------------------

Benchmarking results (first measurements are for the {{main}} branch):

{noformat}
Benchmark                                    Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  thrpt   25  1.048 ± 0.209  ops/s

Benchmark                                    Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  thrpt   25  0.669 ± 0.102  ops/s
{noformat}



was (Author: apolovtcev):
Benchmarking results:

Throughput mode (first measurements are for the {{main}} branch):

{noformat}
Benchmark                                    Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  thrpt   25  1.563 ± 0.237  ops/s

Benchmark                                    Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  thrpt   25  0.897 ± 0.144  ops/s
{noformat}

Average time mode (first measurements are for the {{main}} branch):

{noformat}
Benchmark                                   Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  avgt   25  0.715 ± 0.087   s/op

Benchmark                                   Mode  Cnt  Score   Error  Units
CommitManyWritesBenchmark.commitManyWrites  avgt   25  0.657 ± 0.166   s/op
{noformat}



> Reduce write amplification for RocksDB partition storage
> --------------------------------------------------------
>
>                 Key: IGNITE-19395
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19395
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Assignee: Aleksandr Polovtsev
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the "commit" operation in rocksdb storage looks like this:
> {code:java}
> val data = db.read(writeIntentKey);
> db.remove(writeIntentKey);
> db.write(committedKey, data);{code}
> This is wasteful, we end up writing everything twice. There's another 
> solution, we may add a level of indirection to the data:
> {code:java}
> // RowId index.
> [ TableId?? | PartId | RowId | Timestamp ] -> [ DataId ]
> [ TableId?? | PartId | RowId ] -> [ DataId | TxId | CommitTableId | 
> CommitPartId ]
> // Data.
> [ DataId ] -> [ Payload ]{code}
> {{DataId}} must be unique. I don't like the idea of auto-incrementing key (we 
> should always persist latest value), there must be another way.
> The main idea is that DataId doesn't change while committing the data, 
> meaning that it can be generated using RowId and TxId.
> For example, {{RowId ++ beginTimestamp(TxId)}} seems like a unique value 
> (with mandatory partition ID prefix and probably a table ID prefix)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (IGNITE-19395) Reduce write amplification for RocksDB partition storage

Reply via email to