[ 
https://issues.apache.org/jira/browse/CASSANDRA-18648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740025#comment-17740025
 ] 

Berenguer Blasi commented on CASSANDRA-18648:
---------------------------------------------

In order to see the impact of the change a jmh has been included. We can see 
here the numbers for the raw algorithm against a ByteBuffer are pretty similar:

{noformat}
          [java] Benchmark                                (diskRAMParam)  
(liveDTPcParam)  (sstableParam)  Mode  Cnt  Score   Error  Units
     [java] DeletionTimeDeSerBench.testRawAlgReads              RAM         
70PcLive              NC  avgt   15  0.336 ± 0.002  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgReads              RAM         
70PcLive              OA  avgt   15  0.339 ± 0.010  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgReads              RAM         
30PcLive              NC  avgt   15  0.342 ± 0.009  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgReads              RAM         
30PcLive              OA  avgt   15  0.348 ± 0.012  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgWrites             RAM         
70PcLive              NC  avgt   15  0.346 ± 0.007  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgWrites             RAM         
70PcLive              OA  avgt   15  0.339 ± 0.006  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgWrites             RAM         
30PcLive              NC  avgt   15  0.349 ± 0.012  ns/op
     [java] DeletionTimeDeSerBench.testRawAlgWrites             RAM         
30PcLive              OA  avgt   15  0.349 ± 0.009  ns/op
{noformat}

Here we can see the numbers for an end to end De/Serialization against a mmap 
file and disk. We can see again the new algorithm has little/no impact against 
memory. We can see big improvements though when we hit disk:

{noformat}
          [java] Benchmark                                    (diskRAMParam)  
(liveDTPcParam)  (sstableParam)  Mode  Cnt        Score        Error  Units
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT             RAM         
70PcLive              NC  avgt   15   613153.276 ±  18373.103  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT             RAM         
70PcLive              OA  avgt   15   602673.939 ±  20984.133  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT             RAM         
30PcLive              NC  avgt   15   937167.359 ±  23716.146  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT             RAM         
30PcLive              OA  avgt   15   953676.229 ±  12097.928  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT            Disk         
70PcLive              NC  avgt   15  1411795.076 ± 128808.350  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT            Disk         
70PcLive              OA  avgt   15   819253.723 ±  29830.083  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT            Disk         
30PcLive              NC  avgt   15  1610652.879 ±  64372.441  ns/op
     [java] DeletionTimeDeSerBench.testE2EDeSerializeDT            Disk         
30PcLive              OA  avgt   15  1466368.673 ±  52166.339  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT               RAM         
70PcLive              NC  avgt   15   301368.972 ±   8243.054  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT               RAM         
70PcLive              OA  avgt   15   329025.463 ±   3750.320  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT               RAM         
30PcLive              NC  avgt   15   456404.368 ±  11689.921  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT               RAM         
30PcLive              OA  avgt   15   478303.985 ±   8203.872  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT              Disk         
70PcLive              NC  avgt   15  1555390.965 ± 167212.659  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT              Disk         
70PcLive              OA  avgt   15   590942.229 ±   8555.182  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT              Disk         
30PcLive              NC  avgt   15  1934177.996 ± 242716.258  ns/op
     [java] DeletionTimeDeSerBench.testE2ESerializeDT              Disk         
30PcLive              OA  avgt   15  1239555.345 ± 128650.505  ns/op
{noformat}

A perf test shows a slight improvement in ops/s which could be just noise test 
and no impact in latencies:

 !screenshot-1.png! 

 !screenshot-2.png! 

 !screenshot-3.png! 

As per disk size I can see improvements of around 3% in such test (20% deletes)

{noformat}
#Insert data
CQL(8)|INSERT INTO test.test (id, type0, type1, type2, type3, type4, type5, 
type6, type7, text) VALUES ($RANDOM_10000000, $RANDOM_10, 
$RANDOM_10,$RANDOM_10,$RANDOM_10,$RANDOM_10,$RANDOM_10,$RANDOM_10,$RANDOM_10,'DT
 deser Profiling')
CQL(2)|DELETE FROM test.test WHERE id = $RANDOM_10000000

#Select data
CQL|SELECT id, type$RANDOM_7, text FROM test.test WHERE id=$RANDOM_10000000
{noformat}

This will only show up in small partitions tests as we're writing 1B instead 
off 11B now at _partition_ level.

> Improved DeletionTime serialization
> -----------------------------------
>
>                 Key: CASSANDRA-18648
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18648
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/SSTable
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> DeletionTime.markedForDeleteAt is a long useconds since Unix Epoch. But I 
> noticed that with 7 bytes we can already encode ~2284 years. We can either 
> shed the 8th byte, for reduced IO and disk, or can encode some sentinel 
> values as flags there. [~blerer] suggested starting with DeletionTime.LIVE.
> That would mean reading and writing 1 byte instead of 12 (8 mfda long + 4 
> ldts int). Yes we already avoid serializing DeletionTime (DT) in sstables at 
> _row_ level entirely but not at _partition_ level and it is also serialized 
> at index, metadata, etc. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to