[ 
https://issues.apache.org/jira/browse/CASSANDRA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155467#comment-16155467
 ] 

Ariel Weisberg commented on CASSANDRA-13530:
--------------------------------------------

One problem is that all the measurements with batch were taken with a small 
sync window which could be introducing extra syncs that hurt performance. I 
don't think it's likely, but I want to make sure.

The second issue is that it's not clear why one implementation is faster than 
the other and it's kind of unexpected that batch is slower then group. I want 
to find out why batch is slower than group so we know it's not just a bug in 
the batch implementation.

The third issue is that group isn't faster under every scenario. If it were 
clearly always faster I wouldn't dig deeper, but I don't want people who use 
group to have to make the tradeoff if we can fix the performance of batch. It's 
also not about the amount of code so much as it is adding another tuning and 
configuration choice for users.

Both approaches batch together several writes before calling sync. The 
difference is how they construct the batches. One is based on fixed time 
interval and the other is based on fsync latency. Fsync latency should yield 
the optimal batch size that is as low latency as possible under every level of 
throughput with zero configuration tuning.

The only way for group to be better is if batch is calling fsync when it 
shouldn't or with a small number of operations (why?) or if calling fsync as 
soon as possible is somehow killing throughput of the device.

This is dragging on a bit just because of coordination issues over what and how 
to measure. [~yuji] thank you for doing the work I really do appreciate you 
being so thorough.

> GroupCommitLogService
> ---------------------
>
>                 Key: CASSANDRA-13530
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13530
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuji Ito
>            Assignee: Yuji Ito
>             Fix For: 2.2.x, 3.0.x, 3.11.x
>
>         Attachments: groupCommit22.patch, groupCommit30.patch, 
> groupCommit3x.patch, groupCommitLog_noSerial_result.xlsx, 
> groupCommitLog_result.xlsx, GuavaRequestThread.java, MicroRequestThread.java
>
>
> I propose a new CommitLogService, GroupCommitLogService, to improve the 
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the disk.
> In Batch, we can write commit log to the disk every time. The size of commit 
> log to write is too small (< 4KB). When high concurrency, these writes are 
> gathered and persisted to the disk at once. But, when insufficient 
> concurrency, many small writes are issued and the performance decreases due 
> to the latency of the disk. Even if you use SSD, processes of many IO 
> commands decrease the performance.
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting 
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at the 
> same time.
> In GroupCommitLogService, the latency becomes worse if the there is no 
> concurrency.
> I measured the performance with my microbench (MicroRequestThread.java) by 
> increasing the number of threads.The cluster has 3 nodes (Replication factor: 
> 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window improved 
> update with Paxos by 94% and improved select with Paxos by 76%.
> h6. SELECT / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|192|103|
> |2|163|212|
> |4|264|416|
> |8|454|800|
> |16|744|1311|
> |32|1151|1481|
> |64|1767|1844|
> |128|2949|3011|
> |256|4723|5000|
> h6. UPDATE / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|45|26|
> |2|39|51|
> |4|58|102|
> |8|102|198|
> |16|167|213|
> |32|289|295|
> |64|544|548|
> |128|1046|1058|
> |256|2020|2061|



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to