[jira] [Work logged] (ARTEMIS-3282) Expose Replication response batching tuning

ASF GitHub Bot (Jira) Wed, 12 May 2021 06:41:08 -0700


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-3282?focusedWorklogId=595330&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-595330
 ]


ASF GitHub Bot logged work on ARTEMIS-3282:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/May/21 13:40
            Start Date: 12/May/21 13:40
    Worklog Time Spent: 10m 
      Work Description: franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630973519



##########
File path: 
artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to 
save long tail latency
+               endOfBatch();

Review comment:
       > I really don't see how a user can choose a value for 
maxReplicaResponseBatchBytes
   
   It can be 0 with users that care about 99.XXX percentile latencies and have 
configured kernel bypass drivers.
   It could be the MTU size for users that knows it, it should be -1 for 
"common" users (right now I've chosen 1500 that's the typical MTU size).
   I think is very low level config really, yet to be seen how it can be 
useful: I'm still in the process of validating it before dropping the "draft" 
status of the PR.
   
   >  I think it has to be automatic or automagical, based on the some limit on 
what can be read.
   
   From the point of view of network utilization and memory usage, just using 
-1 or 1500 is already a huge step forward if compared with the "previous" 
(pre-https://issues.apache.org/jira/browse/ARTEMIS-2877)  behaviour.
   
   Size of Ethernet frame - 24 Bytes
   Size of IPv4 Header (without any options) - 20 bytes
   Size of TCP Header (without any options) - 20 Bytes
   
   Total size of an Ethernet Frame carrying an IP Packet with an empty TCP 
Segment - 24 + 20 + 20 = 64 bytes
   
   When packet size is > MTU, the TCP packets are going to be fragmented, but 
that's fine because it will amortize syscall cost instead, while maximizing 
network usage too.
   While just sending responses one by one means sending a ~3X overhead of data 
for each response sent, that will hurt both latencies and CPU/network usage.
   
   
   >  I wonder if something similar based on confirmation-window could work here
   
   No idea, IIRC  the replication flow of packets won't obey any of the other 
cluster connection channel packets flow rules, no duplicate checks/no 
confirmation window or anything similar, @clebertsuconic can you confirm? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 595330)
    Time Spent: 2h 50m  (was: 2h 40m)

> Expose Replication response batching tuning
> -------------------------------------------
>
>                 Key: ARTEMIS-3282
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3282
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>            Reporter: Francesco Nigro
>            Assignee: Francesco Nigro
>            Priority: Major
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (ARTEMIS-3282) Expose Replication response batching tuning

Reply via email to