[I] Bug(duplication):some nodes can not send duplication data in master when big difference in data size [incubator-pegasus]

via GitHub Mon, 08 Jan 2024 19:59:07 -0800


ninsmiracle opened a new issue, #1840:
URL: https://github.com/apache/incubator-pegasus/issues/1840

## Bug Report

Please answer these questions before submitting your issue. Thanks!

1. What did you do?
In online production situation. Some nodes in matser cluster can not send
dupcalition data to backup cluster.

![image](https://github.com/apache/incubator-pegasus/assets/110282526/6102f4aa-ccd2-4ff5-98e8-85f5089019b4)

2. What version of Pegasus are you using?
[pegasus2.4](https://github.com/apache/incubator-pegasus/tree/v2.4)

3.Why?
The root cause is that the master cluster sent a write RPC to the backup
cluster, and the request body size exceeded the maximum allowed write size set
by the backup cluster. The reason why the master cluster sends this illegal
data is due to the mechanism in the process of packaging and sending mutations:
the master cluster traverses the writes received in a time period, checks
whether all writes have been traversed or whether the current batch bytes are
already greater than the duplicate_log_batch_bytes set in the cluster hot
standby configuration (config.ini). If it is not greater than the batch size,
then two mutations are combined into one batch.
This can happen if the data length distribution of the table is too large.

For example:
The length of the first mutation A is 200byte. This naturally is to be
combined with the next piece.
But the length of the next mutation B is 1048376(1048576-200) .
At this point, **A and B are already in a batch**, RPC will be sent out
smoothly, but the standby cluster can not accept such a large write, and the
standby cluster throws an ERR error. The master cluster is delayed recovery,
throwing an ERR error.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Bug(duplication):some nodes can not send duplication data in master when big difference in data size [incubator-pegasus]

Reply via email to