xiyang created NIFI-12812:
-----------------------------
Summary: CaptureChangeMySQL consumes binlog backlog
Key: NIFI-12812
URL: https://issues.apache.org/jira/browse/NIFI-12812
Project: Apache NiFi
Issue Type: Bug
Components: C2
Affects Versions: 1.23.2
Environment: java version "17.0.7" 2023-04-18 LTS
Java(TM) SE Runtime Environment (build 17.0.7+8-LTS-224)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode, sharing)
liuxu 3.10.0-1160.92.1.el7.x86_64
Reporter: xiyang
I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs. In the long
running process, we found that there will often be stagnation (we did the
monitoring, extracted the binlog position of the main database and the
comparison of binlog.position), Delayed consumption or backlog occurs; Or in
the case of a large number of updates and inserts, binlog. filename may be
backlogged by 10 or more.
We tried increasing the number of Events Per FlowFile and setting Include
Begin/Commit Events to true, which might alleviate the backlog caused by a
large number of updates and inserts, but there was another case where data
seemed to be sent intermittently (like a batch, Data in memory or somewhere
else), we try to stop the processor, and the processor will send all the
previously cached data; When the Events Per FlowFile is set to the default
value, at the insertion rate of 3000+ lines per second, the backlog situation
occurs and the catch-up speed is very slow, even when the binlog purge does not
catch up, eventually leading to the binlog flie is not found. How to implement
or solve the binlog (cdc) real-time output problem
Best regards!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)