[
https://issues.apache.org/jira/browse/NIFI-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xiyang updated NIFI-12812:
--------------------------
Description:
I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs. In the long
running process, we found that there will often be stagnation (we did the
monitoring, extracted the binlog position of the main database and the
comparison of binlog.position), Delayed consumption or backlog occurs; Or in
the case of a large number of updates and inserts, binlog. filename may be
backlogged by 10 or more.
We tried increasing the number of Events Per FlowFile and setting Include
Begin/Commit Events to true, which might alleviate the backlog caused by a
large number of updates and inserts, but there was another case where data
seemed to be sent intermittently (like a batch, Data in memory or somewhere
else), we try to stop the processor, and the processor will send all the
previously cached data; When the Events Per FlowFile is set to the default
value, at the insertion rate of 3000+ lines per second, the backlog situation
occurs and the catch-up speed is very slow, even when the binlog purge does not
catch up, eventually leading to the binlog flie is not found. How to implement
or solve the binlog (cdc) real-time output problem
Best regards!
was:
I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs. In the long
running process, we found that there will often be stagnation (we did the
monitoring, extracted the binlog position of the main database and the
comparison of binlog.position), Delayed consumption or backlog occurs; Or in
the case of a large number of updates and inserts, binlog. filename may be
backlogged by 10 or more.
We tried increasing the number of Events Per FlowFile and setting Include
Begin/Commit Events to true, which might alleviate the backlog caused by a
large number of updates and inserts, but there was another case where data
seemed to be sent intermittently (like a batch, Data in memory or somewhere
else), we try to stop the processor, and the processor will send all the
previously cached data; When the Events Per FlowFile is set to the default
value, at the insertion rate of 3000+ lines per second, the backlog situation
occurs and the catch-up speed is very slow, even when the binlog purge does not
catch up, eventually leading to the binlog flie is not found. How to implement
or solve the binlog (cdc) real-time output problem
Best regards!
> CaptureChangeMySQL consumes binlog backlog
> ------------------------------------------
>
> Key: NIFI-12812
> URL: https://issues.apache.org/jira/browse/NIFI-12812
> Project: Apache NiFi
> Issue Type: Bug
> Components: C2
> Affects Versions: 1.23.2
> Environment: java version "17.0.7" 2023-04-18 LTS
> Java(TM) SE Runtime Environment (build 17.0.7+8-LTS-224)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode,
> sharing)
> liuxu 3.10.0-1160.92.1.el7.x86_64
> Reporter: xiyang
> Priority: Major
>
>
> I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs. In the long
> running process, we found that there will often be stagnation (we did the
> monitoring, extracted the binlog position of the main database and the
> comparison of binlog.position), Delayed consumption or backlog occurs; Or in
> the case of a large number of updates and inserts, binlog. filename may be
> backlogged by 10 or more.
> We tried increasing the number of Events Per FlowFile and setting Include
> Begin/Commit Events to true, which might alleviate the backlog caused by a
> large number of updates and inserts, but there was another case where data
> seemed to be sent intermittently (like a batch, Data in memory or somewhere
> else), we try to stop the processor, and the processor will send all the
> previously cached data; When the Events Per FlowFile is set to the default
> value, at the insertion rate of 3000+ lines per second, the backlog situation
> occurs and the catch-up speed is very slow, even when the binlog purge does
> not catch up, eventually leading to the binlog flie is not found. How to
> implement or solve the binlog (cdc) real-time output problem
>
> Best regards!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)