[ 
https://issues.apache.org/jira/browse/NIFI-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiyang updated NIFI-12812:
--------------------------
    Description: 
 

I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs.  In the long 
running process, we found that there will often be stagnation (we did the 
monitoring, extracted the binlog position of the main database and the 
comparison of binlog.position), Delayed consumption or backlog occurs;  Or in 
the case of a large number of updates and inserts, binlog. filename may be 
backlogged by 10 or more. 

We tried increasing the number of Events Per FlowFile and setting Include 
Begin/Commit Events to true, which might alleviate the backlog caused by a 
large number of updates and inserts, but there was another case where data 
seemed to be sent intermittently (like a batch, Data in memory or somewhere 
else), we try to stop the processor, and the processor will send all the 
previously cached data; When the Events Per FlowFile is set to the default 
value, at the insertion rate of 3000+ lines per second, the backlog situation 
occurs and the catch-up speed is very slow, even when the binlog purge does not 
catch up, eventually leading to the binlog flie is not found. How to implement 
or solve the binlog (cdc) real-time output problem

 

Best regards!

 

  was:
I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs.  In the long 
running process, we found that there will often be stagnation (we did the 
monitoring, extracted the binlog position of the main database and the 
comparison of binlog.position), Delayed consumption or backlog occurs;  Or in 
the case of a large number of updates and inserts, binlog. filename may be 
backlogged by 10 or more. 


We tried increasing the number of Events Per FlowFile and setting Include 
Begin/Commit Events to true, which might alleviate the backlog caused by a 
large number of updates and inserts, but there was another case where data 
seemed to be sent intermittently (like a batch, Data in memory or somewhere 
else), we try to stop the processor, and the processor will send all the 
previously cached data; When the Events Per FlowFile is set to the default 
value, at the insertion rate of 3000+ lines per second, the backlog situation 
occurs and the catch-up speed is very slow, even when the binlog purge does not 
catch up, eventually leading to the binlog flie is not found. How to implement 
or solve the binlog (cdc) real-time output problem

 

Best regards!

 


> CaptureChangeMySQL consumes binlog backlog
> ------------------------------------------
>
>                 Key: NIFI-12812
>                 URL: https://issues.apache.org/jira/browse/NIFI-12812
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: C2
>    Affects Versions: 1.23.2
>         Environment: java version "17.0.7" 2023-04-18 LTS
> Java(TM) SE Runtime Environment (build 17.0.7+8-LTS-224)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode, 
> sharing)
> liuxu 3.10.0-1160.92.1.el7.x86_64
>            Reporter: xiyang
>            Priority: Major
>
>  
> I used CaptureChangeMySQL 1.23.3 for the consumption of binlogs.  In the long 
> running process, we found that there will often be stagnation (we did the 
> monitoring, extracted the binlog position of the main database and the 
> comparison of binlog.position), Delayed consumption or backlog occurs;  Or in 
> the case of a large number of updates and inserts, binlog. filename may be 
> backlogged by 10 or more. 
> We tried increasing the number of Events Per FlowFile and setting Include 
> Begin/Commit Events to true, which might alleviate the backlog caused by a 
> large number of updates and inserts, but there was another case where data 
> seemed to be sent intermittently (like a batch, Data in memory or somewhere 
> else), we try to stop the processor, and the processor will send all the 
> previously cached data; When the Events Per FlowFile is set to the default 
> value, at the insertion rate of 3000+ lines per second, the backlog situation 
> occurs and the catch-up speed is very slow, even when the binlog purge does 
> not catch up, eventually leading to the binlog flie is not found. How to 
> implement or solve the binlog (cdc) real-time output problem
>  
> Best regards!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to