Another possibility is that the command 'SHOW MASTER STATUS' and the ROTATE event just return the correct position, but the binlog file itself contain some issue. I'd suggest checking MySQL master server side logs for any disk related issue. In that case, some data loss is inevitable.
I tried different setups such as changing sync-binlog or rotating logs forcefully via 'flush logs' statement, but haven't been able to reproduce the error. On Mon, Nov 19, 2018 at 3:17 PM Anand Dev <[email protected]> wrote: > > > Where/How did you get the initial binlog filename and position... > I get it by querying "SHOW MASTER STATUS" > > I'm doubting that the ROTATE event is giving some invalid binlog position > because in the code I see for ROTATE case it is setting the binlog file and > position given in the ROTATE message. > > and just after ROTATE we get FORMAT_DESC but we don't do anything for that > event and right after that we get the ERROR saying could not find next log > > What are your views on this? > > On Mon, Nov 19, 2018 at 11:28 AM Koji Kawamura <[email protected]> wrote: >> >> > I face this scenario in every 2-3 days. And surprisingly, the binlog file >> > never changed still received ROTATE events, do you know any reason why >> > MySQL sends ROTATE and FOR_DES events even when there is no change in >> > binlog file? >> >> MySQL rotates binlog by expiration setting, mysql-log-rotate or when >> logs are flushed such as taking backup ... etc >> https://dev.mysql.com/doc/refman/8.0/en/log-file-maintenance.html >> >> Where/How did you get the initial binlog filename and position you set >> at the CaptureChangeMySQL "Initial Binlog Filename" and position? >> >> >> On Sat, Nov 17, 2018 at 12:50 AM Anand Dev <[email protected]> wrote: >> > >> > Yes, it fails with same message. >> > >> > And I noticed that it happened in the sequence, ROTATE event followed by >> > FORMAT_DESCRIPTION event followed by above ERROR >> > When I reset with same binlog position, it again follows same order and >> > fails. I then have to reset with latest binlog pos but that costs data >> > loss. >> > >> > I have also seen instances when it has successfully processed these events >> > but not sure why it fails sometimes and everytime it fails after >> > FORMAT_DESCRIPTION. >> > >> > Error is same: >> > INFO [Process Cluster Protocol Request-3] >> > o.a.n.c.p.impl.SocketProtocolListener Finished processing request >> > b731e751-c486-4820-bffd-2eb543240457 (type=HEARTBEAT, length=2781 bytes) >> > from xxxxxxxxxx:5001 in 1 millis >> > INFO [blc-172.16.2.223:5101] c.g.shyiko.mysql.binlog.BinaryLogClient >> > Connected to xxxxxxxx:5324 at (sid:65535, cid:239876) >> > INFO [Timer-Driven Process Thread-2] >> > o.a.n.c.m.processors.CaptureChangeMySQL >> > CaptureChangeMySQL[id=879f5c10-0166-1000-0000-00005772bb34] Got message >> > event type: ROTATE >> > INFO [Timer-Driven Process Thread-2] >> > o.a.n.c.m.processors.CaptureChangeMySQL >> > CaptureChangeMySQL[id=879f5c10-0166-1000-0000-00005772bb34] Got message >> > event type: FORMAT_DESCRIPTION >> > ERROR [Timer-Driven Process Thread-7] >> > o.a.n.c.m.processors.CaptureChangeMySQL >> > CaptureChangeMySQL[id=879f5c10-0166-1000-0000-00005772bb34] Binlog >> > connector communications failure: could not find next log; the first event >> > 'bin_5101.000013' at 2343174, the last event read from >> > '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 2343260, the last >> > byte read from '/u01/mysql/var/log/binlogs/bin_5101.000013' at 2343260.: >> > com.github.shyiko.mysql.binlog.network.ServerException: could not find >> > next log; the first event 'bin_5101.000013' at 2343174, the last event >> > read from '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 2343260, >> > the last byte read from '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' >> > at 2343260. >> > com.github.shyiko.mysql.binlog.network.ServerException: could not find >> > next log; the first event 'bin_5101.000013' at 2343174, the last event >> > read from '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 2343260, >> > the last byte read from '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' >> > at 2343260. >> > at >> > com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:882) >> > at >> > com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:559) >> > at >> > com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:793) >> > at java.lang.Thread.run(Thread.java:745) >> > >> > >> > I face this scenario in every 2-3 days. And surprisingly, the binlog file >> > never changed still received ROTATE events, do you know any reason why >> > MySQL sends ROTATE and FOR_DES events even when there is no change in >> > binlog file? >> > >> > P.S. using NiFi 1.4 >> > >> > On Thu, Nov 15, 2018 at 11:02 AM Koji Kawamura <[email protected]> >> > wrote: >> >> >> >> > Sometimes it fails (I ensure that the state is cleared), and then I >> >> > have to restart it with latest binlog position. >> >> >> >> Does it fail with the same error message? >> >> com.github.shyiko.mysql.binlog.network.ServerException: could not find >> >> next log; the first event 'bin_8101.000001' at 142923964, the last >> >> event read from '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at >> >> 142924061, the last byte read from >> >> '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061. >> >> >> >> Excuse me if I'm not understanding it correctly, but you mean when you >> >> clear the processor state and manually put "Initial Binlog Filename" >> >> and "Initial Binlog Position" the processor fails to start with the >> >> above error message? If so, where did you get the initial binlog >> >> filename and position? >> >> >> >> On Wed, Nov 14, 2018 at 8:50 PM Anand Dev <[email protected]> wrote: >> >> > >> >> > When I restart the Processor from the binlog pos where it stopped, not >> >> > everytime it starts successfully. >> >> > >> >> > Sometimes it fails (I ensure that the state is cleared), and then I >> >> > have to restart it with latest binlog position. >> >> > >> >> > Is there any thing apart from state to be taken care while restarting >> >> > the processor from any binlog position? >> >> > >> >> > >> >> > On Tue, Nov 13, 2018 at 7:27 AM Koji Kawamura <[email protected]> >> >> > wrote: >> >> >> >> >> >> Hi Anand, >> >> >> >> >> >> I'm not sure what caused the error, but I believe the error is MySQL >> >> >> error 1236. >> >> >> The error can happen with different reasons, you may find this article >> >> >> informative. >> >> >> https://www.percona.com/blog/2014/10/08/mysql-replication-got-fatal-error-1236-causes-and-cures/ >> >> >> >> >> >> Also, there is a JIRA that may be related to the issue. >> >> >> "Make Change Data Capture (CDC) processor for MySQL refer to GTID" >> >> >> https://issues.apache.org/jira/browse/NIFI-5739 >> >> >> >> >> >> Did your MySQL server cluster undergo master change recently? >> >> >> >> >> >> If the error keeps happening and you can't get events from the binlog >> >> >> any more, I'd stop the CaptureChangeMySQL processor and clear its >> >> >> state, then restart capturing to recover from that situation. If your >> >> >> use-case allows that.. >> >> >> This will do the same thing with the one described in the preceding >> >> >> percona article. >> >> >> > The solution would be to move the slave thread to the next available >> >> >> > binary log and initialize slave thread with the first available >> >> >> > position on binary log as below: >> >> >> >> >> >> Thanks, >> >> >> Koji >> >> >> On Mon, Nov 12, 2018 at 10:28 PM Anand Chandrashekhar Tirthgirikar >> >> >> <[email protected]> wrote: >> >> >> > >> >> >> > Thanks Koji for the resolution of previous issue. >> >> >> > >> >> >> > >> >> >> > >> >> >> > But I see below error again and again once in a day. There is >> >> >> > nothing else in the logs apart from below error. >> >> >> > >> >> >> > >> >> >> > >> >> >> > Can you please tell what could be the problem here? Is it from >> >> >> > server side or from processor or any network issue? We are not able >> >> >> > to reproduce it. >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > 2018-11-12 11:46:31,952 ERROR [Timer-Driven Process Thread-9] >> >> >> > o.a.n.c.m.processors.CaptureChangeMySQL >> >> >> > CaptureChangeMySQL[id=df913697-f589-121f-b681-da192917092a] Binlog >> >> >> > connector communications failure: could not find next log; the first >> >> >> > event 'bin_8101.000001' at 142923964, the last event read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061, the >> >> >> > last byte read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061.: >> >> >> > com.github.shyiko.mysql.binlog.network.ServerException: could not >> >> >> > find next log; the first event 'bin_8101.000001' at 142923964, the >> >> >> > last event read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061, the >> >> >> > last byte read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061. >> >> >> > >> >> >> > com.github.shyiko.mysql.binlog.network.ServerException: could not >> >> >> > find next log; the first event 'bin_8101.000001' at 142923964, the >> >> >> > last event read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061, the >> >> >> > last byte read from >> >> >> > '/u01/mysql/8101/var/log/binlogs/bin_8101.000001' at 142924061. >> >> >> > >> >> >> > at >> >> >> > com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:882) >> >> >> > >> >> >> > at >> >> >> > com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:559) >> >> >> > >> >> >> > at >> >> >> > com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:793) >> >> >> > >> >> >> > at java.lang.Thread.run(Thread.java:745) >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > From: Koji Kawamura <[email protected]> >> >> >> > Sent: 23 October 2018 12:02 >> >> >> > To: [email protected] >> >> >> > Subject: Re: CaptureChangeMySQL - throwing Binlog connector >> >> >> > communications failure >> >> >> > >> >> >> > >> >> >> > >> >> >> > Hello, >> >> >> > >> >> >> > Thanks for reporting the issue and the detailed analysis. >> >> >> > I was able to reproduce by setting short wait_timeout as 30 sec. >> >> >> > >> >> >> > I'm not aware of any work-around to make keep alive >> >> >> > CaptureChangeMySQL >> >> >> > at the moment. >> >> >> > But you can write a script to stop/start CaptureChangeMySQL processor >> >> >> > using NiFi API to refresh underlying JDBC connection. >> >> >> > NipyAPI can be useful to write such scripts. >> >> >> > https://github.com/Chaffelson/nipyapi >> >> >> > >> >> >> > I submit a JIRA for this issue. >> >> >> > https://issues.apache.org/jira/browse/NIFI-5739 >> >> >> > >> >> >> > Thanks, >> >> >> > Koji >> >> >> > >> >> >> > >> >> >> > On Wed, Oct 10, 2018 at 9:42 PM Anand Dev <[email protected]> >> >> >> > wrote: >> >> >> > > >> >> >> > > In logs, below error is reported >> >> >> > > >> >> >> > > 2018-10-10 00:16:19,766 ERROR [Timer-Driven Process Thread-1] >> >> >> > > o.a.n.c.m.processors.CaptureChangeMySQL >> >> >> > > CaptureChangeMySQL[id=3c3f6e89-8401-3a67-a49d-3af14abf13dc] Error >> >> >> > > in getting DDL Table info create table .... : >> >> >> > > com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: >> >> >> > > No operations allowed after connection closed. >> >> >> > > >> >> >> > > Caused by: >> >> >> > > com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last >> >> >> > > packet successfully received from the server was 58,006,344 >> >> >> > > milliseconds ago. The last packet sent successfully to the server >> >> >> > > was 58,006,345 milliseconds ago. is longer than the server >> >> >> > > configured value of 'wait_timeout'. You should consider either >> >> >> > > expiring and/or testing connection validity before use in your >> >> >> > > application, increasing the server configured values for client >> >> >> > > timeouts, or using the Connector/J connection property >> >> >> > > 'autoReconnect=true' to avoid this problem. >> >> >> > > >> >> >> > > It happened because the connection got stale / dead after 8 hours. >> >> >> > > >> >> >> > > How to keep alive CaptureChangeMySQL processor connection with DB ? >> >> >> > > One way would be increasing 'wait_timeout' to high value but the >> >> >> > > error can still occur post that time. >> >> >> > > How to keep alive connection forever unless processor is stopped? >> >> >> > > >> >> >> > > On Wed, Oct 10, 2018 at 4:41 PM Anand Dev <[email protected]> >> >> >> > > wrote: >> >> >> > >> >> >> >> > >> When I start the CaptureChangeMySQL processor, it worked fine >> >> >> > >> till the point when DDL event occured for Creating a table. >> >> >> > >> >> >> >> > >> Processor captured the CREATE table event successfully. >> >> >> > >> Next, when record was inserted in the newly created table, it >> >> >> > >> threw below error and stopped capturing further events. >> >> >> > >> >> >> >> > >> ERROR >> >> >> > >> =============== >> >> >> > >> Binlog connector communications failure: could not find next log; >> >> >> > >> the first event 'bin_5101.000013' at 921485, the last event read >> >> >> > >> from '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 921565, >> >> >> > >> the last byte read from >> >> >> > >> '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 921565 >> >> >> > >> >> >> >> > >> com.github.shyiko.mysql.binlog.network.ServerException: could not >> >> >> > >> find next log; the first event 'bin_5101.000013' at 921485, the >> >> >> > >> last event read from >> >> >> > >> '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 921565, the >> >> >> > >> last byte read from >> >> >> > >> '/u01/mysql/5101/var/log/binlogs/bin_5101.000013' at 921565 >> >> >> > >> ================ >> >> >> > >> >> >> >> > >> Can someone suggest what's going wrong here?
