[
https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380514#comment-17380514
]
Roman Khachatryan commented on FLINK-22889:
-------------------------------------------
Thanks for posting the failures.
With more logging I see that:
# all tasks started checkpoint 10 and XA_PREPARE in the sync phase of it - and
blocked
# which caused checkpoint timeout (1s) and job restart
# which caused tasks cancellation by JM
# tasks cancellation timed out and watch dogs got off after 1s
# which killed the TMs and then JM wasn't abel to re-deploy the job
The tasks were blocked reading from socket, so they likely already got the
connection:
{code:java}
18:13:21,556 - Task 'Source: Custom Source -> Map -> Sink: Unnamed (4/4)#18'
did not react to cancelling signal - notifying TM; it is stuck for 1 seconds in
method:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
java.net.SocketInputStream.read(SocketInputStream.java:171)
java.net.SocketInputStream.read(SocketInputStream.java:141)
sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457)
sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1095)
sun.security.ssl.SSLSocketImpl.access$200(SSLSocketImpl.java:72)
sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:815)
java.io.FilterInputStream.read(FilterInputStream.java:133)
com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64)
com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63)
com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45)
com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52)
com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41)
com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54)
com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44)
com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:532)
com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:702)
com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:641)
com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:940)
com.mysql.cj.protocol.a.NativeProtocol.sendQueryString(NativeProtocol.java:886)
com.mysql.cj.NativeSession.execSQL(NativeSession.java:1073)
com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:724)
com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648)
com.mysql.cj.jdbc.MysqlXAConnection.dispatchCommand(MysqlXAConnection.java:323)
com.mysql.cj.jdbc.MysqlXAConnection.prepare(MysqlXAConnection.java:226)
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.lambda$endAndPrepare$3(XaFacadeImpl.java:176)
org.apache.flink.connector.jdbc.xa.XaFacadeImpl$$Lambda$1040/95240942.call(Unknown
Source)
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.execute(XaFacadeImpl.java:273)
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.endAndPrepare(XaFacadeImpl.java:176)
org.apache.flink.connector.jdbc.xa.XaFacadePoolingImpl.endAndPrepare(XaFacadePoolingImpl.java:97)
org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.prepareCurrentTx(JdbcXaSinkFunction.java:290)
org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.snapshotState(JdbcXaSinkFunction.java:244)
{code}
This suggest they were either blocked by the previous transactions, deadlocked,
or hit some bug in db like [https://bugs.mysql.com/bug.php?id=86819].
I'll take a further look into it.
> JdbcExactlyOnceSinkE2eTest hangs on azure
> -----------------------------------------
>
> Key: FLINK-22889
> URL: https://issues.apache.org/jira/browse/FLINK-22889
> Project: Flink
> Issue Type: Bug
> Components: Connectors / JDBC
> Affects Versions: 1.14.0, 1.13.1
> Reporter: Dawid Wysakowicz
> Assignee: Roman Khachatryan
> Priority: Critical
> Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=16658
--
This message was sent by Atlassian Jira
(v8.3.4#803005)