[ 
https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380514#comment-17380514
 ] 

Roman Khachatryan commented on FLINK-22889:
-------------------------------------------

Thanks for posting the failures.

With more logging I see that:
 # all tasks started checkpoint 10 and XA_PREPARE in the sync phase of it - and 
blocked
 # which caused checkpoint timeout (1s) and job restart
 # which caused tasks cancellation by JM
 # tasks cancellation timed out and watch dogs got off after 1s
 # which killed the TMs and then JM wasn't abel to re-deploy the job

The tasks were blocked reading from socket, so they likely already got the 
connection:
{code:java}
18:13:21,556  - Task 'Source: Custom Source -> Map -> Sink: Unnamed (4/4)#18' 
did not react to cancelling signal - notifying TM; it is stuck for 1 seconds in 
method:
    java.net.SocketInputStream.socketRead0(Native Method)
    java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    java.net.SocketInputStream.read(SocketInputStream.java:171)
    java.net.SocketInputStream.read(SocketInputStream.java:141)
    sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457)
    
sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
    
sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1095)
    sun.security.ssl.SSLSocketImpl.access$200(SSLSocketImpl.java:72)
    sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:815)
    java.io.FilterInputStream.read(FilterInputStream.java:133)
    
com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64)
    
com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63)
    
com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45)
    
com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52)
    
com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41)
    
com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54)
    
com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44)
    com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:532)
    
com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:702)
    com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:641)
    
com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:940)
    
com.mysql.cj.protocol.a.NativeProtocol.sendQueryString(NativeProtocol.java:886)
    com.mysql.cj.NativeSession.execSQL(NativeSession.java:1073)
    com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:724)
    com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648)
    
com.mysql.cj.jdbc.MysqlXAConnection.dispatchCommand(MysqlXAConnection.java:323)
    com.mysql.cj.jdbc.MysqlXAConnection.prepare(MysqlXAConnection.java:226)
    
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.lambda$endAndPrepare$3(XaFacadeImpl.java:176)
    
org.apache.flink.connector.jdbc.xa.XaFacadeImpl$$Lambda$1040/95240942.call(Unknown
 Source)
    
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.execute(XaFacadeImpl.java:273)
    
org.apache.flink.connector.jdbc.xa.XaFacadeImpl.endAndPrepare(XaFacadeImpl.java:176)
    
org.apache.flink.connector.jdbc.xa.XaFacadePoolingImpl.endAndPrepare(XaFacadePoolingImpl.java:97)
    
org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.prepareCurrentTx(JdbcXaSinkFunction.java:290)
    
org.apache.flink.connector.jdbc.xa.JdbcXaSinkFunction.snapshotState(JdbcXaSinkFunction.java:244)
{code}
This suggest they were either blocked by the previous transactions, deadlocked, 
or hit some bug in db like [https://bugs.mysql.com/bug.php?id=86819].
 I'll take a further look into it.

> JdbcExactlyOnceSinkE2eTest hangs on azure
> -----------------------------------------
>
>                 Key: FLINK-22889
>                 URL: https://issues.apache.org/jira/browse/FLINK-22889
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / JDBC
>    Affects Versions: 1.14.0, 1.13.1
>            Reporter: Dawid Wysakowicz
>            Assignee: Roman Khachatryan
>            Priority: Critical
>              Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690&view=logs&j=ba53eb01-1462-56a3-8e98-0dd97fbcaab5&t=bfbc6239-57a0-5db0-63f3-41551b4f7d51&l=16658



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to