[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-09-03 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190133#comment-17190133
 ] 

Roman Khachatryan commented on FLINK-19012:
---

Hi [~mapohl],

thanks for reporting this.

I think it's FLINK-19093 which was recently merged into master. Can you rebase 
and check again?

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> 

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-28 Thread Yun Gao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186330#comment-17186330
 ] 

Yun Gao commented on FLINK-19012:
-

Another instance: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5947=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529]

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> 

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-27 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186036#comment-17186036
 ] 

Roman Khachatryan commented on FLINK-19012:
---

I think the problem is that AsyncCheckpointRunnable throws an exception when it 
sees that SubtaskCheckpointCoordinator is closed. 

Upon close, SubtaskCheckpointCoordinator closes its runnables but doesn't stop 
their threads. This also changes their statuses.

So AsyncCheckpointRunnable should check its status before throwing an exception.

 

The tests started to fail after increasing the log level in FLINK-18962.

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.12.0
>
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183931#comment-17183931
 ] 

Robert Metzger commented on FLINK-19012:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5845=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.12.0
>
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-24 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183673#comment-17183673
 ] 

Dian Fu commented on FLINK-19012:
-

Upgrade to "Critical" as this issue seems not occur accidentally and it has 
occurred several times these days.

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.12.0
>
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421879Z  at 

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-24 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183670#comment-17183670
 ] 

Dian Fu commented on FLINK-19012:
-

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5834=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529]

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421879Z  at 

[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-21 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182230#comment-17182230
 ] 

Dian Fu commented on FLINK-19012:
-

Another instance on master: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5780=logs=739e6eac-8312-5d31-d437-294c4d26fced=a68b8d89-50e9-5977-4500-f4fde4f57f9b

> E2E test fails with "Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument."
> ---
>
> Key: FLINK-19012
> URL: https://issues.apache.org/jira/browse/FLINK-19012
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Task, Tests
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> Note: This error occurred in a custom branch with unreviewed changes. I don't 
> believe my changes affect this error, but I would keep this in mind when 
> investigating the error: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d
>  
> {code}
> 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Registering 
> task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
> 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
> backend has been configured, using default (Memory / JobManager) 
> MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
> (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
> 5242880)
> 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
> 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
> 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
> org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
>  [] - Elasticsearch RestHighLevelClient is connected to 
> [http://127.0.0.1:9200]
> 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
> org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
>  [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
> drained requests
> 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
> 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
> org.apache.flink.runtime.taskmanager.Task[] - Freeing 
> task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> (cbc357ccb763df2852fee8c4fc7d55f2_0_0).
> 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
> Un-registering task and sending final execution state FINISHED to JobManager 
> for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
> cbc357ccb763df2852fee8c4fc7d55f2_0_0.
> 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
> Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
> checkpoint 1 could not be completed.
> 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: 
> java.io.IOException: Cannot register Closeable, this 
> subtaskCheckpointCoordinator is already closed. Closing argument.
> 2020-08-20T20:55:30.2418956Z  at 
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
>  ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420100Z  at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
>  [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-08-20T20:55:30.2420927Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_265]
> 2020-08-20T20:55:30.2421455Z  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_265]
>