[jira] [Commented] (FLINK-20743) Print ContainerId For RemoteTransportException

2021-04-16 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17322891#comment-17322891
 ] 

Flink Jira Bot commented on FLINK-20743:


This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Print ContainerId For RemoteTransportException
> --
>
> Key: FLINK-20743
> URL: https://issues.apache.org/jira/browse/FLINK-20743
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.10.0, 1.11.1, 1.12.1
>Reporter: yang gang
>Assignee: yang gang
>Priority: Major
>  Labels: stale-assigned
> Attachments: image-2020-12-23-15-13-21-226.png
>
>
> !image-2020-12-23-15-13-21-226.png|width=970,height=291!
>  RemoteTransportException, this exception reminds the user which service has 
> a problem by means of Ip/Port.
>  When we troubleshoot the problem, the information is not accurate enough. 
> Usually at this time we need to look at the running log of the container that 
> has the problem, but when we see this log, it also shows that the container 
> has died, so pass Ip/ The port method can no longer quickly locate a specific 
> container.
>  So I hope that when such an exception occurs, I hope to print the 
> containerId。
> E.g:
>  Connection unexpectedly closed by remote task manager 
> 'hostName/ip:port/containerId'. This might indicate that the remote task 
> manager was lost.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20743) Print ContainerId For RemoteTransportException

2020-12-29 Thread Zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255906#comment-17255906
 ] 

Zhijiang commented on FLINK-20743:
--

already assigned.

> Print ContainerId For RemoteTransportException
> --
>
> Key: FLINK-20743
> URL: https://issues.apache.org/jira/browse/FLINK-20743
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.10.0, 1.11.1, 1.12.1
>Reporter: yang gang
>Assignee: yang gang
>Priority: Major
> Attachments: image-2020-12-23-15-13-21-226.png
>
>
> !image-2020-12-23-15-13-21-226.png|width=970,height=291!
>  RemoteTransportException, this exception reminds the user which service has 
> a problem by means of Ip/Port.
>  When we troubleshoot the problem, the information is not accurate enough. 
> Usually at this time we need to look at the running log of the container that 
> has the problem, but when we see this log, it also shows that the container 
> has died, so pass Ip/ The port method can no longer quickly locate a specific 
> container.
>  So I hope that when such an exception occurs, I hope to print the 
> containerId。
> E.g:
>  Connection unexpectedly closed by remote task manager 
> 'hostName/ip:port/containerId'. This might indicate that the remote task 
> manager was lost.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20743) Print ContainerId For RemoteTransportException

2020-12-25 Thread yang gang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254943#comment-17254943
 ] 

yang gang commented on FLINK-20743:
---

[~zjwang] Hi
Thank you very much for your reply, can you assign this issue to me, I want to 
try to solve it。:)

> Print ContainerId For RemoteTransportException
> --
>
> Key: FLINK-20743
> URL: https://issues.apache.org/jira/browse/FLINK-20743
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.10.0, 1.11.1, 1.12.1
>Reporter: yang gang
>Priority: Major
> Attachments: image-2020-12-23-15-13-21-226.png
>
>
> !image-2020-12-23-15-13-21-226.png|width=970,height=291!
>  RemoteTransportException, this exception reminds the user which service has 
> a problem by means of Ip/Port.
>  When we troubleshoot the problem, the information is not accurate enough. 
> Usually at this time we need to look at the running log of the container that 
> has the problem, but when we see this log, it also shows that the container 
> has died, so pass Ip/ The port method can no longer quickly locate a specific 
> container.
>  So I hope that when such an exception occurs, I hope to print the 
> containerId。
> E.g:
>  Connection unexpectedly closed by remote task manager 
> 'hostName/ip:port/containerId'. This might indicate that the remote task 
> manager was lost.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20743) Print ContainerId For RemoteTransportException

2020-12-25 Thread Zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254795#comment-17254795
 ] 

Zhijiang commented on FLINK-20743:
--

Thanks for creating this issue [~清月]. 
I remembered I also encountered the same concern while debugging some failover 
issues before. I guess we might rely on the  port info to trace the other 
required infos such as container ID from job manager log. But indeed it is not 
very convinenet and efficient. So in general I supoort your proposal for 
improving the debugging process, but I am a bit worried that it might not be 
easy to pass container ID into the network stack  which might touch many 
components.

Anyway, you can try out your way and I can assign this ticket to you if you 
desire to contribute it. :) 

> Print ContainerId For RemoteTransportException
> --
>
> Key: FLINK-20743
> URL: https://issues.apache.org/jira/browse/FLINK-20743
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.10.0, 1.11.1, 1.12.1
>Reporter: yang gang
>Priority: Major
> Attachments: image-2020-12-23-15-13-21-226.png
>
>
> !image-2020-12-23-15-13-21-226.png|width=970,height=291!
>  RemoteTransportException, this exception reminds the user which service has 
> a problem by means of Ip/Port.
>  When we troubleshoot the problem, the information is not accurate enough. 
> Usually at this time we need to look at the running log of the container that 
> has the problem, but when we see this log, it also shows that the container 
> has died, so pass Ip/ The port method can no longer quickly locate a specific 
> container.
>  So I hope that when such an exception occurs, I hope to print the 
> containerId。
> E.g:
>  Connection unexpectedly closed by remote task manager 
> 'hostName/ip:port/containerId'. This might indicate that the remote task 
> manager was lost.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20743) Print ContainerId For RemoteTransportException

2020-12-22 Thread yang gang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253926#comment-17253926
 ] 

yang gang commented on FLINK-20743:
---

[~zjwang] Hi.
Please see if this proposal is reasonable, thank you.

> Print ContainerId For RemoteTransportException
> --
>
> Key: FLINK-20743
> URL: https://issues.apache.org/jira/browse/FLINK-20743
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.10.0, 1.11.1, 1.12.1
>Reporter: yang gang
>Priority: Major
> Attachments: image-2020-12-23-15-13-21-226.png
>
>
> !image-2020-12-23-15-13-21-226.png|width=970,height=291!
>  RemoteTransportException, this exception reminds the user which service has 
> a problem by means of Ip/Port.
>  When we troubleshoot the problem, the information is not accurate enough. 
> Usually at this time we need to look at the running log of the container that 
> has the problem, but when we see this log, it also shows that the container 
> has died, so pass Ip/ The port method can no longer quickly locate a specific 
> container.
>  So I hope that when such an exception occurs, I hope to print the 
> containerId。
> E.g:
>  Connection unexpectedly closed by remote task manager 
> 'hostName/ip:port/containerId'. This might indicate that the remote task 
> manager was lost.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)