[ 
https://issues.apache.org/jira/browse/FLINK-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024790#comment-16024790
 ] 

ASF GitHub Bot commented on FLINK-6708:
---------------------------------------

Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3982#discussion_r118497398
  
    --- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
    @@ -413,14 +413,18 @@ public static void 
runInteractiveCli(YarnClusterClient yarnCluster, boolean read
                        while (true) {
                                // ------------------ check if there are 
updates by the cluster -----------
     
    -                           GetClusterStatusResponse status = 
yarnCluster.getClusterStatus();
    -                           LOG.debug("Received status message: {}", 
status);
    +                           try {
    +                                   GetClusterStatusResponse status = 
yarnCluster.getClusterStatus();
    +                                   LOG.debug("Received status message: 
{}", status);
     
    -                           if (status != null && numTaskmanagers != 
status.numRegisteredTaskManagers()) {
    -                                   System.err.println("Number of connected 
TaskManagers changed to " +
    +                                   if (status != null && numTaskmanagers 
!= status.numRegisteredTaskManagers()) {
    +                                           System.err.println("Number of 
connected TaskManagers changed to " +
                                                        
status.numRegisteredTaskManagers() + ". " +
    -                                           "Slots available: " + 
status.totalNumberOfSlots());
    -                                   numTaskmanagers = 
status.numRegisteredTaskManagers();
    +                                                   "Slots available: " + 
status.totalNumberOfSlots());
    +                                           numTaskmanagers = 
status.numRegisteredTaskManagers();
    +                                   }
    +                           } catch (Exception e) {
    +                                   LOG.warn("Could not retrieve the 
current cluster status. Retrying...", e);
    --- End diff --
    
    "Skipping" might be a better term here, because we aren't actually retrying 
to get the cluster status, just ignoring it for this loop attempt.


> Don't let the FlinkYarnSessionCli fail if it cannot retrieve the ClusterStatus
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-6708
>                 URL: https://issues.apache.org/jira/browse/FLINK-6708
>             Project: Flink
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
>
> The {{FlinkYarnSessionCli}} should not fail if it cannot retrieve the 
> {{GetClusterStatusResponse}}. This would harden Flink's Yarn session.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to