[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621014#comment-15621014 ] Naganarasimha G R commented on YARN-3924: - Agree with [~kasha]'s comment, i think eventually user/admin needs to check, so i think not much can be done in this regard. Can we close this jira if no further work is planned? > Submitting an application to standby ResourceManager should respond better > than Connection Refused > -- > > Key: YARN-3924 > URL: https://issues.apache.org/jira/browse/YARN-3924 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Dustin Cote >Assignee: Ajith S >Priority: Minor > > When submitting an application directly to a standby resource manager, the > resource manager responds with 'Connection Refused' rather than indicating > that it is a standby resource manager. Because the resource manager is aware > of its own state, I feel like we can have the 8032 port open for standby > resource managers and reject the request with something like 'Cannot process > application submission from this standby resource manager'. > This would be especially helpful for debugging oozie problems when users put > in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM > address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698158#comment-14698158 ] Karthik Kambatla commented on YARN-3924: bq. not somehow making clear about two cases of (RMs being down,client config problems etc) vs (RMs in standby). Fair point. I would like to understand what the user/admin would do differently in the two cases. Seeing the proposed message, the admin would likely go through all the RMs specified in the config and check their HA state. If it is a config issue, the admin should realize it straight-away. If the RM is down or in standby, the admin would likely do the needful to get it to active. I see the value in making this simpler for the admin, but the config issue is likely a one-time thing. Augmenting ClientRMService and other user-visible services to have an Active/Standby mode is rather involved, and I want to make sure the usability improvement is worth the effort and risk. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694792#comment-14694792 ] Ajith S commented on YARN-3924: --- Hi [~kasha] and [~rohithsharma] Thank you for inputs guys. The message *None of the RMs specified by ha-ids appear to be active.* is fine, but again, as me and [~cote] previously mentioned, this is not somehow making clear about two cases of (RMs being down,client config problems etc) vs (RMs in standby). I agree that in current design its hard to say this as rpc servers are started only if RMs are active. So if i could suggest, we can change this so that in case of RMs in standby, rpc servers are up and throw standby exception to client.? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694797#comment-14694797 ] Ajith S commented on YARN-3924: --- Adding to the suggestion, currently the *AdminService* in RM follows this, if in standby, it will throw StandbyException, we can implement in similar way to other RM services Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694770#comment-14694770 ] Rohith Sharma K S commented on YARN-3924: - bq. None of the RMs specified by ha-ids appear to be active. This error message would be more appropriate to me. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695639#comment-14695639 ] Robert Kanter commented on YARN-3924: - I think the original mention of JobTracker here is because Oozie names the field in the workflow job-tracker, regardless of whether you put a JobTracker or ResourceManager address there. So, yes, if using Hadoop 2, the JobTracker is not involved at all. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694748#comment-14694748 ] Karthik Kambatla commented on YARN-3924: It is hard to say whether the failure to connect to an Active RM is because all RMs are in Standby mode or not started. Wouldn't it suffice to say - None of the RMs specified by ha-ids appear to be active.? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694745#comment-14694745 ] Rohith Sharma K S commented on YARN-3924: - bq. A more informative error message might be enough here? Yes, user wants to differentiate RM state like *StandbyRM* VS *Not Started RM/attempt to connect invalid RM ha-ids*. So error message would help more. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694738#comment-14694738 ] Karthik Kambatla commented on YARN-3924: cc: [~rkanter] IIRR, Oozie doesn't rely on the jobTracker to connect to the RMs. It just uses the client config available to it. Yarn Client takes care of routing the request to the appropriate RM. If the client is unable to route to the appropriate RM, there is no Active RM in the list of RMs specified in the config. A more informative error message might be enough here? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681526#comment-14681526 ] Rohith Sharma K S commented on YARN-3924: - If I see from user side, user would be expecting that to differente exception for *connecting to Standby RM* and *connecting to invalid/not started resourceManager address*. But as per RM HA design, both the scenario's are treated as same. The reason is StandBy RM does not opens any rpc server for client communication. If the client is trying to submit a job, then client retry for certain amout of time for both configured rm.ha-ids and throw connectionRefused exception. There are 2 possibilities client might throw connection refused # Configuring wrong/invalid *ha.rm-ids* at client is user mistake, this can be rechecked by user. # Both RM's are in StandBy for long time is problem from YARN and need to find the reason for this state. Ideally if any issue with ZK, after sometime RM will shutdown. If you can share logs for both RM's in standBy would be helpful for analysis. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681677#comment-14681677 ] Ajith S commented on YARN-3924: --- Hi [~rohithsharma] +1 and Thanks for the input, agree with you regarding RM HA design. But however, i think what [~cotedm] is conveying is, in any scenario, if both RM nodes in HA(for whatever reason maybe) are in Standby, then client should have got back a reasonable StandbyException instead of connection refused. If i can suggest, can we change it so that rpc server can be started in standby too, but before it sends response we can check if its active or else throw StandbyException any thoughts.? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682099#comment-14682099 ] Rohith Sharma K S commented on YARN-3924: - I agree with the concern that user should be able to obtain standby exception.I am not sure whether this point was discussed when initially RM HA was designed. keeping cc:\ [~ka...@cloudera.com] [~jianhe] [~xgong] [~vinodkv] for more discussion on this. Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Assignee: Ajith S Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626840#comment-14626840 ] Xuan Gong commented on YARN-3924: - The submit application request should re-direct to the active RM, does not it? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626874#comment-14626874 ] Dustin Cote commented on YARN-3924: --- It doesn't if you put in the standby resource manager for the 'jobTracker' stanza in oozie or if you misconfigure yarn.resourcemanager.ha.rm-ids to include only the standby resource manager. The oozie scenario is more the real user scenario, but I reproduced this by using the yarn.resourcemanager.ha.rm-ids method. I assume closing the 8032 port for the standby RM is by design, but can we indicate that the RM is in standby instead of just saying connection refused? Submitting an application to standby ResourceManager should respond better than Connection Refused -- Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)