[
https://issues.apache.org/jira/browse/TEZ-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15684252#comment-15684252
]
TezQA commented on TEZ-3347:
----------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12839842/TEZ-3347.002.patch
against master revision 501a351.
{color:red}-1 patch{color}. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2115//console
This message is automatically generated.
> Vertex UI throws an error while getting vertexProgress for a killed Vertex
> --------------------------------------------------------------------------
>
> Key: TEZ-3347
> URL: https://issues.apache.org/jira/browse/TEZ-3347
> Project: Apache Tez
> Issue Type: Bug
> Components: UI
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Attachments: ErrorCodeFailedVertex.png, TEZ-3347.001.patch,
> TEZ-3347.002.patch
>
>
> Given an AM that fails all its attempts, the application fails and the very
> first click on the killed/failed vertex throws the following error:
> {code}
> error code: Unknown, message: expected expression, got '<'
> {code}
> It self corrects if tried again immediately after the failure.
> This is because the RM proxy redirects the call to the AHS server and the
> REST call is malformed for that server. Upon inspection of the responses, it
> was seen that the URL looked something like this:
> {code}
> http://<hostname>:<ahsport>/applicationhistory/app/application_123_456/ws/v1/tez/vertexProgress?dagID=1&vertexID=01&_=123
> {code}
> which is not a proper Rest call on the AHS.
> I think the following code can cause this issue:
> {code}
> // Load progress in parallel for v1 version of the api
> _loadProgress: function (vertices) {
> var that = this,
> runningVerticesIdx = vertices
> .filterBy('status', 'RUNNING')
> .map(function(item) {
> return item.get('id').split('_').splice(-1).pop();
> });
> if (runningVerticesIdx.length > 0) {
> this.store.unloadAll('vertexProgress');
> this.store.findQuery('vertexProgress', {
> metadata: {
> appId: that.get('applicationId'),
> dagIdx: that.get('idx'),
> vertexIds: runningVerticesIdx.join(',')
> }
> }).then(function(vertexProgressInfo) {
> App.Helpers.emData.mergeRecords(
> that.get('rowsDisplayed'),
> vertexProgressInfo,
> ['progress']
> );
> }).catch(function(error) {
> error.message = "Failed to fetch vertexProgress. Application Master
> (AM) is out of reach. Either it's down, or CORS is not enabled for YARN
> ResourceManager.";
> Em.Logger.error(error);
> var err = App.Helpers.misc.formatError(error);
> var msg = 'Error code: %@, message: %@'.fmt(err.errCode, err.msg);
> App.Helpers.ErrorBar.getInstance().show(msg, err.details);
> });
> {code}
> which uses AMInfo that gets the response based on what loadApp method finds:
> {code}
> loadApp: function (store, appId, useCache) {
> if(!useCache) {
> App.Helpers.misc.removeRecord(store, 'appDetail', appId);
> App.Helpers.misc.removeRecord(store, 'clusterApp', appId);
> }
> return store.find('clusterApp', appId).catch(function () {
> return store.find('appDetail', appId);
> }).catch(function (error) {
> error.message = "Couldn't get details of application %@. RM is not
> reachable, and history service is not enabled.".fmt(appId);
> throw error;
> });
> }
> {code}
> We can check here in the catch block if the response type is not JSON or not
> try and get vertexProgress since it knows that the application/AM has failed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)