[ 
https://issues.apache.org/jira/browse/TEZ-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294361#comment-14294361
 ] 

Hitesh Shah commented on TEZ-1893:
----------------------------------

Seems like this is a critical enough issue that should be fixed in previous 
releases too ( 0.5.x ). Comments? \cc [~bikassaha] [~sseth] [~zjffdu]

> Some vertex init fail are still not propagated to clients
> ---------------------------------------------------------
>
>                 Key: TEZ-1893
>                 URL: https://issues.apache.org/jira/browse/TEZ-1893
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>
> {code}
>           throw new TezUncheckedException(vertex.getLogIdentifier() +
>           " has -1 tasks but does not have input initializers, " +
>           "1-1 uninited sources or custom vertex manager to set it at 
> runtime");
> {code}
> IMO, for this kind of verification we could do it in client side (DAG.verify)
> The following are the message on the client side, the reason that Client 
> could not get the real status of DAG is that Tez AM is killed due to this 
> vertex init error
> {code}
> 19:25:33,716 - Thread( main) - (RMProxy.java:98) - Connecting to 
> ResourceManager at /0.0.0.0:8032
> 19:25:33,717 - Thread( main) - (AHSProxy.java:42) - Connecting to Application 
> History server at /0.0.0.0:10200
> 19:25:34,724 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:35,725 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:36,726 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 2 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:36,846 - Thread( main) - (DAGClientImpl.java:463) - DAG initialized: 
> CurrentState=Running
> 19:25:38,351 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:39,352 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:40,354 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 2 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:41,356 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 3 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:42,357 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 4 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:43,358 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 5 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:44,359 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 6 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:45,360 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 7 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:46,361 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 8 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:47,362 - Thread( main) - (Client.java:858) - Retrying connect to 
> server: localhost/127.0.0.1:6000. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19:25:47,369 - Thread( main) - (DAGClientImpl.java:463) - DAG completed. 
> FinalState=FAILED
> 19:25:47,369 - Thread( main) - (TezWordCount.java:203) - status=FAILED, 
> progress=null, diagnostics=Session stats:submittedDAGs=0, successfulDAGs=0, 
> failedDAGs=0, killedDAGs=0
> , counters=null
> 19:25:47,372 - Thread( main) - (TezClient.java:470) - Shutting down Tez 
> Session, sessionName=commonName, applicationId=application_1420335690331_0007
> 19:25:47,374 - Thread( main) - (TezClientUtils.java:838) - Application not 
> running, applicationId=application_1420335690331_0007, 
> yarnApplicationState=FINISHED, finalApplicationStatus=FAILED, 
> trackingUrl=http://localhost:8088/proxy/application_1420335690331_0007/A, 
> diagnostics=Session stats:submittedDAGs=0, successfulDAGs=0, failedDAGs=0, 
> killedDAGs=0
> 19:25:47,375 - Thread( main) - (TezClient.java:484) - Failed to shutdown Tez 
> Session via proxy
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1420335690331_0007, yarnApplicationState=FINISHED, 
> finalApplicationStatus=FAILED, 
> trackingUrl=http://localhost:8088/proxy/application_1420335690331_0007/A, 
> diagnostics=Session stats:submittedDAGs=0, successfulDAGs=0, failedDAGs=0, 
> killedDAGs=0
>       at 
> org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:839)
>       at org.apache.tez.client.TezClient.getSessionAMProxy(TezClient.java:669)
>       at org.apache.tez.client.TezClient.stop(TezClient.java:476)
>       at com.zjffdu.tez.tutorial.TezWordCount.main(TezWordCount.java:204)
> 19:25:47,377 - Thread( main) - (TezClient.java:489) - Could not connect to 
> AM, killing session via YARN, sessionName=commonName, 
> applicationId=application_1420335690331_0007
> 19:25:47,381 - Thread( main) - (YarnClientImpl.java:364) - Killed application 
> application_1420335690331_0007
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to