[
https://issues.apache.org/jira/browse/PIG-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853376#comment-13853376
]
Rohini Palaniswamy commented on PIG-3602:
-----------------------------------------
Upgraded my MAC two days back and minicluster is not working. So I only ran it
directly against the cluster and did not run ant test-tez which I should have
done. Sorry about that and thanks for catching it.
The actual problem thread is below where if the request to RM does not go
through it sleeps. With the default retry interval being in minutes to support
RM HA and rolling upgrade, this will hang for a lot of time. Will fix it to
timeout quickly if not able to stop.
{code}
"Thread-511" prio=5 tid=7f8ddb12c000 nid=0x11666f000 waiting on condition
[11666e000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at
org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:150)
at com.sun.proxy.$Proxy79.getApplicationReport(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
at
org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:590)
at org.apache.tez.client.TezSession.stop(TezSession.java:210)
- locked <7883c9928> (a org.apache.tez.client.TezSession)
at
org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.shutdown(TezSessionManager.java:179)
- locked <7883c3818> (a java.util.ArrayList)
at
org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager$1.run(TezSessionManager.java:51)
{code}
> Tear down TezSessions when Pig exits
> ------------------------------------
>
> Key: PIG-3602
> URL: https://issues.apache.org/jira/browse/PIG-3602
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Affects Versions: tez-branch
> Reporter: Cheolsoo Park
> Assignee: Rohini Palaniswamy
> Fix For: tez-branch
>
> Attachments: PIG-3602-1.patch, unit_test.txt
>
>
> Currently, Pig reuses AMs via TezSession, but they are not shut down when Pig
> exits. There are two problems that I noticed with this-
> # Tez jobs are not marked as finished until TezSessions are expired after
> timeout. Since they occupy task slots, it blocks submitting jobs.
> # ant clean test-tez leaves orphan processes (DAGAppMaster).
> Ideally, TezSession should be kept alive while Pig runs but tore down when
> Pig exits.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)