[jira] [Created] (TEZ-3405) Support ability for AM to kill itself if there is no client heartbeating to it

2016-08-09 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created TEZ-3405:
---

 Summary: Support ability for AM to kill itself if there is no 
client heartbeating to it
 Key: TEZ-3405
 URL: https://issues.apache.org/jira/browse/TEZ-3405
 Project: Apache Tez
  Issue Type: Bug
Reporter: Gunther Hagleitner
Priority: Critical


HiveServer2 optionally maintains a pool of AMs in either Tez or LLAP mode. This 
is done to amortize the cost of launching a Tez session.

We also try in a shutdown hook to kill all these AMs when HS2 goes down. 
However, there are cases where HS2 doesn't get the chance to kill these AMs 
before it goes away. As a result these zombie AMs hang around until the timeout 
kicks in.

The trouble with the timeout is that we have to set it fairly high. Otherwise 
the benefit of having pre-launched AMs obviously goes away (in a lightly loaded 
cluster).

So, if people kill/restart HS2 they often times run into situations where the 
cluster/queue doesn't have any more capacity for AMs. They either have to 
manually kill the zombies or wait.

The request is therefore for Tez to maintain a heartbeat to the client. If the 
client goes away the AM should exit. That way we can keep the AMs alive for a 
long time regardless of activity and at the same time don't have to worry about 
them if HS2 goes down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-3390) Package Shuffle Handler as a shaded uber-jar

2016-08-09 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3390.
--
   Resolution: Fixed
Fix Version/s: TEZ-3334

Thanks, [~kshukla]. Committed this to the TEZ-3334 feature branch.

> Package Shuffle Handler as a shaded uber-jar
> 
>
> Key: TEZ-3390
> URL: https://issues.apache.org/jira/browse/TEZ-3390
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: TEZ-3334
>
> Attachments: TEZ-3390.1.patch, TEZ-3390.2.patch
>
>
> This jira aims to isolate the shuffle handler dependencies from the yarn 
> dependencies of the nodemanager by packaging the shaded dependencies in an 
> uber jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-3381) Allow tez to enable mapreduce or tez shuffle handler

2016-08-09 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3381.
--
Resolution: Duplicate

> Allow tez to enable mapreduce or tez shuffle handler
> 
>
> Key: TEZ-3381
> URL: https://issues.apache.org/jira/browse/TEZ-3381
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>
> Currently in TEZ-3334 branch, shuffle handler is hard-coded to use 
> 'tez_shuffle' service. This jira will allow jobs to specify shuffle service 
> as a configuration parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-3336) Hive map-side join job sometimes fails with ROOT_INPUT_INIT_FAILURE

2016-08-09 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved TEZ-3336.
-
Resolution: Invalid

Closing this as invalid since it seems like a problem with Hive's use of Tez 
rather than Tez itself.  [~mithun] please reopen with details if you find 
otherwise.

> Hive map-side join job sometimes fails with ROOT_INPUT_INIT_FAILURE
> ---
>
> Key: TEZ-3336
> URL: https://issues.apache.org/jira/browse/TEZ-3336
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Jason Lowe
>
> When Hive does a map-side join it can generate a DAG where a vertex has two 
> inputs, one from an upstream task and another using MRInputAMSplitGenerator.  
> If it takes a while for MRInputAMSplitGenerator to compute the splits and one 
> of the tasks for the other upstream vertex completes then the job can fail 
> with an error since MRInputAMSplitGenerator does not expect to receive any 
> events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)