> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > 1. I think webhcat-default.xml should be modified to include the jars that 
> > are now required in templeton.libjars to minimize out-of-the-box config for 
> > end users.
> > 2. Is there any test (e2e) that can be added for this? (with reasonable 
> > amount of effort)
> > 3. When you tested that Pig/Hive jobs get properly tagged, you mean you 
> > tested that MR jobs that are generated by Pig/Hive are tagged, correct?
> 
> Eugene Koifman wrote:
>     4. Actually, instead of doing 1, could WebHCat dynamically figure out 
> which hadoop version it's talking to and add only the necessary shim jar, 
> rather than shipping all of them?  It reduces the amount of config needed.  
> It would also be better if we can only ship the minimal set of jars.
>

1. I like your proposal from #4. I actually started this route but run into 
some issues when I tried to add libjars programmatically. Let me try harder and 
I'll reply back. 
2. Will have to check out what we have currently.
3. Correct, I validated that MR jobs generated by Pig/Hive are tagged properly. 


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java,
> >  line 44
> > <https://reviews.apache.org/r/22329/diff/1/?file=604984#file604984line44>
> >
> >     I think it would be useful to add a more detailed description of these 
> > props.  Something like what is in the JIRA ticket.  I would have added the 
> > ticket number to the comment, but Hive prohibits that.

Will fix this, thanks


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java,
> >  line 126
> > <https://reviews.apache.org/r/22329/diff/1/?file=604985#file604985line126>
> >
> >     Which user will this use?  Is it the user running WebHCat or the value 
> > of 'doAs' parameter?

This is running in the context of the task itself. In unsecure hadoop this is 
in the same context as nodemanager/tasktracker. In secure hadoop I believe this 
is in the context of the user submitting the job.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 157
> > <https://reviews.apache.org/r/22329/diff/1/?file=604987#file604987line157>
> >
> >     Is LOG.info() the right log level?  Seems like it will pollute the log 
> > file.

I think this is totally fine, it's just a single entry in the task syslog. This 
is super useful info (IMO must have) for users to understand what templeton 
launcher job does.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 160
> > <https://reviews.apache.org/r/22329/diff/1/?file=604987#file604987line160>
> >
> >     Is LOG.info() the right level?

I think this is ok.


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java, 
> > line 189
> > <https://reviews.apache.org/r/22329/diff/1/?file=604987#file604987line189>
> >
> >     log level

Same as above, I think this is ok. 


- Ivan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22329/#review44992
-----------------------------------------------------------


On June 6, 2014, 10:02 p.m., Ivan Mitic wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22329/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 10:02 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Approach in the patch is similar to what Oozie does to handle this situation. 
> Specifically, all child map jobs get tagged with the launcher MR job id. On 
> launcher task restart, launcher queries RM for the list of jobs that have the 
> tag and kills them. After that it moves on to start the same child job again. 
> Again, similarly to what Oozie does, a new templeton.job.launch.time property 
> is introduced that captures the launcher job submit timestamp and later used 
> to reduce the search window when RM is queried. 
> 
> To validate the patch, you will need to add webhcat shim jars to 
> templeton.libjars as now webhcat launcher also has a dependency on hadoop 
> shims. 
> 
> I have noticed that in case of the SqoopDelegator webhcat currently does not 
> set the MR delegation token when optionsFile flag is used. This also creates 
> the problem in this scenario. This looks like something that should be 
> handled via a separate Jira.
> 
> 
> Diffs
> -----
> 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java
>  23b1c4f 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarDelegator.java
>  41b1dc5 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LauncherDelegator.java
>  04a5c6f 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigDelegator.java
>  04e061d 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SqoopDelegator.java
>  adcd917 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java
>  a6355a6 
>   
> hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java
>  556ee62 
>   shims/0.20S/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java 
> d3552c1 
>   shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java 
> 5a728b2 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
> 299e918 
> 
> Diff: https://reviews.apache.org/r/22329/diff/
> 
> 
> Testing
> -------
> 
> I have validated that MR, Pig and Hive jobs do get tagged appropriately. I 
> have also validated that previous child jobs do get killed on RM 
> failover/task failure.
> 
> 
> Thanks,
> 
> Ivan Mitic
> 
>

Reply via email to