RE: Re: why only need to configure "the client node hadoop classpath"?

2016-04-05 Thread Bikas Saha
Maria, 

 

Are you just trying to understand the code or planning to make some 
customization to this? If the latter, then please open a jira for that. Would 
be good to understand your scenario.

 

Thanks

Bikas 

 

From: Siddharth Seth [mailto:ss...@apache.org] 
Sent: Tuesday, April 5, 2016 10:39 AM
To: user@tez.apache.org
Subject: Re: Re: why only need to configure "the client node hadoop classpath"?

 

Yes. It depends upon the YARN distributed cache. The relevant jars 
(tez.lib.uris, tez.aux.uris) are localized on each node using YARN local 
resources (distributed cache). Configs are constructed on the client node and 
then written out to HDFS, and localized again via the distributed cache. If 
you're looking for the implementation - 
TezClientUtils.createApplicationSubmissionContext.

If additional jars need to be made available as part of a running DAG - that is 
done by 1) setting tez.aux.uris, or 2) programmatically using APIs on TezClient 
/ during dag construction.

 

On Tue, Apr 5, 2016 at 12:47 AM, Maria  > wrote:


Much thanks for your quick reply. Siddharth~ :)
OK, I got it. But another question arises:How can it propagate the libraries 
and configuration to other nodes? Does it depends on hadoop distributed cache? 
I still can not find the code logic. :(


At 2016-04-05 10:27:55, "Siddharth Seth"  > wrote:


Tez will run tasks throughout the cluster. However, it takes care of 
propagating the libraries and configuration to other nodes - which is why only 
the client needs to be configured.


On Mon, Apr 4, 2016 at 6:51 PM, Maria  > wrote:



Hi,all. 

 Is it just because tez is a client-side plug-in, so we can only need to 
configure tez-libraries in the hadoop client node hadoop classpath,and do not 
need to send and configure them to any datanode?  Is there any other reasons?



Thanks for any reply.



Maria.



 



Re: Tez UI in Pig

2016-04-05 Thread Hitesh Shah
Hi Kurt, 

The Tez UI as documented should work with any version beyond 0.5.2 if the 
history logging is configured to use YARN timeline. As for scopes, some bits of 
the vertex description are currently not displayed in the UI though I am not 
sure if Pig has integrated with that API yet. Depending on the version of 
hadoop you are running and the scale at which you are running, there are some 
known issues with the YARN timeline impl from a scalability perspective but the 
Yahoo folks have implemented some fixes/config workarounds to get around those. 
@Jon Eagles, any chance of publishing a wiki for the configs that you recommend 
running with for YARN Timeline with the level db impl? ( and also the HDFS 
based impl though that this is not really available in any hadoop release as of 
now ). 

If you are trying out the UI, it would be good if you also try out tez-ui2 as 
it has some enhancements coming down the pipe such as a vertex swim lane which 
provides a better overall view of the vertices and how they progress/time they 
took. The UI2 version is fairly new so feedback will be highly appreciated. 

@Rohini, has Pig started setting the vertex info? 
@Sreenath, do we have an open jira for the vertex description to be displayed 
in the UI?

thanks
— Hitesh

On Apr 5, 2016, at 11:04 AM, Kurt Muehlner  wrote:

> I have a question about the availability of the Tez web UI in Pig on Tez.  
> The Pig ‘Performance and Efficiency’ doc states, "Tez specific GUI is not 
> available yet, there is no GUI to track task progress. However, log message 
> is available in GUI.”  What does this mean, precisely?  We have not deployed 
> and configured the Tez UI described here:  
> https://tez.apache.org/tez-ui.html.  Will that UI work when running Tez on 
> Pig?  If so, what does ‘Tez specific GUI is not available yet’ mean?
> 
> What I am most specifically concerned about is the ability to see which Pig 
> aliases are being assigned to which Tez vertices, or failing that, which Pig 
> aliases are being processed by a particular Tez DAG.  This is currently not 
> available in logs in pig 0.15.0, although I’m aware it is in master.
> 
> What are best practices for Pig 0.15.0?
> 
> Thanks,
> Kurt