RE: Re: why only need to configure "the client node hadoop classpath"?
Maria, Are you just trying to understand the code or planning to make some customization to this? If the latter, then please open a jira for that. Would be good to understand your scenario. Thanks Bikas From: Siddharth Seth [mailto:ss...@apache.org] Sent: Tuesday, April 5, 2016 10:39 AM To: user@tez.apache.org Subject: Re: Re: why only need to configure "the client node hadoop classpath"? Yes. It depends upon the YARN distributed cache. The relevant jars (tez.lib.uris, tez.aux.uris) are localized on each node using YARN local resources (distributed cache). Configs are constructed on the client node and then written out to HDFS, and localized again via the distributed cache. If you're looking for the implementation - TezClientUtils.createApplicationSubmissionContext. If additional jars need to be made available as part of a running DAG - that is done by 1) setting tez.aux.uris, or 2) programmatically using APIs on TezClient / during dag construction. On Tue, Apr 5, 2016 at 12:47 AM, Maria> wrote: Much thanks for your quick reply. Siddharth~ :) OK, I got it. But another question arises:How can it propagate the libraries and configuration to other nodes? Does it depends on hadoop distributed cache? I still can not find the code logic. :( At 2016-04-05 10:27:55, "Siddharth Seth" > wrote: Tez will run tasks throughout the cluster. However, it takes care of propagating the libraries and configuration to other nodes - which is why only the client needs to be configured. On Mon, Apr 4, 2016 at 6:51 PM, Maria > wrote: Hi,all. Is it just because tez is a client-side plug-in, so we can only need to configure tez-libraries in the hadoop client node hadoop classpath,and do not need to send and configure them to any datanode? Is there any other reasons? Thanks for any reply. Maria.
Re: Tez UI in Pig
Hi Kurt, The Tez UI as documented should work with any version beyond 0.5.2 if the history logging is configured to use YARN timeline. As for scopes, some bits of the vertex description are currently not displayed in the UI though I am not sure if Pig has integrated with that API yet. Depending on the version of hadoop you are running and the scale at which you are running, there are some known issues with the YARN timeline impl from a scalability perspective but the Yahoo folks have implemented some fixes/config workarounds to get around those. @Jon Eagles, any chance of publishing a wiki for the configs that you recommend running with for YARN Timeline with the level db impl? ( and also the HDFS based impl though that this is not really available in any hadoop release as of now ). If you are trying out the UI, it would be good if you also try out tez-ui2 as it has some enhancements coming down the pipe such as a vertex swim lane which provides a better overall view of the vertices and how they progress/time they took. The UI2 version is fairly new so feedback will be highly appreciated. @Rohini, has Pig started setting the vertex info? @Sreenath, do we have an open jira for the vertex description to be displayed in the UI? thanks — Hitesh On Apr 5, 2016, at 11:04 AM, Kurt Muehlnerwrote: > I have a question about the availability of the Tez web UI in Pig on Tez. > The Pig ‘Performance and Efficiency’ doc states, "Tez specific GUI is not > available yet, there is no GUI to track task progress. However, log message > is available in GUI.” What does this mean, precisely? We have not deployed > and configured the Tez UI described here: > https://tez.apache.org/tez-ui.html. Will that UI work when running Tez on > Pig? If so, what does ‘Tez specific GUI is not available yet’ mean? > > What I am most specifically concerned about is the ability to see which Pig > aliases are being assigned to which Tez vertices, or failing that, which Pig > aliases are being processed by a particular Tez DAG. This is currently not > available in logs in pig 0.15.0, although I’m aware it is in master. > > What are best practices for Pig 0.15.0? > > Thanks, > Kurt