Hello Xiaoyong, Some of the information is easily available - some of it is not :). To answer your questions, I will answer them in a different order:
2) The data is sent from the Tez AM to the Timeline server as and when events occur during the processing of a DAG. For example, whenever a task is started, a task entity will be created in Timeline. When it finished, its status will be updated in Timeline. In that sense, as long as the Timeline server is capable of handling the updates from the Tez AM, it will have close to the real-time status of a running Tez application. 3) Could you clarify more on this question. The Timeline data is mostly raw data ( very little aggregations except in some cases ) by querying the information from it, a UI can build out/calculate the completion rate as such. One example of aggregation is at the Vertex level where it stores stats such as slowest/fastest task, etc. 1) To get the structure of the data, unfortunately there is no detailed docs. But you can look at look at the code in HistoryEventTimelineConversion.java. Using this and more docs from YARN ( https://issues.apache.org/jira/browse/YARN-1530 has the design doc ), you will see what data is being stored and when. If you would like to help the Tez project in documenting this, we would appreciate it a lot. I can help guide you on this if you would like to take this up. thanks — Hitesh On Dec 15, 2014, at 12:56 AM, Xiaoyong Zhu <[email protected]> wrote: > Sorry for the consequent mails.. I investigated Tez UI several days and I > guess what I care about most is about the APIs and the fields in the return > JSON values. > I have read the JIRA regarding the Yarn timeline server integration > (https://issues.apache.org/jira/browse/TEZ-1066) by Hitesh. And my question > would be: > 1. how could I get the format of the final returned JSON results? > 2. This is for finished Tez applications, right? or Tez will also log > the running status in Timeline server and the timeline server will expose the > running Tez applications via REST APIs? > 3. Do we have some intermediate result of a Tez application? For > example, the completion rate of each stage(graphs) > > Regarding #1, I found an article talking about this: > https://www.altiscale.com/hadoop-blog/timeline-server-in-hadoop/ > > Please ignore the previous mails as I don’t care them.. at least nowJ > Thanks! > > Xiaoyong > > From: Xiaoyong Zhu [mailto:[email protected]] > Sent: Sunday, December 14, 2014 4:57 PM > To: [email protected]; [email protected] > Subject: RE: Tez UI early trial? > > Btw I also didn’t find the property > tez.runtime.convert.user-payload.to.history-text in the tez-site.xml in > HDP2.2… I don’t know why – maybe I missed something? > > From: Xiaoyong Zhu > Sent: Sunday, December 14, 2014 4:53 PM > To: [email protected]; '[email protected]' > Subject: RE: Tez UI early trial? > > Thanks Hitesh! Finally I find that Tez 0.6.0 is available in HDP 2.2 preview > so I decide to use that. I guess the previous error is because I don’t have > the YARN timeline packages in local disk. > But currently I still cannot access the timeline server in HDP 2.2 Sandbox > (there is no http://127.0.0.1:8188/ws/v1/timeline/TEZ_APPLICATIONID)– this > might be out of your scope, but I still would like to ask a question that – > do you know how to enable Tez UI quickly in HDP 2.2? since I see that it is > using Tez 0.6.0. > > The tar under /apps/tez/ is tez-0.6.0.2.2.0.0-1084.tar.gz > > Xiaoyong > > > -----Original Message----- > From: Hitesh Shah [mailto:[email protected]] > Sent: Saturday, December 13, 2014 12:17 AM > To: [email protected] > Subject: Re: Tez UI early trial? > > Hello Xiaoyong, > > Could you shed more light on the problems you have been encountering ( and > with which version of the hadoop )? Some of the details on how to use YARN > timeline are documented here: http://tez.apache.org/tez_yarn_timeline.html - > let me know if that helps. > Let us also know what version of Tez you are trying and we can guide you > further. > > Beyond that, assuming you manage to get Tez to write data correctly to YARN > timeline, the UI as you mentioned is JS based but it is built on top of the > YARN timeline web services. There is some initial info on how to setup the UI > here: > https://github.com/apache/tez/blob/master/tez-ui/README.TXT. If you face any > issues using the instructions, please do let us know as this will get refined > as we get closer to the 0.6 release in the next few weeks. > > Some of the web service calls can be found in > app/scripts/models/TimelineRestAdapter.js. > > For the most part, all the data is retrieved via these endpoints: > > <TIMELINEHOST:PORT>/ws/v1/timeline/TEZ_APPLICATION > <TIMELINEHOST:PORT>/ws/v1/timeline/TEZ_DAG_ID > <TIMELINEHOST:PORT>/ws/v1/timeline/TEZ_VERTEX_ID > <TIMELINEHOST:PORT>/ws/v1/timeline/TEZ_TASK_ID > <TIMELINEHOST:PORT>/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID > > ( timeline usually runs on port 8188 by default ) > > thanks > — Hitesh > > > On Dec 12, 2014, at 2:21 AM, Xiaoyong Zhu <[email protected]> wrote: > > > Hi Tez experts > > > > I am really excited to see the Tez UI will be coming out in 0.6! it’s > > really helpful! > > I want to try that in my local sandbox (Cloudera/Hortonworks) but I > > encountered a lot of issues (such as cannot find > > org.apache.tez.dag.history.logging.ats packages) and I decide to give up > > building it myself. Is this feature available in Tez 0.5 releases so I > > could download the bits and deploy rather than build from scratch? > > > > Another question is that, I know it is a JS build atop of a bunch of > > RESTful APIs after talking with Gopal – could someone guide me where is the > > REST API documentation? I want to have a look at the REST API abilities. > > > > Thanks! > > > > Xiaoyong
