Re: Tez statistics model
It sounds like there is interest in this feature from others in the community. XuanCao, if you are willing to open a TEZ jira request with a description of the features we can start discussing requirements and solution ideas at that time. Regards, jeagles On Thu, Mar 8, 2018 at 11:55 PM, Eric Wohlstadter wrote: > Hi Xuan, Jonathan, > > We have some use-cases that could really benefit from > "solidify and enhance the Tez Statistics API so that applications > (pig/hive/scope/etc) can provide their own custom implementations" > > I'd be happy to contribute in some way. > > I think it would be a good idea to leverage standard libraries > like DropWizard Metrics: http://metrics.dropwizard.io/4.0.0/ > > DropWizard for example has a really nice number of off-the-shelf > integrations: http://metrics.dropwizard.io/4.0.0/manual/third-party.html > > On Thu, Mar 8, 2018 at 1:14 PM, Jonathan Eagles wrote: > > > Thanks for reaching out to use XuanCao. > > > > Sounds like you are planning to use new custom statistics that a new > custom > > Vertex can utilize to make better decisions. Are you planning to > contribute > > these custom statistics changes to tez? Or are are we trying to solidify > > and enhance the Tez Statistics API so that applications > > (pig/hive/scope/etc) can provide their own custom implementations? This > > sounds like a great addition if I understand correctly. There are some > > request for similar (TEZ-1167, TEZ-764) so we can try to find a way to > > implement this that the whole community can benefit. > > > > Regards, > > jeagles > > > > On Tue, Mar 6, 2018 at 4:22 PM, Xuan Cao > wrote: > > > > > > > > Hi, > > > > > > > > > > > > Thisis Xuan with Microsoft. We are looking at the Tez statistics model > > > trying toextend it to support our custom workloads. Here are some > > > observations of thefew statistics objects in the current codebase: > > > > > > 1. InputStatistics,OutputStatictics and VertexStatiscs are > interfaces > > > in tez-api. > > > > > > 2. TaskStatistics is a classin tez-runtime-internals. > > > > > > 3. VertexStatiscsImpl is aninner class of VertexImpl implementing > > > VertexStatistics in tez-dag. > > > > > > 4. IOStatistics is in a classin tez-runtime-internals and > > > IOStaticsticsImpl is a static inner class ofVertexImpl extending > > > IOStatistics but implementing InputStatistics andOutputStatistics. > > > > > > > > > > > > Ourintention is to extend VertexStatisticsImpl so that we can aggregate > > > the custompayload in TaskStatistics. But the object model here seems to > > be > > > ratherconfusing and not consistent. We are planning to do some work in > > this > > > area toimprove it, but not sure whether there is any work going on > right > > > now and thestatus of it? > > > > > > > > > > > > Regards, > > > > > > XuanCao > > > > > > > > > > > > > > >
Re: Tez statistics model
Hi Xuan, Jonathan, We have some use-cases that could really benefit from "solidify and enhance the Tez Statistics API so that applications (pig/hive/scope/etc) can provide their own custom implementations" I'd be happy to contribute in some way. I think it would be a good idea to leverage standard libraries like DropWizard Metrics: http://metrics.dropwizard.io/4.0.0/ DropWizard for example has a really nice number of off-the-shelf integrations: http://metrics.dropwizard.io/4.0.0/manual/third-party.html On Thu, Mar 8, 2018 at 1:14 PM, Jonathan Eagles wrote: > Thanks for reaching out to use XuanCao. > > Sounds like you are planning to use new custom statistics that a new custom > Vertex can utilize to make better decisions. Are you planning to contribute > these custom statistics changes to tez? Or are are we trying to solidify > and enhance the Tez Statistics API so that applications > (pig/hive/scope/etc) can provide their own custom implementations? This > sounds like a great addition if I understand correctly. There are some > request for similar (TEZ-1167, TEZ-764) so we can try to find a way to > implement this that the whole community can benefit. > > Regards, > jeagles > > On Tue, Mar 6, 2018 at 4:22 PM, Xuan Cao wrote: > > > > > Hi, > > > > > > > > Thisis Xuan with Microsoft. We are looking at the Tez statistics model > > trying toextend it to support our custom workloads. Here are some > > observations of thefew statistics objects in the current codebase: > > > > 1. InputStatistics,OutputStatictics and VertexStatiscs are interfaces > > in tez-api. > > > > 2. TaskStatistics is a classin tez-runtime-internals. > > > > 3. VertexStatiscsImpl is aninner class of VertexImpl implementing > > VertexStatistics in tez-dag. > > > > 4. IOStatistics is in a classin tez-runtime-internals and > > IOStaticsticsImpl is a static inner class ofVertexImpl extending > > IOStatistics but implementing InputStatistics andOutputStatistics. > > > > > > > > Ourintention is to extend VertexStatisticsImpl so that we can aggregate > > the custompayload in TaskStatistics. But the object model here seems to > be > > ratherconfusing and not consistent. We are planning to do some work in > this > > area toimprove it, but not sure whether there is any work going on right > > now and thestatus of it? > > > > > > > > Regards, > > > > XuanCao > > > > > > > > >
Re: Tez statistics model
Thanks for reaching out to use XuanCao. Sounds like you are planning to use new custom statistics that a new custom Vertex can utilize to make better decisions. Are you planning to contribute these custom statistics changes to tez? Or are are we trying to solidify and enhance the Tez Statistics API so that applications (pig/hive/scope/etc) can provide their own custom implementations? This sounds like a great addition if I understand correctly. There are some request for similar (TEZ-1167, TEZ-764) so we can try to find a way to implement this that the whole community can benefit. Regards, jeagles On Tue, Mar 6, 2018 at 4:22 PM, Xuan Cao wrote: > > Hi, > > > > Thisis Xuan with Microsoft. We are looking at the Tez statistics model > trying toextend it to support our custom workloads. Here are some > observations of thefew statistics objects in the current codebase: > > 1. InputStatistics,OutputStatictics and VertexStatiscs are interfaces > in tez-api. > > 2. TaskStatistics is a classin tez-runtime-internals. > > 3. VertexStatiscsImpl is aninner class of VertexImpl implementing > VertexStatistics in tez-dag. > > 4. IOStatistics is in a classin tez-runtime-internals and > IOStaticsticsImpl is a static inner class ofVertexImpl extending > IOStatistics but implementing InputStatistics andOutputStatistics. > > > > Ourintention is to extend VertexStatisticsImpl so that we can aggregate > the custompayload in TaskStatistics. But the object model here seems to be > ratherconfusing and not consistent. We are planning to do some work in this > area toimprove it, but not sure whether there is any work going on right > now and thestatus of it? > > > > Regards, > > XuanCao > > > >
Tez statistics model
Hi, Thisis Xuan with Microsoft. We are looking at the Tez statistics model trying toextend it to support our custom workloads. Here are some observations of thefew statistics objects in the current codebase: 1. InputStatistics,OutputStatictics and VertexStatiscs are interfaces in tez-api. 2. TaskStatistics is a classin tez-runtime-internals. 3. VertexStatiscsImpl is aninner class of VertexImpl implementing VertexStatistics in tez-dag. 4. IOStatistics is in a classin tez-runtime-internals and IOStaticsticsImpl is a static inner class ofVertexImpl extending IOStatistics but implementing InputStatistics andOutputStatistics. Ourintention is to extend VertexStatisticsImpl so that we can aggregate the custompayload in TaskStatistics. But the object model here seems to be ratherconfusing and not consistent. We are planning to do some work in this area toimprove it, but not sure whether there is any work going on right now and thestatus of it? Regards, XuanCao