Hi Xuan, Jonathan, We have some use-cases that could really benefit from "solidify and enhance the Tez Statistics API so that applications (pig/hive/scope/etc) can provide their own custom implementations"
I'd be happy to contribute in some way. I think it would be a good idea to leverage standard libraries like DropWizard Metrics: http://metrics.dropwizard.io/4.0.0/ DropWizard for example has a really nice number of off-the-shelf integrations: http://metrics.dropwizard.io/4.0.0/manual/third-party.html On Thu, Mar 8, 2018 at 1:14 PM, Jonathan Eagles <jeag...@gmail.com> wrote: > Thanks for reaching out to use XuanCao. > > Sounds like you are planning to use new custom statistics that a new custom > Vertex can utilize to make better decisions. Are you planning to contribute > these custom statistics changes to tez? Or are are we trying to solidify > and enhance the Tez Statistics API so that applications > (pig/hive/scope/etc) can provide their own custom implementations? This > sounds like a great addition if I understand correctly. There are some > request for similar (TEZ-1167, TEZ-764) so we can try to find a way to > implement this that the whole community can benefit. > > Regards, > jeagles > > On Tue, Mar 6, 2018 at 4:22 PM, Xuan Cao <cao...@yahoo.com.invalid> wrote: > > > > > Hi, > > > > > > > > Thisis Xuan with Microsoft. We are looking at the Tez statistics model > > trying toextend it to support our custom workloads. Here are some > > observations of thefew statistics objects in the current codebase: > > > > 1. InputStatistics,OutputStatictics and VertexStatiscs are interfaces > > in tez-api. > > > > 2. TaskStatistics is a classin tez-runtime-internals. > > > > 3. VertexStatiscsImpl is aninner class of VertexImpl implementing > > VertexStatistics in tez-dag. > > > > 4. IOStatistics is in a classin tez-runtime-internals and > > IOStaticsticsImpl is a static inner class ofVertexImpl extending > > IOStatistics but implementing InputStatistics andOutputStatistics. > > > > > > > > Ourintention is to extend VertexStatisticsImpl so that we can aggregate > > the custompayload in TaskStatistics. But the object model here seems to > be > > ratherconfusing and not consistent. We are planning to do some work in > this > > area toimprove it, but not sure whether there is any work going on right > > now and thestatus of it? > > > > > > > > Regards, > > > > XuanCao > > > > > > > > >