Thanks Bill. Sent from my iPhone
On Jan 22, 2013, at 10:10 PM, Bill Graham <billgra...@gmail.com> wrote: > Sure, here you go: > > https://github.com/twitter/ambrose/blob/master/pig/src/main/java/com/twitter/ambrose/pig/AmbrosePigProgressNotificationListener.java > > > > On Tue, Jan 22, 2013 at 10:04 PM, Prashant Kommireddi > <prash1...@gmail.com>wrote: > >> Thanks Bill. Can you please point me to the Ambrose code that uses PPNL? >> >> I will open a JIRA for getting hooks with explain in. >> >> Sent from my iPhone >> >> On Jan 22, 2013, at 9:03 PM, Bill Graham <billgra...@gmail.com> wrote: >> >>> Yeah, getting at the info here is tricky. For Ambrose we're getting info >>> about submitted jobs, so we can just hook into the lifecycle of >>> PigProgressNotificationListener. The PPNL notifiers are pretty coupled to >>> PigStatsUtil and ScriptState, which aren't invoked during explain. >>> >>> The bulk of the action for explain all happens in the >> PigServer.explain(..) >>> method. That's where the logical plan, physical plan and execution plan >> are >>> generated before explain gets called on each to print the output. We >> could >>> look to add some sort of listener interface and hook here perhaps that >> gets >>> each of these passed during explain via a configured param. >>> >>> >>> >>> On Tue, Jan 22, 2013 at 3:05 PM, Jonathan Coveney <jcove...@gmail.com >>> wrote: >>> >>>> I think that this is all available, it's just not the easiest thing to >> get >>>> at. If you look at the explain plan, it has a lot of this info, and you >> can >>>> definitely get at that info. I'm not sure if it has the reducers or if >>>> that's post MR setup, but you should be able to. >>>> >>>> That said, I do not think it would hurt to have hooks in to more clearly >>>> do something with this info. Bill had to do stuff like this for >> Ambrose, so >>>> maybe he can weigh in on what that could look like. >>>> >>>> >>>> 2013/1/22 Prashant Kommireddi <prash1...@gmail.com> >>>> >>>>> Jon/others - any pointers on this? I would like to patch in hooks if >> this >>>>> is not possible at the moment. >>>>> >>>>> -Prashant >>>>> >>>>> On Mon, Jan 21, 2013 at 5:47 PM, Prashant Kommireddi < >> prash1...@gmail.com >>>>>> wrote: >>>>> >>>>>> At the moment, basically info on I/O paths, operators used (group by, >>>>>> foreach ..), job level info such as number of reducers etc. >>>>>> >>>>>> >>>>>> On Mon, Jan 21, 2013 at 5:30 PM, Jonathan Coveney <jcove...@gmail.com >>>>>> wrote: >>>>>> >>>>>>> What level of information would you like? IE if you do "explain >>>>> relation," >>>>>>> which of the three do you want to hook into? >>>>>>> >>>>>>> >>>>>>> 2013/1/21 Prashant Kommireddi <prash1...@gmail.com> >>>>>>> >>>>>>>> Been coding with the APIs and wondering if there is anything that >>>>> allows >>>>>>>> you to only retrieve the operators, I/O paths etc without actually >>>>>>> issuing >>>>>>>> an execute or a store? Basically, being able to get information >>>>>>>> post-parsing of the script but pre-execution. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Prashant >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>