Re: Pig APIs

Prashant Kommireddi Tue, 22 Jan 2013 23:25:42 -0800

Thanks Bill.

Sent from my iPhone


On Jan 22, 2013, at 10:10 PM, Bill Graham <billgra...@gmail.com> wrote:

> Sure, here you go:
>
> https://github.com/twitter/ambrose/blob/master/pig/src/main/java/com/twitter/ambrose/pig/AmbrosePigProgressNotificationListener.java
>
>
>
> On Tue, Jan 22, 2013 at 10:04 PM, Prashant Kommireddi
> <prash1...@gmail.com>wrote:
>
>> Thanks Bill. Can you please point me to the Ambrose code that uses PPNL?
>>
>> I will open a JIRA for getting hooks with explain in.
>>
>> Sent from my iPhone
>>
>> On Jan 22, 2013, at 9:03 PM, Bill Graham <billgra...@gmail.com> wrote:
>>
>>> Yeah, getting at the info here is tricky. For Ambrose we're getting info
>>> about submitted jobs, so we can just hook into the lifecycle of
>>> PigProgressNotificationListener. The PPNL notifiers are pretty coupled to
>>> PigStatsUtil and ScriptState, which aren't invoked during explain.
>>>
>>> The bulk of the action for explain all happens in the
>> PigServer.explain(..)
>>> method. That's where the logical plan, physical plan and execution plan
>> are
>>> generated before explain gets called on each to print the output. We
>> could
>>> look to add some sort of listener interface and hook here perhaps that
>> gets
>>> each of these passed during explain via a configured param.
>>>
>>>
>>>
>>> On Tue, Jan 22, 2013 at 3:05 PM, Jonathan Coveney <jcove...@gmail.com
>>> wrote:
>>>
>>>> I think that this is all available, it's just not the easiest thing to
>> get
>>>> at. If you look at the explain plan, it has a lot of this info, and you
>> can
>>>> definitely get at that info. I'm not sure if it has the reducers or if
>>>> that's post MR setup, but you should be able to.
>>>>
>>>> That said, I do not think it would hurt to have hooks in to more clearly
>>>> do something with this info. Bill had to do stuff like this for
>> Ambrose, so
>>>> maybe he can weigh in on what that could look like.
>>>>
>>>>
>>>> 2013/1/22 Prashant Kommireddi <prash1...@gmail.com>
>>>>
>>>>> Jon/others - any pointers on this? I would like to patch in hooks if
>> this
>>>>> is not possible at the moment.
>>>>>
>>>>> -Prashant
>>>>>
>>>>> On Mon, Jan 21, 2013 at 5:47 PM, Prashant Kommireddi <
>> prash1...@gmail.com
>>>>>> wrote:
>>>>>
>>>>>> At the moment, basically info on I/O paths, operators used (group by,
>>>>>> foreach ..), job level info such as number of reducers etc.
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 21, 2013 at 5:30 PM, Jonathan Coveney <jcove...@gmail.com
>>>>>> wrote:
>>>>>>
>>>>>>> What level of information would you like? IE if you do "explain
>>>>> relation,"
>>>>>>> which of the three do you want to hook into?
>>>>>>>
>>>>>>>
>>>>>>> 2013/1/21 Prashant Kommireddi <prash1...@gmail.com>
>>>>>>>
>>>>>>>> Been coding with the APIs and wondering if there is anything that
>>>>> allows
>>>>>>>> you to only retrieve the operators, I/O paths etc without actually
>>>>>>> issuing
>>>>>>>> an execute or a store? Basically, being able to get information
>>>>>>>> post-parsing of the script but pre-execution.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Prashant
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>

Re: Pig APIs

Reply via email to