The code snipped I wrote was for use inside a UDF, not part of Pig Latin.
The way to get at things like counters when running Pig code would
have to be to write a Java driver program that would use the new API
in https://issues.apache.org/jira/browse/PIG-1478 and
https://issues.apache.org/jira/browse/PIG-1333

-Dmitriy

On Mon, Oct 18, 2010 at 2:57 AM, Josh Devins <[email protected]> wrote:
> Ah, sorry, just saw that this should read:
>
> PigStatusReporter.getInstance() and there is no special counters
> keyword/variable. However is this common for Pig, being able to access
> static methods directly from within a Pig script?
>
> Thanks,
>
> Josh
>
>
> On 18 October 2010 11:56, Josh Devins <[email protected]> wrote:
>
>> Thanks, I will explore the stats in MR mode a bit once I'm on 0.8/trunk.
>>
>> I will also have a look at wrapping some of the standard loaders to get
>> better stats out of them. Is this of interest to anyone else? Should I
>> submit back to PiggyBank?
>>
>> This syntax of counters.PigStatusReporter, is that documented somewhere? Is
>> it only on 0.8/trunk? What other variables do we have access to in the
>> "native" Pig script other than "counters"?
>>
>> Josh
>>
>>
>>
>> On 17 October 2010 19:44, Dmitriy Ryaboy <[email protected]> wrote:
>>
>>> No on Filters (though every MR job tells you the number of records
>>> ingested,
>>> and the number returned, and as of 0.8 it also tells you which relations
>>> were being produced in the job -- so you can sort of back into that).
>>> EB sort of gives you 2), most of the loaders in there give you number of
>>> malformed records, though they do not store the bad records anywhere.
>>> I am not sure what you mean by 3) -- you can just increment
>>> counters.
>>> PigStatusReporter.getInstance().getCounter(myEnum).increment(1L);
>>>
>>> (watch out for a null reporter when you are still in the client-side).
>>>
>>> -D
>>>
>>>
>>> On Sat, Oct 16, 2010 at 2:28 PM, Josh Devins <[email protected]> wrote:
>>>
>>> > I've seen a few threads about counters, PigStats, Elephant-Bird's stats
>>> > utility class, etc.
>>> >
>>> > http://www.mail-archive.com/[email protected]/msg00900.html
>>> > http://www.mail-archive.com/user%40pig.apache.org/msg00034.html
>>> >
>>> > Has any progress been made on this or to provide a comprehensive
>>> > stats/counter mechanism?
>>> >
>>> > What I'm looking to do is three-fold:
>>> >
>>> > 1) Get stats on the number of records that are filtered out when using
>>> the
>>> > FILTER operation
>>> > 2) Get stats on the number of records dropped/not loaded in a LOAD
>>> function
>>> > (and actual copies of the records/rows from the file for later
>>> evaluation)
>>> > 3) Output my own stats from a Pig job (without resorting to writing my
>>> own
>>> > UDF and pushing things into PigStats using the Elephant-Bird utility)
>>> >
>>> > If any of this is possible, it would be great to see some examples or
>>> > documentation. I would hate to go to raw Hadoop MR code just to get to
>>> > counters.
>>> >
>>> > Thanks,
>>> >
>>> > Josh
>>> >
>>>
>>
>>
>

Reply via email to