After a bit of research and going through the code, I found where the issue came from : this is actually a bug in PIG and it seems it was recently logged (and fixed !) here : https://issues.apache.org/jira/browse/PIG-2358
The bug itself is pretty simple : In JobStats.java, in the addCounters(RunningJob rjob) method, the instance variable counters is NEVER referenced, as it creates a new counters variable everytime it is called, which in turn causes the instance variable counters to be always null since it is virtually never used. I was able to get this working correctly by applying this patch, which simply removes the declaration of this temporary counters variable. Counters now work with PigRunner and PigServer as well. Hope this information can prevent others from being stuck with this issue ! Thanks, Charles On Wed, Dec 7, 2011 at 10:04 AM, Charles Menguy < [email protected]> wrote: > Yes I already read this article but that's about exactly what I'm doing > (except I'm using PigServer instead of PigRunner). I can get the JobGraph > instance and the JobStats, but it's the call to getHadoopCounters() that > always returns null. > > Do you have any clue why the hadoop counters are null with PigServer? I > can still seem to get the other metrics from JobStats instances, but > specifically the counters don't seem to return anything in that case. Not > sure if this is a bug or if something is missing there. > > Thanks, > > Charles > > > On Tue, Dec 6, 2011 at 11:47 PM, Aniket Mokashi <[email protected]>wrote: > >> There is a good blog article on this- >> >> http://squarecog.wordpress.com/2010/12/24/incrementing-hadoop-counters-in-apache-pig/ >> >> Thanks, >> Aniket >> >> On Tue, Dec 6, 2011 at 1:49 PM, Charles Menguy < >> [email protected]> wrote: >> >> > Hi All, >> > >> > I'm trying to play with counters with PigServer and have a couple >> issues. >> > >> > First, I've found very little documentation on how to do this, so I'm >> not >> > sure if the method I'm trying is the good one, any feedback would be >> > appreciated. >> > >> > From what I understand, we need a PigStats in order to be able to >> retrieve >> > the counters from it. >> > To get this PigStats from a PigServer instance, here is what I do : >> > >> > pigServer.setBatchOn(); // needed to enable batch mode, which seems to >> be >> > the only way to get the ExecJob instances needed to get the stats >> > pigServer.registerScript(pigScript, params); // register the script i >> want >> > to run >> > List<ExecJob> execJobs = pigServer.executeBatch(); // get the ExecJobs >> > associated with the script i just ran >> > >> > Now I am supposed to be able to get the counters from this ExecJob >> class. >> > >> > for (ExecJob execJob : execJobs) { >> > for (JobStats jobStats : execJob.getStatistics().getJobGraph()) { // not >> > sure why we need to use the job graph to get the stats but that seems >> to be >> > the only solution i found >> > Counters counters = jobStats.getHadoopCounters(); // this is >> always >> > NULL ! >> > for (Group group : counters) { >> > for (Counter counter : group) { >> > >> > Now the strange thing is that every time I call the getHadoopCounters(), >> > the resulting Counters object is null, and thus I cannot get any >> counter at >> > all. >> > >> > This happens in local and mapreduce mode, and I checked that execJob and >> > jobStats are indeed not null. >> > >> > Am I doing something wrong here to get the counters, or forgetting >> > something? I'm using pig 0.8.1 from cdh3u1 >> > >> > Thanks for your help ! >> > >> > -- >> > Charles Menguy | Senior Software Engineer >> > Proclivity Systems >> > 22 West 19th Street | 9th Floor >> > New York, NY 10011 >> > [email protected] >> > www.proclivitysystems.com >> > >> > Proclivity® | We Value Your Customers™ >> > >> > This message is the property of Proclivity Systems, Inc. and is intended >> > only for the use of the addressee(s), and may contain material that is >> > confidential and privileged for the sole use of the intended recipient. >> If >> > you are not the intended recipient, reliance or forwarding without >> express >> > permission is strictly prohibited; please contact the sender and delete >> all >> > copies. >> > >> >> >> >> -- >> "...:::Aniket:::... Quetzalco@tl" >> > >
