There clearly is a race condition as described on the JIRA. But we haven't hit it at all in my memory even on BIG clusters.
Surely something is different with these crunch jobs. Please enable debug logging on the client side, that can help figure out what is happening on the client. More debug-fu at AM logs and the client logs can shed some light. The other thing I assumed was that this happens during app-finish. Is that true or is the error happening during the beginning? Thanks, +Vinod On Aug 13, 2013, at 6:46 PM, Matt Christiansen wrote: > I will give that a try; the idea in the MR ticket listed in the > earlier part of the thread did not work for this. > > > > On Tue, Aug 13, 2013 at 5:47 PM, Josh Wills <[email protected]> wrote: >> Hey Matt, >> >> I do a build against CDH4.3.0 and post it on Cloudera's local repository for >> my customers: >> https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/crunch/crunch-core/ >> >> Does that work any better? >> >> J >> >> >> On Tue, Aug 13, 2013 at 5:37 PM, Matt Christiansen <[email protected]> wrote: >>> >>> CDH4.3.0 which I believe its 2.0.5-alpha (not sure) >>> >>> Yes the mapreduce specific JobHistoryServer is started and configured. >>> Our other jobs don't seem to have issues just crunch. >>> >>> On Tue, Aug 13, 2013 at 3:31 PM, Vinod Kumar Vavilapalli >>> <[email protected]> wrote: >>>> Which version of hadoop-2 are you testing against? Did you start up the >>>> mapreduce specific JobHistoryServer? >>>> >>>> Thanks, >>>> +Vinod >>>> >>>> On Aug 7, 2013, at 5:37 PM, Matt Christiansen wrote: >>>> >>>> Hey guys, im getting this error while running a crunch job on our YARN >>>> cluster. It doesn't seem to be serious buts generating a lot of QA >>>> questions: >>>> >>>> >>>> ERROR exec.MRExecutor: Exception thrown fetching job counters for >>>> stage: com.rr.crunch.ads.KeyValuePairsAggregator: >>>> >>>> Avro(/QA/event_logs/2013_08_07/pixel_tracking)+S0+pre-distinct+GBK+post-distinct+S1+SeqFile(/tmp/crunch-1725623982/p1) >>>> java.io.IOException >>>> at >>>> >>>> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:317) >>>> at >>>> >>>> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:337) >>>> at >>>> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:476) >>>> at org.apache.hadoop.mapreduce.Job$8.run(Job.java:757) >>>> at org.apache.hadoop.mapreduce.Job$8.run(Job.java:754) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>> at >>>> >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >>>> at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:754) >>>> at >>>> org.apache.crunch.impl.mr.exec.MRExecutor.execute(MRExecutor.java:69) >>>> at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:153) >>>> at >>>> >>>> com.rr.crunch.ads.KeyValuePairsAggregator.process(KeyValuePairsAggregator.java:79) >>>> at >>>> >>>> com.rr.crunch.ads.KeyValuePairsAggregator.main(KeyValuePairsAggregator.java:63) >>>> >>>> This is with crunch 0.5.0 but I have tried 0.7.0 and get the same >>>> general error (no message about it being job counters but same strack >>>> trace and in the code its at the job.getJob().getCounters() part) I >>>> have tried both the hadoop-1 and hadoop-2 jars; I was wondering if any >>>> one had come acrossed this before or had any idea >>>> >>>> >>>> >>>> CONFIDENTIALITY NOTICE >>>> NOTICE: This message is intended for the use of the individual or entity >>>> to >>>> which it is addressed and may contain information that is confidential, >>>> privileged and exempt from disclosure under applicable law. If the >>>> reader of >>>> this message is not the intended recipient, you are hereby notified that >>>> any >>>> printing, copying, dissemination, distribution, disclosure or forwarding >>>> of >>>> this communication is strictly prohibited. If you have received this >>>> communication in error, please contact the sender immediately and delete >>>> it >>>> from your system. Thank You. >> >> >> >> >> -- >> Director of Data Science >> Cloudera >> Twitter: @josh_wills -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
