Hi Jeremy, I just took some time to dig into this a bit deeper. It turns out that it is indeed an issue with handling an empty output PCollection in the AvroPathPerKeyTarget -- I've logged https://issues.apache.org/jira/browse/CRUNCH-371 to resolve it.
The reason it was working on the local job tracker is a difference in the implementation of LocalFileSystem and DistributedFileSystem in hadoop-1. The good/bad news is that the current code will consistently crash with with a consistent exception on hadoop-2 with both the local file system and HDFS. The short-term solution (other than patching your Crunch build with the patch in CRUNCH-371) would be just just ensure that the PCollection being output isn't empty. - Gabriel On Sat, Mar 29, 2014 at 2:27 AM, Jeremy Lewi <[email protected]> wrote: > Thanks for the tip. I'll look into it and try to figure it out. > > > On Fri, Mar 28, 2014 at 11:11 AM, Gabriel Reid <[email protected]> > wrote: >> >> On Fri, Mar 28, 2014 at 6:13 PM, Jeremy Lewi <[email protected]> wrote: >> > Unfortunately that didn't work. I still have a reduce only job. >> > >> > Here's a link to the console output in case that's helpful: >> > >> > https://drive.google.com/a/lewi.us/file/d/0B6ngy4MCihWwcy1sdE9DQ2hiYnc/edit?usp=sharing >> > >> > >> > I'm currently ungrouping my records before writing them (an earlier >> > attempt >> > to fix this issue). I'm trying without the ungroup now. >> >> Looking at the console output, I noticed that the second and third >> jobs are logging "Total input paths to process : 0", which makes me >> think that the first job being run doesn't have any output. Could you >> check the job counters there to see if it is indeed outputting >> anything? And was your local job running on the same data? >> >> The fact that there are no inputs would explain the reduce-only job, >> and I'm guessing/hoping that will be the reason the >> AvroPathPerKeyTarget is breaking. >> >> - Gabriel >> >> >> > >> > J >> > >> > >> > On Fri, Mar 28, 2014 at 10:08 AM, Jeremy Lewi <[email protected]> wrote: >> >> >> >> Unfortunately that didn't work. I still have a reduce only job. I'm >> >> attaching the console output from when I run my job in case thats >> >> helpful. >> >> I'm currently ungrouping my records before writing them (an earlier >> >> attempt to fix this). I'm try undoing that. >> >> >> >> J >> >> >> >> >> >> On Fri, Mar 28, 2014 at 9:51 AM, Jeremy Lewi <[email protected]> wrote: >> >>> >> >>> Thanks Gabriel I'll give that a try now. I was actually planning on >> >>> making that change once I realized that my current strategy was >> >>> forcing me >> >>> to materialize data early on. >> >>> >> >>> >> >>> On Fri, Mar 28, 2014 at 7:44 AM, Gabriel Reid <[email protected]> >> >>> wrote: >> >>>> >> >>>> On Fri, Mar 28, 2014 at 3:19 PM, Jeremy Lewi <[email protected]> wrote: >> >>>> > No luck. I get the same error even when using a single reducer. I'm >> >>>> > attaching the job configuration as shown in the web ui. >> >>>> > >> >>>> > When I look at the job tracker for the job, it has no map tasks. Is >> >>>> > that >> >>>> > expected? I've never heard of a reduce only job. >> >>>> > >> >>>> >> >>>> Nope, a job with no map tasks doesn't sound right to me. I noticed >> >>>> that you're doing a effectively doing a materialize at [1], and then >> >>>> using a BloomFilterJoinStrategy. While this should work fine, I'm >> >>>> thinking that it could also potentially lead to some issues such as >> >>>> the one you're having (i.e. a job with no map tasks). >> >>>> >> >>>> Could you try using the default join strategy there to see what >> >>>> happens. I'm thinking that the AvroPathPerKeyTarget issue could just >> >>>> a >> >>>> consequence of something else going wrong earlier on. >> >>>> >> >>>> 1. >> >>>> >> >>>> https://code.google.com/p/contrail-bio/source/browse/src/main/java/contrail/scaffolding/FilterReads.java?name=dev_read_filtering#156 >> >>>> >> >>>> > >> >>>> > On Fri, Mar 28, 2014 at 6:45 AM, Jeremy Lewi <[email protected]> >> >>>> > wrote: >> >>>> >> >> >>>> >> This is my first time on a cluster I'll try what Josh suggests >> >>>> >> now. >> >>>> >> >> >>>> >> J >> >>>> >> >> >>>> >> >> >>>> >> On Fri, Mar 28, 2014 at 3:41 AM, Josh Wills <[email protected]> >> >>>> >> wrote: >> >>>> >>> >> >>>> >>> >> >>>> >>> On Fri, Mar 28, 2014 at 1:22 AM, Gabriel Reid >> >>>> >>> <[email protected]> >> >>>> >>> wrote: >> >>>> >>>> >> >>>> >>>> Hi Jeremy, >> >>>> >>>> >> >>>> >>>> On Thu, Mar 27, 2014 at 3:26 PM, Jeremy Lewi <[email protected]> >> >>>> >>>> wrote: >> >>>> >>>> > Hi >> >>>> >>>> > >> >>>> >>>> > I'm hitting the exception pasted below when using >> >>>> >>>> > AvroPathPerKeyTarget. >> >>>> >>>> > Interestingly, my code works just fine when I run on a small >> >>>> >>>> > dataset >> >>>> >>>> > using >> >>>> >>>> > the LocalJobTracker. However, when I run on a large dataset >> >>>> >>>> > using >> >>>> >>>> > a >> >>>> >>>> > hadoop >> >>>> >>>> > cluster I hit the exception. >> >>>> >>>> > >> >>>> >>>> >> >>>> >>>> Have you ever been able to successfully use the >> >>>> >>>> AvroPathPerKeyTarget >> >>>> >>>> on a real cluster, or is this the first try with it? >> >>>> >>>> >> >>>> >>>> I'm wondering if this could be a problem that's always been >> >>>> >>>> around >> >>>> >>>> (as >> >>>> >>>> the integration test for AvroPathPerKeyTarget also runs in the >> >>>> >>>> local >> >>>> >>>> jobtracker), or if this could be something new. >> >>>> >>> >> >>>> >>> >> >>>> >>> +1-- Jeremy, if you force the job to run w/a single reducer on >> >>>> >>> the >> >>>> >>> cluster (i.e., via groupByKey(1)), does it work? >> >>>> >>> >> >>>> >>>> >> >>>> >>>> >> >>>> >>>> - Gabriel >> >>>> >>> >> >>>> >>> >> >>>> >> >> >>>> > >> >>> >> >>> >> >> >> > > >
