Unfortunately that didn't work. I still have a reduce only job. Here's a link to the console output in case that's helpful: https://drive.google.com/a/lewi.us/file/d/0B6ngy4MCihWwcy1sdE9DQ2hiYnc/edit?usp=sharing
I'm currently ungrouping my records before writing them (an earlier attempt to fix this issue). I'm trying without the ungroup now. J On Fri, Mar 28, 2014 at 10:08 AM, Jeremy Lewi <[email protected]> wrote: > Unfortunately that didn't work. I still have a reduce only job. I'm > attaching the console output from when I run my job in case thats helpful. > I'm currently ungrouping my records before writing them (an earlier > attempt to fix this). I'm try undoing that. > > J > > > On Fri, Mar 28, 2014 at 9:51 AM, Jeremy Lewi <[email protected]> wrote: > >> Thanks Gabriel I'll give that a try now. I was actually planning on >> making that change once I realized that my current strategy was forcing me >> to materialize data early on. >> >> >> On Fri, Mar 28, 2014 at 7:44 AM, Gabriel Reid <[email protected]>wrote: >> >>> On Fri, Mar 28, 2014 at 3:19 PM, Jeremy Lewi <[email protected]> wrote: >>> > No luck. I get the same error even when using a single reducer. I'm >>> > attaching the job configuration as shown in the web ui. >>> > >>> > When I look at the job tracker for the job, it has no map tasks. Is >>> that >>> > expected? I've never heard of a reduce only job. >>> > >>> >>> Nope, a job with no map tasks doesn't sound right to me. I noticed >>> that you're doing a effectively doing a materialize at [1], and then >>> using a BloomFilterJoinStrategy. While this should work fine, I'm >>> thinking that it could also potentially lead to some issues such as >>> the one you're having (i.e. a job with no map tasks). >>> >>> Could you try using the default join strategy there to see what >>> happens. I'm thinking that the AvroPathPerKeyTarget issue could just a >>> consequence of something else going wrong earlier on. >>> >>> 1. >>> https://code.google.com/p/contrail-bio/source/browse/src/main/java/contrail/scaffolding/FilterReads.java?name=dev_read_filtering#156 >>> >>> > >>> > On Fri, Mar 28, 2014 at 6:45 AM, Jeremy Lewi <[email protected]> wrote: >>> >> >>> >> This is my first time on a cluster I'll try what Josh suggests now. >>> >> >>> >> J >>> >> >>> >> >>> >> On Fri, Mar 28, 2014 at 3:41 AM, Josh Wills <[email protected]> >>> wrote: >>> >>> >>> >>> >>> >>> On Fri, Mar 28, 2014 at 1:22 AM, Gabriel Reid < >>> [email protected]> >>> >>> wrote: >>> >>>> >>> >>>> Hi Jeremy, >>> >>>> >>> >>>> On Thu, Mar 27, 2014 at 3:26 PM, Jeremy Lewi <[email protected]> >>> wrote: >>> >>>> > Hi >>> >>>> > >>> >>>> > I'm hitting the exception pasted below when using >>> >>>> > AvroPathPerKeyTarget. >>> >>>> > Interestingly, my code works just fine when I run on a small >>> dataset >>> >>>> > using >>> >>>> > the LocalJobTracker. However, when I run on a large dataset using >>> a >>> >>>> > hadoop >>> >>>> > cluster I hit the exception. >>> >>>> > >>> >>>> >>> >>>> Have you ever been able to successfully use the AvroPathPerKeyTarget >>> >>>> on a real cluster, or is this the first try with it? >>> >>>> >>> >>>> I'm wondering if this could be a problem that's always been around >>> (as >>> >>>> the integration test for AvroPathPerKeyTarget also runs in the local >>> >>>> jobtracker), or if this could be something new. >>> >>> >>> >>> >>> >>> +1-- Jeremy, if you force the job to run w/a single reducer on the >>> >>> cluster (i.e., via groupByKey(1)), does it work? >>> >>> >>> >>>> >>> >>>> >>> >>>> - Gabriel >>> >>> >>> >>> >>> >> >>> > >>> >> >> >
