On Fri, Mar 28, 2014 at 3:19 PM, Jeremy Lewi <[email protected]> wrote:
> No luck. I get the same error even when using a single reducer. I'm
> attaching the job configuration as shown in the web ui.
>
> When I look at the job tracker for the job, it has no map tasks. Is that
> expected? I've never heard of a reduce only job.
>

Nope, a job with no map tasks doesn't sound right to me. I noticed
that you're doing a effectively doing a materialize at [1], and then
using a BloomFilterJoinStrategy. While this should work fine, I'm
thinking that it could also potentially lead to some issues such as
the one you're having (i.e. a job with no map tasks).

Could you try using the default join strategy there to see what
happens. I'm thinking that the AvroPathPerKeyTarget issue could just a
consequence of something else going wrong earlier on.

1. 
https://code.google.com/p/contrail-bio/source/browse/src/main/java/contrail/scaffolding/FilterReads.java?name=dev_read_filtering#156

>
> On Fri, Mar 28, 2014 at 6:45 AM, Jeremy Lewi <[email protected]> wrote:
>>
>> This is my first time on a  cluster I'll try what Josh suggests now.
>>
>> J
>>
>>
>> On Fri, Mar 28, 2014 at 3:41 AM, Josh Wills <[email protected]> wrote:
>>>
>>>
>>> On Fri, Mar 28, 2014 at 1:22 AM, Gabriel Reid <[email protected]>
>>> wrote:
>>>>
>>>> Hi Jeremy,
>>>>
>>>> On Thu, Mar 27, 2014 at 3:26 PM, Jeremy Lewi <[email protected]> wrote:
>>>> > Hi
>>>> >
>>>> > I'm hitting the exception pasted below when using
>>>> > AvroPathPerKeyTarget.
>>>> > Interestingly, my code works just fine when I run on a small dataset
>>>> > using
>>>> > the LocalJobTracker. However, when I run on a large dataset using a
>>>> > hadoop
>>>> > cluster I hit the exception.
>>>> >
>>>>
>>>> Have you ever been able to successfully use the AvroPathPerKeyTarget
>>>> on a real cluster, or is this the first try with it?
>>>>
>>>> I'm wondering if this could be a problem that's always been around (as
>>>> the integration test for AvroPathPerKeyTarget also runs in the local
>>>> jobtracker), or if this could be something new.
>>>
>>>
>>> +1-- Jeremy, if you force the job to run w/a single reducer on the
>>> cluster (i.e., via groupByKey(1)), does it work?
>>>
>>>>
>>>>
>>>> - Gabriel
>>>
>>>
>>
>

Reply via email to