Re: Job config before read fields

Shahab Yunus Fri, 30 Aug 2013 19:43:20 -0700

What I meant was that you might have to split or redesign your logic or
your usecase (which we don't know about)?


Regards,
Shahab


On Fri, Aug 30, 2013 at 10:31 PM, Adrian CAPDEFIER
<[email protected]>wrote:

> But how would the comparator have access to the job config?
>
>
> On Sat, Aug 31, 2013 at 2:38 AM, Shahab Yunus <[email protected]>wrote:
>
>> I think you have to override/extend the Comparator to achieve that,
>> something like what is done in Secondary Sort?
>>
>> Regards,
>> Shahab
>>
>>
>> On Fri, Aug 30, 2013 at 9:01 PM, Adrian CAPDEFIER <[email protected]
>> > wrote:
>>
>>> Howdy,
>>>
>>> I apologise for the lack of code in this message, but the code is fairly
>>> convoluted and it would obscure my problem. That being said, I can put
>>> together some sample code if really needed.
>>>
>>> I am trying to pass some metadata between the map & reduce steps. This
>>> metadata is read and generated in the map step and stored in the job
>>> config. It also needs to be recreated on the reduce node before the key/
>>> value fields can be read in the readFields function.
>>>
>>> I had assumed that I would be able to override the Reducer.setup()
>>> function and that would be it, but apparently the readFields function is
>>> called before the Reducer.setup() function.
>>>
>>> My question is what is any (the best) place on the reduce node where I
>>> can access the job configuration/ context before the readFields function is
>>> called?
>>>
>>> This is the stack trace:
>>>
>>>         at
>>> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>>>         at
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1111)
>>>         at
>>> org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:70)
>>>         at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59)
>>>         at
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1399)
>>>         at
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
>>>         at
>>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>>
>>
>

Re: Job config before read fields

Reply via email to