That design is fine.
You should read your map in the configure method of the reducer.
There is a MapFile format supported by Hadoop, but they tend to be pretty
slow. I usually find it better to just load my hash table by hand. If you
do this, you should use whatever format you like.
On 4/16/08 12:41 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
> HI,
>
> The current structure of my program is::
> Upper class{
> class Reduce{
> reduce function(K1,V1,K2,V2){
> // I count the frequency for each key
> // Add output in HashMap(Key,value) instead of output.collect()
> }
> }
>
> void run()
> {
> runjob();
> // Now eliminate top frequency keys in HashMap built in reduce function
> here because only now hashmap is complete.
> // Write this hashmap to a file in such a format so that I can use this
> hashmap in next MapReduce job and key of this hashmap is taken as key in
> mapper function of that Map Reduce. ?? How and which format should I
> choose??? Is this design and approach ok?
>
> }
>
> public static void main() {}
> }
> I hope you have got my question.
>
> Thanks,
>
>
> On Wed, Apr 16, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]> wrote:
>
>> Aayush Garg wrote:
>>
>>> Hi,
>>>
>>> Are you sure that another MR is required for eliminating some rows?
>>> Can't I
>>> just somehow eliminate from main() when I know the keys which are needed
>>> to
>>> remove?
>>>
>>>
>>>
>> Can you provide some more details on how exactly are you filtering?
>> Amar
>>
>>
>>