Aayush, out of curiosity, why do you want model wordcount this way?
What benefit do you see?

Norbert

On 4/6/09, Aayush Garg <aayush.g...@gmail.com> wrote:
> Hi,
>
>  I want to make experiments with wordcount example in a different way.
>
>  Suppose we have very large data. Instead of splitting all the data one time,
>  we want to feed some splits in the map-reduce job at a time. I want to model
>  the hadoop job like this,
>
>  Suppose a batch of inputsplits arrive in the beginning to every map, and
>  reduce gives the word, frequency for this batch of inputsplits.
>  Now after this another batch of inputsplits arrive and the results from
>  subsequent reduce are aggregated to the previous results(if the word "that"
>  has frequency 2 in previous processing and in this processing it occurs 1
>  time, then the frequency of "that" is now maintained as 3).
>  In next map-reduce "that" comes 4 times, now its frequency maintained as
>  7....
>
>  And this process goes on like this.
>  Now how would I model inputsplits like this and how these continuous
>  map-reduces can be made running. In what way should I keep the results of
>  Map-Reduces so that I could aggregate this with the output of next
>  Map-reduce.
>
>  Thanks,
>
> Aayush
>

Reply via email to