Yup, exactly. A MapReduce is equivalent to a ParDo-GroupByKey-ParDo.

On Tue, Jun 7, 2016 at 9:55 AM, Jesse Anderson <[email protected]>
wrote:

> Pawel,
>
> I think I understand it better. By running a ParDo after a GroupByKey, the
> DoFn runs as what a MapReduce-style would do in the reducer.
>
> Thanks,
>
> Jesse
>
> On Tue, Jun 7, 2016 at 12:17 PM Pawel Szczur <[email protected]>
> wrote:
>
>> There's no shuffle sort (at least no documented), but they may be shuffle
>> e.g. when one use GroupByKey. In such case, the result is:
>> PCollection<KV<?,Iterable<?>>
>> if you apply DoFn, it's similar to running reducer.
>>
>> 2016-06-07 18:10 GMT+02:00 Jesse Anderson <[email protected]>:
>>
>>> I have a question about the ParDo JavaDocs. The JavaDocs say:
>>> The ParDo processing style is similar to what happens inside the
>>> "Mapper" or "Reducer" class of a MapReduce-style algorithm.
>>>
>>> I think a ParDo's DoFn is only what a Mapper class would do. A DoFn
>>> doesn't seem to run after a shuffle sort like a reducer does. Is my
>>> understanding correct?
>>>
>>> Thanks,
>>>
>>> Jesse
>>>
>>
>>

Reply via email to