Pawel, I think I understand it better. By running a ParDo after a GroupByKey, the DoFn runs as what a MapReduce-style would do in the reducer.
Thanks, Jesse On Tue, Jun 7, 2016 at 12:17 PM Pawel Szczur <[email protected]> wrote: > There's no shuffle sort (at least no documented), but they may be shuffle > e.g. when one use GroupByKey. In such case, the result is: > PCollection<KV<?,Iterable<?>> > if you apply DoFn, it's similar to running reducer. > > 2016-06-07 18:10 GMT+02:00 Jesse Anderson <[email protected]>: > >> I have a question about the ParDo JavaDocs. The JavaDocs say: >> The ParDo processing style is similar to what happens inside the "Mapper" >> or "Reducer" class of a MapReduce-style algorithm. >> >> I think a ParDo's DoFn is only what a Mapper class would do. A DoFn >> doesn't seem to run after a shuffle sort like a reducer does. Is my >> understanding correct? >> >> Thanks, >> >> Jesse >> > >
