Re: Question of fetching mapper output

orpl Sun, 16 Jul 2023 07:33:58 -0700

We have extended the implementation of MR3 so that all partitioninputs can be fetched with a single call, e.g.:


  rssShuffleClient.readPartition(..., 0, 100)

Now, Hive-MR3 with Celeborn runs as fast as Hive-MR3 with its own shufflehandlers when tested with 10TB TPC-DS benchmark. For some queries, it iseven noticeably faster.


Thanks,

--- Sungwoo

On Thu, 13 Jul 2023, [email protected] wrote:

Hi Team,

I have a question on how a reducer should fetch the output of mappers.
As an example, consider this standard scenario:

1. There are 100 mapper and 50 reducers.
2. Each mapper creates 50 partitions, each of which is to be fetched by thecorresponding reducer.3. Each reducer is responsible for a single partition and tries to fetch 100partitions (one from each mapper).
In our current implementation, a reducer callsrssShuffleClient.readPartition() 100 times (one for each mapper):
 rssShuffleClient.readPartition(..., mapIndex, mapIndex + 1)
My question is: if reducers start after the completion of all mappers, can wecall (or should we try to call) rssShuffleClient.readPartition() only once,as in?
 rssShuffleClient.readPartition(..., 0, 100)
My understanding of remote shuffle service (like Magnet for Spark) is thatall the partitions destined to the same reducer are automatically merged bythe shuffle service, so we thought that just a single call might be enough.
Thanks,

--- Sungwoo Park

Re: Question of fetching mapper output

Reply via email to