It's not just a matter of transferring the data from the reducer to
the region server, you have to take into account that that data is
also replicated to other nodes.

So in a suboptimal setup you have:

Reducer  -> Network -> RegionServer -> Local Datanode -> Network ->
Remote Datanode1 -> Network -> Remote Datanode2

What you are trying to get is:

Reducer  -> Local RegionServer -> Local Datanode -> Network -> Remote
Datanode1 -> Network -> Remote Datanode2

Subsequent flushes of the inserted data will also follow the latest
pattern. That's what I meant earlier when I said the gain would be
marginal, you're only saving one network trip among many others. Also
I took a look at the JobTracker code and modifying it doesn't look so
easy.

Instead, since you already use the HRegionPartionioner, why don't you
do an incremental bulk load? http://hbase.apache.org/bulk-loads.html

J-D

On Wed, Apr 13, 2011 at 7:49 AM, Biedermann,S.,Fa. Post Direkt
<[email protected]> wrote:
> Hi Jean-Daniel,
>
> thx for your reply.
>
> What I assume is that the total network load during reduce is O(n) with n the 
> number of nodes in the cluster. We saw a major performance loss in the reduce 
> step when our network degraded to 100Mbit by accident (1h vs. 13 minutes).
>
> With more nodes I see 2 options:
>
> 1) using switches with a higher switching capacity
> 2) improve hbase/hadoop's assignment of reduce task to those nodes which 
> serve the corresponding hbase regions.
>
> What do you think?
>
> Sven
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected] [mailto:[email protected]] Im Auftrag von 
> Jean-Daniel Cryans
> Gesendet: Freitag, 8. April 2011 18:04
> An: [email protected]
> Betreff: Re: data locality for reducer writes?
>
> Unfortunately it seems that there's nothing in the OutputFormat
> interface that we could implement (like getSplits in the InputFormat)
> to inform the JobTracker of the location of the regions. It kinda make
> sense, since when you're writing to HDFS in a "normal" MR job you
> always write to the local DataNode (well if there's one), but even
> then it is replicated to two other nodes. IMO even if we had that the
> gain would be marginal.
>
> J-D
>
> On Fri, Apr 8, 2011 at 4:18 AM, Biedermann,S.,Fa. Post Direkt
> <[email protected]> wrote:
>> Hi,
>>
>>
>>
>> we have a number of Reducer task each writing a bunch of rows into the
>> latest HBase via Puts.
>>
>> What is working is that each Reducer only creates Puts for one single
>> Region by using HRegionPartionioner.
>>
>>
>>
>> However, we are seeing that the Region flush itself is not local, but
>> going to some other node in the cluster. This puts load on the network.
>>
>> We'd like to see that instead the Reducer would be run on the same node
>> where the region is served.
>>
>>
>>
>> Is that possible?
>>
>> Any ideas or suggestions?
>>
>>
>>
>> Sven
>>
>>
>

Reply via email to