Thanks J-D

> Let's start with...
>
> Are you using HTable directly or are you going through TableOutputFormat?
I'm using 
http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapreduce/IdentityTableReducer.html

> If former, do you use the write buffer?
Not explicitly set be me

> Are you inserting into multiple families?
1 family

> Are you using compression?
LZO

> Did you take a look at the region server logs?
I am now ;)

> If so, so you see a lot of messages in the likes of "Blocking ..."?
Indeed:

memstore size 138.7m is >= than blocking 128.0m size 2010-11-24
17:12:49,136 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 4 on 60020' on region
raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.:
memstore size 138.7m is >= than blocking 128.0m size 2010-11-24
17:12:49,155 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 10 on 60020' on region
raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.:
memstore size 146.3m is >= than blocking 128.0m size 2010-11-24
17:12:49,169 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 5 on 60020' on region
raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.:
memstore size 148.8m is >= than blocking 128.0m size 2010-11-24
17:12:49,193 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 8 on 60020' on region

I guess this is bad, but could benefit from some guidance...

> Are you monitoring the GCs?
> If so, do you see some pauses longer than a second?

What's the best way to do this please and I will?

> Thx!

Thank you J-D

Tim


>
> J-D
>
> On Wed, Nov 24, 2010 at 11:00 AM, Tim Robertson
> <[email protected]> wrote:
>> Hi all,
>>
>> I am running an MR job that is loading an HBase table in the reduce,
>> and I am seeing hopeless performance - 10 million records of <1Kb in 2
>> hours so far.
>>
>> Please bear in mind I am software guy, so go easy ;) but here is what
>> I know so far:
>>
>> (http://code.google.com/p/gbif-occurrencestore/wiki/ClusterConfig
>> describes the cluster, and currently 40 reducers are running, all on
>> CDH3)
>>
>> - RS and TT all have load averages way down at 1-2 max
>> - RS and TT CPUs are 398% idle on quad cores, 1598% idle on hyper
>> threading dual quads
>> - RS heap is 4G
>> - there seems no iowait anywhere
>> - Free -m shows "swap used 0" on all machines if I am reading it correctly
>>
>> Can anyone please suggest where I can go digging?  Please don't assume
>> I have looked at the basics - I'm learning as much as I can as I go.
>>
>> Thanks,
>> Tim
>>
>

Reply via email to