My GC oprions: -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
Yes, it is >8ms now for random read.
In my previous report, the random-read evaluation was started soon after of
the sequential-write(from empty) evaluation. The result of such case is
still good.
We have Ganglia, but I cannot access it since I am remote accessing my
cluster. And the swap vm.swappiness=20 to minimize swap.
I just check more things carefully by commands (top, free, iostat, etc.),
and find following issue:
On my slave node (1-4) which has 4CPU cores and 8GB RAM, 2 SATA. The memory
and regionserver heap are both adequate. CPU is not very busy. Seems every
thing is ok.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4398 schubert 23 0 2488m 2.0g 9.9m S 38 26.2 59:06.21
java (it is region server)
But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, 6
SATA-RAID1, has problem.
avg-cpu: %user %nice %system %iowait %steal %idle
7.46 0.00 3.28 23.11 0.00 66.15
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 84.83 25.12 485.57 2.49 53649.75 220.90 110.38
9.20 18.85 2.04 99.53
dm-0 0.00 0.00 0.00 25.12 0.00 201.00 8.00
0.01 0.27 0.01 0.02
dm-1 0.00 0.00 570.90 2.49 53655.72 19.90 93.61
10.74 18.72 1.74 99.53
It seems the disk I/O is very busy.
And top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
21037 schubert 20 0 2628m 1.6g 10m S 48 41.3 71:24.74
java
I will check more when I go to company in the moning.
On Wed, Aug 19, 2009 at 1:36 AM, stack <[email protected]> wrote:
> What do you have for GC config Schubert? Now its 8ms a random read?
> St.Ack
>
> On Tue, Aug 18, 2009 at 10:28 AM, Jonathan Gray <[email protected]> wrote:
>
> > Schubert,
> >
> > I can't think of any reason your random reads would get slower after
> > inserting more data, besides GC issues.
> >
> > Do you have GC logging and JVM metrics logging turned on? I would
> inspect
> > those to see if you have any long-running GC pauses, or just lots and
> lots
> > of GC going on.
> >
> > If I recall, you are running on 4GB nodes, 2GB RS heap, and cohosted
> > DataNodes and TaskTrackers. We ran for a long time on a similar setup,
> but
> > once we moved to 0.20 (and to the CMS garbage collector), we really
> needed
> > to add more memory to the nodes and increase RS heap to 4 or 5GB. The
> CMS
> > GC is less efficient in memory, but if given sufficient resources, is
> much
> > better for overall performance/throughput.
> >
> > Also, do you have Ganglia setup? Are you seeing swapping on your RS
> nodes?
> > Is there high IO-wait CPU usage?
> >
> > JG
> >
> >
> > Schubert Zhang wrote:
> >
> >> Addition.
> >> Only random-reads become very slow, scans and sequential-reads are ok.
> >>
> >>
> >> On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang <[email protected]>
> >> wrote:
> >>
> >> stack and J-G, Thank you very much for your helpful comment.
> >>>
> >>> But now, we find such a critical issue for random reads.
> >>> I use sequentical-writes to insert 5GB of data in our HBase table from
> >>> empty, and ~30 regions are generated. Then the random-reads takes about
> >>> 30
> >>> minutes to complete. And then, I run the sequentical-writes again.
> Thus,
> >>> another version of each cell are inserted, thus ~60 regions are
> >>> generated.
> >>> But, we I ran the random-reads again to this table, it always take long
> >>> time
> >>> (more than 2 hours).
> >>>
> >>> I check the heap usage and other metrics, does not find the reason.
> >>>
> >>> Bellow is the status of one region server:
> >>> request=0.0, regions=13, stores=13, storefiles=14,
> storefileIndexSize=2,
> >>> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080,
> >>> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55
> >>>
> >>> Schubert
> >>>
> >>>
> >>> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang <[email protected]>
> >>> wrote:
> >>>
> >>> We have just done a Performance Evaluation on HBase-0.20.0.
> >>>> Refers to:
> >>>>
> >>>>
> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html
> >>>>
> >>>>
> >>>
> >>
>