On Fri, May 17, 2013 at 8:23 AM, Jeremy Carroll <[email protected]> wrote:
> Look at how much Hard Disk utilization you have (IOPS / Svctm). You may > just be under scaled for the QPS you desire for both read + write load. If > you are performing random gets, you could expect around the low to mid > 100's IOPS/sec per HDD. Use bonnie++ / IOZone / IOPing to verify. > > Also you could see how efficient your cache is (Saving Disk IOPS). > Thanks for the tips Jeremy. I have used bonnie++ to benchmark both the fast and slow servers and the outputs of bonnie are very similar. I haven't tried running bonnie++ when the load was high but I can try and do it later today since I just restarted my load test again. It takes a few hours before the performance starts degrading. Regarding the IOPS/Svctm, I have run iostat for a while when performance was bad and saw that the tps was pretty spiky. I have a striped RAID0 on my 4 disks and see the tps hovering anywhere between 100tps to 4000tps. Each disk individually max's out at 1000 tps. I checked another region server which handles almost equal amounts of data but the rowkey size on that box is bigger by 8 bytes than the box that is slow (fast server: rk is 24 bytes, cf is 1 byte, cq is 6 bytes, val can be 25 bytes to 1.5KB). That box shows tps of max 200 and the GETs that are sent to that regionserver finish 10K requests in a second (not great but acceptable). Given the region size are almost same (off by 300MB), I am still not clear what else to debug. Maybe I can try and split the region and see if that speeds up things but I wanted to try that as my last option. Thanks, Viral
