I'm running some YCSB tests and am seeing performance loss when I enable lzo on the table when doing the "load." There are times where the insert rate will drop to 0 operations per second.

The drop in ops/sec is caused by:
16:17:51,410 INFO HRegion: Blocking updates for 'IPC Server handler 72 on 60020' on region usertable,user1754755504,1301436104287.779d91282740038221bca426493d9ae0.: memstore size 192.0m is >= than blocking 192.0m size 16:17:51,808 INFO HRegion: Blocking updates for 'IPC Server handler 80 on 60020' on region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7.: memstore size 192.2m is >= than blocking 192.0m size 16:17:51,909 INFO HRegion: Blocking updates for 'IPC Server handler 39 on 60020' on region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7.: memstore size 192.2m is >= than blocking 192.0m size 16:17:51,983 INFO HRegion: Blocking updates for 'IPC Server handler 10 on 60020' on region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7.: memstore size 192.2m is >= than blocking 192.0m size 16:17:52,127 INFO HRegion: Blocking updates for 'IPC Server handler 34 on 60020' on region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7.: memstore size 192.2m is >= than blocking 192.0m size 16:17:52,223 INFO HRegion: Blocking updates for 'IPC Server handler 61 on 60020' on region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7.: memstore size 192.2m is >= than blocking 192.0m size

During the same time I see:
16:17:50,116 INFO HRegion: Finished memstore flush of ~199.9m for region usertable,user1020000000,1301436193579.b2ad4a4b18f02a0d4c104dfd445cb81c. in 14882ms, sequenceid=434372473, compaction requested=false 16:17:50,277 INFO HRegion: Finished memstore flush of ~5.5k for region response-index,R_cwrJ0DhNp9u5lgo,1298774642178.043cabc33b864e030253cd692a608792. in 161ms, sequenceid=434372741, compaction requested=true 16:18:04,439 INFO HRegion: Finished memstore flush of ~177.1m for region usertable,user1264540941,1301436193579.97d94c6fed18aa0c35afcc4e4e3505d7. in 14161ms, sequenceid=434372759, compaction requested=true 16:18:18,289 INFO HRegion: Finished memstore flush of ~192.0m for region usertable,user1754755504,1301436104287.779d91282740038221bca426493d9ae0. in 13850ms, sequenceid=434373121, compaction requested=true 16:18:39,521 INFO HRegion: Finished memstore flush of ~168.5m for region usertable,user150997537,1301436104287.15f426db2efc810aa23ab753fa7b8965. in 21228ms, sequenceid=434373508, compaction requested=false

Which looks like it is taking a lot of time to flush the memstore with lzo enabled. For comparison here are similar memstore flushes with and without lzo.

With lzo:
15:55:45,385 INFO HRegion: Finished memstore flush of ~64.1m for region usertable,user9\x09p:\xDA\xDA\xDA\xDA\xDA\x9C,1301435595000.9c47aed1f43b745ade509c4e390c14d7. in 4191ms, sequenceid=434307849, compaction requested=false 15:58:50,836 INFO HRegion: Finished memstore flush of ~64.1m for region usertable,user7\x08p\xBA\xDA\xDA\xDA\xDA\xDA\xAC,1301435594999.8f3df80dab5b6df73761f203f48c1bad. in 3773ms, sequenceid=434312129, compaction requested=false 16:00:04,289 INFO HRegion: Finished memstore flush of ~64.1m for region usertable,user5\x07q:\xDA\xDA\xDA\xDA\xDA\xBC,1301435594998.68f3fbd6fe3c364031b0dc0a7d4fbddc. in 3613ms, sequenceid=434313422, compaction requested=false 16:00:07,916 INFO HRegion: Finished memstore flush of ~66.1m for region usertable,user9\x09p:\xDA\xDA\xDA\xDA\xDA\x9C,1301435595000.9c47aed1f43b745ade509c4e390c14d7. in 3627ms, sequenceid=434313616, compaction requested=false


Without lzo:
16:41:54,415 INFO HRegion: Finished memstore flush of ~65.0m for region usertable,user20\xB1\xF000000(,1301438473877.3740787fab9695d18f77dc3ec08f5ce9. in 892ms, sequenceid=434359154, compaction requested=false 16:42:03,846 INFO HRegion: Finished memstore flush of ~64.0m for region usertable,user41\xB1p00000\x18,1301438473878.2f2de0ac90c3718bc6cfef2df4dd9398. in 808ms, sequenceid=434359429, compaction requested=false 16:42:08,658 INFO HRegion: Finished memstore flush of ~64.1m for region usertable,user62\xB0\xF000000\x08,1301438473879.914eaad85cfbf6f4d60f943dfbcc2221. in 1369ms, sequenceid=434359568, compaction requested=false 16:42:09,593 INFO HRegion: Finished memstore flush of ~65.3m for region usertable,user20\xB1\xF000000(,1301438473877.3740787fab9695d18f77dc3ec08f5ce9. in 935ms, sequenceid=434359632, compaction requested=false


The average lzo flush takes ~3801ms whereas a non-lzo flush takes ~1001ms which is 3.8x slower. Any ideas of what to do to prevent this problem? I'm guessing it might be fixable with some parameter tuning. Here are the important parts of my hbase-site.xml.

<property>
<name>hbase.regionserver.handler.count</name>
<value>100</value>
</property>
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>100</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>536870912</value>
</property>
<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>3</value>
</property>
<property>
<name>hbase.hstore.blockingStoreFiles</name>
<value>14</value>
</property>
<property>
<name>hbase.hstore.compactionThreshold</name>
<value>5</value>
</property>


Any suggestions would be great.

Thanks,
~Jeff

--
Jeff Whiting
Qualtrics Senior Software Engineer
[email protected]

Reply via email to