Re: Optimizing compactions on super-low-cost HW

Serega Sheypak Thu, 21 May 2015 12:25:17 -0700

Hi!
Please help ^)

2015-05-21 11:04 GMT+03:00 Serega Sheypak <serega.shey...@gmail.com>:


> > Do you have the system sharing
> There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive
> with mirroring enabled. I can't persuade devops that mirroring could cause
> IO issues. What arguments can I bring? They use OS partition mirroring when
> disck fails, we can use other partition to boot OS and continue to work...
>
> >Do you have to compact? In other words, do you have read SLAs?
> Unfortunately, I have mixed workload from web applications. I need to
> write and read and SLA is < 50ms.
>
> >How are your read times currently?
> Cloudera manager says it's 4K reads per second and 500 writes per second
>
> >Does your working dataset fit in RAM or do
> reads have to go to disk?
> I have several tables for 500GB each and many small tables 10-20 GB. Small
> tables loaded hourly/daily using bulkload (prepare HFiles using MR and move
> them to HBase using utility). Big tables are used by webapps, they read and
> write them.
>
> >It looks like you are running at about three storefiles per column family
> is it hbase.hstore.compactionThreshold=3?
>
> >What if you upped the threshold at which minors run?
> you mean bump  hbase.hstore.compactionThreshold to 8 or 10?
>
> >Do you have a downtime during which you could schedule compactions?
> Unfortunately no. It should work 24/7 and sometimes it doesn't do it.
>
> >Are you managing the major compactions yourself or are you having hbase
> do it for you?
> HBase, once a day hbase.hregion.majorcompaction=1day
>
> I can disable WAL. It's ok to loose some data in case of RS failure. I'm
> not doing banking transactions.
> If I disable WAL, could it help?
>
> 2015-05-20 18:04 GMT+03:00 Stack <st...@duboce.net>:
>
>> On Mon, May 18, 2015 at 4:26 PM, Serega Sheypak <serega.shey...@gmail.com
>> >
>> wrote:
>>
>> > Hi, we are using extremely cheap HW:
>> > 2 HHD 7200
>> > 4*2 core (Hyperthreading)
>> > 32GB RAM
>> >
>> > We met serious IO performance issues.
>> > We have more or less even distribution of read/write requests. The same
>> for
>> > datasize.
>> >
>> > ServerName Request Per Second Read Request Count Write Request Count
>> > node01.domain.com,60020,1430172017193 195 171871826 16761699
>> > node02.domain.com,60020,1426925053570 24 34314930 16006603
>> > node03.domain.com,60020,1430860939797 22 32054801 16913299
>> > node04.domain.com,60020,1431975656065 33 1765121 253405
>> > node05.domain.com,60020,1430484646409 27 42248883 16406280
>> > node07.domain.com,60020,1426776403757 27 36324492 16299432
>> > node08.domain.com,60020,1426775898757 26 38507165 13582109
>> > node09.domain.com,60020,1430440612531 27 34360873 15080194
>> > node11.domain.com,60020,1431989669340 28 44307 13466
>> > node12.domain.com,60020,1431927604238 30 5318096 2020855
>> > node13.domain.com,60020,1431372874221 29 31764957 15843688
>> > node14.domain.com,60020,1429640630771 41 36300097 13049801
>> >
>> > ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed
>> > Storefile
>> > Size Index Size Bloom Size
>> > node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
>> > 310111k
>> > node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
>> > 318854k
>> > node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
>> > 307136k
>> > node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
>> > 289316k
>> > node05.domain.com,60020,1430484646409 82 185 1111807m 81474mb 688136k
>> > 334127k
>> > node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
>> > 296169k
>> > node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
>> > 312325k
>> > node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
>> > 309734k
>> > node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
>> > 264081k
>> > node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
>> > 304137k
>> > node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k
>> > 257607k
>> > node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k
>> > 266677k
>> >
>> >
>> > When compaction starts  random node gets I/O 100%, io wait for seconds,
>> > even tenth of seconds.
>> >
>> > What are the approaches to optimize minor and major compactions when you
>> > are I/O bound..?
>> >
>>
>> Yeah, with two disks, you will be crimped. Do you have the system sharing
>> with hbase/hdfs or is hdfs running on one disk only?
>>
>> Do you have to compact? In other words, do you have read SLAs?  How are
>> your read times currently?  Does your working dataset fit in RAM or do
>> reads have to go to disk?  It looks like you are running at about three
>> storefiles per column family.  What if you upped the threshold at which
>> minors run? Do you have a downtime during which you could schedule
>> compactions? Are you managing the major compactions yourself or are you
>> having hbase do it for you?
>>
>> St.Ack
>>
>
>

Re: Optimizing compactions on super-low-cost HW

Reply via email to