[jira] [Updated] (HBASE-25619) 50% reading performance degradation 2.4.1 over 1.6.0

Danil Lipovoy (Jira) Sun, 28 Feb 2021 21:59:07 -0800


     [ 
https://issues.apache.org/jira/browse/HBASE-25619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Danil Lipovoy updated HBASE-25619:
----------------------------------
    Description: 
I have found performance issues. YCSB tests show:

 *Operations per second (batch 1000)*
 
  | |*1.4.13*|*1.6.0*|*2.2.6*|*2.4.1*|*comments*|
|INSERTS|68|68|75|76|< this is fine|
|GETS|92|100|72|48|< 50% less|
|FLUSHED GETS|126|141|120|108|< not good |
|GET+INSERT|69|71|68|66| |

GETS - means gets right after inserts.

FLUSHED GETS - after flush and major compation

All numbers are average of 3 runs.

For example GETS 2.4.1 => (45 + 49 + 50) / 3 = 48 got from:

— run 01 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 45
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
 — run 02 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 109
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 49
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 77
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
 — run 03 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 50
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65

But always were 4 runs (not 3). First run for warm up and excluded from 
aggregation (usually it is faster then all runs later).

All test done with AdaptiveLRU 
(https://issues.apache.org/jira/browse/HBASE-23887)

This is because:
 # RS on old LRU just often fall under pressure.
 # It is faster than current version (much faster when server powerful).
 For example on my PC (AMD Ryzen 7 2700X Eight-Core Processor, 32 GB MEM, SSD)  
this is current version LRU (1.4.13):
 -- 
 --- run 01 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65
 --- run 02 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 115
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 81
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67
 --- run 03 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 82
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66

This is new version (1.4.13):
 – run 01 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 128
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 70
 — run 02 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 126
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 69
 — run 03 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 125
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 91
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67

All test done with the same params:

<configuration>

<property>
 <name>hbase.cluster.distributed</name>
 <value>true</value>
 </property>

<property>
 <name>hbase.tmp.dir</name>
 <value>./tmp/hb</value>
 </property>

<property>
 <name>hbase.rootdir</name>
 <value>/tmp/hbase</value>
 </property>

<property>
 <name>hbase.unsafe.stream.capability.enforce</name>
 <value>false</value>
 </property>

<property>
 <name>zookeeper.session.timeout</name>
 <value>120000</value>
 </property>

<property>
 <name>hbase.rpc.timeout</name>
 <value>120000</value>
 </property>

<property>
 <name>hbase.regionserver.handler.count</name>
 <value>300</value>
 </property>

<property>
 <name>hbase.regionserver.metahandler.count</name>
 <value>30</value>
 </property>

<property>
 <name>hbase.regionserver.maxlogs</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.hregion.memstore.flush.size</name>
 <value>1342177280</value>
 </property>

<property>
 <name>hbase.hregion.memstore.block.multiplier</name>
 <value>6</value>
 </property>

<property>
 <name>hbase.hstore.compactionThreshold</name>
 <value>2</value>
 </property>

<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.regionserver.optionalcacheflushinterval</name>
 <value>18000000</value>
 </property>

<property>
 <name>hbase.regionserver.thread.compaction.large</name>
 <value>12</value>
 </property>

<property>
 <name>hbase.regionserver.wal.enablecompression</name>
 <value>true</value>
 </property>

<property>
 <name>hbase.server.compactchecker.interval.multiplier</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.rest.threads.min</name>
 <value>8</value>
 </property>

<property>
 <name>hbase.rest.threads.max</name>
 <value>150</value>
 </property>

<property>
 <name>hbase.thrift.minWorkerThreads</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.regionserver.thread.compaction.small</name>
 <value>6</value>
 </property>

<property>
 <name>hbase.ipc.server.read.threadpool.size</name>
 <value>60</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.count.limit</name>
 <value>0</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.mb.size.limit</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.overhead.coefficient</name>
 <value>0.01</value>
 </property>

 <property>
 <name>hbase.wal.provider</name>
 <value>multiwal</value>
 </property>
 </configuration>

And everywhere export HBASE_HEAPSIZE=22G

ZK is separate (downloaded from apache site) because RS just falls when use 
build-in ZK.

Full logs in an attachment.

Every one can repeat the tests. I used modificated YCSB (added batchsize)
 [https://github.com/pustota2009/YCSB.git]

It is possible just:
 1. Download and set up ZK 
[https://www.apache.org/dyn/closer.lua/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz]

2. Download and set up HBase ([https://hbase.apache.org/downloads.html)]

3. Tune HBase (with params above)

4. Download [^scripts.zip] (there are YCSB and scripts) into hbase dir - the 
same level where bin, conf, log etc

5. Execute run-4-tests-30t-LRU.sh.

 

It will works about 1,5 hours and collect the results into 
hdl300_LRU_thr30_reg100.res and results_agg.txt

Maybe somebody would interested to investigate the cause this degradation and 
fix it.

 

  was:
I have found performance issues. YCSB tests show:

 
 | |*Operations per second (batch 1000)*|
| |*1.4.13*|*1.6.0*|*2.2.6*|*2.4.1*|
|INSERTS|68|68|75|76|
|GETS|92|100|72|48|
|FLUSHED GETS|126|141|120|108|
|GET+INSERT|69|71|68|66|

 

GETS - means gets right after inserts.

FLUSHED GETS - after flush and major compation

All numbers are average of 3 runs.

For example GETS 2.4.1 => (45 + 49 + 50) / 3 = 48 got from:

— run 01 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 45
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
 — run 02 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 109
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 49
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 77
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
 — run 03 hdl300_LRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 50
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65

But always were 4 runs (not 3). First run for warm up and excluded from 
aggregation (usually it is faster then all runs later).

All test done with AdaptiveLRU 
(https://issues.apache.org/jira/browse/HBASE-23887)

This is because:
 # RS on old LRU just often fall under pressure.
 # It is faster than current version (much faster when server powerful).
 For example on my PC (AMD Ryzen 7 2700X Eight-Core Processor, 32 GB MEM, SSD)  
this is current version LRU (1.4.13):
 -- 
 --- run 01 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 76
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65
 --- run 02 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 115
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 81
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67
 --- run 03 hdl300_oldLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 82
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66

This is new version (1.4.13):
 – run 01 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 128
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 70
 — run 02 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 126
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 69
 — run 03 hdl300_newLRU_thr30_reg100 —
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 125
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 91
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
 thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67

All test done with the same params:

<configuration>

<property>
 <name>hbase.cluster.distributed</name>
 <value>true</value>
 </property>

<property>
 <name>hbase.tmp.dir</name>
 <value>./tmp/hb</value>
 </property>

<property>
 <name>hbase.rootdir</name>
 <value>/tmp/hbase</value>
 </property>

<property>
 <name>hbase.unsafe.stream.capability.enforce</name>
 <value>false</value>
 </property>

<property>
 <name>zookeeper.session.timeout</name>
 <value>120000</value>
 </property>

<property>
 <name>hbase.rpc.timeout</name>
 <value>120000</value>
 </property>

<property>
 <name>hbase.regionserver.handler.count</name>
 <value>300</value>
 </property>

<property>
 <name>hbase.regionserver.metahandler.count</name>
 <value>30</value>
 </property>

<property>
 <name>hbase.regionserver.maxlogs</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.hregion.memstore.flush.size</name>
 <value>1342177280</value>
 </property>

<property>
 <name>hbase.hregion.memstore.block.multiplier</name>
 <value>6</value>
 </property>

<property>
 <name>hbase.hstore.compactionThreshold</name>
 <value>2</value>
 </property>

<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.regionserver.optionalcacheflushinterval</name>
 <value>18000000</value>
 </property>

<property>
 <name>hbase.regionserver.thread.compaction.large</name>
 <value>12</value>
 </property>

<property>
 <name>hbase.regionserver.wal.enablecompression</name>
 <value>true</value>
 </property>

<property>
 <name>hbase.server.compactchecker.interval.multiplier</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.rest.threads.min</name>
 <value>8</value>
 </property>

<property>
 <name>hbase.rest.threads.max</name>
 <value>150</value>
 </property>

<property>
 <name>hbase.thrift.minWorkerThreads</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.regionserver.thread.compaction.small</name>
 <value>6</value>
 </property>

<property>
 <name>hbase.ipc.server.read.threadpool.size</name>
 <value>60</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.count.limit</name>
 <value>0</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.mb.size.limit</name>
 <value>200</value>
 </property>

<property>
 <name>hbase.lru.cache.heavy.eviction.overhead.coefficient</name>
 <value>0.01</value>
 </property>

 <property>
 <name>hbase.wal.provider</name>
 <value>multiwal</value>
 </property>
 </configuration>

And everywhere export HBASE_HEAPSIZE=22G

ZK is separate (downloaded from apache site) because RS just falls when use 
build-in ZK.

Full logs in an attachment.

Every one can repeat the tests. I used modificated YCSB (added batchsize)
 [https://github.com/pustota2009/YCSB.git]

It is possible just:
 1. Download and set up ZK 
[https://www.apache.org/dyn/closer.lua/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz]

2. Download and set up HBase ([https://hbase.apache.org/downloads.html)]

3. Tune HBase (with params above)

4. Download [^scripts.zip] (there are YCSB and scripts) into hbase dir - the 
same level where bin, conf, log etc

5. Execute run-4-tests-30t-LRU.sh.

 

It will works about 1,5 hours and collect the results into 
hdl300_LRU_thr30_reg100.res and results_agg.txt

Maybe somebody would interested to investigate the cause this degradation and 
fix it.

 


> 50% reading performance degradation 2.4.1 over 1.6.0
> ----------------------------------------------------
>
>                 Key: HBASE-25619
>                 URL: https://issues.apache.org/jira/browse/HBASE-25619
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Danil Lipovoy
>            Priority: Major
>         Attachments: logs.zip, scripts.zip
>
>
> I have found performance issues. YCSB tests show:
>  *Operations per second (batch 1000)*
>  
>   | |*1.4.13*|*1.6.0*|*2.2.6*|*2.4.1*|*comments*|
> |INSERTS|68|68|75|76|< this is fine|
> |GETS|92|100|72|48|< 50% less|
> |FLUSHED GETS|126|141|120|108|< not good |
> |GET+INSERT|69|71|68|66| |
> GETS - means gets right after inserts.
> FLUSHED GETS - after flush and major compation
> All numbers are average of 3 runs.
> For example GETS 2.4.1 => (45 + 49 + 50) / 3 = 48 got from:
> — run 01 hdl300_LRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 45
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
>  — run 02 hdl300_LRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 109
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 49
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 77
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
>  — run 03 hdl300_LRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 108
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 50
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 76
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65
> But always were 4 runs (not 3). First run for warm up and excluded from 
> aggregation (usually it is faster then all runs later).
> All test done with AdaptiveLRU 
> (https://issues.apache.org/jira/browse/HBASE-23887)
> This is because:
>  # RS on old LRU just often fall under pressure.
>  # It is faster than current version (much faster when server powerful).
>  For example on my PC (AMD Ryzen 7 2700X Eight-Core Processor, 32 GB MEM, 
> SSD)  this is current version LRU (1.4.13):
>  -- 
>  --- run 01 hdl300_oldLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 76
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 65
>  --- run 02 hdl300_oldLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 115
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 81
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67
>  --- run 03 hdl300_oldLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 116
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 82
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 66
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 66
> This is new version (1.4.13):
>  – run 01 hdl300_newLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 128
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 67
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 70
>  — run 02 hdl300_newLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 126
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 93
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 69
>  — run 03 hdl300_newLRU_thr30_reg100 —
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 fget ops= 125
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 get ops= 91
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 ins ops= 68
>  thr30 cnt100000 tim300 num0 max1 bch1000 reg100 upd ops= 67
> All test done with the same params:
> <configuration>
> <property>
>  <name>hbase.cluster.distributed</name>
>  <value>true</value>
>  </property>
> <property>
>  <name>hbase.tmp.dir</name>
>  <value>./tmp/hb</value>
>  </property>
> <property>
>  <name>hbase.rootdir</name>
>  <value>/tmp/hbase</value>
>  </property>
> <property>
>  <name>hbase.unsafe.stream.capability.enforce</name>
>  <value>false</value>
>  </property>
> <property>
>  <name>zookeeper.session.timeout</name>
>  <value>120000</value>
>  </property>
> <property>
>  <name>hbase.rpc.timeout</name>
>  <value>120000</value>
>  </property>
> <property>
>  <name>hbase.regionserver.handler.count</name>
>  <value>300</value>
>  </property>
> <property>
>  <name>hbase.regionserver.metahandler.count</name>
>  <value>30</value>
>  </property>
> <property>
>  <name>hbase.regionserver.maxlogs</name>
>  <value>200</value>
>  </property>
> <property>
>  <name>hbase.hregion.memstore.flush.size</name>
>  <value>1342177280</value>
>  </property>
> <property>
>  <name>hbase.hregion.memstore.block.multiplier</name>
>  <value>6</value>
>  </property>
> <property>
>  <name>hbase.hstore.compactionThreshold</name>
>  <value>2</value>
>  </property>
> <property>
>  <name>hbase.hstore.blockingStoreFiles</name>
>  <value>200</value>
>  </property>
> <property>
>  <name>hbase.regionserver.optionalcacheflushinterval</name>
>  <value>18000000</value>
>  </property>
> <property>
>  <name>hbase.regionserver.thread.compaction.large</name>
>  <value>12</value>
>  </property>
> <property>
>  <name>hbase.regionserver.wal.enablecompression</name>
>  <value>true</value>
>  </property>
> <property>
>  <name>hbase.server.compactchecker.interval.multiplier</name>
>  <value>200</value>
>  </property>
> <property>
>  <name>hbase.rest.threads.min</name>
>  <value>8</value>
>  </property>
> <property>
>  <name>hbase.rest.threads.max</name>
>  <value>150</value>
>  </property>
> <property>
>  <name>hbase.thrift.minWorkerThreads</name>
>  <value>200</value>
>  </property>
> <property>
>  <name>hbase.regionserver.thread.compaction.small</name>
>  <value>6</value>
>  </property>
> <property>
>  <name>hbase.ipc.server.read.threadpool.size</name>
>  <value>60</value>
>  </property>
> <property>
>  <name>hbase.lru.cache.heavy.eviction.count.limit</name>
>  <value>0</value>
>  </property>
> <property>
>  <name>hbase.lru.cache.heavy.eviction.mb.size.limit</name>
>  <value>200</value>
>  </property>
> <property>
>  <name>hbase.lru.cache.heavy.eviction.overhead.coefficient</name>
>  <value>0.01</value>
>  </property>
>  <property>
>  <name>hbase.wal.provider</name>
>  <value>multiwal</value>
>  </property>
>  </configuration>
> And everywhere export HBASE_HEAPSIZE=22G
> ZK is separate (downloaded from apache site) because RS just falls when use 
> build-in ZK.
> Full logs in an attachment.
> Every one can repeat the tests. I used modificated YCSB (added batchsize)
>  [https://github.com/pustota2009/YCSB.git]
> It is possible just:
>  1. Download and set up ZK 
> [https://www.apache.org/dyn/closer.lua/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz]
> 2. Download and set up HBase ([https://hbase.apache.org/downloads.html)]
> 3. Tune HBase (with params above)
> 4. Download [^scripts.zip] (there are YCSB and scripts) into hbase dir - the 
> same level where bin, conf, log etc
> 5. Execute run-4-tests-30t-LRU.sh.
>  
> It will works about 1,5 hours and collect the results into 
> hdl300_LRU_thr30_reg100.res and results_agg.txt
> Maybe somebody would interested to investigate the cause this degradation and 
> fix it.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HBASE-25619) 50% reading performance degradation 2.4.1 over 1.6.0

Reply via email to