[jira] [Updated] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

Qiang Tian (JIRA) Fri, 24 Oct 2014 03:03:08 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Qiang Tian updated HBASE-11368:
-------------------------------
    Attachment: performance_improvement_verification_98.5.patch

A simple comparison test using updated TestHRegionServerBulkLoad.java the 
number is for just for reference. the real perf improvement might depend on a 
combination of factors, such as campaction time, bulkload time, scan/read 
workload type, request currency etc)


98.5:
-----------
2014-10-24 02:30:03,399 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(345):   loaded 16
2014-10-24 02:30:03,399 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(346):   compations 16

2014-10-24 02:30:03,399 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(348): Scanners:
//average # with 50 scanners
2014-10-24 02:30:03,399 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(350):   scanned 73
2014-10-24 02:30:03,400 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(351):   verified 18000 rows


98.5+patch
--------
//since bulkload has smaller conflict with compaction, we get more 
bulkload/compaction request in fixed test cycle(5 minutes)
2014-10-24 02:41:19,071 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(344): Loaders:
2014-10-24 02:41:19,072 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(345):   loaded 43
2014-10-24 02:41:19,072 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(346):   compations 43

2014-10-24 02:41:19,073 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(348): Scanners:
 //since bulkload has smaller conflict with scan, we get more scans in fixed 
test cycle(5 minutes)
//average # for 50 scanners
2014-10-24 02:41:19,073 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(350):   scanned 92  
2014-10-24 02:41:19,073 INFO  [main] 
regionserver.TestHRegionServerBulkLoad(351):   verified 25000 rows



> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>
>                 Key: HBASE-11368
>                 URL: https://issues.apache.org/jira/browse/HBASE-11368
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Qiang Tian
>         Attachments: hbase-11368-0.98.5.patch, 
> performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

Reply via email to