[
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qiang Tian updated HBASE-11368:
-------------------------------
Attachment: performance_improvement_verification_98.5.patch
A simple comparison test using updated TestHRegionServerBulkLoad.java the
number is for just for reference. the real perf improvement might depend on a
combination of factors, such as campaction time, bulkload time, scan/read
workload type, request currency etc)
98.5:
-----------
2014-10-24 02:30:03,399 INFO [main]
regionserver.TestHRegionServerBulkLoad(345): loaded 16
2014-10-24 02:30:03,399 INFO [main]
regionserver.TestHRegionServerBulkLoad(346): compations 16
2014-10-24 02:30:03,399 INFO [main]
regionserver.TestHRegionServerBulkLoad(348): Scanners:
//average # with 50 scanners
2014-10-24 02:30:03,399 INFO [main]
regionserver.TestHRegionServerBulkLoad(350): scanned 73
2014-10-24 02:30:03,400 INFO [main]
regionserver.TestHRegionServerBulkLoad(351): verified 18000 rows
98.5+patch
--------
//since bulkload has smaller conflict with compaction, we get more
bulkload/compaction request in fixed test cycle(5 minutes)
2014-10-24 02:41:19,071 INFO [main]
regionserver.TestHRegionServerBulkLoad(344): Loaders:
2014-10-24 02:41:19,072 INFO [main]
regionserver.TestHRegionServerBulkLoad(345): loaded 43
2014-10-24 02:41:19,072 INFO [main]
regionserver.TestHRegionServerBulkLoad(346): compations 43
2014-10-24 02:41:19,073 INFO [main]
regionserver.TestHRegionServerBulkLoad(348): Scanners:
//since bulkload has smaller conflict with scan, we get more scans in fixed
test cycle(5 minutes)
//average # for 50 scanners
2014-10-24 02:41:19,073 INFO [main]
regionserver.TestHRegionServerBulkLoad(350): scanned 92
2014-10-24 02:41:19,073 INFO [main]
regionserver.TestHRegionServerBulkLoad(351): verified 25000 rows
> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch,
> performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock. If a multi-column family region, before bulk
> loading, we want to take a write lock on the region. If the compaction takes
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long? Does the bulk load need
> a full write lock when multiple column families? Can we fail more gracefully
> at least?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)