[jira] [Updated] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

Qiang Tian (JIRA) Tue, 14 Oct 2014 01:04:47 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Qiang Tian updated HBASE-11368:
-------------------------------
    Attachment: hbase-11368-0.98.5.patch

I forgot StoreScanner is per CF..earlier analysis is wrong:
{quote}
After DefaultStoreFileManager#storefiles is updated in HStore#bulkLoadHFile, 
notifyChangedReadersObservers is called to reset the StoreScanner#heap, so 
checkReseek->resetScannerStack will be triggered in next scan/read to recreate 
store scanners based on new storefiles.

so we could introduce a new region level rwlock multiCFLock, 
HRegion#bulkLoadHFiles acquires the writelock before multi-CF 
HStore.bulkLoadHFile call. and StoreScanner#resetScannerStack acquires the 
readlock. this way the scanners are recreated after all CFs' store files are 
populated.
{quote}

instead, the new lock should put at regionScanner layer.  see the patch 
attached.

the "mvn test" and "TestHRegionServerBulkLoad"(large test for atomic bulkload 
test) passed, still need to run large tests and performance test(any 
suggestions for it? YCSB?).

the lock can be further limited to a smaller scope by split 
HStore#bulkLoadHFile into 2 parts:1) rename the bulkload files and put new 
files into store files list 2) notifyChangedReadersObservers. only #2 needs the 
lock. 
if HDFS file rename is fast, the split may not be needed.



> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>
>                 Key: HBASE-11368
>                 URL: https://issues.apache.org/jira/browse/HBASE-11368
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Qiang Tian
>         Attachments: hbase-11368-0.98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

Reply via email to