[
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qiang Tian updated HBASE-11368:
-------------------------------
Attachment: hbase-11368-0.98.5.patch
I forgot StoreScanner is per CF..earlier analysis is wrong:
{quote}
After DefaultStoreFileManager#storefiles is updated in HStore#bulkLoadHFile,
notifyChangedReadersObservers is called to reset the StoreScanner#heap, so
checkReseek->resetScannerStack will be triggered in next scan/read to recreate
store scanners based on new storefiles.
so we could introduce a new region level rwlock multiCFLock,
HRegion#bulkLoadHFiles acquires the writelock before multi-CF
HStore.bulkLoadHFile call. and StoreScanner#resetScannerStack acquires the
readlock. this way the scanners are recreated after all CFs' store files are
populated.
{quote}
instead, the new lock should put at regionScanner layer. see the patch
attached.
the "mvn test" and "TestHRegionServerBulkLoad"(large test for atomic bulkload
test) passed, still need to run large tests and performance test(any
suggestions for it? YCSB?).
the lock can be further limited to a smaller scope by split
HStore#bulkLoadHFile into 2 parts:1) rename the bulkload files and put new
files into store files list 2) notifyChangedReadersObservers. only #2 needs the
lock.
if HDFS file rename is fast, the split may not be needed.
> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch
>
>
> Compactions take a read lock. If a multi-column family region, before bulk
> loading, we want to take a write lock on the region. If the compaction takes
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long? Does the bulk load need
> a full write lock when multiple column families? Can we fail more gracefully
> at least?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)