[
https://issues.apache.org/jira/browse/HBASE-26225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xi chaomin resolved HBASE-26225.
--------------------------------
Resolution: Won't Fix
> let hbase.mapreduce.bulkload.assign.sequenceNumbers take effect in
> SecureBulkLoadManager.secureBulkLoadHFiles
> -------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-26225
> URL: https://issues.apache.org/jira/browse/HBASE-26225
> Project: HBase
> Issue Type: Improvement
> Components: Performance
> Reporter: xi chaomin
> Priority: Minor
> Attachments: SecureBulkLoadManager.patch
>
>
> HBASE-10958 Call Flush before BulkLoad to obtain the latest sequenceID to
> prevent data loss during replay.
> '_hbase.mapreduce.bulkload.assign.sequenceNumbers_' controls whether to flush
> before BulkLoad, but we pass true to whether to flush in
> *SecureBulkLoadManager*. If we bulkload frequently we flush a lot of small
> files. Can we make 'hbase.mapreduce.bulkload.assign.sequenceNumbers' work in
> SecureBulkLoadManager? This passes -1 to sequenceId, we won't loss data.
> SecureBulkLoadManager.java.
> secureBulkLoadHFiles
> {code:java}
> // code placeholder
> return region.bulkLoadHFiles(familyPaths, true, new
> SecureBulkLoadListener(fs, bulkToken, conf), request.getCopyFile(),
> clusterIds, request.getReplicate());
> {code}
> Hregion.java
> {code:java}
> // code placeholder
> public Map<byte[], List<Path>> bulkLoadHFiles(Collection<Pair<byte[],
> String>> familyPaths,
> boolean assignSeqId, BulkLoadListener bulkLoadListener, boolean copyFile,
> List<String> clusterIds, boolean replicate)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)