Oh, and don't forget, you have to update bulk load to work with this approach.

Never knew that we had a utility to pick up files that folks wrote directly into the hbase.rootdir (RefreshHFilesClient). I am 110% behind ripping that out. We have bulk loading as the supported path for a reason :)

On 7/21/20 1:45 PM, Tak-Lon (Stephen) Wu wrote:
Hi guys,

I'm sending this email to get more comments and thoughts from the dev@list
for an open discussion item on HBASE-24749
<https://issues.apache.org/jira/browse/HBASE-24749>.

mainly we're proposing a feature with a new store engine to skip the use of
.tmp directory in the HFile commit stage and write directly to data
directory.

The proposal doc
<https://issues.apache.org/jira/secure/attachment/13008049/Apache%20HBase%20-%20Direct%20insert%20HFiles%20and%20Persist%20in-memory%20HFile%20tracking.pdf>
is on the JIRA and we have provided initial results
<https://issues.apache.org/jira/secure/attachment/13008050/1B100m-25m25m-performance.pdf>
with YCSB 25m and 1B that shows it's positive with the changes.

Improvement Highlights
1. Lower write latency, especially the p99+
2. Higher write throughput on flush and compaction
3. Lower MTTR on region (re)open or assignment
4. Remove consistent check dependencies (e.g. DynamoDB) supported by file
system implementation

Again, any suggestions are welcomed.

Thanks,
Stephen

Reply via email to