[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

Anoop Sam John (Jira) Thu, 23 Jul 2020 06:25:01 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163565#comment-17163565
 ]


Anoop Sam John commented on HBASE-24749:
----------------------------------------

Ya that will help for deleting the compacted away files while opening the 
result file.
But the other case where we write compaction result file directly under 
region/cf and we had the start compaction marker in wal. The compaction did not 
finish and RS crashed. By that time the wal file which had the start compaction 
marker got rolled and deleted also.  Now while replay of wal, we dont have 
start compaction marker for this wal file. So we think this is an old valid 
file but that is wrong. This is  a partial file.  This is possible.  On that 
approach of solving the issue with WAL marker, this is a possible issue (?) 

> Direct insert HFiles and Persist in-memory HFile tracking
> ---------------------------------------------------------
>
>                 Key: HBASE-24749
>                 URL: https://issues.apache.org/jira/browse/HBASE-24749
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Compaction, HFile
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Tak-Lon (Stephen) Wu
>            Assignee: Tak-Lon (Stephen) Wu
>            Priority: Major
>              Labels: design, discussion, objectstore, storeFile, storeengine
>         Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

Reply via email to