[
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164126#comment-17164126
]
Anoop Sam John commented on HBASE-24749:
----------------------------------------
Thought abt that also.. In case of a cluster drop and later recreate based on
the cluster FS, we wont have any WALs. So no wal replay. That means
automatically all the files are valid. All files will come as valid HFiles for
that region:cf.
In case of RS crash only the WAL replay come into pic. We split all WAL files
and replay and once replay also over, the region will come online in next RS.
So during this replay, if we do the tracking of the files also, end of that we
can find uncommitted files and throw them away.
In case when we have to store the HFiles list in META, every write op
(flush/compaction) from every RS is depending on the META region and a write to
that. If the META region is not available for some time, all the
flushes/compaction is blocked from completion. That is one worry I am having.
And then the next is how to store the META table's file list itself. In case
of cluster recreate, the zk data also lost right [~zyork]? So that also says
clearly that storing at zk is not possible.
> Direct insert HFiles and Persist in-memory HFile tracking
> ---------------------------------------------------------
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
> Issue Type: Umbrella
> Components: Compaction, HFile
> Affects Versions: 3.0.0-alpha-1
> Reporter: Tak-Lon (Stephen) Wu
> Assignee: Tak-Lon (Stephen) Wu
> Priority: Major
> Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}}
> directory used in the commit stage for common HFile operations such as flush
> and compaction to improve the write throughput and latency on object stores.
> Specifically for S3 filesystems, this will also mitigate read-after-write
> inconsistencies caused by immediate HFiles validation after moving the
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN,
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed
> improvement on the object stores use case makes senses and if we miss
> anything should be included.
> Improvement Highlights
> 1. Lower write latency, especially the p99+
> 2. Higher write throughput on flush and compaction
> 3. Lower MTTR on region (re)open or assignment
> 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file
> system implementation
--
This message was sent by Atlassian Jira
(v8.3.4#803005)