[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464862#comment-17464862 ] Duo Zhang commented on HBASE-24749: --- I also resolved all the sub tasks as Won't fix. Feel free to reopen and link them to other issues if they are still needed. Thanks. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464263#comment-17464263 ] Duo Zhang commented on HBASE-24749: --- HBASE-26067 has been merged to master. As discussed before, will close this issue later today. Thanks~ > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450629#comment-17450629 ] Josh Elser commented on HBASE-24749: HBASE-26067 seems to be the right path to me. I don't see a need to keep this open if no one is working on it, but don't feel strongly (given Duo suggested we only close this after the merge for 26067 to master happens). > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448305#comment-17448305 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- no issue here, thanks for working on HBASE-26067 > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448301#comment-17448301 ] Zach York commented on HBASE-24749: --- I think it's fine to close this issue. If we later determine this approach has better characteristics than the implemented approach, we can always implement it within the new interfaces and can reopen/create a new issue. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448051#comment-17448051 ] Duo Zhang commented on HBASE-24749: --- For me I think we could close this issue after we merge HBASE-26067 back to master. And there is an issue for implementing region based SFT. HBASE-26262 But let's see what [~zyork] and [~taklwu] think? They are the author of this issue. Thanks. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448029#comment-17448029 ] Wellington Chevreuil commented on HBASE-24749: -- We had favoured the design approach from HBASE-26067, and have been doing all necessary refactorings as subtasks to allow for a pluggable way of creating and tracking storefiles, so I think this Jira (and all its subtasks), is not relevant anymore. Can we close it as duplicate or abandoned, to avoid any confusions on what's still being worked? [~zhangduo] [~elserj] [~zyork] [~stack] > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380474#comment-17380474 ] Wellington Chevreuil commented on HBASE-24749: -- {quote} I know Wellington has been making good progress on the table approach as well {quote} I believe the extra work on "rename-less" flushes, compactions, splits, merges, etc can be easily ported to any other type of file tracker, not only the table based one. HBASE-25395 already defined a HStoreFilePathAccessor interface, which is being used by these functionalities that create new files to effectively commit files and get them tracked. Additional trackers, then, would just need to provide their implementation of HStoreFilePathAccessor interface, and all we would need is to modify StoreFileTrackingUtils.createStoreFilePathAccessor to dynamically load HStoreFilePathAccessor implementations defined in the configuration (currently, it's hardcoded to instantiate HTableStoreFilePathAccessor). > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17376012#comment-17376012 ] Zach York commented on HBASE-24749: --- I think we can implement both approaches since the StoreEngine/SFM can be configured and provides a nice interface to hide away the implementation specific details (unless the new metadata system being proposed doesn't make use of those existing abstractions). I know Wellington has been making good progress on the table approach as well. Good stuff > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374288#comment-17374288 ] Duo Zhang commented on HBASE-24749: --- Recently some guys in our company introduced how iceberg manages its data files with a separated metadata system. I think here we could also make use of this idea, to introduce a metadata system to manage our hfiles. For the old implementation, it just stores nothing, uses FileSystem.list to get the fhfiles. We could introduce different implementations for the metadata system, such as record the hfile list in region, or in file, etc. And we could also introduce combined metadata system, for example, read from and write to the primary metadata system, and mirror to the secondary metadata system, to make it possible to migrate from different implementations, and also make it possible to downgrade. Thanks. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248272#comment-17248272 ] Michael Stack commented on HBASE-24749: --- bq. Then we will followup tasks to merge it into hbase:meta via a single writer and show the Y write throughput that's not far from the the system table approach. Sounds good (There are a few recent paragraphs here that explain why I'm concerned when I see mention of a new System Table -- See '2.0.1 Avoid compounding of Region Assignment Complexity' in https://docs.google.com/document/d/11ChsSb2LGrSzrSJz8pDCAw5IewmaMV0ZDN1LrMkAj4s/edit#) bq. I'm wondered what the testing scope of hbase-on-s3 could be? are we testing the functionality of using S3A/DFS API to perform write operation? Could start small. Configure minihbasecluster so its on s3 then run a subset of tests that grows over time proving that hbase works on s3 across the variety of failures the test suite is full of (HBase has its own set of machines attached to Apache infrastructure donated by Xiaomi. These machines are EC2 instances if that helps). Good stuff. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248267#comment-17248267 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- Hi [~stack] , for the hbase:storefile system table, we will firstly introduce the system table approach and shows everything is working with X write throughput. Then we will followup tasks to merge it into hbase:meta via a single writer and show the Y write throughput that's not far from the the system table approach. The part of ZK will be completely removed for hbase:meta and master region by keeping it AS IS to use the Default Store Engine and Default Store File Manager. (sorry that we didn't update the design doc.) About the RefreshHFilesClient part, we will try to update the design doc or directly write on the upcoming JIRA how it will be recovered from online and offline repairing, tho this part is still pending implementation. Lastly, correct me if I'm wrong, AKA there is no much S3 test in nightly or integration tests within HBase, we could definitely add that at least as integration tests especially the [S3 Strong Consistency|https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/] is out . Meanwhile, I'm wondered what the testing scope of hbase-on-s3 could be? are we testing the functionality of using S3A/DFS API to perform write operation? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244166#comment-17244166 ] Michael Stack commented on HBASE-24749: --- Took a look at the design. Looks good. Fun. Still talks of hbase:storefile table. Is that intentional? The design also proposes using zk as a persistent store (See invariants for devs in [http://hbase.apache.org/book.html#design.invariants).] A bit more detail on RefreshHFilesClient would be appreciated... here or as amend to design. We'd run this tool to make the dfsclient listing view align with hbase's understanding? The testing section is interesting. s3 is a predicate. Anything we can do to make sure that a check-in doesn't break hbase-on-s3? Perhaps we need to add to our nightly runs hbase on s3? Or, a PR that touches fs code runs the s3 tests? Thanks > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226329#comment-17226329 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- thanks Nick ! > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225620#comment-17225620 ] Nick Dimiduk commented on HBASE-24749: -- bq. I have a question on branching, since this feature will be default disable, is it possible that we direct contribute into master branch (instead of having a feature branch)? mainly, we're learning how and when one feature branch can merge into master branch. I am of the opinion that feature development should always happen on a feature branch. Never merge to a releasable branch until you have something you're ready to release. We consider {{master}} a releasable branch. I do recommend that feature branches be frequently rebased onto their source branch. In my experience, once/week has been a good cadence. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225590#comment-17225590 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- Hi guys, we're about to sending out milestones and PRs under this parent JIRA. I have a question on branching, since this feature will be default disable, is it possible that we direct contribute into master branch (instead of having a feature branch)? mainly, we're learning how and when one feature branch can merge into master branch. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188056#comment-17188056 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- {quote} # When (re)open, ROOT region only cares HFiles in the data directory of ROOT Region (relies on the MVCC protection of what files should be included). # HFile Tracking of hbase:meta are written to ROOT region (similar to how the meta location is being handled), and this tracking metadata is being protected by the WAL of ROOT Region and HFiles in the data directory of ROOT Region. # HFile Tracking of any other tables are being updated to a column family cf:storefile in hbase:meta. The only read extensively period is during the region open and region assignment.{quote} hey guys, did you guys see any red flag on the last comment about scanning the file system directly for the ROOT region ? [~stack] [~zyork] [~anoop.hbase] [~zhangduo] ? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183739#comment-17183739 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- I updated a [design doc|https://docs.google.com/document/d/15Nx-xZ7FoPoud9vqkmIwphkNwBv0mdKMkvU7Ley5i4A/edit?usp=sharing] (google doc version), then we leave any design related comments on there directly to avoid a long page of comments in this JIRA. In addition, I'm wondered if we can simplify the circuit for tracking the HFiles of the ROOT region to directly rely on the file storage (assuming the WAL works fine + HFiles are always immutable) without adding a tracking layer as well as directly writes HFiles to the data directory. e.g. the dependencies flow should be # When (re)open, ROOT region only cares HFiles in the data directory of ROOT Region (relies on the MVCC protection of what files should be included). # HFile Tracking of hbase:meta are written to ROOT region (similar to how the meta location is being handled), and this tracking metadata is being protected by the WAL of ROOT Region and HFiles in the data directory of ROOT Region. # HFile Tracking of any other tables are being updated to a column family cf:storefile in hbase:meta. The only read extensively period is during the region open and region assignment. We have provided an [investigation (Appendix#1) within the new design doc |https://docs.google.com/document/d/15Nx-xZ7FoPoud9vqkmIwphkNwBv0mdKMkvU7Ley5i4A/edit?usp=sharing] that MVCC (Max Seq#) in the Store is the guard to reload cells from HFiles in the data directory without the tracking metadata and .tmp directory. But we should only use it for ROOT region because the amount of HFiles in ROOT directory is limited and normally won't change frequently > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171129#comment-17171129 ] Zach York commented on HBASE-24749: --- Yep, could do that as well. I was suggesting opening up only the one CF for writing outside of Master if it turns out having a single writer can't scale, but I think for now the single writer approach should work. Whether that writer is locked to Master or the RS hosting the CF, I don't have particularly strong feelings on. We would want to test how much overhead the additional hop would take (if Master is the writer, but meta is hosted on a different RS). > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171121#comment-17171121 ] Michael Stack commented on HBASE-24749: --- One comment on the [~zyork] approach is that if a dedicated hbase:meta CF for hfiles then there could still be one-writer only; it would just be the RS that was hosting the Region; not Master. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171116#comment-17171116 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- sorry for the delay, I was out few days last week. {quote}every flush and compaction will result in an update inline w/ the flush/compaction completion – if it fails, the flush/compaction fail? {quote} if updating hfile set in {{hbase:meta}} fails, it should be considered as failure if this feature is enabled. do you have concern on blocking the actual flush to be completed ? (it should be similar to other feature like {{hbase:quota}}) {quote}Master would update, or RS writes meta, a violation of a simplification we made trying to ensure one-writer {quote} for ensuring one-writer to {{hbase:meta}}, this is a good note and we haven't considered one writer scenario yet. I'm not sure the right way but for flush should be happened on the RS side, then either the RS create directly connection to {{hbase:meta}} with a limit on writing to only this column family outside of the master (suggested by [~zyork], pending investigation) or as you suggested that we package the hfile set information with a RPC call to Master and Master updates the hfile set. the amount of traffic (direct table connection or RPC call) should be the same, I still need to compare if the overhead (throughput) have any difference. In addition, I will try to come up a set of sub-tasks and update the proposal doc the coming week. please bear with me, the plan may have some transition tasks, e.g. 1. having the separate system table first, then have followup tasks to 2). compare the migration into the {{hbase:meta}} and actually 3). merge into {{hbase:meta}} (as a throughput sanity check) > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164147#comment-17164147 ] Michael Stack commented on HBASE-24749: --- Just a thought; keeping the hfile set in hbase:meta is going to up the read/write load on this table significantly; every flush and compaction will result in an update inline w/ the flush/compaction completion – if it fails, the flush/compaction fail? – and every open will be an hbase:meta read to find set of files to use. Currently Master only writes hbase:meta so RS will tell Master the compaction result – fine-by-me because master should be running compactions anyways – or the new flush file added, and Master would update, or RS writes meta, a violation of a simplification we made trying to ensure one-writer. Just a note. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164138#comment-17164138 ] ramkrishna.s.vasudevan commented on HBASE-24749: WAL based approach one advantage is that because of the fact that we anyway replay the WALs on RS crash and we can easily identify this fact of which were the files that were flushed/compacted completely. HBASE-20704 was to remove the files that were actually compacted but not yet removed by the discharger. This WAL marker is just to ensure on a restart of an RS and region opening we know the file in that region:cf path came out of a completed compaction or not. I think atleast in region open case prevoiusly we were reading that marker but we never validated the compacted file under region:cf with that . > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164129#comment-17164129 ] Anoop Sam John commented on HBASE-24749: So the diff is this. In case of HFiles list in META or a system table, that info is abt the valid files and always that has to be examined (when new cluster is created or cluster is restarted). In WAL based what we rely on the status of not committed files alone. By default all files are committed files. Only those which are on going will have markers in the WAL. So that is the basic difference. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164126#comment-17164126 ] Anoop Sam John commented on HBASE-24749: Thought abt that also.. In case of a cluster drop and later recreate based on the cluster FS, we wont have any WALs. So no wal replay. That means automatically all the files are valid. All files will come as valid HFiles for that region:cf. In case of RS crash only the WAL replay come into pic. We split all WAL files and replay and once replay also over, the region will come online in next RS. So during this replay, if we do the tracking of the files also, end of that we can find uncommitted files and throw them away. In case when we have to store the HFiles list in META, every write op (flush/compaction) from every RS is depending on the META region and a write to that. If the META region is not available for some time, all the flushes/compaction is blocked from completion. That is one worry I am having. And then the next is how to store the META table's file list itself. In case of cluster recreate, the zk data also lost right [~zyork]? So that also says clearly that storing at zk is not possible. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164094#comment-17164094 ] Zach York commented on HBASE-24749: --- Yes, I think that is potentially an alternative implementation that could work. One downside I could see is you would still want to be able to handle bulk loading/other procedures. If all updates to the state are controlled by the RS, this approach would work. I wonder what the perf difference might be... since in this case you would have to replay edits always. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164090#comment-17164090 ] Anoop Sam John commented on HBASE-24749: bq. Can you expand on how we can get in a situation where a partial file is written? I'm trying to see if there are any failure modes we haven't though of. If the case is a complete file written to the data directory, is there harm in picking up the new file (even if it hasn't successfully committed to the SFM)? That point was based on another direction what Stack was saying. Am not sure whether Stack suggested that for META table alone or for all. ie use the WAL event markers to know whether a HFile is committer or not. If we see, during WAL replay that there is a flush begin marker and later flush complete marker, this means it is a committed file. If no markers at all for a file, this is an old existing file. If only begin but no end means this is not a committed file and so while region reopen, we can ignore this. Same with compaction also. There one issue was what if the WAL file which is having the begin marker got rolled and deleted. We lost the track. But if that can be controlled, this is also a direction no? We can avoid the need to store all the files list into META and avoid the Q of how to handled the META's file list. Storing in zk is not a direction. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164048#comment-17164048 ] Guanghao Zhang commented on HBASE-24749: {quote}bq. it should be HBASE-20724, so for compaction we can reused that to confirm if the flushed StoreFile were from a compaction. {quote} Yes. The compaction event marker is not used anymore. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164039#comment-17164039 ] Zach York commented on HBASE-24749: --- [~anoop.hbase] Can you expand on how we can get in a situation where a partial file is written? I'm trying to see if there are any failure modes we haven't though of. If the case is a complete file written to the data directory, is there harm in picking up the new file (even if it hasn't successfully committed to the SFM)? In any case, for all user tables, this could be covered by the new store file list. The only case that is tricky/of concern is the file list for the root. > On HBASE-14090, it is old but still cool, virtuous, aiming to hit a bigger > target. [~stack] We've definitely looked at it and gained some inspiration :) I think at this point, we want to keep this in a manageable scope to be able to deliver something. However, this approach should help break some of the reliance on FS structure and make it easier to accomplish the goal in the future. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164037#comment-17164037 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- bq. If an HFile is written successfully but no marker in the WAL, then it doesn't exist, right? As part of the WAL replay you will reconstitute it from edits in the WAL? You're right if that happens, any uncommitted HFile should not be taken and replay from the WALs. (assuming that replay may generate the same content but a different HFile, I was thinking an store open optimization that but that would be too far from now.) bq. you have surveyed the calls to the NN made by HBase on a regular basis? we don't have how many rename calls captured to NN or even to object stores, and we will add a measurement survey task on the related milestone. But at one point we captured while running compaction with rename, the overall wall clock time only for the rename part were dominating ~60% of overall compaction time on object stores. bq. IIRC there is a issue for storing the compacted files in HFile's metadata, to solve the problem that the wal file contains the compaction marker may be deleted before wal splitting. it should be HBASE-20724, so for compaction we can reused that to confirm if the flushed StoreFile were from a compaction. bq. Now while replay of wal, we dont have start compaction marker for this wal file. So we think this is an old valid file but that is wrong. This is a partial file. This is possible. don't we only have the `end` compaction event marker only when compacted HFile(s) has been moved to cf directory right before updating the store file manager? but yeah if the WAL is rolled, then we lost this even marker. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163565#comment-17163565 ] Anoop Sam John commented on HBASE-24749: Ya that will help for deleting the compacted away files while opening the result file. But the other case where we write compaction result file directly under region/cf and we had the start compaction marker in wal. The compaction did not finish and RS crashed. By that time the wal file which had the start compaction marker got rolled and deleted also. Now while replay of wal, we dont have start compaction marker for this wal file. So we think this is an old valid file but that is wrong. This is a partial file. This is possible. On that approach of solving the issue with WAL marker, this is a possible issue (?) > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163558#comment-17163558 ] Duo Zhang commented on HBASE-24749: --- IIRC there is a issue for storing the compacted files in HFile's metadata, to solve the problem that the wal file contains the compaction marker may be deleted before wal splitting. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163552#comment-17163552 ] ramkrishna.s.vasudevan commented on HBASE-24749: I think in flushes this is possible to recover from a start marker (assuming we will abort on a flush failure). But for compactions I think the markers in the WAL may get rolled over, right? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163542#comment-17163542 ] Anoop Sam John commented on HBASE-24749: bq.On events such as flush and compaction, we write markers to the WAL w/ notes listing files that participated in the event. On recovery, we read these events completing compactions if all participants present and it looked like we crashed after compaction completed but before we got to slot the new files into place and remove the old. This is not only for META (or ROOT) table you suggesting stack? (But generically) Ya during flush and compaction we have start markers with files involved in that op and on completion another marker. During region recovery, we can use these markers to identify uncommitted files right? Need to see whether there is chance that we wall gets rolled and miss the start markers. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163163#comment-17163163 ] Michael Stack commented on HBASE-24749: --- {quote}and if any HFile is being written successfully without a even marker, we probably need a repair hook (maybe HBCK) to consider including the written storefile back to be tracked. {quote} If an HFile is written successfully but no marker in the WAL, then it doesn't exist, right? As part of the WAL replay you will reconstitute it from edits in the WAL? On HBASE-14090, it is old but still cool, virtuous, aiming to hit a bigger target. A question for your that I think might be of general utility is whether you have surveyed the calls to the NN made by HBase on a regular basis? It would be good to get a list of renames – files and dirs – for your project but also, my sense is that we are profligate w/ our NN calls. If a survey and report, we might be able up performance at least around MTTR. Anyways, just a thought. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163136#comment-17163136 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- Thanks Stack, HBASE-14090 and their [design doc|https://docs.google.com/document/d/10tSCSSWPwdFqOLLYtY2aVFe6iCIrsBk4Vqm8LSGUfhQ/edit#] seems share several directions, e.g. how to use a hbase:meta column to track storefile and create a cleaner efficiently remove left over Storefiles. So, we will take another review with those design docs to see what we can pick within this proposed scope. Also as [~elserj] pointed out from the dev@ list, Accumulo is also managing their data/RFile with a metadata table. Without the support from ZK, write an extra edit to the WAL as a `commit` marker and reuse it for recovering the ROOT region (for hbase:meta and maybe the MasterRegion) make senses, and if the flush failed normally, we don't write a new marker and remove that storefile if written. and if any HFile is being written successfully without a even marker, we probably need a repair hook (maybe HBCK) to consider including the written storefile back to be tracked. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Assignee: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163074#comment-17163074 ] Zach York commented on HBASE-24749: --- Hmm, perhaps a flush of the root table could include the currently tracked files (for root) in the flush edit, then replaying the WAL for the root would be a pretty good guarantee. If you don't have the WAL durability, there is a potential for dataloss anyways. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163069#comment-17163069 ] Michael Stack commented on HBASE-24749: --- On where to write ROOT hfile list, when no zk, I suppose you can't adapt the ROOT Region to write a manifest file that gets updated as file set changes because you don't have sufficient guarantees from storage? Will your WAL filesystem have better semantics? On events such as flush and compaction, we write markers to the WAL w/ notes listing files that participated in the event. On recovery, we read these events completing compactions if all participants present and it looked like we crashed after compaction completed but before we got to slot the new files into place and remove the old. Could you use this mechanism – aggregating result of hfile list changes (flushes/compactions)? Or add an event of your own that would make your job easier? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162470#comment-17162470 ] Michael Stack commented on HBASE-24749: --- There is overlap with HBASE-14090 Perhaps ideas from there of use here? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162196#comment-17162196 ] Tak-Lon (Stephen) Wu commented on HBASE-24749: -- Thanks [~busbey] and [~zhangduo] , I will send a email to dev@ list. bq. Could this be a part of meta instead? Definitively we can more them into meta table, and it shouldn't put too much load and data (about 400MB for 10k region). but [~zhangduo] is right that, meta table itself still need to have something to store the tracking for the meta region(s). but other storing storefile tracking of metatable region in zookeeper, do you guys have other thoughts? {quote}But it is still a pain that we could see a HFile on the filesystem but we do not know whether it is valid... Check for trailer? {quote} For repairing and recovery, currently we're testing and reusing refresh hfiles coprocessor ({{RefreshHFilesClient}}) that relies on {{HStore#openStoreFiles}} to valid if the HFiles should be included (if the HFiles can be opened and/or it's a result from compaction). seem like that {{HStore#openStoreFiles}} ({{initReader}}) already did that for checking the metadata ? bq. And we should provide new HBCK tools to sync the file system with the storefiles family, or rebuild the storefiles family from the filesystem? We can change that to be part of repairing meta table in HBCK such user do not need to run a separate command. > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system imple -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162033#comment-17162033 ] Duo Zhang commented on HBASE-24749: --- What I agree with [~busbey] is that, let's use a new family for meta to store the storefiles, do not introduce new system tables on the critical path of recovery. And we should provide new HBCK tools to sync the file system with the storefiles family, or rebuild the storefiles family from the filesystem? But it is still a pain that we could see a HFile on the filesystem but we do not know whether it is valid... Check for trailer? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system imple -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162030#comment-17162030 ] Duo Zhang commented on HBASE-24749: --- Master local region can not be used to solve this, as the master local region itself also needs to know where to store the storefiles for its own, so at the root level we still need something else than a region to store this information... > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system imple -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking
[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162022#comment-17162022 ] Sean Busbey commented on HBASE-24749: - excellent! it's worth a heads up to dev@hbase pointing folks here, I think. {quote} For recovery from region server crashes or region reload, we persist the in-memory HFiles tracking in store file manager to a new HBase admin table, ‘hbase:storefile’. To prevent loading HFiles from incomplete flushes and compactions, and reduce the number of expensive LIST files calls against the file system, we will read directly from the hbase:storefile table. A write to the storefile table is used as the commit mechanism for a HFile, removing the rename from .tmp to the data directory. {quote} Could this be a part of meta instead? we just recently got through having {{hbase:namespace}} move into meta to improve operational robustness, and this proposed storefile lookup seems very likely to be an even greater tripping point since all the RS need access. {quote} To avoid a circular dependency on the storefile table, the store file manager for the meta and storefile tables will be persisted in ZooKeeper. {quote} no persistent state in zookeeper please. we could do this via a local region controlled by whomever is handling meta. or at least I think that the feature would work for this, what do you think [~zhangduo]? > Direct insert HFiles and Persist in-memory HFile tracking > - > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile >Affects Versions: 3.0.0-alpha-1 >Reporter: Tak-Lon (Stephen) Wu >Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system imple -- This message was sent by Atlassian Jira (v8.3.4#803005)