[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-12-23 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464862#comment-17464862
 ] 

Duo Zhang commented on HBASE-24749:
---

I also resolved all the sub tasks as Won't fix. Feel free to reopen and link 
them to other issues if they are still needed.

Thanks.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-12-22 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464263#comment-17464263
 ] 

Duo Zhang commented on HBASE-24749:
---

HBASE-26067 has been merged to master. As discussed before, will close this 
issue later today. Thanks~

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-11-29 Thread Josh Elser (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450629#comment-17450629
 ] 

Josh Elser commented on HBASE-24749:


HBASE-26067 seems to be the right path to me. I don't see a need to keep this 
open if no one is working on it, but don't feel strongly (given Duo suggested 
we only close this after the merge for 26067 to master happens).

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-11-23 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448305#comment-17448305
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

no issue here, thanks for working on HBASE-26067 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-11-23 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448301#comment-17448301
 ] 

Zach York commented on HBASE-24749:
---

I think it's fine to close this issue. If we later determine this approach has 
better characteristics than the implemented approach, we can always implement 
it within the new interfaces and can reopen/create a new issue.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-11-23 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448051#comment-17448051
 ] 

Duo Zhang commented on HBASE-24749:
---

For me I think we could close this issue after we merge HBASE-26067 back to 
master. And there is an issue for implementing region based SFT. HBASE-26262

But let's see what [~zyork] and [~taklwu] think? They are the author of this 
issue.

Thanks.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-11-23 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448029#comment-17448029
 ] 

Wellington Chevreuil commented on HBASE-24749:
--

We had favoured the design approach from HBASE-26067, and have been doing all 
necessary refactorings as subtasks to allow for a pluggable way of creating and 
tracking storefiles, so I think this Jira (and all its subtasks), is not 
relevant anymore. Can we close it as duplicate or abandoned, to avoid any 
confusions on what's still being worked? [~zhangduo] [~elserj] [~zyork] 
[~stack] 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-07-14 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380474#comment-17380474
 ] 

Wellington Chevreuil commented on HBASE-24749:
--

{quote}
I know Wellington has been making good progress on the table approach as well
{quote}
I believe the extra work on "rename-less" flushes, compactions, splits, merges, 
etc can be easily ported to any other type of file tracker, not only the table 
based one. HBASE-25395 already defined a HStoreFilePathAccessor interface, 
which is being used by these functionalities that create new files to 
effectively commit files and get them tracked. Additional trackers, then, would 
just need to provide their implementation of HStoreFilePathAccessor interface, 
and all we would need is to modify 
StoreFileTrackingUtils.createStoreFilePathAccessor to dynamically load 
HStoreFilePathAccessor implementations defined in the configuration (currently, 
it's hardcoded to instantiate HTableStoreFilePathAccessor).
 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-07-06 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17376012#comment-17376012
 ] 

Zach York commented on HBASE-24749:
---

I think we can implement both approaches since the StoreEngine/SFM can be 
configured and provides a nice interface to hide away the implementation 
specific details (unless the new metadata system being proposed doesn't make 
use of those existing abstractions). I know Wellington has been making good 
progress on the table approach as well. Good stuff

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2021-07-04 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374288#comment-17374288
 ] 

Duo Zhang commented on HBASE-24749:
---

Recently some guys in our company introduced how iceberg manages its data files 
with a separated metadata system.

I think here we could also make use of this idea, to introduce a metadata 
system to manage our hfiles. For the old implementation, it just stores 
nothing, uses FileSystem.list to get the fhfiles. We could introduce different 
implementations for the metadata system, such as record the hfile list in 
region, or in file, etc. And we could also introduce combined metadata system, 
for example, read from and write to the primary metadata system, and mirror to 
the secondary metadata system, to make it possible to migrate from different 
implementations, and also make it possible to downgrade.

Thanks.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-12-11 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248272#comment-17248272
 ] 

Michael Stack commented on HBASE-24749:
---

bq. Then we will followup tasks to merge it into hbase:meta via a single writer 
and show the Y write throughput that's not far from the the system table 
approach. 

Sounds good (There are a few recent paragraphs here that explain why I'm 
concerned when I see mention of a new System Table -- See '2.0.1 Avoid 
compounding of Region Assignment Complexity' in 
https://docs.google.com/document/d/11ChsSb2LGrSzrSJz8pDCAw5IewmaMV0ZDN1LrMkAj4s/edit#)

bq. I'm wondered what the testing scope of hbase-on-s3 could be? are we testing 
the functionality of using S3A/DFS API to perform write operation?

Could start small. Configure minihbasecluster so its on s3 then run a subset of 
tests that grows over time proving that hbase works on s3 across the variety of 
failures the test suite is full of (HBase has its own set of machines attached 
to Apache infrastructure donated by Xiaomi. These machines are EC2 instances if 
that helps).

Good stuff.



> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-12-11 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248267#comment-17248267
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

Hi [~stack] , for the hbase:storefile system table, we will firstly introduce 
the system table approach and shows everything is working with X write 
throughput. Then we will followup tasks to merge it into hbase:meta via a 
single writer and show the Y write throughput that's not far from the the 
system table approach. 

The part of ZK will be completely removed for hbase:meta and master region by 
keeping it AS IS to use the Default Store Engine and Default Store File 
Manager. (sorry that we didn't update the design doc.)

About the RefreshHFilesClient part, we will try to update the design doc or 
directly write on the upcoming JIRA how it will be recovered from online and 
offline repairing, tho this part is still pending implementation. 

 Lastly, correct me if I'm wrong, AKA there is no much S3 test in nightly or 
integration tests within HBase, we could definitely add that at least as 
integration tests especially the [S3 Strong 
Consistency|https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/]
 is out . Meanwhile, I'm wondered what the testing scope of hbase-on-s3 could 
be? are we testing the functionality of using S3A/DFS API to perform write 
operation?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-12-04 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244166#comment-17244166
 ] 

Michael Stack commented on HBASE-24749:
---

Took a look at the design. Looks good. Fun. Still talks of hbase:storefile 
table. Is that intentional? The design also proposes using zk as a persistent 
store (See invariants for devs in 
[http://hbase.apache.org/book.html#design.invariants).]

A bit more detail on RefreshHFilesClient would be appreciated... here or as 
amend to design. We'd run this tool to make the dfsclient listing view align 
with hbase's understanding?


The testing section is interesting. s3 is a predicate. Anything we can do to 
make sure that a check-in doesn't break hbase-on-s3? Perhaps we need to add to 
our nightly runs hbase on s3? Or, a PR that touches fs code runs the s3 tests?

Thanks

 

 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-11-04 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226329#comment-17226329
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

thanks Nick !

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-11-03 Thread Nick Dimiduk (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225620#comment-17225620
 ] 

Nick Dimiduk commented on HBASE-24749:
--

bq. I have a question on branching, since this feature will be default disable, 
is it possible that we direct contribute into master branch (instead of having 
a feature branch)? mainly, we're learning how and when one feature branch can 
merge into master branch.

I am of the opinion that feature development should always happen on a feature 
branch. Never merge to a releasable branch until you have something you're 
ready to release. We consider {{master}} a releasable branch.

I do recommend that feature branches be frequently rebased onto their source 
branch. In my experience, once/week has been a good cadence.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-11-03 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225590#comment-17225590
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

Hi guys, we're about to sending out milestones and PRs under this parent JIRA. 

I have a question on branching, since this feature will be default disable, is 
it possible that we direct contribute into master branch (instead of having a 
feature branch)? mainly, we're learning how and when one feature branch can 
merge into master branch.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-08-31 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188056#comment-17188056
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

{quote} # When (re)open, ROOT region only cares HFiles in the data directory of 
ROOT Region (relies on the MVCC protection of what files should be included). 
 # HFile Tracking of hbase:meta are written to ROOT region (similar to how the 
meta location is being handled), and this tracking metadata is being protected 
by the WAL of ROOT Region and HFiles in the data directory of ROOT Region.
 # HFile Tracking of any other tables are being updated to a column family 
cf:storefile in hbase:meta. The only read extensively period is during the 
region open and region assignment.{quote}
hey guys, did you guys see any red flag on the last comment about scanning the 
file system directly for the ROOT region ? [~stack]  [~zyork]  [~anoop.hbase] 
[~zhangduo]  ?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-08-24 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183739#comment-17183739
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

I updated a [design 
doc|https://docs.google.com/document/d/15Nx-xZ7FoPoud9vqkmIwphkNwBv0mdKMkvU7Ley5i4A/edit?usp=sharing]
 (google doc version), then we leave any design related comments on there 
directly to avoid a long page of comments in this JIRA.

In addition, I'm wondered if we can simplify the circuit for tracking the 
HFiles of the ROOT region to directly rely on the file storage (assuming the 
WAL works fine + HFiles are always immutable) without adding a tracking layer 
as well as directly writes HFiles to the data directory. e.g. the dependencies 
flow should be
 # When (re)open, ROOT region only cares HFiles in the data directory of ROOT 
Region (relies on the MVCC protection of what files should be included). 
 # HFile Tracking of hbase:meta are written to ROOT region (similar to how the 
meta location is being handled), and this tracking metadata is being protected 
by the WAL of ROOT Region and HFiles in the data directory of ROOT Region.
 # HFile Tracking of any other tables are being updated to a column family 
cf:storefile in hbase:meta. The only read extensively period is during the 
region open and region assignment.

We have provided an [investigation (Appendix#1) within the new design doc 
|https://docs.google.com/document/d/15Nx-xZ7FoPoud9vqkmIwphkNwBv0mdKMkvU7Ley5i4A/edit?usp=sharing]
 that MVCC (Max Seq#) in the Store is the guard to reload cells from HFiles in 
the data directory without the tracking metadata and .tmp directory. But we 
should only use it for ROOT region because the amount of HFiles in ROOT 
directory is limited and normally won't change frequently

 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-08-04 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171129#comment-17171129
 ] 

Zach York commented on HBASE-24749:
---

Yep, could do that as well. I was suggesting opening up only the one CF for 
writing outside of Master if it turns out having a single writer can't scale, 
but I think for now the single writer approach should work. Whether that writer 
is locked to Master or the RS hosting the CF, I don't have particularly strong 
feelings on. We would want to test how much overhead the additional hop would 
take (if Master is the writer, but meta is hosted on a different RS).

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-08-04 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171121#comment-17171121
 ] 

Michael Stack commented on HBASE-24749:
---

One comment on the [~zyork]  approach is that if a dedicated hbase:meta CF for 
hfiles then there could still be one-writer only; it would just be the RS that 
was hosting the Region; not Master.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-08-04 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171116#comment-17171116
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

sorry for the delay, I was out few days last week.
{quote}every flush and compaction will result in an update inline w/ the 
flush/compaction completion – if it fails, the flush/compaction fail?
{quote}
if updating hfile set in {{hbase:meta}} fails, it should be considered as 
failure if this feature is enabled. do you have concern on blocking the actual 
flush to be completed ? (it should be similar to other feature like 
{{hbase:quota}})
{quote}Master would update, or RS writes meta, a violation of a simplification 
we made trying to ensure one-writer
{quote}
for ensuring one-writer to {{hbase:meta}}, this is a good note and we haven't 
considered one writer scenario yet. I'm not sure the right way but for flush 
should be happened on the RS side, then either the RS create directly 
connection to {{hbase:meta}} with a limit on writing to only this column family 
outside of the master (suggested by [~zyork], pending investigation) or as you 
suggested that we package the hfile set information with a RPC call to Master 
and Master updates the hfile set. the amount of traffic (direct table 
connection or RPC call) should be the same, I still need to compare if the 
overhead (throughput) have any difference.

In addition, I will try to come up a set of sub-tasks and update the proposal 
doc the coming week. please bear with me, the plan may have some transition 
tasks, e.g. 1. having the separate system table first, then have followup tasks 
to 2). compare the migration into the {{hbase:meta}} and actually 3). merge 
into {{hbase:meta}} (as a throughput sanity check)

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164147#comment-17164147
 ] 

Michael Stack commented on HBASE-24749:
---

Just a thought; keeping the hfile set in hbase:meta is going to up the 
read/write load on this table significantly; every flush and compaction will 
result in an update inline w/ the flush/compaction completion – if it fails, 
the flush/compaction fail? – and every open will be an hbase:meta read to find 
set of files to use. Currently Master only writes hbase:meta so RS will tell 
Master the compaction result – fine-by-me because master should be running 
compactions anyways – or the new flush file added, and Master would update, or 
RS writes meta, a violation of a simplification we made trying to ensure 
one-writer. Just a note.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164138#comment-17164138
 ] 

ramkrishna.s.vasudevan commented on HBASE-24749:


WAL based approach one advantage is that because of the fact that we anyway 
replay the WALs on RS crash and we can easily identify this fact of which were 
the files that were flushed/compacted completely. 
HBASE-20704 was to remove the files that were actually compacted but not yet 
removed by the discharger. This WAL marker is just to ensure on a restart of an 
RS and region opening we know the file in that region:cf path came out of a  
completed compaction or not. I think atleast in region open case prevoiusly we 
were reading that marker but we never validated the compacted file under 
region:cf with that .  

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164129#comment-17164129
 ] 

Anoop Sam John commented on HBASE-24749:


So the diff is this.  In case of HFiles list in META or a system table, that 
info is abt the valid files and always that has to be examined (when new 
cluster is created or cluster is restarted).  In WAL based what we rely on the 
status of not committed files alone.  By default all files are committed files. 
 Only those which are on going will have markers in the WAL.  So that is the 
basic difference.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164126#comment-17164126
 ] 

Anoop Sam John commented on HBASE-24749:


Thought abt that also..  In case of a cluster drop and later recreate based on 
the cluster FS, we wont have any WALs.  So no wal replay. That means 
automatically all the files are valid. All files will come as valid HFiles for 
that region:cf.
In case of RS crash only the WAL replay come into pic.  We split all WAL files 
and replay and once replay also over, the region will come online in next RS.  
So during this replay,  if we do the tracking of the files also, end of that we 
can find uncommitted files and throw them away.
In case when we have to store the HFiles list in META, every write op 
(flush/compaction) from every RS is depending on the META region and a write to 
that. If the META region is not available for some time, all the 
flushes/compaction is blocked from completion.  That is one worry I am having.  
And then the next is how to store the META table's file list itself.  In case 
of cluster recreate, the zk data also lost right [~zyork]?  So that also says 
clearly that storing at zk is not possible.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164094#comment-17164094
 ] 

Zach York commented on HBASE-24749:
---

Yes, I think that is potentially an alternative implementation that could work. 
One downside I could see is you would still want to be able to handle bulk 
loading/other procedures. If all updates to the state are controlled by the RS, 
this approach would work. I wonder what the perf difference might be... since 
in this case you would have to replay edits always.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164090#comment-17164090
 ] 

Anoop Sam John commented on HBASE-24749:


bq. Can you expand on how we can get in a situation where a partial file is 
written? I'm trying to see if there are any failure modes we haven't though of. 
If the case is a complete file written to the data directory, is there harm in 
picking up the new file (even if it hasn't successfully committed to the SFM)?
That point was based on another direction what Stack was saying.  Am not sure 
whether Stack suggested that for META table alone or for all.  ie use the WAL 
event markers to know whether a HFile is committer or not. If we see, during 
WAL replay that there is a flush begin marker and later flush complete marker, 
this means it is  a committed file.  If no markers at all for a file, this is 
an old existing file. If only begin but no end means this is not a committed 
file and so while region reopen, we can ignore this.  Same with compaction 
also.   There one issue was what if the WAL file which is having the begin 
marker got rolled and deleted.  We lost the track.  But if that can be 
controlled, this is also a direction no?  We can avoid the need to store all 
the files list into META and avoid the Q of how to handled the META's file 
list. Storing in zk is not a direction.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Guanghao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164048#comment-17164048
 ] 

Guanghao Zhang commented on HBASE-24749:


{quote}bq. it should be HBASE-20724, so for compaction we can reused that to 
confirm if the flushed StoreFile were from a compaction.
{quote}
Yes. The compaction event marker is not used anymore.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164039#comment-17164039
 ] 

Zach York commented on HBASE-24749:
---

[~anoop.hbase] Can you expand on how we can get in a situation where a partial 
file is written? I'm trying to see if there are any failure modes we haven't 
though of. If the case is a complete file written to the data directory, is 
there harm in picking up the new file (even if it hasn't successfully committed 
to the SFM)?

In any case, for all user tables, this could be covered by the new store file 
list. The only case that is tricky/of concern is the file list for the root.

> On HBASE-14090, it is old but still cool, virtuous, aiming to hit a bigger 
> target.

[~stack] We've definitely looked at it and gained some inspiration :) I think 
at this point, we want to keep this in a manageable scope to be able to deliver 
something. However, this approach should help break some of the reliance on FS 
structure and make it easier to accomplish the goal in the future.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164037#comment-17164037
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

bq. If an HFile is written successfully but no marker in the WAL, then it 
doesn't exist, right? As part of the WAL replay you will reconstitute it from 
edits in the WAL?
You're right if that happens, any uncommitted HFile should not be taken and 
replay from the WALs. (assuming that replay may generate the same content but a 
different HFile, I was thinking an store open optimization that but that would 
be too far from now.)

bq. you have surveyed the calls to the NN made by HBase on a regular basis?
we don't have how many rename calls captured to NN or even to object stores, 
and we will add a measurement survey task on the related milestone. But at one 
point we captured while running compaction with rename, the overall wall clock 
time only for the rename part were dominating ~60% of overall compaction time 
on object stores.

bq. IIRC there is a issue for storing the compacted files in HFile's metadata, 
to solve the problem that the wal file contains the compaction marker may be 
deleted before wal splitting.
it should be HBASE-20724, so for compaction we can reused that to confirm if 
the flushed StoreFile were from a compaction. 

bq. Now while replay of wal, we dont have start compaction marker for this wal 
file. So we think this is an old valid file but that is wrong. This is a 
partial file. This is possible.
don't we only have the `end` compaction event marker only when compacted 
HFile(s) has been moved to cf directory right before updating the store file 
manager? but yeah if the WAL is rolled, then we lost this even marker. 


> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163565#comment-17163565
 ] 

Anoop Sam John commented on HBASE-24749:


Ya that will help for deleting the compacted away files while opening the 
result file.
But the other case where we write compaction result file directly under 
region/cf and we had the start compaction marker in wal. The compaction did not 
finish and RS crashed. By that time the wal file which had the start compaction 
marker got rolled and deleted also.  Now while replay of wal, we dont have 
start compaction marker for this wal file. So we think this is an old valid 
file but that is wrong. This is  a partial file.  This is possible.  On that 
approach of solving the issue with WAL marker, this is a possible issue (?) 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163558#comment-17163558
 ] 

Duo Zhang commented on HBASE-24749:
---

IIRC there is a issue for storing the compacted files in HFile's metadata, to 
solve the problem that the wal file contains the compaction marker may be 
deleted before wal splitting.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163552#comment-17163552
 ] 

ramkrishna.s.vasudevan commented on HBASE-24749:


I think in flushes this is possible to recover from a start marker (assuming we 
will abort on a flush failure). But for compactions I think the markers in the 
WAL may get rolled over, right?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163542#comment-17163542
 ] 

Anoop Sam John commented on HBASE-24749:


bq.On events such as flush and compaction, we write markers to the WAL w/ notes 
listing files that participated in the event. On recovery, we read these events 
completing compactions if all participants present and it looked like we 
crashed after compaction completed but before we got to slot the new files into 
place and remove the old.
This is not only for META (or ROOT) table you suggesting stack? (But 
generically)  Ya during flush and compaction we have start markers with files 
involved in that op and on completion another marker.  During region recovery, 
we can use these markers to identify uncommitted files right? Need to see 
whether there is chance that we wall gets rolled and miss the start markers. 

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-22 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163163#comment-17163163
 ] 

Michael Stack commented on HBASE-24749:
---

{quote}and if any HFile is being written successfully without a even marker, we 
probably need a repair hook (maybe HBCK) to consider including the written 
storefile back to be tracked.
{quote}
If an HFile is written successfully but no marker in the WAL, then it doesn't 
exist, right? As part of the WAL replay you will reconstitute it from edits in 
the WAL?

 

On HBASE-14090, it is old but still cool, virtuous, aiming to hit a bigger 
target.

 

A question for your that I think might be of general utility is whether you 
have surveyed the calls to the NN made by HBase on a regular basis? It would be 
good to get a list of renames – files and dirs – for your project but also, my 
sense is that we are profligate w/ our NN calls. If a survey and report, we 
might be able up performance at least around MTTR.  Anyways, just a thought.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-22 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163136#comment-17163136
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

Thanks Stack, HBASE-14090 and their [design 
doc|https://docs.google.com/document/d/10tSCSSWPwdFqOLLYtY2aVFe6iCIrsBk4Vqm8LSGUfhQ/edit#]
 seems share several directions, e.g. how to use a hbase:meta column to track 
storefile and create a cleaner efficiently remove left over Storefiles. So, we 
will take another review with those design docs to see what we can pick within 
this proposed scope. Also as [~elserj] pointed out from the dev@ list, Accumulo 
is also managing their data/RFile with a metadata table. 

Without the support from ZK, write an extra edit to the WAL as a `commit` 
marker and reuse it for recovering the ROOT region (for hbase:meta and maybe 
the MasterRegion) make senses, and if the flush failed normally, we don't write 
a new marker and remove that storefile if written. and if any HFile is being 
written successfully without a even marker, we probably need a repair hook 
(maybe HBCK) to consider including the written storefile back to be tracked.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-22 Thread Zach York (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163074#comment-17163074
 ] 

Zach York commented on HBASE-24749:
---

Hmm, perhaps a flush of the root table could include the currently tracked 
files (for root) in the flush edit, then replaying the WAL for the root would 
be a pretty good guarantee. If you don't have the WAL durability, there is a 
potential for dataloss anyways.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-22 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163069#comment-17163069
 ] 

Michael Stack commented on HBASE-24749:
---

On where to write ROOT hfile list, when no zk, I suppose you can't adapt the 
ROOT Region to write a manifest file that gets updated as file set changes 
because you don't have sufficient guarantees from storage? Will your WAL 
filesystem have better semantics? On events such as flush and compaction, we 
write markers to the WAL w/ notes listing files that participated in the event. 
On recovery, we read these events completing compactions if all participants 
present and it looked like we crashed after compaction completed but before we 
got to slot the new files into place and remove the old. Could you use this 
mechanism – aggregating result of hfile list changes (flushes/compactions)? Or 
add an event of your own that would make your job easier?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-21 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162470#comment-17162470
 ] 

Michael Stack commented on HBASE-24749:
---

There is overlap with HBASE-14090 Perhaps ideas from there of use here?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-21 Thread Tak-Lon (Stephen) Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162196#comment-17162196
 ] 

Tak-Lon (Stephen) Wu commented on HBASE-24749:
--

Thanks [~busbey] and [~zhangduo] , I will send a email to dev@ list.

bq. Could this be a part of meta instead? 

Definitively we can more them into meta table, and it shouldn't put too much 
load and data (about 400MB for 10k region). but [~zhangduo]  is right that, 
meta table itself still need to have something to store the tracking for the 
meta region(s). but other storing storefile tracking of metatable region in 
zookeeper, do you guys have other thoughts? 

{quote}But it is still a pain that we could see a HFile on the filesystem but 
we do not know whether it is valid... Check for trailer?
{quote}
For repairing and recovery, currently we're testing and reusing refresh hfiles 
coprocessor ({{RefreshHFilesClient}}) that relies on {{HStore#openStoreFiles}} 
to valid if the HFiles should be included (if the HFiles can be opened and/or 
it's a result from compaction). seem like that {{HStore#openStoreFiles}}  
({{initReader}}) already did that for checking the metadata ? 

bq. And we should provide new HBCK tools to sync the file system with the 
storefiles family, or rebuild the storefiles family from the filesystem?

We can change that to be part of repairing meta table in HBCK such user do not 
need to run a separate command.

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system imple



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-21 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162033#comment-17162033
 ] 

Duo Zhang commented on HBASE-24749:
---

What I agree with [~busbey] is that, let's use a new family for meta to store 
the storefiles, do not introduce new system tables on the critical path of 
recovery.

And we should provide new HBCK tools to sync the file system with the 
storefiles family, or rebuild the storefiles family from the filesystem?

But it is still a pain that we could see a HFile on the filesystem but we do 
not know whether it is valid... Check for trailer?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system imple



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-21 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162030#comment-17162030
 ] 

Duo Zhang commented on HBASE-24749:
---

Master local region can not be used to solve this, as the master local region 
itself also needs to know where to store the storefiles for its own, so at the 
root level we still need something else than a region to store this 
information...

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system imple



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

2020-07-21 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162022#comment-17162022
 ] 

Sean Busbey commented on HBASE-24749:
-

excellent! it's worth a heads up to dev@hbase pointing folks here, I think.

{quote}
For recovery from region server crashes or region reload, we persist the 
in-memory HFiles tracking in store file manager to a new HBase admin table, 
‘hbase:storefile’. To prevent loading HFiles from incomplete flushes and 
compactions, and reduce the number of expensive LIST files calls against the 
file system, we will read directly from the hbase:storefile table. A write to 
the storefile table is used as the commit mechanism for a HFile, removing the 
rename from .tmp to the data directory.
{quote}

Could this be a part of meta instead? we just recently got through having 
{{hbase:namespace}} move into meta to improve operational robustness, and this 
proposed storefile lookup seems very likely to be an even greater tripping 
point since all the RS need access.

{quote}
To avoid a circular dependency on the storefile table, the store file manager 
for the meta and storefile tables will be persisted in ZooKeeper.
{quote}

no persistent state in zookeeper please. we could do this via a local region 
controlled by whomever is handling meta. or at least I think that the feature 
would work for this, what do you think [~zhangduo]?

> Direct insert HFiles and Persist in-memory HFile tracking
> -
>
> Key: HBASE-24749
> URL: https://issues.apache.org/jira/browse/HBASE-24749
> Project: HBase
>  Issue Type: Umbrella
>  Components: Compaction, HFile
>Affects Versions: 3.0.0-alpha-1
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: design, discussion, objectstore, storeFile, storeengine
> Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct 
> insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} 
> directory used in the commit stage for common HFile operations such as flush 
> and compaction to improve the write throughput and latency on object stores. 
> Specifically for S3 filesystems, this will also mitigate read-after-write 
> inconsistencies caused by immediate HFiles validation after moving the 
> HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 
> 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, 
> and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed 
> improvement on the object stores use case makes senses and if we miss 
> anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file 
> system imple



--
This message was sent by Atlassian Jira
(v8.3.4#803005)