S3 now supports strong consistency, and I heard that they are also implementing atomic renaming currently, so maybe that's one of the reasons why the development is silent now...
For me, I also think deploying hbase on cloud storage is the future, so I would also like to participate here. But I do not think store hfile list in meta is the only solution. It will cause cyclic dependencies for hbase:meta, and then force us a have a fallback solution which makes the code a bit ugly. We should try to see if this could be done with only the FileSystem. Thanks. Andrew Purtell <[email protected]> 于2021年5月19日周三 上午8:04写道: > Wellington (and et. al), > > S3 is also an important piece of our future production plans. > Unfortunately, we were unable to assist much with last year's work, on > account of being sidetracked by more immediate concerns. Fortunately, this > renewed interest is timely in that we have an HBase 2 project where, if > this can land in a 2.5 or a 2.6, it could be an important cost to serve > optimization, and one we could and would make use of. Therefore I would > like to restate my employer's interest in this work too. It may just be > Viraj and myself in the early days. > > I'm not sure how best to collaborate. We could review changes from the > original authors, new changes, and/or divide up the development tasks. We > can certainly offer our time for testing, and can afford the costs of > testing against the S3 service. > > > On Tue, May 18, 2021 at 12:16 PM Wellington Chevreuil < > [email protected]> wrote: > > > Greetings everyone, > > > > HBASE-24749 has been proposed almost a year ago, introducing a new > > StoreFile tracker as a way to allow for any hbase hfile modifications to > be > > safely completed without needing a file system rename. This seems pretty > > relevant for deployments over S3 file systems, where rename operations > are > > not atomic and can have a performance degradation when multiple requests > > get concurrently submitted to the same bucket. We had done superficial > > tests and ycsb runs, where individual renames of files larger than 5GB > can > > take a few hundreds of seconds to complete. We also observed impacts in > > write loads throughput, the bottleneck potentially being the renames. > > > > With S3 being an important piece of my employer cloud solution, we would > > like to help it move forward. We plan to contribute new patches per the > > original design/Jira, but we’d also be happy to review changes from the > > original authors, too. Please let us know if anyone has any concerns, > > otherwise we’ll start to self-assign issues on HBASE-24749 > > > > Wellington > > > > > -- > Best regards, > Andrew > > Words like orphans lost among the crosstalk, meaning torn from truth's > decrepit hands > - A23, Crosstalk >
