Apache9 commented on PR #5545: URL: https://github.com/apache/hbase/pull/5545#issuecomment-1962840902
For me, I think whether to introduce a new store engine here is clear. If this is a new layout way, which can not be worked together with other store engines, we should introduce a new store engine. If not, we should consider this as an optimization and apply it to all store engines if they can be optimized. Here, I think at least for DateTieredStoreEngine, we could still use this optimization, as we split tiers by timestamp, generating two files does not break the rule. For StripeStoreEngine, for range based tier, the optimization should be OK, but for size based tier, I'm not sure but it may cause problems. For MobStoreEngine, I think at least it can be applied to the index file, I'm not sure if it is OK to also apply to mob data file. So in general I think the previous reviewers's opinion is correct, this should be considered an optimization, not a fresh new layout way. Back to the StoreFileWriter problem, as I said above, other store engine could also apply this optimization, so we should add the support for writing the latest cells to a separated file to StoreFileWriter directly, not introducing a DualFileWriter and only use it in DefaultStoreEngine. Technically, I agree that we could have a 'StoreFileWriter' to always write a single store file, and use combination to support writing multiple store files at once. The problem here is still about naming, I do not think it is a good idea to use 'DualFileWriter' in DefaultStoreEngine, we should give it another name. And also, for other store engines, we should use the new store file writer which support writing latest cells to a separated file, instead of using the single file store file writer, although in the first version, we can add a check to not enable it for the store engines other than default store engine, but theoretically it should also work with other store engine. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
