It would be nice using PathCapabilities to determine lease recovery as a feature flag. In fact, s3a and abfs have lots of feature flags being derived from this API already. It would be good for dfs and ozone to recognize lease recovery as a capability.
However, this alone might not be sufficient and something like RecoverableFileSystem interface would be helpful as long as we can abstract out lease recovery (and safe mode etc) options as hbase anyways need to perform them. Hence, having both: a) path capability to identify if lease recovery etc features are available and b) a new FileSystem interface that both dfs and ozone can implement, would be great IMHO. Because even if we just have path capability for the feature flag, we would still end up adding ozone dependency (unless done with reflection as Andrew mentioned) to perform lease recovery unless lease recovery is abstracted out somewhere in hadoop. > One of the original worries is if the Hadoop/HDFS community > would reject our proposal when we change the base interface/abstract class > in FileSystem (if it's non-backward compatible). I believe, new IA.Public interface in hadoop that can abstract out lease recovery etc would have less likelihood of getting rejected than "making changes in FileSystem directly". On Thu, Mar 16, 2023 at 2:07 AM Tak Lon (Stephen) Wu <tak...@apache.org> wrote: > In addition, I'm yet confirm but based on another search in the hadoop > code, we may be able to add recover lease as a feature flag in > CommonPathCapabilities [3] and can be used by the interface of > PathCapabilities#hasPathCapability [4]. (this is similar to > StreamCapabilities as mentioned by Viraj) > > 3. > https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonPathCapabilities.java > 4. > https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PathCapabilities.java > > -Stephen > > On Thu, Mar 16, 2023 at 12:00 AM Tak Lon (Stephen) Wu <tak...@apache.org> > wrote: > > > > Thanks everyone ! Sean helped to clarify that something like DFS specific > > APIs used by HBase has been in-place in many HBase modules as the feature > > implementation but yet standardized in hadoop general FileSystem API, > e.g. > > lease recovery. One of the original worries is if the Hadoop/HDFS > community > > would reject our proposal when we change the base interface/abstract > class > > in FileSystem (if it's non-backward compatible). The discussion here > helps > > to confirm the direction, and let's see how we can make it generic and > > could help to avoid confusion in both places. > > > > Thanks again, > > Stephen > > > > On Wed, Mar 15, 2023 at 2:54 PM Andrew Purtell <andrew.purt...@gmail.com > > > > wrote: > > > > > Then Hadoop should add one and although we would need a reflection > based > > > check in the interim we can converge toward the ideal. > > > > > > In any case I believe we can avoid a direct dependency on Ozone and > should > > > strongly avoid taking such unnecessary dependencies. The Hadoop and > HBase > > > build dependency sets are already very large and we and other users are > > > being hit with significant security issue remediation work, much of > which > > > represents compatibility problems and is not upstreamable (like > protobuf 2 > > > removal in 2.x). We struggle with the existing dependencies enough > already > > > at my employer. > > > > > > > On Mar 15, 2023, at 1:53 PM, Sean Busbey <bus...@apache.org> wrote: > > > > > > > > the check that Stephen is referring to is for logic around lease > > > recovery > > > > and not stream flush/sync. the lease recovery is specific to DFS > IIRC and > > > > doesn't have a FileSystem marker. > > > > > > > >> On Wed, Mar 15, 2023 at 3:22 PM Andrew Purtell <apurt...@apache.org > > > > > wrote: > > > >> > > > >> So we can test StreamCapabilities in code, in worst case by wrapping > > > some > > > >> probe code during startup with try-catch and examining the > exception. > > > >> > > > >>> On Wed, Mar 15, 2023 at 1:09 PM Viraj Jasani <vjas...@apache.org> > > > wrote: > > > >>> > > > >>> As of today, both WAL impl (fshlog and asyncfs) throw > > > >>> StreamLacksCapabilityException if the FS Data OutputStream probe > fails > > > >> for > > > >>> Hflush/Hsync: > > > >>> > > > >>> StreamLacksCapabilityException(StreamCapabilities.HFLUSH) > > > >>> and > > > >>> StreamLacksCapabilityException(StreamCapabilities.HSYNC) > > > >>> > > > >>> > > > >>> On Wed, Mar 15, 2023 at 12:51 PM Andrew Purtell < > apurt...@apache.org> > > > >>> wrote: > > > >>> > > > >>>> Does Hadoop have a marker interface that lets an application know > its > > > >>>> FileSystem instances can support hsync/hflush? Ideally all we > should > > > >> need > > > >>>> to do is test with instanceof for that marker and use reflection > (in > > > >> the > > > >>>> worst case) to get a handle to the hsync or hflush method, and > then > > > >> call > > > >>>> it. This approach should be taken wherever we have a requirement > to > > > >> use a > > > >>>> special WAL specific API provided by the underlying FileSystem, > so we > > > >> can > > > >>>> abstract it sufficiently to not require a direct dependency on > Ozone > > > or > > > >>> S3A > > > >>>> or any non HDFS filesystem. > > > >>>> > > > >>>> On Wed, Mar 15, 2023 at 12:31 PM Tak Lon (Stephen) Wu < > > > >> tak...@apache.org > > > >>>> > > > >>>> wrote: > > > >>>> > > > >>>>> Hi team, > > > >>>>> > > > >>>>> Recently, Wei-Chiu and I have been discussing about if HBase can > use > > > >>>>> Ozone as another storage as WAL (see the hsync and hflush JIRAs > [1]) > > > >>>>> and HFile, for HFile it’s pluggable by configuring the file > system to > > > >>>>> use Ozone File System (Ozone) > > > >>>>> > > > >>>>> But we found that the WAL it’s a bit different, especially > > > >>>>> RecoverLeaseFSUtils#recoverFileLease [2], it has one check about > if > > > >>>>> the file system is an instance of HDFS, and thus WAL recovery to > > > >>>>> execute file lease recovery from RS crashes. Here, if we would > like > > > >> to > > > >>>>> add Ozone, it does not matter by importing as a direct > dependency to > > > >>>>> perform similar lease recovery or via reflection by class name in > > > >>>>> plaintext String, we still need to somehow introduce Ozone to be > > > >>>>> another supported file system. (we can discuss how we can > implement > > > >>>>> better as well) > > > >>>>> > > > >>>>> We also found other places e.g. FSUtils and HFileSystem have used > > > >>>>> DistributedFileSystem, but it should be able to move them into > either > > > >>>>> hbase-asyncfs or a new FS related component to separate the use > of > > > >>>>> different supported file systems. > > > >>>>> > > > >>>>> So, we’re wondering if anyone would have any objections to adding > > > >>>>> Ozone as a dependency to hbase-asyncfs? or if you have a better > idea > > > >>>>> how this could be added without adding Ozone as dependency, > please > > > >>>>> feel free to comment on this thread. > > > >>>>> > > > >>>>> > > > >>>>> [1] Ozone is working on support for hsync and hflush, > > > >>>>> https://issues.apache.org/jira/browse/HDDS-7593, > > > >>>>> https://issues.apache.org/jira/browse/HDDS-4353 > > > >>>>> [2] RecoverLeaseFSUtils#recoverFileLease, > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/util/RecoverLeaseFSUtils.java#L53-L63 > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Stephen > > > >>> > > > >> > > > > -- Thanks, Viraj