In addition, I'm yet confirm but based on another search in the hadoop code, we may be able to add recover lease as a feature flag in CommonPathCapabilities [3] and can be used by the interface of PathCapabilities#hasPathCapability [4]. (this is similar to StreamCapabilities as mentioned by Viraj)
3. https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonPathCapabilities.java 4. https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PathCapabilities.java -Stephen On Thu, Mar 16, 2023 at 12:00 AM Tak Lon (Stephen) Wu <[email protected]> wrote: > > Thanks everyone ! Sean helped to clarify that something like DFS specific > APIs used by HBase has been in-place in many HBase modules as the feature > implementation but yet standardized in hadoop general FileSystem API, e.g. > lease recovery. One of the original worries is if the Hadoop/HDFS community > would reject our proposal when we change the base interface/abstract class > in FileSystem (if it's non-backward compatible). The discussion here helps > to confirm the direction, and let's see how we can make it generic and > could help to avoid confusion in both places. > > Thanks again, > Stephen > > On Wed, Mar 15, 2023 at 2:54 PM Andrew Purtell <[email protected]> > wrote: > > > Then Hadoop should add one and although we would need a reflection based > > check in the interim we can converge toward the ideal. > > > > In any case I believe we can avoid a direct dependency on Ozone and should > > strongly avoid taking such unnecessary dependencies. The Hadoop and HBase > > build dependency sets are already very large and we and other users are > > being hit with significant security issue remediation work, much of which > > represents compatibility problems and is not upstreamable (like protobuf 2 > > removal in 2.x). We struggle with the existing dependencies enough already > > at my employer. > > > > > On Mar 15, 2023, at 1:53 PM, Sean Busbey <[email protected]> wrote: > > > > > > the check that Stephen is referring to is for logic around lease > > recovery > > > and not stream flush/sync. the lease recovery is specific to DFS IIRC and > > > doesn't have a FileSystem marker. > > > > > >> On Wed, Mar 15, 2023 at 3:22 PM Andrew Purtell <[email protected]> > > wrote: > > >> > > >> So we can test StreamCapabilities in code, in worst case by wrapping > > some > > >> probe code during startup with try-catch and examining the exception. > > >> > > >>> On Wed, Mar 15, 2023 at 1:09 PM Viraj Jasani <[email protected]> > > wrote: > > >>> > > >>> As of today, both WAL impl (fshlog and asyncfs) throw > > >>> StreamLacksCapabilityException if the FS Data OutputStream probe fails > > >> for > > >>> Hflush/Hsync: > > >>> > > >>> StreamLacksCapabilityException(StreamCapabilities.HFLUSH) > > >>> and > > >>> StreamLacksCapabilityException(StreamCapabilities.HSYNC) > > >>> > > >>> > > >>> On Wed, Mar 15, 2023 at 12:51 PM Andrew Purtell <[email protected]> > > >>> wrote: > > >>> > > >>>> Does Hadoop have a marker interface that lets an application know its > > >>>> FileSystem instances can support hsync/hflush? Ideally all we should > > >> need > > >>>> to do is test with instanceof for that marker and use reflection (in > > >> the > > >>>> worst case) to get a handle to the hsync or hflush method, and then > > >> call > > >>>> it. This approach should be taken wherever we have a requirement to > > >> use a > > >>>> special WAL specific API provided by the underlying FileSystem, so we > > >> can > > >>>> abstract it sufficiently to not require a direct dependency on Ozone > > or > > >>> S3A > > >>>> or any non HDFS filesystem. > > >>>> > > >>>> On Wed, Mar 15, 2023 at 12:31 PM Tak Lon (Stephen) Wu < > > >> [email protected] > > >>>> > > >>>> wrote: > > >>>> > > >>>>> Hi team, > > >>>>> > > >>>>> Recently, Wei-Chiu and I have been discussing about if HBase can use > > >>>>> Ozone as another storage as WAL (see the hsync and hflush JIRAs [1]) > > >>>>> and HFile, for HFile it’s pluggable by configuring the file system to > > >>>>> use Ozone File System (Ozone) > > >>>>> > > >>>>> But we found that the WAL it’s a bit different, especially > > >>>>> RecoverLeaseFSUtils#recoverFileLease [2], it has one check about if > > >>>>> the file system is an instance of HDFS, and thus WAL recovery to > > >>>>> execute file lease recovery from RS crashes. Here, if we would like > > >> to > > >>>>> add Ozone, it does not matter by importing as a direct dependency to > > >>>>> perform similar lease recovery or via reflection by class name in > > >>>>> plaintext String, we still need to somehow introduce Ozone to be > > >>>>> another supported file system. (we can discuss how we can implement > > >>>>> better as well) > > >>>>> > > >>>>> We also found other places e.g. FSUtils and HFileSystem have used > > >>>>> DistributedFileSystem, but it should be able to move them into either > > >>>>> hbase-asyncfs or a new FS related component to separate the use of > > >>>>> different supported file systems. > > >>>>> > > >>>>> So, we’re wondering if anyone would have any objections to adding > > >>>>> Ozone as a dependency to hbase-asyncfs? or if you have a better idea > > >>>>> how this could be added without adding Ozone as dependency, please > > >>>>> feel free to comment on this thread. > > >>>>> > > >>>>> > > >>>>> [1] Ozone is working on support for hsync and hflush, > > >>>>> https://issues.apache.org/jira/browse/HDDS-7593, > > >>>>> https://issues.apache.org/jira/browse/HDDS-4353 > > >>>>> [2] RecoverLeaseFSUtils#recoverFileLease, > > >>>>> > > >>>>> > > >>>> > > >>> > > >> > > https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/util/RecoverLeaseFSUtils.java#L53-L63 > > >>>>> > > >>>>> Thanks, > > >>>>> Stephen > > >>> > > >> > >
