In addition, I'm yet confirm but based on another search in the hadoop
code, we may be able to add recover lease as a feature flag in
CommonPathCapabilities [3] and can be used by the interface of
PathCapabilities#hasPathCapability [4]. (this is similar to
StreamCapabilities as mentioned by Viraj)

3. 
https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonPathCapabilities.java
4. 
https://github.com/apache/hadoop/blob/branch-3.3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PathCapabilities.java

-Stephen

On Thu, Mar 16, 2023 at 12:00 AM Tak Lon (Stephen) Wu <[email protected]> wrote:
>
> Thanks everyone ! Sean helped to clarify that something like DFS specific
> APIs used by HBase has been in-place in many HBase modules as the feature
> implementation but yet standardized in hadoop general FileSystem API, e.g.
> lease recovery. One of the original worries is if the Hadoop/HDFS community
> would reject our proposal when we change the base interface/abstract class
> in FileSystem (if it's non-backward compatible). The discussion here helps
> to confirm the direction, and let's see how we can make it generic and
> could help to avoid confusion in both places.
>
> Thanks again,
> Stephen
>
> On Wed, Mar 15, 2023 at 2:54 PM Andrew Purtell <[email protected]>
> wrote:
>
> > Then Hadoop should add one and although we would need a reflection based
> > check in the interim we can converge toward the ideal.
> >
> > In any case I believe we can avoid a direct dependency on Ozone and should
> > strongly avoid taking such unnecessary dependencies. The Hadoop and HBase
> > build dependency sets are already very large and we and other users are
> > being hit with significant security issue remediation work, much of which
> > represents compatibility problems and is not upstreamable (like protobuf 2
> > removal in 2.x). We struggle with the existing dependencies enough already
> > at my employer.
> >
> > > On Mar 15, 2023, at 1:53 PM, Sean Busbey <[email protected]> wrote:
> > >
> > > the check that Stephen is referring to is for logic around lease
> > recovery
> > > and not stream flush/sync. the lease recovery is specific to DFS IIRC and
> > > doesn't have a FileSystem marker.
> > >
> > >> On Wed, Mar 15, 2023 at 3:22 PM Andrew Purtell <[email protected]>
> > wrote:
> > >>
> > >> So we can test StreamCapabilities in code, in worst case by wrapping
> > some
> > >> probe code during startup with try-catch and examining the exception.
> > >>
> > >>> On Wed, Mar 15, 2023 at 1:09 PM Viraj Jasani <[email protected]>
> > wrote:
> > >>>
> > >>> As of today, both WAL impl (fshlog and asyncfs) throw
> > >>> StreamLacksCapabilityException if the FS Data OutputStream probe fails
> > >> for
> > >>> Hflush/Hsync:
> > >>>
> > >>> StreamLacksCapabilityException(StreamCapabilities.HFLUSH)
> > >>> and
> > >>> StreamLacksCapabilityException(StreamCapabilities.HSYNC)
> > >>>
> > >>>
> > >>> On Wed, Mar 15, 2023 at 12:51 PM Andrew Purtell <[email protected]>
> > >>> wrote:
> > >>>
> > >>>> Does Hadoop have a marker interface that lets an application know its
> > >>>> FileSystem instances can support hsync/hflush? Ideally all we should
> > >> need
> > >>>> to do is test with instanceof for that marker and use reflection (in
> > >> the
> > >>>> worst case) to get a handle to the hsync or hflush method, and then
> > >> call
> > >>>> it. This approach should be taken wherever we have a requirement to
> > >> use a
> > >>>> special WAL specific API provided by the underlying FileSystem, so we
> > >> can
> > >>>> abstract it sufficiently to not require a direct dependency on Ozone
> > or
> > >>> S3A
> > >>>> or any non HDFS filesystem.
> > >>>>
> > >>>> On Wed, Mar 15, 2023 at 12:31 PM Tak Lon (Stephen) Wu <
> > >> [email protected]
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi team,
> > >>>>>
> > >>>>> Recently, Wei-Chiu and I have been discussing about if HBase can use
> > >>>>> Ozone as another storage as WAL (see the hsync and hflush JIRAs [1])
> > >>>>> and HFile, for HFile it’s pluggable by configuring the file system to
> > >>>>> use Ozone File System (Ozone)
> > >>>>>
> > >>>>> But we found that the WAL it’s a bit different, especially
> > >>>>> RecoverLeaseFSUtils#recoverFileLease [2], it has one check about if
> > >>>>> the file system is an instance of HDFS, and thus WAL recovery to
> > >>>>> execute file lease recovery from RS crashes. Here, if we would like
> > >> to
> > >>>>> add Ozone, it does not matter by importing as a direct dependency to
> > >>>>> perform similar lease recovery or via reflection by class name in
> > >>>>> plaintext String, we still need to somehow introduce Ozone to be
> > >>>>> another supported file system. (we can discuss how we can implement
> > >>>>> better as well)
> > >>>>>
> > >>>>> We also found other places e.g. FSUtils and HFileSystem have used
> > >>>>> DistributedFileSystem, but it should be able to move them into either
> > >>>>> hbase-asyncfs or a new FS related component to separate the use of
> > >>>>> different supported file systems.
> > >>>>>
> > >>>>> So, we’re wondering if anyone would have any objections to adding
> > >>>>> Ozone as a dependency to hbase-asyncfs? or if you have a better idea
> > >>>>> how this could be added without adding Ozone as dependency, please
> > >>>>> feel free to comment on this thread.
> > >>>>>
> > >>>>>
> > >>>>> [1] Ozone is working on support for hsync and hflush,
> > >>>>> https://issues.apache.org/jira/browse/HDDS-7593,
> > >>>>> https://issues.apache.org/jira/browse/HDDS-4353
> > >>>>> [2] RecoverLeaseFSUtils#recoverFileLease,
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/util/RecoverLeaseFSUtils.java#L53-L63
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Stephen
> > >>>
> > >>
> >

Reply via email to