Marton, this is what we had on the apache docs as of 2/17 when I last
checked:

S3 Fuse driver (goofys)
Goofys is a S3 FUSE driver. It could be used to any mount any Ozone
bucket as posix file system:goofys --endpoint http://localhost:9878
bucket1 /mount/bucket1


So +1 for the change above.

Also, do we have tests for what happens after the mount is successful? What
about working behavior of the mounted volume, we need to invest some time
in this IMO for CSI readiness.
Agree, that it is a selling point but if we claim support without a clear
disclaimer like the one we have now, it sends a very wrong message.

- Sid



On Fri, Mar 12, 2021 at 10:15 AM Arpit Agarwal
<aagar...@cloudera.com.invalid> wrote:

> I am +1 on your proposed changes. It makes it clear upfront that the
> support is incomplete and uses fuse s3.
>
> Thanks for putting up the patch.
>
>
> > On Mar 12, 2021, at 9:57 AM, Elek, Marton <e...@apache.org> wrote:
> >
> >
> > If we simplify the picture, the two biggest Apache Ozone advantages
> compared with HDFS are the following (IMHO!):
> >
> > 1. better scalability: it can handle billion of files
> > 2. better interface support: it can be used from multiple interfaces not
> only from Hadoop compatible interfaces (S3, CSI)
> >
> >
> > I think the second is equally important with the first. Ozone can be
> used not only from Hadoop compatible tools like Spark and Hive but also
> from and S3 compatible data science or ML tool, or (via Fuse file system)
> from Yarn or Kubernetes containers.
> >
> >
> > There is a well-known slide about this which is used wildly (at least by
> me): big Ozone logo with smaller Hadoop/AWS/K8s logos. It's used in all the
> Ozone videos, other conference presentations and part of the official
> documentation: https://ozone.apache.org/docs/1.0.0/
> >
> > It was also used when Apache Ozone was showed at Cloud Native conf for
> non-Hadoop user audience
> >
> >
> >
> > Let's look the CSI feature more closer:
> >
> > CSI nothing more just a very lightweight interface which can receive
> requests from container orchestrator to create storage (creating bucket in
> our case) and can receive requests to mount it.
> >
> > (for more information about CSI, check this video:
> >
> >
> https://www.youtube.com/watch?v=xQwXnuVr8hc&list=PLCaV-jpCBO8UK5Ged2A_iv3eHuozzMsYv&index=10&t=387s
> )
> >
> >
> > The hard part is the the CSI interface, the hard part is mounting.
> >
> > How can I mount Ozone buckets:
> >
> >
> > Using Ozone (or at least HDDS) as some kind of block store was always
> part of our vision:
> >
> > * HDFS-11118 showed how is it possible to mount huge HDDS containers
> (with jscsi) as ext4 file system
> >
> > This worked very well, but didn't merged back to Hadoop trunk together
> with the other parts and it had one big limitations: the containers are
> used as raw, storage backend, and files were not visible via other
> interfaces (S3 or ofs/o3fs)
> >
> >
> > To fix this there were multiple experiments:
> >
> > * Try to use libhdfs based fuse file system for Ozone (HDDS-3352)
> > * Try to support NFS based on Hadoop NFS support (HDDS-3001)
> >
> > And (as we have proper s3 compatible endpoint) we also tried to use S3
> compatible fuse file systems. We tested goofys, fixed incompatibilities,
> and it worked well.
> >
> > But long term, the most effective solution would be a native fuse driver
> (a prototype can be found at https://github.com/elek/ozone-go and we had
> an agreement to move it to Apache Ozone repository).
> >
> >
> >
> > So Ozone has a simple but working CSI support today which supports CSI
> requests and mount command is configurable. Default value is goofys but
> there other options, for example https://github.com/s3fs-fuse/s3fs-fuse
> > or https://github.com/archiecobbs/s3backer
> >
> > You can use any of the available fuse drivers based on your requirements
> / environments.
> >
> >
> >
> > Recently we had a debate with Arpit about the documentation of CSI (
> https://issues.apache.org/jira/browse/HDDS-4904).
> >
> >
> >
> > Arpit claims that we should remove the documentation of CSI driver
> because Goofys (one of the available implementations) is not production
> ready.
> >
> >
> > I have strong concerns against it:
> >
> > * Goofys is just one possible configuration value, any other drivers can
> be used as mount implementation
> >
> > * As we have this feature implemented it should be documented
> >
> > * It's important part of Ozone selling points and we already shared it
> with the wider community
> >
> > * Even today it can be used with the right choice of S3 fuse driver.
> >
> > * Default settings may or may not be acceptable in production (depends
> if you need strict POSIX compatibility in your prod env or not)
> >
> >
> >
> > I suggest instead to CLEARLY DOCUMENT the state of the CSI and what kind
> of guarantees can be expected and what are the risks (and what are the
> long-term plans):
> >
> > (my suggested patch is here:
> >
> >
> https://github.com/elek/ozone/commit/e56b23499686ce5e90c65285099445e5ee0a935f
> >
> > with update image:
> https://github.com/elek/ozone/blob/csi-alpha/hadoop-hdds/docs/static/ozone-usage.png
> )
> >
> >
> > Please let me know, what is your opinion,
> >
> > Thanks a lot
> > Marton
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>

Reply via email to