Dear Hoppiverse,

Apache VFS is a very useful tool for supporting alternative ways of storing
files.
We have a plugin type for VFS plugins as well as for allowing us to extend
functionality.
Where we're not doing a great job is in supporting specific driver features
like authentication where this is needed.
For example, Amazon AWS needs an access key and a secret key (among a list
of authentication options).

Usually the configuration of these access APIs underpinning the VFS drivers
is left as standard as possible.
However, this causes it to be dragged outside of the Hop-isphere in the
sense that you have effects like the following:
- variables can't be used.
- obfuscation/encryption of keys and passwords is not done
- No GUI or configuration elements to configure the security
- No metadata wrappers are available or supported
...

To get past these shortcomings I propose to turn HopVfs from a singleton
into a proper class.
Class HopVfs is used about 800 times in the source code.  It's a wrapper
around the Apache VFS API.

In the ideal case we would have a new metadata object type called something
like :
"Amazon Web Services Authentication" which would have fields like access
and secret key but also perhaps a checkbox: [x] configured by the system.
We could then have a global configuration option for the S3 VFS driver
which simply says which "AWS Auth" object to use.

This would address many if not all concerns.  The "only" thing we need to
change is moving from:

HopVfs.getFileObject(filename)

to

new HopVfs(metadataProvider, variables).getFileObject(filename)

Obviously a case can be made to cache the HopVfs objects at various
locations (one per pipeline, workflow, ...).
You get the idea.

I have a feeling that if we don't do this we'll continue to carry around
this architectural debt for quite a while more into other territories like
Azure and Google.

Thoughts?

Matt
-- 
Neo4j Chief Solutions Architect
*✉   *[email protected]
☎  +32486972937

Reply via email to