Hi Accumulo Devs,

Lately, Dave Marion (Apache ID: dlmarion) has been working on
prototyping some new class loader concepts for Accumulo that he and I
have discussed, and I wanted to pitch the idea here for consideration
for the project.

# Background:

Accumulo currently has two classloaders that are instantiated at
startup, and which can be used to bootstrap Accumulo dependencies (at
least, those not needed for the classloader code itself). This allows
us to use the `general.classpaths`[1] and
`general.dynamic.classpaths`[2] properties, as well as the per-context
classloaders (`general.vfs.*`[3] and `table.classpath.context`[4]) for
things like iterator class isolation. Since 2.0.0, we have deprecated
`general.classpaths` and `general.dynamic.classpaths`, the former
supplanted by the better use of the `CLASSPATH` environment variable
(along with much improved scripts in 2.0.0), and the latter being
replaceable by a user-provided class loader using the built-in Java
property, `java.system.class.loader`[5], at their discretion.

# The Problem:

The main problem with the current code is: complexity. Accumulo is
already complex enough without needing to be in the business of
developing and supporting complex custom class loading features,
especially when users have viable alternatives that can be better
supported by independent, dedicated projects. Furthermore, these
custom class loaders also have a dependency on commons-vfs2, which has
been the source of numerous problems and bugs that we have needed to
deal with, and which affect Accumulo, even though they are not
necessarily bugs in Accumulo itself. This also brings in a lot of
optional dependencies that aren't needed by users who don't rely on
these features.

# The Requirements:

In spite of these problems, I believe we still want to enable the use
cases that our classloaders are currently enabling.

Specifically,
1) the ability to have separate contexts for iterator class isolation
(A/B testing of iterators, updating iterators in a live system, etc.),
and
2) the ability for users to bootstrap their class path from some other
distributed storage than local disk.

# The Proposal:

1. Create a new reloading vfs class loader, with similar functionality
as our current two-classloaders that do the reloading and provide vfs
features, that can be easily used as a system class loader, if the
user chooses to, and deprecate (for removal in 3.0) the built-in
implementations. This class loader could not only be used with
Accumulo, but it could also be used by any other project that chooses
to use it, because it will not have much, if any, dependencies beyond
commons-vfs2, and will certainly not depend on Accumulo. Creating this
separate class loader provides us a path forward to simplify Accumulo
by removing these features from Accumulo directly (the properties are
already deprecated), and enabling it to be maintained independently.
2. Create a new class loader factory property in Accumulo, with
corresponding SPI interface, for users to provide their own
implementation of a class loader factory, that can map a per-table
"context" to a ClassLoader of the implementation's choosing.

The result of doing these two things will allow us to more flexibly
support user class loading needs, without being directly responsible
for class loading implementations inside Accumulo's core code. All the
same functionality that is available today will continue to exist, but
will be configured differently. The resulting code in Accumulo will be
dramatically simpler, as we would no longer have any complex class
loading implementations in our code base, and we would no longer have
any direct dependency on commons-vfs2, which has been problematic.
Independent implementations may use commons-vfs2, or something else,
but will be more easily testable and maintainable as independent
projects that are pluggable in Accumulo.

Dave has already been working on prototyping these proposed changes,
and it is looking very feasible.

We are now ready to:
1. get feedback on the overall proposal, and
2. decide on where to maintain the separate class loader.

For where to maintain, the options seem to be: A) try to donate to
commons-vfs2 OR B) maintain as a new repository,
accumulo-vfs-classloader.

Note that we have not yet proposed the idea of a user-facing,
configurable, reloading vfs classloader to the commons-vfs2
developers. We wanted to get our own community's feedback on this
first.

Please discuss.

Thanks,
Christopher (in collaboration with Dave)

[1]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_classpaths
[2]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_dynamic_classpaths
[3]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_vfs_context_classpath_prefix
[4]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#table_classpath_context
[5]: 
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#getSystemClassLoader%E2%80%93

Reply via email to