Elek, Marton created HDDS-922:
---------------------------------
Summary: Create isolated classloder to use ozonefs with any older
hadoop versions
Key: HDDS-922
URL: https://issues.apache.org/jira/browse/HDDS-922
Project: Hadoop Distributed Data Store
Issue Type: Improvement
Components: Ozone Filesystem
Reporter: Elek, Marton
Assignee: Elek, Marton
As of now we create a shaded ozonefs artifact which includes all the required
class files to use ozonefs (Hadoop compatible file system for Ozone)
But the shading process of this artifact is very easy, it includes all the
class files but no relocation rules (package name renaming) are configured.
With this approach ozonefs can be used from the compatible hadoop version (this
is hadoop 3.1 only, I guess) but can't be used with any older hadoop version as
it requires the newer version of hadoop-common.
I tried to configure a full shading (with relocation) but it's not a simple
task. For example a pure (non-relocated) Configuration is required by the
ozonefs itself, but an other, newer Configuration class is required by the
ozone client code which is a dependency of OzoneFileSystem So we need a
relocated and a non-relocated class in the same time.
I tried out a different approach: I moved out all of the ozone specific classes
from the OzoneFileSystem to an adapter class (OzoneClientAdapter). In case of
an older hadoop version the adapter class itself can be loaded with an isolated
classloader. The isolated classloader can load all the required classes from
the jar file from a specific path. It doesn't require any specific package
relocation as the default class loader doesn't load these classes.
The OzoneFileSystem (in case of older hadoop version) can load the adapter with
the isolated classloader and only a few classes should be shared between the
normal and isolated classloader (the interface of the adapter and the types in
the method signatures). All of the other ozone classes and the newer hadoop
dependencies will be hidden by the isolated classloader.
This patch is more like a proof of concept, I would like to start a discussion
about this approach. I successfully used the generated artifact to use ozonefs
from spark 2.4 default distribution (which includes hadoop 2.7).
For a final patch I would add some check to use the ozonefs without any
classpath separation by default. (could be configured or chosen by
automatically)
For using spark (+ hadoop 2.7 + kubernetes scheduler) together with ozone, you
can check this screencast: https://www.youtube.com/watch?v=cpRJcSHIEdM&t=8s
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]