[
https://issues.apache.org/jira/browse/DRILL-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038810#comment-17038810
]
ASF GitHub Bot commented on DRILL-7590:
---------------------------------------
paul-rogers commented on pull request #1988: DRILL-7590: Refactor plugin
registry
URL: https://github.com/apache/drill/pull/1988
Performs a thorough "spring cleaning" of the storage plugin registry to
prepare it to add a proper plugin API.
This is a complex PR with lots going on.
The plugin registry connects configurations, stored in ZK, with
implementations, which are Java classes. The registry handles a large number of
tasks including:
* Populating "bootstrap" plugin configurations and handling upgrades.
* Reading from, and writing to, the persistent store in ZK.
* Handling "normal" (configured) plugins and special system plugins (which
have no configuration.)
* Handle format plugins which are always associated with the DFS storage
plugin.
* Handle "ephemeral" plugins which correspond to configs not stored in the
registry.
* And so on.
The code has grown overly complex. As we look to add a new, cleaner plugin
mechanism, we will start by cleaning up what we have to allow the new mechanism
to be one of many.
## Terminology
There is no more confusing term in Drill than "plugin." That single term can
mean:
* The stored JSON definition for a plugin config. (What we see in the web
console.)
* The config object which holds the configuation.
* The storage plugin instance with the config attached. This is the
functional aspect
of a plugin.
* The storage plugin class itself.
To make the following discussion clearer, we redefine terms as:
* *Connector*: the storage plugin class (which needs a config to be useful)
* *Plugin*: the configuration of a plugin in any of its three forms: JSON,
config-only or as part
of a connector + config pair.
## Standard and System Connectors
The registry class handled many tasks itself, making the code hard to
follow. The first task is to split apart responsibilities into separate
classses.
The registry handles two kinds of plugins at present:
* "Classic" plugins are those defined by a `StoragePluginConfig` subclass
and a `StoragePlugin` subclass
with a specific constructor. Their configs are persistently stored in ZK.
That is, the storage plugins most of us think about.
* System plugins are a special case: they are always defined by default, and
have no (or, actually, an implicit) config.
Examples: `sys` and `information_schema` System plugins have the ``
annotation, are created at boot time, and do not reside in
the ZK store.
The first step is to split out these two kinds of plugins into separate
"provider" classes, along with a common interface. A new `ConnectorProvider`
interface has two implementations: one for "classic" plugins another for system
plugins. Then, when we add the new mechanism, it becomes a third plugin
provider.
## Bootstrap and Upgrade
The registry also handles the process of initializing a newly installed
Drill, or upgrading an existing one. The code for this is pulled out into a
separate class.
Moved the names of the bootstrap plugins and plugins upgrade files into the
config system
to allow easier testing with test-specific files. Added complete unit tests.
## Plugin Lifecycle
Plugins have a surprisingly robust lifecycle. Revised the code to better
model the nuances of the
lifecycle (and fix a number of subtle bugs).
Plugin instances must be created, but only for standard plugins (not system
plugins). Added a
`ConnectorHandle` so we can track the source of each connector so that the
locator can create
connector instances (for standard plugins) or not (for system plugins.)
Plugins are defined by persistent storage as a (name, config) pair. There is
no reason to
create a connector instance just to load plugins from storage. So, added a
`PluginHandle` class
to hold onto the (name, config, `ConnectorHandle`) triple.
This handle then allows us to do lazy instantiation of the connector class.
Rather than creating
it on load, we wait until some code actually needs the plugin. (Some code
still demands that we load all plugins; this can be fixed in a later PR.)
The registry API was changed to make this clear. `createOrUpdate()` is
renamed to `put` and
no longer returns the plugin instance (which, it turned out, was never
used.) Now, we don't
create the connector instance until `getPlugin()` is called. Added a new
`getConfig()` method for the many times we only want the config and don't
actually need the instance.
Drill is a concurrent, distributed system. Plugin (configurations) can
change at any time.
We might change `dfs` while queries run. The registry supports "ephemeral"
plugins, those
that occur in a query execution plan, but do not match a name in persistent
storage.
Previously, ephemeral plugins were not connected to normal named plugins.
Revised this so that
plugin instances migrate from the named cache to the ephemeral cache and
back again as the
user changes plugin definitions. This avoids having two plugins with the
same configuration
active at the same time (unless those plugins have different names.) It also
prevents redundant close/open pairs on the plugin instance.
## Plugin Bootstrap
The registry provides a way to load plugins on first start, and to upgrade
plugins
in existing versions. This code was gathered into a `PluginBootstrapLoader`
interface
and implementation so it can be extended for the new plugin system.
The upgrade code seems to have never been used, and would be somewhat
fragile if we
tried. It relies on detecting that a certain file exists in Drill, doing the
upgrade
if do, then deleting the file. While this works in the ideal case, it does
not work if
the same Drill install is used with two persistent stores, if the persistent
store is
restored to an earlier state, or if the user upgrades across multiple
versions in one
go. We need a version number in the persistent store in order to make
upgrade work
well, but doing so is out of scope of this particular PR.
The bootstrap system also has code to load or upgrade format plugins.
Wonderful idea.
However, today, format plugins are part of the storage plugin; there is no
good
mechanism to upgrade format plugins separately. (This limitation has more
the flavor
of "bug" than "feature.") Rather than remove the non-functional code, it is
left as
a future PR can/should allow sharing format plugins across file system
storage plugins,
at which time format plugin upgrades will make perfect sense as we add new
ones.
## Concurrency
The `StoragePluginMap` maintains two maps: one from name to plugins, another
from config
to plugins. The map tried to be thread-safe, but had some race conditions
that, under heavy load,
the two maps could get out of sync. Reworked this class to avoid the race
conditions. Then,
carefully reviewed all code paths to ensure that either we obtain locks (for
the `StoragePluginMap`)
or the code flow is such that a race condition is benign.
There was some unfortunate code that modified storage plugin configs *after*
they were stored in the registry. This must be strictly forbidden as it causes
the (name --> plugin) and (config --> plugin) maps to get out of sync. Fixed
all such issues that could be found.
## Removed Ad-Hoc Plugins
The registry contained support for ad-hoc plugins: the ability to pass
`false` for the `isPersistent` argument to the old `createOrUpdate()` method.
These were added (in part by this author)
to support test-only plugins. However, ad-hoc plugins create very messy
semantics. For example,
once we get the list of plugin classes at startup, we want the list to be
immutable to avoid
the need to synchronize the list. Adding test-only classes after start
breaks this assumption.
Test-only plugin configs can just be added with the `put()` method. Such
plugins will persist in the plugin store. This is fine, because tests usually
use a throw-away persistent store after each test.
Custom plugin classes can be marked with the new `@PrivatePlugin` annotation
so that the
"classic" locator won't load them. Then, set the new `PRIVATE_CONNECTORS`
property to list
those private classes that should be loaded. The private plugins are then
available for use.
## Revise Registry API
The registry wants to be pluggable (though we have only one implementation).
Replaced the method to get the underlying store with one to return the list of
plugins, which is what the only client of the former method needed.
Already mentioned that the `createOrUpdate` method changed to the simpler
`put`. Added a method to get a config without the plugin. Removed the plugin
return from `put` since no code
used it and it forces creation of the plugin (see below).
## Advice for Test Creators
The right way to create a plugin for a unit test is:
* Make sure your storage plugin (connector) class is on the class path. (If
you've just
added the class, you may need to do a new build, or use the code shown
below to force
a run-time scan.)
* Add the plugin via the `put` method on the registry giving a (name,
config) pair.
* If you need access to the storage plugin instance, call the `get()` method
with the name.
* If you use `ClusterTest`, call `defineStoragePlugin()` to define the
plugin on all Drillbits.
The code to force a run-time class scan is:
```
OperatorFixture.Builder builder =
OperatorFixture.builder(dirTestWatcher);
builder.configBuilder().put(ClassPathScanner.IMPLEMENTATIONS_SCAN_CACHE,
false);
```
Also, if you want to change the content of a plugin config stored in the
registry, you
*must* make a copy first, else the internal maps will get out of sync. Think
of formats
as immutable once stored in the registry. Code should be modified to reflect
this fact.
## Retire the "Named" Storage Plugin
Execution plans sent from the planner to the execution engine contain the
full configuration, in JSON form, for any storage or format plugins used by the
query. This may seem redundant: why include the information in the plan when we
could obtain it from the plugin registry? All we need is the plugin name and we
can do the retrieval.
The code contained an obscure plugin config classes:
`NamedStoragePluginConfig` which seemed to implement this idea. At present,
they are used by exactly one obscure test: `TextRecordReaderTest`.
One might think that we should broaden the use of this idea. But, a bit of
thought suggests why the original authors might not have done so. Consider what
happens if the user changes the config concurrently with running a query. The
update and query "race" each other to the worker nodes.
In the worst case, some nodes run the query with the old configuration, some
with the new, causing infrequent, random and obscure errors.
By copying the config into the plan, we avoid the race condition. It is a
poor man's versioned plugin storage. (The ephemeral store ensures that the
query sees the "old" config for the duration of the query.)
A more elegant solution might be to version each plugin configuration in the
shared DB so that the "named" config could say, "I want the s3 plugin at
version 2", even while version 3 is being propagated across the cluster.
However, since the current system works, and there are no plans for a
versioned system, it seems to make sense to simply remove the unused "named"
plugins and remove the one tests which uses them.
Note that the code also contains a `NamedFormatPluginConfig` class used in
several places. This format allows forcing the type of a format in a query.
This format does not have the same issues as the above class, and so remains in
the code.
## Add Unit Tests
At present there are no unit tests that directly exercise the plugin
registry. An analysis of a test run showed that some code is called only a few
times, by tests that are doing something that only indirectly relates to
plugins. (The previously-mentioned, `TextRecordReaderTest`, which was the only
use of a specific if-statement, impersonation tests which are the only use of a
"delete" method, and so on.
Add a unit tests that directly tests all plugin functionality. Refactor the
plugin code to allow such tests. (The upgrade mechanism, for example, only
works with a very specific system state which is hard to reproduce in tests.)
## Plugin Context
Plugin constructs accept a `DrillbitContext` instance. `DrillbitContext`
provides access to all
of Drill's internals. As we think about an improved plugin API, it is clear
we cannot have
external code depend on Drill's internals. We need a plugin-specific
context. This PR takes
a step in that direction with the `PluginRegistryContext` interface used, at
present, only
by the registry itself. This class also allows mocking for unit tests.
A future PR will revise storage plugins to accept a new
`StoragePluginContext` interface instead of `DrillbitContext`.
## Logical Plan Persistence
Cleaned up redundant code for finding serialized classes for operators.
Cleaned up references to
`LogicalPlanPersistence` when `ObjectMapper` would do.
## Random Other Changes
* Added the ability to customize the local persistent store even running in
embedded mode. Previously, the persistent store option was used only for a
distributed run, with a fixed store selection in embedded mode.
* Added an annotation to mark a class as a "private plugin" that will not be
found via the normal class path search.
* Added a config option to tell the server which (if any) private plugins to
load. This is requred because the list of plugin classes wants to be fixed
after Drillbit start.
* Fixed some formatting, removed some warnings.
* Separating the definition of a plugin (name, config) from the plugin
implementation, via a handle
* Pull persistent store code into a separate class
* Move the `DrillSchemaFactory` into a separate file.
* Fixed numerous small bugs exposed during code inspection and unit tests.
* Removes the unused `DistributedStorageEngine`, `BatchExceededException`,
`StorageDriver`, `StorageDriverFactory`, and `PluginHost` classes.
* Reformatted the `StoragePlugin` interface, but did not modify the set of
methods.
* Added a `PlanStringBuilder` class and used it to add `toString()` methods
to several plugins.
## Additional Work
A number of additional improvements are possible:
* The above-mentioned improvement to not load all schemas for every query.
Instead,
resolve schemas only as referenced.
* Rather than doing the class path analysis twice, perform it once and share
it
between classic and system connectors locators.
* Revise the `StoragePlugin` class to accept a plugin-specific context
instead of
the "kitchen sink" `DrillbitContext`.
## Documentation
Care was taken to ensure that, despite the many code changes, there is no
visible difference in behavior for end users.
## Testing
Added multiple new unit tests. Reran all unit tests and fixed issues.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Refactor plugin registry
> ------------------------
>
> Key: DRILL-7590
> URL: https://issues.apache.org/jira/browse/DRILL-7590
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.17.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Major
> Fix For: 1.18.0
>
>
> The plugin registry connects configurations, stored in ZK, with
> implementations, which are Java classes. The registry handles a large number
> of tasks including:
> * Populating "bootstrap" plugin configurations and handling upgrades.
> * Reading from, and writing to, the persistent store in ZK.
> * Handling "normal" (configured) plugins and special system plugins (which
> have no configuration.)
> * Handle format plugins which are always associated with the DFS storage
> plugin.
> * And so on.
> The code has grown overly complex. As we look to add a new, cleaner plugin
> mechanism, we will start by cleaning up what we have to allow the new
> mechanism to be one of many.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)