[GitHub] [drill] paul-rogers opened a new pull request #1988: DRILL-7590: Refactor plugin registry

GitBox Mon, 17 Feb 2020 22:21:13 -0800

paul-rogers opened a new pull request #1988: DRILL-7590: Refactor plugin 
registry
URL: https://github.com/apache/drill/pull/1988
 
 
   Performs a thorough "spring cleaning" of the storage plugin registry to 
prepare it to add a proper plugin API.
   
   This is a complex PR with lots going on.
   
   The plugin registry connects configurations, stored in ZK, with 
implementations, which are Java classes. The registry handles a large number of 
tasks including:
   
   * Populating "bootstrap" plugin configurations and handling upgrades.
   * Reading from, and writing to, the persistent store in ZK.
   * Handling "normal" (configured) plugins and special system plugins (which 
have no configuration.)
   * Handle format plugins which are always associated with the DFS storage 
plugin.
   * Handle "ephemeral" plugins which correspond to configs not stored in the 
registry.
   * And so on.
   
   The code has grown overly complex. As we look to add a new, cleaner plugin 
mechanism, we will start by cleaning up what we have to allow the new mechanism 
to be one of many.
   
   ## Terminology
   
   There is no more confusing term in Drill than "plugin." That single term can 
mean:
   
   * The stored JSON definition for a plugin config. (What we see in the web 
console.)
   * The config object which holds the configuation.
   * The storage plugin instance with the config attached. This is the 
functional aspect
     of a plugin.
   * The storage plugin class itself.
   
   To make the following discussion clearer, we redefine terms as:
   
   * *Connector*: the storage plugin class (which needs a config to be useful)
   * *Plugin*: the configuration of a plugin in any of its three forms: JSON, 
config-only or as part
     of a connector + config pair.
   
   ## Standard and System Connectors
   
   The registry class handled many tasks itself, making the code hard to 
follow. The first task is to split apart responsibilities into separate 
classses.
   
   The registry handles two kinds of plugins at present:
   
   * "Classic" plugins are those defined by a `StoragePluginConfig` subclass 
and a `StoragePlugin` subclass
   with a specific constructor. Their configs are persistently stored in ZK. 
That is, the storage plugins most of us think about.
   * System plugins are a special case: they are always defined by default, and 
have no (or, actually, an implicit) config.
   Examples: `sys` and `information_schema` System plugins have the `` 
annotation, are created at boot time, and do not reside in
   the ZK store.
   
   The first step is to split out these two kinds of plugins into separate 
"provider" classes, along with a common interface. A new `ConnectorProvider` 
interface has two implementations: one for "classic" plugins another for system 
plugins. Then, when we add the new mechanism, it becomes a third plugin 
provider.
   
   ## Bootstrap and Upgrade
   
   The registry also handles the process of initializing a newly installed 
Drill, or upgrading an existing one. The code for this is pulled out into a 
separate class.
   
   Moved the names of the bootstrap plugins and plugins upgrade files into the 
config system
   to allow easier testing with test-specific files. Added complete unit tests.
   
   ## Plugin Lifecycle
   
   Plugins have a surprisingly robust lifecycle. Revised the code to better 
model the nuances of the
   lifecycle (and fix a number of subtle bugs).
   
   Plugin instances must be created, but only for standard plugins (not system 
plugins). Added a
   `ConnectorHandle` so we can track the source of each connector so that the 
locator can create
   connector instances (for standard plugins) or not (for system plugins.)
   
   Plugins are defined by persistent storage as a (name, config) pair. There is 
no reason to
   create a connector instance just to load plugins from storage. So, added a 
`PluginHandle` class
   to hold onto the (name, config, `ConnectorHandle`) triple.
   
   This handle then allows us to do lazy instantiation of the connector class. 
Rather than creating
   it on load, we wait until some code actually needs the plugin. (Some code 
still demands that we load all plugins; this can be fixed in a later PR.)
   
   The registry API was changed to make this clear. `createOrUpdate()` is 
renamed to `put` and
   no longer returns the plugin instance (which, it turned out, was never 
used.) Now, we don't
   create the connector instance until `getPlugin()` is called. Added a new 
`getConfig()` method for the many times we only want the config and don't 
actually need the instance.
   
   Drill is a concurrent, distributed system. Plugin (configurations) can 
change at any time.
   We might change `dfs` while queries run. The registry supports "ephemeral" 
plugins, those
   that occur in a query execution plan, but do not match a name in persistent 
storage.
   
   Previously, ephemeral plugins were not connected to normal named plugins. 
Revised this so that
   plugin instances migrate from the named cache to the ephemeral cache and 
back again as the
   user changes plugin definitions. This avoids having two plugins with the 
same configuration
   active at the same time (unless those plugins have different names.) It also 
prevents redundant close/open pairs on the plugin instance.
   
   ## Plugin Bootstrap
   
   The registry provides a way to load plugins on first start, and to upgrade 
plugins
   in existing versions. This code was gathered into a `PluginBootstrapLoader` 
interface
   and implementation so it can be extended for the new plugin system.
   
   The upgrade code seems to have never been used, and would be somewhat 
fragile if we
   tried. It relies on detecting that a certain file exists in Drill, doing the 
upgrade
   if do, then deleting the file. While this works in the ideal case, it does 
not work if
   the same Drill install is used with two persistent stores, if the persistent 
store is
   restored to an earlier state, or if the user upgrades across multiple 
versions in one
   go. We need a version number in the persistent store in order to make 
upgrade work
   well, but doing so is out of scope of this particular PR.
   
   The bootstrap system also has code to load or upgrade format plugins. 
Wonderful idea.
   However, today, format plugins are part of the storage plugin; there is no 
good
   mechanism to upgrade format plugins separately. (This limitation has more 
the flavor
   of "bug" than "feature.") Rather than remove the non-functional code, it is 
left as
   a future PR can/should allow sharing format plugins across file system 
storage plugins,
   at which time format plugin upgrades will make perfect sense as we add new 
ones.
   
   ## Concurrency
   
   The `StoragePluginMap` maintains two maps: one from name to plugins, another 
from config
   to plugins. The map tried to be thread-safe, but had some race conditions 
that, under heavy load,
   the two maps could get out of sync. Reworked this class to avoid the race 
conditions. Then,
   carefully reviewed all code paths to ensure that either we obtain locks (for 
the `StoragePluginMap`)
   or the code flow is such that a race condition is benign.
   
   There was some unfortunate code that modified storage plugin configs *after* 
they were stored in the registry. This must be strictly forbidden as it causes 
the (name --> plugin) and (config --> plugin) maps to get out of sync. Fixed 
all such issues that could be found.
   
   ## Removed Ad-Hoc Plugins
   
   The registry contained support for ad-hoc plugins: the ability to pass 
`false` for the `isPersistent` argument to the old `createOrUpdate()` method. 
These were added (in part by this author)
   to support test-only plugins. However, ad-hoc plugins create very messy 
semantics. For example,
   once we get the list of plugin classes at startup, we want the list to be 
immutable to avoid
   the need to synchronize the list. Adding test-only classes after start 
breaks this assumption.
   
   Test-only plugin configs can just be added with the `put()` method. Such 
plugins will persist in the plugin store. This is fine, because tests usually 
use a throw-away persistent store after each test.
   
   Custom plugin classes can be marked with the new `@PrivatePlugin` annotation 
so that the
   "classic" locator won't load them. Then, set the new `PRIVATE_CONNECTORS` 
property to list
   those private classes that should be loaded. The private plugins are then 
available for use.
   
   ## Revise Registry API
   
   The registry wants to be pluggable (though we have only one implementation). 
Replaced the method to get the underlying store with one to return the list of 
plugins, which is what the only client of the former method needed.
   
   Already mentioned that the `createOrUpdate` method changed to the simpler 
`put`. Added a method to get a config without the plugin. Removed the plugin 
return from `put` since no code
   used it and it forces creation of the plugin (see below).
   
   ## Advice for Test Creators
   
   The right way to create a plugin for a unit test is:
   
   * Make sure your storage plugin (connector) class is on the class path. (If 
you've just
     added the class, you may need to do a new build, or use the code shown 
below to force
     a run-time scan.)
   * Add the plugin via the `put` method on the registry giving a (name, 
config) pair.
   * If you need access to the storage plugin instance, call the `get()` method 
with the name.
   * If you use `ClusterTest`, call `defineStoragePlugin()` to define the 
plugin on all Drillbits.
   
   The code to force a run-time class scan is:
   
   ```
       OperatorFixture.Builder builder = 
OperatorFixture.builder(dirTestWatcher);
       builder.configBuilder().put(ClassPathScanner.IMPLEMENTATIONS_SCAN_CACHE, 
false);
   ```
   
   Also, if you want to change the content of a plugin config stored in the 
registry, you
   *must* make a copy first, else the internal maps will get out of sync. Think 
of formats
   as immutable once stored in the registry. Code should be modified to reflect 
this fact.
   
   ## Retire the "Named" Storage Plugin
   
   Execution plans sent from the planner to the execution engine contain the 
full configuration, in JSON form, for any storage or format plugins used by the 
query. This may seem redundant: why include the information in the plan when we 
could obtain it from the plugin registry? All we need is the plugin name and we 
can do the retrieval.
   
   The code contained an obscure plugin config classes: 
`NamedStoragePluginConfig` which seemed to implement this idea. At present, 
they are used by exactly one obscure test: `TextRecordReaderTest`.
   
   One might think that we should broaden the use of this idea. But, a bit of 
thought suggests why the original authors might not have done so. Consider what 
happens if the user changes the config concurrently with running a query. The 
update and query "race" each other to the worker nodes.
   In the worst case, some nodes run the query with the old configuration, some 
with the new, causing infrequent, random and obscure errors.
   
   By copying the config into the plan, we avoid the race condition. It is a 
poor man's versioned plugin storage. (The ephemeral store ensures that the 
query sees the "old" config for the duration of the query.)
   
   A more elegant solution might be to version each plugin configuration in the 
shared DB so that the "named" config could say, "I want the s3 plugin at 
version 2", even while version 3 is being propagated across the cluster.
   
   However, since the current system works, and there are no plans for a 
versioned system, it seems to make sense to simply remove the unused "named" 
plugins and remove the one tests which uses them.
   
   Note that the code also contains a `NamedFormatPluginConfig` class used in 
several places. This format allows forcing the type of a format in a query. 
This format does not have the same issues as the above class, and so remains in 
the code.
   
   ## Add Unit Tests
   
   At present there are no unit tests that directly exercise the plugin 
registry. An analysis of a test run showed that some code is called only a few 
times, by tests that are doing something that only indirectly relates to 
plugins. (The previously-mentioned, `TextRecordReaderTest`, which was the only 
use of a specific if-statement, impersonation tests which are the only use of a 
"delete" method, and so on.
   
   Add a unit tests that directly tests all plugin functionality. Refactor the 
plugin code to allow such tests. (The upgrade mechanism, for example, only 
works with a very specific system state which is hard to reproduce in tests.)
   
   ## Plugin Context
   
   Plugin constructs accept a `DrillbitContext` instance. `DrillbitContext` 
provides access to all
   of Drill's internals. As we think about an improved plugin API, it is clear 
we cannot have
   external code depend on Drill's internals. We need a plugin-specific 
context. This PR takes
   a step in that direction with the `PluginRegistryContext` interface used, at 
present, only
   by the registry itself. This class also allows mocking for unit tests.
   
   A future PR will revise storage plugins to accept a new 
`StoragePluginContext` interface instead of `DrillbitContext`.
   
   ## Logical Plan Persistence
   
   Cleaned up redundant code for finding serialized classes for operators. 
Cleaned up references to
   `LogicalPlanPersistence` when `ObjectMapper` would do.
   
   ## Random Other Changes
   
   * Added the ability to customize the local persistent store even running in 
embedded mode. Previously, the persistent store option was used only for a 
distributed run, with a fixed store selection in embedded mode.
   * Added an annotation to mark a class as a "private plugin" that will not be 
found via the normal class path search.
   * Added a config option to tell the server which (if any) private plugins to 
load. This is requred because the list of plugin classes wants to be fixed 
after Drillbit start.
   * Fixed some formatting, removed some warnings.
   * Separating the definition of a plugin (name, config) from the plugin 
implementation, via a handle
   * Pull persistent store code into a separate class
   * Move the `DrillSchemaFactory` into a separate file.
   * Fixed numerous small bugs exposed during code inspection and unit tests.
   * Removes the unused `DistributedStorageEngine`, `BatchExceededException`, 
`StorageDriver`, `StorageDriverFactory`, and  `PluginHost` classes.
   * Reformatted the `StoragePlugin` interface, but did not modify the set of 
methods.
   * Added a `PlanStringBuilder` class and used it to add `toString()` methods 
to several plugins.
   
   ## Additional Work
   
   A number of additional improvements are possible:
   
   * The above-mentioned improvement to not load all schemas for every query. 
Instead,
     resolve schemas only as referenced.
   * Rather than doing the class path analysis twice, perform it once and share 
it
     between classic and system connectors locators.
   * Revise the `StoragePlugin` class to accept a plugin-specific context 
instead of
     the "kitchen sink" `DrillbitContext`.
   
   ## Documentation
   
   Care was taken to ensure that, despite the many code changes, there is no 
visible difference in behavior for end users.
   
   ## Testing
   
   Added multiple new unit tests. Reran all unit tests and fixed issues.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers opened a new pull request #1988: DRILL-7590: Refactor plugin registry

Reply via email to