[GitHub] [drill] paul-rogers commented on pull request #2215: DRILL-7916: Support new plugin installation on the running system

GitBox Fri, 04 Feb 2022 21:29:00 -0800


paul-rogers commented on pull request #2215:
URL: https://github.com/apache/drill/pull/2215#issuecomment-1030531217



   @luocooong, here are a few more thoughts. First issues, then solutions.
   
   Installing a plugin on a running system is a bit hard, as you've discovered. 
Looks like I had some earlier comments. Adding a new plugin requires some kind 
of coordination: it can't be used on any node until it is available on all 
nodes.
   
   The plugin config system is still a bit brittle. The config can't go into ZK 
until the plugin is available on all nodes. Else, when a node tries to 
deserialize the config, it will fail because the plugin config class is not 
(yet) available.
   
   The situation for upgrades is worse: it is hard to remove the existing 
classes, and references. (Special design must be done to allow classes to be 
unloaded.) Such replacement can only be done when we know no queries are 
running that need the plugin.
   
   Plugins are not as easy to manage as UDFs, so the approach used for dynamic 
UDFs (which was already a bit complex), probably won't work for plugins. For 
one, UDFs have no ZK-based config file. On the other hand, the complexity of 
UDFs comes from preventing race conditions: from ensuring that the UDF is 
available to all nodes before allowing a query to reference the UDF.
   
   So, solutions. The simplest is to use Drill's graceful shutdown feature and 
simply restart each Drillbit. This process also works for a patch release, to 
change the memory allocated to Drill, etc. So, you should already have a 
rolling restart mechanism available anyway if you are running Drill in 
prouction. If so, then just use that mechanism for adding a new plugin.
   
   The process would be:
   
   * Install the plugin jar on every node. (See below.)
   * Use graceful shutdown to perform a rolling restart: shut down and restart 
nodes one at a time. This is safe because there should be no configs in ZK for 
the new plugin, which means queries can use the plugin on the nodes that have 
loaded it.
   * When all Drillbits are restarted, write the plugin config into ZK.
   * Once the plugin config is picked up by the nodes, queries can be issued 
against nodes.
   
   The config is needed only on the node acting as the Foreman: all other nodes 
get the required config handed to them in the physical plan. So, no worry about 
race conditions in plugin config rollout.
   
   One thing Drill is lacking is simple REST APIs for management. There should 
be an API to trigger graceful shutdown. Another to post a config. (There are a 
few APIs, but they are designed for use by the UI. Better than nothing.)
   
   One other item: I notice that the proposed code change adds a plugin class 
path to the Drill config. That's not really needed. If you use the `--site 
<path>` option, then all your unique files (configs, UDFs, plugins) can reside 
in your custom conifg directory. Drill adds the `$DRILL_CONF/lib` directory to 
the class path. (Double check, I'm writing this from memory, being the guy who 
added this feature.) So, your custom plugins simply go into that custom config 
directory.
   
   Longer term, we should add classloader isolation for plugins. Without that, 
your new plugin could bring down the system if, say, you use a version of Guava 
that conflicts with Drill's. The same is true of the many other dependencies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] paul-rogers commented on pull request #2215: DRILL-7916: Support new plugin installation on the running system

Reply via email to