paul-rogers commented on pull request #2215: URL: https://github.com/apache/drill/pull/2215#issuecomment-1030531217
@luocooong, here are a few more thoughts. First issues, then solutions. Installing a plugin on a running system is a bit hard, as you've discovered. Looks like I had some earlier comments. Adding a new plugin requires some kind of coordination: it can't be used on any node until it is available on all nodes. The plugin config system is still a bit brittle. The config can't go into ZK until the plugin is available on all nodes. Else, when a node tries to deserialize the config, it will fail because the plugin config class is not (yet) available. The situation for upgrades is worse: it is hard to remove the existing classes, and references. (Special design must be done to allow classes to be unloaded.) Such replacement can only be done when we know no queries are running that need the plugin. Plugins are not as easy to manage as UDFs, so the approach used for dynamic UDFs (which was already a bit complex), probably won't work for plugins. For one, UDFs have no ZK-based config file. On the other hand, the complexity of UDFs comes from preventing race conditions: from ensuring that the UDF is available to all nodes before allowing a query to reference the UDF. So, solutions. The simplest is to use Drill's graceful shutdown feature and simply restart each Drillbit. This process also works for a patch release, to change the memory allocated to Drill, etc. So, you should already have a rolling restart mechanism available anyway if you are running Drill in prouction. If so, then just use that mechanism for adding a new plugin. The process would be: * Install the plugin jar on every node. (See below.) * Use graceful shutdown to perform a rolling restart: shut down and restart nodes one at a time. This is safe because there should be no configs in ZK for the new plugin, which means queries can use the plugin on the nodes that have loaded it. * When all Drillbits are restarted, write the plugin config into ZK. * Once the plugin config is picked up by the nodes, queries can be issued against nodes. The config is needed only on the node acting as the Foreman: all other nodes get the required config handed to them in the physical plan. So, no worry about race conditions in plugin config rollout. One thing Drill is lacking is simple REST APIs for management. There should be an API to trigger graceful shutdown. Another to post a config. (There are a few APIs, but they are designed for use by the UI. Better than nothing.) One other item: I notice that the proposed code change adds a plugin class path to the Drill config. That's not really needed. If you use the `--site <path>` option, then all your unique files (configs, UDFs, plugins) can reside in your custom conifg directory. Drill adds the `$DRILL_CONF/lib` directory to the class path. (Double check, I'm writing this from memory, being the guy who added this feature.) So, your custom plugins simply go into that custom config directory. Longer term, we should add classloader isolation for plugins. Without that, your new plugin could bring down the system if, say, you use a version of Guava that conflicts with Drill's. The same is true of the many other dependencies. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
