Hey John, I think this is some great stuff. Thanks for the write up.
It looks like you have ideas around what might go into a first release of this plug-in framework. Were you thinking we'd have enough time to squeeze that first rev into 4.3. I'm just wondering (it's not a huge deal to hit that release for this) because we would only have about five weeks. Thanks On Tue, Aug 20, 2013 at 3:43 PM, John Burwell <[email protected]> wrote: > All, > > In capturing my thoughts on storage, my thinking backed into the driver > model. While we have the beginnings of such a model today, I see the > following deficiencies: > > > 1. *Multiple Models*: The Storage, Hypervisor, and Security layers > each have a slightly different model for allowing system functionality to > be extended/substituted. These differences increase the barrier of entry > for vendors seeking to extend CloudStack and accrete code paths to be > maintained and verified. > 2. *Leaky Abstraction*: Plugins are registered through a Spring > configuration file. In addition to being operator unfriendly (most > sysadmins are not Spring experts nor do they want to be), we expose the > core bootstrapping mechanism to operators. Therefore, a misconfiguration > could negatively impact the injection/configuration of internal management > server components. Essentially handing them a loaded shotgun pointed at > our right foot. > 3. *Nondeterministic Load/Unload Model*: Because the core loading > mechanism is Spring, the management has little control over the timing and > order of component loading/unloading. Changes to the Management Server's > component dependency graph could break a driver by causing it to be started > at an unexpected time. > 4. *Lack of Execution Isolation*: As a Spring component, plugins are > loaded into the same execution context as core management server > components. Therefore, an errant plugin can corrupt the entire management > server. > > > For next revision of the plugin/driver mechanism, I would like see us > migrate towards a standard pluggable driver model that supports all of the > management server's extension points (e.g. network devices, storage > devices, hypervisors, etc) with the following capabilities: > > > - *Consolidated Lifecycle and Startup Procedure*: Drivers share a > common state machine and categorization (e.g. network, storage, hypervisor, > etc) that permits the deterministic calculation of initialization and > destruction order (i.e. network layer drivers -> storage layer drivers -> > hypervisor drivers). Plugin inter-dependencies would be supported between > plugins sharing the same category. > - *In-process Installation and Upgrade*: Adding or upgrading a driver > does not require the management server to be restarted. This capability > implies a system that supports the simultaneous execution of multiple > driver versions and the ability to suspend continued execution work on a > resource while the underlying driver instance is replaced. > - *Execution Isolation*: The deployment packaging and execution > environment supports different (and potentially conflicting) versions of > dependencies to be simultaneously used. Additionally, plugins would be > sufficiently sandboxed to protect the management server against driver > instability. > - *Extension Data Model*: Drivers provide a property bag with a > metadata descriptor to validate and render vendor specific data. The > contents of this property bag will provided to every driver operation > invocation at runtime. The metadata descriptor would be a lightweight > description that provides a label resource key, a description resource key, > data type (string, date, number, boolean), required flag, and optional > length limit. > - *Introspection: Administrative APIs/UIs allow operators to > understand the configuration of the drivers in the system, their > configuration, and their current state.* > - *Discoverability*: Optionally, drivers can be discovered via a > project repository definition (similar to Yum) allowing drivers to be > remotely acquired and operators to be notified regarding update > availability. The project would also provide, free of charge, certificates > to sign plugins. This mechanism would support local mirroring to support > air gapped management networks. > > > Fundamentally, I do not want to turn CloudStack into an erector set with > more screws than nuts which is a risk with highly pluggable architectures. > As such, I think we would need to tightly bound the scope of drivers and > their behaviors to prevent the loss system usability and stability. My > thinking is that drivers would be packaged into a custom JAR, CAR > (CloudStack ARchive), that would be structured as followed: > > > - META-INF > - MANIFEST.MF > - driver.yaml (driver metadata(e.g. version, name, description, > etc) serialized in YAML format) > - LICENSE (a text file containing the driver's license) > - lib (driver dependencies) > - classes (driver implementation) > - resources (driver message files and potentially JS resources) > > > The management server would acquire drivers through a simple scan of a URL > (e.g. file directory, S3 bucket, etc). For every CAR object found, the > management server would create an execution environment (likely a dedicated > ExecutorService and Classloader), and transition the state of the driver to > Running (the exact state model would need to be worked out). To be really > nice, we could develop a custom Ant task/Maven plugin/Gradle plugin to > create CARs. I can also imagine an opportunities to add hooks to this > model to register instrumentation information with JMX and authorization. > > To keep the scope of this email confined, we would introduce the general > notion of a Resource, and (hand wave hand wave) eventually compartmentalize > the execution of work around a resource [1]. This (hand waved) > compartmentalization would allow us the controls necessary to safely and > reliably perform in-place driver upgrades. For an initial release, I would > recommend implementing the abstractions, loading mechanism, extension data > model, and discovery features. With these capabilities in place, we could > attack the in-place upgrade model. > > If we were to adopt such a pluggable capability, we would have the > opportunity to decouple the vendor and CloudStack release schedules. For > example, if a vendor were introducing a new product that required a new or > updated driver, they would no longer need to wait for a CloudStack release > to support it. They would also gain the ability to fix high priority > defects in the same manner. > > I have hand waved a number of issues that would need to be resolved before > such an approach could be implemented. However, I think we need to decide, > as a community, that it worth devoting energy and effort to enhancing the > plugin/driver model and the goals of that effort before driving head first > into the deep rabbit hole of design/implementation. > > Thoughts? (/me ducks) > -John > > [1]: My opinions on the matter from CloudStack Collab 2013 -> > http://www.slideshare.net/JohnBurwell1/how-to-run-from-a-zombie-cloud-stack-distributed-process-management > -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: [email protected] o: 303.746.7302 Advancing the way the world uses the cloud<http://solidfire.com/solution/overview/?video=play> *™*
