Re: [DISCUSS/PROPOSAL] Upgrading Driver Model

Mike Tutkowski Tue, 20 Aug 2013 15:23:44 -0700

Hey John,

I think this is some great stuff. Thanks for the write up.


It looks like you have ideas around what might go into a first release of
this plug-in framework. Were you thinking we'd have enough time to squeeze
that first rev into 4.3. I'm just wondering (it's not a huge deal to hit
that release for this) because we would only have about five weeks.

Thanks


On Tue, Aug 20, 2013 at 3:43 PM, John Burwell <[email protected]> wrote:

> All,
>
> In capturing my thoughts on storage, my thinking backed into the driver
> model.  While we have the beginnings of such a model today, I see the
> following deficiencies:
>
>
>    1. *Multiple Models*: The Storage, Hypervisor, and Security layers
>    each have a slightly different model for allowing system functionality to
>    be extended/substituted.  These differences increase the barrier of entry
>    for vendors seeking to extend CloudStack and accrete code paths to be
>    maintained and verified.
>    2. *Leaky Abstraction*:  Plugins are registered through a Spring
>    configuration file.  In addition to being operator unfriendly (most
>    sysadmins are not Spring experts nor do they want to be), we expose the
>    core bootstrapping mechanism to operators.  Therefore, a misconfiguration
>    could negatively impact the injection/configuration of internal management
>    server components.  Essentially handing them a loaded shotgun pointed at
>    our right foot.
>    3. *Nondeterministic Load/Unload Model*:  Because the core loading
>    mechanism is Spring, the management has little control over the timing and
>    order of component loading/unloading.  Changes to the Management Server's
>    component dependency graph could break a driver by causing it to be started
>    at an unexpected time.
>    4. *Lack of Execution Isolation*: As a Spring component, plugins are
>    loaded into the same execution context as core management server
>    components.  Therefore, an errant plugin can corrupt the entire management
>    server.
>
>
> For next revision of the plugin/driver mechanism, I would like see us
> migrate towards a standard pluggable driver model that supports all of the
> management server's extension points (e.g. network devices, storage
> devices, hypervisors, etc) with the following capabilities:
>
>
>    - *Consolidated Lifecycle and Startup Procedure*:  Drivers share a
>    common state machine and categorization (e.g. network, storage, hypervisor,
>    etc) that permits the deterministic calculation of initialization and
>    destruction order (i.e. network layer drivers -> storage layer drivers ->
>    hypervisor drivers).  Plugin inter-dependencies would be supported between
>    plugins sharing the same category.
>    - *In-process Installation and Upgrade*: Adding or upgrading a driver
>    does not require the management server to be restarted.  This capability
>    implies a system that supports the simultaneous execution of multiple
>    driver versions and the ability to suspend continued execution work on a
>    resource while the underlying driver instance is replaced.
>    - *Execution Isolation*: The deployment packaging and execution
>    environment supports different (and potentially conflicting) versions of
>    dependencies to be simultaneously used.  Additionally, plugins would be
>    sufficiently sandboxed to protect the management server against driver
>    instability.
>    - *Extension Data Model*: Drivers provide a property bag with a
>    metadata descriptor to validate and render vendor specific data.  The
>    contents of this property bag will provided to every driver operation
>    invocation at runtime.  The metadata descriptor would be a lightweight
>    description that provides a label resource key, a description resource key,
>    data type (string, date, number, boolean), required flag, and optional
>    length limit.
>    - *Introspection: Administrative APIs/UIs allow operators to
>    understand the configuration of the drivers in the system, their
>    configuration, and their current state.*
>    - *Discoverability*: Optionally, drivers can be discovered via a
>    project repository definition (similar to Yum) allowing drivers to be
>    remotely acquired and operators to be notified regarding update
>    availability.  The project would also provide, free of charge, certificates
>    to sign plugins.  This mechanism would support local mirroring to support
>    air gapped management networks.
>
>
> Fundamentally, I do not want to turn CloudStack into an erector set with
> more screws than nuts which is a risk with highly pluggable architectures.
>  As such, I think we would need to tightly bound the scope of drivers and
> their behaviors to prevent the loss system usability and stability.  My
> thinking is that drivers would be packaged into a custom JAR, CAR
> (CloudStack ARchive), that would be structured as followed:
>
>
>    - META-INF
>       - MANIFEST.MF
>       - driver.yaml (driver metadata(e.g. version, name, description,
>       etc) serialized in YAML format)
>       - LICENSE (a text file containing the driver's license)
>    - lib (driver dependencies)
>    - classes (driver implementation)
>    - resources (driver message files and potentially JS resources)
>
>
> The management server would acquire drivers through a simple scan of a URL
> (e.g. file directory, S3 bucket, etc).  For every CAR object found, the
> management server would create an execution environment (likely a dedicated
> ExecutorService and Classloader), and transition the state of the driver to
> Running (the exact state model would need to be worked out).  To be really
> nice, we could develop a custom Ant task/Maven plugin/Gradle plugin to
> create CARs.   I can also imagine an opportunities to add hooks to this
> model to register instrumentation information with JMX and authorization.
>
> To keep the scope of this email confined, we would introduce the general
> notion of a Resource, and (hand wave hand wave) eventually compartmentalize
> the execution of work around a resource [1].  This (hand waved)
> compartmentalization would allow us the controls necessary to safely and
> reliably perform in-place driver upgrades.  For an initial release, I would
> recommend implementing the abstractions, loading mechanism, extension data
> model, and discovery features.  With these capabilities in place, we could
> attack the in-place upgrade model.
>
> If we were to adopt such a pluggable capability, we would have the
> opportunity to decouple the vendor and CloudStack release schedules.  For
> example, if a vendor were introducing a new product that required a new or
> updated driver, they would no longer need to wait for a CloudStack release
> to support it.  They would also gain the ability to fix high priority
> defects in the same manner.
>
> I have hand waved a number of issues that would need to be resolved before
> such an approach could be implemented.  However, I think we need to decide,
> as a community, that it worth devoting energy and effort to enhancing the
> plugin/driver model and the goals of that effort before driving head first
> into the deep rabbit hole of design/implementation.
>
> Thoughts? (/me ducks)
> -John
>
> [1]: My opinions on the matter from CloudStack Collab 2013 ->
> http://www.slideshare.net/JohnBurwell1/how-to-run-from-a-zombie-cloud-stack-distributed-process-management
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: [email protected]
o: 303.746.7302
Advancing the way the world uses the
cloud<http://solidfire.com/solution/overview/?video=play>
*™*

Re: [DISCUSS/PROPOSAL] Upgrading Driver Model

Reply via email to