Some feedback on the cloud deployment use case
(http://amdatu.org/confluence/display/Amdatu/Amdatu+-+Cloud+Deployment+Use+Case):
1) I think at the lowest level, Amdatu should always be build on top of some
"infrastructure as a service". In other words, some kind of cloud
infrastructure. That can be a public cloud, such as Amazon's EC-2, or a private
cloud (running Eucalypus on a set of your own servers).
2) The role of Leon should be the one in charge of that IAAS layer. Taking 1)
into account, he should just be responsible for providing a cloud. He should
not yet do any installation of OSGi + management agent in my opinion because
that should be the responsibility of Marcel, who manages the Educa deployment
and can make the trade-off between involving more hardware and running more
stuff on one node. I'm assuming there is some relationship between managing the
Educa deployment and actually getting paid by customers who want a certain
level of service: in my opinion that is more Marcel's than Leon's role.
Consideration: do we also want to support a fixed, unmanageable cloud, in other
words, just a fixed set of machines that do not run any cloud supporting
software at all. This would be just a bunch of machines that run OSGi with a
management agent directly. I'm hesitant about supporting this scenario. On the
other hand, if Leon's role is only to provide a cloud infrastructure, having an
unmanageable, fixed cloud simply means he's mostly out of work in this case.
In my view, initial deployment would look like this:
? Leon has setup (or rented) a cloud.
? Marcel installs a provisioning server in the cloud.
? Marcel creates a number of targets which automatically triggers the
activation of the same number of cloud nodes which already have OSGi and a
management agent running.
? The interface of the provisioning server shows an overview of all
connected targets, and shows that every target is ready but none of them is
running a distribution.
? In the provisioning server an overview of distributions is shown.
? Dion has uploaded a set of OSGi bundles and other resources. Those
together define the application version Educa 1.0.
? Marcel chooses a target, and selects the distribution 'Educa 1.0' to
install on that node.
? The agent on the target retrieves all needed bundles and bootstraps
itself.
? Educa 1.0 is now running.
Adding or removing cluster nodes:
? After setting up a single target Educa 1.0, Marcel wants to add a
second target with the same application in the cloud, that is clustered
automatically with the first target.
? Marcel navigates back to the target overview of the provisioning
server.
? Next to the already created Educa target, a 'copy' button is shown,
Marcel presses the button.
? A new target is started on a new node in the server cloud, with Educa
1.0 automatically added.
? The new target adds itself automatically to the other target to form
a cluster of Educa targets (clusters can be identified with a name, all targets
that share the same cluster name are in the same cluster)
Regarding clusters, I think there are a couple of options for managing them
actually:
1) as described in this use case, each cluster target runs the same
distribution and some external load balancer actually distributes the incoming
requests between the cluster target. Each cluster target "reports for duty"
with that load balancer.
2) there is actually some kind of cluster manager running somewhere (or
everywhere) in the cluster and it decides what distributions are installed on
which targets, so targets can run a subset of the whole set of components, or
everthing, depending on decisions the cluster manager makes.
Removing a cluster node:
? After some time Marcel wants to remove a target in the cluster to
perform maintenance
? Marcel goes back to the provisioning server and presses the '-' button
? The first target is now automatically a single instanced cluster
? The second target is gone
? The second target is automatically removed from the load balancer
Updating:
? Dion has finished Educa 1.1 and uploads the new artifacts to the
provisioning server.
? Marcel approves the update for one or more targets in the cluster
(depending on whether he wants to do a rolling update, or simply update the
cluster as quickly as possible).
Note: of course you can define a whole new distribution for every update, but
updating a target actually creates a new version of the software going to that
target anyway, and you can always roll back to a previous version, so there is
no fundamental need to create a new distribution every time. Of course you can,
if you want to, but the end result on the target is exactly the same.
On 15 Oct 2010, at 15:22 , Martijn van Berkum wrote:
> Thanks for the feedback. I agree on splitting the administrator role in two
> roles: one generic sysop, and one per application (1 or more tenants). Based
> on that I added a third role: Marcel the Educa Administrator, and explained
> the various roles a little bit more.
Notes above.
> About the multiple composites, from my point of view this is not a valid use
> case for Amdatu management/deployment.
That's interesting, so an application can never become bigger than one
node/target? I think for big applications you do want to be able to partition
the application, running parts of it on one target and parts on another. Just
like multi tenancy is scaling in one direction, this is scaling in the other.
It could, and I think should be the responsibility of a (possible application
specific or agnostic) cluster manager to actually decide how to do this
partitioning.
I do think we don't need to provide this right away, but it does not hurt
discussing how we would implement it when we need to.
> Just like on the generic Internet, every application should itself be
> prepared for unexpected updated REST APIs, services that are down, changed
> dependencies and other horrors. This should not be managed centrally, that is
> 'old' thinking from a viewpoint that everything can be controlled.
Well, if you want single components to actually scale out by themselves instead
of doing that at an application level, then you end up with cluster managers
for each component. That does not fundamentally differ.
I totally agree that a component should simply try to satisfy its dependencies
and deal with all the dynamics in this environment (which is actually hard to
do if your dependencies are all REST APIs because they currently have no
mechanism whatsoever for things like discovery and notifications based on that
(like the OSGi service registry provides, even for remote services).
> Really service oriented architecture means be very service oriented; if the
> other party decides not to show up, do something else, want something else;
> adapt, not demand another contract. Graceful degradation and design for
> failure are common architectural design goals following this philosophy.
Agreed.
> Rolling restarts/updates/deployments; I put it in just as a check for us that
> this could be a very common use case, although I know this is really hard.
> Not only if you have only 2 nodes in a cluster and want to update one, but
> also when you want to update thousands of servers. For example the new
> twitter interface was gradually introduced in a few weeks, some users got it
> much earlier than others; apparently they have some kind of rolling update
> mechanism for that.
I'm actually wondering why you're stating that these rolling updates are hard.
What makes them hard? I can agree that doing schema updates on relational
databases as part of updates can be hard or time consuming, especially if you
need to be able to roll them back too, but since we're not using relational
databases anymore, and having one central database that is used by all
components is a bad idea from a component perspective anyway, I don't see big
problems.
Greetings, Marcel
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.amdatu.org/pipermail/amdatu-developers/attachments/20101023/e35d1e72/attachment.html