Back in the day before the ephemeral hack (though that was something folk have said they would like for libvirt too - so its not such a hack per-se) this was (broadly) sketched out. We spoke with the cinder PTL at the time in portland, from memory.
There was no spec, so here is my brain-dumpy-recollection... - actual volumes are a poor match because we wouldn't be running cinder-volume on an ongoing basis and service records would accumulate etc. - we'd need cross-service scheduler support to make cinder operations line up with allocated bare metal nodes (and to e.g. make sure both our data volume and golden image volume are scheduled to the same machine). - folk want to be able to do fairly arbitrary RAID(& JBOD) setups and that affects scheduling as well, one way to work it is to have Ironic export capabilities and specify actual RAID setups via matching flavors - this is the direction the ephemeral work took us, and is conceptually straight forwardly extended to RAID. We did talk about doing a little JSON schema to describe RAID / volume layouts, which cinder could potentially use for user defined volume flavors too. One thing I think that is missing from your description is in this: " To be clear, in TripleO, we need a way to keep the data on a local direct attached storage device while deploying a new image to the box." I think we need to be able to do this with a single drive shared between image and data - doing one disk image, one disk data would add substantial waste given the size of disks these days (and for some form factors like moonshot it would rule out using them at all). Of course, being able to do entirely network stored golden images might be something some deployments want, but we can't require them all to do that ;) -Rob On 13 November 2014 11:30, Clint Byrum <cl...@fewbar.com> wrote: > Each summit since we created "preserve ephemeral" mode in Nova, I have > some conversations where at least one person's brain breaks for a > second. There isn't always alcohol involved before, there almost > certainly is always a drink needed after. The very term is vexing, and I > think we have done ourselves a disservice to have it, even if it was the > best option at the time. > > To be clear, in TripleO, we need a way to keep the data on a local > direct attached storage device while deploying a new image to the box. > If we were on VMs, we'd attach volumes, and just deploy new VMs and move > the volume over. If we had a SAN, we'd just move the LUN's. But at some > point when you deploy a cloud you're holding data that is expensive to > replicate all at once, and so you'd rather just keep using the same > server instead of trying to move the data. > > Since we don't have baremetal Cinder, we had to come up with a way to > do this, so we used Nova rebuild, and slipped it a special command that > said "don't overwrite the partition you'd normally make the 'ephemeral'" > partition. This works fine, but it is confusing and limiting. We'd like > something better. > > I had an interesting discussion with Devananda in which he suggested an > alternative approach. If we were to bring up cinder-volume on our deploy > ramdisks, and configure it in such a way that it claimed ownership of > the section of disk we'd like to preserve, then we could allocate that > storage as a volume. From there, we could boot from volume, or "attach" > the volume to the instance (which would really just tell us how to find > the volume). When we want to write a new image, we can just delete the old > instance and create a new one, scheduled to wherever that volume already > is. This would require the nova scheduler to have a filter available > where we could select a host by the volumes it has, so we can make sure to > send the instance request back to the box that still has all of the data. > > Alternatively we can keep on using rebuild, but let the volume model the > preservation rather than our special case. > > Thoughts? Suggestions? I feel like this might take some time, but it is > necessary to consider it now so we can drive any work we need to get it > done soon. > > _______________________________________________ > OpenStack-dev mailing list > OpenStackfirstname.lastname@example.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Robert Collins <rbtcoll...@hp.com> Distinguished Technologist HP Converged Cloud _______________________________________________ OpenStack-dev mailing list OpenStackemail@example.com http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev