great, thanks
On Sat, Nov 16, 2013 at 5:10 AM, Mark Washenberger < [email protected]> wrote: > Hi folks, > > My summary notes from the OpenStack Design Summit Glance sessions follow. > Enjoy, and please help correct any misunderstandings. > > > > Image State Consistency: > ------------------------ > > https://etherpad.openstack.org/p/icehouse-summit-image-state-consistency > > In this session, we focused on the problem of snapshots that fail > after the image is created but before the image data is uploaded > result in a pending image that will never become active, and the > only operation nova can do is to delete the image. Thus there is > not a very good way to communicate the failure to users without > just leaving a useless image record around. > > A solution was proposed to allow Nova to directly set the status > of the image, say to "killed" or some other state. > > A problem with the proposed solution is that we generally have > kept the "status" field internally controlled by glance, which > means there are some modeling and authorization concerns. > However, it is actually something Nova could do today through > the hacky mechanism of initiating a PUT with data, but then > terminating the connection without sending a complete body. So > the authorization aspects are not really a fundamental concern. > > It was suggested that the solution to this problem > is to make Nova responsible for reporting these failures rather > than Glance. In the short term, we could do the following > - have nova delete the image when snapshot fails (already merged) > - merge nova patch to report the failure as part of instance > error reporting > > In the longer term, it was seen as desirable for nova to treat > snapshots as asynchronous tasks and reflect those tasks in the > api, including the failure/success of those tasks. > > Another long term option that was viewed mostly favorably was > to add another asynchronous task to glance for vanilla uploads > so that nova snapshots can avoid creating the image until it > is fully active. > > Fei Long Wang is going to follow up on what approach makes the > most sense for Nova and report back for our next steps. > > > > What to do about v1? > -------------------- > > https://etherpad.openstack.org/p/icehouse-summit-images-v1-api > > In this discussion, we hammered out the details for how to drop > the v1 api and in what timetable. > > Leaning heavily on cinder's experience dropping v1, we came > up with the following schedule. > > Icehouse: > - Announce plan to deprecate the V1 API and registry in J and remove > it in K > - Announce feature freeze for v1 API immediately > - Make sure everything in OpenStack is using v2 (cinder, nova, ?) > - Ensure v2 is being fully covered in tempest tests > - Ensure there are no gaps in the migration strategy from v1 to v2 > - after the fact, it seems to me we need to produce a migration > guide as a way to evaluate the presence of such gaps > - Make v2 the default in glanceclient > - Turn v2 on by default in glance API > > "J": > - Mark v1 as deprecated > - Turn v1 off by default in config > > "K": > - Delete v1 api and v1 registry > > > A few gotchas were identified, in particular, a concern was raised > about breaking stable branch testing when we switch the default in > glanceclient to v2--since latest glanceclient will be used to test > glance in say Folsom or Grizzly where the v2 api didn't really > work at all. > > In addition, it was suggested that we should be very aggressive > in using deprecation warnings for config options to communicate > this change as loudly as possible. > > > > > Image Sharing > ------------- > > https://etherpad.openstack.org/p/icehouse-summit-enhance-v2-image-sharing > > This session focused on the gaps between the current image sharing > functionality and what is needed to establish an image marketplace. > > One issue was the lack of verification of project ids when sharing an > image. > > A few other issues were identified: > - there is no way to share an image with a large number of projects in a > single api operation > - membership lists are not currently paged > - there is no way to share an image with everyone, you must know each > other project id > > We identified a potential issue with bulk operations and > verification--namely there is no way to do bulk verification of project ids > in keystone that we know of, so probably keystone work would be needed to > have both of these features in place without implying super slow api calls. > > In addition, we spent some time toying with the idea of image catalogs. If > publishers put images in catalogs, rather than having shared images show up > directly in other users' image lists, things would be a lot safer and we > could relax some of our restrictions. However, there are some issues with > this approach as well, > - How do you find the catalog of a trusted image publisher? > - Are we just pushing the issue of sensible world-listings to another > resource? > - This would be a big change. > > > > Enhancing Image Locations: > -------------------------- > > > https://etherpad.openstack.org/p/icehouse-summit-enhance-image-location-property > > This session proposed adding several attributes to image locations > > 1. Add 'status' to each location. > > I think consensus was that this approach makes sense moving forward. In > particular, it would be nice to have a 'pending-delete' status for image > locations, so that when you delete a single location from an image it can > be picked up properly by the glance scrubber. > > There was some concern about how we define the overall image status if we > allow other statuses on locations. Is image status just stored > independently of image locations statuses? Or is it newly defined as a > function of those image locations statuses? > > 2. Allow disk_format, container_format, and checksum to vary per location. > > The usecase here is that if you have a multi-hypervisor cloud, where > different formats are needed, the client can automatically select the > correct format when it downloads an image. > > This idea was initially met with some skepticism because we have a strong > view that an image is immutable once it is created, and the checksum is a > big part of how we enforce that. > > However it was correctly pointed out that the immutability we care about > is actually a property of the block device that each image format > represents. But for the moment we were unsure how to enforce that block > device immutability save keeping the checksum and image formats the same. > > > 3. Add metrics to each image location. > > The essential idea here is to track the performance metrics of each image > location to ensure we choose the fastest location. These metrics would not > be revealed as part of the API. > > I think most of us were initially a bit confused by this suggestion. > However, after talking with Zhi Yan after the session, I think it makes > sense to support this in a local sense rather than storing such information > in the database. Locality is critical because different glance nodes likely > have different relationships to the underlying locations in terms of > network distance, so each node should be gearing towards what is best for > it. > > We can also probably reuse a local metrics tracking library to enable > similar optimizations in a future incarnation of the glance client. > > > > > Images and Taskflow > ------------------- > > https://etherpad.openstack.org/p/icehouse-summit-taskflow-and-glance > > In this session we discussed both the general layout of taskflow the > strategy for porting the current image tasks under development to use > taskflow, and came up with the following basic outline. > > Short Term: > > As we add more and more complexity to the import task, we can try to > compose the work as a flow of tasks. With this set up, our local, > eventlet-backed executor (glance task execution engine) could be just a > thin wrapper around a local taskflow engine. > > Medium Term: > > At some point pretty early on we are going to want to have glance tasks > running on distributed worker processes, mostly likely having the tasks > triggered by rpc. At this point, we can copy the existing approach in > cinder c.a. Havana > > Longer Term: > > When taskflow engines support distributing tasks across different workers, > we can fall back to having a local task engine that is distributing tasks > using that engine. > > During the discussion a few concerns were discussed about working with > taskflow. > - tasks have to be structured in the right way to make restart, recovery, > and rollback work > - in other words, if we don't think about this carefully, we'll likely > screw things up > - it remains difficult to determine if a task has stalled or failed > - we are not sure how to restart a failed task at this point > > Some of these concerns may already be being addressed in the library. > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
