On Sat, Jun 14, 2014 at 7:33 AM, Eoghan Glynn <[email protected]> wrote:
> > > ----- Original Message ----- > > On 11 June 2014 20:07, Joe Gordon <[email protected]> wrote: > > > On Wed, Jun 11, 2014 at 11:38 AM, Matt Riedemann > > > <[email protected]> wrote: > > >> On 6/11/2014 10:01 AM, Eoghan Glynn wrote: > > >>> Thanks for bringing this to the list Matt, comments inline ... > > >>> > > >>>> tl;dr: some pervasive changes were made to nova to enable polling in > > >>>> ceilometer which broke some things and in my opinion shouldn't have > been > > >>>> merged as a bug fix but rather should have been a blueprint. > > >>>> > > >>>> === > > >>>> > > >>>> The detailed version: > > >>>> > > >>>> I opened bug 1328694 [1] yesterday and found that came back to some > > >>>> changes made in ceilometer for bug 1262124 [2]. > > >>>> > > >>>> Upon further inspection, the original ceilometer bug 1262124 made > some > > >>>> changes to the nova os-floating-ips API extension and the database > API > > >>>> [3], and changes to python-novaclient [4] to enable ceilometer to > use > > >>>> the new API changes (basically pass --all-tenants when listing > floating > > >>>> IPs). > > >>>> > > >>>> The original nova change introduced bug 1328694 which spams the > nova-api > > >>>> logs due to the ceilometer change [5] which does the polling, and > right > > >>>> now in the gate ceilometer is polling every 15 seconds. > > >>> > > >>> > > >>> IIUC that polling cadence in the gate is in the process of being > reverted > > >>> to the out-of-the-box default of 600s. > > >>> > > >>>> I pushed a revert in ceilometer to fix the spam bug and a separate > patch > > >>>> was pushed to nova to fix the problem in the network API. > > >>> > > >>> > > >>> Thank you for that. The revert is just now approved on the ceilometer > > >>> side, > > >>> and is wending its merry way through the gate. > > >>> > > >>>> The bigger problem I see here is that these changes were all made > under > > >>>> the guise of a bug when I think this is actually a blueprint. We > have > > >>>> changes to the nova API, changes to the nova database API, CLI > changes, > > >>>> potential performance impacts (ceilometer can be hitting the nova > > >>>> database a lot when polling here), security impacts (ceilometer > needs > > >>>> admin access to the nova API to list floating IPs for all tenants), > > >>>> documentation impacts (the API and CLI changes are not documented), > etc. > > >>>> > > >>>> So right now we're left with, in my mind, two questions: > > >>>> > > >>>> 1. Do we just fix the spam bug 1328694 and move on, or > > >>>> 2. Do we revert the nova API/CLI changes and require this goes > through > > >>>> the nova-spec blueprint review process, which should have happened > in > > >>>> the first place. > > >>> > > >>> > > >>> So just to repeat the points I made on the unlogged #os-nova IRC > channel > > >>> earlier, for posterity here ... > > >>> > > >>> Nova already exposed an all_tenants flag in multiple APIs (servers, > > >>> volumes, > > >>> security-groups etc.) and these would have: > > >>> > > >>> (a) generally pre-existed ceilometer's usage of the corresponding > APIs > > >>> > > >>> and: > > >>> > > >>> (b) been tracked and proposed at the time via straight-forward LP > > >>> bugs, > > >>> as opposed to being considered blueprint material > > >>> > > >>> So the manner of the addition of the all_tenants flag to the > floating_ips > > >>> API looks like it just followed existing custom & practice. > > >>> > > >>> Though that said, the blueprint process and in particular the > nova-specs > > >>> aspect, has been tightened up since then. > > >>> > > >>> My preference would be to fix the issue in the underlying API, but > to use > > >>> this as "a teachable moment" ... i.e. to require more oversight (in > the > > >>> form of a reviewed & approved BP spec) when such API changes are > proposed > > >>> in the future. > > >>> > > >>> Cheers, > > >>> Eoghan > > >>> > > >>>> Are there other concerns here? If there are no major objections to > the > > >>>> code that's already merged, then #2 might be excessive but we'd > still > > >>>> need docs changes. > > >>>> > > >>>> I've already put this on the nova meeting agenda for tomorrow. > > >>>> > > >>>> [1] https://bugs.launchpad.net/ceilometer/+bug/1328694 > > >>>> [2] https://bugs.launchpad.net/nova/+bug/1262124 > > >>>> [3] https://review.openstack.org/#/c/81429/ > > >>>> [4] https://review.openstack.org/#/c/83660/ > > >>>> [5] https://review.openstack.org/#/c/83676/ > > >>>> > > >>>> -- > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Matt Riedemann > > >>>> > > >>>> > > >>>> _______________________________________________ > > >>>> OpenStack-dev mailing list > > >>>> [email protected] > > >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > >>>> > > >>> > > >>> _______________________________________________ > > >>> OpenStack-dev mailing list > > >>> [email protected] > > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > >>> > > >> > > >> While there is precedent for --all-tenants with some of the other > APIs, > > >> I'm concerned about where this stops. When ceilometer wants polling > on > > >> some > > >> other resources that the nova API exposes, will it need the same > thing? > > >> Doing all of this polling for resources in all tenants in nova puts an > > >> undue > > >> burden on the nova API and the database. > > >> > > >> Can we do something with notifications here instead? That's where the > > >> nova-spec process would have probably caught this. > > > > > > ++ to notifications and not polling. > > > > Yeah, I think we need to revert this, and go through the specs > > process. Its been released in Juno-1 now, so this revert feels bad, > > but perhaps its the best of a bad situation? > > > > Word of caution, we need to get notifications versioned correctly if > > we want this as a more formal external "API". I think Heat have > > similar issues in this area, efficiently knowing about something > > happening in Nova. So we do need to "solve this". > > > > Some kind of callback to ceilometer's REST API sounds attractive, but > > messes up dependencies. > > Yeah, I don't think that would be a workable approach, as it breaks > the separation of concerns ... i.e. nova would then need to become > aware of the format of ceilometer metering samples. > > Instead, the established model is that nova (or any other openstack > service in general) either: > > * emits resource lifecycle data in the form of notifications (which > generally contains a super-set of the data that ceilometer is > specifically interested in), > > or else: > > * allows these data to be surfaced via a RESTful API that ceilometer > can call into > > > Maybe implemented as a plugin to a Nova > > "notification system", could work better somehow? A versioned > > interface that issues "server life cycle" events (some versioned > > subset of notifications), and allow services to implement their own > > plugins to that interface. But maybe we should probably just version > > the current notifications, rather than re-invent another interface. > > I don't fully understand the concept of services implementing their > own plugins to this versioned interface. > > Would the implementing service in this case be nova or ceilometer? > > If the latter, why re-invent the existing notifications mechanism > layered over AMQP (i.e. oslo.messaging or the old openstack.common > nofications, now deprecated)? > > In terms of versioning, it would be great to protect ceilometer from > unilateral changes to or inconsistencies in the notification formats > used by the services, as this has been found to be brittle in practice. > ++, right now nova only produces one stable type of API: REST. Notifications aren't versioned and we don't really gate on their format (or document them). Without versioning and proper gating, notifications will never become a stable contractual API. While I would like to see a way to protect ceilometer from unilateral changes, I think we should focus on making sure notifications become a contractual stable API, just like REST. As that is the more general issue is at play here. > Cheers, > Eoghan > > > Thanks, > > John > > > > _______________________________________________ > > OpenStack-dev mailing list > > [email protected] > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
