On Thu, Feb 19, 2015, at 01:09 PM, Ben Nemec wrote: > Hi, > > Mike Bayer recently tracked down an issue with database errors in Cinder > to a single database connection being shared over multiple processes. > This is not something that should happen, and it turns out to cause > intermittent failures in the Cinder volume service. Full details can be > found in the bug here: https://bugs.launchpad.net/cinder/+bug/1417018 > and his mailing list thread here: > http://lists.openstack.org/pipermail/openstack-dev/2015-February/057184.html > > The question we're facing is what to do about it. There's quite a lot > of discussion on https://review.openstack.org/#/c/156725 and in > http://eavesdrop.openstack.org/irclogs/%23openstack-oslo/%23openstack-oslo.2015-02-18.log > starting at 2015-02-18T21:38:12 but I'll try to summarize it here. > > On the plus side, we have a way to detect this sort of thing in oslo.db. > That's what Mike's change 156725 is about. On the minus side, > recovering from this in oslo.db is papering over a legitimate problem in > the calling service, and a lot of the discussion has been around how to > communicate that to the calling service. A few options that have been > mentioned: > > 1) Leave the linked change as-is, with a warning logged that will > hopefully be seen and prompt a fix in the service. > > The concerns raised with this is that the warning log level is a very > operator-visible thing and there's nothing an operator can do to fix > this other than pester the developers. Also, it seems developers tend > to ignore logs, so it's unlikely they'll pick up on it themselves. > > Note that while the errors resulting from this situation are > intermittent, the actual situation happens on every start up of > cinder-volume, so these messages will always be logged as it stands > today. > > 2) Change the log message to debug. > > This is the developer-focused log level, but as noted above developers > tend to ignore logs and it will be very easy for the message to get lost > in the debug noise. This option would likely require someone to go > specifically looking for the error to find it. > > 3) Make the error a hard failure. > > Rather than hide the error by recovering, fail immediately when it's > detected. This has the problem of making all the existing Cinder code > (and any other services with the same problem) in the wild incompatible > with any new releases of oslo.db, but it's about the only way to make > sure the error will be addressed now and in any future occurrences. > > 4) Leave the bug alone for now and just log a message so we can find out > how widespread this problem actually is. > > At the moment we only know it exists in Cinder, but due to the way the > service code works it's quite possible other projects have the same > problem and don't know it yet. > > 5) Allow this sort of connection sharing to continue for a deprecation > period with apppropriate logging, then make it a hard failure. > > This would provide services time to find and fix any sharing problems > they might have, but would delay the timeframe for a final fix. > > 6-ish) Fix oslo-incubator service.py to close all file descriptors after > forking. > > This is a best practice anyway so it's something we intend to pursue, > but it's probably more of a long-term fix because it will take some work > to implement and make sure it doesn't break existing services. It also > papers over the problem and according to Mike is basically a slower and > messier alternative to his current proposed change, so it's probably a > tangential change to avoid this in the future as opposed to a solution. > > If you've made it this far, thank you and please provide thoughts on the > options presented above. :-)
I'm not sure why 6 is "slower", can someone elaborate on that? Doug > > -Ben > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev