On Sep 7, 2014, at 9:39 AM, John Schwarz <[email protected]> wrote:
> Hi, > > Long story short: for future reference, if you initialize an eventlet > Timeout, make sure you close it (either with a context manager or simply > timeout.close()), and be extra-careful when writing tests using > eventlet Timeouts, because these timeouts don't implicitly expire and > will cause unexpected behaviours (see [1]) like gate failures. In our > case this caused non-deterministic failures on the dsvm-functional test > suite. It would be good to have a fixture class in oslotest to set up eventlet timeouts properly. Doug > > > Late last week, a bug was found ([2]) in which an eventlet Timeout > object was initialized but not closed. This instance was left inside > eventlet's inner-workings and triggered non-deterministic "Timeout: 10 > seconds" errors and failures in dsvm-functional tests. > > As mentioned earlier, initializing a new eventlet.timeout.Timeout > instance also registers it to inner mechanisms that exist within the > library, and the reference remains there until it is explicitly removed > (and not until the scope leaves the function block, as some would have > thought). Thus, the old code (simply creating an instance without > assigning it to a variable) left no way to close the timeout object. > This reference remains throughout the "life" of a worker, so this can > (and did) effect other tests and procedures using eventlet under the > same process. Obviously this could easily effect production-grade > systems with very high load. > > For future reference: > 1) If you run into a "Timeout: %d seconds" exception whose traceback > includes "hub.switch()" and "self.greenlet.switch()" calls, there might > be a latent Timeout somewhere in the code, and a search for all > eventlet.timeout.Timeout instances will probably produce the culprit. > > 2) The setup used to reproduce this error for debugging purposes is a > baremetal machine running a VM with devstack. In the baremetal machine I > used some 6 "dd if=/dev/zero of=/dev/null" to simulate high CPU load > (full command can be found at [3]), and in the VM I ran the > dsvm-functional suite. Using only a VM with similar high CPU simulation > fails to produce the result. > > [1] > http://eventlet.net/doc/modules/timeout.html#eventlet.timeout.eventlet.timeout.Timeout.Timeout.cancel > [2] https://review.openstack.org/#/c/119001/ > [3] > http://stackoverflow.com/questions/2925606/how-to-create-a-cpu-spike-with-a-bash-command > > > -- > John Schwarz, > Software Engineer, Red Hat. > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
