Re: [openstack-dev] [cinder] Volume Drivers unit tests
Eric, you're right. I've disabled all such tests using '@unittest.skip("Skip until bug #1578986 is fixed")' decorator in my patch: $ grep -r '1578986' cinder/tests/unit/ | grep -v 'pyc' | wc -l 37 Next step is to fix them. [1] https://review.openstack.org/#/c/320148/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Fri, Jul 22, 2016 at 4:13 PM, Eric Harney <ehar...@redhat.com> wrote: > On 07/21/2016 05:26 PM, Knight, Clinton wrote: > > Nate, you have to press Ctrl-C to see the in-progress test, that’s why > you don’t > > see it in the logs. The bug report shows this and points to the patch > where it > > appeared to begin. https://bugs.launchpad.net/cinder/+bug/1578986 > > > > Clinton > > > > I think this only gives a backtrace of the test runner and not the test. > > I attached gdb when this hang occured and see this. Looks like we still > have a thread running the oslo.messaging fake driver. > > http://paste.openstack.org/raw/539769/ > > (Linked in the bug report as well.) > > > *From: *"Potter, Nathaniel" <nathaniel.pot...@intel.com> > > *Reply-To: *"OpenStack Development Mailing List (not for usage > questions)" > > <openstack-dev@lists.openstack.org> > > *Date: *Thursday, July 21, 2016 at 7:17 PM > > *To: *"OpenStack Development Mailing List (not for usage questions)" > > <openstack-dev@lists.openstack.org> > > *Subject: *Re: [openstack-dev] [cinder] Volume Drivers unit tests > > > > Hi all, > > > > I’m not totally sure that this is the same issue, but lately I’ve seen > the gate > > tests fail while hanging at this point [1], but they say ‘ok’ rather than > > ‘inprogress’. Has anyone else come across this? It only happens > sometimes, and a > > recheck can get past it. The full log is here [2]. > > > > [1] http://paste.openstack.org/show/539314/ > > > > [2] > > > http://logs.openstack.org/90/341090/6/check/gate-cinder-python34-db/ea65de5/console.html > > > > Thanks, > > > > Nate > > > > *From:*yang, xing [mailto:xing.y...@emc.com] > > *Sent:* Thursday, July 21, 2016 3:17 PM > > *To:* OpenStack Development Mailing List (not for usage questions) > > <openstack-dev@lists.openstack.org> > > *Subject:* Re: [openstack-dev] [cinder] Volume Drivers unit tests > > > > Hi Ivan, > > > > Do you have any logs for the VMAX driver? We'll take a look. > > > > Thanks, > > > > Xing > > > > > > > > > *From:*Ivan Kolodyazhny [e...@e0ne.info] > > *Sent:* Thursday, July 21, 2016 4:44 PM > > *To:* OpenStack Development Mailing List (not for usage questions) > > *Subject:* Re: [openstack-dev] [cinder] Volume Drivers unit tests > > > > Thank you Xing, > > > > The issue is related both to VNX and VMAX EMC drivers > > > > > > Regards, > > Ivan Kolodyazhny, > > http://blog.e0ne.info/ > > > > On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com > > <mailto:xing.y...@emc.com>> wrote: > > > > Hi Ivan, > > > > Thanks for sending this out. Regarding the issue in the EMC VNX > driver unit > > tests, it is tracked by this bug > > https://bugs.launchpad.net/cinder/+bug/1578986. The driver was > recently > > refactored so this is probably a new issue introduced by the > refactor. > > We are investigating this issue. > > > > Thanks, > > > > Xing > > > > > > > > > > *From:*Ivan Kolodyazhny [e...@e0ne.info <mailto:e...@e0ne.info>] > > *Sent:* Thursday, July 21, 2016 1:02 PM > > *To:* OpenStack Development Mailing List > > *Subject:* [openstack-dev] [cinder] Volume Drivers unit tests > > > > Hi team, > > > > First of all, I would like to apologize, if my mail is be too > emotional. I > > spent too much of time to fix it and failed. > > > > TL;DR; > > > > What I want to say is: "Let's spend some time to make our tests > better and > > fix all issues". Patch [1] is still unstable. Unit tests can pass or > fail in > > a in a random order. Also, I've disabled some tests to pass CI. > > > > Long version: > > > > While I was working on patch "Move drivers unit tests to > unit.volume.drivers > > directory" [1] I've f
Re: [openstack-dev] [cinder] Volume Drivers unit tests
On 07/21/2016 05:26 PM, Knight, Clinton wrote: > Nate, you have to press Ctrl-C to see the in-progress test, that’s why you > don’t > see it in the logs. The bug report shows this and points to the patch where > it > appeared to begin. https://bugs.launchpad.net/cinder/+bug/1578986 > > Clinton > I think this only gives a backtrace of the test runner and not the test. I attached gdb when this hang occured and see this. Looks like we still have a thread running the oslo.messaging fake driver. http://paste.openstack.org/raw/539769/ (Linked in the bug report as well.) > *From: *"Potter, Nathaniel" <nathaniel.pot...@intel.com> > *Reply-To: *"OpenStack Development Mailing List (not for usage questions)" > <openstack-dev@lists.openstack.org> > *Date: *Thursday, July 21, 2016 at 7:17 PM > *To: *"OpenStack Development Mailing List (not for usage questions)" > <openstack-dev@lists.openstack.org> > *Subject: *Re: [openstack-dev] [cinder] Volume Drivers unit tests > > Hi all, > > I’m not totally sure that this is the same issue, but lately I’ve seen the > gate > tests fail while hanging at this point [1], but they say ‘ok’ rather than > ‘inprogress’. Has anyone else come across this? It only happens sometimes, > and a > recheck can get past it. The full log is here [2]. > > [1] http://paste.openstack.org/show/539314/ > > [2] > http://logs.openstack.org/90/341090/6/check/gate-cinder-python34-db/ea65de5/console.html > > Thanks, > > Nate > > *From:*yang, xing [mailto:xing.y...@emc.com] > *Sent:* Thursday, July 21, 2016 3:17 PM > *To:* OpenStack Development Mailing List (not for usage questions) > <openstack-dev@lists.openstack.org> > *Subject:* Re: [openstack-dev] [cinder] Volume Drivers unit tests > > Hi Ivan, > > Do you have any logs for the VMAX driver? We'll take a look. > > Thanks, > > Xing > > ---------------- > > *From:*Ivan Kolodyazhny [e...@e0ne.info] > *Sent:* Thursday, July 21, 2016 4:44 PM > *To:* OpenStack Development Mailing List (not for usage questions) > *Subject:* Re: [openstack-dev] [cinder] Volume Drivers unit tests > > Thank you Xing, > > The issue is related both to VNX and VMAX EMC drivers > > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > > On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com > <mailto:xing.y...@emc.com>> wrote: > > Hi Ivan, > > Thanks for sending this out. Regarding the issue in the EMC VNX driver > unit > tests, it is tracked by this bug > https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently > refactored so this is probably a new issue introduced by the refactor. > We are investigating this issue. > > Thanks, > > Xing > > > ---- > > *From:*Ivan Kolodyazhny [e...@e0ne.info <mailto:e...@e0ne.info>] > *Sent:* Thursday, July 21, 2016 1:02 PM > *To:* OpenStack Development Mailing List > *Subject:* [openstack-dev] [cinder] Volume Drivers unit tests > > Hi team, > > First of all, I would like to apologize, if my mail is be too emotional. I > spent too much of time to fix it and failed. > > TL;DR; > > What I want to say is: "Let's spend some time to make our tests better and > fix all issues". Patch [1] is still unstable. Unit tests can pass or fail > in > a in a random order. Also, I've disabled some tests to pass CI. > > Long version: > > While I was working on patch "Move drivers unit tests to > unit.volume.drivers > directory" [1] I've found a lot of issues with our unit tests :(. Not all > of > them are already fixed, so that patch is still in progress > > What did I found and what should we have to fix: > > 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 > seconds per tests should be non-acceptable, IMO. > > 2) Execution order. Seriously, do you know that our tests will fail or > hang > if execution order will change? Even if one test for diver A failed, some > tests for driver B will fail too. > > 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event > loops right. We don't mock RPC call well too [3]. We don't > have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. > > In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall > [4]. We've go ZeroIntervalLoo
Re: [openstack-dev] [cinder] Volume Drivers unit tests
Nate, you have to press Ctrl-C to see the in-progress test, that’s why you don’t see it in the logs. The bug report shows this and points to the patch where it appeared to begin. https://bugs.launchpad.net/cinder/+bug/1578986 Clinton From: "Potter, Nathaniel" <nathaniel.pot...@intel.com> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev@lists.openstack.org> Date: Thursday, July 21, 2016 at 7:17 PM To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Hi all, I’m not totally sure that this is the same issue, but lately I’ve seen the gate tests fail while hanging at this point [1], but they say ‘ok’ rather than ‘inprogress’. Has anyone else come across this? It only happens sometimes, and a recheck can get past it. The full log is here [2]. [1] http://paste.openstack.org/show/539314/ [2] http://logs.openstack.org/90/341090/6/check/gate-cinder-python34-db/ea65de5/console.html Thanks, Nate From: yang, xing [mailto:xing.y...@emc.com] Sent: Thursday, July 21, 2016 3:17 PM To: OpenStack Development Mailing List (not for usage questions) <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Hi Ivan, Do you have any logs for the VMAX driver? We'll take a look. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info] Sent: Thursday, July 21, 2016 4:44 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Thank you Xing, The issue is related both to VNX and VMAX EMC drivers Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com<mailto:xing.y...@emc.com>> wrote: Hi Ivan, Thanks for sending this out. Regarding the issue in the EMC VNX driver unit tests, it is tracked by this bug https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently refactored so this is probably a new issue introduced by the refactor. We are investigating this issue. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info<mailto:e...@e0ne.info>] Sent: Thursday, July 21, 2016 1:02 PM To: OpenStack Development Mailing List Subject: [openstack-dev] [cinder] Volume Drivers unit tests Hi team, First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed. TL;DR; What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI. Long version: While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress What did I found and what should we have to fix: 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO. 2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too. 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too. 4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD. It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it. [1] https://review.openstack.org/#/c/320148/ [2] http://paste.openstack.org/show/539081/ [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake [4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 [5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289
Re: [openstack-dev] [cinder] Volume Drivers unit tests
Hi all, I'm not totally sure that this is the same issue, but lately I've seen the gate tests fail while hanging at this point [1], but they say 'ok' rather than 'inprogress'. Has anyone else come across this? It only happens sometimes, and a recheck can get past it. The full log is here [2]. [1] http://paste.openstack.org/show/539314/ [2] http://logs.openstack.org/90/341090/6/check/gate-cinder-python34-db/ea65de5/console.html Thanks, Nate From: yang, xing [mailto:xing.y...@emc.com] Sent: Thursday, July 21, 2016 3:17 PM To: OpenStack Development Mailing List (not for usage questions) <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Hi Ivan, Do you have any logs for the VMAX driver? We'll take a look. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info] Sent: Thursday, July 21, 2016 4:44 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Thank you Xing, The issue is related both to VNX and VMAX EMC drivers Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com<mailto:xing.y...@emc.com>> wrote: Hi Ivan, Thanks for sending this out. Regarding the issue in the EMC VNX driver unit tests, it is tracked by this bug https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently refactored so this is probably a new issue introduced by the refactor. We are investigating this issue. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info<mailto:e...@e0ne.info>] Sent: Thursday, July 21, 2016 1:02 PM To: OpenStack Development Mailing List Subject: [openstack-dev] [cinder] Volume Drivers unit tests Hi team, First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed. TL;DR; What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI. Long version: While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress What did I found and what should we have to fix: 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO. 2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too. 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too. 4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD. It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it. [1] https://review.openstack.org/#/c/320148/ [2] http://paste.openstack.org/show/539081/ [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake [4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 [5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289 [6] http://paste.openstack.org/show/539090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Volume Drivers unit tests
Hi Ivan, Do you have any logs for the VMAX driver? We'll take a look. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info] Sent: Thursday, July 21, 2016 4:44 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [cinder] Volume Drivers unit tests Thank you Xing, The issue is related both to VNX and VMAX EMC drivers Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com<mailto:xing.y...@emc.com>> wrote: Hi Ivan, Thanks for sending this out. Regarding the issue in the EMC VNX driver unit tests, it is tracked by this bug https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently refactored so this is probably a new issue introduced by the refactor. We are investigating this issue. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info<mailto:e...@e0ne.info>] Sent: Thursday, July 21, 2016 1:02 PM To: OpenStack Development Mailing List Subject: [openstack-dev] [cinder] Volume Drivers unit tests Hi team, First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed. TL;DR; What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI. Long version: While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress What did I found and what should we have to fix: 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO. 2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too. 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too. 4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD. It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it. [1] https://review.openstack.org/#/c/320148/ [2] http://paste.openstack.org/show/539081/ [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake [4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 [5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289 [6] http://paste.openstack.org/show/539090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Volume Drivers unit tests
Thank you Xing, The issue is related both to VNX and VMAX EMC drivers Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 21, 2016 at 11:00 PM, yang, xing <xing.y...@emc.com> wrote: > Hi Ivan, > > Thanks for sending this out. Regarding the issue in the EMC VNX driver > unit tests, it is tracked by this bug > https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently > refactored so this is probably a new issue introduced by the refactor. We are > investigating this issue. > > Thanks, > Xing > > > -- > *From:* Ivan Kolodyazhny [e...@e0ne.info] > *Sent:* Thursday, July 21, 2016 1:02 PM > *To:* OpenStack Development Mailing List > *Subject:* [openstack-dev] [cinder] Volume Drivers unit tests > > Hi team, > > First of all, I would like to apologize, if my mail is be too emotional. > I spent too much of time to fix it and failed. > TL;DR; > > What I want to say is: "Let's spend some time to make our tests better and > fix all issues". Patch [1] is still unstable. Unit tests can pass or fail > in a in a random order. Also, I've disabled some tests to pass CI. > > > Long version: > > While I was working on patch "Move drivers unit tests to > unit.volume.drivers directory" [1] I've found a lot of issues with our unit > tests :(. Not all of them are already fixed, so that patch is still in > progress > > What did I found and what should we have to fix: > > 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 > seconds per tests should be non-acceptable, IMO. > > 2) Execution order. Seriously, do you know that our tests will fail or > hang if execution order will change? Even if one test for diver A failed, > some tests for driver B will fail too. > > 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event > loops right. We don't mock RPC call well too [3]. We don't > have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. > > In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall > [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it > everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've > hacked oslo_service in my env to rise an exception if interval > 0. 297 > tests failed. It means, our tests use sleep. We have to get rid of this. > TBH, not only volume drivers unit tests failed. E.g. some API unit tests > failed too. > > > 4) Due to #3, sometimes unit tests hangs even on master branch with a > minor changes.If I stop execution of such tests, usually I see something > like [6]. In most of cases I see that following drivers' tests hangs: EMC, > Huawei, Dell and RBD. > > It's hard to debug such failures because the lack of tooling for eventlet > debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody > know better solution for it. > > [1] https://review.openstack.org/#/c/320148/ > [2] http://paste.openstack.org/show/539081/ > [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake > [4] use > https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 > [5] > https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289 > [6] http://paste.openstack.org/show/539090/ > > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Volume Drivers unit tests
Hi Ivan, Thanks for sending this out. Regarding the issue in the EMC VNX driver unit tests, it is tracked by this bug https://bugs.launchpad.net/cinder/+bug/1578986. The driver was recently refactored so this is probably a new issue introduced by the refactor. We are investigating this issue. Thanks, Xing From: Ivan Kolodyazhny [e...@e0ne.info] Sent: Thursday, July 21, 2016 1:02 PM To: OpenStack Development Mailing List Subject: [openstack-dev] [cinder] Volume Drivers unit tests Hi team, First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed. TL;DR; What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI. Long version: While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress What did I found and what should we have to fix: 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO. 2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too. 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too. 4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD. It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it. [1] https://review.openstack.org/#/c/320148/ [2] http://paste.openstack.org/show/539081/ [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake [4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 [5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289 [6] http://paste.openstack.org/show/539090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [cinder] Volume Drivers unit tests
Hi team, First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed. TL;DR; What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI. Long version: While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress What did I found and what should we have to fix: 1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO. 2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too. 3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree. In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too. 4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD. It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it. [1] https://review.openstack.org/#/c/320148/ [2] http://paste.openstack.org/show/539081/ [3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93=impl_fake [4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162 [5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289 [6] http://paste.openstack.org/show/539090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev