[openstack-dev] [all] In defence of faking

2014-09-22 Thread Matthew Booth
If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
thanks to Sean, Jay and Dan for doing this. I liked the format, the
informal style and the content. Unfortunately I missed the live event,
but I can confirm that watching it after the event worked just fine
(thanks for reading out live questions for the stream!).

I'd like to make a brief defence of faking, which perhaps predictably in
a talk about mock took a bit of a bashing.

Firstly, when not to fake. As Jay pointed out, faking adds an element of
complexity to a test, so if you can achieve what you need to with a
simple mock then you should. But, as the quote goes, you should make
things as simple as possible, but not simpler.

Here are some simple situations where I believe fake is the better solution:

* Mock assertions aren't sufficiently expressive on their own

For example, imagine your code is calling:

def complex_set(key, value)

You want to assert that on completion of your unit, the final value
assigned to key was value. This is difficult to capture with mock
without risking false assertion failures if complex_set sets other keys
which you aren't interested in, or if key's value is set multiple
times, but you're only interested in the last one. A little fake
function which stores the final value assigned to key does this simply
and accurately without adding a great deal of complexity. e.g.

mykey = [None]
def fake_complex_set(key, value):
  if key == 'FOO':
mykey[0] = value

with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
  run_test()
self.assertEquals('expected', mykey[0])

Summary: fake method is a custom mock assertion.

* A simple fake function is simpler than a complex mock dance

For example, you're mocking 2 functions: start_frobincating(key) and
wait_for_frobnication(key). They can potentially be called overlapping
with different keys. The desired mock return value of one is dependent
on arguments passed to the other. This is better mocked with a couple of
little fake functions and some external state, or you risk introducing
artificial constraints on the code under test.

Jay pointed out that faking methods creates more opportunities for
errors. For this reason, in the above cases, you want to keep your fake
function as simple as possible (but no simpler). However, there's a big
one: the fake driver!

This may make less sense outside of driver code, although faking the
image service came up in the talk. Without looking at the detail, that
doesn't necessarily sound awful to me, depending on context. In the
driver, though, the ultimate measure of correctness isn't a Nova call:
it's the effect produced on the state of an external system.

For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
lot of code: 1599 lines as of writing. It contains bugs, and it contains
inaccuracies, and both of these can mess up tests. However:

* It's vastly simpler than the system it models (vSphere server)
* It's common code, so gets fixed over time
* It allows tests to run almost all driver code unmodified

So, for example, it knows that you can't move a file before you create
it. It knows that creating a VM creates a bunch of different files, and
where they're created. It knows what objects are created by the server,
and what attributes they have. And what attributes they don't have. If
you do an object lookup, it knows which objects to return, and what
their properties are.

All of this knowledge is vital to testing, and if it wasn't contained in
the fake driver, or something like it[1], would have to be replicated
across all tests which require it. i.e. It may be 1599 lines of
complexity, but it's all complexity which has to live somewhere anyway.

Incidentally, this is fresh in my mind because of
https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
-161, and the rewrite has better coverage, too :) It executes the
function under test, it checks that it produces the correct outcome, and
other than that it doesn't care how the function is implemented.

TL;DR

* Bootstrap hour is awesome
* Don't fake if you don't have to
* However, there are situations where it's a good choice

Thanks for reading :)

Matt

[1] There are other ways to skin this cat, but ultimately if you aren't
actually spinning up a vSphere server, you're modelling it somehow.
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] In defence of faking

2014-09-22 Thread Monty Taylor
On 09/22/2014 08:58 AM, Matthew Booth wrote:
 If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
 http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
 thanks to Sean, Jay and Dan for doing this. I liked the format, the
 informal style and the content. Unfortunately I missed the live event,
 but I can confirm that watching it after the event worked just fine
 (thanks for reading out live questions for the stream!).
 
 I'd like to make a brief defence of faking, which perhaps predictably in
 a talk about mock took a bit of a bashing.
 
 Firstly, when not to fake. As Jay pointed out, faking adds an element of
 complexity to a test, so if you can achieve what you need to with a
 simple mock then you should. But, as the quote goes, you should make
 things as simple as possible, but not simpler.
 
 Here are some simple situations where I believe fake is the better solution:
 
 * Mock assertions aren't sufficiently expressive on their own
 
 For example, imagine your code is calling:
 
 def complex_set(key, value)
 
 You want to assert that on completion of your unit, the final value
 assigned to key was value. This is difficult to capture with mock
 without risking false assertion failures if complex_set sets other keys
 which you aren't interested in, or if key's value is set multiple
 times, but you're only interested in the last one. A little fake
 function which stores the final value assigned to key does this simply
 and accurately without adding a great deal of complexity. e.g.
 
 mykey = [None]
 def fake_complex_set(key, value):
   if key == 'FOO':
 mykey[0] = value
 
 with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
   run_test()
 self.assertEquals('expected', mykey[0])
 
 Summary: fake method is a custom mock assertion.
 
 * A simple fake function is simpler than a complex mock dance
 
 For example, you're mocking 2 functions: start_frobincating(key) and
 wait_for_frobnication(key). They can potentially be called overlapping
 with different keys. The desired mock return value of one is dependent
 on arguments passed to the other. This is better mocked with a couple of
 little fake functions and some external state, or you risk introducing
 artificial constraints on the code under test.
 
 Jay pointed out that faking methods creates more opportunities for
 errors. For this reason, in the above cases, you want to keep your fake
 function as simple as possible (but no simpler). However, there's a big
 one: the fake driver!
 
 This may make less sense outside of driver code, although faking the
 image service came up in the talk. Without looking at the detail, that
 doesn't necessarily sound awful to me, depending on context. In the
 driver, though, the ultimate measure of correctness isn't a Nova call:
 it's the effect produced on the state of an external system.
 
 For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
 lot of code: 1599 lines as of writing. It contains bugs, and it contains
 inaccuracies, and both of these can mess up tests. However:
 
 * It's vastly simpler than the system it models (vSphere server)
 * It's common code, so gets fixed over time
 * It allows tests to run almost all driver code unmodified
 
 So, for example, it knows that you can't move a file before you create
 it. It knows that creating a VM creates a bunch of different files, and
 where they're created. It knows what objects are created by the server,
 and what attributes they have. And what attributes they don't have. If
 you do an object lookup, it knows which objects to return, and what
 their properties are.
 
 All of this knowledge is vital to testing, and if it wasn't contained in
 the fake driver, or something like it[1], would have to be replicated
 across all tests which require it. i.e. It may be 1599 lines of
 complexity, but it's all complexity which has to live somewhere anyway.
 
 Incidentally, this is fresh in my mind because of
 https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
 -161, and the rewrite has better coverage, too :) It executes the
 function under test, it checks that it produces the correct outcome, and
 other than that it doesn't care how the function is implemented.
 
 TL;DR
 
 * Bootstrap hour is awesome
 * Don't fake if you don't have to
 * However, there are situations where it's a good choice

Yes. I agree with all of these things.

For reference, we use several complex-ish fakes in nodepool testing
because it's the sanest thing to do.

Also, the nova fake-virt driver is actually completely amazing and
allows us to test system state things that otherwise would be hard to model.

However - 100 times yes - don't go faking things when a mock will do the
job - and that's almost always.

 Thanks for reading :)
 
 Matt
 
 [1] There are other ways to skin this cat, but ultimately if you aren't
 actually spinning up a vSphere server, you're modelling it somehow.
 



Re: [openstack-dev] [all] In defence of faking

2014-09-22 Thread Robert Collins
On 23 September 2014 03:58, Matthew Booth mbo...@redhat.com wrote:
 If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
 http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
 thanks to Sean, Jay and Dan for doing this. I liked the format, the
 informal style and the content. Unfortunately I missed the live event,
 but I can confirm that watching it after the event worked just fine
 (thanks for reading out live questions for the stream!).

 I'd like to make a brief defence of faking, which perhaps predictably in
 a talk about mock took a bit of a bashing.

 Firstly, when not to fake. As Jay pointed out, faking adds an element of
 complexity to a test, so if you can achieve what you need to with a
 simple mock then you should. But, as the quote goes, you should make
 things as simple as possible, but not simpler.

 Here are some simple situations where I believe fake is the better solution:

 * Mock assertions aren't sufficiently expressive on their own

 For example, imagine your code is calling:

 def complex_set(key, value)

 You want to assert that on completion of your unit, the final value
 assigned to key was value. This is difficult to capture with mock
 without risking false assertion failures if complex_set sets other keys
 which you aren't interested in, or if key's value is set multiple
 times, but you're only interested in the last one. A little fake
 function which stores the final value assigned to key does this simply
 and accurately without adding a great deal of complexity. e.g.

 mykey = [None]
 def fake_complex_set(key, value):
   if key == 'FOO':
 mykey[0] = value

 with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
   run_test()
 self.assertEquals('expected', mykey[0])

 Summary: fake method is a custom mock assertion.

 * A simple fake function is simpler than a complex mock dance

 For example, you're mocking 2 functions: start_frobincating(key) and
 wait_for_frobnication(key). They can potentially be called overlapping
 with different keys. The desired mock return value of one is dependent
 on arguments passed to the other. This is better mocked with a couple of
 little fake functions and some external state, or you risk introducing
 artificial constraints on the code under test.

 Jay pointed out that faking methods creates more opportunities for
 errors. For this reason, in the above cases, you want to keep your fake
 function as simple as possible (but no simpler). However, there's a big
 one: the fake driver!

 This may make less sense outside of driver code, although faking the
 image service came up in the talk. Without looking at the detail, that
 doesn't necessarily sound awful to me, depending on context. In the
 driver, though, the ultimate measure of correctness isn't a Nova call:
 it's the effect produced on the state of an external system.

 For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
 lot of code: 1599 lines as of writing. It contains bugs, and it contains
 inaccuracies, and both of these can mess up tests. However:

 * It's vastly simpler than the system it models (vSphere server)
 * It's common code, so gets fixed over time
 * It allows tests to run almost all driver code unmodified

 So, for example, it knows that you can't move a file before you create
 it. It knows that creating a VM creates a bunch of different files, and
 where they're created. It knows what objects are created by the server,
 and what attributes they have. And what attributes they don't have. If
 you do an object lookup, it knows which objects to return, and what
 their properties are.

 All of this knowledge is vital to testing, and if it wasn't contained in
 the fake driver, or something like it[1], would have to be replicated
 across all tests which require it. i.e. It may be 1599 lines of
 complexity, but it's all complexity which has to live somewhere anyway.

 Incidentally, this is fresh in my mind because of
 https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
 -161, and the rewrite has better coverage, too :) It executes the
 function under test, it checks that it produces the correct outcome, and
 other than that it doesn't care how the function is implemented.

 TL;DR

 * Bootstrap hour is awesome
 * Don't fake if you don't have to
 * However, there are situations where it's a good choice

I'm going to push on this a bit further. Mocking is fine in many
cases, but it encodes dependencies from within your objects into the
test suite: e.g. that X will call Y and then Z - and if thats what you
want to *test*, thats great, but it often isn't outside of the very
smallest unit tests - because it makes your tests fragile to change -
tests for a class Foo need to change when a parent class Bar alters,
or when a utility class Quux alters. And that coupling has a cost and
is a pain.

Ad-hoc fakes are no better than mocks in this regard, but verified
fakes are substantially better and give 

Re: [openstack-dev] [all] In defence of faking

2014-09-22 Thread Joshua Harlow
I would have to agree, fakes certainly have there benefits.

IMHO, sometimes when you are testing a complex set of interactions you don't
want to have to mock the full set of interactions when you just want to 
determine if the
end result worked or did not work.

For example, lets say I have a workflow:

A - B - C

Where B is composed of 20 different things (or B could itself be composed of 20 
sub-workflows...).

To use mock here you would have to tightly couple whatever those workflows are 
doing internally
just to make mock work correctly, when in reality you just want to know did A 
- B - C work as expected
with the expected C being created/adjusted/whatever for the given input of A.

A example that I think is useful is one that I have created for faking 
zookeeper:

https://pypi.python.org/pypi/zake*

*mentioned @ http://kazoo.readthedocs.org/en/latest/testing.html#zake

Since its not typically possible to run zookeeper during testing, or to have a 
strong dependency to zookeeper
when running unit tests the above was created to function like a real kazoo 
zookeeper client, making it possible
to trigger the same set of things that a real zookeeper client would trigger 
but without requiring zookeeper, making
it possible to inject data into that fake kazoo client, delete data and test 
what that affects on your code...

This allows users of kazoo to test interactions of complex systems without 
trying to mock out that entire interaction
and without having to setup zookeeper to do the same (coupling your tests to a 
functioning zookeeper application).

TLDR; imho testing the interaction and expected outcome is easier with fakes, 
and doesn't tightly couple you to
an implementation. Mock is great for testing methods, small API's, simple 
checks, returned/raised results, but I believe
the better test is a test that tests an interaction in a complex system for a 
desired result (without requiring that full system to
be setup to test that result) because in the end that complex system is what 
users use as a whole.

My 2 cents.

On Sep 22, 2014, at 12:36 PM, Robert Collins robe...@robertcollins.net wrote:

 On 23 September 2014 03:58, Matthew Booth mbo...@redhat.com wrote:
 If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
 http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
 thanks to Sean, Jay and Dan for doing this. I liked the format, the
 informal style and the content. Unfortunately I missed the live event,
 but I can confirm that watching it after the event worked just fine
 (thanks for reading out live questions for the stream!).
 
 I'd like to make a brief defence of faking, which perhaps predictably in
 a talk about mock took a bit of a bashing.
 
 Firstly, when not to fake. As Jay pointed out, faking adds an element of
 complexity to a test, so if you can achieve what you need to with a
 simple mock then you should. But, as the quote goes, you should make
 things as simple as possible, but not simpler.
 
 Here are some simple situations where I believe fake is the better solution:
 
 * Mock assertions aren't sufficiently expressive on their own
 
 For example, imagine your code is calling:
 
 def complex_set(key, value)
 
 You want to assert that on completion of your unit, the final value
 assigned to key was value. This is difficult to capture with mock
 without risking false assertion failures if complex_set sets other keys
 which you aren't interested in, or if key's value is set multiple
 times, but you're only interested in the last one. A little fake
 function which stores the final value assigned to key does this simply
 and accurately without adding a great deal of complexity. e.g.
 
 mykey = [None]
 def fake_complex_set(key, value):
  if key == 'FOO':
mykey[0] = value
 
 with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
  run_test()
 self.assertEquals('expected', mykey[0])
 
 Summary: fake method is a custom mock assertion.
 
 * A simple fake function is simpler than a complex mock dance
 
 For example, you're mocking 2 functions: start_frobincating(key) and
 wait_for_frobnication(key). They can potentially be called overlapping
 with different keys. The desired mock return value of one is dependent
 on arguments passed to the other. This is better mocked with a couple of
 little fake functions and some external state, or you risk introducing
 artificial constraints on the code under test.
 
 Jay pointed out that faking methods creates more opportunities for
 errors. For this reason, in the above cases, you want to keep your fake
 function as simple as possible (but no simpler). However, there's a big
 one: the fake driver!
 
 This may make less sense outside of driver code, although faking the
 image service came up in the talk. Without looking at the detail, that
 doesn't necessarily sound awful to me, depending on context. In the
 driver, though, the ultimate measure of correctness isn't a Nova call:
 it's the effect 

Re: [openstack-dev] [all] In defence of faking

2014-09-22 Thread Jay Pipes

On 09/22/2014 03:36 PM, Robert Collins wrote:

On 23 September 2014 03:58, Matthew Booth mbo...@redhat.com wrote:

If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
thanks to Sean, Jay and Dan for doing this. I liked the format, the
informal style and the content. Unfortunately I missed the live event,
but I can confirm that watching it after the event worked just fine
(thanks for reading out live questions for the stream!).

I'd like to make a brief defence of faking, which perhaps predictably in
a talk about mock took a bit of a bashing.

Firstly, when not to fake. As Jay pointed out, faking adds an element of
complexity to a test, so if you can achieve what you need to with a
simple mock then you should. But, as the quote goes, you should make
things as simple as possible, but not simpler.

Here are some simple situations where I believe fake is the better solution:

* Mock assertions aren't sufficiently expressive on their own

For example, imagine your code is calling:

def complex_set(key, value)

You want to assert that on completion of your unit, the final value
assigned to key was value. This is difficult to capture with mock
without risking false assertion failures if complex_set sets other keys
which you aren't interested in, or if key's value is set multiple
times, but you're only interested in the last one. A little fake
function which stores the final value assigned to key does this simply
and accurately without adding a great deal of complexity. e.g.

mykey = [None]
def fake_complex_set(key, value):
   if key == 'FOO':
 mykey[0] = value

with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
   run_test()
self.assertEquals('expected', mykey[0])

Summary: fake method is a custom mock assertion.

* A simple fake function is simpler than a complex mock dance

For example, you're mocking 2 functions: start_frobincating(key) and
wait_for_frobnication(key). They can potentially be called overlapping
with different keys. The desired mock return value of one is dependent
on arguments passed to the other. This is better mocked with a couple of
little fake functions and some external state, or you risk introducing
artificial constraints on the code under test.

Jay pointed out that faking methods creates more opportunities for
errors. For this reason, in the above cases, you want to keep your fake
function as simple as possible (but no simpler). However, there's a big
one: the fake driver!

This may make less sense outside of driver code, although faking the
image service came up in the talk. Without looking at the detail, that
doesn't necessarily sound awful to me, depending on context. In the
driver, though, the ultimate measure of correctness isn't a Nova call:
it's the effect produced on the state of an external system.

For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
lot of code: 1599 lines as of writing. It contains bugs, and it contains
inaccuracies, and both of these can mess up tests. However:

* It's vastly simpler than the system it models (vSphere server)
* It's common code, so gets fixed over time
* It allows tests to run almost all driver code unmodified

So, for example, it knows that you can't move a file before you create
it. It knows that creating a VM creates a bunch of different files, and
where they're created. It knows what objects are created by the server,
and what attributes they have. And what attributes they don't have. If
you do an object lookup, it knows which objects to return, and what
their properties are.

All of this knowledge is vital to testing, and if it wasn't contained in
the fake driver, or something like it[1], would have to be replicated
across all tests which require it. i.e. It may be 1599 lines of
complexity, but it's all complexity which has to live somewhere anyway.

Incidentally, this is fresh in my mind because of
https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
-161, and the rewrite has better coverage, too :) It executes the
function under test, it checks that it produces the correct outcome, and
other than that it doesn't care how the function is implemented.

TL;DR

* Bootstrap hour is awesome
* Don't fake if you don't have to
* However, there are situations where it's a good choice


I'm going to push on this a bit further. Mocking is fine in many
cases, but it encodes dependencies from within your objects into the
test suite: e.g. that X will call Y and then Z - and if thats what you
want to *test*, thats great, but it often isn't outside of the very
smallest unit tests - because it makes your tests fragile to change -
tests for a class Foo need to change when a parent class Bar alters,
or when a utility class Quux alters. And that coupling has a cost and
is a pain.


It's a pain that is on purpose, though. The coupling of the unit test to 
the class relationship is precisely to cause pain 

Re: [openstack-dev] [all] In defence of faking

2014-09-22 Thread Robert Collins
On 23 September 2014 08:09, Jay Pipes jaypi...@gmail.com wrote:
 On 09/22/2014 03:36 PM, Robert Collins wrote:

 I'm going to push on this a bit further. Mocking is fine in many
 cases, but it encodes dependencies from within your objects into the
 test suite: e.g. that X will call Y and then Z - and if thats what you
 want to *test*, thats great, but it often isn't outside of the very
 smallest unit tests - because it makes your tests fragile to change -
 tests for a class Foo need to change when a parent class Bar alters,
 or when a utility class Quux alters. And that coupling has a cost and
 is a pain.


 It's a pain that is on purpose, though. The coupling of the unit test to the
 class relationship is precisely to cause pain when the interface changes. It
 is this pain that informs you that oh, I changed an interface, so I need to
 change the unit test that verifies that interface.

 The problem we have in OpenStack right now is too many fake stub methods
 that hide too much, either using stuff like:

  def fake_get_instance(*args, **kwargs):
 ... blah

 or returning completely incorrect stuff (like dicts instead of objects, thus
 in some cases making our unit tests the *only* remaining place where certain
 things are passed as dicts and not objects).

 Also, for the record, when I refer to fakes, I'm mostly referring to stubbed
 out methods that really should instead be calls to
 mock.MagicMock.assert_called[_once]_with().

 I'm not referring to fake drivers. Those are a special kind of problem in
 OpenStack, but not related to the use of fake stub methods when mock should
 be used instead.

 The problem with some fake drivers in OpenStack code is demonstrated here:

 https://github.com/openstack/nova/blob/e7d4ede1317ca0401bf8b4d49d58d000c53b1ed2/nova/tests/image/fake.py#L36

 * The fixtures contain types that don't actually come back from Glance (for
 example size isn't a string and the datetime format is wrong for some
 versions of Glance API)

 * The fake download() method ends up only exposing part of the real
 download() method, and it gets the return arguments wrong for other parts

 * The fake download() method masks over the real driver's complexity in
 requiring the v2 Glance API (the locations attribute of the image) and thus
 introduces further mistakes into the unit tests that (used to) depend on
 this thing.

 * The fake stub_out_image_service() completely messes up the return values
 by returning the wrong parameters

 All of this is just the stubbed out fakes for the image service. All of the
 above introduced subtle bugs into the unit tests and produced a situation
 where the unit tests were erroneously passing when the RBD image backend
 patches (the ones that were reverted days after inclusion right before
 Icehouse was released, remember?) used the fake image service instead of
 actually verifying the interface that exists in the real driver.

 If instead of writing full fake drivers for some of these important parts of
 the code base, we instead just mocked out the specific methods that cross
 performance or dependency-sensitive boundaries, I think we'd all be a bit
 saner.

So that fake ImageService is a *not* a verified fake - as you say it
had many forms of skew vs the real service, and those were fatal for
us. Mocking is no safer than that fake ImageService though: because
mocks are tautologies, they are subject to exactly the same skew, just
worse. They are no use for ensuring that code works in reality, only
for checking that the locally provided definition works. Mocks that
record interactions with real objects and capture it for you are a lot
better, but we don't have any of those in OpenStack today.

-Rob

-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev