Re: Feedback on a base fake type in the testing repo

2015-02-14 Thread John Meinel

 ...



 Still, keeping the per-test stubbed return values in sync with the
 actual low-level behavior is the real challenge when using stubs.
 Perhaps a happy medium (without an ideal fake) would be to drive a
 simple fake with tables of known outputs.  Then those tables could be
 vetted against the real thing.  At my last job (manufacturing testing
 for SSDs) we actually did something like this for our code base.  We
 would periodically run drives through the system and record the
 low-level inputs and outputs.  Then for testing we would use a fake to
 play it back and make sure we got the same behavior


Just as a quick note, that was the goal of the
environs/jujutest/livetests.go suite. Specifically, it was intended to be
run against:
1) the testing server (ie fake) for a given provider
2) the live server for a given provider when supplied with a -live sort
of flag.
3) 12 for all providers that we have implemented.

The idea is that rather than just stubs that return canned input, you can
test that your fake server acts like the real remote production server, and
can be substituted for the production server in your testing.
And then ideally, that each Environ implementation (which should be called
Provider, fwiw) then conforms to the Environ interface{} abstraction, and
you can again substitute any of them in a given test.

Arguably at this point we could drop (1) in favor of a really good 23 and
running those against the live services in CI. As most of our testing uses
the 'dummy' provider anyway. Very few testts actually end up using the
goose testing server other than direct openstack provider tests. And while
I think those are still better against a Fake, I'd personally be willing to
figure out what the pragmatic call is.

One of the really nice bits is running non-live interface tests against
each provider implementation (done in pre-commit-to-trunk testing). Which
makes it less likely that we won't catch something until CI notices and
blocks commits to master.

John
=:-
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feedback on a base fake type in the testing repo

2015-02-13 Thread Gustavo Niemeyer
On Fri, Feb 13, 2015 at 2:05 PM, Eric Snow eric.s...@canonical.com wrote:
 As for me, by fake I mean a struct that implements an interface with
 essentially no logic other than to keep track of method calls and
 facilitate controlling their return values explicitly.  For examples
 see the implementations for GCE and in `testing/fake.go`.  Thus in
 tests a fake may be used in place of the concrete implementation of
 the interface that would be used in production.

To me this is a good fake implementation:

https://github.com/juju/juju/tree/master/provider/dummy

 The onus is on the test writer to populate the fake with the correct
 return values such that they would match the expected behavior of the
 concrete implementation.

That's an optimistic view of it, as I described.

 Regardless, I'm convinced that testing needs to include both high
 coverage via isolated unit tests and good enough coverage via full
 stack integration tests.  Essentially we have to ensure that layers
 work together properly and that low-level APIs work the way we expect
 (and don't change unexpectedly).

That's globally agreed. What's at stake is how to do these.


gustavo @ http://niemeyer.net

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feedback on a base fake type in the testing repo

2015-02-13 Thread Gustavo Niemeyer
On Fri, Feb 13, 2015 at 3:25 PM, Eric Snow eric.s...@canonical.com wrote:
 This is a mock object under some well known people's terminology [1].

 With all due respect to Fowler, the terminology in this space is
 fairly muddled still. :)

Sure, I'm happy to use any terminology, but I'd prefer to not make one
up just now.

 The most problematic aspect of this approach is that tests are pretty
 much always very closely tied to the implementation, in a way that you
 suddenly cannot touch the implementation anymore without also fixing a
 vast number tests to comply.

 Let's look at this from the context of unit (i.e. function
 signature) testing.  By implementation do you mean you mean the
 function you are testing, or the low-level API the function is using,
 or both?  If the low-level API then it seems like the real fake
 object you describe further on would help by moving at least part of
 the test setup down out of the test and down into the fake.  However
 aren't you then just as susceptible to changes in the fake with the
 same maintenance consequences?

No, because the fake should behave as a normal type would, instead of
expecting a very precisely constrained orchestration of calls into its
interface. If we hand the implementation a fake value, it should be
able to call that value as many times as it wants, with whatever
parameters it wants, in whatever order it wants, and its behavior
should be consistent with a realistic implementation. Again, see the
dummy provider for a convenient example of that in practice.

 Ultimately I just don't see how you can avoid depending on low-level
 details (closely tied to the implementation) in your tests and still
 have confidence that you are testing things rigorously.  I think the

I could perceive that on your original email, and it's precisely why
I'm worried and responding to this thread.

If that logic held any ground, we'd never be able to have
organizations that could certify the quality and conformance of
devices based on the device itself. Instead, they'd have to go into
the industries to see how the device was manufactured. But that's not
what it happens.. these organizations get the outcome of the
production line, no matter how that worked, because that's the most
relevant thing to test. You can change the production line, you can
optimize it away, and you can even replace entire components, and it
doesn't matter as long as you preserve the quality of the outcome. Of
course, on the way to producing a device you'll generally make use of
smaller devices, which have their own production lines, and which
ensure that the outcome of their own production lines is of high
quality.

The same thing is true in code. If you spend a lot of time writing
tests for your production line, you are optimizing for the wrong goal.
You are spending a lot of time, the outcome can still be of poor
quality, and you are making it hard to optimize your production line
and potentially replace its components and methods by something
completely different. Of course, as in actual devices, code is
layered, so sub-components can be tested on their own to ensure their
promised interfaces hold water, but even there what matters is
ensuring that what they promise is being satisfied, rather than how
they are doing it.

 Also, the testing world puts a lot are emphasis on branch coverage in
 tests.  It almost sounds like you are suggesting that is not such an
 important goal.  Could you clarify?  Perhaps I'm inferring too much
 from what you've said. :)

I'd be happy to dive into that, but it's a distraction in this
conversation. You can use or not use your coverage tool irrespective
of your testing approach.

 As a recommendation to avoid digging a hole -- one that is pretty
 difficult to climb out of once you're in -- instead of testing method
 calls and cooking fake return values in your own test, build a real
 fake object: one that pretends to be a real implementation of that
 interface, and understands the business logic of it. Then, have
 methods on it that allow tailoring its behavior, but in a high-level
 way, closer to the problem than to the code.

 Ah, I like that!  So to rephrase, instead of a type where you just
 track calls and explicitly control return values, it is better to use
 a type that implements your expectations about the low-level system,
 exposed via the same API as the actual one?  This would likely still
 involve both to implement the same interface, right?  The thing I like

That's right.

 about that approach is that is forces you to document your
 expectations (i.e. dependencies) as code.  The problem is that you pay
 (in development time and in complexity) for an extra layer to engineer

This is irrelevant if you take into account the monumental future cost
of mocking everything up.

 Regardless, as I noted in an earlier message, I think testing needs to 
 involve:

 1. a mix of high branch coverage through isolated unit tests,

I'd be very careful to not 

Re: Feedback on a base fake type in the testing repo

2015-02-13 Thread Gustavo Niemeyer
On Fri, Feb 13, 2015 at 6:50 PM, Eric Snow eric.s...@canonical.com wrote:
 Using a fake for that input means you don't have to encode the
 low-level business logic in each test (just any setup of the fake's
 state).  You can be confident about the low-level behavior during
 tests as matching production operation (as long as the fake
 implementation is correct and bug free).  The potential downsides are
 any performance costs of using the fake, maintaining the fake (if
 applicable), and knowing how to manage the fake's state.
 Consequently, there should be a mechanism to ensure that the fake's
 behavior matches the real thing.

All of these costs exist whether you are dealing with one fake
implementation, or with logic spread through five hundred tests which
all include details of that interface. Hopefully the saner approach is
obvious.

 Alternately you can use a stub (what I was calling a fake) for that
 input.  On the upside, stubs are lightweight, both performance-wise
 and in terms of engineer time.  They also help limit the scope of what
 executes to just the code in the function under test.  The downside is
 that each test must encode the relevant business logic (mapping
 low-level inputs to low-level outputs) into the test's setup.  Not
 only is that fragile but the low-level return values will probably not
 have a lot of context where they appear in the test setup code
 (without comments).

Precisely. And that also implies the test knows exactly how the
function is using the interface, thus leaking its implementation into
the test, and preventing the implementation to change even in simple
ways without breaking the test.

 I'd be very careful to not overdo this. Covering a line just for the
 fun of seeing the CPU passing there is irrelevant. If you fake every
 single thing around it with no care, you'll have a CPU pointer jumping
 in and out of it, without any relevant achievement.

 I was talking about branches in the source code, not actual CPU branch
 operations. :)

Oh, so you mean you are not testing all CPU branches!?

(/me provokes the inner perfectionist spirit ;-)


gustavo @ http://niemeyer.net

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feedback on a base fake type in the testing repo

2015-02-11 Thread Andrew Wilkins
On Thu, Feb 12, 2015 at 2:53 AM, Eric Snow eric.s...@canonical.com wrote:

 tl;dr Using fakes for testing works well so I wrote a base fake type. [1]

 While working on the GCE provider, Wayne and I started taking a
 different approach to unit testing than the usual 1. expose an
 external dependency as an unexported var; 2.export it in
 export_test.go; 3. patch it out in tests. [2]  Instead we wrote an
 interface for the provider's low-level API and implemented the
 provider relative to that. [3]  Then in tests we used a fake
 implementation of that interface instead of the concrete one.  Instead
 of making the actual API requests, the fake simply tracked method
 calls and controlled return values.


Sounds good. The less patching the better.

Using the fake had a number of benefits:

 1. our tests were very focused on what code gets exercised, meaning we
 don't take responsibility for what amounts to testing our dependencies
 at the same time as the code at hand.
 2. the tests ran faster since they aren't making HTTP requests (even
 if just to localhost).
 3. the provider code isn't cluttered up with
 vars-only-there-to-be-patched-out. [2]
 4. the test code isn't cluttered with many separate s.PatchValue calls. [2]
 5. setting up the faked return values was cleaner.
 6. checking which methods were called (in order) along with the
 arguments, was easier and cleaner.

 In addition to all that, taking the fake approach required that we
 encapsulate our low-level dependencies in a single interface.  This
 was good because it forced us to spell out those dependencies.  That
 helped us write the provider better.  The low-level interface also
 made the code more maintainable since it makes the boundary layers
 explicit, and formats it in a concise display (the interface
 definition).

 So I've taken the base fake type from the GCE patch and pulled it into
 patch against the testing repo. [1]  I've made some adjustments based
 on use cases I've had in subsequent patches.  Nate has the bright idea
 of getting some feedback from the team before landing anything since
 it's the kind of thing that'll start popping up everywhere, and I want
 to make sure it passes muster with others.

 I'm not suggesting that fakes are the solution to all testing problems
 so let's please avoid that discussion. :)  Neither am I proposing that
 any existing tests should be converted to use fakes.  Nor am I
 suggesting the need for any policy recommending the use of fakes.  As
 well, we still need to make sure we have enough full stack test
 coverage to be confident about integration between layers.  Relatedly,
 the concrete implementation of low-level interfaces needs adequate
 test coverage to ensure the underlying APIs (remote, etc.) continue to
 match expected behavior.  So yeah, neither fakes nor isolated unit
 tests meet all our testing needs.  I just think the base fake type
 I've written will help facilitate a cleaner approach where it's
 appropriate.


Looks okay in principal, though I'm not sure how much benefit it would
add to the existing, tailored fakes in the codebase. It seems a little bit
weird that error results are encapsulated, but not non-error results.

Thoughts?

 -eric

 [1] https://github.com/juju/testing/pull/48
 [2] We were still feeling out the approach so we still took the
 var-and-patch in some cases.
 [3] See gceConnection in http://reviews.vapour.ws/r/771/diff/#46.

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feedback on a base fake type in the testing repo

2015-02-11 Thread Eric Snow
On Wed, Feb 11, 2015 at 6:43 PM, Andrew Wilkins
andrew.wilk...@canonical.com wrote:
 Looks okay in principal, though I'm not sure how much benefit it would
 add to the existing, tailored fakes in the codebase. It seems a little bit
 weird that error results are encapsulated, but not non-error results.

Yeah, I've tried to make that work several different ways to no avail.
It's messy since return types vary greatly and a mechanism to
accommodate them would require extra boilerplate in faked methods to
accommodate type conversions.  So it didn't seem worth bothering to
me.*  Error returns are a different story since there is only one type
to deal with (error).

-eric

* I suppose I could use reflection to make it all work out, but I'd
rather avoid the complexity (at least in this initial patch).

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev