Re: How expensive are tests?

2008-04-27 Thread bsdlogical

The testing issue was something that I emphasized in my GSOC proposal
(which wasn't accepted). I may or may not work on it this summer, but
for anyone interested, the full proposal is available here:

https://www.dupontmanual.org/wikis/spectops//FrontPage/DjangoTestFramework

Part 2 might be of interest specifically.

Nick
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-27 Thread Malcolm Tredinnick


On Sun, 2008-04-27 at 19:42 -0400, Marty Alchin wrote:
[...]
> Well, my concern now is the notion of separate tests for separate
> things. We have modeltests and regressiontests, and it seems (to me)
> like the file storage does a little of both.

This is a case of (probably unfortunate) historical evolving. We used to
have model tests, miscellaneous tests (which lived in the root test
directory) and then we introduced regression tests for stuff that was
truly boring, but needed to be tested for some cases that might have a
chance of coming back in the future. They were initially all tests
against specific ticket items. For better or worse, "regressiontests"
has been effectively redefined to mean "everything by model tests",
leading to this conundrum every time.

The problem is that this means we've confused the cases where tests also
act as best practice examples, but aren't exhibiting model features, and
the cases where tests are doing truly ugly stuff to try to trigger
edge-case bugs. This makes me sad. :-(


> So, I guess it's less of a technical question and more a philosophical
> one. Which tests should go where?

So use both. Put the example file field usage in model tests and the
other stuff in regression tests.

Malcolm

-- 
The cost of feathers has risen; even down is up! 
http://www.pointy-stick.com/blog/


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-27 Thread Marty Alchin

On Sun, Apr 27, 2008 at 1:37 AM, Russell Keith-Magee
<[EMAIL PROTECTED]> wrote:
>  - The database reset is only used by django.test.TestCase. If you are
>  using raw unittest.TestCase or doctests, the database isn't flushed,
>  and so they can run quite quickly.

This addresses the root of my question. I planned on doing everything
in doctests, so I don't feel bad having two separate tests modules.

>  At the top level, you have regression tests for file uploads, so it
>  makes sense that they are all in a single package. However, that
>  doesn't preclude some internal organization. Having multiple test
>  classes in a single tests.py file is one way to do this (and there are
>  already a few examples of this in the regression tests for the
>  fixtures and test cases).

Well, my concern now is the notion of separate tests for separate
things. We have modeltests and regressiontests, and it seems (to me)
like the file storage does a little of both.

On one hand, there's the file storage code itself, which solely deals
with storing and retrieving files. This has absolutely nothing to do
with the database or any models, and has nothing to do with the upload
process itself, or any of the form handling for it. It just solely
deals with mapping filenames to actual stored content. I was thinking
tests for this would be best placed in regressiontests.

On the other hand, the file storage patch also includes a number of
changes and improvements to FileField and how its various operations
work. As model fields, I figured tests for this portion of the patch
would be best placed in modeltests.

So, I guess it's less of a technical question and more a philosophical
one. Which tests should go where?

-Gul

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-27 Thread Russell Keith-Magee

On Sun, Apr 27, 2008 at 1:51 PM, Trevor Caira <[EMAIL PROTECTED]> wrote:
>
>  Would it be possible to, instead of manually specifying when to reset
>  the database, automatically reset the database the first time that a
>  test attempts to access it in a test case, and not reset again until
>  the next test case after one which modifies it? This way, the database
>  can be guaranteed to be reset in the same way as before, but no
>  unnecessary resets are done.

It's not as simple as you make it sound. Completely aside from the
problem of how to instrument the database cursor to support this idea,
it's not as simple as just checking for 'database access'.

If a test doesn't use the database at all, it doesn't need to use the
django.test.TestCase, so you can already avoid overhead of doing
database resets.

If you are using the database, it makes a difference whether you are
doing reads or writes. If all you are doing is reading the database,
you don't need to reset (except perhaps for an initial per TestCase
reset when the fixture is established), but if you write you will need
to reset. This means that you need to be able to distinguish between
read and write operations on the database - but the only interface to
the database is a cursor that takes string input.

I'm open to suggestions on how this could be done, but to my eyes,
it's not a trivial problem.

Yours,
Russ Magee %-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-27 Thread Russell Keith-Magee

On Sun, Apr 27, 2008 at 3:23 PM, Simon Willison <[EMAIL PROTECTED]> wrote:
>
>  On Apr 27, 6:37 am, "Russell Keith-Magee" <[EMAIL PROTECTED]>
>  wrote:
>
> > - The slow operation is resetting the database. As far as I know,
>  > there isn't much that can be done about this. No matter which way you
>  > do it, deleting a whole lot of data then re-establishing database
>  > structure is an expensive operation.
>
>  There's probably an obvious reason why this wouldn't work, but could
>  this be dealt with by running each test suite inside a transaction and
>  rolling back at the end of it instead of committing? Transactions can
>  be nested so theoretically it shouldn't interfere with Django's other
>  transaction related features.

As I understand it, SQLite and MySQL don't support nested transactions.

http://www.sqlite.org/lang_transaction.html
http://dev.mysql.com/doc/refman/5.0/en/implicit-commit.html

Yours,
Russ Magee %-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-27 Thread Simon Willison

On Apr 27, 6:37 am, "Russell Keith-Magee" <[EMAIL PROTECTED]>
wrote:
> - The slow operation is resetting the database. As far as I know,
> there isn't much that can be done about this. No matter which way you
> do it, deleting a whole lot of data then re-establishing database
> structure is an expensive operation.

There's probably an obvious reason why this wouldn't work, but could
this be dealt with by running each test suite inside a transaction and
rolling back at the end of it instead of committing? Transactions can
be nested so theoretically it shouldn't interfere with Django's other
transaction related features.

Cheers,

Simon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-26 Thread Trevor Caira

On Apr 27, 1:37 am, "Russell Keith-Magee" <[EMAIL PROTECTED]>
wrote:
> On Sun, Apr 27, 2008 at 12:27 PM, Marty Alchin <[EMAIL PROTECTED]> wrote:
>
> >  In particular, I thought I had remembered some discussion a while back
> >  about how expensive different test packages were, since each package
> >  requires a setup/teardown of the database. Normally this is an
> >  acceptable inconvenience, but it becomes a burdern for tests that
> >  don't actually use the database.
>
> Test performance is a long standing problem (ticket #4998 for those
> keeping track). I'm open to any suggestions to improve things on this
> front. One of the GSOC projects made an interesting suggestion; more
> details on that below.
>
> >  So, my question is this: how expensive is it, really, to setup and
> >  teardown the database and whatnot, with two separate packages?
>
> - The slow operation is resetting the database. As far as I know,
> there isn't much that can be done about this. No matter which way you
> do it, deleting a whole lot of data then re-establishing database
> structure is an expensive operation. However, it is also an essential
> operation for certain tests because guaranteeing initial test state
> can only really be done if you have a clean slate to start with.
>
> - The database reset is only used by django.test.TestCase. If you are
> using raw unittest.TestCase or doctests, the database isn't flushed,
> and so they can run quite quickly.
>
> - There is nothing stopping you from mixing and matching doctests, raw
> unittest.TestCases and django.test.TestCases in a single test module.
>
> The one immediate area for improvement is to cover the case where you
> need the django.test.TestCase capabilities (fixtures, test client,
> assertions etc), but don't need the database to be reset (because the
> tests aren't modifying the test data). One idea stemming from this
> year's GSOC proposals is to add a decorator (or some similar
> mechanism) to disable the database reset on a per-test basis and/or
> adding a class variable to disable the reset for all tests in the
> class. This would allow you to use the Django test cases to set up
> data, but speed up the tests for those cases that can support the
> optimization.
Would it be possible to, instead of manually specifying when to reset
the database, automatically reset the database the first time that a
test attempts to access it in a test case, and not reset again until
the next test case after one which modifies it? This way, the database
can be guaranteed to be reset in the same way as before, but no
unnecessary resets are done.

Trevor Caira
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: How expensive are tests?

2008-04-26 Thread Russell Keith-Magee

On Sun, Apr 27, 2008 at 12:27 PM, Marty Alchin <[EMAIL PROTECTED]> wrote:
>
>  In particular, I thought I had remembered some discussion a while back
>  about how expensive different test packages were, since each package
>  requires a setup/teardown of the database. Normally this is an
>  acceptable inconvenience, but it becomes a burdern for tests that
>  don't actually use the database.

Test performance is a long standing problem (ticket #4998 for those
keeping track). I'm open to any suggestions to improve things on this
front. One of the GSOC projects made an interesting suggestion; more
details on that below.

>  So, my question is this: how expensive is it, really, to setup and
>  teardown the database and whatnot, with two separate packages?

- The slow operation is resetting the database. As far as I know,
there isn't much that can be done about this. No matter which way you
do it, deleting a whole lot of data then re-establishing database
structure is an expensive operation. However, it is also an essential
operation for certain tests because guaranteeing initial test state
can only really be done if you have a clean slate to start with.

- The database reset is only used by django.test.TestCase. If you are
using raw unittest.TestCase or doctests, the database isn't flushed,
and so they can run quite quickly.

- There is nothing stopping you from mixing and matching doctests, raw
unittest.TestCases and django.test.TestCases in a single test module.

The one immediate area for improvement is to cover the case where you
need the django.test.TestCase capabilities (fixtures, test client,
assertions etc), but don't need the database to be reset (because the
tests aren't modifying the test data). One idea stemming from this
year's GSOC proposals is to add a decorator (or some similar
mechanism) to disable the database reset on a per-test basis and/or
adding a class variable to disable the reset for all tests in the
class. This would allow you to use the Django test cases to set up
data, but speed up the tests for those cases that can support the
optimization.

> Is it
>  enough to merit rolling all these tests into one package, or would it
>  be worth it to maintain the separation of concerns?

At the top level, you have regression tests for file uploads, so it
makes sense that they are all in a single package. However, that
doesn't preclude some internal organization. Having multiple test
classes in a single tests.py file is one way to do this (and there are
already a few examples of this in the regression tests for the
fixtures and test cases).

If you have a lot of test cases and a single test file gets too big,
you can also consider splitting up the test module into submodules.
Rather than a single tests.py file, make at tests directory containing
a bunch of .py files containing tests. Doctests use the __tests__
variable to find subtests; unittests can be broken up by putting 'from
submodule import *' into the __init__.py file in the tests directory.

Yours,
Russ Magee %-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



How expensive are tests?

2008-04-26 Thread Marty Alchin

I had mentioned this a while back, in passing, but I'd like to bring
it up again now that the filestorage patch is getting close to making
it into trunk.

In particular, I thought I had remembered some discussion a while back
about how expensive different test packages were, since each package
requires a setup/teardown of the database. Normally this is an
acceptable inconvenience, but it becomes a burdern for tests that
don't actually use the database.

This ties in with file storage because there are actually two
distinctly separate things to test: model-related file stuffs and
those features that don't care about models. I already have a number
of tests for the model-related work: FileField upload_to, etc. That's
all well and good, and it essentially tests the underlying file
storage code by virtue of relying on it.

However, I'm noticing that there are a number of features that exist
solely in the storage aspect, without any relation to FileField. I'd
rather test these in a separate test package, so it's obvious where
the line is drawn. For instance, it doesn't make sense for modeltests
to include a test of whether a given file path is actually beneath the
storage system's base location. I'd rather have a separate package for
those types of things.

So, my question is this: how expensive is it, really, to setup and
teardown the database and whatnot, with two separate packages? Is it
enough to merit rolling all these tests into one package, or would it
be worth it to maintain the separation of concerns?

If it's best to have two separate packages, I'll make that change and
upload a new patch to be looked over while we wait on the streaming
upload stuff to finish up.

-Gul

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---