Re: [HACKERS] Test code is worth the space

2015-08-22 Thread Noah Misch
On Tue, Aug 18, 2015 at 02:03:19PM +0100, Greg Stark wrote:
 On Tue, Aug 18, 2015 at 6:57 AM, Noah Misch n...@leadboat.com wrote:
  My own position is based on having maintained a pg_regress suite an order of
  magnitude larger than that.  I don't know why that outcome was so different.

 And does your pg_regress test suite actually find many bugs or does it
 mainly detect when functionality has changed and require updating
 expected results to match?

It found plenty of bugs and required plenty of expected output updates.

  I suspect any effort to significantly improve Postgres test coverage is
  doomed until there's an alternative to pg_regress.
 
  There is the src/test/perl/TestLib.pm harness.
 
 Sadly I think the test suite is only half the battle.

Certainly.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-22 Thread Jeff Janes
On Tue, Aug 18, 2015 at 3:32 PM, David Fetter da...@fetter.org wrote:

 On Tue, Aug 18, 2015 at 04:54:07PM +0100, Greg Stark wrote:
  On Tue, Aug 18, 2015 at 2:16 PM, David Fetter da...@fetter.org wrote:
   I'm given to understand that this tight coupling is necessary for
   performance.  Are you saying that it could be unwound, or that
   testing strategies mostly need to take it into account, or...?
 
  I'm just saying that we shouldn't expect to find a magic bullet test
  framework that solves all these problems. Without restructuring
  code, which I don't think is really feasible, we won't be able to
  have good unit test coverage for most existing code.
 
  It might be more practical to start using such a new tool for new
  code only. Then the new code could be structured in ways that allow
  the environment to be mocked more easily and the results observed
  more easily.

 Great!

 Do we have examples of such tools and code bases structured to
 accommodate them that we'd like to use for reference, or at least for
 inspiration?


+1 on that.  It would be helpful to see successful examples.  Especially
ones written in C.

I can't really figure out what success looks like just from reading the
descriptions.

Cheers,

Jeff


Re: [HACKERS] Test code is worth the space

2015-08-18 Thread David Fetter
On Tue, Aug 18, 2015 at 02:03:19PM +0100, Greg Stark wrote:
 On Tue, Aug 18, 2015 at 6:57 AM, Noah Misch n...@leadboat.com wrote:

  I suspect any effort to significantly improve Postgres test
  coverage is doomed until there's an alternative to pg_regress.
 
  There is the src/test/perl/TestLib.pm harness.
 
 Sadly I think the test suite is only half the battle. The coding
 style of Postgres predates modern test suite systems and makes it
 hard to test. Most functions require a specific environment set up
 that would be laborious and difficult to do in any sane way. Even
 something as self-contained as tuplesort would be difficult to test
 without the whole operator class and syscache mechanisms initialized
 and populated. And that's an easy case, imagine trying to test
 individual functions in the planner without doing a complete planner
 run on a query.

I'm given to understand that this tight coupling is necessary for
performance.  Are you saying that it could be unwound, or that testing
strategies mostly need to take it into account, or...?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-18 Thread Greg Stark
On Tue, Aug 18, 2015 at 6:57 AM, Noah Misch n...@leadboat.com wrote:
 My own position is based on having maintained a pg_regress suite an order of
 magnitude larger than that.  I don't know why that outcome was so different.

Comparing the size of test suites by these numbers is impossible
because people put more or fewer tests in each schedule file. I would
be more interested in the run-time as a metric but even that is
fallible as I've seen individual tests that took 30min to run.

And does your pg_regress test suite actually find many bugs or does it
mainly detect when functionality has changed and require updating
expected results to match?


 I suspect any effort to significantly improve Postgres test coverage is
 doomed until there's an alternative to pg_regress.

 There is the src/test/perl/TestLib.pm harness.

Sadly I think the test suite is only half the battle. The coding style
of Postgres predates modern test suite systems and makes it hard to
test. Most functions require a specific environment set up that would
be laborious and difficult to do in any sane way. Even something as
self-contained as tuplesort would be difficult to test without the
whole operator class and syscache mechanisms initialized and
populated. And that's an easy case, imagine trying to test individual
functions in the planner without doing a complete planner run on a
query.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-18 Thread David Fetter
On Tue, Aug 18, 2015 at 04:54:07PM +0100, Greg Stark wrote:
 On Tue, Aug 18, 2015 at 2:16 PM, David Fetter da...@fetter.org wrote:
  I'm given to understand that this tight coupling is necessary for
  performance.  Are you saying that it could be unwound, or that
  testing strategies mostly need to take it into account, or...?
 
 I'm just saying that we shouldn't expect to find a magic bullet test
 framework that solves all these problems. Without restructuring
 code, which I don't think is really feasible, we won't be able to
 have good unit test coverage for most existing code.
 
 It might be more practical to start using such a new tool for new
 code only. Then the new code could be structured in ways that allow
 the environment to be mocked more easily and the results observed
 more easily.

Great!

Do we have examples of such tools and code bases structured to
accommodate them that we'd like to use for reference, or at least for
inspiration?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-18 Thread Greg Stark
On Tue, Aug 18, 2015 at 2:16 PM, David Fetter da...@fetter.org wrote:
 I'm given to understand that this tight coupling is necessary for
 performance.  Are you saying that it could be unwound, or that testing
 strategies mostly need to take it into account, or...?

I'm just saying that we shouldn't expect to find a magic bullet test
framework that solves all these problems. Without restructuring code,
which I don't think is really feasible, we won't be able to have good
unit test coverage for most existing code.

It might be more practical to start using such a new tool for new code
only. Then the new code could be structured in ways that allow the
environment to be mocked more easily and the results observed more
easily.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-17 Thread Noah Misch
On Mon, Aug 17, 2015 at 02:04:40PM -0500, Jim Nasby wrote:
 On 8/16/15 8:48 AM, Greg Stark wrote:
 On Sun, Aug 16, 2015 at 7:33 AM, Noah Misch n...@leadboat.com wrote:
 When I've just spent awhile implementing a behavior change, the test diffs 
 are
 a comforting sight.  They confirm that the test suite exercises the topic I
 just changed.  Furthermore, most tests today do not qualify under this
 stringent metric you suggest.  The nature of pg_regress opposes it.
 
 It's not a comforting sight for me. It means I need to change the test
 results and then of course the tests will pass. So they didn't really
 test anything and I don't really know whether the changes broke
 anything. Any test I had to adjust for my change was effectively
 worthless.
 
 This is precisely my point about pg_regress and why it's the reason
 there's pressure not to have extensive tests. It's useful to have some
 end-to-end black box tests like this but it's self-limiting. You can't
 test many details without tying the tests to specific behaviour.
 
 I have worked with insanely extensive testing suites that didn't have
 this problem. But they were unit tests for code that was structured
 around testability. When the API is designed to be testable and you
 have good infrastructure for mocking up the environment and comparing
 function results in a much narrower way than just looking at text
 output you can test the correctness of each function without tying the
 test to the specific behaviour.
 
 *That* gives a real comfort. When you change a function to behave
 entirely differently and can still run all the tests and see that all
 the code that depends on it still operate correctly then you know you
 haven't broken anything. In that case it actually *speeds up*
 development rather than slowing it down. It lets newbie developers
 hack on code where they don't necessarily know all the downstream
 dependent code and not be paralyzed by fear that they'll break
 something.
 
 As someone who oversaw a database test suite with almost 500 test files
 (each testing multiple cases), I completely agree. Our early framework was
 actually similar to pg_regress and that worked OK up to about 200 test
 files, but it got very unwieldy very quickly. We switched to pgTap and it
 was far easier to work with.

My own position is based on having maintained a pg_regress suite an order of
magnitude larger than that.  I don't know why that outcome was so different.

 I suspect any effort to significantly improve Postgres test coverage is
 doomed until there's an alternative to pg_regress.

There is the src/test/perl/TestLib.pm harness.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-17 Thread Jim Nasby

On 8/15/15 4:45 AM, Petr Jelinek wrote:

We could fix a) by adding ORDER BY to those queries but I don't see how
to fix the rest easily or at all without sacrificing some test coverage.


Hopefully at some point we'll have test frameworks that don't depend on 
capturing raw psql output, which presumably makes dealing with a lot of 
this easier.


Maybe some of this could be handled by factoring BLKSZ into the queries...

A really crude solution would be to have expected output be BLKSZ 
dependent where appropriate. No one will remember to test that by hand, 
but if we have CI setup then maybe it's workable...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-17 Thread Jim Nasby

On 8/16/15 8:48 AM, Greg Stark wrote:

On Sun, Aug 16, 2015 at 7:33 AM, Noah Misch n...@leadboat.com wrote:

When I've just spent awhile implementing a behavior change, the test diffs are
a comforting sight.  They confirm that the test suite exercises the topic I
just changed.  Furthermore, most tests today do not qualify under this
stringent metric you suggest.  The nature of pg_regress opposes it.


It's not a comforting sight for me. It means I need to change the test
results and then of course the tests will pass. So they didn't really
test anything and I don't really know whether the changes broke
anything. Any test I had to adjust for my change was effectively
worthless.

This is precisely my point about pg_regress and why it's the reason
there's pressure not to have extensive tests. It's useful to have some
end-to-end black box tests like this but it's self-limiting. You can't
test many details without tying the tests to specific behaviour.

I have worked with insanely extensive testing suites that didn't have
this problem. But they were unit tests for code that was structured
around testability. When the API is designed to be testable and you
have good infrastructure for mocking up the environment and comparing
function results in a much narrower way than just looking at text
output you can test the correctness of each function without tying the
test to the specific behaviour.

*That* gives a real comfort. When you change a function to behave
entirely differently and can still run all the tests and see that all
the code that depends on it still operate correctly then you know you
haven't broken anything. In that case it actually *speeds up*
development rather than slowing it down. It lets newbie developers
hack on code where they don't necessarily know all the downstream
dependent code and not be paralyzed by fear that they'll break
something.


As someone who oversaw a database test suite with almost 500 test files 
(each testing multiple cases), I completely agree. Our early framework 
was actually similar to pg_regress and that worked OK up to about 200 
test files, but it got very unwieldy very quickly. We switched to pgTap 
and it was far easier to work with.


I suspect any effort to significantly improve Postgres test coverage is 
doomed until there's an alternative to pg_regress.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-16 Thread Noah Misch
On Wed, Aug 12, 2015 at 06:46:19PM +0100, Greg Stark wrote:
 On Wed, Aug 12, 2015 at 3:10 AM, Noah Misch n...@leadboat.com wrote:
  Committers press authors to delete tests more often than we press them to
  resubmit with more tests.  No wonder so many patches have insufficient 
  tests;
  we treat those patches more favorably, on average.  I have no objective
  principles for determining whether a test is pointlessly redundant, but I
  think the principles should become roughly 10x more permissive than the
  (unspecified) ones we've been using.
 
 I would suggest the metric should be if this test fails is it more
 likely to be noise due to an intentional change in behaviour or more
 likely to be a bug?

When I've just spent awhile implementing a behavior change, the test diffs are
a comforting sight.  They confirm that the test suite exercises the topic I
just changed.  Furthermore, most tests today do not qualify under this
stringent metric you suggest.  The nature of pg_regress opposes it.

I sometimes hear a myth that tests catch the bugs their authors anticipated.
We have tests like that (opr_sanity comes to mind), but much test-induced bug
discovery is serendipitous.  To give a recent example, Peter Eisentraut didn't
write src/bin tests to reveal the bug that led to commit d73d14c.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-16 Thread Noah Misch
On Fri, Aug 14, 2015 at 12:47:49PM +0100, Simon Riggs wrote:
 On 13 August 2015 at 00:31, Robert Haas robertmh...@gmail.com wrote:
  On Wed, Aug 12, 2015 at 7:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   We've talked about having some sort of second rank of tests that
   people wouldn't necessarily run before committing, and that would
   be allowed to eat more time than the core regression tests would.
   I think that might be a valuable direction to pursue if people start
   submitting very bulky tests.
 
  Maybe.  Adding a whole new test suite is significantly more
  administratively complex, because the BF client has to get updated to
  run it.  And if expected outputs in that test suite change very often
  at all, then committers will have to run it before committing anyway.
 
  The value of a core regression suite that takes less time to run has
  to be weighed against the possibility that a better core regression
  suite might cause us to find more bugs before committing.  That could
  easily be worth the price in runtime.
 
 Seems like a simple fix. We maintain all regression tests in full, but keep
 slow tests in separate files accessed only by a different schedule.
 
 make check == fast-parallel_schedule
 make check-full == parallel_schedule

+1 for a split, though I would do make quickcheck and make check.  Using
fewer tests should be a conscious decision, and check is the widely-known
Makefile target.  In particular, non-hackers building production binaries need
the thorough test battery.  (As a bonus, the buildfarm wouldn't miss a beat.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-16 Thread Greg Stark
On Sun, Aug 16, 2015 at 7:33 AM, Noah Misch n...@leadboat.com wrote:
 When I've just spent awhile implementing a behavior change, the test diffs are
 a comforting sight.  They confirm that the test suite exercises the topic I
 just changed.  Furthermore, most tests today do not qualify under this
 stringent metric you suggest.  The nature of pg_regress opposes it.

It's not a comforting sight for me. It means I need to change the test
results and then of course the tests will pass. So they didn't really
test anything and I don't really know whether the changes broke
anything. Any test I had to adjust for my change was effectively
worthless.

This is precisely my point about pg_regress and why it's the reason
there's pressure not to have extensive tests. It's useful to have some
end-to-end black box tests like this but it's self-limiting. You can't
test many details without tying the tests to specific behaviour.

I have worked with insanely extensive testing suites that didn't have
this problem. But they were unit tests for code that was structured
around testability. When the API is designed to be testable and you
have good infrastructure for mocking up the environment and comparing
function results in a much narrower way than just looking at text
output you can test the correctness of each function without tying the
test to the specific behaviour.

*That* gives a real comfort. When you change a function to behave
entirely differently and can still run all the tests and see that all
the code that depends on it still operate correctly then you know you
haven't broken anything. In that case it actually *speeds up*
development rather than slowing it down. It lets newbie developers
hack on code where they don't necessarily know all the downstream
dependent code and not be paralyzed by fear that they'll break
something.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-15 Thread Petr Jelinek

On 2015-08-15 03:35, Jim Nasby wrote:


I setup a simple example of this with 64 variations of TAP tests, BLKSZ
and WAL blocksize. Unfortunately to make this work you have to commit a
.travis.yml file to your fork.

build: https://travis-ci.org/decibel/postgres/builds/75692344
.travis.yml: https://github.com/decibel/postgres/blob/master/.travis.yml

Looks like we might have some problems with BLKSZ != 8...


I went through several of those and ISTM that most test failures are to 
be expected:

a) order of resulting rows is different for some of the joins
b) planner behaves differently because of different number of pages
c) tablesample returns different results because it uses ctids as random 
source
d) toaster behaves differently because of different threshold which is 
calculated from BLKSZ
e) max size of btree value is exceeded with BLKSZ = 4 in the test that 
tries to create deep enough btree


We could fix a) by adding ORDER BY to those queries but I don't see how 
to fix the rest easily or at all without sacrificing some test coverage.


However I see one real error in the BLKSZ = 4 tests - two of the merge 
join queries in equivclass die with ERROR:  could not find commutator 
for operator ( example build output 
https://travis-ci.org/decibel/postgres/jobs/75692377 ). I didn't yet 
investigate what's causing this.



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-14 Thread Simon Riggs
On 13 August 2015 at 00:31, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Aug 12, 2015 at 7:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  FWIW, I've objected in the past to tests that would significantly
  increase the runtime of make check, unless I thought they were
  especially valuable (which enumerating every minor behavior of a
  feature patch generally isn't IMO).  I still think that that's an
  important consideration: every second you add to make check is
  multiplied many times over when you consider how many developers
  run that how many times a day.
 
  We've talked about having some sort of second rank of tests that
  people wouldn't necessarily run before committing, and that would
  be allowed to eat more time than the core regression tests would.
  I think that might be a valuable direction to pursue if people start
  submitting very bulky tests.

 Maybe.  Adding a whole new test suite is significantly more
 administratively complex, because the BF client has to get updated to
 run it.  And if expected outputs in that test suite change very often
 at all, then committers will have to run it before committing anyway.

 The value of a core regression suite that takes less time to run has
 to be weighed against the possibility that a better core regression
 suite might cause us to find more bugs before committing.  That could
 easily be worth the price in runtime.


Seems like a simple fix. We maintain all regression tests in full, but keep
slow tests in separate files accessed only by a different schedule.

make check == fast-parallel_schedule

make check-full == parallel_schedule

Probably easier to make one schedule call the other, so we don't duplicate
anything.

Tom gets his fast schedule, others get their full schedule.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Test code is worth the space

2015-08-14 Thread Jim Nasby

On 8/14/15 12:11 AM, Jim Nasby wrote:


I favor splitting the regression tests to add all the time and
before commit targets as you describe. I think that once the
facility is there, we can determine over time how expansive that
second category gets to be.


I don't know how many folks work in a github fork of Postgres, but
anyone that does could run slow checks on every single push via Travis-CI.


I setup a simple example of this with 64 variations of TAP tests, BLKSZ 
and WAL blocksize. Unfortunately to make this work you have to commit a 
.travis.yml file to your fork.


build: https://travis-ci.org/decibel/postgres/builds/75692344
.travis.yml: https://github.com/decibel/postgres/blob/master/.travis.yml

Looks like we might have some problems with BLKSZ != 8...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Magnus Hagander
On Thu, Aug 13, 2015 at 1:31 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Aug 12, 2015 at 7:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  FWIW, I've objected in the past to tests that would significantly
  increase the runtime of make check, unless I thought they were
  especially valuable (which enumerating every minor behavior of a
  feature patch generally isn't IMO).  I still think that that's an
  important consideration: every second you add to make check is
  multiplied many times over when you consider how many developers
  run that how many times a day.
 
  We've talked about having some sort of second rank of tests that
  people wouldn't necessarily run before committing, and that would
  be allowed to eat more time than the core regression tests would.
  I think that might be a valuable direction to pursue if people start
  submitting very bulky tests.

 Maybe.  Adding a whole new test suite is significantly more
 administratively complex, because the BF client has to get updated to
 run it.  And if expected outputs in that test suite change very often
 at all, then committers will have to run it before committing anyway.



We could always do that the other way - meaning put everything in make
check, and invent a make quickcheck for developers with the smaller
subset.



 The value of a core regression suite that takes less time to run has
 to be weighed against the possibility that a better core regression
 suite might cause us to find more bugs before committing.  That could
 easily be worth the price in runtime.



Or have a quickcheck you run all the time and then run the bigger one
once before committing perhaps?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Fabien COELHO



FWIW, I've objected in the past to tests that would significantly
increase the runtime of make check, unless I thought they were
especially valuable (which enumerating every minor behavior of a
feature patch generally isn't IMO).  I still think that that's an
important consideration: every second you add to make check is
multiplied many times over when you consider how many developers
run that how many times a day.


Sure. These are developer's tests run 50 times per day just to check that 
nothing was just completly broken. It's just a kind of test.


I agree that having ISN conversions tested everytime does not make much 
sense, especially as the source file is touched every two years. On the 
other hand, when I tried to fix ISN bugs and figure out that there was no 
single tests for the module, then it is hard to spot regressions, so the 
tests should be there even if they are not run often: Testing the obvious 
is useful when you start meddling in the code.



We've talked about having some sort of second rank of tests that
people wouldn't necessarily run before committing, and that would
be allowed to eat more time than the core regression tests would.
I think that might be a valuable direction to pursue if people start
submitting very bulky tests.


Yep.

For regression tests, the list of tests run is maintained in the 
*_schedule files. There could be a large_parallel_schedule which 
included more tests. This is already more or less the case, as there are 
big* targets which run numeric_big in addition to the others.

This could be expanded and taken into account by the build farm.

I recall test submissions which were rejected on the ground of 'it takes 1 
second of my time every day' and is 'not very useful as the feature is 
known to work'.


I think that a key change needed is that committers are more open to such 
additions and the discussion rather focus on (1) are the tests portable 
(2) do the test cover features not already tested (3) should they be in 
the default list of tests run under make check. But some should be 
accepted in *core* nevertheless, and not push out in the bin.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Peter Geoghegan
On Wed, Aug 12, 2015 at 11:23 PM, Magnus Hagander mag...@hagander.net wrote:
 The value of a core regression suite that takes less time to run has
 to be weighed against the possibility that a better core regression
 suite might cause us to find more bugs before committing.  That could
 easily be worth the price in runtime.


 Or have a quickcheck you run all the time and then run the bigger one once
 before committing perhaps?

I favor splitting the regression tests to add all the time and
before commit targets as you describe. I think that once the
facility is there, we can determine over time how expansive that
second category gets to be.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/12/15 9:24 PM, Stephen Frost wrote:

* Michael Paquier (michael.paqu...@gmail.com) wrote:

On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:

The regression tests included in pgBackRest (available here:
https://github.com/pgmasters/backrest) go through a number of different
recovery tests.  There's vagrant configs for a few different VMs too
(CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
been testing.
We're working to continue expanding those tests and will also be adding
tests for replication and promotion in the future.  Eventually, we plan
to write a buildfarm module for pgBackRest, to allow it to be run in the
same manner as the regular buildfarm animals with the results posted.


Interesting. Do you mind if I pick up from it some ideas for the
in-core replication test suite based on TAP stuff? That's still in the
works for the next CF.


Certainly don't mind at all, entirely open source under the MIT
license.


Absolutely, and I'm always look to add new tests so some 
cross-pollination may be beneficial as well.


Currently the regression tests work with 9.0 - 9.5alpha1 (and back to 
8.3, actually).  The pause_on_recovery_target test is currently disabled 
for 9.5 and I have not implemented recovery_target_action or 
recovery_target = immediate yet.  I haven't tested alpha2 yet but it 
should work unless catalog or other IDs have changed.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Andres Freund
On 2015-08-13 09:32:02 -0400, David Steele wrote:
 On 8/12/15 9:32 PM, Robert Haas wrote:
 On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:
 Certainly don't mind at all, entirely open source under the MIT
 license.
 
 Why not the PG license?  It would be nicer if we didn't have to worry
 about license contamination here.

I don't think MIT is particularly problematic, it's rather similar to a
3 clause BSD and both are pretty similar to PG's license.

 There are actually a few reasons I chose the MIT license:
 
 1) It's one of the most permissive licenses around.

 2) I originally had plans to extend backrest to other database systems.
 Nearly two years into development I don't think that sounds like a great
 idea anymore but it was the original plan.

I don't see the difference to/with postgres' license there.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/12/15 9:32 PM, Robert Haas wrote:

On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:

* Michael Paquier (michael.paqu...@gmail.com) wrote:

Interesting. Do you mind if I pick up from it some ideas for the
in-core replication test suite based on TAP stuff? That's still in the
works for the next CF.


Certainly don't mind at all, entirely open source under the MIT
license.


Why not the PG license?  It would be nicer if we didn't have to worry
about license contamination here.


There are actually a few reasons I chose the MIT license:

1) It's one of the most permissive licenses around.

2) I originally had plans to extend backrest to other database systems. 
 Nearly two years into development I don't think that sounds like a 
great idea anymore but it was the original plan.


3) It's common for GitHub projects and backrest has lived there its 
entire life.


I'm not against a license change in theory though I can't see why it 
matters very much.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/13/15 9:55 AM, Andres Freund wrote:

On 2015-08-13 09:32:02 -0400, David Steele wrote:

On 8/12/15 9:32 PM, Robert Haas wrote:

On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:

Certainly don't mind at all, entirely open source under the MIT
license.


Why not the PG license?  It would be nicer if we didn't have to worry
about license contamination here.


I don't think MIT is particularly problematic, it's rather similar to a
3 clause BSD and both are pretty similar to PG's license.


There are actually a few reasons I chose the MIT license:

1) It's one of the most permissive licenses around.



2) I originally had plans to extend backrest to other database systems.
Nearly two years into development I don't think that sounds like a great
idea anymore but it was the original plan.


I don't see the difference to/with postgres' license there.


Perhaps just me but it was really about perception.  If I extended 
BackRest to work with MySQL (shudders) then I thought it would be weird 
if it used the PostgreSQL license.


Anyway, now BackRest is now pretty solidly pgBackRest and I don't have 
any immediate plans to port to other database systems so the point is 
probably moot.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Jim Nasby

On 8/13/15 1:31 AM, Peter Geoghegan wrote:

On Wed, Aug 12, 2015 at 11:23 PM, Magnus Hagander mag...@hagander.net wrote:

The value of a core regression suite that takes less time to run has
to be weighed against the possibility that a better core regression
suite might cause us to find more bugs before committing.  That could
easily be worth the price in runtime.



Or have a quickcheck you run all the time and then run the bigger one once
before committing perhaps?


I favor splitting the regression tests to add all the time and
before commit targets as you describe. I think that once the
facility is there, we can determine over time how expansive that
second category gets to be.


I don't know how many folks work in a github fork of Postgres, but 
anyone that does could run slow checks on every single push via 
Travis-CI. [1] is an example of that. That wouldn't work directly since 
it depends on Peter Eisentraut's scripts [2] that pull from 
apt.postgresql.org, but presumably it wouldn't be too hard to get tests 
running directly out of a Postgres repo.


[1] https://travis-ci.org/decibel/variant
[2] See wget URLs in 
https://github.com/decibel/variant/blob/master/.travis.yml

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Aug 12, 2015 at 2:04 PM, Peter Geoghegan p...@heroku.com wrote:
  This resistance to adding tests seems quite short sighted to me,
  especially when the concern is about queries that will each typically
  take less than 1ms to execute. Like Noah, I think that it would be
  very helpful to simply be more inclusive of additional tests that
  don't increase test coverage by as much as each query in a minimal
  subset. I am not at all convinced by arguments about the cost of
  maintaining tests when a simple behavioral change occurs.
 
 I've removed tests from patches that in my opinion were unlikely to
 fail either (a) for any reason or (b) for any reason other than an
 intentional change, and I think that's a reasonable thing to do.
 However, I still think it's a good idea, and useful, to try to expand
 the code coverage we get from 'make check'.  However, the bigger
 issue, IMHO, is the stuff that can't be tested via pg_regress, e.g.
 because it needs hooks, like what Alvaro is talking about, or because
 it needs a custom testing framework.  Recovery, for example, really
 needs a lot more testing, as we talked about at PGCon.  If we just
 expand what 'make check' covers and don't deal with those kinds of
 things, we will be improving our test coverage but maybe not all that
 much.

Agreed, and this is something which I said we'd work to help with and
which we have some things to show now, for those interested.

The regression tests included in pgBackRest (available here: 
https://github.com/pgmasters/backrest) go through a number of different
recovery tests.  There's vagrant configs for a few different VMs too
(CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
been testing.

We're working to continue expanding those tests and will also be adding
tests for replication and promotion in the future.  Eventually, we plan
to write a buildfarm module for pgBackRest, to allow it to be run in the
same manner as the regular buildfarm animals with the results posted.

David, feel free to correct me if I'm misconstrued anything above.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Robert Haas
On Wed, Aug 12, 2015 at 7:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 FWIW, I've objected in the past to tests that would significantly
 increase the runtime of make check, unless I thought they were
 especially valuable (which enumerating every minor behavior of a
 feature patch generally isn't IMO).  I still think that that's an
 important consideration: every second you add to make check is
 multiplied many times over when you consider how many developers
 run that how many times a day.

 We've talked about having some sort of second rank of tests that
 people wouldn't necessarily run before committing, and that would
 be allowed to eat more time than the core regression tests would.
 I think that might be a valuable direction to pursue if people start
 submitting very bulky tests.

Maybe.  Adding a whole new test suite is significantly more
administratively complex, because the BF client has to get updated to
run it.  And if expected outputs in that test suite change very often
at all, then committers will have to run it before committing anyway.

The value of a core regression suite that takes less time to run has
to be weighed against the possibility that a better core regression
suite might cause us to find more bugs before committing.  That could
easily be worth the price in runtime.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Aug 12, 2015 at 2:04 PM, Peter Geoghegan p...@heroku.com wrote:
 This resistance to adding tests seems quite short sighted to me,
 especially when the concern is about queries that will each typically
 take less than 1ms to execute. Like Noah, I think that it would be
 very helpful to simply be more inclusive of additional tests that
 don't increase test coverage by as much as each query in a minimal
 subset. I am not at all convinced by arguments about the cost of
 maintaining tests when a simple behavioral change occurs.

 I've removed tests from patches that in my opinion were unlikely to
 fail either (a) for any reason or (b) for any reason other than an
 intentional change, and I think that's a reasonable thing to do.

FWIW, I've objected in the past to tests that would significantly
increase the runtime of make check, unless I thought they were
especially valuable (which enumerating every minor behavior of a
feature patch generally isn't IMO).  I still think that that's an
important consideration: every second you add to make check is
multiplied many times over when you consider how many developers
run that how many times a day.

We've talked about having some sort of second rank of tests that
people wouldn't necessarily run before committing, and that would
be allowed to eat more time than the core regression tests would.
I think that might be a valuable direction to pursue if people start
submitting very bulky tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Robert Haas
On Wed, Aug 12, 2015 at 2:04 PM, Peter Geoghegan p...@heroku.com wrote:
 This resistance to adding tests seems quite short sighted to me,
 especially when the concern is about queries that will each typically
 take less than 1ms to execute. Like Noah, I think that it would be
 very helpful to simply be more inclusive of additional tests that
 don't increase test coverage by as much as each query in a minimal
 subset. I am not at all convinced by arguments about the cost of
 maintaining tests when a simple behavioral change occurs.

I've removed tests from patches that in my opinion were unlikely to
fail either (a) for any reason or (b) for any reason other than an
intentional change, and I think that's a reasonable thing to do.
However, I still think it's a good idea, and useful, to try to expand
the code coverage we get from 'make check'.  However, the bigger
issue, IMHO, is the stuff that can't be tested via pg_regress, e.g.
because it needs hooks, like what Alvaro is talking about, or because
it needs a custom testing framework.  Recovery, for example, really
needs a lot more testing, as we talked about at PGCon.  If we just
expand what 'make check' covers and don't deal with those kinds of
things, we will be improving our test coverage but maybe not all that
much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Michael Paquier
On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:
 The regression tests included in pgBackRest (available here:
 https://github.com/pgmasters/backrest) go through a number of different
 recovery tests.  There's vagrant configs for a few different VMs too
 (CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
 been testing.
 We're working to continue expanding those tests and will also be adding
 tests for replication and promotion in the future.  Eventually, we plan
 to write a buildfarm module for pgBackRest, to allow it to be run in the
 same manner as the regular buildfarm animals with the results posted.

Interesting. Do you mind if I pick up from it some ideas for the
in-core replication test suite based on TAP stuff? That's still in the
works for the next CF.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Robert Haas
On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:
 * Michael Paquier (michael.paqu...@gmail.com) wrote:
 On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:
  The regression tests included in pgBackRest (available here:
  https://github.com/pgmasters/backrest) go through a number of different
  recovery tests.  There's vagrant configs for a few different VMs too
  (CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
  been testing.
  We're working to continue expanding those tests and will also be adding
  tests for replication and promotion in the future.  Eventually, we plan
  to write a buildfarm module for pgBackRest, to allow it to be run in the
  same manner as the regular buildfarm animals with the results posted.

 Interesting. Do you mind if I pick up from it some ideas for the
 in-core replication test suite based on TAP stuff? That's still in the
 works for the next CF.

 Certainly don't mind at all, entirely open source under the MIT
 license.

Why not the PG license?  It would be nicer if we didn't have to worry
about license contamination here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
 On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:
  The regression tests included in pgBackRest (available here:
  https://github.com/pgmasters/backrest) go through a number of different
  recovery tests.  There's vagrant configs for a few different VMs too
  (CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
  been testing.
  We're working to continue expanding those tests and will also be adding
  tests for replication and promotion in the future.  Eventually, we plan
  to write a buildfarm module for pgBackRest, to allow it to be run in the
  same manner as the regular buildfarm animals with the results posted.
 
 Interesting. Do you mind if I pick up from it some ideas for the
 in-core replication test suite based on TAP stuff? That's still in the
 works for the next CF.

Certainly don't mind at all, entirely open source under the MIT
license.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:
  * Michael Paquier (michael.paqu...@gmail.com) wrote:
  On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:
   The regression tests included in pgBackRest (available here:
   https://github.com/pgmasters/backrest) go through a number of different
   recovery tests.  There's vagrant configs for a few different VMs too
   (CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
   been testing.
   We're working to continue expanding those tests and will also be adding
   tests for replication and promotion in the future.  Eventually, we plan
   to write a buildfarm module for pgBackRest, to allow it to be run in the
   same manner as the regular buildfarm animals with the results posted.
 
  Interesting. Do you mind if I pick up from it some ideas for the
  in-core replication test suite based on TAP stuff? That's still in the
  works for the next CF.
 
  Certainly don't mind at all, entirely open source under the MIT
  license.
 
 Why not the PG license?  It would be nicer if we didn't have to worry
 about license contamination here.

There is no conflict between the licenses and pgBackRest is not part of
nor maintained as part of the PostgreSQL project.

Not that I'm against that changing, certainly, but it was independently
developed and I wouldn't ask the community to take on maintaining
something which wasn't developed following the community process, not to
mention that I imagine some would probably want it rewritten in C.

We clearly have dependencies, to the extent that there's concern about
licenses, on much more restrictive and interesting licenses, and there
currently isn't even any real dependency here and would only be if the
PGDG decided to fork pgBackRest and start maintaining it independently.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Simon Riggs
On 12 August 2015 at 03:10, Noah Misch n...@leadboat.com wrote:


  On another review I suggested we add a function to core to allow it to be
  used in regression tests. A long debate ensued, deciding that we must be
  consistent and put diagnostic functions in contrib. My understanding is
  that we are not allowed to write regression tests that use contrib
 modules,
  yet the consistent place to put diagnostic functions is contrib -
  therefore, we are never allowed to write tests utilizing diagnostic
  functions. We are allowed to put modules for testing in another
 directory,
  but these are not distributed to users so cannot be used for production
  diagnosis. Systemic fail, advice needed.

 If you want a user-visible function for production diagnosis, set aside the
 fact that you wish to use it in a test, and find the best place for it.
 Then,
 place the tests with the function.  (That is, if the function belongs in
 contrib for other reasons, put tests calling it in the contrib module
 itself.)

 Place non-user-visible test support functions in regress.c, or use one of
 the
 options Robert described.


That helps, thanks.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Peter Geoghegan
On Wed, Aug 12, 2015 at 10:46 AM, Greg Stark st...@mit.edu wrote:
 The only time I've seen pushback against tests is when the test author
 made valiant efforts to test every codepath and the expected output
 embeds the precise behaviour of the current code as correct. Even
 when patches have extensive tests I don't recall seeing much pushback
 (though I've been having trouble keeping up with the list in recent
 months) if the tests are written in a way that they will only fail if
 there's a bug, even if behaviour changes in unrelated ways.

Really? I think Noah's description of how less testing is in effect
incentivized by committers is totally accurate. No patch author is
going to dig their heals in over the objections of a committer when
the complaint is about brevity of tests.

This resistance to adding tests seems quite short sighted to me,
especially when the concern is about queries that will each typically
take less than 1ms to execute. Like Noah, I think that it would be
very helpful to simply be more inclusive of additional tests that
don't increase test coverage by as much as each query in a minimal
subset. I am not at all convinced by arguments about the cost of
maintaining tests when a simple behavioral change occurs.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Greg Stark
On Wed, Aug 12, 2015 at 3:10 AM, Noah Misch n...@leadboat.com wrote:
 Committers press authors to delete tests more often than we press them to
 resubmit with more tests.  No wonder so many patches have insufficient tests;
 we treat those patches more favorably, on average.  I have no objective
 principles for determining whether a test is pointlessly redundant, but I
 think the principles should become roughly 10x more permissive than the
 (unspecified) ones we've been using.


I would suggest the metric should be if this test fails is it more
likely to be noise due to an intentional change in behaviour or more
likely to be a bug?

The only time I've seen pushback against tests is when the test author
made valiant efforts to test every codepath and the expected output
embeds the precise behaviour of the current code as correct. Even
when patches have extensive tests I don't recall seeing much pushback
(though I've been having trouble keeping up with the list in recent
months) if the tests are written in a way that they will only fail if
there's a bug, even if behaviour changes in unrelated ways.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-12 Thread Alvaro Herrera
One trouble I face when adding tests is that sometimes they require
hooks in the code, to test for race conditions.  In BRIN I cannot test
some code paths without resorting to adding breakpoints in GDB, for
instance.  If there's no support for such in the core code, it's
essentially impossible to add meaningful test for certain code paths.  I
toyed with the notion of adding hook points (such that a test module can
put the backend to sleep until the other backend acknowledges hitting
the race condition window); I decided not to because it seems a more
general discussion is necessary first about the desirability of such.

As I recall we needed this in SKIP LOCKED and of course for multixacts
also.

I agree with the general idea that it's worthwhile to add lots more
tests than we currently have, both for the current code and for future
patches.  Surely we don't need to reach the point where every single
nuance of every single user interface is verified to the point that
tests are such a large burden to maintain as is being suggested.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-11 Thread Noah Misch
On Mon, Aug 10, 2015 at 07:02:17AM +0100, Simon Riggs wrote:
 Almost every patch I review has either zero or insufficient tests.
 
 If we care about robustness, then we must discuss tests. Here are my two
 recent experiences:
 
 I agree we could do with x10 as many tests, but that doesn't mean all tests
 make sense, so there needs to be some limiting principles to avoid adding
 pointless test cases. I recently objected to adding a test case to
 Isolation tester for the latest ALTER TABLE patch because in that case
 there is no concurrent behavior to test. There is already a regression test
 that tests lock levels for those statements, so in my view it is more
 sensible to modify the existing test than to add a whole new isolation test
 that does nothing more than demonstrate the lock levels work as described,
 which we already knew.

Committers press authors to delete tests more often than we press them to
resubmit with more tests.  No wonder so many patches have insufficient tests;
we treat those patches more favorably, on average.  I have no objective
principles for determining whether a test is pointlessly redundant, but I
think the principles should become roughly 10x more permissive than the
(unspecified) ones we've been using.

 On another review I suggested we add a function to core to allow it to be
 used in regression tests. A long debate ensued, deciding that we must be
 consistent and put diagnostic functions in contrib. My understanding is
 that we are not allowed to write regression tests that use contrib modules,
 yet the consistent place to put diagnostic functions is contrib -
 therefore, we are never allowed to write tests utilizing diagnostic
 functions. We are allowed to put modules for testing in another directory,
 but these are not distributed to users so cannot be used for production
 diagnosis. Systemic fail, advice needed.

If you want a user-visible function for production diagnosis, set aside the
fact that you wish to use it in a test, and find the best place for it.  Then,
place the tests with the function.  (That is, if the function belongs in
contrib for other reasons, put tests calling it in the contrib module itself.)

Place non-user-visible test support functions in regress.c, or use one of the
options Robert described.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-10 Thread Robert Haas
On Mon, Aug 10, 2015 at 2:02 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Almost every patch I review has either zero or insufficient tests.

 If we care about robustness, then we must discuss tests. Here are my two
 recent experiences:

 I agree we could do with x10 as many tests, but that doesn't mean all tests
 make sense, so there needs to be some limiting principles to avoid adding
 pointless test cases. I recently objected to adding a test case to Isolation
 tester for the latest ALTER TABLE patch because in that case there is no
 concurrent behavior to test. There is already a regression test that tests
 lock levels for those statements, so in my view it is more sensible to
 modify the existing test than to add a whole new isolation test that does
 nothing more than demonstrate the lock levels work as described, which we
 already knew.

 On another review I suggested we add a function to core to allow it to be
 used in regression tests. A long debate ensued, deciding that we must be
 consistent and put diagnostic functions in contrib. My understanding is that
 we are not allowed to write regression tests that use contrib modules, yet
 the consistent place to put diagnostic functions is contrib - therefore, we
 are never allowed to write tests utilizing diagnostic functions. We are
 allowed to put modules for testing in another directory, but these are not
 distributed to users so cannot be used for production diagnosis. Systemic
 fail, advice needed.

There are actually two ways to do this.

One model is the dummy_seclabel module.  The build system arranges for
that to be available when running the core regression tests, so they
can use it.  And the dummy_seclabel test does.  There are actually a
couple of other loadable modules that are used by the core regression
tests in this kind of way (refint and autoinc, from contrib/spi).

The other model is what I'll call the test_decoding model.  Since the
core code can't be tested without a module, we have a module.  But
then the tests are in the module's directory
(contrib/test_decoding/{sql,expected}) not the core regression tests.

In general, I have a mild preference for the second model.  It seems
more scalable, and keeps the core tests quick to run, which is
appropriate for more obscure tests that are unlikely to break very
often.  But the first model can also be done, as show by the fact that
we have in fact done it several times.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-10 Thread Simon Riggs
On 8 August 2015 at 17:47, Noah Misch n...@leadboat.com wrote:

 We've too often criticized patches for carrying many lines/bytes of test
 case
 additions.  Let's continue to demand debuggable, secure tests that fail
 only
 when something is wrong, but let's stop pushing for test minimalism.  Such
 objections may improve the individual patch, but that doesn't make up for
 the
 chilling effect on test contributions.  I remember clearly the first time I
 submitted thorough test coverage with a feature.  Multiple committers
 attacked
 it in the name of brevity.  PostgreSQL would be better off with 10x its
 current test bulk, even if the average line of test code were considerably
 less important than we expect today.


Almost every patch I review has either zero or insufficient tests.

If we care about robustness, then we must discuss tests. Here are my two
recent experiences:

I agree we could do with x10 as many tests, but that doesn't mean all tests
make sense, so there needs to be some limiting principles to avoid adding
pointless test cases. I recently objected to adding a test case to
Isolation tester for the latest ALTER TABLE patch because in that case
there is no concurrent behavior to test. There is already a regression test
that tests lock levels for those statements, so in my view it is more
sensible to modify the existing test than to add a whole new isolation test
that does nothing more than demonstrate the lock levels work as described,
which we already knew.

On another review I suggested we add a function to core to allow it to be
used in regression tests. A long debate ensued, deciding that we must be
consistent and put diagnostic functions in contrib. My understanding is
that we are not allowed to write regression tests that use contrib modules,
yet the consistent place to put diagnostic functions is contrib -
therefore, we are never allowed to write tests utilizing diagnostic
functions. We are allowed to put modules for testing in another directory,
but these are not distributed to users so cannot be used for production
diagnosis. Systemic fail, advice needed.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Test code is worth the space

2015-08-10 Thread Simon Riggs
On 10 August 2015 at 13:55, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 10, 2015 at 2:02 AM, Simon Riggs si...@2ndquadrant.com
 wrote:
  On another review I suggested we add a function to core to allow it to be
  used in regression tests. A long debate ensued, deciding that we must be
  consistent and put diagnostic functions in contrib. My understanding is
 that
  we are not allowed to write regression tests that use contrib modules,
 yet
  the consistent place to put diagnostic functions is contrib - therefore,
 we
  are never allowed to write tests utilizing diagnostic functions. We are
  allowed to put modules for testing in another directory, but these are
 not
  distributed to users so cannot be used for production diagnosis. Systemic
  fail, advice needed.

 There are actually two ways to do this.

 One model is the dummy_seclabel module.  The build system arranges for
 that to be available when running the core regression tests, so they
 can use it.  And the dummy_seclabel test does.  There are actually a
 couple of other loadable modules that are used by the core regression
 tests in this kind of way (refint and autoinc, from contrib/spi).

 The other model is what I'll call the test_decoding model.  Since the
 core code can't be tested without a module, we have a module.  But
 then the tests are in the module's directory
 (contrib/test_decoding/{sql,expected}) not the core regression tests.

 In general, I have a mild preference for the second model.  It seems
 more scalable, and keeps the core tests quick to run, which is
 appropriate for more obscure tests that are unlikely to break very
 often.  But the first model can also be done, as show by the fact that
 we have in fact done it several times.


Neither of those uses a diagnostic function that also has value in
production environments - for logical decoding we added something to core
specifically to allow this pg_recvlogical.

Anyway, resolving this isn't important anymore because I wish to pursue a
different mechanism for freezing, but its possible I hit the same issue.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


[HACKERS] Test code is worth the space

2015-08-08 Thread Noah Misch
We've too often criticized patches for carrying many lines/bytes of test case
additions.  Let's continue to demand debuggable, secure tests that fail only
when something is wrong, but let's stop pushing for test minimalism.  Such
objections may improve the individual patch, but that doesn't make up for the
chilling effect on test contributions.  I remember clearly the first time I
submitted thorough test coverage with a feature.  Multiple committers attacked
it in the name of brevity.  PostgreSQL would be better off with 10x its
current test bulk, even if the average line of test code were considerably
less important than we expect today.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Peter Geoghegan
On Sat, Aug 8, 2015 at 9:47 AM, Noah Misch n...@leadboat.com wrote:
 We've too often criticized patches for carrying many lines/bytes of test case
 additions.  Let's continue to demand debuggable, secure tests that fail only
 when something is wrong, but let's stop pushing for test minimalism.  Such
 objections may improve the individual patch, but that doesn't make up for the
 chilling effect on test contributions.  I remember clearly the first time I
 submitted thorough test coverage with a feature.  Multiple committers attacked
 it in the name of brevity.  PostgreSQL would be better off with 10x its
 current test bulk, even if the average line of test code were considerably
 less important than we expect today.

I strongly agree. I am fond of the example of external sorting, which
at present has exactly zero test coverage, even though in production
those code paths are exercised all the time.

I think that there needs to be a way of running an extended set of
regression tests. I could definitely respect the desire for minimalism
when it comes to adding tests to the regression tests proper if there
was an extended set of tests that could be run during development less
frequently. I thought about doing the extended set as a satellite
project, but that may not be workable.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Josh Berkus
On 08/08/2015 12:24 PM, Peter Geoghegan wrote:
 I think that there needs to be a way of running an extended set of
 regression tests. I could definitely respect the desire for minimalism
 when it comes to adding tests to the regression tests proper if there
 was an extended set of tests that could be run during development less
 frequently. I thought about doing the extended set as a satellite
 project, but that may not be workable.

There already is, isn't there?  All of those named sets of regression
tests which aren't run by default.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Andrew Dunstan


On 08/08/2015 05:09 PM, Josh Berkus wrote:

On 08/08/2015 12:24 PM, Peter Geoghegan wrote:

I think that there needs to be a way of running an extended set of
regression tests. I could definitely respect the desire for minimalism
when it comes to adding tests to the regression tests proper if there
was an extended set of tests that could be run during development less
frequently. I thought about doing the extended set as a satellite
project, but that may not be workable.

There already is, isn't there?  All of those named sets of regression
tests which aren't run by default.



Exactly. And there is nothing to stop us expanding those. For example, 
it might be useful to have a suite that provably touched every code 
path, or something close to it.


Incidentally, making the buildfarm client run extra sets of tests in the 
main tree is almost completely trivial. Making it run tests from 
somewhere else, such as a different git repo, is only slightly harder.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Greg Stark
On Sat, Aug 8, 2015 at 8:24 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that there needs to be a way of running an extended set of
 regression tests. I could definitely respect the desire for minimalism

The larger expense in having extensive test suites is the cost to
maintain them. With our current test framework tests have to be fairly
carefully written to produce output that isn't very fragile and
sensitive to minor changes in the code. In practice this is what
really drives the push towards minimal tests. If we tried testing
every code path right now -- which I tried to do once for the TOAST
code -- what you'll find is that what you're really testing is not
that the code is correct but that it does exactly what it does today.
At that point the test failures become entirely noise meaning
something changed rather than signal meaning something's broken.

A better test framework can go a long way towards fixing this. It
would be much less of a problem if we had a unit test framework rather
than only black box integration tests. That would allow us to test
that every function in tuplestore.c meets its contract, not just that
some SQL query produces output that's correct and we think happened to
go through some code path we were interested in. There are many code
paths that will be hard to arrange to reach from SQL and impossible to
have any confidence will continue to be reached in the future when the
behaviour is intentionally changed.

That said, I don't think anyone would object to adding some regression
tests that at least test basic correctness of the sorting code. That's
a pretty embarrassing gap and iirc the only reason for it is that it
would make the regression tests sensitive to collation locale
settings.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers