Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-10 Thread Tom Lane
Noah Misch  writes:
> On Sat, Apr 09, 2016 at 10:08:01PM -0400, Tom Lane wrote:
>> Still, we've reached the most coverage this test can give us at 1000
>> rows, which still means it's wasting the last 99% of its runtime.

> If dropping the row count to 1000 shaves >500ms on your primary machine, +1
> for committing such a row count change.  This is exactly what I meant by
> "someone identifies a way to realize similar coverage with lower duration."
> Thanks for contributing this study.

Further testing with gcov showed that the max coverage point was reached
somewhere between 500 and 750 rows (I didn't bother to pin it down
exactly).  So I thought setting the table size to 1000 rows was probably
shaving things a bit too close; I made it 2000 rows instead.

I also added a few more test cases to improve the coverage beyond what
it was to start with, using a white-box approach of modifying the test
script until gcov said that it was hitting the areas it wasn't before.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-10 Thread Alexander Korotkov
On Sun, Apr 10, 2016 at 9:01 AM, Noah Misch  wrote:

> On Sat, Apr 09, 2016 at 10:08:01PM -0400, Tom Lane wrote:
> > I wrote:
> > > I was depressed, though not entirely surprised, to find that you get
> > > exactly that same line-count coverage if the table size is cut back
> > > to ONE row.
> >
> > Oh, I found the flaw in my testing: there are two INSERTs in the test
> > script and I was changing only one of them.  After correcting that,
> > the results behave a little more sanely:
> >
> >Line Coverage   Functions
> > 1 row: 70.4 % 349 / 496   93.1 %  27 / 29
> > 10 row:73.6 % 365 / 496   93.1 %  27 / 29
> > 100 rows:  73.6 % 365 / 496   93.1 %  27 / 29
> > 1000 rows: 75.4 % 374 / 496   93.1 %  27 / 29
> >
> > Still, we've reached the most coverage this test can give us at 1000
> > rows, which still means it's wasting the last 99% of its runtime.
>
> If dropping the row count to 1000 shaves >500ms on your primary machine, +1
> for committing such a row count change.  This is exactly what I meant by
> "someone identifies a way to realize similar coverage with lower duration."
> Thanks for contributing this study.


+1, row count reduction is a good to reduce regression test time.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-10 Thread Noah Misch
On Sat, Apr 09, 2016 at 10:08:01PM -0400, Tom Lane wrote:
> I wrote:
> > I was depressed, though not entirely surprised, to find that you get
> > exactly that same line-count coverage if the table size is cut back
> > to ONE row.
> 
> Oh, I found the flaw in my testing: there are two INSERTs in the test
> script and I was changing only one of them.  After correcting that,
> the results behave a little more sanely:
> 
>Line Coverage   Functions
> 1 row: 70.4 % 349 / 496   93.1 %  27 / 29
> 10 row:73.6 % 365 / 496   93.1 %  27 / 29
> 100 rows:  73.6 % 365 / 496   93.1 %  27 / 29
> 1000 rows: 75.4 % 374 / 496   93.1 %  27 / 29
> 
> Still, we've reached the most coverage this test can give us at 1000
> rows, which still means it's wasting the last 99% of its runtime.

If dropping the row count to 1000 shaves >500ms on your primary machine, +1
for committing such a row count change.  This is exactly what I meant by
"someone identifies a way to realize similar coverage with lower duration."
Thanks for contributing this study.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Tom Lane
I wrote:
> I was depressed, though not entirely surprised, to find that you get
> exactly that same line-count coverage if the table size is cut back
> to ONE row.

Oh, I found the flaw in my testing: there are two INSERTs in the test
script and I was changing only one of them.  After correcting that,
the results behave a little more sanely:

   Line Coverage   Functions
1 row: 70.4 %   349 / 496   93.1 %  27 / 29
10 row:73.6 %   365 / 496   93.1 %  27 / 29
100 rows:  73.6 %   365 / 496   93.1 %  27 / 29
1000 rows: 75.4 %   374 / 496   93.1 %  27 / 29

Still, we've reached the most coverage this test can give us at 1000
rows, which still means it's wasting the last 99% of its runtime.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Tom Lane
Noah Misch  writes:
> On Sat, Apr 09, 2016 at 11:50:08AM -0400, Tom Lane wrote:
>> Would it be possible to dial down the amount of runtime consumed by
>> the regression tests for this module?

> I find this added test duration reasonable.  If someone identifies a way to
> realize similar coverage with lower duration, I'd value that contribution.  -1
> for meeting some runtime target at the expense of coverage.  Older modules
> have rather little test coverage, so they're poor as benchmarks.

That argument is fine in principle, but the thing about applications such
as SQL databases is that shoving more rows through them doesn't in itself
increase test coverage; it may just iterate the loops more times.

The contrib/bloom regression test currently creates a 10-row table.
I wondered how many rows were really required to achieve the same level
of code coverage.  I experimented with gcov, and what I find is that
as of HEAD that test provides these levels of coverage:

 Line Coverage   Functions

contrib/bloom/:  75.4 % 374 / 496   93.1 %  27 / 29
generic_xlog.c:  68.5 % 98 / 14377.8 %  7 / 9

(I looked specifically at generic_xlog.c because the main point of this
contrib module, so far as the core code is concerned, is to exercise
that file.)

I was depressed, though not entirely surprised, to find that you get
exactly that same line-count coverage if the table size is cut back
to ONE row.  Only when you remove the INSERT command entirely is there
any change in these gcov statistics.  Therefore, the last 9 rows
are wasting my time, and yours, and that of every other developer who
will ever run this test suite in future.  I do not like having the
community's time wasted.

I'm all for improving our test coverage, but just shoving lots of rows
through a not-very-well-designed-in-the-first-place regression test
isn't a reliable way to do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Peter Geoghegan
On Sat, Apr 9, 2016 at 4:43 PM, Noah Misch  wrote:
> I find this added test duration reasonable.  If someone identifies a way to
> realize similar coverage with lower duration, I'd value that contribution.  -1
> for meeting some runtime target at the expense of coverage.  Older modules
> have rather little test coverage, so they're poor as benchmarks.

+1.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Noah Misch
On Sat, Apr 09, 2016 at 11:50:08AM -0400, Tom Lane wrote:
> Teodor Sigaev  writes:
> > Bloom index contrib module
> 
> Would it be possible to dial down the amount of runtime consumed by
> the regression tests for this module?
> 
> On my primary dev machine, "make installcheck" in contrib/bloom takes
> 4.5 seconds, which seems a bit excessive when make installcheck across
> all the rest of contrib takes 22 seconds.
> 
> On prairiedog, which admittedly is one of the slower buildfarm animals
> (though far from the slowest), the contrib-install-check-C step went
> from 2:54 immediately before this patch went in to 4:28 immediately
> after.
> 
> That seems out of line for a single contrib module, especially one of
> unproven usefulness.

I find this added test duration reasonable.  If someone identifies a way to
realize similar coverage with lower duration, I'd value that contribution.  -1
for meeting some runtime target at the expense of coverage.  Older modules
have rather little test coverage, so they're poor as benchmarks.

A feature's lack of usefulness can be a good reason to exclude it from the
tree entirely, but it's a bad reason to slash the feature's test cases.  (I
have not evaluated the usefulness of this access method.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Tom Lane
I wrote:
> Would it be possible to dial down the amount of runtime consumed by
> the regression tests for this module?

Hmm ... "perf" says that a full 50% of the runtime of contrib/bloom's
regression test is consumed by GenericXLogFinish, and of that, the vast
majority is consumed by the extremely inefficient one-byte-at-a-time
matching in writeDelta() --- even dwarfing the "exchange two page images"
memcpy's.  So there seems to be good reason to expend some
micro-optimization effort in that area.  But aside from the writeDelta
problems, what in the world is the reason for the page exchange logic?
I could see copying a working image into place in the shared buffer
arena, but not saving the previous contents.

Even if these costs went to zero, though, the bloom regression tests
would still be consuming a disproportionate amount of time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Bloom index contrib module

2016-04-09 Thread Tom Lane
Teodor Sigaev  writes:
> Bloom index contrib module

Would it be possible to dial down the amount of runtime consumed by
the regression tests for this module?

On my primary dev machine, "make installcheck" in contrib/bloom takes
4.5 seconds, which seems a bit excessive when make installcheck across
all the rest of contrib takes 22 seconds.

On prairiedog, which admittedly is one of the slower buildfarm animals
(though far from the slowest), the contrib-install-check-C step went
from 2:54 immediately before this patch went in to 4:28 immediately
after.

That seems out of line for a single contrib module, especially one of
unproven usefulness.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers