Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-19 Thread Andres Freund
On 2015-07-19 12:35:46 -0400, Peter Eisentraut wrote:
 I would like to do that, but the current host has no more capacity and
 the hoster is complaining, so I need to look for a new hosting solution
 before I can expand what is being built.

Perhaps this could be moved into a virtual machine hosted by the
postgres project infrastructure?

What kind of resources do you currently have and what would you need?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-19 Thread Peter Eisentraut
On 7/6/15 3:52 PM, Alvaro Herrera wrote:
 Kevin Grittner wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Peter Geoghegan wrote:

 It would be nice to always have a html report from gcov always
 available on the internet. That would be something useful to
 automate, IMV.

 http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/

 What would it take to get something like that which uses the
 check-world target instead of just the check target?
 
 I suggest CC'ing Peter as a first measure.  I already suggested this (or
 something similar) to him months ago.

I would like to do that, but the current host has no more capacity and
the hoster is complaining, so I need to look for a new hosting solution
before I can expand what is being built.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-14 Thread Peter Geoghegan
On Mon, Jul 6, 2015 at 12:52 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I suggest CC'ing Peter as a first measure.  I already suggested this (or
 something similar) to him months ago.

This would be a worthwhile effort. tuplesort.c only has 46% coverage.
There is no coverage for functions that I know are used all the time
in production, like dumptuples(), or ExecHashIncreaseNumBatches().

We should make increasing test coverage an explicit goal. Ideally,
there'd be a lower quality set of tests that fill in certain gaps in
code coverage, but are used less frequently, and may take much longer
to run. Simply having the code executed will allow tools like Valgrind
to catch bugs.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-06 Thread Alvaro Herrera
Kevin Grittner wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
  Peter Geoghegan wrote:
 
  It would be nice to always have a html report from gcov always
  available on the internet. That would be something useful to
  automate, IMV.
 
  http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/
 
 What would it take to get something like that which uses the
 check-world target instead of just the check target?

I suggest CC'ing Peter as a first measure.  I already suggested this (or
something similar) to him months ago.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-06 Thread Peter Geoghegan
On Mon, Jul 6, 2015 at 12:27 PM, Kevin Grittner kgri...@ymail.com wrote:
 What would it take to get something like that which uses the
 check-world target instead of just the check target?  Without the
 additional tests (like the isolation tests), some of these numbers
 don't reflect the coverage of regularly run tests.  While it is of
 some interest what coverage we get on the 30 second `make check`
 runs, I would be a lot more interested in what our coverage is when
 the full 8.5 minute set of regression tests we have in git is run.

I agree. Obviously in some cases the coverage is bad because
concurrency is naturally required. It's even more glaring that
recovery codepaths are always all-red in coverage reports.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-06 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Peter Geoghegan wrote:

 It would be nice to always have a html report from gcov always
 available on the internet. That would be something useful to
 automate, IMV.

 http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/

What would it take to get something like that which uses the
check-world target instead of just the check target?  Without the
additional tests (like the isolation tests), some of these numbers
don't reflect the coverage of regularly run tests.  While it is of
some interest what coverage we get on the 30 second `make check`
runs, I would be a lot more interested in what our coverage is when
the full 8.5 minute set of regression tests we have in git is run.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-06 Thread Alvaro Herrera
Peter Geoghegan wrote:

 It would be nice to always have a html report from gcov always
 available on the internet. That would be something useful to automate,
 IMV.

http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-07-01 Thread Michael Paquier
On Wed, Jul 1, 2015 at 2:38 PM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jun 30, 2015 at 10:35 PM, Peter Geoghegan p...@heroku.com wrote:
 The regression tests have zero coverage for this
 tuplesort_performsort() btspool2 case. That's a fairly common case
 to have no coverage for, and that took me all of 5 minutes to find.

 BTW, I looked here because I added a bunch of sortsupport stuff to
 _bt_load() in 9.5. It appears that that new code is entirely without
 coverage.

That's not cool. A patch for the src/test/regress area would be welcome.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Peter Geoghegan
On Tue, Jun 30, 2015 at 5:25 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 Isn't this the kind of thing Coverty's supposed to find?

I don't know, but in general I'm not very excited about static
analysis tools. The best things that they have going for them is that
they're available, and don't require test coverage in the same way as
running the regression tests with Valgrind enabled.

There is no real testing of sorting in the regression tests. It would
be nice to have a way of generating a large and varied selection of
sort operations programmatically, to catch this kind of thing.
pg_regress is not really up to it. The same applies to various other
cases where having a lot of expected output makes using pg_regress
infeasible.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Jim Nasby

On 6/29/15 6:47 PM, Peter Geoghegan wrote:

As we all know, the state of automated testing is pretty lamentable.
This is the kind of thing that we could catch more easily in the
future if better infrastructure were in place. I caught this by
eyeballing bttext_abbrev_convert() with slightly fresher eyes than the
last time I looked.


Isn't this the kind of thing Coverty's supposed to find?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Michael Paquier
On Wed, Jul 1, 2015 at 9:36 AM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jun 30, 2015 at 5:25 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 Isn't this the kind of thing Coverty's supposed to find?

 I don't know, but in general I'm not very excited about static
 analysis tools. The best things that they have going for them is that
 they're available, and don't require test coverage in the same way as
 running the regression tests with Valgrind enabled.

Well, yes. It should have complained about that if it were a perfect
tool able to emulate all code paths with all possible variable values.
But Coverity is not actually that perfect, and sometimes it misses the
shot, like here. by experience when you look at reports of a static
tool, you need to have a look first at other code paths that share
similarities with the problem reported, and you will actually see that
most of the time what the static tool saw is just the tip of the
iceberg. The human factor is determinant in this case.

 There is no real testing of sorting in the regression tests. It would
 be nice to have a way of generating a large and varied selection of
 sort operations programmatically, to catch this kind of thing.
 pg_regress is not really up to it. The same applies to various other
 cases where having a lot of expected output makes using pg_regress
 infeasible.

Well, the issue is double here:
1) Having a buildfarm member with valgrind would greatly help.
2) This code path is not used at all AFAIK in the test suite, so we
definitely lack regression tests here. In your opinion what would be a
sort set large enough to be able to trigger this code path? The idea
is to not make the regression test suite take too much time because of
it, and not to have the server created by pg_regress running the
regression tests having a too large PGDATA folder. For example, could
a btree index do it with a correct data set do it on at least one
platform?
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Peter Geoghegan
On Tue, Jun 30, 2015 at 9:39 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 There is no real testing of sorting in the regression tests. It would
 be nice to have a way of generating a large and varied selection of
 sort operations programmatically, to catch this kind of thing.
 pg_regress is not really up to it. The same applies to various other
 cases where having a lot of expected output makes using pg_regress
 infeasible.

 Well, the issue is double here:
 1) Having a buildfarm member with valgrind would greatly help.

Not sure that I agree with that. It wouldn't hurt, but people are
running Valgrind often enough these days that I think it's unlikely
that having it run consistently actually adds all that much. Maybe it
would make a difference to do it on a 32-bit platform, since most of
us have not used a 32-bit machine as our primary development
environment in about a decade, and we probably miss some things
because of that.

If you run Valgrind on Postgres master these days, you actually have a
pretty good chance of not finding any issues, which is great.
shared_buffers support for Valgrind would also help a lot (e.g. making
it so that referencing a shared buffer that isn't pinned causes a
Valgrind error -- Noah and I discussed this a while back; should be
doable). If you're looking for a new project, may I highly recommend
working on that.  :-)

 2) This code path is not used at all AFAIK in the test suite, so we
 definitely lack regression tests here. In your opinion what would be a
 sort set large enough to be able to trigger this code path? The idea
 is to not make the regression test suite take too much time because of
 it, and not to have the server created by pg_regress running the
 regression tests having a too large PGDATA folder. For example, could
 a btree index do it with a correct data set do it on at least one
 platform?

Maybe. The trick would be constructing a case where many different
code paths are covered, including the many different permutations and
combinations of how an external sort can go (number of runs, merge
steps, datatypes, etc).

In general, I think that there is a lot of value to be added by
someone making it their explicit goal to increase test coverage, as
measured by a tool like gcov (plus subjective expert analysis, of
course), particularly when it comes to things like memory corruption
bugs (hopefully including shared memory corruption bugs, and hopefully
including recovery). If someone was doing that in a focused way, then
the codepath where we must explicitly pfree() (in the case of this
bug) would probably have coverage, and then Valgrind probably would
catch this.

As long as that has to be a part of adding things to the standard
regression test suite (particularly with sorts), a suite which is
expected to run quickly, and as long as we're entirely relying on
pg_regress, we will not make progress here. Maybe if there was an
extended regression test suite that was explicitly about meeting a
code coverage goal we'd do better. Yes, I think mechanically
increasing code coverage is useful as an end in itself (although I do
accept that meeting a particularly coverage percentage is often not
especially useful). Increasing coverage has led me to the right
question before, just as static analysis has done that for you. For
example, the effort of increasing coverage can find dead code, and you
can intuitively get a sense of where to look for bugs manually by
determining mechanically what code paths are hard to hit, or are
naturally seldom hit.

It would be nice to always have a html report from gcov always
available on the internet. That would be something useful to automate,
IMV. Watching that regress over time might provide useful insight, but
I only use gcov a couple of times a year, so that's not going to
happen on its own.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Peter Geoghegan
On Tue, Jun 30, 2015 at 10:25 PM, Peter Geoghegan p...@heroku.com wrote:
 It would be nice to always have a html report from gcov always
 available on the internet. That would be something useful to automate,
 IMV. Watching that regress over time might provide useful insight, but
 I only use gcov a couple of times a year, so that's not going to
 happen on its own.

I generated such a report just now, and noticed this:

   1137 : tuplesort_performsort(btspool-sortstate);
 2141131 : if (btspool2)
 215   0 : tuplesort_performsort(btspool2-sortstate);
 216 :
 2171131 : wstate.heap = btspool-heap;
 2181131 : wstate.index = btspool-index;

The regression tests have zero coverage for this
tuplesort_performsort() btspool2 case. That's a fairly common case
to have no coverage for, and that took me all of 5 minutes to find.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-30 Thread Peter Geoghegan
On Tue, Jun 30, 2015 at 10:35 PM, Peter Geoghegan p...@heroku.com wrote:
 The regression tests have zero coverage for this
 tuplesort_performsort() btspool2 case. That's a fairly common case
 to have no coverage for, and that took me all of 5 minutes to find.

BTW, I looked here because I added a bunch of sortsupport stuff to
_bt_load() in 9.5. It appears that that new code is entirely without
coverage.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in bttext_abbrev_convert()

2015-06-29 Thread Robert Haas
On Mon, Jun 29, 2015 at 7:47 PM, Peter Geoghegan p...@heroku.com wrote:
 Commits b181a919 and arguably c79b6413 added bugs to
 bttext_abbrev_convert() in the process of fixing some others. In the
 master branch, bttext_abbrev_convert() can leak memory when the C
 locale happens to be used and we must detoast, which is unacceptable
 for about the same reason that it's unacceptable for a standard B-Tree
 comparator routine. There could also be a use-after-free issue for
 large strings that are detoasted, because bttext_abbrev_convert()
 hashes memory which might already be freed (when hashing the
 authoritative value).

 Attached patch fixes these issues.

 As we all know, the state of automated testing is pretty lamentable.
 This is the kind of thing that we could catch more easily in the
 future if better infrastructure were in place. I caught this by
 eyeballing bttext_abbrev_convert() with slightly fresher eyes than the
 last time I looked.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers