Re: [HACKERS] Bug in bttext_abbrev_convert()
On 2015-07-19 12:35:46 -0400, Peter Eisentraut wrote: I would like to do that, but the current host has no more capacity and the hoster is complaining, so I need to look for a new hosting solution before I can expand what is being built. Perhaps this could be moved into a virtual machine hosted by the postgres project infrastructure? What kind of resources do you currently have and what would you need? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On 7/6/15 3:52 PM, Alvaro Herrera wrote: Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: Peter Geoghegan wrote: It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/ What would it take to get something like that which uses the check-world target instead of just the check target? I suggest CC'ing Peter as a first measure. I already suggested this (or something similar) to him months ago. I would like to do that, but the current host has no more capacity and the hoster is complaining, so I need to look for a new hosting solution before I can expand what is being built. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Mon, Jul 6, 2015 at 12:52 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I suggest CC'ing Peter as a first measure. I already suggested this (or something similar) to him months ago. This would be a worthwhile effort. tuplesort.c only has 46% coverage. There is no coverage for functions that I know are used all the time in production, like dumptuples(), or ExecHashIncreaseNumBatches(). We should make increasing test coverage an explicit goal. Ideally, there'd be a lower quality set of tests that fill in certain gaps in code coverage, but are used less frequently, and may take much longer to run. Simply having the code executed will allow tools like Valgrind to catch bugs. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: Peter Geoghegan wrote: It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/ What would it take to get something like that which uses the check-world target instead of just the check target? I suggest CC'ing Peter as a first measure. I already suggested this (or something similar) to him months ago. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Mon, Jul 6, 2015 at 12:27 PM, Kevin Grittner kgri...@ymail.com wrote: What would it take to get something like that which uses the check-world target instead of just the check target? Without the additional tests (like the isolation tests), some of these numbers don't reflect the coverage of regularly run tests. While it is of some interest what coverage we get on the 30 second `make check` runs, I would be a lot more interested in what our coverage is when the full 8.5 minute set of regression tests we have in git is run. I agree. Obviously in some cases the coverage is bad because concurrency is naturally required. It's even more glaring that recovery codepaths are always all-red in coverage reports. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
Alvaro Herrera alvhe...@2ndquadrant.com wrote: Peter Geoghegan wrote: It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/ What would it take to get something like that which uses the check-world target instead of just the check target? Without the additional tests (like the isolation tests), some of these numbers don't reflect the coverage of regularly run tests. While it is of some interest what coverage we get on the 30 second `make check` runs, I would be a lot more interested in what our coverage is when the full 8.5 minute set of regression tests we have in git is run. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
Peter Geoghegan wrote: It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/ -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Wed, Jul 1, 2015 at 2:38 PM, Peter Geoghegan p...@heroku.com wrote: On Tue, Jun 30, 2015 at 10:35 PM, Peter Geoghegan p...@heroku.com wrote: The regression tests have zero coverage for this tuplesort_performsort() btspool2 case. That's a fairly common case to have no coverage for, and that took me all of 5 minutes to find. BTW, I looked here because I added a bunch of sortsupport stuff to _bt_load() in 9.5. It appears that that new code is entirely without coverage. That's not cool. A patch for the src/test/regress area would be welcome. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Tue, Jun 30, 2015 at 5:25 PM, Jim Nasby jim.na...@bluetreble.com wrote: Isn't this the kind of thing Coverty's supposed to find? I don't know, but in general I'm not very excited about static analysis tools. The best things that they have going for them is that they're available, and don't require test coverage in the same way as running the regression tests with Valgrind enabled. There is no real testing of sorting in the regression tests. It would be nice to have a way of generating a large and varied selection of sort operations programmatically, to catch this kind of thing. pg_regress is not really up to it. The same applies to various other cases where having a lot of expected output makes using pg_regress infeasible. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On 6/29/15 6:47 PM, Peter Geoghegan wrote: As we all know, the state of automated testing is pretty lamentable. This is the kind of thing that we could catch more easily in the future if better infrastructure were in place. I caught this by eyeballing bttext_abbrev_convert() with slightly fresher eyes than the last time I looked. Isn't this the kind of thing Coverty's supposed to find? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Wed, Jul 1, 2015 at 9:36 AM, Peter Geoghegan p...@heroku.com wrote: On Tue, Jun 30, 2015 at 5:25 PM, Jim Nasby jim.na...@bluetreble.com wrote: Isn't this the kind of thing Coverty's supposed to find? I don't know, but in general I'm not very excited about static analysis tools. The best things that they have going for them is that they're available, and don't require test coverage in the same way as running the regression tests with Valgrind enabled. Well, yes. It should have complained about that if it were a perfect tool able to emulate all code paths with all possible variable values. But Coverity is not actually that perfect, and sometimes it misses the shot, like here. by experience when you look at reports of a static tool, you need to have a look first at other code paths that share similarities with the problem reported, and you will actually see that most of the time what the static tool saw is just the tip of the iceberg. The human factor is determinant in this case. There is no real testing of sorting in the regression tests. It would be nice to have a way of generating a large and varied selection of sort operations programmatically, to catch this kind of thing. pg_regress is not really up to it. The same applies to various other cases where having a lot of expected output makes using pg_regress infeasible. Well, the issue is double here: 1) Having a buildfarm member with valgrind would greatly help. 2) This code path is not used at all AFAIK in the test suite, so we definitely lack regression tests here. In your opinion what would be a sort set large enough to be able to trigger this code path? The idea is to not make the regression test suite take too much time because of it, and not to have the server created by pg_regress running the regression tests having a too large PGDATA folder. For example, could a btree index do it with a correct data set do it on at least one platform? Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Tue, Jun 30, 2015 at 9:39 PM, Michael Paquier michael.paqu...@gmail.com wrote: There is no real testing of sorting in the regression tests. It would be nice to have a way of generating a large and varied selection of sort operations programmatically, to catch this kind of thing. pg_regress is not really up to it. The same applies to various other cases where having a lot of expected output makes using pg_regress infeasible. Well, the issue is double here: 1) Having a buildfarm member with valgrind would greatly help. Not sure that I agree with that. It wouldn't hurt, but people are running Valgrind often enough these days that I think it's unlikely that having it run consistently actually adds all that much. Maybe it would make a difference to do it on a 32-bit platform, since most of us have not used a 32-bit machine as our primary development environment in about a decade, and we probably miss some things because of that. If you run Valgrind on Postgres master these days, you actually have a pretty good chance of not finding any issues, which is great. shared_buffers support for Valgrind would also help a lot (e.g. making it so that referencing a shared buffer that isn't pinned causes a Valgrind error -- Noah and I discussed this a while back; should be doable). If you're looking for a new project, may I highly recommend working on that. :-) 2) This code path is not used at all AFAIK in the test suite, so we definitely lack regression tests here. In your opinion what would be a sort set large enough to be able to trigger this code path? The idea is to not make the regression test suite take too much time because of it, and not to have the server created by pg_regress running the regression tests having a too large PGDATA folder. For example, could a btree index do it with a correct data set do it on at least one platform? Maybe. The trick would be constructing a case where many different code paths are covered, including the many different permutations and combinations of how an external sort can go (number of runs, merge steps, datatypes, etc). In general, I think that there is a lot of value to be added by someone making it their explicit goal to increase test coverage, as measured by a tool like gcov (plus subjective expert analysis, of course), particularly when it comes to things like memory corruption bugs (hopefully including shared memory corruption bugs, and hopefully including recovery). If someone was doing that in a focused way, then the codepath where we must explicitly pfree() (in the case of this bug) would probably have coverage, and then Valgrind probably would catch this. As long as that has to be a part of adding things to the standard regression test suite (particularly with sorts), a suite which is expected to run quickly, and as long as we're entirely relying on pg_regress, we will not make progress here. Maybe if there was an extended regression test suite that was explicitly about meeting a code coverage goal we'd do better. Yes, I think mechanically increasing code coverage is useful as an end in itself (although I do accept that meeting a particularly coverage percentage is often not especially useful). Increasing coverage has led me to the right question before, just as static analysis has done that for you. For example, the effort of increasing coverage can find dead code, and you can intuitively get a sense of where to look for bugs manually by determining mechanically what code paths are hard to hit, or are naturally seldom hit. It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. Watching that regress over time might provide useful insight, but I only use gcov a couple of times a year, so that's not going to happen on its own. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Tue, Jun 30, 2015 at 10:25 PM, Peter Geoghegan p...@heroku.com wrote: It would be nice to always have a html report from gcov always available on the internet. That would be something useful to automate, IMV. Watching that regress over time might provide useful insight, but I only use gcov a couple of times a year, so that's not going to happen on its own. I generated such a report just now, and noticed this: 1137 : tuplesort_performsort(btspool-sortstate); 2141131 : if (btspool2) 215 0 : tuplesort_performsort(btspool2-sortstate); 216 : 2171131 : wstate.heap = btspool-heap; 2181131 : wstate.index = btspool-index; The regression tests have zero coverage for this tuplesort_performsort() btspool2 case. That's a fairly common case to have no coverage for, and that took me all of 5 minutes to find. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Tue, Jun 30, 2015 at 10:35 PM, Peter Geoghegan p...@heroku.com wrote: The regression tests have zero coverage for this tuplesort_performsort() btspool2 case. That's a fairly common case to have no coverage for, and that took me all of 5 minutes to find. BTW, I looked here because I added a bunch of sortsupport stuff to _bt_load() in 9.5. It appears that that new code is entirely without coverage. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in bttext_abbrev_convert()
On Mon, Jun 29, 2015 at 7:47 PM, Peter Geoghegan p...@heroku.com wrote: Commits b181a919 and arguably c79b6413 added bugs to bttext_abbrev_convert() in the process of fixing some others. In the master branch, bttext_abbrev_convert() can leak memory when the C locale happens to be used and we must detoast, which is unacceptable for about the same reason that it's unacceptable for a standard B-Tree comparator routine. There could also be a use-after-free issue for large strings that are detoasted, because bttext_abbrev_convert() hashes memory which might already be freed (when hashing the authoritative value). Attached patch fixes these issues. As we all know, the state of automated testing is pretty lamentable. This is the kind of thing that we could catch more easily in the future if better infrastructure were in place. I caught this by eyeballing bttext_abbrev_convert() with slightly fresher eyes than the last time I looked. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers