Andreas Seltenreich's sqlsmith tool has found an impressive number of
bugs. In light of that, it seems to me that it would be reasonable for
a contributor to write a regression test that avoids dropping database
objects specifically so that sqlsmith had some chance of finding bugs
after the fact, by generating a query whose plan ends up accessing
said objects. Commit 975ad4e6 fixed a nasty bug in BRIN that was
originally found by Andreas' fuzz testing. Perhaps that bug would
still be around if Alvaro had included any DROP statements within
brin.sql. Presumably his choice to not to do so was completely
arbitrary. Actually, it seems likely that it was a fortunate accident.

I understand that there is a cost to increasing the high-watermark
amount of disk space used during regression testing, since many
buildfarm animals are rather underpowered. I also understand that in
most cases the objects themselves aren't really what we're testing, so
it doesn't make much sense to leave them behind. However, it still
seems like a good idea to try to think about this a little more
strategically. For example, the new index_including.sql file drops all
INCLUDE indexes/tables proactively, even though it looks like they're
rather small, and in a certain sense worth keeping around. I also see
that both spgist.sql and gin.sql avoid dropping objects, but gist.sql
seems to drop several. hash_index.sql drops some objects, too
(sqlsmith also found hash index bugs despite this).

Maybe a cheap, scalable solution is possible. For example, perhaps we
could maintain a version of the regression tests that have Postgres
somehow silently ignore most DROP statements. I doubt that people like
Andreas will ever be remotely concerned about the extra disk space
overhead, since fuzzing is inherently a computationally intensive
process; CPU costs are bound to dominate.  A strategy like this would
also make it easier to run amcheck on indexes left behind by the
regression tests.

How can we do better?

Peter Geoghegan

Reply via email to