On 30 September 2017 at 21:03, Alexander Korotkov <a.korot...@postgrespro.ru> wrote:
> I heard from customers that they periodically dump contents of > pg_stat_statements and then build statistics over long period of time. If > even they leave default pg_stat_statements.max unchanged, probability of > collision would be significantly higher. Indeed. It's simple enough to export stats to prometheus with queryid as the key. Then even if the query ages out of the database stats you have graphs and derivative metrics going further back. I have to admit this was my first reaction to the idea of using sha1 hashes for git commits as well. But eventually I came around. If the chances of a hash collision are smaller than a cosmic ray flipping a bit or a digital electronics falling into a meta-stable state then I had to admit there's not much value in being theoretically more correct. In practice if the query has aged out of pg_stat_statements and you're exporting pg_stat_statements metrics to longer-term storage there's really nothing more "correct" than just using a long enough hash and assuming there are no collisions anyways. If 64-bits is still not sufficient we could just go to 160-bit sha1 or 256-bit sha256. Actually there's a reason I'm wondering if we shouldn't use a cryptographic hash or even an HMAC. Currently if you're non-superuser we, quite sensibly, hide the query text. But we also hide the queryid. The latter is really inconvenient since it really makes the stats utterly useless. I'm not sure what the rationale was but the only thing I can think of is a fear that it's possible to reverse engineer the query using brute force. An HMAC, or for most purposes even a simple cryptographic hash with a secret salt would make that impossible. (I have other advances in pg_stat_statements I would love to see. It would be so much more helpful if pg_stat_statements also kept a few examples of query parameters such as the most recent set, the set that caused the longest execution, maybe the set with the largest of each metric.) -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers