On Tue, May 2, 2017 at 9:01 PM, Jeff Davis <pg...@j-davis.com> wrote: > On Tue, Feb 28, 2017 at 6:33 AM, Yugo Nagata <nag...@sraoss.co.jp> wrote: >> In this patch, user can specify a hash function USING. However, >> we migth need default hash functions which are useful and >> proper for hash partitioning. > > I suggest that we consider the hash functions more carefully. This is > (effectively) an on-disk format so it can't be changed easily later. > > 1. Consider a partition-wise join of two hash-partitioned tables. If > that's a hash join, and we just use the hash opclass, we immediately > lose some useful bits of the hash function. Same for hash aggregation > where the grouping key is the partition key.
Hmm, that could be a problem in some cases. I think there's probably much less of a problem if the modulus isn't a power of two? > To fix this, I think we need to include a salt in the hash API. Each > level of hashing can choose a random salt. Do you mean that we'd salt partitioning hashing differently from grouping hashing which would be salted different from aggregation hashing which, I suppose, would be salted differently from hash index hashing? Or do you mean that you'd have to specify a salt when creating a hash-partitioned table, and make sure it's the same across all compatibly partitioned tables you might want to hash-join? That latter sounds unappealing. > 2. Consider a partition-wise join where the join keys are varchar(10) > and char(10). We can't do that join if we just use the existing hash > strategy, because 'foo' = 'foo ' should match, but those values > have different hashes when using the standard hash opclass. > > To fix this, we need to be smarter about normalizing values at a > logical level before hashing. We can take this to varying degrees, > perhaps even normalizing an integer to a numeric before hashing so > that you can do a cross-type join on int=numeric. > > Furthermore, we need catalog metadata to indicate which hash functions > are suitable for which cross-type comparisons. Or, to put it the other > way, which typecasts preserve the partitioning. You're basically describing what a hash opfamily already does, except that we don't have a single opfamily that covers both varchar(10) and char(10), nor do we have one that covers both int and numeric. We have one that covers int2, int4, and int8, though. If somebody wanted to make the ones you're suggesting, there's nothing preventing it, although I'm not sure exactly how we'd encourage people to start using the new one and deprecating the old one. We don't seem to have a good infrastructure for that. > 3. We might want to use a hash function that is a little slower that > is more resistant to collisions. We may even want to use a 64-bit > hash. > > My opinion is that we should work on this hashing infrastructure > first, and then support the DDL. If we get the hash functions right, > that frees us up to create better plans, with better push-downs, which > will be good for parallel query. I am opposed to linking the fate of this patch to multiple independent, possibly large, possibly difficult, possibly controversial enhancements to the hashing mechanism. If there are simple things that can reasonably be done in this patch to make hash partitioning better, great. If you want to work on improving the hashing mechanism as an independent project, also great. But I think that most people would rather have hash partitioning in v11 than wait for v12 or v13 so that other hashing improvements can be completed; I know I would. If we say "we shouldn't implement hash partitioning because some day we might make incompatible changes to the hashing mechanism" then we'll never implement it, because that will always be true. Even the day after we change it, there still may come a future day when we change it again. The stakes have already been raised by making hash indexes durable; that too is arguably making future changes to the hashing infrastructure harder. But I still think it was the right thing to proceed with that work. If we get 64-bit hash codes in the next release, and we want hash indexes to use them, then we will have to invalidate existing hash indexes (again). That's sad, but not as sad as it would have been to not commit the work to make hash indexes durable. There's a chicken-and-egg problem here: without durable hash indexes and hash partitioning, there's not much incentive to make hashing better, but once we have them, changes create a backward compatibility issue. Such is life; nothing we do is infinitely future-proof. The last significant overhaul of the hashing mechanism that I know about was in 2009, cf. 2604359251d34177a14ef58250d8b4a51d83103b and 8205258fa675115439017b626c4932d5fefe2ea8. Until this email, I haven't seen any complaints about the quality of that hash function either in terms of speed or collision properties - what makes you think those things are serious problems? I *have* heard some interest in widening the output to 64 bits, and also in finding a way to combine multiple hash values in some smarter way than we do at present. Seeding has come up, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers