On Fri, May 12, 2017 at 11:45 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Forget hash partitioning. There's no law saying that that's a good > idea and we have to have it. With a different set of constraints, > maybe we could do it, but I think the existing design decisions have > basically locked it out --- and I doubt that hash partitioning is so > valuable that we should jettison other desirable properties to get it.
A lot of the optimizations that can make use of hash partitioning could also make use of range partitioning. But let me defend hash partitioning: * hash partitioning requires fewer decisions by the user * naturally balances data and workload among partitions in most cases * easy to match with a degree of parallelism But with range partitioning, you can have situations where different tables have different distributions of data. If you partition to balance the data between partitions in both cases, then that makes partition-wise join a lot harder because the boundaries don't line up. If you make the boundaries line up to do partition-wise join, the partitions might have wildly different amounts of data in them. Either way, it makes parallelism harder. Even without considering joins, range partitioning could force you to make a choice between balancing the data and balancing the workload. If you are partitioning based on date, then a lot of the workload will be on more recent partitions. That's desirable sometimes (e.g. for vacuum) but not always desirable for parallelism. Hash partitioning doesn't have these issues and goes very nicely with parallel query. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers