On Wed, May 17, 2017 at 2:35 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Tue, May 16, 2017 at 4:25 PM, Jeff Davis <pg...@j-davis.com> wrote: >>> Why can't hash partitions be stored in tables the same way as we do TOAST? >>> That should take care of the naming problem. > >> Hmm, yeah, something like that could be done, but every place where >> you are currently allowed to refer to a partition by name would have >> to be be changed to accept some other syntax for specifying the >> partition. > > Uh ... toast tables have regular names, and can be specified in commands > just like any other table. I don't see why these "auto" partition tables > couldn't be handled the same way.
Really? That seems like a huge usability fail to me. If somebody wants to create an index on one partition of a hash-partitioned table, or reindex an index, do you really want them to have to dig out an internal name to do it? And how exactly would you dump and restore the partitions and their indexes? It's true that there are some operations that can be performed directly on a TOAST table, but the saving grace is that you usually don't need to do any of them. That won't be true here. >> Beyond that, I think it's a bad idea to make hash partitions behave >> completely differently from list and range partitions. > > I think the question is whether we are going to make a distinction between > logical partitions (where the data division rule makes some sense to the > user) and physical partitions (where it needn't). I think it might be > perfectly reasonable for those to behave differently. I don't think I'd like to go so far as to say that it's unreasonable, but I certainly wouldn't say I'm optimistic about such a design. I do not think that it is going to work to conceal from the user that the partitions are really separate tables with their own indexes. I also think that trying to make such a thing work is just going to lead to a lot of time and energy spent trying to paper over problems that are basically self-inflicted, and that papering over those problems won't really end up having any value for users. Remember, the chain of reasoning here is something like: 1. To handle dump-and-reload the way we partitioning does today, hash functions would need to be portable across encodings. 2. That's impractically difficult. 3. So let's always load data through the top-parent. 4. But that could fail due to e.g. a UNIQUE index on an individual child, so let's try to prohibit all of the things that could be done to an individual partition that could cause a reload failure. 5. And then for good measure let's hide the existence of the partitions, too. Every step in that chain of logic has a certain sense to it, but none of them are exactly water-tight. #1 is basically a value judgement: would people rather (a) have faster hash functions, or (b) would they rather be able to port a database to a different encoding without having rows move between hash functions? The statement is only true if you think it's the latter, but I tend to think it's the former. #2 is a judgement that the performance characteristics of as-yet-unwritten portable hashing will be so bad that nobody could possibly be satisfied with it. #3 is a great idea as an optional behavior, but it's only a strict necessity if you're totally committed to #1 and #2. It also has some performance cost, which makes it somewhat undesirable as a default behavior. #4 is *probably* a necessary consequence of #3. I don't know what the argument for #5 is unless it's that #4 isn't hard enough already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers