On Thu, Jun 1, 2017 at 10:59 AM, Robert Haas <robertmh...@gmail.com> wrote: > 1. Are the new problems worse than the old ones? > > 2. What could we do about it?
Exactly the right questions. 1. For range partitioning, I think it's "yes, a little". As you point out, there are already some weird edge cases -- the main way range partitioning would make the problem worse is simply by having more users. But for hash partitioning I think the problems will become more substantial. Different encodings, endian issues, etc. will be a headache for someone, and potentially a bad day if they are urgently trying to restore on a new machine. We should remember that not everyone is a full-time postgres DBA, and users might reasonably think that the default options to pg_dump[all] will give them a portable dump. 2. I basically see two approaches to solve the problem: (a) Tom suggested at PGCon that we could have a GUC that automatically causes inserts to the partition to be re-routed through the parent. We could discuss whether to always route through the parent, or do a recheck on the partition constrains and only reroute tuples that will fail it. If the user gets into trouble, the worst that would happen is a helpful error message telling them to set the GUC. I like this idea. (b) I had suggested before that we could make the default text dump (and the default output from pg_restore, for consistency) route through the parent. Advanced users would dump with -Fc, and pg_restore would support an option to do partition-wise loading. To me, this is simpler, but users might forget to use (or not know about) the pg_restore option and then it would load more slowly. Also, the ship is sailing on range partitioning, so we might prefer option (a) just to avoid making any changes. I am fine with either option. > 2. Add an option like --dump-partition-data-with-parent. I'm not sure > who originally proposed this, but it seems that everybody likes it. > What we disagree about is the degree to which it's sufficient. Jeff > Davis thinks it doesn't go far enough: what if you have an old > plain-format dump that you don't want to hand-edit, and the source > database is no longer available? Most people involved in the > unconference discussion of partitioning at PGCon seemed to feel that > wasn't really something we should be worry about too much. I had been > taking that position also, more or less because I don't see that there > are better alternatives. If the suggestions above are unacceptable, and we don't come up with anything better, then of course we have to move on. I am worrying now primarily because now is the best time to worry; I don't expect any horrible outcome. > 3. Implement portable hash functions (Jeff Davis or me, not sure > which). Andres scoffed at this idea, but I still think it might have > legs. I think it reduces the problem, which has value, but it's hard to make it rock-solid. > make fast. Those two things also solve different parts of the > problem; one is insulating the user from a difference in hardware > architecture, while the other is insulating the user from a difference > in user-selected settings. I think that the first of those things is > more important than the second, because it's easier to change your > settings than it is to change your hardware. Good point. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers