FOR VALUE clause of a partition does not allow a constant expression like (10000/5 -1). It gives syntax error regression=# create table pt1_p1 partition of pt1 for values start (0) end ((10000/5) - 1); ERROR: syntax error at or near "(" LINE 1: ...pt1_p1 partition of pt1 for values start (0) end ((10000/5) ...
Shouldn't we allow constant expressions here? If this has been already discussed, please forgive me and point out the relevant mail chain. On Tue, Aug 9, 2016 at 12:48 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > What strikes me odd about these patches is RelOptInfo has remained > unmodified. For a base partitioned table, I would expect it to be marked as > partitioned may be indicating the partitioning scheme. Instead of that, I > see that the code directly deals with PartitionDesc, PartitionKey which are > part of Relation. It uses other structures like PartitionKeyExecInfo. I > don't have any specific observation as to why we need such information in > RelOptInfo, but lack of it makes me uncomfortable. It could be because > inheritance code does not require any mark in RelOptInfo and your patch > re-uses inheritance code. I am not sure. > > On Tue, Aug 9, 2016 at 9:17 AM, Amit Langote < > langote_amit...@lab.ntt.co.jp> wrote: > >> On 2016/08/09 6:02, Robert Haas wrote: >> > On Mon, Aug 8, 2016 at 1:40 AM, Amit Langote >> > <langote_amit...@lab.ntt.co.jp> wrote: >> >>> +1, if we could do it. It will need a change in the way Amit's patch >> stores >> >>> partitioning scheme in PartitionDesc. >> >> >> >> Okay, I will try to implement this in the next version of the patch. >> >> >> >> One thing that comes to mind is what if a user wants to apply hash >> >> operator class equality to list partitioned key by specifying a hash >> >> operator class for the corresponding column. In that case, we would >> not >> >> have the ordering procedure with an hash operator class, hence any >> >> ordering based optimization becomes impossible to implement. The >> current >> >> patch rejects a column for partition key if its type does not have a >> btree >> >> operator class for both range and list methods, so this issue doesn't >> >> exist, however it could be seen as a limitation. >> > >> > Yes, I think you should expect careful scrutiny of that issue. It >> > seems clear to me that range partitioning requires a btree opclass, >> > that hash partitioning requires a hash opclass, and that list >> > partitioning requires at least one of those things. It would probably >> > be reasonable to pick one or the other and insist that list >> > partitioning always requires exactly that, but I can't see how it's >> > reasonable to insist that you must have both types of opclass for any >> > type of partitioning. >> >> So because we intend to implement optimizations for list partition >> metadata that presuppose existence of corresponding btree operator class, >> we should just always require user to specify one (or error out if user >> doesn't specify and a default one doesn't exist). That way, we explicitly >> do not support specify hash equality operator for list partitioning. >> >> >> So, we have 3 choices for the internal representation of list >> partitions: >> >> >> >> Choice 1 (the current approach): Load them in the same order as they >> are >> >> found in the partition catalog: >> >> >> >> Table 1: p1 {'b', 'f'}, p2 {'c', 'd'}, p3 {'a', 'e'} >> >> Table 2: p1 {'c', 'd'}, p2 {'a', 'e'}, p3 {'b', 'f'} >> >> >> >> In this case, mismatch on the first list would make the two tables >> >> incompatibly partitioned, whereas they really aren't incompatible. >> > >> > Such a limitation seems clearly unacceptable. We absolutely must be >> > able to match up compatible partitioning schemes without getting >> > confused by little details like the order of the partitions. >> >> Agreed. Will change my patch to adopt the below method. >> >> >> Choice 2: Representation with 2 arrays: >> >> >> >> Table 1: ['a', 'b', 'c', 'd', 'e', 'f'], [3, 1, 2, 2, 3, 1] >> >> Table 2: ['a', 'b', 'c', 'd', 'e', 'f'], [2, 3, 1, 1, 2, 3] >> >> >> >> It still doesn't help the case of pairwise joins because it's hard to >> tell >> >> which value belongs to which partition (the 2nd array carries the >> original >> >> partition numbers). Although it might still work for tuple-routing. >> > >> > It's very good for tuple routing. It can also be used to match up >> > partitions for pairwise joins. Compare the first arrays. If they are >> > unequal, stop. Else, compare the second arrays, incrementally >> > building a mapping between them and returning false if the mapping >> > turns out to be non-bijective. For example, in this case, we look at >> > index 0 and decide that 3 -> 2. We look at index 1 and decide 1 -> 3. >> > We look at index 2 and decide 2 -> 1. We look at index 4 and find >> > that we already have a mapping for 2, but it's compatible because we >> > need 2 -> 1 and that's what is already there. Similarly for the >> > remaining entries. This is really a pretty easy loop to write and it >> > should run very quickly. >> >> I see, it does make sense to try to implement this way. >> >> >> Choice 3: Order all lists' elements for each list individually and then >> >> order the lists themselves on their first values: >> >> >> >> Table 1: p3 {'a', 'e'}, p2 {'b', 'f'}, p1 {'c', 'd'} >> >> Table 2: p2 {'a', 'e'}, p1 {'b', 'f'}, p3 {'c', 'd'} >> >> >> >> This representation makes pairing partitions for pairwise joining >> >> convenient but for tuple-routing we still need to visit each partition >> in >> >> the worst case. >> > >> > I think this is clearly not good enough for tuple routing. If the >> > algorithm I proposed above turns out to be too slow for matching >> > partitions, then we could keep both this representation and the >> > previous one. We are not limited to just one. But I don't think >> > that's likely to be the case. >> >> I agree. Let's see how the option 2 turns out. >> >> > Also, note that all of this presupposes we're doing range >> > partitioning, or perhaps list partitioning with a btree opclass. For >> > partitioning based on a hash opclass, you'd organize the data based on >> > the hash values rather than range comparisons. >> >> Yes, the current patch does not implement hash partitioning, although I >> have to think about how to support the hash case when designing the >> internal data structures. >> >> >> By the way, I am planning to start a new thread with the latest set of >> patches which I will post in a day or two. I have tried to implement all >> the bug fixes and improvements that have been suggested on this thread so >> far. Thanks to all those who reviewed and gave their comments. Please >> check this page to get a link to the new thread: >> https://commitfest.postgresql.org/10/611/ >> >> Thanks, >> Amit >> >> >> > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company