On 2017/06/08 1:44, Robert Haas wrote:
> On Wed, Jun 7, 2017 at 7:47 AM, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
>> In ATExecAttachPartition() there's following code
>>
>> 13715         partnatts = get_partition_natts(key);
>> 13716         for (i = 0; i < partnatts; i++)
>> 13717         {
>> 13718             AttrNumber  partattno;
>> 13719
>> 13720             partattno = get_partition_col_attnum(key, i);
>> 13721
>> 13722             /* If partition key is an expression, must not skip
>> validation */
>> 13723             if (!partition_accepts_null &&
>> 13724                 (partattno == 0 ||
>> 13725                  !bms_is_member(partattno, not_null_attrs)))
>> 13726                 skip_validate = false;
>> 13727         }
>>
>> partattno obtained from the partition key is the attribute number of
>> the partitioned table but not_null_attrs contains the attribute
>> numbers of attributes of the table being attached which have NOT NULL
>> constraint on them. But the attribute numbers of partitioned table and
>> the table being attached may not agree i.e. partition key attribute in
>> partitioned table may have a different position in the table being
>> attached. So, this code looks buggy. Probably we don't have a test
>> which tests this code with different attribute order between
>> partitioned table and the table being attached. I didn't get time to
>> actually construct a testcase and test it.

There seem to be couple of bugs here:

1. When partition's key attributes differ in ordering from the parent,
   predicate_implied_by() will give up due to structural inequality of
   Vars in the expressions.  By fixing this, we can get it to return
   'true' when it's really so.

2. As you said, we store partition's attribute numbers in the
   not_null_attrs bitmap, but then check partattno (which is the parent's
   attribute number which might differ) against the bitmap, which seems
   like it might produce incorrect result.  If, for example,
   predicate_implied_by() set skip_validate to true, and the above code
   failed to set skip_validate to false where it should have, then we
   would wrongly end up skipping the scan.  That is, rows in the partition
   will contain null values whereas the partition constraint does not
   allow it.  It's hard to reproduce this currently, because with
   different ordering of attributes, predicate_refute_by() never returns
   true (as mentioned in 1 above), so skip_validate does not need to be
   set to false again.

Consider this example:

create table p (a int, b char) partition by list (a);

create table p1 (b char not null, a int check (a in (1)));
insert into p1 values ('b', null);

Note that not_null_attrs for p1 will contain 1 corresponding to column b,
which matches key attribute of the parent, that is 1, corresponding to
column a.  Hence we end up wrongly concluding that p1's partition key
column does not allow nulls.

> I think this code could be removed entirely in light of commit
> 3ec76ff1f2cf52e9b900349957b42d28128b7bc7.

I am assuming you think that because now we emit IS NOT NULL constraint
internally for any partition keys that do not allow null values (including
all the keys of range partitions as of commit
3ec76ff1f2cf52e9b900349957b42d28128b7bc7).  But those IS NOT NULL
constraint expressions are inconclusive as far as the application of
predicate_implied_by() to determine if we can skip the scan is concerned.
So even if predicate_implied_by() returned 'true', we cannot conclude,
just based on that result, that there are not any null values in the
partition keys.

The code in question is there to check if there are explicit NOT NULL
constraints on the partition keys.  It cannot be true for expression keys,
so we give up on skip_validate in that case anyway.  But if 1) there are
no expression keys, 2) all the partition key columns of the table being
attached have NOT NULL constraint, and 3) predicate_implied_by() returned
'true', we can skip the scan.

Thoughts?

I am working on a patch to fix the above mentioned issues and will post
the same no later than Friday.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to