subject:"Re\: \[HACKERS\] GSoC 2017\: Foreign Key Arrays"

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-11-12 Thread Andreas Karlsson

On 11/10/2017 01:47 AM, Mark Rofail wrote:

I am sorry for the late reply

There is no reason for you to be. It did not take you 6 weeks to do a 
review. :) Thanks for this new version.

== Functional review

 >1) MATCH FULL does not seem to care about NULLS in arrays. In the
example below I expected both inserts into the referring table to fail.

It seems in your example the only failed case was: INSERT INTO fk VALUES 
(NULL, '{1}');

which shouldn't work, can you clarify this?

I think that if you use MATH FULL the query should fail if you have a 
NULL in the array.

 >2) To me it was not obvious that ON DELETE CASCADE would delete
the whole rows rather than delete the members from the array, and
this kind of misunderstanding can lead to pretty bad surprises in
production. I am leaning towards not supporting CASCADE.

I would say so too, maybe we should remove ON DELETE CASCADE until we 
have supported all remaining actions.

I am leaning towards this too. I would personally be fine with a first 
version without support for CASCADE since it is not obvious to me what 
CASCADE should do.

== The @>> operator
I would argue that allocating an array of datums and building an array 
would have the same complexity

I am not sure what you mean here. Just because something has the same 
complexity does not mean there can't be major performance differences.

== Code review

 >I think the code in RI_Initial_Check() would be cleaner if you
used "CROSS JOIN LATERAL unnest(col)" rather than having unnest() in
the target list. This way you would not need to rename all columns
and the code paths for the array case could look more like the code
path for the normal case.

Can you clarify what you mean a bit more?

I think the code would look cleaner if you generate the following query:

SELECT fk.x, fk.ys FROM ONLY t2 fk CROSS JOIN LATERAL 
pg_catalog.unnest(ys) a2 (v) LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.x 
AND pk.y = a2.v WHERE [...]

rather than:

SELECT fk.k1, fk.ak2 FROM (SELECT x k1, pg_catalog.unnest(ys) k2, ys ak2 
FROM ONLY t2) fk LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.k1 AND pk.y = 
fk.k2 WHERE [...]

= New stuff

When applying the patch I got some white space warnings:

Array-ELEMENT-foreign-key-v5.3.patch:1343: space before tab in indent, 
indent with spaces.

format_type_be(oprleft), 
format_type_be(oprright;
Array-ELEMENT-foreign-key-v5.3.patch:1345: trailing whitespace.

When compiling I got an error:

ri_triggers.c: In function ‘ri_GenerateQual’:
ri_triggers.c:2693:19: error: unknown type name ‘d’
   Oid   oprcommon;d
   ^
ri_triggers.c:2700:3: error: conflicting types for ‘oprright’
   oprright = get_array_type(operform->oprleft);
   ^~~~
ri_triggers.c:2691:9: note: previous declaration of ‘oprright’ was here
   Oid   oprright;
 ^~~~
: recipe for target 'ri_triggers.o' failed

When building the documentation I got two warnings:

/usr/bin/osx:catalogs.sgml:2349:17:W: empty end-tag
/usr/bin/osx:catalogs.sgml:2350:17:W: empty end-tag

When running the tests I got a failure in element_foreign_key.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-10-29 Thread Andreas Karlsson


Sorry for the very late review.

I like this feature and have needed it myself in the past, and the 
current syntax seems pretty good. One could argue for if the syntax 
could be generalized to support other things like json and hstore, but I 
do not think it would be fair to block this patch due to that.


== Limitations of the current design

1) Array element foreign keys can only be specified at the table level 
(not at columns): I think this limitation is fine. Other PostgreSQL 
specific features like exclusion contraints can also only be specified 
at the table level.


2) Lack of support for SET NULL and SET DEFAULT: these do not seem very 
useful for arrays.


3) Lack of support for specifiying multiple arrays in the foreign key: 
seems like a good thing to me since it is not obvious what such a thing 
even would do.


4) That you need to add a cast to the index if you have different types: 
due to there not being a int4[] <@ int2[] operator you need to add an 
index on (col::int4[]) to speed up deletes and updates. This one i 
annoying since EXPLAIN wont give you the query plans for the foreign key 
queries, but I do not think fixing this should be within the scope of 
the patch and that having a smaller interger in the referring table is rare.


5) The use of count(DISTINCT) requiring types to support btree equality: 
this has been discussed a lot up-thread and I think the current state is 
good enough.


== Functional review

I have played around some with it and things seem to work and the test 
suite passes, but I noticed a couple of strange behaviors.


1) MATCH FULL does not seem to care about NULLS in arrays. In the 
example below I expected both inserts into the referring table to fail.


CREATE TABLE t (x int, y int, PRIMARY KEY (x, y));
CREATE TABLE fk (x int, ys int[], FOREIGN KEY (x, EACH ELEMENT OF ys) 
REFERENCES t MATCH FULL);

INSERT INTO t VALUES (10, 1);
INSERT INTO fk VALUES (10, '{1,NULL}');
INSERT INTO fk VALUES (NULL, '{1}');

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
ERROR:  insert or update on table "fk" violates foreign key constraint 
"fk_x_fkey"

DETAIL:  MATCH FULL does not allow mixing of null and nonnull key values.

2) To me it was not obvious that ON DELETE CASCADE would delete the 
whole rows rather than delete the members from the array, and this kind 
of misunderstanding can lead to pretty bad surprises in production. I am 
leaning towards not supporting CASCADE.


== The @>> operator

A previous version of your patch added the "anyelement <<@ anyarray" 
operator to avoid having to build arrays, but that part was reverted due 
to a bug.


I am not expert on the gin code, but as far as I can tell it would be 
relatively simple to fix that bug. Just allocate an array of Datums of 
length one where you put the element you are searching for (or maybe a 
copy of it).


Potential issues with adding the operators:

1) Do we really want to add an operator just for array element foreign 
keys? I think this is not an issue since it seems like it should be 
useful in general. I know I have wanted it myself.


2) I am not sure, but the committers might prefer if adding the 
operators is done in a separate patch.


3) Bikeshedding about operator names. I personally think @>> is clear 
enough and as far as I know it is not used for anything else.


== Code review

The patch no longer applies to HEAD, but the conflicts are small.

I think we should be more consistent in the naming, both in code and in 
the documentation. Right now we have "array foreign keys", "element 
foreign keys", "ELEMENT foreign keys", etc.


+   /*
+* If this is an array foreign key, we must look up the 
operators for

+* the array element type, not the array type itself.
+*/
+   if (fkreftypes[i] != FKCONSTR_REF_PLAIN)

+   if (fkreftypes[i] != FKCONSTR_REF_PLAIN)
+   {
+   old_fktype = get_base_element_type(old_fktype);
+   /* this shouldn't happen ... */
+   if (!OidIsValid(old_fktype))
+   elog(ERROR, "old foreign key column is not an array");
+   }

+   if (riinfo->fk_reftypes[i] != FKCONSTR_REF_PLAIN)
+   {
+   riinfo->has_array = true;
+   riinfo->ff_eq_oprs[i] = ARRAY_EQ_OP;
+   }

In the three diffs above it would be much cleaner to check for "== 
FKCONSTR_REF_EACH_ELEMENT" since that better conveys the intent and is 
safer for adding new types in the future.


+   /* We look through any domain here */
+   fktype = get_base_element_type(fktype);

What does the comment above mean?

if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop)))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
-errmsg("foreign key constraint \"%s\" "
-   "cannot be implemented",
-   fkconstraint->conname),
-errdetail("Key columns \"%s\" and

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-09-17 Thread Andreas Karlsson

I have not looked at the issue with the btree_gin tests yet, but here is 
the first part of my review.


= Review

This is my first quick review where I just read the documentation and 
quickly tested the feature. I will review it more in-depth later.


This is a very useful feature, one which I have a long time wished for.

The patch applies, compiles and passes the test suite with just one warning.

parse_coerce.c: In function ‘select_common_type_2args’:
parse_coerce.c:1379:7: warning: statement with no effect [-Wunused-value]
   rightOid;
   ^~~~

= Functional

The documentation does not agree with the code on the syntax. The 
documentation claims it is "FOREIGN KEY (ELEMENT xs) REFERENCES t1 (x)" 
when it actually is "FOREIGN KEY (EACH ELEMENT OF xs) REFERENCES t1 (x)".


Likewise I can't get the "final_positions integer[] ELEMENT REFERENCES 
drivers" syntax to work, but here I cannot see any change in the syntax 
to support it.


Related to the above: I am not sure if it is a good idea to make ELEMENT 
a reserved word in column definitions. What if the SQL standard wants to 
use it for something?


The documentation claims ON CASCADE DELETE is not supported by array 
element foreign keys, but I do not think that is actually the case.


I think I prefer (EACH ELEMENT OF xs) over (ELEMENT xs) given how the 
former is more in what I feel is the spirit of SQL. And if so we should 
match it as "xs integer[] EACH ELEMENT REFERENCES t1 (x)", assuming we 
want that syntax.


Once I have created an array element foreign key the basic features seem 
to work as expected.


The error message below fails to mention that it is an array element 
foreign key, but I do not think that is not a blocker for getting this 
feature merged. Right now I cannot think of how to improve it either.


$ INSERT INTO t3 VALUES ('{1,3}');
ERROR:  insert or update on table "t3" violates foreign key constraint 
"t3_xs_fkey"

DETAIL:  Key (xs)=({1,3}) is not present in table "t1".

= Nitpicking/style comments

In doc/src/sgml/catalogs.sgml the 
"conpfeqop" line is 
incorrectly indented.


I am not fan of calling it "array-vs-scalar". What about array to scalar?

In ddl.sgml date should be lower case like the other types  in "race_day 
DATE,".


In ddl.sgml I suggest removing the "..." from the examples to make it 
possible to copy paste them easily.


Your text wrapping in ddl.sqml and create_table.sgqml is quite 
arbitrary. I suggest wrapping all paragraphs at 80 characters (except 
for code which should not be wrapped). Your text editor probably has 
tools for wrapping paragraphs.


Please be consistent about how you write table names and SQL in general. 
I think almost all places use lower case for table names, while your 
examples in create_table.sgml are FKTABLEFORARRAY.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-19 Thread Mark Rofail

I have a concern that after supporting UPDATE/DELETE CASCADE, the
performance would drop.

On Thu, Jul 27, 2017 at 12:54 PM, Alexander Korotkov 
 wrote:
>
> I wonder how may RI trigger work so fast if it has to do some job besides
> index search with no results?
>

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-14 Thread Tom Lane

Alexander Korotkov  writes:
> On Mon, Aug 14, 2017 at 2:09 PM, Mark Rofail  wrote:
>> I think we should cast the operands in the RI queries fired as follows
>> 1. we get the array type from the right operand
>> 2. compare the two array type and see which type is more "general" (as to
>> which should be cast to which, int2 should be cast to int4, since casting
>> int4 to int2 could lead to data loss). This can be done by seeing which Oid
>> is larger numerically since, coincidentally, they are declared in this way
>> in pg_type.h.

> I'm not sure numerical comparison of Oids is a good idea.

I absolutely, positively guarantee that a patch written that way will be
rejected.

> Should we instead use logic similar to select_common_type() and underlying
> functions?

Right.  What we typically do in cases like this is check to see if there
is an implicit coercion available in one direction but not the other.
I don't know if you can use select_common_type() directly, but it would
be worth looking at.

Also, given that the context here is RI constraints, what you're really
worried about is whether the referenced column's uniqueness constraint
is associated with compatible operators, so looking into its operator
class for relevant operators might be the right way to think about it.
I wrote something just very recently that touches on that ... ah,
here it is:
https://www.postgresql.org/message-id/13220.1502376...@sss.pgh.pa.us

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-14 Thread Alexander Korotkov

On Mon, Aug 14, 2017 at 2:09 PM, Mark Rofail  wrote:

> On Tue, Aug 8, 2017 at 3:24 PM, Alexander Korotkov 
> wrote:
>
>> On Tue, Aug 8, 2017 at 4:12 PM, Mark Rofail 
>> wrote:
>>
>>> On Tue, Aug 8, 2017 at 2:25 PM, Alexander Korotkov >> > wrote:
>>>
>> GROUP BY would also use default btree/hash opclass for element type.  It
 doesn't differ from DISTINCT from that point.

>>> Then there's no going around this limitation,
>>>
>> That seems like this.
>>
>
> Since for now, the limitation
>
>> ✗ presupposes that count(distinct y) has exactly the same notion of
>> equality that the PK unique index has. In reality, count(distinct) will
>> fall back to the default btree opclass for the array element type.
>
> is unavoidable.
>
> I started to look at the next one on the list.
>
>> ✗ coercion is unsopported. i.e. a numeric can't refrence int8
>
>
> The limitation in short.
>
> #= CREATE TABLE PKTABLEFORARRAY ( ptest1 int4 PRIMARY KEY, ptest2 text );
> #= CREATE TABLE FKTABLEFORARRAY ( ftest1 int2[], FOREIGN KEY (EACH ELEMENT
> OF ftest1) REFERENCES PKTABLEFORARRAY, ftest2 int );
>
> should be accepted but this produces the following error
> operator does not exist: integer[] @> smallint
>
> The algorithm I propose:
> I don't think it's easy to modify the @>> operator as we discussed here.
> 
>
> I think we should cast the operands in the RI queries fired as follows
> 1. we get the array type from the right operand
> 2. compare the two array type and see which type is more "general" (as to
> which should be cast to which, int2 should be cast to int4, since casting
> int4 to int2 could lead to data loss). This can be done by seeing which Oid
> is larger numerically since, coincidentally, they are declared in this way
> in pg_type.h.
>

I'm not sure numerical comparison of Oids is a good idea.  AFAIK, any
regularity of Oids assignment is coincidence...  Also, consider
user-defined data types: their oids depend on order of their creation.
Should we instead use logic similar to select_common_type() and underlying
functions?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-08 Thread Alexander Korotkov

On Tue, Aug 8, 2017 at 4:12 PM, Mark Rofail  wrote:

> On Tue, Aug 8, 2017 at 2:25 PM, Alexander Korotkov 
> wrote:
>>
>> Do we already assume that default btree opclass for array element type
>> matches PK opclass when using @>> operator on UPDATE/DELETE of referenced
>> table?
>>
> I believe so, since it's a polymorphic function.
>
>
>> If so, we don't introduce additional restriction here...
>>
> You mean to remove the wrapper query ?
>

I think we should choose the query which would be better planned (and
presumably faster executed).  You can make some experiments and then choose
the query.


> GROUP BY would also use default btree/hash opclass for element type.  It
>> doesn't differ from DISTINCT from that point.
>>
> Then there's no going around this limitation,
>

That seems like this.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-08 Thread Mark Rofail

On Tue, Aug 8, 2017 at 2:25 PM, Alexander Korotkov 
wrote:
>
> Do we already assume that default btree opclass for array element type
> matches PK opclass when using @>> operator on UPDATE/DELETE of referenced
> table?
>
I believe so, since it's a polymorphic function.


> If so, we don't introduce additional restriction here...
>
You mean to remove the wrapper query ?


> GROUP BY would also use default btree/hash opclass for element type.  It
> doesn't differ from DISTINCT from that point.
>
Then there's no going around this limitation,

Best Regard,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-08 Thread Alexander Korotkov

On Sat, Aug 5, 2017 at 11:36 PM, Mark Rofail  wrote:

> This is the query fired upon any UPDATE/DELETE for RI checks:
>
> SELECT 1 FROM ONLY  x WHERE pkatt1 = $1 [AND ...] FOR KEY SHARE
> OF x
>
> in  the case of foreign key arrays, it's wrapped in this query:
>
> SELECT 1 WHERE
> (SELECT count(DISTINCT y) FROM unnest($1) y)
> = (SELECT count(*) FROM () z)
>
> This is where the limitation appears, the DISTINCT keyword. Since in
> reality, count(DISTINCT) will fall back to the default btree opclass for
> the array element type regardless of the opclass indicated in the access
> method. Thus I believe going around DISTINCT is the way to go.
>

Do we already assume that default btree opclass for array element type
matches PK opclass when using @>> operator on UPDATE/DELETE of referenced
table?
If so, we don't introduce additional restriction here...


This is what I came up with:
>
> SELECT 1 WHERE
> (SELECT COUNT(*)
> FROM
> (
> SELECT y
> FROM unnest($1) y
> GROUP BY y
> )
> )
> = (SELECT count(*) () z)
>
> I understand there might be some syntax errors but this is just a proof of
> concept.
>

GROUP BY would also use default btree/hash opclass for element type.  It
doesn't differ from DISTINCT from that point.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-05 Thread Mark Rofail

This is the query fired upon any UPDATE/DELETE for RI checks:

SELECT 1 FROM ONLY  x WHERE pkatt1 = $1 [AND ...] FOR KEY SHARE OF
x

in  the case of foreign key arrays, it's wrapped in this query:

SELECT 1 WHERE
(SELECT count(DISTINCT y) FROM unnest($1) y)
= (SELECT count(*) FROM () z)

This is where the limitation appears, the DISTINCT keyword. Since in
reality, count(DISTINCT) will fall back to the default btree opclass for
the array element type regardless of the opclass indicated in the access
method. Thus I believe going around DISTINCT is the way to go.

This is what I came up with:

SELECT 1 WHERE
(SELECT COUNT(*)
FROM
(
SELECT y
FROM unnest($1) y
GROUP BY y
)
)
= (SELECT count(*) () z)

I understand there might be some syntax errors but this is just a proof of
concept.

Is this the right way to go?
It's been a week and I don't think I made significant progress. Any
pointers?

Best Regards,
MarkRofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-03 Thread Mark Rofail

To better understand a limitation I ask 5 questions

What is the limitation?
Why is there a limitation?
Why is it a limitation?
What can we do?
Is it feasible?

Through some reading:

*What is the limitation?*
presupposes that count(distinct y) has exactly the same notion of equality
that the PK unique index has. In reality, count(distinct) will fall back to
the default btree opclass for the array element type.

the planner may choose an optimization of this sort when the index's
opclass matches the one
DISTINCT will use, ie the default for the data type.

*Why is there a limitation?*
necessary because ri_triggers.c relies on COUNT(DISTINCT x) on the element
type, as well as on array_eq() on the array type, and we need those
operations to have the same notion of equality that we're using otherwise.

*Why is it a limitation?*
That's wrong: DISTINCT should use the equality operator that corresponds
to the index' operator class instead, not the default one.

*What can we do ?*
I'm sure that we can replace array_eq() with a newer polymorphic version
but I don't know how we could get around COUNT(DISTINCT x)

*Is it feasible? *
I don't think I have the experience to answer that

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-31 Thread Mark Rofail

On Mon, Jul 31, 2017 at 5:18 PM, Alvaro Herrera 
wrote:

> Tom Lane wrote:
> > Alvaro Herrera  writes:
> > > ...  However, when you create an index, you can
> > > indicate which operator class to use, and it may not be the default
> one.
> > > If a different one is chosen at index creation time, then a query using
> > > COUNT(distinct) will do the wrong thing, because DISTINCT will select
> > > an equality type using the type's default operator class, not the
> > > equality that belongs to the operator class used to create the index.
> >
> > > That's wrong: DISTINCT should use the equality operator that
> corresponds
> > > to the index' operator class instead, not the default one.
> >
> > Uh, what?  Surely the semantics of count(distinct x) *must not* vary
> > depending on what indexes happen to be available.
>
> Err ...
>
> > I think what you meant to say is that the planner may only choose an
> > optimization of this sort when the index's opclass matches the one
> > DISTINCT will use, ie the default for the data type.


I understand the problem. I am currently researching how to resolve it.

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-31 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:
> > ...  However, when you create an index, you can
> > indicate which operator class to use, and it may not be the default one.
> > If a different one is chosen at index creation time, then a query using
> > COUNT(distinct) will do the wrong thing, because DISTINCT will select
> > an equality type using the type's default operator class, not the
> > equality that belongs to the operator class used to create the index.
> 
> > That's wrong: DISTINCT should use the equality operator that corresponds
> > to the index' operator class instead, not the default one.
> 
> Uh, what?  Surely the semantics of count(distinct x) *must not* vary
> depending on what indexes happen to be available.

Err ...

> I think what you meant to say is that the planner may only choose an
> optimization of this sort when the index's opclass matches the one
> DISTINCT will use, ie the default for the data type.

Um, yeah, absolutely.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-29 Thread Tom Lane

Alvaro Herrera  writes:
> ...  However, when you create an index, you can
> indicate which operator class to use, and it may not be the default one.
> If a different one is chosen at index creation time, then a query using
> COUNT(distinct) will do the wrong thing, because DISTINCT will select
> an equality type using the type's default operator class, not the
> equality that belongs to the operator class used to create the index.

> That's wrong: DISTINCT should use the equality operator that corresponds
> to the index' operator class instead, not the default one.

Uh, what?  Surely the semantics of count(distinct x) *must not* vary
depending on what indexes happen to be available.

I think what you meant to say is that the planner may only choose an
optimization of this sort when the index's opclass matches the one
DISTINCT will use, ie the default for the data type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-29 Thread Alvaro Herrera

Mark Rofail wrote:
> These are limitations of the patch ordered by importance:
> 
> ✗ presupposes that count(distinct y) has exactly the same notion of
> equality that the PK unique index has. In reality, count(distinct) will
> fall back to the default btree opclass for the array element type.

Operators are classified in operator classes; each data type may have
more than one operator class for a particular access method.  Exactly
one operator class for some access method can be designated as the
default one for a type.  However, when you create an index, you can
indicate which operator class to use, and it may not be the default one.
If a different one is chosen at index creation time, then a query using
COUNT(distinct) will do the wrong thing, because DISTINCT will select
an equality type using the type's default operator class, not the
equality that belongs to the operator class used to create the index.

That's wrong: DISTINCT should use the equality operator that corresponds
to the index' operator class instead, not the default one.

I hope that made sense.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-29 Thread Mark Rofail

These are limitations of the patch ordered by importance:

✗ presupposes that count(distinct y) has exactly the same notion of
equality that the PK unique index has. In reality, count(distinct) will
fall back to the default btree opclass for the array element type.

- Supported actions:
 ✔ NO ACTION
 ✔ RESTRICT
 ✗ CASCADE
 ✗ SET NULL
 ✗ SET DEFAULT

✗ coercion is unsopported. i.e. a numeric can't refrence int8

✗ Only one "ELEMENT" column allowed in a multi-column key

✗ undesirable dependency on default opclass semantics in the patch, which
is that it supposes it can use array_eq() to detect whether or not the
referencing column has changed.  But I think that can be fixed without
undue pain by providing a refactored version of array_eq() that can be told
which element-comparison function to use

✗ cross-type FKs are unsupported

-- Resolved limitations =

✔ fatal performance issues.  If you issue any UPDATE or DELETE against the
PK table, you get a query like this for checking to see if the RI
constraint would be violated:
SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;
/* Changed into SELECT 1 FROM ONLY fktable x WHERE $1 @> fkcol FOR SHARE OF
x; */

-- 

Can someone help me understand the first limitation?

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-28 Thread Mark Rofail

On Fri, Jul 28, 2017 at 1:19 PM, Erik Rijkers  wrote:

> One small thing while building docs:
>
> $  cd doc/src/sgml && make html
> osx -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -x lower
> postgres.sgml >postgres.xml.tmp
> osx:ref/create_table.sgml:960:100:E: document type does not allow element
> "VARLISTENTRY" here
> Makefile:147: recipe for target 'postgres.xml' failed
> make: *** [postgres.xml] Error 1
>

I will work on it.

How's the rest of the patch ?

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-28 Thread Erik Rijkers


On 2017-07-27 21:08, Mark Rofail wrote:

On Thu, Jul 27, 2017 at 7:15 PM, Erik Rijkers  wrote:


It would help (me at least) if you could be more explicit about what
exactly each instance is.



I apologize, I thought it was clear through the context.


Thanks a lot.  It's just really easy for testers like me that aren't 
following a thread too closely and just snatch a half hour here and 
there to look into a feature/patch.



One small thing while building docs:

$  cd doc/src/sgml && make html
osx -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -x lower 
postgres.sgml >postgres.xml.tmp
osx:ref/create_table.sgml:960:100:E: document type does not allow 
element "VARLISTENTRY" here

Makefile:147: recipe for target 'postgres.xml' failed
make: *** [postgres.xml] Error 1

(Debian 8/jessie)


thanks,


Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail

On Thu, Jul 27, 2017 at 7:30 PM, Alexander Korotkov 
wrote:

> Oh, ok.  I missed that.
>>
> Could you remind me why don't we have DELETE CASCADE?  I understand that
> UPDATE CASCADE is problematic because it's unclear which way should we
> delete elements from array.  But what about DELETE CASCADE?
>

Honestly, I didn't touch that part of the patch. It's very interesting
though, I think it would be great to spend the rest of GSoC in it.

Off the top of my head though, there's many ways to go about DELETE
CASCADE. You could only delete the member of the referencing array or the
whole array. I think there's a lot of options the user might want to
consider and it's hard to generalize to DELETE CASCADE. Maybe new grammar
would be introduced here ?|

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail

On Thu, Jul 27, 2017 at 7:15 PM, Erik Rijkers  wrote:

> It would help (me at least) if you could be more explicit about what
> exactly each instance is.
>

I apologize, I thought it was clear through the context.

I meant by the original patch is all the work done before my GSoC project.
The latest of which, was submitted by Tom Lane[1]. And rebased here[2].

The new patch is the latest one submitted by me[3].

And the new patch with index is the same[3], but with a GIN index built
over it. CREATE INDEX ON fktableforarray USING gin (fktest array_ops);

[1] https://www.postgresql.org/message-id/28617.1351095...@sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/CAJvoCutcMEYNFYK8Hdiui-M2y0ZGg%3DBe17fHgQ%3D8nHexZ6ft7w%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/CAJvoCuuoGo5zJTpmPm90doYTUWoeUc%2BONXK2%2BH_vxsi%2BZi09bQ%40mail.gmail.com

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Alexander Korotkov

On Thu, Jul 27, 2017 at 3:07 PM, Mark Rofail  wrote:

> On Thu, Jul 27, 2017 at 12:54 PM, Alexander Korotkov  > wrote:
>>
>> How many rows of FK table were referencing the PK table row you're
>> updating/deleting.
>> I wonder how may RI trigger work so fast if it has to do some job besides
>> index search with no results?
>>
> The problem here is that the only to option for the foreign key arrays are
> NO ACTION and RESTRICT which don't allow me to update/delete a refrenced
> row in the PK Table. the EXPLAIN ANALYZE only tells me that this violates
> the FK constraint.
>
> So we have two options. Either implement CASCADE or if there's a
> configration for EXPLAIN to show costs even if it violates the FK
> constraints.
>

Oh, ok.  I missed that.
Could you remind me why don't we have DELETE CASCADE?  I understand that
UPDATE CASCADE is problematic because it's unclear which way should we
delete elements from array.  But what about DELETE CASCADE?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Erik Rijkers


On 2017-07-27 02:31, Mark Rofail wrote:

I have written some benchmark test.



It would help (me at least) if you could be more explicit about what 
exactly each instance is.


Apparently there is an 'original patch': is this the original patch by 
Marco Nenciarini?

Or is it something you posted earlier?

I guess it could be distilled from the earlier posts but when I looked 
those over yesterday evening I still didn't get it.


A link to the post where the 'original patch' is would be ideal...

thanks!

Erik Rijkers



With two tables a PK table with 5 rows and an FK table with growing row
count.






Once triggering an RI check
at 10 rows,
100 rows,
1,000 rows,
10,000 rows,
100,000 rows and
1,000,000 rows

Please find the graph with the findings attached below


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail

On Thu, Jul 27, 2017 at 12:54 PM, Alexander Korotkov 
wrote:
>
> How many rows of FK table were referencing the PK table row you're
> updating/deleting.
> I wonder how may RI trigger work so fast if it has to do some job besides
> index search with no results?
>
The problem here is that the only to option for the foreign key arrays are
NO ACTION and RESTRICT which don't allow me to update/delete a refrenced
row in the PK Table. the EXPLAIN ANALYZE only tells me that this violates
the FK constraint.

So we have two options. Either implement CASCADE or if there's a
configration for EXPLAIN to show costs even if it violates the FK
constraints.


> I think we should also vary the number of referencing rows.
>
The x axis is the number if refrencing rows in the FK table

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Alexander Korotkov

On Thu, Jul 27, 2017 at 3:31 AM, Mark Rofail  wrote:

> I have written some benchmark test.
>
> With two tables a PK table with 5 rows and an FK table with growing row
> count.
>
> Once triggering an RI check
> at 10 rows,
> 100 rows,
> 1,000 rows,
> 10,000 rows,
> 100,000 rows and
> 1,000,000 rows
>

How many rows of FK table were referencing the PK table row you're
updating/deleting.
I wonder how may RI trigger work so fast if it has to do some job besides
index search with no results?
I think we should also vary the number of referencing rows.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Erik Rijkers


On 2017-07-24 23:31, Mark Rofail wrote:

On Mon, Jul 24, 2017 at 11:25 PM, Erik Rijkers  wrote:


This patch doesn't apply to HEAD at the moment ( e2c8100e6072936 ).



My bad, I should have mentioned that the patch is dependant on the 
original

patch.
Here is a *unified* patch that I just tested.


Thanks.  Apply is now good, but I get this error when compiling:

ELEMENT' not present in UNRESERVED_KEYWORD section of gram.y
make[4]: *** [gram.c] Error 1
make[3]: *** [parser/gram.h] Error 2
make[2]: *** [../../src/include/parser/gram.h] Error 2
make[1]: *** [all-common-recurse] Error 2
make: *** [all-src-recurse] Error 2





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Erik Rijkers


On 2017-07-24 23:08, Mark Rofail wrote:
Here is the new Patch with the bug fixes and the New Patch with the 
Index

in place performance results.

I just want to point this out because I still can't believe the 
numbers. In

reference to the old patch:
The new patch without the index suffers a 41.68% slow down, while the 
new

patch with the index has a 95.18% speed up!



[elemOperatorV4.patch]


This patch doesn't apply to HEAD at the moment ( e2c8100e6072936 ).

Can you have a look?

thanks,

Erik Rijkers




patching file doc/src/sgml/ref/create_table.sgml
Hunk #1 succeeded at 816 with fuzz 3.
patching file src/backend/access/gin/ginarrayproc.c
patching file src/backend/utils/adt/arrayfuncs.c
patching file src/backend/utils/adt/ri_triggers.c
Hunk #1 FAILED at 2650.
Hunk #2 FAILED at 2694.
2 out of 2 hunks FAILED -- saving rejects to file 
src/backend/utils/adt/ri_triggers.c.rej

patching file src/include/catalog/pg_amop.h
patching file src/include/catalog/pg_operator.h
patching file src/include/catalog/pg_proc.h
patching file src/test/regress/expected/arrays.out
patching file src/test/regress/expected/opr_sanity.out
patching file src/test/regress/sql/arrays.sql



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Mark Rofail

It certainly is, thank you for the heads up. I included a note to encourage
the user to index the referencing column instead.

On Sun, Jul 23, 2017 at 4:41 AM, Robert Haas  wrote:
>
> This is a jumbo king-sized can of worms, and even a very experienced
> contributor would likely find it extremely difficult to sort all of
> the problems that would result from a change in this area.

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Mark Rofail

>
> However, there is a bug that prevented me from testing the third scenario,
> I assume there's an issue of incompatible types problem since the right
> operand type is anyelement and the supporting procedures expect anyarray.
> I am working on debugging it right now.
>

I have also solved the bug that prevented me from performance testing the
New Patch with the Index in place.

Here is a summary of the results:

A-  Original Patch
DELETE Average Execution time = 3.508 ms
UPDATE Average Execution time = 3.239 ms

B- New Patch
DELETE Average Execution time = 4.970 ms
UPDATE Average Execution time = 4.170 ms

C- With Index
DELETE Average Execution time = 0.169 ms
UPDATE Average Execution time = 0.147 ms

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-22 Thread Robert Haas

On Sat, Jul 22, 2017 at 5:50 PM, Mark Rofail  wrote:
> so personally I don't think we should leave creating a GIN index up to the
> user, it should be automatically generated instead.

I can certainly understand why you feel that way, but trying to do
that in your patch is just going to get your patch rejected.  We don't
want array foreign keys to have different behavior than regular
foreign keys, and regular foreign keys don't do this automatically.
We could change that, but I suspect it would cause us some pretty
serious problems with upgrades from older versions with the existing
behavior to newer versions with the revised behavior.

There are other problems, too.  Suppose the user creates the foreign
key and then drops the associated index; then, they run pg_dump.  Will
restoring the dump recreate the index?  If so, then you've broken
dump/restore, because now it doesn't actually recreate the original
state of the database.  You might think of fixing this by not letting
the index be dropped, but that's problematic too, because a
fairly-standard way of removing index bloat is to create a new index
with the "concurrently" flag and then drop the old one.  Another
problem entirely is that the auto-generated index will need to have an
auto-generated name, and that name might happen to conflict with the
name of some other object that already exists in the database, which
doesn't initially seem like a problem because you can just generate a
different name instead; indeed, we already do such things.  But the
thorny point is that you have to preserve whatever name you choose --
and the linkage to the array foreign key that caused it to be created
-- across a dump/restore cycle; otherwise you'll have cases where
conflicting names cause failures.  I doubt this is a comprehensive
list of things that might go wrong; it's intended as an illustrative
list, not an exhaustive one.

This is a jumbo king-sized can of worms, and even a very experienced
contributor would likely find it extremely difficult to sort all of
the problems that would result from a change in this area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-21 Thread Alexander Korotkov

On Wed, Jul 19, 2017 at 11:08 PM, Alvaro Herrera 
wrote:

> I'm not entirely sure what's the best way to deal with the polymorphic
> problem, but on the other hand as Robert says downthread maybe we
> shouldn't be solving it at this stage anyway.  So let's step back a bit,
> get a patch that works for the case where the types match on both sides
> of the FK, then we review that patch; if all is well, we can discuss the
> other problem as a stretch goal.


+1
Regular FK functionality have type restrictions based on btree opfamilies
and implicit casts.  Array FK should not necessary have the same type
restrictions.  Also, we don't necessary need to make those restrictions as
soft as possible during this GSoC project.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail

On Wed, Jul 19, 2017 at 10:08 PM, Alvaro Herrera 
wrote:

> So let's step back a bit,
> get a patch that works for the case where the types match on both sides
> of the FK, then we review that patch; if all is well, we can discuss the
> other problem as a stretch goal.

Agreed. This should be a future improvment.

I think the next step should be testing the performnce before/after the
modifiactions.

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Alvaro Herrera

Mark Rofail wrote:
> On Tue, Jul 18, 2017 at 11:14 PM, Alvaro Herrera 
> wrote:
> >
> > Why did we add an operator and not a support
> > procedure?
> 
> I thought the support procedures were constant within an opclass.

Uhh ... I apologize but I think I was barking at the wrong tree.  I was
thinking that it mattered that the opclass mechanism was able to
determine whether some array @>> some element, but that's not true: it's
the queries in ri_triggers.c, which have no idea about opclasses.

(I tihnk we would have wanted to use to opclasses in order to find out
what operator to use in the first place, if ri_triggers.c was already
using that general idea; but in reality it's already using hardcoded
operator names, so it doesn't matter.)

I'm not entirely sure what's the best way to deal with the polymorphic
problem, but on the other hand as Robert says downthread maybe we
shouldn't be solving it at this stage anyway.  So let's step back a bit,
get a patch that works for the case where the types match on both sides
of the FK, then we review that patch; if all is well, we can discuss the
other problem as a stretch goal.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Robert Haas

On Wed, Jul 19, 2017 at 2:29 PM, Mark Rofail  wrote:
> On Wed, Jul 19, 2017 at 7:28 PM, Robert Haas  wrote:
>>
>> Why do we have to solve that limitation?
>
> Since the regress test labled element_foreing_key fails now that I made the
> RI queries utilise @(anyarray, anyelement), that means it's not functioning
> as it is meant to be.

Well, if this is a new test introduced by the patch, you could also
just change the test.  Off-hand, I'm not sure that it's very important
to make the case work where the types don't match between the
referenced table and the referencing table, which is what you seem to
be talking about here.  But maybe I'm misunderstanding the situation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail

On Wed, Jul 19, 2017 at 7:28 PM, Robert Haas  wrote:

> Why do we have to solve that limitation?


Since the regress test labled element_foreing_key fails now that I made the
RI queries utilise @(anyarray, anyelement), that means it's not functioning
as it is meant to be.

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Robert Haas

On Wed, Jul 19, 2017 at 8:08 AM, Mark Rofail  wrote:
> To summarise, the options we have to solve the limitation of the @>(anyarray
> , anyelement) where it produces the following error: operator does not
> exist: integer[] @> smallint

Why do we have to solve that limitation?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail

*To summarise,* the options we have to solve the limitation of the
@>(anyarray , anyelement) where it produces the following error: operator
does not exist: integer[] @> smallint

*Option 1: *Multiple Operators
Have separate operators for every combination of datatypes instead of a
single polymorphic definition (i.e int4[] @>> int8, int4[] @>> int4, int4[]
@>> int2, int4[] @>> numeric.)

Drawback: High maintenance.


*Option 2: *Explicit casting
Where we compare the datatype of the 2 operands and cast with the
appropriate datatype

Drawback: figuring out the appropriate cast may require considerable
computation


*Option 3:* Unsafe Polymorphic datatypes
This a little out there. But since @>(anyarray, anyelement) have to resolve
to the same datatype. How about defining new datatypes without this
constraint? Where we handle the datatypes ourselves? It would ve something
like @>(unsafeAnyarray, unsafeAnyelement).

Drawback: a lot of defensive programming has to be implemented to guard
against any exception.


*Another thing*
Until this is settled, another thing I have to go through is performance
testing. To provide evidence that all we did actually enhances the
performance of the RI checks. How can I go about this?

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail

On Tue, Jul 18, 2017 at 11:14 PM, Alvaro Herrera 
wrote:
>
> Why did we add an operator and not a support
> procedure?

I thought the support procedures were constant within an opclass. They
implement the mandotary function required of an opclass. I don't see why we
would need to implement new ones since they already deal with the lefthand
operand which is the refrencing coloumn and is always an array so anyarray
would suffice.

Also the support procedure don't interact with the left and right operands
simultanously. And we want to target the combinations of  int4[] @>> int8,
int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.

So I think implementing operators is the way to go.

Best Regards,
Mark Rofail.

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alvaro Herrera

Alexander Korotkov wrote:

> The problem is that you need to have not only opclass entries for the
> operators, but also operators themselves.  I.e. separate operators for
> int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
> tried to add multiple pg_amop rows for single operator and consequently get
> unique index violation.
> 
> Alvaro, do you think we need to define all these operators?  I'm not sure.
> If even we need it, I think we shouldn't do this during this GSoC.  What
> particular shortcomings do you see in explicit cast in RI triggers queries?

I'm probably confused.  Why did we add an operator and not a support
procedure?  I think we should have added rows in pg_amproc, not
pg_amproc.  I'm very tired right now so I may be speaking nonsense.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alvaro Herrera

Mark Rofail wrote:
> On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
> wrote:
> 
> >  separate operators for int4[] @>> int8, int4[] @>> int4, int4[] @>> int2,
> > int4[] @>> numeric.
> >
> 
> My only comment on the separate operators is its high maintenance.  Any new
> datatype introduced a corresponding operator should be created.

Yes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail

On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

>  separate operators for int4[] @>> int8, int4[] @>> int4, int4[] @>> int2,
> int4[] @>> numeric.
>

My only comment on the separate operators is its high maintenance.  Any new
datatype introduced a corresponding operator should be created.

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail

On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

> On T upue, Jul 18, 2017 at 2:24 AM, Mark Rofail 
> wrote:
>
>> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera <
>> alvhe...@2ndquadrant.com> wrote:
>>>
>>> We have one opclass for each type combination -- int4 to int2, int4 to
>>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>>> the opclasses.
>>
>>
>>  I tried this approach by manually declaring the operator multiple of
>> times in pg_amop.h (src/include/catalog/pg_amop.h)
>>
>> so instead of the polymorphic declaration
>> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
>> anyelem */
>>
>> multiple declarations were used, for example for int4[] :
>> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
>> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
>> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
>> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric
>> */
>>
>> However, make check produced:
>> could not create unique index "pg_amop_opr_fam_index"
>> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>>
>> Am I implementing this the wrong way or do we need to look for another
>> approach?
>>
>
> The problem is that you need to have not only opclass entries for the
> operators, but also operators themselves.  I.e. separate operators for
> int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
> tried to add multiple pg_amop rows for single operator and consequently get
> unique index violation.
>
> Alvaro, do you think we need to define all these operators?  I'm not
> sure.  If even we need it, I think
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alexander Korotkov

On Tue, Jul 18, 2017 at 2:24 AM, Mark Rofail  wrote:

> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > wrote:
>>
>> We have one opclass for each type combination -- int4 to int2, int4 to
>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>> the opclasses.
>
>
>  I tried this approach by manually declaring the operator multiple of
> times in pg_amop.h (src/include/catalog/pg_amop.h)
>
> so instead of the polymorphic declaration
> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
> anyelem */
>
> multiple declarations were used, for example for int4[] :
> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric */
>
> However, make check produced:
> could not create unique index "pg_amop_opr_fam_index"
> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>
> Am I implementing this the wrong way or do we need to look for another
> approach?
>

The problem is that you need to have not only opclass entries for the
operators, but also operators themselves.  I.e. separate operators for
int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
tried to add multiple pg_amop rows for single operator and consequently get
unique index violation.

Alvaro, do you think we need to define all these operators?  I'm not sure.
If even we need it, I think we shouldn't do this during this GSoC.  What
particular shortcomings do you see in explicit cast in RI triggers queries?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-17 Thread Enrique Meneses

There is a generic definition for any array added as part of
https://commitfest.postgresql.org/10/708/ (it may be the reason for the
duplicate error). I am not sure what your change is but I would review the
above just in case. There is also a defect with a misleading error that is
still being triggered for UUID arrays.

Enrique

On Mon, Jul 17, 2017 at 4:25 PM Mark Rofail  wrote:

> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > wrote:
>>
>> We have one opclass for each type combination -- int4 to int2, int4 to
>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>> the opclasses.
>
>
>  I tried this approach by manually declaring the operator multiple of
> times in pg_amop.h (src/include/catalog/pg_amop.h)
>
> so instead of the polymorphic declaration
> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
> anyelem */
>
> multiple declarations were used, for example for int4[] :
> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric */
>
> However, make check produced:
> could not create unique index "pg_amop_opr_fam_index"
> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>
> Am I implementing this the wrong way or do we need to look for another
> approach?
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-17 Thread Mark Rofail

On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera 
 wrote:
>
> We have one opclass for each type combination -- int4 to int2, int4 to
> int4, int4 to int8, etc.  You just need to add the new strategy to all
> the opclasses.


 I tried this approach by manually declaring the operator multiple of times
in pg_amop.h (src/include/catalog/pg_amop.h)

so instead of the polymorphic declaration
DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>> anyelem
*/

multiple declarations were used, for example for int4[] :
DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric */

However, make check produced:
could not create unique index "pg_amop_opr_fam_index"
Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.

Am I implementing this the wrong way or do we need to look for another
approach?
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index a5238c3af5..9d6447923d 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy	5
 
 
 /*
@@ -43,7 +44,7 @@ ginarrayextract(PG_FUNCTION_ARGS)
 	bool	   *nulls;
 	int			nelems;
 
-	get_typlenbyvalalign(ARR_ELEMTYPE(array),
+	get_typlenbyvalalign(ARR_ELEMTYPE(array),	
 		 , , );
 
 	deconstruct_array(array,
@@ -110,6 +111,11 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
+			/* only items that match the queried element 
+are considered candidate  */
+			*searchMode = GIN_SEARCH_MODE_DEFAULT;
+			break;
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
@@ -171,6 +177,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +265,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 34dadd6e19..8c9eb0c676 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,117 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid element_type,
+bool element_isnull, Oid collation,	void **fn_extra)
+{
+	Oid 		arr_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	if (arr_type != element_type)
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare different element types")));
+	
+	if (element_isnull)
+		return false;
+		
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != arr_type)
+	{
+		typentry = lookup_type_cache(arr_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(arr_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, >eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(, , i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			continue;
+
+		/*
+			*

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-14 Thread Alvaro Herrera

Mark Rofail wrote:
> On Wed, Jul 12, 2017 at 2:30 PM, Mark Rofail  wrote:
> 
> > On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > > wrote:
> >>
> >> We have one opclass for each type combination -- int4 to int2, int4 to
> >> int4, int4 to int8, etc.  You just need to add the new strategy to all
> >> the opclasses.
> >>
> >
> > Can you clarify this solution ? I think another solution would be external
> > casting
> >
> If external casting is to be used. If for example the two types in
> question are smallint and integer. Would a function get_common_type(Oid
> leftopr, Oid rightopr) be useful ?, that given the two types return the
> "common" type between the two in this case integer.

Do you mean adding cast decorators to the query constructed by
ri_triggers.c?  That looks like an inferior solution.  What problem do
you see with adding more rows to the opclass?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-14 Thread Mark Rofail

On Wed, Jul 12, 2017 at 2:30 PM, Mark Rofail  wrote:

> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > wrote:
>>
>> We have one opclass for each type combination -- int4 to int2, int4 to
>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>> the opclasses.
>>
>
> Can you clarify this solution ? I think another solution would be external
> casting
>
>>
>> If external casting is to be used. If for example the two types in
question are smallint and integer. Would a function get_common_type(Oid
leftopr, Oid rightopr) be useful ?, that given the two types return the
"common" type between the two in this case integer.

Best Regards,
 Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-12 Thread Mark Rofail

On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera 
wrote:
>
> We have one opclass for each type combination -- int4 to int2, int4 to
> int4, int4 to int8, etc.  You just need to add the new strategy to all
> the opclasses.
>

Can you clarify this solution ? I think another solution would be external
casting

BTW now that we've gone through this a little further, it's starting to
> look like a mistake to me to use the same @> operator for (anyarray,
> anyelement) than we use for (anyarray, anyarray).


I agree. Changed to @>>

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-11 Thread Alvaro Herrera

Mark Rofail wrote:

>- now the RI checks utilise the @>(anyarray, anyelement)
>   - however there's a small problem:
>   operator does not exist: integer[] @> smallint
>   I assume that external casting would be required here. But how can I
>   downcast smallint to integer or interger to numeric automatically ?

We have one opclass for each type combination -- int4 to int2, int4 to
int4, int4 to int8, etc.  You just need to add the new strategy to all
the opclasses.

BTW now that we've gone through this a little further, it's starting to
look like a mistake to me to use the same @> operator for (anyarray,
anyelement) than we use for (anyarray, anyarray).  I have the feeling
we'd do better by having some other operator for this purpose -- dunno,
maybe @>> or @>.  ... whatever you think is reasonable and not already
in use.  Unless there is some other reason to pick @> for this purpose.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-11 Thread Mark Rofail

here are the modifications to ri_triggers.c

On Wed, Jul 12, 2017 at 12:26 AM, Mark Rofail 
wrote:
>
> *What I did *
>
>- now the RI checks utilise the @>(anyarray, anyelement)
>
> Best Regards,
> Mark Rofail
>
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..2d2b8e6a4f 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2694,21 +2694,34 @@ ri_GenerateQual(StringInfo buf,
  	else
  		oprright = operform->oprright;
  
-	appendStringInfo(buf, " %s %s", sep, leftop);
-	if (leftoptype != operform->oprleft)
-		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
+ 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT){
+		appendStringInfo(buf, " %s %s", sep, rightop);
+
+		if (rightoptype != oprright)
+ 			ri_add_cast_to(buf, oprright);
+
+		appendStringInfo(buf, " @> ");
+
+		appendStringInfoString(buf, leftop);
+
+		if (leftoptype != operform->oprleft)
+			ri_add_cast_to(buf, operform->oprleft);
+	 }	
+	else{
+		appendStringInfo(buf, " %s %s", sep, leftop);
+
+		if (leftoptype != operform->oprleft)
+			ri_add_cast_to(buf, operform->oprleft);
+
+		appendStringInfo(buf, " OPERATOR(%s.%s) ",
  	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
- 	appendStringInfoString(buf, rightop);
- 	if (rightoptype != oprright)
- 		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
+		appendStringInfoString(buf, rightop);
+
+		if (rightoptype != oprright)
+ 			ri_add_cast_to(buf, oprright);
+	}
+	
 	ReleaseSysCache(opertup);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-11 Thread Mark Rofail

On Sun, Jul 9, 2017 at 7:42 PM, Alexander Korotkov 
wrote:

> We may document that GIN index is required to accelerate RI queries for
> array FKs.  And users are encouraged to manually define them.
> It's also possible to define new option when index on referencing
> column(s) would be created automatically.  But I think this option should
> work the same way for regular FKs and array FKs.
>

I just thought because GIN index is suited for composite elements, it would
be appropriate for array FKs.

So we should leave it to the user ? I think tht would be fine too.

*What I did *

   - now the RI checks utilise the @>(anyarray, anyelement)
  - however there's a small problem:
  operator does not exist: integer[] @> smallint
  I assume that external casting would be required here. But how can I
  downcast smallint to integer or interger to numeric automatically ?

*What I plan to do*

   - work on the above mentioned buy/limitation
   - otherwise, I think this concludes limitation #5

fatal performance issues.  If you issue any UPDATE or DELETE against the PK
> table, you get a query like this for checking to see if the RI constraint
> would be violated:

SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;.

or is there anything remaining ?

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-09 Thread Alexander Korotkov

On Sun, Jul 9, 2017 at 1:11 PM, Mark Rofail  wrote:

> On Sun, Jul 9, 2017 at 2:38 AM, Alexander Korotkov 
> wrote:
>
>> Could you, please, specify idea of what you're implementing in more
>> detail?
>>
>
> Ultimatley we would like an indexed scan instead of a sequential scan, so
> I thought we needed to index the FK array columns first.
>

Indeed, this is right.
But look how that works for regular FK.  When you declare a FK, you
necessary need unique index on referenced column(s).  However, index on
referencing columns(s) is not required.  Without index on referencing
column(s), row delete in referenced table and update of referenced column
are expensive because requires sequential scan of referencing table.  Users
are encouraged to index referencing column(s) to accelerate queries
produced by RI triggers. [1]
According to this, it's unclear why array FKs should behave differently.
We may document that GIN index is required to accelerate RI queries for
array FKs.  And users are encouraged to manually define them.
It's also possible to define new option when index on referencing column(s)
would be created automatically.  But I think this option should work the
same way for regular FKs and array FKs.

1.
https://www.postgresql.org/docs/current/static/ddl-constraints.html#DDL-CONSTRAINTS-FK

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-09 Thread Mark Rofail

On Sun, Jul 9, 2017 at 2:38 AM, Alexander Korotkov 
wrote:

> Could you, please, specify idea of what you're implementing in more
> detail?
>

Ultimatley we would like an indexed scan instead of a sequential scan, so I
thought we needed to index the FK array columns first.

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-08 Thread Alexander Korotkov

On Sun, Jul 9, 2017 at 2:35 AM, Mark Rofail  wrote:

> * What I am working on*
>
>- since we want to create an index on the referencing column, I am
>working on firing a 'CREATE INDEX' query programatically right after
>the 'CREATE TABLE' query
>- The problem I ran into is how to specify my Strategy (
>   GinContainsElemStrategy) within the CREATE INDEX query. For
>   example: CREATE INDEX ON fktable USING gin (fkcolumn array_ops)
>   Where does the strategy number fit?
>   - The patch is attached here, is the approach I took to creating an
>   index programmatically, correct?
>
>
Could you, please, specify idea of what you're implementing in more
detail?  AFACS, you're going to automatically create GIN indexes on FK
array columns.  However, if we don't do this for regular columns, why
should we do for array columns?  For me that sounds like a separate feature
which should be implemented for both regular and array FK columns.

Regarding your questions.  If you need to create index supporting given
operator, you shouldn't take care about strategy number.  Strategy number
makes sense only in opclass internals.  You just need to specify opclass
which support your operator.  In principle, you can find all of them in
pg_amop table.  Alternatively you can just stick to GIN array_ops.

In general the approach you create index looks OK.  It's OK to manually
create DDL node and execute it.  As you can see, this is done in many other
places of backend code.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-08 Thread Mark Rofail

* What I am working on*

   - since we want to create an index on the referencing column, I am
   working on firing a 'CREATE INDEX' query programatically right after the
   'CREATE TABLE' query
   - The problem I ran into is how to specify my Strategy (
  GinContainsElemStrategy) within the CREATE INDEX query. For
example: CREATE
  INDEX ON fktable USING gin (fkcolumn array_ops)
  Where does the strategy number fit?
  - The patch is attached here, is the approach I took to creating an
  index programmatically, correct?

Best Regard,
Mark Rofail
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index dc18fd1eae..085b63aa98 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7139,6 +7139,31 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_FOREIGN_KEY),
 	 errmsg("array foreign keys support only NO ACTION and RESTRICT actions")));
+
+		IndexStmt *stmt = makeNode(IndexStmt);
+		stmt->unique = false; /* is index unique? Nope, should allow duplicates*/
+		stmt->concurrent = false; /* should this be a concurrent index build? we want 
+	to lock out writes on the table until it's done. */
+		stmt->idxname = NULL; 		/* let the idxname be generated */ 
+		stmt->relation = /* relation name */;
+		stmt->accessMethod = "gin";	/* name of access method: GIN */
+		stmt->indexParams = /* column name + */"array_ops";
+		stmt->options = NULL;
+		stmt->tableSpace = NULL; 	/* NULL for default */
+		stmt->whereClause = NULL;
+		stmt->excludeOpNames = NIL;
+		stmt->idxcomment = NULL;
+		stmt->indexOid = InvalidOid;
+		stmt->oldNode = InvalidOid; /* relfilenode of existing storage, if any: None*/
+		stmt->primary = false; 		/* is index a primary key? Nope */
+		stmt->isconstraint = false; /* is it for a pkey/unique constraint? Nope */
+		stmt->deferrable = false;
+		stmt->initdeferred = false;
+		stmt->transformed = false;
+		stmt->if_not_exists = false; /* just do nothing if index already exists? Nope 
+	(this shouldn't happen)*/
+
+		ATExecAddIndex(tab, rel, stmt, true, lockmode);
 	}
 
  	/*
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..0045f64c9e 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2697,17 +2697,10 @@ ri_GenerateQual(StringInfo buf,
 	appendStringInfo(buf, " %s %s", sep, leftop);
 	if (leftoptype != operform->oprleft)
 		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
- 	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
+ 	appendStringInfo(buf, " @> "); 
  	appendStringInfoString(buf, rightop);
  	if (rightoptype != oprright)
  		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
 	ReleaseSysCache(opertup);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-05 Thread Mark Rofail

To make the queries fired by the RI triggers GIN indexed. We need to ‒ as
Tom Lane has previously suggested[1] ‒ to replace the query

SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;

with

SELECT 1 FROM ONLY fktable x WHERE ARRAY[$1] <@ fkcol FOR SHARE OF x;

but since we have @<(anyarray, anyelement) it can be improved to

SELECT 1 FROM ONLY fktable x WHERE $1 @> fkcol FOR SHARE OF x;

and the piece of code responsible for all of this is ri_GenerateQual in
ri_triggers.c.

How to accomplish that is the next step. I don't know if we should hardcode
the "@>" symbol or if we just index the fk table then ri_GenerateQual would
be able to find the operator on it's own.

*What I plan to do:*

   - study how to index the fk table upon its creation. I suspect this can
   be done in tablecmds.c

*Questions:*

   - how can you programmatically in C index a table?

[1] https://www.postgresql.org/message-id/28389.1351094795%40sss.pgh.pa.us

Best Regards,
Mark Rofail
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..0045f64c9e 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2697,17 +2697,10 @@ ri_GenerateQual(StringInfo buf,
 	appendStringInfo(buf, " %s %s", sep, leftop);
 	if (leftoptype != operform->oprleft)
 		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
- 	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
+ 	appendStringInfo(buf, " @> "); 
  	appendStringInfoString(buf, rightop);
  	if (rightoptype != oprright)
  		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
 	ReleaseSysCache(opertup);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-03 Thread Alvaro Herrera

Mark Rofail wrote:
> On Mon, Jun 26, 2017 at 6:44 PM, Alexander Korotkov 
> wrote:
> 
> > Have you met any particular problem here?  Or is it just a lot of
> > mechanical work?
> >
> 
> Just A LOT of mechanictal work, thankfully. The patch is now rebased and
> all regress tests have passed (even the element_foreign_key). Please find
> the patch below !

Great!

> *What I plan to do next *
> 
>- study ri_triggers.c (src/backend/utils/adt/ri_triggers.c) since this
>is where the new RI code will reside

Any news?


-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-26 Thread Alexander Korotkov

On Mon, Jun 26, 2017 at 2:26 AM, Mark Rofail  wrote:

> *What I did:*
>
>
>
>- read into the old patch but couldn't apply it since it's quite old.
>It needs to be rebased and that's what I am working on.  It's a lot of
>work.
>   - incomplete patch can be found attached here
>
> Have you met any particular problem here?  Or is it just a lot of
mechanical work?

*Bugs*
>
>- problem with the @>(anyarray, anyelement) opertator: if for example,
>you apply the operator as follows  '{AA646'}' @> 'AA646' it
>maps to @>(anyarray, anyarray) since 'AA646' is interpreted as
>char[] instead of Text
>
> I don't think it is bug.  When types are not specified explicitly, then
optimizer do its best on guessing them.  Sometimes results are
counterintuitive to user.  But that is not bug, it's probably a room for
improvement.  And I don't think this improvement should be subject of this
GSoC.  Anyway, array FK code should use explicit type cast, and then you
wouldn't meet this problem.

On the other hand, you could just choose another operator name for
arraycontainselem.
Then such problem probably wouldn't occur.

*Suggestion:*
>
>- since I needed to check if the Datum was null and its type, I had to
>do it in the arraycontainselem and pass it as a parameter to the underlying
>function array_contains_elem. I'm proposing to introduce a new struct like
>ArrayType, but ElementType along all with brand new MACROs to make dealing
>with anyelement easier in any polymorphic context.
>
> You don't need to do explicit check for nulls, because arraycontainselem
is marked as strict function.  Executor never pass null inputs to your
function if its declared as strict.  See evaluate_function().
Also, during query planning it's checked that all polymorphic are
consistent between each other.  See
https://www.postgresql.org/docs/devel/static/extend-type-system.html#extend-types-polymorphic
and check_generic_type_consistency() for details.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-25 Thread Mark Rofail

*What I did:*



   - read into the old patch but couldn't apply it since it's quite old. It
   needs to be rebased and that's what I am working on.  It's a lot of work.
  - incomplete patch can be found attached here

*Bugs*

   - problem with the @>(anyarray, anyelement) opertator: if for example,
   you apply the operator as follows  '{AA646'}' @> 'AA646' it
   maps to @>(anyarray, anyarray) since 'AA646' is interpreted as
   char[] instead of Text

*Suggestion:*

   - since I needed to check if the Datum was null and its type, I had to
   do it in the arraycontainselem and pass it as a parameter to the underlying
   function array_contains_elem. I'm proposing to introduce a new struct like
   ArrayType, but ElementType along all with brand new MACROs to make dealing
   with anyelement easier in any polymorphic context.


Best Regards,
Mark Rofail

On Tue, Jun 20, 2017 at 12:19 AM, Alvaro Herrera 
wrote:

> Mark Rofail wrote:
> > Okay, so major breakthrough.
> >
> > *Updates:*
> >
> >- The operator @>(anyarray, anyelement) is now functional
> >   - The segmentation fault was due to applying PG_FREE_IF_COPY on a
> >   datum when it should only be applied on TOASTed inputs
> >   - The only problem now is if for example you apply the operator as
> >   follows  '{AA646'}' @> 'AA646' it maps to
> @>(anyarray,
> >   anyarray) since 'AA646' is interpreted as char[] instead
> of Text
> >- Added some regression tests (src/test/regress/sql/arrays.sql) and
> >their results(src/test/regress/expected/arrays.out)
> >- wokred on the new GIN strategy, I don't think it would vary much
> from
> >GinContainsStrategy.
>
> OK, that's great.
>
> > *What I plan to do:*
> >
> >- I need to start working on the Referential Integrity code but I
> don't
> >where to start
>
> You need to study the old patch posted by Marco Nenciarini.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ea655a10a8..712f631e88 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2288,6 +2288,14 @@ SCRAM-SHA-256$iteration count:salt<
  
 
  
+  confiselement
+  bool
+  
+  If a foreign key, is it an array ELEMENT
+  foreign key?
+ 
+
+ 
   coninhcount
   int4
   
@@ -2324,6 +2332,18 @@ SCRAM-SHA-256$iteration count:salt<
  
 
  
+  confelement
+  bool[]
+  
+  
+ 	If a foreign key, list of booleans expressing which columns
+ 	are array ELEMENT columns; see
+ 	
+ 	for details
+  
+ 
+
+ 
   conpfeqop
   oid[]
   pg_operator.oid
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b05a9c2150..c1c847bc7e 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -881,7 +881,112 @@ CREATE TABLE order_items (
 .

   
-
+  
+   
+Array ELEMENT Foreign Keys
+ 
+
+ ELEMENT foreign key
+
+ 
+
+ constraint
+ Array ELEMENT foreign key
+
+ 
+
+ constraint
+ ELEMENT foreign key
+
+ 
+
+ referential integrity
+
+ 
+
+ Another option you have with foreign keys is to use a
+ referencing column which is an array of elements with
+ the same type (or a compatible one) as the referenced
+ column in the related table. This feature is called
+ array element foreign key and is implemented
+ in PostgreSQL with ELEMENT foreign key constraints,
+ as described in the following example:
+ 
+
+CREATE TABLE drivers (
+driver_id integer PRIMARY KEY,
+first_name text,
+last_name text,
+...
+);
+
+CREATE TABLE races (
+race_id integer PRIMARY KEY,
+title text,
+race_day DATE,
+...
+final_positions integer[] ELEMENT REFERENCES drivers
+);
+
+ 
+ The above example uses an array (final_positions)
+ to store the results of a race: for each of its elements
+ a referential integrity check is enforced on the
+ drivers table.
+ Note that ELEMENT REFERENCES is an extension
+ of PostgreSQL and it is not included in the SQL standard.
+
+ 
+
+ Even though the most common use case for array ELEMENT
+ foreign keys is on a single column key, you can define an array
+ ELEMENT foreign key constraint on a group
+ of columns. As the following example shows, it must be written in table
+ constraint form:
+ 
+
+CREATE TABLE available_moves (
+kind text,
+move text,
+description text,
+PRIMARY KEY (kind, move)
+);
+
+CREATE TABLE paths (
+description text,
+kind text,
+moves text[],
+FOREIGN KEY (kind, ELEMENT moves) REFERENCES available_moves

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-19 Thread Alvaro Herrera

Mark Rofail wrote:
> Okay, so major breakthrough.
> 
> *Updates:*
> 
>- The operator @>(anyarray, anyelement) is now functional
>   - The segmentation fault was due to applying PG_FREE_IF_COPY on a
>   datum when it should only be applied on TOASTed inputs
>   - The only problem now is if for example you apply the operator as
>   follows  '{AA646'}' @> 'AA646' it maps to @>(anyarray,
>   anyarray) since 'AA646' is interpreted as char[] instead of Text
>- Added some regression tests (src/test/regress/sql/arrays.sql) and
>their results(src/test/regress/expected/arrays.out)
>- wokred on the new GIN strategy, I don't think it would vary much from
>GinContainsStrategy.

OK, that's great.

> *What I plan to do:*
> 
>- I need to start working on the Referential Integrity code but I don't
>where to start

You need to study the old patch posted by Marco Nenciarini.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-19 Thread Mark Rofail

Okay, so major breakthrough.

*Updates:*

   - The operator @>(anyarray, anyelement) is now functional
  - The segmentation fault was due to applying PG_FREE_IF_COPY on a
  datum when it should only be applied on TOASTed inputs
  - The only problem now is if for example you apply the operator as
  follows  '{AA646'}' @> 'AA646' it maps to @>(anyarray,
  anyarray) since 'AA646' is interpreted as char[] instead of Text
   - Added some regression tests (src/test/regress/sql/arrays.sql) and
   their results(src/test/regress/expected/arrays.out)
   - wokred on the new GIN strategy, I don't think it would vary much from
   GinContainsStrategy.

*What I plan to do:*

   - I need to start working on the Referential Integrity code but I don't
   where to start

Best Regards,
Mark Rofail
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..a1b3f53ed9 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy	5
 
 
 /*
@@ -110,6 +111,11 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
+			/* only items that match the queried element 
+are considered candidate  */
+			*searchMode = GIN_SEARCH_MODE_DEFAULT;
+			break;
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
@@ -171,6 +177,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +265,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..c563aa564e 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,117 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid element_type,
+bool element_isnull, Oid collation,	void **fn_extra)
+{
+	Oid 		arr_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	if (arr_type != element_type)
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare different element types")));
+	
+	if (element_isnull)
+		return false;
+		
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != arr_type)
+	{
+		typentry = lookup_type_cache(arr_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(arr_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, >eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(, , i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			continue;
+
+		/*
+			* Apply the operator to the element pair
+			*/
+		locfcinfo.arg[0] = elt1;
+		locfcinfo.arg[1] = elem;
+		locfcinfo.argnull[0] = false;
+		locfcinfo.argnull[1] = false;
+		locfcinfo.isnull = false;
+		oprresult = DatumGetBool(FunctionCallInvoke());
+		if (oprresult)
+			return true;
+	}
+
+	return false;
+}
+
+Datum
+arraycontainselem(PG_FUNCTION_ARGS)
+{
+	AnyArrayType *array = PG_GETARG_ANY_ARRAY(0);
+	Datum elem = PG_GETARG_DATUM(1);
+	Oid element_type =

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-18 Thread Alexander Korotkov

On Sun, Jun 18, 2017 at 12:41 AM, Mark Rofail 
wrote:

> *Questions:*
>
>- I'd like to check that anyelem and anyarray have the same element
>type. but anyelem is obtained from PG_FUNCTION_ARGS as a Datum. How
>can I make such a check?
>
>
As I know, it's implicitly checked during query analyze stage.  You don't
have to implement your own check inside function implementation.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-17 Thread Mark Rofail

*Updates till now:*

   - added a record to pg_proc (src/include/catalog/pg_proc.h)
   - modified opr_sanity regression check expected results
   - implemented a  low-level function called `array_contains_elem` as an
   equivalent to `array_contain_compare` but accepts anyelement instead of
   anyarray as the right operand. This is more efficient than constructing an
   array and then immediately deconstructing it.

*Questions:*

   - I'd like to check that anyelem and anyarray have the same element
   type. but anyelem is obtained from PG_FUNCTION_ARGS as a Datum. How can
   I make such a check?

Best Regards,
Mark Rofail
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..214aac8fba 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy		5
 
 
 /*
@@ -43,7 +44,7 @@ ginarrayextract(PG_FUNCTION_ARGS)
 	bool	   *nulls;
 	int			nelems;
 
-	get_typlenbyvalalign(ARR_ELEMTYPE(array),
+	get_typlenbyvalalign(ARR_ELEMTYPE(array),	
 		 , , );
 
 	deconstruct_array(array,
@@ -110,7 +111,8 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
 			else	/* everything contains the empty set */
@@ -171,6 +173,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +261,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..8009ab5acb 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,107 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid collation,
+	void **fn_extra)
+{
+	Oid 		element_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != element_type)
+	{
+		typentry = lookup_type_cache(element_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(element_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, >eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(, , i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			continue;
+
+		/*
+			* Apply the operator to the element pair
+			*/
+		locfcinfo.arg[0] = elt1;
+		locfcinfo.arg[1] = elem;
+		locfcinfo.argnull[0] = false;
+		locfcinfo.argnull[1] = false;
+		locfcinfo.isnull = false;
+		oprresult = DatumGetBool(FunctionCallInvoke());
+		if (oprresult)
+			return true;
+	}
+
+	return false;
+}
+
+Datum
+arraycontainselem(PG_FUNCTION_ARGS)
+{
+	AnyArrayType *array = PG_GETARG_ANY_ARRAY(0);
+	Datum elem = PG_GETARG_DATUM(1);
+	Oid collation = PG_GET_COLLATION();
+	bool result;
+
+	result = array_contains_elem(array, elem, collation,
+ >flinfo->fn_extra);
+
+	/* Avoid leaking memory when handed toasted input. */
+	AARR_FREE_IF_COPY(array, 0);
+	PG_FREE_IF_COPY( , 1);
+
+	PG_RETURN_BOOL(result);

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-10 Thread Mark Rofail

• After finding the arraycontains function, I implemented
arraycontainselem that corresponds to the operator @<(anyarray,
anyelem)
   ◦ Please read the attached patch file to view my progress.

•  In addition to src/backend/utils/adt/arrayfuncs.c where I
implemented arraycontainselem.

   ◦ I also edited pg_amop (src/include/catalog/pg_amop.h) since
it stores information about operators associated with access method
operator families.

+DATA(insert ( 2745   2277 2283 2 s 2753 2742 0 ));
{
2745: Oid amopfamily; (denotes gin array_ops)
277: Oid amoplefttype; (denotes anyaray)
2283: Oid amoprighttype; (denotes anyelem)
5: int16 amopstrategy; /* operator strategy number */ (denotes the new
startegy that is yet to be created)
's': char amoppurpose; (denotes 's' for search)
2753: Oid amopopr; (denotes the new operator Oid)
2742: Oid amopmethod;(denotes gin)
0: Oid amopsortfamily; (0 since search operator)
}

   ◦ And pg_operator (src/include/catalog/pg_operator.h) since it
stores information about operators.
+DATA(insert OID = 2753 (  "@>"   PGNSP PGUID b f f 2277 2283 16 0  0
arraycontainselem 0 0 ));
{
 "@>": NameData oprname; /* name of operator */
Oid oprnamespace; /* OID of namespace containing this oper */
Oid oprowner; /* operator owner */
'b': char oprkind; /* 'l', 'r', or 'b' */ (denotes infix)
'f': bool oprcanmerge; /* can be used in merge join? */
'f': bool oprcanhash; /* can be used in hash join? */
277: Oid oprleft; (denotes anyaray)
2283: Oid oprright; (denotes anyelem)
16: Oid oprresult;  (denotes boolean)
0: Oid oprcom; /* OID of commutator oper, or 0 if none */ (needs to be
revisited)
0: Oid oprnegate; /* OID of negator oper, or 0 if none */ (needs to be
revisited)
arraycontainselem: regproc oprcode; /* OID of underlying function */
0: regproc oprrest; /* OID of restriction estimator, or 0 */
0: regproc oprjoin; /* OID of join estimator, or 0 */
}
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..14fedc8066 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy		5
 
 
 /*
@@ -110,7 +111,8 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
 			else	/* everything contains the empty set */
@@ -171,6 +173,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +261,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..e1ff6d33b5 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4215,6 +4215,40 @@ arraycontains(PG_FUNCTION_ARGS)
 }
 
 Datum
+arraycontainselem(PG_FUNCTION_ARGS)
+{
+		Datum *elem = PG_GETARG_DATUM(0);
+		AnyArrayType *array1;
+		AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1);
+		Oid collation = PG_GET_COLLATION();
+		bool result;
+
+		int16 typlen;
+		bool typbyval;
+		char typalign;
+		int nelems;
+
+		/* we have one element */
+		nelems= 1;
+
+		/* get required info about the element type */
+		get_typlenbyvalalign(ARR_ELEMTYPE(array),
+			 , , );
+
+		/* now build the array */
+		array1 =  construct_array(, nelems,collation, , , );
+
+		result = array_contain_compare(array2, array1, collation, true,
+			>flinfo->fn_extra);
+
+		/* Avoid leaking memory when handed toasted input. */
+		PG_FREE_IF_COPY(elem,0);
+		AARR_FREE_IF_COPY(array, 1);
+
+		PG_RETURN_BOOL(result);
+}
+
+Datum
 arraycontained(PG_FUNCTION_ARGS)
 {
 	AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0);
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index da0228de6b..2da9002577 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -687,6 +687,8 @@ DATA(insert (	2595   718 600 15 o 3291 783 1970 ));
  */
 DATA(insert (	2745   2277 2277 1 s 2750 2742 0 ));
 DATA(insert (	2745   2277 2277 2 s 2751 2742 0 ));
+//TODO link the operator's pg_operator OID
+DATA(insert ( 2745   2277 2283 5 s 2753 2742 0 ));
 DATA(insert (	2745   2277 2277 3 s 2752 2742 0 ));
 DATA(insert (	2745   2277 2277 4 s 1070 2742 0 ));
 
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index ccbb17efec..626a0b1c49 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1567,6 +1567,9 @@

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-05-30 Thread Alexander Korotkov

Hi, Mark!

On Tue, May 30, 2017 at 2:18 AM, Mark Rofail  wrote:

> rhaas=# select oid, * from pg_opfamily where opfmethod = 2742;
>>  oid  | opfmethod |opfname | opfnamespace | opfowner
>> --+---++--+--
>>  2745 |  2742 | array_ops  |   11 |   10
>>  3659 |  2742 | tsvector_ops   |   11 |   10
>>  4036 |  2742 | jsonb_ops  |   11 |   10
>>  4037 |  2742 | jsonb_path_ops |   11 |   10
>> (4 rows)
>
> I am particulary intrested in array_ops but I have failed in locating the
> code behind it. Where is it reflected in the source code
>

Let's look what particular opclass is consisting of.  Besides records in
pg_opfamily, it also contains records in pg_opclass, pg_amproc and pg_amop.

=# select * from pg_opclass where opcfamily = 2745;
 opcmethod |  opcname  | opcnamespace | opcowner | opcfamily | opcintype |
opcdefault | opckeytype
---+---+--+--+---+---++
  2742 | array_ops |   11 |   10 |  2745 |  2277 |
t  |   2283
(1 row)

=# select * from pg_amproc where amprocfamily = 2745;
 amprocfamily | amproclefttype | amprocrighttype | amprocnum |
amproc
--++-+---+
 2745 |   2277 |2277 | 2 |
pg_catalog.ginarrayextract
 2745 |   2277 |2277 | 3 |
ginqueryarrayextract
 2745 |   2277 |2277 | 4 |
ginarrayconsistent
 2745 |   2277 |2277 | 6 |
ginarraytriconsistent
(4 rows)

=# select * from pg_amop where amopfamily = 2745;
 amopfamily | amoplefttype | amoprighttype | amopstrategy | amoppurpose |
amopopr | amopmethod | amopsortfamily
+--+---+--+-+-++
   2745 | 2277 |  2277 |1 | s   |
 2750 |   2742 |  0
   2745 | 2277 |  2277 |2 | s   |
 2751 |   2742 |  0
   2745 | 2277 |  2277 |3 | s   |
 2752 |   2742 |  0
   2745 | 2277 |  2277 |4 | s   |
 1070 |   2742 |  0
(4 rows)

These records of system catalog are defined in special headers the source
code:
src/include/catalog/pg_amop.h
src/include/catalog/pg_amproc.h
src/include/catalog/pg_opclass.h
src/include/catalog/pg_opfamily.h
These records are written to system catalog during bootstrap process (see
src/backend/catalog/README).

As you can see pg_amproc records refer some procedures.  Those procedures
are actually the majority of source code behind of opclass.  Those
procedures are defined in src/backend/access/gin/ginarrayproc.c.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-05-29 Thread Mark Rofail

>
> rhaas=# select oid, * from pg_opfamily where opfmethod = 2742;
>  oid  | opfmethod |opfname | opfnamespace | opfowner
> --+---++--+--
>  2745 |  2742 | array_ops  |   11 |   10
>  3659 |  2742 | tsvector_ops   |   11 |   10
>  4036 |  2742 | jsonb_ops  |   11 |   10
>  4037 |  2742 | jsonb_path_ops |   11 |   10
> (4 rows)

I am particulary intrested in array_ops but I have failed in locating the
code behind it. Where is it reflected in the source code

Best Regards,
Mark Rofail

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-05-24 Thread Robert Haas

On Mon, May 22, 2017 at 7:51 PM, Mark Rofail  wrote:
> Cloned the git repo found @ https://github.com/postgres/postgres and
> identified the main two files I will be concerned with. (I know I may need
> to edit other files but these seem to where I will spend most of my summer)
>
> src/backend/commands/tablecmds.c
> src/backend/utils/ri_triggers.c
>
> I am yet to identify the files concerned with the GIN opclass. <-- if anyone
> can help with this

There's not only one GIN opclass.  You can get a list like this:

select oid, * from pg_opclass where opcmethod = 2742;

Actually, you probably want to look for GIN opfamilies:

rhaas=# select oid, * from pg_opfamily where opfmethod = 2742;
 oid  | opfmethod |opfname | opfnamespace | opfowner
--+---++--+--
 2745 |  2742 | array_ops  |   11 |   10
 3659 |  2742 | tsvector_ops   |   11 |   10
 4036 |  2742 | jsonb_ops  |   11 |   10
 4037 |  2742 | jsonb_path_ops |   11 |   10
(4 rows)

To see which SQL functions are used to implement a particular
opfamily, use the OID from the previous step in a query like this:

rhaas=# select prosrc from pg_amop, pg_operator, pg_proc where
amopfamily = 2745 and amopopr = pg_operator.oid and oprcode =
pg_proc.oid;
 prosrc

 array_eq
 arrayoverlap
 arraycontains
 arraycontained
(4 rows)

Then, you can look for those in the source tree.  You can also search
for the associated support functions, e.g.:

rhaas=# select distinct amprocnum, prosrc from pg_amproc, pg_proc
where amprocfamily = 2745 and amproc = pg_proc.oid order by 1, 2;
 amprocnum |prosrc
---+---
 1 | bitcmp
 1 | bpcharcmp
 1 | btabstimecmp
 1 | btboolcmp
 1 | btcharcmp
 1 | btfloat4cmp
 1 | btfloat8cmp
 1 | btint2cmp
 1 | btint4cmp
 1 | btint8cmp
 1 | btnamecmp
 1 | btoidcmp
 1 | btoidvectorcmp
 1 | btreltimecmp
 1 | bttextcmp
 1 | bttintervalcmp
 1 | byteacmp
 1 | cash_cmp
 1 | date_cmp
 1 | interval_cmp
 1 | macaddr_cmp
 1 | network_cmp
 1 | numeric_cmp
 1 | time_cmp
 1 | timestamp_cmp
 1 | timetz_cmp
 2 | ginarrayextract
 3 | ginqueryarrayextract
 4 | ginarrayconsistent
 6 | ginarraytriconsistent
(30 rows)

You might want to read https://www.postgresql.org/docs/devel/static/xindex.html

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

66 matches

Mail list logo