Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-13 Thread Jim Nasby

On 12/12/14, 7:16 PM, Tom Lane wrote:

Jim Nasby  writes:

I'd say that array_eq (and probably _cmp) just needs to be taught to fall back 
to what oper() does, but this part of the commit message gives me pause:



"Change the operator search algorithms to look for appropriate btree or hash index 
opclasses, instead of assuming operators named '<' or '=' have the right semantics."


As it should.  array_cmp is the basis for a btree opclass, therefore
it must *not* use operators that are not themselves btree operators.

Quite aside from that, we need to move even further away from having
internal system operations depend on operator-name-based lookups;
see for instance the recent complaints over stuff like IS DISTINCT FROM
failing for types whose operators aren't in the search path.


Agreed, but in a way that's not what we're doing here; we're trying to utilize 
an operator the user asked us to use. Of course, array_eq assumes that we want 
equality; I think that's a problem itself. rowtypes suffer from this too, but 
presumably it's not as big a deal because we require typid's to match exactly.


It's arguable that the typcache code should be taught to look for
binary-compatible opclasses if it can't find one directly for the
specified type.  I'm not sure offhand what rules we'd need to make
to ensure such a search would yield deterministic results, though.


The risk I see is what happens when someone adds a new operator or cast and 
suddenly we have multiple paths. That should be fine for regular comparisons, 
but probably not in an index.


Another possibility is that we might be able to extend the "text_ops"
btree operator family to include an opclass entry for varchar, rather than
relying on binary compatibility to find the text opclass.  But that would
also require some careful thought to understand what the relaxed
invariants should be for the opfamily structure as a whole.  We don't want
to add more actual operators, for fear of creating ambiguous-operator
lookup failures.


Yeah, to me that sounds like heading back down the road of assuming = means = 
and the other fun we had before classes and families... but maybe I'm just 
being paranoid.

I have an alternative idea, though I'm not sure it's worth the work. Instead of 
having special array-only operators we could instead apply regular operators to 
arrays. I believe we can do this and reuse existing operators, if we store an 
expression of how to combine the result from the previous iteration to the current 
one (ie: for < this would be (prev AND current), if there is a result value 
that should stop iteration (for comparison operators that would be false), and 
what to do with different size arrays. In the last case, you're either going to 
use that to provide a final result, substitute a specific value for any missing 
elements, or throw an error.

Pros:
With some additional information in the catalog, we could provide a lot more 
array operations, using existing operator functions.
We can use the same operator search logic we use for elements. If you can 
perform an operation on 2 elements and that operator has array support, then 
we're good to go. If we'd perform casting on the elements, we'd do the same 
casting with the array values.
We no longer need to assume things like = means equal. If text = int actually 
meant length(text) = int then as long as that operator had array support you 
could do text[] = int[].
This keeps all operator logic together, regardless of whether the input is an 
array or a base element. Not

Cons:
Could this cause problems in the planner? Selectivity estimation comes to mind.
transformAExprOp/make_op would need to do something different for arrays, 
because the function would need more information. If we can just extend OpExpr 
then maybe this isn't a big deal. (A simple grep shows 401 instances of OpExpr 
in src/backend).
I don't think we could use this same technique with rowtypes.
We'd still allow for array-specific operators; perhaps that might cause issues 
or at least confusion.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-12 Thread Tom Lane
Jim Nasby  writes:
> I'd say that array_eq (and probably _cmp) just needs to be taught to fall 
> back to what oper() does, but this part of the commit message gives me pause:

> "Change the operator search algorithms to look for appropriate btree or hash 
> index opclasses, instead of assuming operators named '<' or '=' have the 
> right semantics."

As it should.  array_cmp is the basis for a btree opclass, therefore
it must *not* use operators that are not themselves btree operators.

Quite aside from that, we need to move even further away from having
internal system operations depend on operator-name-based lookups;
see for instance the recent complaints over stuff like IS DISTINCT FROM
failing for types whose operators aren't in the search path.

It's arguable that the typcache code should be taught to look for
binary-compatible opclasses if it can't find one directly for the
specified type.  I'm not sure offhand what rules we'd need to make
to ensure such a search would yield deterministic results, though.

Another possibility is that we might be able to extend the "text_ops"
btree operator family to include an opclass entry for varchar, rather than
relying on binary compatibility to find the text opclass.  But that would
also require some careful thought to understand what the relaxed
invariants should be for the opfamily structure as a whole.  We don't want
to add more actual operators, for fear of creating ambiguous-operator
lookup failures.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-12 Thread Jim Nasby

On 12/9/14, 5:06 PM, Jim Nasby wrote:

On 12/9/14, 4:30 PM, Tom Lane wrote:

Jim Nasby  writes:

On 12/9/14, 4:19 PM, Jim Nasby wrote:

Is there any particular reason we don't allow comparing char and varchar 
arrays? If not I'll submit a patch.



We're also missing operators on text and varchar arrays.


Adding operators would be an incorrect fix.


Right, I'm assuming this is a problem somewhere else (haven't looked into it 
yet).

I just wanted confirmation that this is unexpected before I try and fix it. 
I'll take your silence on that point as confirmation that this is a bug. :)


I've tracked down what's going on here; array_eq is lazy about finding an 
equality operator. It asks lookup_type_cache for TYPECACHE_EQ_OPR_FINFO, which 
means it looks first for a Btree Opclass, then a Hash Opclass. If neither is 
found then we fail.

OTOH, the path taken in transformAExprOp is very different. It ends up at 
oper(), which looks for an exact operator match; if that fails we look for 
binary operators we can coerce to. That's the path that allows this to work in 
the non-array case.

The question is why. :)

array_eq's call to lookup_type_cache was created in 2003 [1] and hasn't been 
touched since. Previously it called equality_oper, which called 
compatible_oper, which called oper (same as transforAExprOp does).

I'd say that array_eq (and probably _cmp) just needs to be taught to fall back 
to what oper() does, but this part of the commit message gives me pause:

"Change the operator search algorithms to look for appropriate btree or hash index 
opclasses, instead of assuming operators named '<' or '=' have the right semantics."

I can see where there are many places where we don't want to just assume than an oprname 
of = actually means =, but does that apply to arrays? If the user says "array = 
array", isn't it safe to assume that that's the same thing as if tried to compare 
two values of the respective typelem's? Wouldn't the same be true for row comparison as 
well?


[1] 
https://github.com/postgres/postgres/blame/master/src/backend/utils/adt/arrayfuncs.c#L3231
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-09 Thread Jim Nasby

On 12/9/14, 4:30 PM, Tom Lane wrote:

Jim Nasby  writes:

On 12/9/14, 4:19 PM, Jim Nasby wrote:

Is there any particular reason we don't allow comparing char and varchar 
arrays? If not I'll submit a patch.



We're also missing operators on text and varchar arrays.


Adding operators would be an incorrect fix.


Right, I'm assuming this is a problem somewhere else (haven't looked into it 
yet).

I just wanted confirmation that this is unexpected before I try and fix it. 
I'll take your silence on that point as confirmation that this is a bug. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-09 Thread Jim Nasby

On 12/9/14, 4:19 PM, Jim Nasby wrote:

Is there any particular reason we don't allow comparing char and varchar 
arrays? If not I'll submit a patch.


We're also missing operators on text and varchar arrays.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator does not exist: character varying[] <> character[]

2014-12-09 Thread Tom Lane
Jim Nasby  writes:
> On 12/9/14, 4:19 PM, Jim Nasby wrote:
>> Is there any particular reason we don't allow comparing char and varchar 
>> arrays? If not I'll submit a patch.

> We're also missing operators on text and varchar arrays.

Adding operators would be an incorrect fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] operator does not exist: character varying[] <> character[]

2014-12-09 Thread Jim Nasby

Is there any particular reason we don't allow comparing char and varchar 
arrays? If not I'll submit a patch.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers