Re: [HACKERS] proposal: searching in array function - array_position
2015-03-22 11:30 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com: On 22 March 2015 at 06:11, Pavel Stehule pavel.steh...@gmail.com wrote: Hi here is updated patch with array_position, array_positions implementation. It is based on committed code - so please, revert commit 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch I checked this and the changes look good, except that you missed a couple of places in src/include/utils/array.h that need updating. In the public docs, you should s/position/subscript because that's the term used throughout the docs for an index into an array. I still like the name array_position() for the function though, because it's consistent with the existing position() functions. updated patch Regards, Dean commit 86ec9f27c473de3de0b1fee3c90d40a610fbbdcd Author: Pavel Stehule pavel.steh...@gooddata.com Date: Sun Mar 22 11:55:24 2015 +0100 doc and header files fix diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml index 9ea1068..5e4130a 100644 --- a/doc/src/sgml/array.sgml +++ b/doc/src/sgml/array.sgml @@ -600,6 +600,25 @@ SELECT * FROM sal_emp WHERE pay_by_quarter ARRAY[1]; index, as described in xref linkend=indexes-types. /para + para + You can also search for specific values in an array using the functionarray_position/ + and functionarray_positions/ functions. The former returns the subscript of + the first occurrence of a value in an array; the latter returns an array with the + subscripts of all occurrences of the value in the array. For example: + +programlisting +SELECT array_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon'); + array_positions +- + 2 + +SELECT array_positions(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1); + array_positions +- + {1,4,8} +/programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index c89f343..8b12c26 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11480,6 +11480,12 @@ SELECT NULLIF(value, '(none)') ... primaryarray_lower/primary /indexterm indexterm +primaryarray_position/primary + /indexterm + indexterm +primaryarray_positions/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm @@ -11599,6 +11605,32 @@ SELECT NULLIF(value, '(none)') ... row entry literal + functionarray_position/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal +/entry +entrytypeint/type/entry +entryreturns the subscript of the first occurrence of the second +argument in the array, starting at the element indicated by the third +argument or at the first element (array must be one-dimensional)/entry +entryliteralarray_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry +entryliteral2/literal/entry + /row + row +entry + literal + functionarray_positions/function(typeanyarray/type, typeanyelement/type) + /literal +/entry +entrytypeint[]/type/entry +entryreturns an array of subscripts of all occurrences of the second +argument in the array given as first argument (array must be +one-dimensional)/entry +entryliteralarray_positions(ARRAY['A','A','B','A'], 'A')/literal/entry +entryliteral{1,2,4}/literal/entry + /row + row +entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry @@ -11708,6 +11740,23 @@ NULL baz/literallayout(3 rows)/entry /table para +In functionarray_position/function and functionarray_positions/, +each array element is compared to the searched value using +literalIS NOT DISTINCT FROM/literal semantics. + /para + + para +In functionarray_position/function, literalNULL/literal is returned +if the value is not found. + /para + + para +In functionarray_positions/function, literalNULL/literal is returned +only if the array is literalNULL/literal; if the value is not found in +the array, an empty array is returned instead. + /para + + para In functionstring_to_array/function, if the delimiter parameter is NULL, each character in the input string will become a separate element in the resulting array. If the delimiter is an empty string, then the entire diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 6679333..c0bfd33 100644 --- a/src/backend/utils/adt/array_userfuncs.c +++ b/src/backend/utils/adt/array_userfuncs.c @@ -12,9 +12,14 @@ */ #include postgres.h +#include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h
Re: [HACKERS] proposal: searching in array function - array_position
On 22 March 2015 at 06:11, Pavel Stehule pavel.steh...@gmail.com wrote: Hi here is updated patch with array_position, array_positions implementation. It is based on committed code - so please, revert commit 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch I checked this and the changes look good, except that you missed a couple of places in src/include/utils/array.h that need updating. In the public docs, you should s/position/subscript because that's the term used throughout the docs for an index into an array. I still like the name array_position() for the function though, because it's consistent with the existing position() functions. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Hi here is updated patch with array_position, array_positions implementation. It is based on committed code - so please, revert commit 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch Regards Pavel 2015-03-20 18:29 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: 2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com: There's an issue when the array's lower bound isn't 1: select array_offset('[2:4]={1,2,3}'::int[], 1); array_offset -- 1 (1 row) whereas I would expect this to return 2. Similarly for array_offsets(), so the offsets can be used as indexes into the original array. I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..85dc647 *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,624 index, as described in xref linkend=indexes-types. /para + para + You can also search for specific values in an array using the functionarray_position/ + and functionarray_positions/ functions. The former returns the position of + the first occurrence of a value in an array; the latter returns an array with the + positions of all occurrences of the value in the array. For example: + + programlisting + SELECT array_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon'); + array_positions + - + 2 + + SELECT array_positions(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1); + array_positions + - + {1,4,8} + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index c89f343..c865f30 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11480,11485 --- 11480,11491 primaryarray_lower/primary /indexterm indexterm + primaryarray_position/primary + /indexterm + indexterm + primaryarray_positions/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11599,11604 --- 11605,11636 row entry literal + functionarray_position/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns the position of the first occurrence of the second + argument in the array, starting at the element indicated by the third + argument or at the first element (array must be one-dimensional)/entry + entryliteralarray_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +row + entry + literal + functionarray_positions/function(typeanyarray/type, typeanyelement/type) + /literal + /entry + entrytypeint[]/type/entry + entryreturns an array of positions of all occurrences of the second + argument in the array given as first argument (array must be + one-dimensional)/entry + entryliteralarray_positions(ARRAY['A','A','B','A'], 'A')/literal/entry + entryliteral{1,2,4}/literal/entry +/row +row + entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry *** NULL baz/literallayout(3 rows)/entry *** 11708,11713 --- 11740,11762 /table para + In functionarray_position/function and functionarray_positions/, + each array element is compared to the searched value using + literalIS NOT DISTINCT FROM/literal semantics. +/para + +para + In functionarray_position/function, literalNULL/literal is returned + if the value is not found. +/para + +para + In functionarray_positions/function, literalNULL/literal is returned + only if the array is literalNULL/literal; if the value is not found in + the array, an empty array is returned instead. +/para + +para In functionstring_to_array/function, if the delimiter parameter is NULL, each character in the input string will become a separate element in the resulting array. If the delimiter is an empty string, then the entire diff --git a/src/backend/utils/adt/array_userfuncs.c
Re: [HACKERS] proposal: searching in array function - array_position
do you have any idea about name for this function? array_position is ok? +1 on array_position. It's possible at some point we'll actually want array_offset that does what it claims. +1 for array_position. -1 for keeping array_offset. I'm not convinced that there are sufficient use cases for it. No other array functions deal in offsets relative to the first element, and if you do want that, it is trivial to achieve with array_position() and array_lower(). IMO having 2 functions for searching in an array will just increase the risk of users picking the wrong one by accident. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-21 0:27 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 3/20/15 2:48 PM, Pavel Stehule wrote: 2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us: Alvaro Herrera alvhe...@2ndquadrant.com mailto:alvhe...@2ndquadrant.com writes: Pavel Stehule wrote: I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. I concur; perhaps offset was the design intention, but it's wrong. The result should be a subscript. do you have any idea about name for this function? array_position is ok? +1 on array_position. It's possible at some point we'll actually want array_offset that does what it claims. additional implementation of array_position needs few lines more On another note, you mentioned elsewhere that it's not possible to return anything other than an integer. Why can't there be a variation of this function that returns an array of ndims-1 that is the slice where a value was found? We talked about it, when we talked about MD searching - and we moved it to next stage. I am thinking so array_postions can support MD arrays due returning a array Regards Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com commit 8ceb761fcd4bca3859c0ec371ec783a36795dd49 Author: Pavel Stehule pavel.steh...@gooddata.com Date: Sat Mar 21 07:01:51 2015 +0100 initial diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 57074e0..91e2824 100644 --- a/src/backend/utils/adt/array_userfuncs.c +++ b/src/backend/utils/adt/array_userfuncs.c @@ -19,8 +19,8 @@ #include utils/typcache.h -static Datum array_offset_common(FunctionCallInfo fcinfo); - +static Datum array_offset_common(FunctionCallInfo fcinfo, bool expected_position); +static Datum array_offsets_common(FunctionCallInfo fcinfo, bool expected_position); /* * fetch_array_arg_replace_nulls @@ -669,13 +669,33 @@ array_agg_array_finalfn(PG_FUNCTION_ARGS) Datum array_offset(PG_FUNCTION_ARGS) { - return array_offset_common(fcinfo); + return array_offset_common(fcinfo, false); } Datum array_offset_start(PG_FUNCTION_ARGS) { - return array_offset_common(fcinfo); + return array_offset_common(fcinfo, false); +} + +/*- + * array_positiob, array_position_start : + * return the position of a value in an array. + * + * IS NOT DISTINCT FROM semantics are used for comparisons. Return NULL when + * the value is not found. + *- + */ +Datum +array_position(PG_FUNCTION_ARGS) +{ + return array_offset_common(fcinfo, true); +} + +Datum +array_position_start(PG_FUNCTION_ARGS) +{ + return array_offset_common(fcinfo, true); } /* @@ -686,7 +706,7 @@ array_offset_start(PG_FUNCTION_ARGS) * They are not strict so we have to test for null inputs explicitly. */ static Datum -array_offset_common(FunctionCallInfo fcinfo) +array_offset_common(FunctionCallInfo fcinfo, bool expected_positions) { ArrayType *array; Oid collation = PG_GET_COLLATION(); @@ -701,6 +721,7 @@ array_offset_common(FunctionCallInfo fcinfo) ArrayMetaState *my_extra; bool null_search; ArrayIterator array_iterator; + intlb = 1; if (PG_ARGISNULL(0)) PG_RETURN_NULL(); @@ -731,6 +752,8 @@ array_offset_common(FunctionCallInfo fcinfo) null_search = false; } + lb = expected_positions ? (ARR_LBOUND(array))[0] : 1; + /* figure out where to start */ if (PG_NARGS() == 3) { @@ -739,7 +762,7 @@ array_offset_common(FunctionCallInfo fcinfo) (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), errmsg(initial offset should not be NULL))); - offset_min = PG_GETARG_INT32(2); + offset_min = PG_GETARG_INT32(2) - lb + 1; } else offset_min = 1; @@ -818,9 +841,25 @@ array_offset_common(FunctionCallInfo fcinfo) if (!found) PG_RETURN_NULL(); - PG_RETURN_INT32(offset); + PG_RETURN_INT32(offset + lb - 1); +} + +/* + * simple wrappers over array_offsets_common + */ +Datum +array_offsets(PG_FUNCTION_ARGS) +{ + return array_offsets_common(fcinfo, false); } +Datum +array_positions(PG_FUNCTION_ARGS) +{ + return array_offsets_common(fcinfo, true); +} + + /*- * array_offsets : * return an array of offsets of a value in an array. @@ -833,7 +872,7 @@ array_offset_common(FunctionCallInfo fcinfo) *- */ Datum -array_offsets(PG_FUNCTION_ARGS) +array_offsets_common(FunctionCallInfo fcinfo, bool expected_positions) { ArrayType *array; Oid collation = PG_GET_COLLATION(); @@ -847,6 +886,7 @@ array_offsets(PG_FUNCTION_ARGS) bool
Re: [HACKERS] proposal: searching in array function - array_position
On 3/20/15 2:48 PM, Pavel Stehule wrote: 2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us: Alvaro Herrera alvhe...@2ndquadrant.com mailto:alvhe...@2ndquadrant.com writes: Pavel Stehule wrote: I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. I concur; perhaps offset was the design intention, but it's wrong. The result should be a subscript. do you have any idea about name for this function? array_position is ok? +1 on array_position. It's possible at some point we'll actually want array_offset that does what it claims. On another note, you mentioned elsewhere that it's not possible to return anything other than an integer. Why can't there be a variation of this function that returns an array of ndims-1 that is the slice where a value was found? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Alvaro Herrera alvhe...@2ndquadrant.com writes: Pavel Stehule wrote: I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. I concur; perhaps offset was the design intention, but it's wrong. The result should be a subscript. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Pavel Stehule wrote: 2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com: There's an issue when the array's lower bound isn't 1: select array_offset('[2:4]={1,2,3}'::int[], 1); array_offset -- 1 (1 row) whereas I would expect this to return 2. Similarly for array_offsets(), so the offsets can be used as indexes into the original array. I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-20 18:29 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: 2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com: There's an issue when the array's lower bound isn't 1: select array_offset('[2:4]={1,2,3}'::int[], 1); array_offset -- 1 (1 row) whereas I would expect this to return 2. Similarly for array_offsets(), so the offsets can be used as indexes into the original array. I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. depends what you want. It means - it is on Nth position from start. So it is useful when iterate over array, because it is safe against different array start dimensions. it works, if you use it as offset. It is named array_offset It can be changed and renamed to array_position - it is simple fix. But I am not sure, if it is better. Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] proposal: searching in array function - array_position
On 18 March 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed after adding error checks there and fixing the docs to match. Please verify. There's an issue when the array's lower bound isn't 1: select array_offset('[2:4]={1,2,3}'::int[], 1); array_offset -- 1 (1 row) whereas I would expect this to return 2. Similarly for array_offsets(), so the offsets can be used as indexes into the original array. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com: On 18 March 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed after adding error checks there and fixing the docs to match. Please verify. There's an issue when the array's lower bound isn't 1: select array_offset('[2:4]={1,2,3}'::int[], 1); array_offset -- 1 (1 row) whereas I would expect this to return 2. Similarly for array_offsets(), so the offsets can be used as indexes into the original array. I am thinking, so it is ok - it returns a offset, not position. Regards Pavel Regards, Dean
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Alvaro Herrera alvhe...@2ndquadrant.com writes: Pavel Stehule wrote: I am thinking, so it is ok - it returns a offset, not position. So you can't use it as a subscript? That sounds unfriendly. Almost every function using this will be subtly broken. I concur; perhaps offset was the design intention, but it's wrong. The result should be a subscript. do you have any idea about name for this function? array_position is ok? Regards Pavel regards, tom lane
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-18 20:03 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to: I am thinking, so this behave is correct (there is no other possible), but it is only corner case for this functionality - and if you are thinking, so better to disallow it, I am not against. My main focus is 1ND array. A nonsensical answer for multi-dimensional arrays is worse than no answer at all. I think raising an exception is better. I have not problem with it. Pushed after adding error checks there and fixing the docs to match. Please verify. it is looking well, thank you. small issue - there is not documented, so multidimensional arrays are not supported, Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] proposal: searching in array function - array_position
Pavel Stehule wrote: 2015-03-18 20:03 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pushed after adding error checks there and fixing the docs to match. Please verify. it is looking well, thank you. Thanks for checking. small issue - there is not documented, so multidimensional arrays are not supported, I added a parenthised comment in the table, (array must be one-dimensional). I copied this from the entry for the array_remove function IIRC; it seemed enough ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-18 3:45 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 3/17/15 8:06 PM, Alvaro Herrera wrote: My main question regarding this patch is whether the behavior with MD arrays is useful at all. Suppose I give it this: alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3); array_offset -- 3 (1 fila) What can I do with the 3 value it returned? Certainly not use it as an offset to get a slice of the original array. The only thing that seems sensible to me here is to reject the whole thing with an error, that is, only accept 1-D arrays here. We can later extend the function by allowing higher dimensionality as long as the second argument is an array one dimension less than the first argument. But if we allow the case on its appearance, it's going to be difficult to change the behavior later. This behave is consistent with unnest function, when all multidimensional arrays are reduced to 1ND arrays. Other argument for this behave is impossibility to design other behave. array_offset function have to returns integer always. You cannot to returns a array of integers, what is necessary for MD position. And one integer can be only position in flatted 1ND array. I agree, so this is not user friendly, but there is not any other possible solution - we have not anyarray and anymdarray types. I designed this possibility (use ND arrays) mainly for info, if some value exists or not. I am thinking, so this behave is correct (there is no other possible), but it is only corner case for this functionality - and if you are thinking, so better to disallow it, I am not against. My main focus is 1ND array. Regards Pavel +1 Has a case been made for the current behavior? Not that I remember. There was discussion about how this should properly support MD arrays. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to: On 3/18/15 12:27 PM, Pavel Stehule wrote: On 3/17/15 8:06 PM, Alvaro Herrera wrote: My main question regarding this patch is whether the behavior with MD arrays is useful at all. Suppose I give it this: alvherre=# select array_offset('{{{1,2},{3,4},{ 5,6}},{{2,3},{4,5},{6,7}}}', 3); array_offset -- 3 (1 fila) What can I do with the 3 value it returned? Certainly not use it as an offset to get a slice of the original array. The only thing that seems sensible to me here is to reject the whole thing with an error, that is, only accept 1-D arrays here. We can later extend the function by allowing higher dimensionality as long as the second argument is an array one dimension less than the first argument. But if we allow the case on its appearance, it's going to be difficult to change the behavior later. I designed this possibility (use ND arrays) mainly for info, if some value exists or not. Why not use =ANY() for that? It is only partial solution - array_offset use IS NOT DISTINCT FROM operator. I am thinking, so this behave is correct (there is no other possible), but it is only corner case for this functionality - and if you are thinking, so better to disallow it, I am not against. My main focus is 1ND array. A nonsensical answer for multi-dimensional arrays is worse than no answer at all. I think raising an exception is better. I have not problem with it. Regards Pavel .m
Re: [HACKERS] proposal: searching in array function - array_position
On 3/18/15 12:27 PM, Pavel Stehule wrote: On 3/17/15 8:06 PM, Alvaro Herrera wrote: My main question regarding this patch is whether the behavior with MD arrays is useful at all. Suppose I give it this: alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3); array_offset -- 3 (1 fila) What can I do with the 3 value it returned? Certainly not use it as an offset to get a slice of the original array. The only thing that seems sensible to me here is to reject the whole thing with an error, that is, only accept 1-D arrays here. We can later extend the function by allowing higher dimensionality as long as the second argument is an array one dimension less than the first argument. But if we allow the case on its appearance, it's going to be difficult to change the behavior later. I designed this possibility (use ND arrays) mainly for info, if some value exists or not. Why not use =ANY() for that? I am thinking, so this behave is correct (there is no other possible), but it is only corner case for this functionality - and if you are thinking, so better to disallow it, I am not against. My main focus is 1ND array. A nonsensical answer for multi-dimensional arrays is worse than no answer at all. I think raising an exception is better. .m -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Pavel Stehule wrote: 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to: I am thinking, so this behave is correct (there is no other possible), but it is only corner case for this functionality - and if you are thinking, so better to disallow it, I am not against. My main focus is 1ND array. A nonsensical answer for multi-dimensional arrays is worse than no answer at all. I think raising an exception is better. I have not problem with it. Pushed after adding error checks there and fixing the docs to match. Please verify. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
My main question regarding this patch is whether the behavior with MD arrays is useful at all. Suppose I give it this: alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3); array_offset -- 3 (1 fila) What can I do with the 3 value it returned? Certainly not use it as an offset to get a slice of the original array. The only thing that seems sensible to me here is to reject the whole thing with an error, that is, only accept 1-D arrays here. We can later extend the function by allowing higher dimensionality as long as the second argument is an array one dimension less than the first argument. But if we allow the case on its appearance, it's going to be difficult to change the behavior later. Has a case been made for the current behavior? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml index 9ea1068..092013b 100644 --- a/doc/src/sgml/array.sgml +++ b/doc/src/sgml/array.sgml @@ -600,6 +600,25 @@ SELECT * FROM sal_emp WHERE pay_by_quarter ARRAY[1]; index, as described in xref linkend=indexes-types. /para + para + You can also search for specific values in an array using the functionarray_offset/ + and functionarray_offsets/ functions. The former returns the position of + the first occurrence of a value in an array; the latter returns an array with the + positions of all occurrences of the value in the array. For example: + +programlisting +SELECT array_offset(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon'); + array_offset +-- + 2 + +SELECT array_offsets(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1); + array_offsets +--- + {1,4,8} +/programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index c198bea..d8c542e 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11480,6 +11480,12 @@ SELECT NULLIF(value, '(none)') ... primaryarray_lower/primary /indexterm indexterm +primaryarray_offset/primary + /indexterm + indexterm +primaryarray_offsets/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm @@ -11599,6 +11605,39 @@ SELECT NULLIF(value, '(none)') ... row entry literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal +/entry +entrytypeint/type/entry +entryreturns the offset of the first occurrence of the second +argument in the array. Each array element is compared using the +literalIS NOT DISTINCT FROM/ operator. The optional third +argument specifies an initial offset to begin the search at. +Returns literalNULL/ if the value is not found. +Multi-dimensional arrays are squashed to one dimension before +searching./entry +entryliteralarray_offset(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry +entryliteral2/literal/entry + /row + row +entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal +/entry +entrytypeint[]/type/entry +entryreturns an array of offsets of all occurrences of the second +argument in the array given as first argument. Each array element is +compared using the the literalIS NOT DISTINCT FROM/ operator. +Returns an empty array when there are no occurences of the value in +the array. Multi-dimensional arrays are squashed to one dimension before +searching./entry +entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry +entryliteral{1,2,4}/literal/entry + /row + row +entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 6679333..dcc7c62 100644 --- a/src/backend/utils/adt/array_userfuncs.c +++ b/src/backend/utils/adt/array_userfuncs.c @@ -12,9 +12,14 @@ */ #include postgres.h +#include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h +#include utils/typcache.h + + +static Datum array_offset_common(FunctionCallInfo fcinfo); /* @@ -652,3 +657,276 @@ array_agg_array_finalfn(PG_FUNCTION_ARGS) PG_RETURN_DATUM(result); } + +/*- + * array_offset, array_offset_start : + * return the offset of a value in an array. + * + * IS NOT DISTINCT FROM semantics are used for comparisons. Return NULL when + * the value is not found. +
Re: [HACKERS] proposal: searching in array function - array_position
On 3/17/15 8:06 PM, Alvaro Herrera wrote: My main question regarding this patch is whether the behavior with MD arrays is useful at all. Suppose I give it this: alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3); array_offset -- 3 (1 fila) What can I do with the 3 value it returned? Certainly not use it as an offset to get a slice of the original array. The only thing that seems sensible to me here is to reject the whole thing with an error, that is, only accept 1-D arrays here. We can later extend the function by allowing higher dimensionality as long as the second argument is an array one dimension less than the first argument. But if we allow the case on its appearance, it's going to be difficult to change the behavior later. +1 Has a case been made for the current behavior? Not that I remember. There was discussion about how this should properly support MD arrays. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 3/11/15 1:19 AM, Pavel Stehule wrote: 2015-03-11 2:57 GMT+01:00 Robert Haas robertmh...@gmail.com mailto:robertmh...@gmail.com: On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com wrote: I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? Not if you want the opr_sanity tests to pass. (But I'm seriously starting to wonder if that's actually a smart rule for us to be enforcing. It seems to be something of a pain in the neck, and I'm unclear as to whether it is preventing any real problem.) It is simple protection against some oversights. I am not against this check - this rule cleans a interface between C and SQL. More, the additional C code is usually very short and trivial. But it should be commented well. Ahh, ok, makes more sense now. If the separate C functions are serving a purpose that's fine. I think the comment should mention it though, as it's not exactly the most obvious thing. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-11 22:50 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 3/11/15 4:37 PM, Pavel Stehule wrote: + /* + * array_offset - returns the offset of a value in an array (array_offset and + * array_offset_start are wrappers for safe call (look on opr_sanity test) a + * array_offset_common function. + * + * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator + * for comparation to be safe against NULL. + */ would be better as... + /* + * array_offset - returns the offset of a value in an array. array_offset and + * array_offset_start are wrappers for the sake of the opr_sanity test. + * + * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator + * for comparation to be safe against NULL. + */ fixed Regards Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..d90266f *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,614 index, as described in xref linkend=indexes-types. /para + para + You can also search for a value in an array using the functionarray_offset/ + function. It returns the position of the first occurrence of a value in an array: + + programlisting + SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index c198bea..311f2fe *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11480,11485 --- 11480,11488 primaryarray_lower/primary /indexterm indexterm + primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11598,11603 --- 11601,11637 /row row entry + literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns the offset of the first occurrence of a value in an + array. It uses the literalIS NOT DISTINCT FROM/ operator for + comparation. The optional third argument specifies an initial offset to + begin the search at. Returns NULL when the value is not found. Note: + multi-dimensional arrays are squashed to one dimension before + searching./entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +row + entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal + /entry + entrytypeint[]/type/entry + entryreturns an array of offsets of all occurrences of a value in a array. It uses + the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array + when there are no occurences of the value in the array. Note: + multi-dimensional input arrays are squashed to one dimension before + searching./entry + entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry + entryliteral{1,2,4}/literal/entry +/row +row + entry literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c new file mode 100644 index 6679333..0f9ea48 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *** *** 12,20 --- 12,24 */ #include postgres.h + #include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h + #include utils/typcache.h + + static Datum array_offset_common(FunctionCallInfo fcinfo); /* *** array_agg_array_finalfn(PG_FUNCTION_ARGS *** 652,654 --- 656,905 PG_RETURN_DATUM(result); } + + + /* + * array_offset - returns the offset of a value in an array. array_offset and + * array_offset_start are wrappers for the sake of the opr_sanity test. + * + * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator + * for comparation to be safe against NULL. + */ + Datum + array_offset(PG_FUNCTION_ARGS) + { + return array_offset_common(fcinfo); + } + + Datum + array_offset_start(PG_FUNCTION_ARGS) + { + return array_offset_common(fcinfo); + } + + /* + * Common part
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 9:30 AM, Robert Haas wrote: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? I'll take a look at it. The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. Since this patch is here and ready to go I would prefer that we commit it and refactor later. I can tackle that unless Pavel specifically wants to. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 16:53 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 3/10/15 9:30 AM, Robert Haas wrote: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? I'll take a look at it. The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. Since this patch is here and ready to go I would prefer that we commit it and refactor later. I can tackle that unless Pavel specifically wants to. I'll look on this part this evening - but I don't have any idea how to find some common pattern yet. So I am with Jim - we can do it later. Regards Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 3/11/15 1:29 AM, Pavel Stehule wrote: What is this comment supposed to mean? There is no 'width_array'... It is typo (I am sorry) - should be width_bucket(, array) http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac The code is similar, but it expect large **sorted** input. array_offset works on unsorted (alphabetical unsorted) data sets - like days of week .. The functions are serving rather different purposes, so I'm not sure it's worth mentioning. If we do want to mention it, then something like the following should be added to *both* functions: * This code is similar to width_bucket() and * This code is similar to array_offset() Incidentally, isn't it bad that we're doing all these static assignments inside the loop in width_bucket? Or can we count on the compiler to recognize this? http://lnk.nu/github.com/1dvrr.c -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-11 22:14 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 3/11/15 1:29 AM, Pavel Stehule wrote: What is this comment supposed to mean? There is no 'width_array'... It is typo (I am sorry) - should be width_bucket(, array) http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= e80252d424278abf65b624669c8e6b3fe8587cac The code is similar, but it expect large **sorted** input. array_offset works on unsorted (alphabetical unsorted) data sets - like days of week .. The functions are serving rather different purposes, so I'm not sure it's worth mentioning. If we do want to mention it, then something like the following should be added to *both* functions: ok, I removed this note. I added comment about wrapping and I simplified a code there - this method is used more time in pg for same purposes. Merged Jim's changes in doc Pavel * This code is similar to width_bucket() and * This code is similar to array_offset() Incidentally, isn't it bad that we're doing all these static assignments inside the loop in width_bucket? Or can we count on the compiler to recognize this? http://lnk.nu/github.com/1dvrr.c -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..d90266f *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,614 index, as described in xref linkend=indexes-types. /para + para + You can also search for a value in an array using the functionarray_offset/ + function. It returns the position of the first occurrence of a value in an array: + + programlisting + SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index c198bea..311f2fe *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11480,11485 --- 11480,11488 primaryarray_lower/primary /indexterm indexterm + primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11598,11603 --- 11601,11637 /row row entry + literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns the offset of the first occurrence of a value in an + array. It uses the literalIS NOT DISTINCT FROM/ operator for + comparation. The optional third argument specifies an initial offset to + begin the search at. Returns NULL when the value is not found. Note: + multi-dimensional arrays are squashed to one dimension before + searching./entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +row + entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal + /entry + entrytypeint[]/type/entry + entryreturns an array of offsets of all occurrences of a value in a array. It uses + the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array + when there are no occurences of the value in the array. Note: + multi-dimensional input arrays are squashed to one dimension before + searching./entry + entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry + entryliteral{1,2,4}/literal/entry +/row +row + entry literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c new file mode 100644 index 6679333..a847358 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *** *** 12,20 --- 12,24 */ #include postgres.h + #include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h + #include utils/typcache.h + + static Datum array_offset_common(FunctionCallInfo fcinfo); /* *** array_agg_array_finalfn(PG_FUNCTION_ARGS *** 652,654 --- 656,906 PG_RETURN_DATUM(result); } + + + /* + * array_offset - returns the offset of a value in an array (array_offset and + * array_offset_start are wrappers for safe call
Re: [HACKERS] proposal: searching in array function - array_position
On 3/11/15 4:37 PM, Pavel Stehule wrote: + /* + * array_offset - returns the offset of a value in an array (array_offset and + * array_offset_start are wrappers for safe call (look on opr_sanity test) a + * array_offset_common function. + * + * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator + * for comparation to be safe against NULL. + */ would be better as... + /* + * array_offset - returns the offset of a value in an array. array_offset and + * array_offset_start are wrappers for the sake of the opr_sanity test. + * + * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator + * for comparation to be safe against NULL. + */ -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 5:25 PM, Tom Lane wrote: Jim Nasby jim.na...@bluetreble.com writes: That said, this pattern with fn_extra is repeated a lot, even just in the backend (not counting contrib or extensions). It would be nice if there was generic support for this. What do you mean by generic support? Most of those functions are doing quite different things with fn_extra --- granted, nearly all of them are caching something there, but I don't see very much that a generic infrastructure could bring to the table. At a glance, almost all the use under src/backend is doing some combination of 3 things: get_typlenbyvalalign(), get_type_io_data() or getting some operator for a type. This is most notable for records, arrays and ranges (though, records actually need an array of this info, so maybe we're out of luck there). That pattern exists outside of backend too; I think it's used in contrib, and I know at least one extension does this. So my thought was something that accepted fcinfo, IOFuncSelector, and TypeCacheEntry.flags. If IOFuncSelector was set, use get_type_io_data; else use get_typlenbyvalalign. If TypeCacheEntry.flags is set also look up the operator. Hmm... now that I look at it, the only use of get_type_io_data in src/backend seems to be to support arrays. Ranges make use of it too, but there's a comment in there that it returns more than what's needed. Even if a generic support function doesn't make sense, there's a lot of repeated fn_extra code for arrays and records. It would be good to at least refactor that like what was done for rangetypes. That process might provide a more obvious answer to how this could be done generically. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-11 2:57 GMT+01:00 Robert Haas robertmh...@gmail.com: On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com wrote: I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? Not if you want the opr_sanity tests to pass. (But I'm seriously starting to wonder if that's actually a smart rule for us to be enforcing. It seems to be something of a pain in the neck, and I'm unclear as to whether it is preventing any real problem.) It is simple protection against some oversights. I am not against this check - this rule cleans a interface between C and SQL. More, the additional C code is usually very short and trivial. But it should be commented well. Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 22:53 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 2/22/15 5:19 AM, Pavel Stehule wrote: 2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com: On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com mailto:Jim.Nasby@bluetreble.__com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). rebased + fixed docs I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? There is a rule about unique mapping C functions to SQL space - and I don't think so this rule is bad. It might be worth combining the array and non-array versions of this, by having a _common function that accepts a boolean and then just run one or the other of the while loops. Most of the code seems to be shared between the two versions. What is this comment supposed to mean? There is no 'width_array'... It is typo (I am sorry) - should be width_bucket(, array) http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac The code is similar, but it expect large **sorted** input. array_offset works on unsorted (alphabetical unsorted) data sets - like days of week .. * Biggest difference against width_array is unsorted input array. I've attached my doc changes, both alone and with the code. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com wrote: I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? Not if you want the opr_sanity tests to pass. (But I'm seriously starting to wonder if that's actually a smart rule for us to be enforcing. It seems to be something of a pain in the neck, and I'm unclear as to whether it is preventing any real problem.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 19:50 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: 2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I agree with Robert, this can be nice, but it needs more time for design :( Regards Pavel regards, tom lane
Re: [HACKERS] proposal: searching in array function - array_position
On Tue, Mar 10, 2015 at 3:43 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I agree with Robert, this can be nice, but it needs more time for design :( OK. If I'm in the minority, I'll desist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. yes, it is a reason. I'll comment it. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra; if (my_extra == NULL) { fcinfo-flinfo-fn_extra = MemoryContextAlloc(fcinfo-flinfo-fn_mcxt, sizeof(ArrayMetaState)); my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra; my_extra-element_type = ~element_type; } and if (my_extra-element_type != element_type) { get_typlenbyvalalign(element_type, my_extra-typlen, my_extra-typbyval, my_extra-typalign); my_extra-element_type = element_type; } so we can design function like (ArrayMetaState *) GetCachedArrayMetaState(FunctionCallInfo fcinfo, Oid element_type, bool *reused) { ArrayMetaState *state = (ArrayMetaState *) fcinfo-flinfo-fn_extra; if (state == NULL) { fcinfo-flinfo-fn_extra = MemoryContextAlloc(fcinfo-flinfo-fn_mcxt, sizeof(ArrayMetaState)); state = (ArrayMetaState *) fcinfo-flinfo-fn_extra; state-element_type = ~element_type; } if (state-element_type != element_type) { get_typlenbyvalalign(element_type, my_extra-typlen, my_extra-typbyval, my_extra-typalign); state-element_type = element_type; *resused = false; } else *reused = true; } Usage in code: array_state = GetCachedArrayMetaState(fceinfo, element_type, reused); if (!reused) { // some other initialization } The content is relatively clean, but the most big problem is naming (as usual) Comments? Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] proposal: searching in array function - array_position
Pavel Stehule pavel.steh...@gmail.com writes: 2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 10:53 AM, Jim Nasby wrote: On 3/10/15 9:30 AM, Robert Haas wrote: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? I'll take a look at it. The documentation should discuss what happens if the array is multi-dimensional. This is the wording I have to describe what happens with a multi-dimensional array. I'm not thrilled with it; anyone have better ideas? Note: multi-dimensional arrays are squashed to one dimension before searching. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 2:43 PM, Pavel Stehule wrote: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I'm not following what you intended there, but in any case I don't think we'd need to change all that much, because there's only 3 cases: 1) Get only the base type info 2) Get base type info and equality operator 3) Get IO info for one IOFunc Passing the function an enum (and perhaps keeping it in ArrayMetaState) would be enough to distinguish between the 3 cases. You'd also need to pass in IOFuncSelector. That said, this pattern with fn_extra is repeated a lot, even just in the backend (not counting contrib or extensions). It would be nice if there was generic support for this. decibel@decina:[17:15]~/pgsql/HEAD/src/backend (array_offset_v4 $)$pg_grep fn_extra|cut -d: -f1|uniq -c 14 access/gist/gistscan.c 7 executor/functions.c 1 executor/nodeWindowAgg.c 14 utils/adt/array_userfuncs.c 31 utils/adt/arrayfuncs.c 4 utils/adt/domains.c 2 utils/adt/enum.c 1 utils/adt/int.c 6 utils/adt/jsonfuncs.c 1 utils/adt/oid.c 4 utils/adt/orderedsetaggs.c 7 utils/adt/rangetypes.c 24 utils/adt/rowtypes.c 8 utils/adt/varlena.c (utils/fmgr/* doesn't count) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Jim Nasby jim.na...@bluetreble.com writes: That said, this pattern with fn_extra is repeated a lot, even just in the backend (not counting contrib or extensions). It would be nice if there was generic support for this. What do you mean by generic support? Most of those functions are doing quite different things with fn_extra --- granted, nearly all of them are caching something there, but I don't see very much that a generic infrastructure could bring to the table. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 2/22/15 5:19 AM, Pavel Stehule wrote: 2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com: On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com mailto:Jim.Nasby@bluetreble.__com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). rebased + fixed docs I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? It might be worth combining the array and non-array versions of this, by having a _common function that accepts a boolean and then just run one or the other of the while loops. Most of the code seems to be shared between the two versions. What is this comment supposed to mean? There is no 'width_array'... * Biggest difference against width_array is unsorted input array. I've attached my doc changes, both alone and with the code. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml index 9ea1068..d90266f 100644 --- a/doc/src/sgml/array.sgml +++ b/doc/src/sgml/array.sgml @@ -600,6 +600,15 @@ SELECT * FROM sal_emp WHERE pay_by_quarter ARRAY[1]; index, as described in xref linkend=indexes-types. /para + para + You can also search for a value in an array using the functionarray_offset/ + function. It returns the position of the first occurrence of a value in an array: + +programlisting +SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); +/programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index c198bea..311f2fe 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11480,6 +11480,9 @@ SELECT NULLIF(value, '(none)') ... primaryarray_lower/primary /indexterm indexterm +primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm @@ -11599,6 +11602,37 @@ SELECT NULLIF(value, '(none)') ... row entry literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal +/entry +entrytypeint/type/entry +entryreturns the offset of the first occurrence of a value in an +array. It uses the literalIS NOT DISTINCT FROM/ operator for +comparation. The optional third argument specifies an initial offset to +begin the search at. Returns NULL when the value is not found. Note: +multi-dimensional arrays are squashed to one dimension before +searching./entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry +entryliteral2/literal/entry + /row + row +entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal +/entry +entrytypeint[]/type/entry +entryreturns an array of offsets of all occurrences of a value in a array. It uses +the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array +when there are no occurences of the value in the array. Note: +multi-dimensional input arrays are squashed to one dimension before +searching./entry +entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry +entryliteral{1,2,4}/literal/entry + /row + row +entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 6679333..f7b7932 100644 ---
Re: [HACKERS] proposal: searching in array function - array_position
On 22/02/15 12:19, Pavel Stehule wrote: 2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com: On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). rebased + fixed docs You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com: On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). rebased + fixed docs Regards Pavel -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..b3630b4 *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,614 index, as described in xref linkend=indexes-types. /para + para + You can also search any value in array using the functionarray_offset/ + function (It returns a position of first occurrence of value in the array): + + programlisting + SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index da2ed67..459343a *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11480,11485 --- 11480,11488 primaryarray_lower/primary /indexterm indexterm + primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11598,11603 --- 11601,11633 /row row entry + literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns a offset of first occurrence of some element in a array. It uses + a literalIS NOT DISTINCT FROM/ operator for comparation. Third + optional argument can specify a initial offset when searching starts. + Returns NULL when there are not occurence of value in array./entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +row + entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal + /entry + entrytypeint[]/type/entry + entryreturns a array of offset of all occurrences some element in a array. It uses + a literalIS NOT DISTINCT FROM/ operator for comparation. Returns empty array, + when there are no occurence of element in input array./entry + entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry + entryliteral{1,2,4}/literal/entry +/row +row + entry literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c new file mode 100644 index 5781b95..9bf0eb5 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *** *** 12,20 --- 12,24 */ #include postgres.h + #include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h + #include utils/typcache.h + + static bool array_offset_common(FunctionCallInfo fcinfo, int *result); /* *** array_agg_array_finalfn(PG_FUNCTION_ARGS *** 644,646 --- 648,911 PG_RETURN_DATUM(result); } + + + /* + * array_offset - returns a offset of entered element in a array. + * Returns NULL when values is not a element of the array. It allow + * searching a NULL value due using a NOT DISTINCT FROM operator. + * + * Biggest difference against width_array is unsorted input array. + */ + Datum + array_offset(PG_FUNCTION_ARGS) + { + int result; + +
Re: [HACKERS] proposal: searching in array function - array_position
On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 1/27/15 4:36 AM, Pavel Stehule wrote: 2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com: On 1/26/15 4:17 PM, Pavel Stehule wrote: Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? I though about it - but there is different checks, different result processing, different result type. I didn't find any readable and reduced form :( Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines... It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/26/15 4:17 PM, Pavel Stehule wrote: Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? I though about it - but there is different checks, different result processing, different result type. I didn't find any readable and reduced form :( Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines... It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Regards Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: 2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto: jim.na...@bluetreble.com: On 1/26/15 4:17 PM, Pavel Stehule wrote: Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? I though about it - but there is different checks, different result processing, different result type. I didn't find any readable and reduced form :( Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines... It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). I plan to do some benchmark to calculate if we have to do, or we can use type cache only. Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 1/24/15 2:48 AM, Pavel Stehule wrote: with array_offsets - returns a array of offsets + entryreturns a offset of first occurrence of some element in a array. It uses should be + entryreturns the offset of the first occurrence of some element in an array. It uses + entryreturns a array of offset of all occurrences some element in a array. It uses should be + entryreturns an array of the offsets of all occurrences of some element in an array. It uses Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? You should remove the array_length() from the last array_offsets test; I don't see that it buys anything. I think there should be tests for what happens when you feed these functions a multi-dimensional array. Other than that, looks good. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-26 23:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/24/15 2:48 AM, Pavel Stehule wrote: with array_offsets - returns a array of offsets + entryreturns a offset of first occurrence of some element in a array. It uses should be + entryreturns the offset of the first occurrence of some element in an array. It uses + entryreturns a array of offset of all occurrences some element in a array. It uses should be + entryreturns an array of the offsets of all occurrences of some element in an array. It uses Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? I though about it - but there is different checks, different result processing, different result type. I didn't find any readable and reduced form :( You should remove the array_length() from the last array_offsets test; I don't see that it buys anything. ok I think there should be tests for what happens when you feed these functions a multi-dimensional array. I can do it. Result should be expected - it searching row by row due MD format Other than that, looks good. Thank you Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 1/26/15 4:17 PM, Pavel Stehule wrote: Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code? I though about it - but there is different checks, different result processing, different result type. I didn't find any readable and reduced form :( Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Hi with array_offsets - returns a array of offsets Regards Pavel 2015-01-20 21:32 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/20/15 11:12 AM, Pavel Stehule wrote: I am sending updated version - it allow third optional argument that specify where searching should to start. With it is possible repeatably call this function. What happened to returning an array of offsets? I think that would be both easier to use than this version as well as performing better. I see you dropped multi-dimension support, but I think that's fine. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..b3630b4 *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,614 index, as described in xref linkend=indexes-types. /para + para + You can also search any value in array using the functionarray_offset/ + function (It returns a position of first occurrence of value in the array): + + programlisting + SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index d57243a..c563165 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11474,11479 --- 11474,11482 primaryarray_lower/primary /indexterm indexterm + primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11592,11597 --- 11595,11626 /row row entry + literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns a offset of first occurrence of some element in a array. It uses + a literalIS NOT DISTINCT FROM/ operator for comparation. Third + optional argument can specify a initial offset when searching starts. /entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +row + entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal + /entry + entrytypeint[]/type/entry + entryreturns a array of offset of all occurrences some element in a array. It uses + a literalIS NOT DISTINCT FROM/ operator for comparation. Returns empty array, + when there are no occurence of element in input array./entry + entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry + entryliteral{1,2,4}/literal/entry +/row +row + entry literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c new file mode 100644 index 600646e..61a080d *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *** *** 12,20 --- 12,24 */ #include postgres.h + #include catalog/pg_type.h #include utils/array.h #include utils/builtins.h #include utils/lsyscache.h + #include utils/typcache.h + + static bool array_offset_common(FunctionCallInfo fcinfo, int *result); /*- *** array_agg_array_finalfn(PG_FUNCTION_ARGS *** 612,614 --- 616,879 PG_RETURN_DATUM(result); } + + + /* + * array_offset - returns a offset of entered element in a array. + * Returns NULL when values is not a element of the array. It allow + * searching a NULL value due using a NOT DISTINCT FROM operator. + * + * Biggest difference against width_array is unsorted input array. + */ + Datum + array_offset(PG_FUNCTION_ARGS) + { + int result; + + if (array_offset_common(fcinfo, result)) + PG_RETURN_INT32(result); + + PG_RETURN_NULL(); + } + + + Datum + array_offset_start(PG_FUNCTION_ARGS) + { + int result; + + if (array_offset_common(fcinfo, result)) + PG_RETURN_INT32(result); + + PG_RETURN_NULL(); + } + + /* + * Common part for functions array_offset and array_offset_startpos + */ + static bool + array_offset_common(FunctionCallInfo fcinfo, + int *result) + { + ArrayType *array; + Oid collation = PG_GET_COLLATION(); + Oid element_type; + Datum searched_element = (Datum) 0, + value; + bool
Re: [HACKERS] proposal: searching in array function - array_position
Hi I am sending updated version - it allow third optional argument that specify where searching should to start. With it is possible repeatably call this function. Regards Pavel 2015-01-17 23:43 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: Hi here is a proof concept of array_offset function possible question: * used comparation = or IS NOT DISTINCT FROM In this initial proof concept I used IS NOT DISTINCT FROM operator - but my opinion is not strong in this question. Both has some advantages and disadvantages. Regards Pavel 2015-01-16 19:12 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: 2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/16/15 11:16 AM, Pavel Stehule wrote: 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto: jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array array_position([1,2,3],2) -- 2 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to have N-1 dimension of first parameter */ The problem with that is you can't actually use '2' to get [2,3] back: select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL; ?column? -- t (1 row) yes, but when you are searching a array in array you can use a full slice selection: postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a constant every time in this case -- so it should not be returned array - {{1,2}} (1 row) I think the bigger problem here is we need something better than slices for handling subsets of arrays. Even if the function returned [2:2] it's still going to behave differently than it will in the non-array case because you won't be getting the expected number of dimensions back. :( you cannot to return a slice and I don't propose it, although we can return a range type or array of range type - but still we cannot to use range for a arrays. array_position_md([1,2,3],2) -- [2] array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1] another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence Gee, if only way had some way to return multiple elements of something... ;P In other words, I think all of these should actually return an array of positions. I think it's OK for someone that only cares about the first instance to just do [1]. there can be two functions - position - returns first and positions returns all as a array -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml new file mode 100644 index 9ea1068..b3630b4 *** a/doc/src/sgml/array.sgml --- b/doc/src/sgml/array.sgml *** SELECT * FROM sal_emp WHERE pay_by_quart *** 600,605 --- 600,614 index, as described in xref linkend=indexes-types. /para + para + You can also search any value in array using the functionarray_offset/ + function (It returns a position of first occurrence of value in the array): + + programlisting + SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); + /programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index 5e7b000..62c9f7f *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** SELECT NULLIF(value, '(none)') ... *** 11474,11479 --- 11474,11482 primaryarray_lower/primary /indexterm indexterm + primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm *** SELECT NULLIF(value, '(none)') ... *** 11592,11597 --- 11595,11613 /row row entry + literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal + /entry + entrytypeint/type/entry + entryreturns a offset of first occurrence of some element in a array. It uses + a literalIS NOT DISTINCT FROM/ operator for comparation. Third + optional argument can specify a initial offset when searching starts. /entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry + entryliteral2/literal/entry +/row +
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-20 21:32 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/20/15 11:12 AM, Pavel Stehule wrote: I am sending updated version - it allow third optional argument that specify where searching should to start. With it is possible repeatably call this function. What happened to returning an array of offsets? I think that would be both easier to use than this version as well as performing better. I have still thinking about this idea. It needs a different function and I didn't start with this. Implementation a optional start parameter to array_offset is cheap - and I am thinking so it can be useful for some use cases. I see you dropped multi-dimension support, but I think that's fine. It can be implemented later. There is no any barriers to later implementation. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 1/20/15 11:12 AM, Pavel Stehule wrote: I am sending updated version - it allow third optional argument that specify where searching should to start. With it is possible repeatably call this function. What happened to returning an array of offsets? I think that would be both easier to use than this version as well as performing better. I see you dropped multi-dimension support, but I think that's fine. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Hi here is a proof concept of array_offset function possible question: * used comparation = or IS NOT DISTINCT FROM In this initial proof concept I used IS NOT DISTINCT FROM operator - but my opinion is not strong in this question. Both has some advantages and disadvantages. Regards Pavel 2015-01-16 19:12 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: 2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/16/15 11:16 AM, Pavel Stehule wrote: 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto: jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array array_position([1,2,3],2) -- 2 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to have N-1 dimension of first parameter */ The problem with that is you can't actually use '2' to get [2,3] back: select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL; ?column? -- t (1 row) yes, but when you are searching a array in array you can use a full slice selection: postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a constant every time in this case -- so it should not be returned array - {{1,2}} (1 row) I think the bigger problem here is we need something better than slices for handling subsets of arrays. Even if the function returned [2:2] it's still going to behave differently than it will in the non-array case because you won't be getting the expected number of dimensions back. :( you cannot to return a slice and I don't propose it, although we can return a range type or array of range type - but still we cannot to use range for a arrays. array_position_md([1,2,3],2) -- [2] array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1] another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence Gee, if only way had some way to return multiple elements of something... ;P In other words, I think all of these should actually return an array of positions. I think it's OK for someone that only cares about the first instance to just do [1]. there can be two functions - position - returns first and positions returns all as a array -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com commit 9562ae2bf9d0e8afdf745e50857052f6b9052086 Author: Pavel Stehule pavel.steh...@gooddata.com Date: Sat Jan 17 23:35:34 2015 +0100 initial implementation of array_offset function. diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml index 9ea1068..b3630b4 100644 --- a/doc/src/sgml/array.sgml +++ b/doc/src/sgml/array.sgml @@ -600,6 +600,15 @@ SELECT * FROM sal_emp WHERE pay_by_quarter ARRAY[1]; index, as described in xref linkend=indexes-types. /para + para + You can also search any value in array using the functionarray_offset/ + function (It returns a position of first occurrence of value in the array): + +programlisting +SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); +/programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 5e7b000..4aac21b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11474,6 +11474,9 @@ SELECT NULLIF(value, '(none)') ... primaryarray_lower/primary /indexterm indexterm +primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm @@ -11593,6 +11596,18 @@ SELECT NULLIF(value, '(none)') ... row entry literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type) + /literal +/entry +entrytypeint/type/entry +entryreturns a offset of first occurrence of some element in a array. It uses +a literalIS NOT DISTINCT FROM/ operator for comparation./entry +entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry +entryliteral2/literal/entry + /row + row +entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 600646e..62768fb 100644 --- a/src/backend/utils/adt/array_userfuncs.c +++
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. theoretically you can use this function for md arrays too. This function returns offset, and you can calculate a Nd possition so maybe better name -- array_offset or some similar Regards Pavel This wouldn't be so bad if we had an easier way to extract subsets of an array, but even that is really ugly because whatever you extract keeps the original number of dimensions. Implementation is simple (plpgsql code) This would actually be written in C though, yes? Otherwise it's not really any better than just doing an extension... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. This wouldn't be so bad if we had an easier way to extract subsets of an array, but even that is really ugly because whatever you extract keeps the original number of dimensions. Implementation is simple (plpgsql code) This would actually be written in C though, yes? Otherwise it's not really any better than just doing an extension... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array array_position([1,2,3],2) -- 2 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to have N-1 dimension of first parameter */ array_position_md([1,2,3],2) -- [2] array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1] another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence This wouldn't be so bad if we had an easier way to extract subsets of an array, but even that is really ugly because whatever you extract keeps the original number of dimensions. Implementation is simple (plpgsql code) This would actually be written in C though, yes? Otherwise it's not really any better than just doing an extension... Sure, I expect a C implementation Pavel -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] proposal: searching in array function - array_position
On 1/16/15 11:16 AM, Pavel Stehule wrote: 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array array_position([1,2,3],2) -- 2 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to have N-1 dimension of first parameter */ The problem with that is you can't actually use '2' to get [2,3] back: select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL; ?column? -- t (1 row) I think the bigger problem here is we need something better than slices for handling subsets of arrays. Even if the function returned [2:2] it's still going to behave differently than it will in the non-array case because you won't be getting the expected number of dimensions back. :( array_position_md([1,2,3],2) -- [2] array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1] another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence Gee, if only way had some way to return multiple elements of something... ;P In other words, I think all of these should actually return an array of positions. I think it's OK for someone that only cares about the first instance to just do [1]. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/16/15 11:16 AM, Pavel Stehule wrote: 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto: jim.na...@bluetreble.com: On 1/16/15 3:39 AM, Pavel Stehule wrote: I am proposing a simple function, that returns a position of element in array. Yes please! FUNCTION array_position(anyarray, anyelement) RETURNS int That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice. It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array array_position([1,2,3],2) -- 2 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to have N-1 dimension of first parameter */ The problem with that is you can't actually use '2' to get [2,3] back: select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL; ?column? -- t (1 row) yes, but when you are searching a array in array you can use a full slice selection: postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a constant every time in this case -- so it should not be returned array - {{1,2}} (1 row) I think the bigger problem here is we need something better than slices for handling subsets of arrays. Even if the function returned [2:2] it's still going to behave differently than it will in the non-array case because you won't be getting the expected number of dimensions back. :( you cannot to return a slice and I don't propose it, although we can return a range type or array of range type - but still we cannot to use range for a arrays. array_position_md([1,2,3],2) -- [2] array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1] another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence Gee, if only way had some way to return multiple elements of something... ;P In other words, I think all of these should actually return an array of positions. I think it's OK for someone that only cares about the first instance to just do [1]. there can be two functions - position - returns first and positions returns all as a array -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com