Re: [HACKERS] proposal: searching in array function - array_position

2015-03-22 Thread Pavel Stehule
2015-03-22 11:30 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com:

 On 22 March 2015 at 06:11, Pavel Stehule pavel.steh...@gmail.com wrote:
  Hi
 
  here is updated patch with array_position, array_positions
 implementation.
 
  It is based on committed code - so please, revert commit
  13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch
 

 I checked this and the changes look good, except that you missed a
 couple of places in src/include/utils/array.h that need updating.

 In the public docs, you should s/position/subscript because that's the
 term used throughout the docs for an index into an array. I still like
 the name array_position() for the function though, because it's
 consistent with the existing position() functions.


updated patch





 Regards,
 Dean

commit 86ec9f27c473de3de0b1fee3c90d40a610fbbdcd
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Sun Mar 22 11:55:24 2015 +0100

doc and header files fix

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index 9ea1068..5e4130a 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -600,6 +600,25 @@ SELECT * FROM sal_emp WHERE pay_by_quarter  ARRAY[1];
   index, as described in xref linkend=indexes-types.
  /para
 
+ para
+  You can also search for specific values in an array using the functionarray_position/
+  and functionarray_positions/ functions. The former returns the subscript of
+  the first occurrence of a value in an array; the latter returns an array with the
+  subscripts of all occurrences of the value in the array.  For example:
+
+programlisting
+SELECT array_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon');
+ array_positions
+-
+ 2
+
+SELECT array_positions(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1);
+ array_positions
+-
+ {1,4,8}
+/programlisting
+ /para
+
  tip
   para
Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c89f343..8b12c26 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11480,6 +11480,12 @@ SELECT NULLIF(value, '(none)') ...
 primaryarray_lower/primary
   /indexterm
   indexterm
+primaryarray_position/primary
+  /indexterm
+  indexterm
+primaryarray_positions/primary
+  /indexterm
+  indexterm
 primaryarray_prepend/primary
   /indexterm
   indexterm
@@ -11599,6 +11605,32 @@ SELECT NULLIF(value, '(none)') ...
row
 entry
  literal
+  functionarray_position/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+ /literal
+/entry
+entrytypeint/type/entry
+entryreturns the subscript of the first occurrence of the second
+argument in the array, starting at the element indicated by the third
+argument or at the first element (array must be one-dimensional)/entry
+entryliteralarray_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry
+entryliteral2/literal/entry
+   /row
+   row
+entry
+ literal
+  functionarray_positions/function(typeanyarray/type, typeanyelement/type)
+ /literal
+/entry
+entrytypeint[]/type/entry
+entryreturns an array of subscripts of all occurrences of the second
+argument in the array given as first argument (array must be
+one-dimensional)/entry
+entryliteralarray_positions(ARRAY['A','A','B','A'], 'A')/literal/entry
+entryliteral{1,2,4}/literal/entry
+   /row
+   row
+entry
+ literal
   functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
  /literal
 /entry
@@ -11708,6 +11740,23 @@ NULL baz/literallayout(3 rows)/entry
 /table
 
para
+In functionarray_position/function and functionarray_positions/,
+each array element is compared to the searched value using
+literalIS NOT DISTINCT FROM/literal semantics.
+   /para
+
+   para
+In functionarray_position/function, literalNULL/literal is returned
+if the value is not found.
+   /para
+
+   para
+In functionarray_positions/function, literalNULL/literal is returned
+only if the array is literalNULL/literal; if the value is not found in
+the array, an empty array is returned instead.
+   /para
+
+   para
 In functionstring_to_array/function, if the delimiter parameter is
 NULL, each character in the input string will become a separate element in
 the resulting array.  If the delimiter is an empty string, then the entire
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 6679333..c0bfd33 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -12,9 +12,14 @@
  */
 #include postgres.h
 
+#include catalog/pg_type.h
 #include utils/array.h
 #include utils/builtins.h
 #include utils/lsyscache.h

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-22 Thread Dean Rasheed
On 22 March 2015 at 06:11, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hi

 here is updated patch with array_position, array_positions implementation.

 It is based on committed code - so please, revert commit
 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch


I checked this and the changes look good, except that you missed a
couple of places in src/include/utils/array.h that need updating.

In the public docs, you should s/position/subscript because that's the
term used throughout the docs for an index into an array. I still like
the name array_position() for the function though, because it's
consistent with the existing position() functions.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-22 Thread Pavel Stehule
Hi

here is updated patch with array_position, array_positions implementation.

It is based on committed code - so please, revert commit
13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch

Regards

Pavel


2015-03-20 18:29 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com:
 
   There's an issue when the array's lower bound isn't 1:
  
   select array_offset('[2:4]={1,2,3}'::int[], 1);
array_offset
   --
   1
   (1 row)
  
   whereas I would expect this to return 2. Similarly for
   array_offsets(), so the offsets can be used as indexes into the
   original array.
  
 
  I am thinking, so it is ok - it returns a offset, not position.

 So you can't use it as a subscript?  That sounds unfriendly.  Almost
 every function using this will be subtly broken.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..85dc647
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,624 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search for specific values in an array using the functionarray_position/
+   and functionarray_positions/ functions. The former returns the position of
+   the first occurrence of a value in an array; the latter returns an array with the
+   positions of all occurrences of the value in the array.  For example:
+ 
+ programlisting
+ SELECT array_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon');
+  array_positions
+ -
+  2
+ 
+ SELECT array_positions(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1);
+  array_positions
+ -
+  {1,4,8}
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index c89f343..c865f30
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11480,11485 
--- 11480,11491 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_position/primary
+   /indexterm
+   indexterm
+ primaryarray_positions/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11599,11604 
--- 11605,11636 
 row
  entry
   literal
+   functionarray_position/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns the position of the first occurrence of the second
+ argument in the array, starting at the element indicated by the third
+ argument or at the first element (array must be one-dimensional)/entry
+ entryliteralarray_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+row
+ entry
+  literal
+   functionarray_positions/function(typeanyarray/type, typeanyelement/type)
+  /literal
+ /entry
+ entrytypeint[]/type/entry
+ entryreturns an array of positions of all occurrences of the second
+ argument in the array given as first argument (array must be
+ one-dimensional)/entry
+ entryliteralarray_positions(ARRAY['A','A','B','A'], 'A')/literal/entry
+ entryliteral{1,2,4}/literal/entry
+/row
+row
+ entry
+  literal
functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
   /literal
  /entry
*** NULL baz/literallayout(3 rows)/entry
*** 11708,11713 
--- 11740,11762 
  /table
  
 para
+ In functionarray_position/function and functionarray_positions/,
+ each array element is compared to the searched value using
+ literalIS NOT DISTINCT FROM/literal semantics.
+/para
+ 
+para
+ In functionarray_position/function, literalNULL/literal is returned
+ if the value is not found.
+/para
+ 
+para
+ In functionarray_positions/function, literalNULL/literal is returned
+ only if the array is literalNULL/literal; if the value is not found in
+ the array, an empty array is returned instead.
+/para
+ 
+para
  In functionstring_to_array/function, if the delimiter parameter is
  NULL, each character in the input string will become a separate element in
  the resulting array.  If the delimiter is an empty string, then the entire
diff --git a/src/backend/utils/adt/array_userfuncs.c 

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-21 Thread Dean Rasheed
 do you have any idea about name for this function? array_position is ok?

 +1 on array_position. It's possible at some point we'll actually want
 array_offset that does what it claims.


+1 for array_position.

-1 for keeping array_offset. I'm not convinced that there are
sufficient use cases for it. No other array functions deal in offsets
relative to the first element, and if you do want that, it is trivial
to achieve with array_position() and array_lower(). IMO having 2
functions for searching in an array will just increase the risk of
users picking the wrong one by accident.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-21 Thread Pavel Stehule
2015-03-21 0:27 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/20/15 2:48 PM, Pavel Stehule wrote:



 2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us
 mailto:t...@sss.pgh.pa.us:

 Alvaro Herrera alvhe...@2ndquadrant.com
 mailto:alvhe...@2ndquadrant.com writes:
  Pavel Stehule wrote:
  I am thinking, so it is ok - it returns a offset, not position.

  So you can't use it as a subscript?  That sounds unfriendly.  Almost
  every function using this will be subtly broken.

 I concur; perhaps offset was the design intention, but it's wrong.
 The result should be a subscript.


 do you have any idea about name for this function? array_position is ok?


 +1 on array_position. It's possible at some point we'll actually want
 array_offset that does what it claims.


additional implementation of array_position needs few lines more



 On another note, you mentioned elsewhere that it's not possible to return
 anything other than an integer. Why can't there be a variation of this
 function that returns an array of ndims-1 that is the slice where a value
 was found?


We talked about it, when we talked about MD searching - and we moved it to
next stage.

I am thinking so array_postions can support MD arrays due returning a array

Regards

Pavel




 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com

commit 8ceb761fcd4bca3859c0ec371ec783a36795dd49
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Sat Mar 21 07:01:51 2015 +0100

initial

diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 57074e0..91e2824 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -19,8 +19,8 @@
 #include utils/typcache.h
 
 
-static Datum array_offset_common(FunctionCallInfo fcinfo);
-
+static Datum array_offset_common(FunctionCallInfo fcinfo, bool expected_position);
+static Datum array_offsets_common(FunctionCallInfo fcinfo, bool expected_position);
 
 /*
  * fetch_array_arg_replace_nulls
@@ -669,13 +669,33 @@ array_agg_array_finalfn(PG_FUNCTION_ARGS)
 Datum
 array_offset(PG_FUNCTION_ARGS)
 {
-	return array_offset_common(fcinfo);
+	return array_offset_common(fcinfo, false);
 }
 
 Datum
 array_offset_start(PG_FUNCTION_ARGS)
 {
-	return array_offset_common(fcinfo);
+	return array_offset_common(fcinfo, false);
+}
+
+/*-
+ * array_positiob, array_position_start :
+ *			return the position of a value in an array.
+ *
+ * IS NOT DISTINCT FROM semantics are used for comparisons.  Return NULL when
+ * the value is not found.
+ *-
+ */
+Datum
+array_position(PG_FUNCTION_ARGS)
+{
+	return array_offset_common(fcinfo, true);
+}
+
+Datum
+array_position_start(PG_FUNCTION_ARGS)
+{
+	return array_offset_common(fcinfo, true);
 }
 
 /*
@@ -686,7 +706,7 @@ array_offset_start(PG_FUNCTION_ARGS)
  * They are not strict so we have to test for null inputs explicitly.
  */
 static Datum
-array_offset_common(FunctionCallInfo fcinfo)
+array_offset_common(FunctionCallInfo fcinfo, bool expected_positions)
 {
 	ArrayType  *array;
 	Oid			collation = PG_GET_COLLATION();
@@ -701,6 +721,7 @@ array_offset_common(FunctionCallInfo fcinfo)
 	ArrayMetaState *my_extra;
 	bool		null_search;
 	ArrayIterator array_iterator;
+	intlb = 1;
 
 	if (PG_ARGISNULL(0))
 		PG_RETURN_NULL();
@@ -731,6 +752,8 @@ array_offset_common(FunctionCallInfo fcinfo)
 		null_search = false;
 	}
 
+	lb = expected_positions ? (ARR_LBOUND(array))[0] : 1;
+
 	/* figure out where to start */
 	if (PG_NARGS() == 3)
 	{
@@ -739,7 +762,7 @@ array_offset_common(FunctionCallInfo fcinfo)
 	(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 	 errmsg(initial offset should not be NULL)));
 
-		offset_min = PG_GETARG_INT32(2);
+		offset_min = PG_GETARG_INT32(2) - lb + 1;
 	}
 	else
 		offset_min = 1;
@@ -818,9 +841,25 @@ array_offset_common(FunctionCallInfo fcinfo)
 	if (!found)
 		PG_RETURN_NULL();
 
-	PG_RETURN_INT32(offset);
+	PG_RETURN_INT32(offset + lb - 1);
+}
+
+/*
+ * simple wrappers over array_offsets_common
+ */
+Datum
+array_offsets(PG_FUNCTION_ARGS)
+{
+	return array_offsets_common(fcinfo, false);
 }
 
+Datum
+array_positions(PG_FUNCTION_ARGS)
+{
+	return array_offsets_common(fcinfo, true);
+}
+
+
 /*-
  * array_offsets :
  *			return an array of offsets of a value in an array.
@@ -833,7 +872,7 @@ array_offset_common(FunctionCallInfo fcinfo)
  *-
  */
 Datum
-array_offsets(PG_FUNCTION_ARGS)
+array_offsets_common(FunctionCallInfo fcinfo, bool expected_positions)
 {
 	ArrayType  *array;
 	Oid			collation = PG_GET_COLLATION();
@@ -847,6 +886,7 @@ array_offsets(PG_FUNCTION_ARGS)
 	bool		

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Jim Nasby

On 3/20/15 2:48 PM, Pavel Stehule wrote:



2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us
mailto:t...@sss.pgh.pa.us:

Alvaro Herrera alvhe...@2ndquadrant.com
mailto:alvhe...@2ndquadrant.com writes:
 Pavel Stehule wrote:
 I am thinking, so it is ok - it returns a offset, not position.

 So you can't use it as a subscript?  That sounds unfriendly.  Almost
 every function using this will be subtly broken.

I concur; perhaps offset was the design intention, but it's wrong.
The result should be a subscript.


do you have any idea about name for this function? array_position is ok?


+1 on array_position. It's possible at some point we'll actually want 
array_offset that does what it claims.


On another note, you mentioned elsewhere that it's not possible to 
return anything other than an integer. Why can't there be a variation of 
this function that returns an array of ndims-1 that is the slice where a 
value was found?

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Pavel Stehule wrote:
 I am thinking, so it is ok - it returns a offset, not position.

 So you can't use it as a subscript?  That sounds unfriendly.  Almost
 every function using this will be subtly broken.

I concur; perhaps offset was the design intention, but it's wrong.
The result should be a subscript.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Alvaro Herrera
Pavel Stehule wrote:
 2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com:
 
  There's an issue when the array's lower bound isn't 1:
 
  select array_offset('[2:4]={1,2,3}'::int[], 1);
   array_offset
  --
  1
  (1 row)
 
  whereas I would expect this to return 2. Similarly for
  array_offsets(), so the offsets can be used as indexes into the
  original array.
 
 
 I am thinking, so it is ok - it returns a offset, not position.

So you can't use it as a subscript?  That sounds unfriendly.  Almost
every function using this will be subtly broken.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Pavel Stehule
2015-03-20 18:29 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com:
 
   There's an issue when the array's lower bound isn't 1:
  
   select array_offset('[2:4]={1,2,3}'::int[], 1);
array_offset
   --
   1
   (1 row)
  
   whereas I would expect this to return 2. Similarly for
   array_offsets(), so the offsets can be used as indexes into the
   original array.
  
 
  I am thinking, so it is ok - it returns a offset, not position.

 So you can't use it as a subscript?  That sounds unfriendly.  Almost
 every function using this will be subtly broken.



depends what you want. It means - it is on Nth position from start. So it
is useful when iterate over array, because it is safe against different
array start dimensions. it works, if you use it as offset. It is named
array_offset

It can be changed and renamed to array_position - it is simple fix.  But I
am not sure, if it is better.

Regards

Pavel


 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Dean Rasheed
On 18 March 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Pushed after adding error checks there and fixing the docs to match.
 Please verify.


There's an issue when the array's lower bound isn't 1:

select array_offset('[2:4]={1,2,3}'::int[], 1);
 array_offset
--
1
(1 row)

whereas I would expect this to return 2. Similarly for
array_offsets(), so the offsets can be used as indexes into the
original array.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Pavel Stehule
2015-03-20 17:49 GMT+01:00 Dean Rasheed dean.a.rash...@gmail.com:

 On 18 March 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
  Pushed after adding error checks there and fixing the docs to match.
  Please verify.
 

 There's an issue when the array's lower bound isn't 1:

 select array_offset('[2:4]={1,2,3}'::int[], 1);
  array_offset
 --
 1
 (1 row)

 whereas I would expect this to return 2. Similarly for
 array_offsets(), so the offsets can be used as indexes into the
 original array.


I am thinking, so it is ok - it returns a offset, not position.

Regards

Pavel



 Regards,
 Dean



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-20 Thread Pavel Stehule
2015-03-20 18:47 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Pavel Stehule wrote:
  I am thinking, so it is ok - it returns a offset, not position.

  So you can't use it as a subscript?  That sounds unfriendly.  Almost
  every function using this will be subtly broken.

 I concur; perhaps offset was the design intention, but it's wrong.
 The result should be a subscript.


do you have any idea about name for this function? array_position is ok?

Regards

Pavel



 regards, tom lane



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-19 Thread Pavel Stehule
2015-03-18 20:03 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to:

   I am thinking, so this behave is correct (there is no other
   possible), but it is only corner case for this functionality - and
   if you are thinking, so better to disallow it, I am not against. My
   main focus is 1ND array.
  
   A nonsensical answer for multi-dimensional arrays is worse than no
 answer
   at all.  I think raising an exception is better.
  
 
  I have not problem with it.

 Pushed after adding error checks there and fixing the docs to match.
 Please verify.


it is looking well, thank you.

small issue - there is not documented, so multidimensional arrays are not
supported,

Regards

Pavel



 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-19 Thread Alvaro Herrera
Pavel Stehule wrote:
 2015-03-18 20:03 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

  Pushed after adding error checks there and fixing the docs to match.
  Please verify.
 
 
 it is looking well, thank you.

Thanks for checking.

 small issue - there is not documented, so multidimensional arrays are not
 supported,

I added a parenthised comment in the table, (array must be
one-dimensional).  I copied this from the entry for the array_remove
function IIRC; it seemed enough ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Pavel Stehule
2015-03-18 3:45 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/17/15 8:06 PM, Alvaro Herrera wrote:

 My main question regarding this patch is whether the behavior with MD
 arrays is useful at all.  Suppose I give it this:

 alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
 3);
   array_offset
 --
  3
 (1 fila)

 What can I do with the 3 value it returned?  Certainly not use it as
 an offset to get a slice of the original array.  The only thing that
 seems sensible to me here is to reject the whole thing with an error,
 that is, only accept 1-D arrays here.  We can later extend the function
 by allowing higher dimensionality as long as the second argument is an
 array one dimension less than the first argument.  But if we allow the
 case on its appearance, it's going to be difficult to change the
 behavior later.


This behave is consistent with unnest function, when all multidimensional
arrays are reduced to 1ND arrays.

Other argument for this behave is impossibility to design other behave.
array_offset function have to returns integer always. You cannot to returns
a array of integers, what is necessary for MD position. And one integer can
be only position in flatted  1ND array. I agree, so this is not user
friendly, but there is not any other possible solution - we have not
anyarray and anymdarray types. I designed this possibility (use ND arrays)
mainly for info, if some value exists or not.

I am thinking, so this behave is correct (there is no other possible), but
it is only corner case for this functionality - and if you are thinking, so
better to disallow it, I am not against. My main focus is 1ND array.

Regards

Pavel





 +1

  Has a case been made for the current behavior?


 Not that I remember. There was discussion about how this should properly
 support MD arrays.

 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Pavel Stehule
2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to:

 On 3/18/15 12:27 PM, Pavel Stehule wrote:

 On 3/17/15 8:06 PM, Alvaro Herrera wrote:

  My main question regarding this patch is whether the behavior with MD
 arrays is useful at all.  Suppose I give it this:

 alvherre=# select array_offset('{{{1,2},{3,4},{
 5,6}},{{2,3},{4,5},{6,7}}}',
 3);
array_offset
 --
   3
 (1 fila)

 What can I do with the 3 value it returned?  Certainly not use it as
 an offset to get a slice of the original array.  The only thing that
 seems sensible to me here is to reject the whole thing with an error,
 that is, only accept 1-D arrays here.  We can later extend the function
 by allowing higher dimensionality as long as the second argument is an
 array one dimension less than the first argument.  But if we allow the
 case on its appearance, it's going to be difficult to change the
 behavior later.

  I designed this possibility (use ND arrays)
 mainly for info, if some value exists or not.


 Why not use =ANY() for that?


It is only partial solution - array_offset use IS NOT DISTINCT FROM
operator.


  I am thinking, so this behave is correct (there is no other possible), but
 it is only corner case for this functionality - and if you are thinking,
 so
 better to disallow it, I am not against. My main focus is 1ND array.


 A nonsensical answer for multi-dimensional arrays is worse than no answer
 at all.  I think raising an exception is better.


I have not problem with it.

Regards

Pavel




 .m



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Marko Tiikkaja

On 3/18/15 12:27 PM, Pavel Stehule wrote:

On 3/17/15 8:06 PM, Alvaro Herrera wrote:


My main question regarding this patch is whether the behavior with MD
arrays is useful at all.  Suppose I give it this:

alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
3);
   array_offset
--
  3
(1 fila)

What can I do with the 3 value it returned?  Certainly not use it as
an offset to get a slice of the original array.  The only thing that
seems sensible to me here is to reject the whole thing with an error,
that is, only accept 1-D arrays here.  We can later extend the function
by allowing higher dimensionality as long as the second argument is an
array one dimension less than the first argument.  But if we allow the
case on its appearance, it's going to be difficult to change the
behavior later.


I designed this possibility (use ND arrays)
mainly for info, if some value exists or not.


Why not use =ANY() for that?


I am thinking, so this behave is correct (there is no other possible), but
it is only corner case for this functionality - and if you are thinking, so
better to disallow it, I am not against. My main focus is 1ND array.


A nonsensical answer for multi-dimensional arrays is worse than no 
answer at all.  I think raising an exception is better.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Alvaro Herrera
Pavel Stehule wrote:
 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to:

  I am thinking, so this behave is correct (there is no other
  possible), but it is only corner case for this functionality - and
  if you are thinking, so better to disallow it, I am not against. My
  main focus is 1ND array.
 
  A nonsensical answer for multi-dimensional arrays is worse than no answer
  at all.  I think raising an exception is better.
 
 
 I have not problem with it.

Pushed after adding error checks there and fixing the docs to match.
Please verify.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-17 Thread Alvaro Herrera
My main question regarding this patch is whether the behavior with MD
arrays is useful at all.  Suppose I give it this:

alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3);
 array_offset 
--
3
(1 fila)

What can I do with the 3 value it returned?  Certainly not use it as
an offset to get a slice of the original array.  The only thing that
seems sensible to me here is to reject the whole thing with an error,
that is, only accept 1-D arrays here.  We can later extend the function
by allowing higher dimensionality as long as the second argument is an
array one dimension less than the first argument.  But if we allow the
case on its appearance, it's going to be difficult to change the
behavior later.

Has a case been made for the current behavior?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index 9ea1068..092013b 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -600,6 +600,25 @@ SELECT * FROM sal_emp WHERE pay_by_quarter  ARRAY[1];
   index, as described in xref linkend=indexes-types.
  /para
 
+ para
+  You can also search for specific values in an array using the functionarray_offset/
+  and functionarray_offsets/ functions. The former returns the position of
+  the first occurrence of a value in an array; the latter returns an array with the
+  positions of all occurrences of the value in the array.  For example:
+
+programlisting
+SELECT array_offset(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon');
+ array_offset
+--
+ 2
+
+SELECT array_offsets(ARRAY[1, 4, 3, 1, 3, 4, 2, 1], 1);
+ array_offsets 
+---
+ {1,4,8}
+/programlisting
+ /para
+
  tip
   para
Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c198bea..d8c542e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11480,6 +11480,12 @@ SELECT NULLIF(value, '(none)') ...
 primaryarray_lower/primary
   /indexterm
   indexterm
+primaryarray_offset/primary
+  /indexterm
+  indexterm
+primaryarray_offsets/primary
+  /indexterm
+  indexterm
 primaryarray_prepend/primary
   /indexterm
   indexterm
@@ -11599,6 +11605,39 @@ SELECT NULLIF(value, '(none)') ...
row
 entry
  literal
+  functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+ /literal
+/entry
+entrytypeint/type/entry
+entryreturns the offset of the first occurrence of the second
+argument in the array.  Each array element is compared using the 
+literalIS NOT DISTINCT FROM/ operator.  The optional third
+argument specifies an initial offset to begin the search at.
+Returns literalNULL/ if the value is not found. 
+Multi-dimensional arrays are squashed to one dimension before
+searching./entry
+entryliteralarray_offset(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon')/literal/entry
+entryliteral2/literal/entry
+   /row
+   row
+entry
+ literal
+  functionarray_offsets/function(typeanyarray/type, typeanyelement/type)
+ /literal
+/entry
+entrytypeint[]/type/entry
+entryreturns an array of offsets of all occurrences of the second
+argument in the array given as first argument.  Each array element is
+compared using the the literalIS NOT DISTINCT FROM/ operator.
+Returns an empty array when there are no occurences of the value in
+the array. Multi-dimensional arrays are squashed to one dimension before
+searching./entry
+entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry
+entryliteral{1,2,4}/literal/entry
+   /row
+   row
+entry
+ literal
   functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
  /literal
 /entry
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 6679333..dcc7c62 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -12,9 +12,14 @@
  */
 #include postgres.h
 
+#include catalog/pg_type.h
 #include utils/array.h
 #include utils/builtins.h
 #include utils/lsyscache.h
+#include utils/typcache.h
+
+
+static Datum array_offset_common(FunctionCallInfo fcinfo);
 
 
 /*
@@ -652,3 +657,276 @@ array_agg_array_finalfn(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+/*-
+ * array_offset, array_offset_start :
+ *			return the offset of a value in an array.
+ *
+ * IS NOT DISTINCT FROM semantics are used for comparisons.  Return NULL when
+ * the value is not found.
+ 

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-17 Thread Jim Nasby

On 3/17/15 8:06 PM, Alvaro Herrera wrote:

My main question regarding this patch is whether the behavior with MD
arrays is useful at all.  Suppose I give it this:

alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3);
  array_offset
--
 3
(1 fila)

What can I do with the 3 value it returned?  Certainly not use it as
an offset to get a slice of the original array.  The only thing that
seems sensible to me here is to reject the whole thing with an error,
that is, only accept 1-D arrays here.  We can later extend the function
by allowing higher dimensionality as long as the second argument is an
array one dimension less than the first argument.  But if we allow the
case on its appearance, it's going to be difficult to change the
behavior later.


+1


Has a case been made for the current behavior?


Not that I remember. There was discussion about how this should properly 
support MD arrays.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-12 Thread Jim Nasby

On 3/11/15 1:19 AM, Pavel Stehule wrote:

2015-03-11 2:57 GMT+01:00 Robert Haas robertmh...@gmail.com
mailto:robertmh...@gmail.com:

On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com
mailto:jim.na...@bluetreble.com wrote:
 I don't think we need both array_offset and array_offset_start; can't both
 SQL functions just call one C function?

Not if you want the opr_sanity tests to pass.

(But I'm seriously starting to wonder if that's actually a smart rule
for us to be enforcing.  It seems to be something of a pain in the
neck, and I'm unclear as to whether it is preventing any real
problem.)


It is simple protection against some oversights.  I am not against this
check - this rule cleans a interface between C and SQL. More, the
additional C code is usually very short and trivial.

But it should be commented well.


Ahh, ok, makes more sense now. If the separate C functions are serving a 
purpose that's fine. I think the comment should mention it though, as 
it's not exactly the most obvious thing.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-12 Thread Pavel Stehule
2015-03-11 22:50 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/11/15 4:37 PM, Pavel Stehule wrote:
 + /*
 +  * array_offset - returns the offset of a value in an array
 (array_offset and
 +  * array_offset_start are wrappers for safe call (look on opr_sanity
 test) a
 +  * array_offset_common function.
 +  *
 +  * Returns NULL when value is not found. It uses a NOT DISTINCT FROM
 operator
 +  * for comparation to be safe against NULL.
 +  */

 would be better as...

 + /*
 +  * array_offset - returns the offset of a value in an array.
 array_offset and
 +  * array_offset_start are wrappers for the sake of the opr_sanity test.
 +  *
 +  * Returns NULL when value is not found. It uses a NOT DISTINCT FROM
 operator
 +  * for comparation to be safe against NULL.
 +  */


fixed

Regards

Pavel



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..d90266f
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,614 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search for a value in an array using the functionarray_offset/
+   function. It returns the position of the first occurrence of a value in an array:
+ 
+ programlisting
+ SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index c198bea..311f2fe
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11480,11485 
--- 11480,11488 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_offset/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11598,11603 
--- 11601,11637 
 /row
 row
  entry
+  literal
+   functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns the offset of the first occurrence of a value in an
+ array. It uses the literalIS NOT DISTINCT FROM/ operator for
+ comparation. The optional third argument specifies an initial offset to
+ begin the search at.  Returns NULL when the value is not found. Note:
+ multi-dimensional arrays are squashed to one dimension before
+ searching./entry
+ entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+row
+ entry
+  literal
+   functionarray_offsets/function(typeanyarray/type, typeanyelement/type)
+  /literal
+ /entry
+ entrytypeint[]/type/entry
+ entryreturns an array of offsets of all occurrences of a value in a array. It uses
+ the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array
+ when there are no occurences of the value in the array. Note:
+ multi-dimensional input arrays are squashed to one dimension before
+ searching./entry
+ entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry
+ entryliteral{1,2,4}/literal/entry
+/row
+row
+ entry
   literal
functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
   /literal
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
new file mode 100644
index 6679333..0f9ea48
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***
*** 12,20 
--- 12,24 
   */
  #include postgres.h
  
+ #include catalog/pg_type.h
  #include utils/array.h
  #include utils/builtins.h
  #include utils/lsyscache.h
+ #include utils/typcache.h
+ 
+ static Datum array_offset_common(FunctionCallInfo fcinfo);
  
  
  /*
*** array_agg_array_finalfn(PG_FUNCTION_ARGS
*** 652,654 
--- 656,905 
  
  	PG_RETURN_DATUM(result);
  }
+ 
+ 
+ /*
+  * array_offset - returns the offset of a value in an array. array_offset and
+  * array_offset_start are wrappers for the sake of the opr_sanity test.
+  *
+  * Returns NULL when value is not found. It uses a NOT DISTINCT FROM operator
+  * for comparation to be safe against NULL.
+  */
+ Datum
+ array_offset(PG_FUNCTION_ARGS)
+ {
+ 	return array_offset_common(fcinfo);
+ }
+ 
+ Datum
+ array_offset_start(PG_FUNCTION_ARGS)
+ {
+ 	return array_offset_common(fcinfo);
+ }
+ 
+ /*
+  * Common part 

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-12 Thread Jim Nasby

On 3/10/15 9:30 AM, Robert Haas wrote:

On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote:

You still duplicate the type cache code, but many other array functions do
that too so I will not hold that against you. (Maybe somebody should write
separate patch that would put all that duplicate code into common function?)

I think this patch is ready for committer so I am going to mark it as such.


The documentation in this patch needs some improvements to the
English.  Can anyone help with that?


I'll take a look at it.


The documentation should discuss what happens if the array is multi-dimensional.

The code for array_offset and for array_offset_start appear to be
byte-for-byte identical.  There's no comment explaining why, but I bet
it's to make the opr_sanity test pass.  How about adding a comment?

The comment for array_offset_common refers to array_offset_start as
array_offset_startpos.

I am sure in agreement with the idea that it would be good to factor
out the common typecache code (for setting up my_extra).  Any chance
we get a preliminary patch that does that refactoring, and then rebase
the main patch on top of it?  I agree that it's not really this
patch's job to solve that problem, but it would be nice.


Since this patch is here and ready to go I would prefer that we commit 
it and refactor later. I can tackle that unless Pavel specifically wants to.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-12 Thread Pavel Stehule
2015-03-10 16:53 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/10/15 9:30 AM, Robert Haas wrote:

 On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com
 wrote:

 You still duplicate the type cache code, but many other array functions
 do
 that too so I will not hold that against you. (Maybe somebody should
 write
 separate patch that would put all that duplicate code into common
 function?)

 I think this patch is ready for committer so I am going to mark it as
 such.


 The documentation in this patch needs some improvements to the
 English.  Can anyone help with that?


 I'll take a look at it.

  The documentation should discuss what happens if the array is
 multi-dimensional.

 The code for array_offset and for array_offset_start appear to be
 byte-for-byte identical.  There's no comment explaining why, but I bet
 it's to make the opr_sanity test pass.  How about adding a comment?

 The comment for array_offset_common refers to array_offset_start as
 array_offset_startpos.

 I am sure in agreement with the idea that it would be good to factor
 out the common typecache code (for setting up my_extra).  Any chance
 we get a preliminary patch that does that refactoring, and then rebase
 the main patch on top of it?  I agree that it's not really this
 patch's job to solve that problem, but it would be nice.


 Since this patch is here and ready to go I would prefer that we commit it
 and refactor later. I can tackle that unless Pavel specifically wants to.


I'll look on this part this evening - but I don't have any idea how to find
some common pattern yet. So I am with Jim - we can do it later.

Regards

Pavel



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Jim Nasby

On 3/11/15 1:29 AM, Pavel Stehule wrote:


What is this comment supposed to mean? There is no 'width_array'...


It is typo (I am sorry) - should be width_bucket(, array)

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac

The code is similar, but it expect large **sorted** input. array_offset
works on unsorted (alphabetical unsorted) data sets - like days of week ..


The functions are serving rather different purposes, so I'm not sure 
it's worth mentioning. If we do want to mention it, then something like 
the following should be added to *both* functions:


 * This code is similar to width_bucket()

and

 * This code is similar to array_offset()

Incidentally, isn't it bad that we're doing all these static assignments 
inside the loop in width_bucket? Or can we count on the compiler to 
recognize this?


http://lnk.nu/github.com/1dvrr.c
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Pavel Stehule
2015-03-11 22:14 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/11/15 1:29 AM, Pavel Stehule wrote:


 What is this comment supposed to mean? There is no 'width_array'...


 It is typo (I am sorry) - should be width_bucket(, array)

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
 e80252d424278abf65b624669c8e6b3fe8587cac

 The code is similar, but it expect large **sorted** input. array_offset
 works on unsorted (alphabetical unsorted) data sets - like days of week ..


 The functions are serving rather different purposes, so I'm not sure it's
 worth mentioning. If we do want to mention it, then something like the
 following should be added to *both* functions:


ok, I removed this note.

I added  comment about wrapping and I simplified a code there - this method
is used more time in pg for same purposes.

Merged Jim's changes in doc

Pavel





  * This code is similar to width_bucket()

 and

  * This code is similar to array_offset()

 Incidentally, isn't it bad that we're doing all these static assignments
 inside the loop in width_bucket? Or can we count on the compiler to
 recognize this?

 http://lnk.nu/github.com/1dvrr.c

 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..d90266f
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,614 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search for a value in an array using the functionarray_offset/
+   function. It returns the position of the first occurrence of a value in an array:
+ 
+ programlisting
+ SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index c198bea..311f2fe
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11480,11485 
--- 11480,11488 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_offset/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11598,11603 
--- 11601,11637 
 /row
 row
  entry
+  literal
+   functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns the offset of the first occurrence of a value in an
+ array. It uses the literalIS NOT DISTINCT FROM/ operator for
+ comparation. The optional third argument specifies an initial offset to
+ begin the search at.  Returns NULL when the value is not found. Note:
+ multi-dimensional arrays are squashed to one dimension before
+ searching./entry
+ entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+row
+ entry
+  literal
+   functionarray_offsets/function(typeanyarray/type, typeanyelement/type)
+  /literal
+ /entry
+ entrytypeint[]/type/entry
+ entryreturns an array of offsets of all occurrences of a value in a array. It uses
+ the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array
+ when there are no occurences of the value in the array. Note:
+ multi-dimensional input arrays are squashed to one dimension before
+ searching./entry
+ entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry
+ entryliteral{1,2,4}/literal/entry
+/row
+row
+ entry
   literal
functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
   /literal
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
new file mode 100644
index 6679333..a847358
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***
*** 12,20 
--- 12,24 
   */
  #include postgres.h
  
+ #include catalog/pg_type.h
  #include utils/array.h
  #include utils/builtins.h
  #include utils/lsyscache.h
+ #include utils/typcache.h
+ 
+ static Datum array_offset_common(FunctionCallInfo fcinfo);
  
  
  /*
*** array_agg_array_finalfn(PG_FUNCTION_ARGS
*** 652,654 
--- 656,906 
  
  	PG_RETURN_DATUM(result);
  }
+ 
+ 
+ /*
+  * array_offset - returns the offset of a value in an array (array_offset and
+  * array_offset_start are wrappers for safe call 

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Jim Nasby

On 3/11/15 4:37 PM, Pavel Stehule wrote:
+ /*
+  * array_offset - returns the offset of a value in an array 
(array_offset and
+  * array_offset_start are wrappers for safe call (look on opr_sanity 
test) a

+  * array_offset_common function.
+  *
+  * Returns NULL when value is not found. It uses a NOT DISTINCT FROM 
operator

+  * for comparation to be safe against NULL.
+  */

would be better as...

+ /*
+  * array_offset - returns the offset of a value in an array. 
array_offset and

+  * array_offset_start are wrappers for the sake of the opr_sanity test.
+  *
+  * Returns NULL when value is not found. It uses a NOT DISTINCT FROM 
operator

+  * for comparation to be safe against NULL.
+  */
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Jim Nasby

On 3/10/15 5:25 PM, Tom Lane wrote:

Jim Nasby jim.na...@bluetreble.com writes:

That said, this pattern with fn_extra is repeated a lot, even just in
the backend (not counting contrib or extensions). It would be nice if
there was generic support for this.


What do you mean by generic support?  Most of those functions are doing
quite different things with fn_extra --- granted, nearly all of them are
caching something there, but I don't see very much that a generic
infrastructure could bring to the table.


At a glance, almost all the use under src/backend is doing some 
combination of 3 things: get_typlenbyvalalign(), get_type_io_data() or 
getting some operator for a type. This is most notable for records, 
arrays and ranges (though, records actually need an array of this info, 
so maybe we're out of luck there). That pattern exists outside of 
backend too; I think it's used in contrib, and I know at least one 
extension does this.


So my thought was something that accepted fcinfo, IOFuncSelector, and 
TypeCacheEntry.flags. If IOFuncSelector was set, use get_type_io_data; 
else use get_typlenbyvalalign. If TypeCacheEntry.flags is set also look 
up the operator.


Hmm... now that I look at it, the only use of get_type_io_data in 
src/backend seems to be to support arrays. Ranges make use of it too, 
but there's a comment in there that it returns more than what's needed.


Even if a generic support function doesn't make sense, there's a lot of 
repeated fn_extra code for arrays and records. It would be good to at 
least refactor that like what was done for rangetypes. That process 
might provide a more obvious answer to how this could be done generically.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Pavel Stehule
2015-03-11 2:57 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com
 wrote:
  I don't think we need both array_offset and array_offset_start; can't
 both
  SQL functions just call one C function?

 Not if you want the opr_sanity tests to pass.

 (But I'm seriously starting to wonder if that's actually a smart rule
 for us to be enforcing.  It seems to be something of a pain in the
 neck, and I'm unclear as to whether it is preventing any real
 problem.)


It is simple protection against some oversights.  I am not against this
check - this rule cleans a interface between C and SQL. More, the
additional C code is usually very short and trivial.

But it should be commented well.

Pavel



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-11 Thread Pavel Stehule
2015-03-10 22:53 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 2/22/15 5:19 AM, Pavel Stehule wrote:



 2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com
 mailto:p...@2ndquadrant.com:

 On 28/01/15 08:15, Pavel Stehule wrote:



 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com
 mailto:jim.na...@bluetreble.com
 mailto:Jim.Nasby@bluetreble.__com
 mailto:jim.na...@bluetreble.com:

  On 1/27/15 4:36 AM, Pavel Stehule wrote:


  It is only partially identical - I would to use cache for
  array_offset, but it is not necessary for array_offsets
 ..
  depends how we would to modify current API to support
 externally
  cached data.


  Externally cached data?


 Some from these functions has own caches for minimize access to
 typcache
 (array_map, array_cmp is example). And in first case, I am trying
 to
 push these information from fn_extra, in second case I don't do
 it,
 because I don't expect a repeated call (and I am expecting so
 type cache
 will be enough).


 You actually do caching via fn_extra in both case and I think that's
 the correct way, and yes that part can be moved common function.

 I also see that the documentation does not say what is returned by
 array_offset if nothing is found (it's documented in code but not in
 sgml).


 rebased + fixed docs


 I don't think we need both array_offset and array_offset_start; can't both
 SQL functions just call one C function?


There is a rule about unique mapping C functions to SQL space - and I don't
think so this rule is bad.



 It might be worth combining the array and non-array versions of this, by
 having a _common function that accepts a boolean and then just run one or
 the other of the while loops. Most of the code seems to be shared between
 the two versions.

 What is this comment supposed to mean? There is no 'width_array'...


It is typo (I am sorry) - should be width_bucket(, array)

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac

The code is similar, but it expect large **sorted** input. array_offset
works on unsorted (alphabetical unsorted) data sets - like days of week ..



  * Biggest difference against width_array is unsorted input array.

 I've attached my doc changes, both alone and with the code.

 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Robert Haas
On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 You still duplicate the type cache code, but many other array functions do
 that too so I will not hold that against you. (Maybe somebody should write
 separate patch that would put all that duplicate code into common function?)

 I think this patch is ready for committer so I am going to mark it as such.

The documentation in this patch needs some improvements to the
English.  Can anyone help with that?

The documentation should discuss what happens if the array is multi-dimensional.

The code for array_offset and for array_offset_start appear to be
byte-for-byte identical.  There's no comment explaining why, but I bet
it's to make the opr_sanity test pass.  How about adding a comment?

The comment for array_offset_common refers to array_offset_start as
array_offset_startpos.

I am sure in agreement with the idea that it would be good to factor
out the common typecache code (for setting up my_extra).  Any chance
we get a preliminary patch that does that refactoring, and then rebase
the main patch on top of it?  I agree that it's not really this
patch's job to solve that problem, but it would be nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Robert Haas
On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 I don't think we need both array_offset and array_offset_start; can't both
 SQL functions just call one C function?

Not if you want the opr_sanity tests to pass.

(But I'm seriously starting to wonder if that's actually a smart rule
for us to be enforcing.  It seems to be something of a pain in the
neck, and I'm unclear as to whether it is preventing any real
problem.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Pavel Stehule
2015-03-10 19:50 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Pavel Stehule pavel.steh...@gmail.com writes:
  2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com:
  I am sure in agreement with the idea that it would be good to factor
  out the common typecache code (for setting up my_extra).  Any chance
  we get a preliminary patch that does that refactoring, and then rebase
  the main patch on top of it?  I agree that it's not really this
  patch's job to solve that problem, but it would be nice.

  The common part is following code:

 There is not all that much commonality; many functions don't bother to
 populate all of the ArrayMetaState fields because they don't need all of
 them.  (The ones you quote don't, in fact.)  You are either going to end
 up with a subroutine that does extra syscache lookups to populate the
 whole struct every time, or a confusing collection of ad-hoc subroutines.
 I'm not convinced that there's a lot of mileage to be gained here.


I have not good feeling about it too. If we would to enhance this are, we
probably need a specific flinfo field and flags to specify more precious
the context of cached informations. my_extra should be reserved for generic
usage. But still there is relative big space for settings some less common
fields like proc.

With extra field in flinfo we can have interface like

bool set_flinfo_type_cache(fcinfo, type, flags);
and usage fcinfo-flinfo-typecache-typlen, ..

I agree with Robert, this can be nice, but it needs more time for design :(

Regards

Pavel



 regards, tom lane



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Robert Haas
On Tue, Mar 10, 2015 at 3:43 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I have not good feeling about it too. If we would to enhance this are, we
 probably need a specific flinfo field and flags to specify more precious the
 context of cached informations. my_extra should be reserved for generic
 usage. But still there is relative big space for settings some less common
 fields like proc.

 With extra field in flinfo we can have interface like

 bool set_flinfo_type_cache(fcinfo, type, flags);
 and usage fcinfo-flinfo-typecache-typlen, ..

 I agree with Robert, this can be nice, but it needs more time for design :(

OK.  If I'm in the minority, I'll desist.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Pavel Stehule
2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote:
  You still duplicate the type cache code, but many other array functions
 do
  that too so I will not hold that against you. (Maybe somebody should
 write
  separate patch that would put all that duplicate code into common
 function?)
 
  I think this patch is ready for committer so I am going to mark it as
 such.

 The documentation in this patch needs some improvements to the
 English.  Can anyone help with that?

 The documentation should discuss what happens if the array is
 multi-dimensional.

 The code for array_offset and for array_offset_start appear to be
 byte-for-byte identical.  There's no comment explaining why, but I bet
 it's to make the opr_sanity test pass.  How about adding a comment?

 The comment for array_offset_common refers to array_offset_start as
 array_offset_startpos.


yes, it is a reason. I'll comment it.


 I am sure in agreement with the idea that it would be good to factor
 out the common typecache code (for setting up my_extra).  Any chance
 we get a preliminary patch that does that refactoring, and then rebase
 the main patch on top of it?  I agree that it's not really this
 patch's job to solve that problem, but it would be nice.


The common part is following code:

my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra;
if (my_extra == NULL)
{
fcinfo-flinfo-fn_extra =
MemoryContextAlloc(fcinfo-flinfo-fn_mcxt,

sizeof(ArrayMetaState));
my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra;
my_extra-element_type = ~element_type;
}

and

if (my_extra-element_type != element_type)
{
get_typlenbyvalalign(element_type,
 my_extra-typlen,
 my_extra-typbyval,
 my_extra-typalign);

my_extra-element_type = element_type;
}


so we can design function like

(ArrayMetaState *)
GetCachedArrayMetaState(FunctionCallInfo fcinfo, Oid element_type, bool
*reused)
{
   ArrayMetaState *state = (ArrayMetaState *) fcinfo-flinfo-fn_extra;
   if (state == NULL)
   {
 fcinfo-flinfo-fn_extra =
MemoryContextAlloc(fcinfo-flinfo-fn_mcxt,

sizeof(ArrayMetaState));
 state = (ArrayMetaState *) fcinfo-flinfo-fn_extra;
 state-element_type = ~element_type;
   }
  if (state-element_type != element_type)
  {
 get_typlenbyvalalign(element_type,
 my_extra-typlen,
 my_extra-typbyval,
 my_extra-typalign);

  state-element_type = element_type;
  *resused = false;
  }
  else
*reused = true;
}

Usage in code:

array_state = GetCachedArrayMetaState(fceinfo, element_type, reused);
if (!reused)
{
// some other initialization
}

The content is relatively clean, but the most big problem is naming (as
usual)

Comments?

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com:
 I am sure in agreement with the idea that it would be good to factor
 out the common typecache code (for setting up my_extra).  Any chance
 we get a preliminary patch that does that refactoring, and then rebase
 the main patch on top of it?  I agree that it's not really this
 patch's job to solve that problem, but it would be nice.

 The common part is following code:

There is not all that much commonality; many functions don't bother to
populate all of the ArrayMetaState fields because they don't need all of
them.  (The ones you quote don't, in fact.)  You are either going to end
up with a subroutine that does extra syscache lookups to populate the
whole struct every time, or a confusing collection of ad-hoc subroutines.
I'm not convinced that there's a lot of mileage to be gained here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Jim Nasby

On 3/10/15 10:53 AM, Jim Nasby wrote:

On 3/10/15 9:30 AM, Robert Haas wrote:

On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com
wrote:

You still duplicate the type cache code, but many other array
functions do
that too so I will not hold that against you. (Maybe somebody should
write
separate patch that would put all that duplicate code into common
function?)

I think this patch is ready for committer so I am going to mark it as
such.


The documentation in this patch needs some improvements to the
English.  Can anyone help with that?


I'll take a look at it.


The documentation should discuss what happens if the array is
multi-dimensional.


This is the wording I have to describe what happens with a 
multi-dimensional array. I'm not thrilled with it; anyone have better ideas?


Note: multi-dimensional arrays are squashed to one dimension before 
searching.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Jim Nasby

On 3/10/15 2:43 PM, Pavel Stehule wrote:


There is not all that much commonality; many functions don't bother to
populate all of the ArrayMetaState fields because they don't need all of
them.  (The ones you quote don't, in fact.)  You are either going to end
up with a subroutine that does extra syscache lookups to populate the
whole struct every time, or a confusing collection of ad-hoc
subroutines.
I'm not convinced that there's a lot of mileage to be gained here.


I have not good feeling about it too. If we would to enhance this are,
we probably need a specific flinfo field and flags to specify more
precious the context of cached informations. my_extra should be reserved
for generic usage. But still there is relative big space for settings
some less common fields like proc.

With extra field in flinfo we can have interface like

bool set_flinfo_type_cache(fcinfo, type, flags);
and usage fcinfo-flinfo-typecache-typlen, ..


I'm not following what you intended there, but in any case I don't think 
we'd need to change all that much, because there's only 3 cases:


1) Get only the base type info
2) Get base type info and equality operator
3) Get IO info for one IOFunc

Passing the function an enum (and perhaps keeping it in ArrayMetaState) 
would be enough to distinguish between the 3 cases. You'd also need to 
pass in IOFuncSelector.


That said, this pattern with fn_extra is repeated a lot, even just in 
the backend (not counting contrib or extensions). It would be nice if 
there was generic support for this.


decibel@decina:[17:15]~/pgsql/HEAD/src/backend (array_offset_v4 
$)$pg_grep fn_extra|cut -d: -f1|uniq -c

  14 access/gist/gistscan.c
   7 executor/functions.c
   1 executor/nodeWindowAgg.c
  14 utils/adt/array_userfuncs.c
  31 utils/adt/arrayfuncs.c
   4 utils/adt/domains.c
   2 utils/adt/enum.c
   1 utils/adt/int.c
   6 utils/adt/jsonfuncs.c
   1 utils/adt/oid.c
   4 utils/adt/orderedsetaggs.c
   7 utils/adt/rangetypes.c
  24 utils/adt/rowtypes.c
   8 utils/adt/varlena.c
(utils/fmgr/* doesn't count)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 That said, this pattern with fn_extra is repeated a lot, even just in 
 the backend (not counting contrib or extensions). It would be nice if 
 there was generic support for this.

What do you mean by generic support?  Most of those functions are doing
quite different things with fn_extra --- granted, nearly all of them are
caching something there, but I don't see very much that a generic
infrastructure could bring to the table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-10 Thread Jim Nasby

On 2/22/15 5:19 AM, Pavel Stehule wrote:



2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com
mailto:p...@2ndquadrant.com:

On 28/01/15 08:15, Pavel Stehule wrote:



2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com
mailto:jim.na...@bluetreble.com
mailto:Jim.Nasby@bluetreble.__com
mailto:jim.na...@bluetreble.com:

 On 1/27/15 4:36 AM, Pavel Stehule wrote:


 It is only partially identical - I would to use cache for
 array_offset, but it is not necessary for array_offsets ..
 depends how we would to modify current API to support
externally
 cached data.


 Externally cached data?


Some from these functions has own caches for minimize access to
typcache
(array_map, array_cmp is example). And in first case, I am trying to
push these information from fn_extra, in second case I don't do it,
because I don't expect a repeated call (and I am expecting so
type cache
will be enough).


You actually do caching via fn_extra in both case and I think that's
the correct way, and yes that part can be moved common function.

I also see that the documentation does not say what is returned by
array_offset if nothing is found (it's documented in code but not in
sgml).


rebased + fixed docs


I don't think we need both array_offset and array_offset_start; can't 
both SQL functions just call one C function?


It might be worth combining the array and non-array versions of this, by 
having a _common function that accepts a boolean and then just run one 
or the other of the while loops. Most of the code seems to be shared 
between the two versions.


What is this comment supposed to mean? There is no 'width_array'...

 * Biggest difference against width_array is unsorted input array.

I've attached my doc changes, both alone and with the code.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index 9ea1068..d90266f 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -600,6 +600,15 @@ SELECT * FROM sal_emp WHERE pay_by_quarter  ARRAY[1];
   index, as described in xref linkend=indexes-types.
  /para
 
+ para
+  You can also search for a value in an array using the 
functionarray_offset/
+  function. It returns the position of the first occurrence of a value in an 
array:
+
+programlisting
+SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+/programlisting
+ /para
+
  tip
   para
Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c198bea..311f2fe 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11480,6 +11480,9 @@ SELECT NULLIF(value, '(none)') ...
 primaryarray_lower/primary
   /indexterm
   indexterm
+primaryarray_offset/primary
+  /indexterm
+  indexterm
 primaryarray_prepend/primary
   /indexterm
   indexterm
@@ -11599,6 +11602,37 @@ SELECT NULLIF(value, '(none)') ...
row
 entry
  literal
+  functionarray_offset/function(typeanyarray/type, 
typeanyelement/type optional, typeint/type/optional)
+ /literal
+/entry
+entrytypeint/type/entry
+entryreturns the offset of the first occurrence of a value in an
+array. It uses the literalIS NOT DISTINCT FROM/ operator for
+comparation. The optional third argument specifies an initial offset to
+begin the search at.  Returns NULL when the value is not found. Note:
+multi-dimensional arrays are squashed to one dimension before
+searching./entry
+
entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 
'mon')/literal/entry
+entryliteral2/literal/entry
+   /row
+   row
+entry
+ literal
+  functionarray_offsets/function(typeanyarray/type, 
typeanyelement/type)
+ /literal
+/entry
+entrytypeint[]/type/entry
+entryreturns an array of offsets of all occurrences of a value in a 
array. It uses
+the literalIS NOT DISTINCT FROM/ operator for comparation. Returns 
an empty array
+when there are no occurences of the value in the array. Note:
+multi-dimensional input arrays are squashed to one dimension before
+searching./entry
+entryliteralarray_offsets(ARRAY['A','A','B','A'], 
'A')/literal/entry
+entryliteral{1,2,4}/literal/entry
+   /row
+   row
+entry
+ literal
   functionarray_prepend/function(typeanyelement/type, 
typeanyarray/type)
  /literal
 /entry
diff --git a/src/backend/utils/adt/array_userfuncs.c 
b/src/backend/utils/adt/array_userfuncs.c
index 6679333..f7b7932 100644
--- 

Re: [HACKERS] proposal: searching in array function - array_position

2015-03-07 Thread Petr Jelinek

On 22/02/15 12:19, Pavel Stehule wrote:



2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com
mailto:p...@2ndquadrant.com:

On 28/01/15 08:15, Pavel Stehule wrote:



2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com
mailto:jim.na...@bluetreble.com:

 On 1/27/15 4:36 AM, Pavel Stehule wrote:


 It is only partially identical - I would to use cache for
 array_offset, but it is not necessary for array_offsets ..
 depends how we would to modify current API to support
externally
 cached data.


 Externally cached data?


Some from these functions has own caches for minimize access to
typcache
(array_map, array_cmp is example). And in first case, I am trying to
push these information from fn_extra, in second case I don't do it,
because I don't expect a repeated call (and I am expecting so
type cache
will be enough).


You actually do caching via fn_extra in both case and I think that's
the correct way, and yes that part can be moved common function.

I also see that the documentation does not say what is returned by
array_offset if nothing is found (it's documented in code but not in
sgml).


rebased + fixed docs



You still duplicate the type cache code, but many other array functions 
do that too so I will not hold that against you. (Maybe somebody should 
write separate patch that would put all that duplicate code into common 
function?)


I think this patch is ready for committer so I am going to mark it as such.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-02-22 Thread Pavel Stehule
2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com:

 On 28/01/15 08:15, Pavel Stehule wrote:



 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com
 mailto:jim.na...@bluetreble.com:

 On 1/27/15 4:36 AM, Pavel Stehule wrote:


 It is only partially identical - I would to use cache for
 array_offset, but it is not necessary for array_offsets ..
 depends how we would to modify current API to support externally
 cached data.


 Externally cached data?


 Some from these functions has own caches for minimize access to typcache
 (array_map, array_cmp is example). And in first case, I am trying to
 push these information from fn_extra, in second case I don't do it,
 because I don't expect a repeated call (and I am expecting so type cache
 will be enough).


 You actually do caching via fn_extra in both case and I think that's the
 correct way, and yes that part can be moved common function.

 I also see that the documentation does not say what is returned by
 array_offset if nothing is found (it's documented in code but not in sgml).


rebased + fixed docs

Regards

Pavel



 --
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..b3630b4
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,614 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search any value in array using the functionarray_offset/
+   function (It returns a position of first occurrence of value in the array):
+ 
+ programlisting
+ SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index da2ed67..459343a
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11480,11485 
--- 11480,11488 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_offset/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11598,11603 
--- 11601,11633 
 /row
 row
  entry
+  literal
+   functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns a offset of first occurrence of some element in a array. It uses
+ a literalIS NOT DISTINCT FROM/ operator for comparation. Third
+ optional argument can specify a initial offset when searching starts.
+ Returns NULL when there are not occurence of value in array./entry
+ entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+row
+ entry
+  literal
+   functionarray_offsets/function(typeanyarray/type, typeanyelement/type)
+  /literal
+ /entry
+ entrytypeint[]/type/entry
+ entryreturns a array of offset of all occurrences some element in a array. It uses
+ a literalIS NOT DISTINCT FROM/ operator for comparation. Returns empty array,
+ when there are no occurence of element in input array./entry
+ entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry
+ entryliteral{1,2,4}/literal/entry
+/row
+row
+ entry
   literal
functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
   /literal
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
new file mode 100644
index 5781b95..9bf0eb5
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***
*** 12,20 
--- 12,24 
   */
  #include postgres.h
  
+ #include catalog/pg_type.h
  #include utils/array.h
  #include utils/builtins.h
  #include utils/lsyscache.h
+ #include utils/typcache.h
+ 
+ static bool array_offset_common(FunctionCallInfo fcinfo, int *result);
  
  
  /*
*** array_agg_array_finalfn(PG_FUNCTION_ARGS
*** 644,646 
--- 648,911 
  
  	PG_RETURN_DATUM(result);
  }
+ 
+ 
+ /*
+  * array_offset - returns a offset of entered element in a array.
+  * Returns NULL when values is not a element of the array. It allow
+  * searching a NULL value due using a NOT DISTINCT FROM operator. 
+  * 
+  * Biggest difference against width_array is unsorted input array.
+  */
+ Datum
+ array_offset(PG_FUNCTION_ARGS)
+ {
+ 	int	result;
+ 
+ 	

Re: [HACKERS] proposal: searching in array function - array_position

2015-02-21 Thread Petr Jelinek

On 28/01/15 08:15, Pavel Stehule wrote:



2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com
mailto:jim.na...@bluetreble.com:

On 1/27/15 4:36 AM, Pavel Stehule wrote:


It is only partially identical - I would to use cache for
array_offset, but it is not necessary for array_offsets ..
depends how we would to modify current API to support externally
cached data.


Externally cached data?


Some from these functions has own caches for minimize access to typcache
(array_map, array_cmp is example). And in first case, I am trying to
push these information from fn_extra, in second case I don't do it,
because I don't expect a repeated call (and I am expecting so type cache
will be enough).



You actually do caching via fn_extra in both case and I think that's the 
correct way, and yes that part can be moved common function.


I also see that the documentation does not say what is returned by 
array_offset if nothing is found (it's documented in code but not in sgml).



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-27 Thread Jim Nasby

On 1/27/15 4:36 AM, Pavel Stehule wrote:

2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com 
mailto:jim.na...@bluetreble.com:

On 1/26/15 4:17 PM, Pavel Stehule wrote:

 Any way to reduce the code duplication between the array and 
non-array versions? Maybe factor out the operator caching code?


I though about it - but there is different checks, different result 
processing, different result type.

I didn't find any readable and reduced form :(


Yeah, that's why I was thinking specifically of the operator caching 
code... isn't that identical? That would at least remove a dozen lines...


It is only partially identical - I would to use cache for array_offset, but it 
is not necessary for array_offsets .. depends how we would to modify current 
API to support externally cached data.


Externally cached data?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-27 Thread Pavel Stehule
2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/26/15 4:17 PM, Pavel Stehule wrote:

 Any way to reduce the code duplication between the array and
 non-array versions? Maybe factor out the operator caching code?


 I though about it - but there is different checks, different result
 processing, different result type.

 I didn't find any readable and reduced form :(


 Yeah, that's why I was thinking specifically of the operator caching
 code... isn't that identical? That would at least remove a dozen lines...


It is only partially identical - I would to use cache for array_offset, but
it is not necessary for array_offsets .. depends how we would to modify
current API to support externally cached data.

Regards

Pavel




 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-27 Thread Pavel Stehule
2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/27/15 4:36 AM, Pavel Stehule wrote:

 2015-01-26 23:29 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:
 jim.na...@bluetreble.com:

 On 1/26/15 4:17 PM, Pavel Stehule wrote:

  Any way to reduce the code duplication between the array and
 non-array versions? Maybe factor out the operator caching code?


 I though about it - but there is different checks, different
 result processing, different result type.

 I didn't find any readable and reduced form :(


 Yeah, that's why I was thinking specifically of the operator caching
 code... isn't that identical? That would at least remove a dozen lines...


 It is only partially identical - I would to use cache for array_offset,
 but it is not necessary for array_offsets .. depends how we would to modify
 current API to support externally cached data.


 Externally cached data?


Some from these functions has own caches for minimize access to typcache
(array_map, array_cmp is example). And in first case, I am trying to push
these information from fn_extra, in second case I don't do it, because I
don't expect a repeated call (and I am expecting so type cache will be
enough).

I plan to do some benchmark to calculate if we have to do, or we can use
type cache only.

Pavel



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-26 Thread Jim Nasby

On 1/24/15 2:48 AM, Pavel Stehule wrote:

with array_offsets - returns a array of offsets


+ entryreturns a offset of first occurrence of some element in a 
array. It uses
should be
+ entryreturns the offset of the first occurrence of some element in 
an array. It uses

+ entryreturns a array of offset of all occurrences some element in a 
array. It uses
should be
+ entryreturns an array of the offsets of all occurrences of some 
element in an array. It uses

Any way to reduce the code duplication between the array and non-array 
versions? Maybe factor out the operator caching code?

You should remove the array_length() from the last array_offsets test; I don't 
see that it buys anything.

I think there should be tests for what happens when you feed these functions a 
multi-dimensional array.

Other than that, looks good.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-26 Thread Pavel Stehule
2015-01-26 23:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/24/15 2:48 AM, Pavel Stehule wrote:

 with array_offsets - returns a array of offsets


 + entryreturns a offset of first occurrence of some element in a
 array. It uses
 should be
 + entryreturns the offset of the first occurrence of some
 element in an array. It uses

 + entryreturns a array of offset of all occurrences some element
 in a array. It uses
 should be
 + entryreturns an array of the offsets of all occurrences of
 some element in an array. It uses

 Any way to reduce the code duplication between the array and non-array
 versions? Maybe factor out the operator caching code?


I though about it - but there is different checks, different result
processing, different result type.

I didn't find any readable and reduced form :(


 You should remove the array_length() from the last array_offsets test; I
 don't see that it buys anything.


ok



 I think there should be tests for what happens when you feed these
 functions a multi-dimensional array.


I can do it. Result should be expected - it searching row by row due MD
format


 Other than that, looks good.


Thank you

Pavel


 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-26 Thread Jim Nasby

On 1/26/15 4:17 PM, Pavel Stehule wrote:

Any way to reduce the code duplication between the array and non-array 
versions? Maybe factor out the operator caching code?


I though about it - but there is different checks, different result processing, 
different result type.

I didn't find any readable and reduced form :(


Yeah, that's why I was thinking specifically of the operator caching code... 
isn't that identical? That would at least remove a dozen lines...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-24 Thread Pavel Stehule
Hi

with array_offsets - returns a array of offsets

Regards

Pavel

2015-01-20 21:32 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/20/15 11:12 AM, Pavel Stehule wrote:

 I am sending updated version - it allow third optional argument that
 specify where searching should to start. With it is possible repeatably
 call this function.


 What happened to returning an array of offsets? I think that would be both
 easier to use than this version as well as performing better.

 I see you dropped multi-dimension support, but I think that's fine.

 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..b3630b4
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,614 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search any value in array using the functionarray_offset/
+   function (It returns a position of first occurrence of value in the array):
+ 
+ programlisting
+ SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index d57243a..c563165
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11474,11479 
--- 11474,11482 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_offset/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11592,11597 
--- 11595,11626 
 /row
 row
  entry
+  literal
+   functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns a offset of first occurrence of some element in a array. It uses
+ a literalIS NOT DISTINCT FROM/ operator for comparation. Third
+ optional argument can specify a initial offset when searching starts. /entry
+ entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+row
+ entry
+  literal
+   functionarray_offsets/function(typeanyarray/type, typeanyelement/type)
+  /literal
+ /entry
+ entrytypeint[]/type/entry
+ entryreturns a array of offset of all occurrences some element in a array. It uses
+ a literalIS NOT DISTINCT FROM/ operator for comparation. Returns empty array,
+ when there are no occurence of element in input array./entry
+ entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry
+ entryliteral{1,2,4}/literal/entry
+/row
+row
+ entry
   literal
functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
   /literal
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
new file mode 100644
index 600646e..61a080d
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***
*** 12,20 
--- 12,24 
   */
  #include postgres.h
  
+ #include catalog/pg_type.h
  #include utils/array.h
  #include utils/builtins.h
  #include utils/lsyscache.h
+ #include utils/typcache.h
+ 
+ static bool array_offset_common(FunctionCallInfo fcinfo, int *result);
  
  
  /*-
*** array_agg_array_finalfn(PG_FUNCTION_ARGS
*** 612,614 
--- 616,879 
  
  	PG_RETURN_DATUM(result);
  }
+ 
+ 
+ /*
+  * array_offset - returns a offset of entered element in a array.
+  * Returns NULL when values is not a element of the array. It allow
+  * searching a NULL value due using a NOT DISTINCT FROM operator. 
+  * 
+  * Biggest difference against width_array is unsorted input array.
+  */
+ Datum
+ array_offset(PG_FUNCTION_ARGS)
+ {
+ 	int	result;
+ 
+ 	if (array_offset_common(fcinfo, result))
+ 		PG_RETURN_INT32(result);
+ 
+ 	PG_RETURN_NULL();
+ }
+ 
+ 
+ Datum
+ array_offset_start(PG_FUNCTION_ARGS)
+ {
+ 	int	result;
+ 
+ 	if (array_offset_common(fcinfo, result))
+ 		PG_RETURN_INT32(result);
+ 
+ 	PG_RETURN_NULL();
+ }
+ 
+ /*
+  * Common part for functions array_offset and array_offset_startpos
+  */
+ static bool
+ array_offset_common(FunctionCallInfo fcinfo,
+ 		 int 	*result)
+ {
+ 	ArrayType	*array;
+ 	Oid		collation = PG_GET_COLLATION();
+ 	Oid			element_type;
+ 	Datum		searched_element = (Datum) 0,
+ value;
+ 	bool		

Re: [HACKERS] proposal: searching in array function - array_position

2015-01-20 Thread Pavel Stehule
Hi

I am sending updated version - it allow third optional argument that
specify where searching should to start. With it is possible repeatably
call this function.

Regards

Pavel

2015-01-17 23:43 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi

 here is a proof concept of array_offset function

 possible question:

 * used comparation = or IS NOT DISTINCT FROM

 In this initial proof concept I used IS NOT DISTINCT FROM operator - but
 my opinion is not strong in this question. Both has some advantages and
 disadvantages.

 Regards

 Pavel


 2015-01-16 19:12 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:



 2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/16/15 11:16 AM, Pavel Stehule wrote:



 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:
 jim.na...@bluetreble.com:

 On 1/16/15 3:39 AM, Pavel Stehule wrote:

 I am proposing a simple function, that returns a position of
 element in array.


 Yes please!

 FUNCTION array_position(anyarray, anyelement) RETURNS int


 That won't work on a multi-dimensional array. Ideally it needs to
 accept a slice or an element and return the specifier for the slice.


 It is question, what is a result - probably, there can be a
 multidimensional variant, where result will be a array

 array_position([1,2,3],2) -- 2
 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter
 should to have N-1 dimension of first parameter */


 The problem with that is you can't actually use '2' to get [2,3] back:

 select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
  ?column?
 --
  t
 (1 row)


 yes, but when you are searching a array in array you can use a full slice
 selection:

 postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
 constant every time in this case -- so it should not be returned
   array
 -
  {{1,2}}
 (1 row)





 I think the bigger problem here is we need something better than slices
 for handling subsets of arrays. Even if the function returned [2:2] it's
 still going to behave differently than it will in the non-array case
 because you won't be getting the expected number of dimensions back. :(


 you cannot to return a slice and I don't propose it, although we can
 return a range type or array of range type - but still we cannot to use
 range for a arrays.


  array_position_md([1,2,3],2) -- [2]
 array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1]

 another question is how to solve more than one occurrence on one value
 - probably two sets of functions - first returns first occurrence of value,
 second returns set of occurrence


 Gee, if only way had some way to return multiple elements of
 something... ;P

 In other words, I think all of these should actually return an array of
 positions. I think it's OK for someone that only cares about the first
 instance to just do [1].


 there can be two functions - position - returns first and positions
 returns all as a array



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com




diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
new file mode 100644
index 9ea1068..b3630b4
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
*** SELECT * FROM sal_emp WHERE pay_by_quart
*** 600,605 
--- 600,614 
index, as described in xref linkend=indexes-types.
   /para
  
+  para
+   You can also search any value in array using the functionarray_offset/
+   function (It returns a position of first occurrence of value in the array):
+ 
+ programlisting
+ SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+ /programlisting
+  /para
+ 
   tip
para
 Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 5e7b000..62c9f7f
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT NULLIF(value, '(none)') ...
*** 11474,11479 
--- 11474,11482 
  primaryarray_lower/primary
/indexterm
indexterm
+ primaryarray_offset/primary
+   /indexterm
+   indexterm
  primaryarray_prepend/primary
/indexterm
indexterm
*** SELECT NULLIF(value, '(none)') ...
*** 11592,11597 
--- 11595,11613 
 /row
 row
  entry
+  literal
+   functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns a offset of first occurrence of some element in a array. It uses
+ a literalIS NOT DISTINCT FROM/ operator for comparation. Third
+ optional argument can specify a initial offset when searching starts. /entry
+ entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+ entryliteral2/literal/entry
+/row
+   

Re: [HACKERS] proposal: searching in array function - array_position

2015-01-20 Thread Pavel Stehule
2015-01-20 21:32 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/20/15 11:12 AM, Pavel Stehule wrote:

 I am sending updated version - it allow third optional argument that
 specify where searching should to start. With it is possible repeatably
 call this function.


 What happened to returning an array of offsets? I think that would be both
 easier to use than this version as well as performing better.


I have still thinking about this idea. It needs a different function and I
didn't start with this.

Implementation a optional start parameter to array_offset is cheap - and I
am thinking so it can be useful for some use cases.



 I see you dropped multi-dimension support, but I think that's fine.


It can be implemented later. There is no any barriers to later
implementation.


 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-20 Thread Jim Nasby

On 1/20/15 11:12 AM, Pavel Stehule wrote:

I am sending updated version - it allow third optional argument that specify 
where searching should to start. With it is possible repeatably call this 
function.


What happened to returning an array of offsets? I think that would be both 
easier to use than this version as well as performing better.

I see you dropped multi-dimension support, but I think that's fine.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-17 Thread Pavel Stehule
Hi

here is a proof concept of array_offset function

possible question:

* used comparation = or IS NOT DISTINCT FROM

In this initial proof concept I used IS NOT DISTINCT FROM operator - but
my opinion is not strong in this question. Both has some advantages and
disadvantages.

Regards

Pavel


2015-01-16 19:12 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:



 2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/16/15 11:16 AM, Pavel Stehule wrote:



 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:
 jim.na...@bluetreble.com:

 On 1/16/15 3:39 AM, Pavel Stehule wrote:

 I am proposing a simple function, that returns a position of
 element in array.


 Yes please!

 FUNCTION array_position(anyarray, anyelement) RETURNS int


 That won't work on a multi-dimensional array. Ideally it needs to
 accept a slice or an element and return the specifier for the slice.


 It is question, what is a result - probably, there can be a
 multidimensional variant, where result will be a array

 array_position([1,2,3],2) -- 2
 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter
 should to have N-1 dimension of first parameter */


 The problem with that is you can't actually use '2' to get [2,3] back:

 select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
  ?column?
 --
  t
 (1 row)


 yes, but when you are searching a array in array you can use a full slice
 selection:

 postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
 constant every time in this case -- so it should not be returned
   array
 -
  {{1,2}}
 (1 row)





 I think the bigger problem here is we need something better than slices
 for handling subsets of arrays. Even if the function returned [2:2] it's
 still going to behave differently than it will in the non-array case
 because you won't be getting the expected number of dimensions back. :(


 you cannot to return a slice and I don't propose it, although we can
 return a range type or array of range type - but still we cannot to use
 range for a arrays.


  array_position_md([1,2,3],2) -- [2]
 array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1]

 another question is how to solve more than one occurrence on one value -
 probably two sets of functions - first returns first occurrence of value,
 second returns set of occurrence


 Gee, if only way had some way to return multiple elements of something...
 ;P

 In other words, I think all of these should actually return an array of
 positions. I think it's OK for someone that only cares about the first
 instance to just do [1].


 there can be two functions - position - returns first and positions
 returns all as a array



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



commit 9562ae2bf9d0e8afdf745e50857052f6b9052086
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Sat Jan 17 23:35:34 2015 +0100

initial implementation of array_offset function.

diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index 9ea1068..b3630b4 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -600,6 +600,15 @@ SELECT * FROM sal_emp WHERE pay_by_quarter  ARRAY[1];
   index, as described in xref linkend=indexes-types.
  /para
 
+ para
+  You can also search any value in array using the functionarray_offset/
+  function (It returns a position of first occurrence of value in the array):
+
+programlisting
+SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon');
+/programlisting
+ /para
+
  tip
   para
Arrays are not sets; searching for specific array elements
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5e7b000..4aac21b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11474,6 +11474,9 @@ SELECT NULLIF(value, '(none)') ...
 primaryarray_lower/primary
   /indexterm
   indexterm
+primaryarray_offset/primary
+  /indexterm
+  indexterm
 primaryarray_prepend/primary
   /indexterm
   indexterm
@@ -11593,6 +11596,18 @@ SELECT NULLIF(value, '(none)') ...
row
 entry
  literal
+  functionarray_offset/function(typeanyarray/type, typeanyelement/type)
+ /literal
+/entry
+entrytypeint/type/entry
+entryreturns a offset of first occurrence of some element in a array. It uses
+a literalIS NOT DISTINCT FROM/ operator for comparation./entry
+entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry
+entryliteral2/literal/entry
+   /row
+   row
+entry
+ literal
   functionarray_prepend/function(typeanyelement/type, typeanyarray/type)
  /literal
 /entry
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 600646e..62768fb 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ 

Re: [HACKERS] proposal: searching in array function - array_position

2015-01-17 Thread Pavel Stehule
2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/16/15 3:39 AM, Pavel Stehule wrote:

 I am proposing a simple function, that returns a position of element in
 array.


 Yes please!

  FUNCTION array_position(anyarray, anyelement) RETURNS int


 That won't work on a multi-dimensional array. Ideally it needs to accept a
 slice or an element and return the specifier for the slice.


theoretically you can use this function for md arrays too. This function
returns offset, and you can calculate a Nd possition

so maybe better name -- array_offset or some similar

Regards

Pavel



 This wouldn't be so bad if we had an easier way to extract subsets of an
 array, but even that is really ugly because whatever you extract keeps the
 original number of dimensions.

  Implementation is simple (plpgsql code)


 This would actually be written in C though, yes? Otherwise it's not really
 any better than just doing an extension...
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-16 Thread Jim Nasby

On 1/16/15 3:39 AM, Pavel Stehule wrote:

I am proposing a simple function, that returns a position of element in array.


Yes please!


FUNCTION array_position(anyarray, anyelement) RETURNS int


That won't work on a multi-dimensional array. Ideally it needs to accept a 
slice or an element and return the specifier for the slice.

This wouldn't be so bad if we had an easier way to extract subsets of an array, 
but even that is really ugly because whatever you extract keeps the original 
number of dimensions.


Implementation is simple (plpgsql code)


This would actually be written in C though, yes? Otherwise it's not really any 
better than just doing an extension...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-16 Thread Pavel Stehule
2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/16/15 3:39 AM, Pavel Stehule wrote:

 I am proposing a simple function, that returns a position of element in
 array.


 Yes please!

  FUNCTION array_position(anyarray, anyelement) RETURNS int


 That won't work on a multi-dimensional array. Ideally it needs to accept a
 slice or an element and return the specifier for the slice.


It is question, what is a result - probably, there can be a
multidimensional variant, where result will be a array

array_position([1,2,3],2) -- 2
array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to
have N-1 dimension of first parameter */
array_position_md([1,2,3],2) -- [2]
array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1]

another question is how to solve more than one occurrence on one value -
probably two sets of functions - first returns first occurrence of value,
second returns set of occurrence



 This wouldn't be so bad if we had an easier way to extract subsets of an
 array, but even that is really ugly because whatever you extract keeps the
 original number of dimensions.

  Implementation is simple (plpgsql code)


 This would actually be written in C though, yes? Otherwise it's not really
 any better than just doing an extension...


Sure, I expect a C implementation

Pavel


 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] proposal: searching in array function - array_position

2015-01-16 Thread Jim Nasby

On 1/16/15 11:16 AM, Pavel Stehule wrote:



2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com 
mailto:jim.na...@bluetreble.com:

On 1/16/15 3:39 AM, Pavel Stehule wrote:

I am proposing a simple function, that returns a position of element in 
array.


Yes please!

FUNCTION array_position(anyarray, anyelement) RETURNS int


That won't work on a multi-dimensional array. Ideally it needs to accept a 
slice or an element and return the specifier for the slice.


It is question, what is a result - probably, there can be a multidimensional 
variant, where result will be a array

array_position([1,2,3],2) -- 2
array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should to 
have N-1 dimension of first parameter */


The problem with that is you can't actually use '2' to get [2,3] back:

select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
 ?column?
--
 t
(1 row)

I think the bigger problem here is we need something better than slices for 
handling subsets of arrays. Even if the function returned [2:2] it's still 
going to behave differently than it will in the non-array case because you 
won't be getting the expected number of dimensions back. :(


array_position_md([1,2,3],2) -- [2]
array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1]

another question is how to solve more than one occurrence on one value - 
probably two sets of functions - first returns first occurrence of value, 
second returns set of occurrence


Gee, if only way had some way to return multiple elements of something... ;P

In other words, I think all of these should actually return an array of 
positions. I think it's OK for someone that only cares about the first instance 
to just do [1].
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-01-16 Thread Pavel Stehule
2015-01-16 18:37 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 1/16/15 11:16 AM, Pavel Stehule wrote:



 2015-01-16 17:57 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:
 jim.na...@bluetreble.com:

 On 1/16/15 3:39 AM, Pavel Stehule wrote:

 I am proposing a simple function, that returns a position of
 element in array.


 Yes please!

 FUNCTION array_position(anyarray, anyelement) RETURNS int


 That won't work on a multi-dimensional array. Ideally it needs to
 accept a slice or an element and return the specifier for the slice.


 It is question, what is a result - probably, there can be a
 multidimensional variant, where result will be a array

 array_position([1,2,3],2) -- 2
 array_position([[1,2],[2,3],[3,4]], [2,3]) -- 2 /* 2nd parameter should
 to have N-1 dimension of first parameter */


 The problem with that is you can't actually use '2' to get [2,3] back:

 select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
  ?column?
 --
  t
 (1 row)


yes, but when you are searching a array in array you can use a full slice
selection:

postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
constant every time in this case -- so it should not be returned
  array
-
 {{1,2}}
(1 row)





 I think the bigger problem here is we need something better than slices
 for handling subsets of arrays. Even if the function returned [2:2] it's
 still going to behave differently than it will in the non-array case
 because you won't be getting the expected number of dimensions back. :(


you cannot to return a slice and I don't propose it, although we can return
a range type or array of range type - but still we cannot to use range for
a arrays.


  array_position_md([1,2,3],2) -- [2]
 array_position_md([[1,2],[2,3],[3,4]], 2) -- [2,1]

 another question is how to solve more than one occurrence on one value -
 probably two sets of functions - first returns first occurrence of value,
 second returns set of occurrence


 Gee, if only way had some way to return multiple elements of something...
 ;P

 In other words, I think all of these should actually return an array of
 positions. I think it's OK for someone that only cares about the first
 instance to just do [1].


there can be two functions - position - returns first and positions
returns all as a array



 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com