Re: [PERFORM] Postgres array parser

2011-12-14 Thread Marc Mamin
> Yes, it would be great, but I haven't found such a function, which
> splits 2 dimensional array into rows =) Maybe we'll modify existing
> function, but unfortunately we have tried hstore type and function in
> postgres and we see a significant performance improvements. So we only
> need to convert existing data into hstore and I think this is a good
> solution.



I haven't tested hstore yet, but I would be interested to find out if it still 
better perform with custom "numeric" aggregates on the hstore values.

I've made a short "proof of concept"  test with a custom key/value type to 
achieve such an aggregation.
Something like:


   SELECT x, distinct_sum( (currency,amount)::keyval ) overview  FROM ... GROUP 
BY x

   x currency amount
   a  EUR   15.0
   a  EUR5.0
   a  CHF7.5
   b  USD   12.0
   =>

   x  overview
   -  
   a {(EUR,20.0), (CHF,7.5)}
   b {(USD,10.0)}


regards,

Marc Mamin

 
> On 12/14/2011 11:21 AM, Marc Mamin wrote:
> > Hello,
> >
> > For such cases (see below), it would be nice to have an unnest
> function that only affect the first array dimension.
> >
> > Something like
> >
> > unnest(ARRAY[[1,2],[2,3]], SLICE=1)
> > =>
> > unnest
> > --
> > [1,2]
> > [2,3]
> >
> >
> > With this function, I imagine that following sql function
> > might beat the plpgsql FOREACH version.
> >
> >
> > CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[],
> in_input_nr numeric)
> > RETURNS numeric AS
> > $BODY$
> >
> >SELECT u[1][2]
> >FROM unnest($1, SLICE =1) u
> >WHERE u[1][1]=in_input_nr
> >LIMIT 1;
> >
> > $BODY$
> > LANGUAGE sql IMMUTABLE;
> >
> >
> >
> > best regards,
> >
> > Marc Mamin
> >
> >
> >> -Original Message-
> >> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-
> performance-
> >> ow...@postgresql.org] On Behalf Of Pavel Stehule
> >> Sent: Dienstag, 13. Dezember 2011 15:43
> >> To: Aleksej Trofimov
> >> Cc: pgsql-performance@postgresql.org
> >> Subject: Re: [PERFORM] Postgres array parser
> >>
> >> Hello
> >>
> >> 2011/12/13 Aleksej Trofimov:
> >>> We have tried foreach syntax, but we have noticed performance
> >> degradation:
> >>> Function with for: 203ms
> >>> Function with foreach: ~250ms:
> >>>
> >>> there is functions code:
> >>> CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[],
> >> in_input_nr
> >>> numeric)
> >>>   RETURNS numeric AS
> >>> $BODY$
> >>> declare i numeric[];
> >>> BEGIN
> >>> FOREACH i SLICE 1 IN ARRAY in_inputs
> >>> LOOP
> >>>  if i[1] = in_input_nr then
> >>> return i[2];
> >>>  end if;
> >>> END LOOP;
> >>>
> >>> return null;
> >>> END;
> >>> $BODY$
> >>>   LANGUAGE plpgsql VOLATILE
> >>>   COST 100;
> >>>
> >>> CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[],
> >> in_input_nr
> >>> numeric)
> >>>   RETURNS numeric AS
> >>> $BODY$
> >>> declare
> >>>   size int;
> >>> BEGIN
> >>>   size = array_upper(in_inputs, 1);
> >>> IF size IS NOT NULL THEN
> >>>
> >>> FOR i IN 1 .. size LOOP
> >>> if in_inputs[i][1] = in_input_nr then
> >>> return in_inputs[i][2];
> >>> end if;
> >>> END LOOP;
> >>> END IF;
> >>>
> >>> return null;
> >>> END;
> >>> $BODY$
> >>>   LANGUAGE plpgsql VOLATILE
> >>>   COST 100;
> >>>
> >>>
> >>> On 12/13/2011 04:02 PM, Pavel Stehule wrote:
> >>>> Hello
> >>>>
> >>>> do you know FOREACH IN ARRAY statement in 9.1
> >>>>
> >>>> this significantly accelerate iteration over array
> >>>>
> >>>>
> >>>> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-
> foreach-
> >> in-array/
> >>>>
> >>>>
> >>>> 2011/12/13 Aleksej Trofimov:
> >>>>> Hello, I wanted to 

Re: [PERFORM] Postgres array parser

2011-12-14 Thread Aleksej Trofimov
Yes, it would be great, but I haven't found such a function, which 
splits 2 dimensional array into rows =) Maybe we'll modify existing 
function, but unfortunately we have tried hstore type and function in 
postgres and we see a significant performance improvements. So we only 
need to convert existing data into hstore and I think this is a good 
solution.


On 12/14/2011 11:21 AM, Marc Mamin wrote:

Hello,

For such cases (see below), it would be nice to have an unnest function that 
only affect the first array dimension.

Something like

unnest(ARRAY[[1,2],[2,3]], SLICE=1)
=>
unnest
--
[1,2]
[2,3]


With this function, I imagine that following sql function
might beat the plpgsql FOREACH version.


CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], in_input_nr 
numeric)
RETURNS numeric AS
$BODY$

   SELECT u[1][2]
   FROM unnest($1, SLICE =1) u
   WHERE u[1][1]=in_input_nr
   LIMIT 1;

$BODY$
LANGUAGE sql IMMUTABLE;



best regards,

Marc Mamin



-Original Message-
From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
ow...@postgresql.org] On Behalf Of Pavel Stehule
Sent: Dienstag, 13. Dezember 2011 15:43
To: Aleksej Trofimov
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Postgres array parser

Hello

2011/12/13 Aleksej Trofimov:

We have tried foreach syntax, but we have noticed performance

degradation:

Function with for: 203ms
Function with foreach: ~250ms:

there is functions code:
CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[],

in_input_nr

numeric)
  RETURNS numeric AS
$BODY$
declare i numeric[];
BEGIN
FOREACH i SLICE 1 IN ARRAY in_inputs
LOOP
 if i[1] = in_input_nr then
return i[2];
 end if;
END LOOP;

return null;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[],

in_input_nr

numeric)
  RETURNS numeric AS
$BODY$
declare
  size int;
BEGIN
  size = array_upper(in_inputs, 1);
IF size IS NOT NULL THEN

FOR i IN 1 .. size LOOP
if in_inputs[i][1] = in_input_nr then
return in_inputs[i][2];
end if;
END LOOP;
END IF;

return null;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;


On 12/13/2011 04:02 PM, Pavel Stehule wrote:

Hello

do you know FOREACH IN ARRAY statement in 9.1

this significantly accelerate iteration over array


http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-

in-array/



2011/12/13 Aleksej Trofimov:

Hello, I wanted to ask according such a problem which we had faced

with.

We are widely using postgres arrays like key->value array by doing

like

this:

{{1,5},{2,6},{3,7}}

where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions

we are

using self written array_input(array::numeric[], key::numeric)

function

which makes a loop on whole array and searches for key like
FOR i IN 1 .. size LOOP
if array[i][1] = key then
return array[i][2];
end if;
END LOOP;

But this was a good solution until our arrays and database had

grown. So

now
FOR loop takes a lot of time to find value of an array.

And my question is, how this problem of performance could be

solved? We

had
tried pgperl for string parsing, but it takes much more time than

our

current solution. Also we are thinking about self-written C++

function,

may
be someone had implemented this algorithm before?


you can use indexes or you can use hstore

Regards

Pavel Stehule


--
Best regards

Aleksej Trofimov


--
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



It is strange - on my comp FOREACH is about 2x faster

postgres=# select input_value(array(select
generate_series(1,100)::numeric), 10);
  input_value
-

(1 row)

Time: 495.426 ms

postgres=# select input_value_fe(array(select
generate_series(1,100)::numeric), 10);
  input_value_fe


(1 row)

Time: 248.980 ms

Regards

Pavel


--
Sent via pgsql-performance mailing list (pgsql-
performa...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Best regards

Aleksej Trofimov


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres array parser

2011-12-14 Thread Marc Mamin
Hello,

For such cases (see below), it would be nice to have an unnest function that 
only affect the first array dimension.

Something like 

unnest(ARRAY[[1,2],[2,3]], SLICE=1)
=>
unnest
--
[1,2]
[2,3]


With this function, I imagine that following sql function
might beat the plpgsql FOREACH version. 


CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], in_input_nr 
numeric)
   RETURNS numeric AS
$BODY$
   
  SELECT u[1][2]
  FROM unnest($1, SLICE =1) u
  WHERE u[1][1]=in_input_nr
  LIMIT 1;

$BODY$
   LANGUAGE sql IMMUTABLE;

   
   
best regards,

Marc Mamin
   

> -Original Message-
> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
> ow...@postgresql.org] On Behalf Of Pavel Stehule
> Sent: Dienstag, 13. Dezember 2011 15:43
> To: Aleksej Trofimov
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Postgres array parser
> 
> Hello
> 
> 2011/12/13 Aleksej Trofimov :
> > We have tried foreach syntax, but we have noticed performance
> degradation:
> > Function with for: 203ms
> > Function with foreach: ~250ms:
> >
> > there is functions code:
> > CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[],
> in_input_nr
> > numeric)
> >  RETURNS numeric AS
> > $BODY$
> > declare i numeric[];
> > BEGIN
> >        FOREACH i SLICE 1 IN ARRAY in_inputs
> >            LOOP
> >                 if i[1] = in_input_nr then
> >                    return i[2];
> >                 end if;
> >            END LOOP;
> >
> >    return null;
> > END;
> > $BODY$
> >  LANGUAGE plpgsql VOLATILE
> >  COST 100;
> >
> > CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[],
> in_input_nr
> > numeric)
> >  RETURNS numeric AS
> > $BODY$
> > declare
> >  size int;
> > BEGIN
> >  size = array_upper(in_inputs, 1);
> >    IF size IS NOT NULL THEN
> >
> >        FOR i IN 1 .. size LOOP
> >            if in_inputs[i][1] = in_input_nr then
> >                return in_inputs[i][2];
> >            end if;
> >        END LOOP;
> >    END IF;
> >
> >    return null;
> > END;
> > $BODY$
> >  LANGUAGE plpgsql VOLATILE
> >  COST 100;
> >
> >
> > On 12/13/2011 04:02 PM, Pavel Stehule wrote:
> >>
> >> Hello
> >>
> >> do you know FOREACH IN ARRAY statement in 9.1
> >>
> >> this significantly accelerate iteration over array
> >>
> >>
> >> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-
> in-array/
> >>
> >>
> >>
> >> 2011/12/13 Aleksej Trofimov:
> >>>
> >>> Hello, I wanted to ask according such a problem which we had faced
> with.
> >>> We are widely using postgres arrays like key->value array by doing
> like
> >>> this:
> >>>
> >>> {{1,5},{2,6},{3,7}}
> >>>
> >>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions
> we are
> >>> using self written array_input(array::numeric[], key::numeric)
> function
> >>> which makes a loop on whole array and searches for key like
> >>> FOR i IN 1 .. size LOOP
> >>>            if array[i][1] = key then
> >>>                return array[i][2];
> >>>            end if;
> >>> END LOOP;
> >>>
> >>> But this was a good solution until our arrays and database had
> grown. So
> >>> now
> >>> FOR loop takes a lot of time to find value of an array.
> >>>
> >>> And my question is, how this problem of performance could be
> solved? We
> >>> had
> >>> tried pgperl for string parsing, but it takes much more time than
> our
> >>> current solution. Also we are thinking about self-written C++
> function,
> >>> may
> >>> be someone had implemented this algorithm before?
> >>>
> >> you can use indexes or you can use hstore
> >>
> >> Regards
> >>
> >> Pavel Stehule
> >>
> >>> --
> >>> Best regards
> >>>
> >>> Aleksej Trofimov
> >>>
> >>>
> >>> --
> >>> Sent via pgsql-performance mailing list
> >>> (pgsql-performance@postgresql.org)
> >>> To make changes to your subscription:
> >>> http://www.postgresql.org/mailpref/pgsql-performance
> >
> >
> 
> It is strange - on my comp FOREACH is about 2x faster
> 
> postgres=# select inp

Re: [PERFORM] Postgres array parser

2011-12-13 Thread Pavel Stehule
Hello

2011/12/13 Aleksej Trofimov :
> We have tried foreach syntax, but we have noticed performance degradation:
> Function with for: 203ms
> Function with foreach: ~250ms:
>
> there is functions code:
> CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[], in_input_nr
> numeric)
>  RETURNS numeric AS
> $BODY$
> declare i numeric[];
> BEGIN
>        FOREACH i SLICE 1 IN ARRAY in_inputs
>            LOOP
>                 if i[1] = in_input_nr then
>                    return i[2];
>                 end if;
>            END LOOP;
>
>    return null;
> END;
> $BODY$
>  LANGUAGE plpgsql VOLATILE
>  COST 100;
>
> CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[], in_input_nr
> numeric)
>  RETURNS numeric AS
> $BODY$
> declare
>  size int;
> BEGIN
>  size = array_upper(in_inputs, 1);
>    IF size IS NOT NULL THEN
>
>        FOR i IN 1 .. size LOOP
>            if in_inputs[i][1] = in_input_nr then
>                return in_inputs[i][2];
>            end if;
>        END LOOP;
>    END IF;
>
>    return null;
> END;
> $BODY$
>  LANGUAGE plpgsql VOLATILE
>  COST 100;
>
>
> On 12/13/2011 04:02 PM, Pavel Stehule wrote:
>>
>> Hello
>>
>> do you know FOREACH IN ARRAY statement in 9.1
>>
>> this significantly accelerate iteration over array
>>
>>
>> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-in-array/
>>
>>
>>
>> 2011/12/13 Aleksej Trofimov:
>>>
>>> Hello, I wanted to ask according such a problem which we had faced with.
>>> We are widely using postgres arrays like key->value array by doing like
>>> this:
>>>
>>> {{1,5},{2,6},{3,7}}
>>>
>>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions we are
>>> using self written array_input(array::numeric[], key::numeric) function
>>> which makes a loop on whole array and searches for key like
>>> FOR i IN 1 .. size LOOP
>>>            if array[i][1] = key then
>>>                return array[i][2];
>>>            end if;
>>> END LOOP;
>>>
>>> But this was a good solution until our arrays and database had grown. So
>>> now
>>> FOR loop takes a lot of time to find value of an array.
>>>
>>> And my question is, how this problem of performance could be solved? We
>>> had
>>> tried pgperl for string parsing, but it takes much more time than our
>>> current solution. Also we are thinking about self-written C++ function,
>>> may
>>> be someone had implemented this algorithm before?
>>>
>> you can use indexes or you can use hstore
>>
>> Regards
>>
>> Pavel Stehule
>>
>>> --
>>> Best regards
>>>
>>> Aleksej Trofimov
>>>
>>>
>>> --
>>> Sent via pgsql-performance mailing list
>>> (pgsql-performance@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-performance
>
>

It is strange - on my comp FOREACH is about 2x faster

postgres=# select input_value(array(select
generate_series(1,100)::numeric), 10);
 input_value
-

(1 row)

Time: 495.426 ms

postgres=# select input_value_fe(array(select
generate_series(1,100)::numeric), 10);
 input_value_fe


(1 row)

Time: 248.980 ms

Regards

Pavel


>
> --
> Best regards
>
> Aleksej Trofimov
>
> UAB "Ruptela"
>
> Phone: +370 657 80475
>
> E-Mail: aleksej.trofi...@ruptela.lt
> Web:    http://www.ruptela.lt
>
> Ruptela - the most successful IT company in Lithuania 2011
> Ruptela - sekmingiausia Lietuvos aukštųjų technologijų įmonė 2011
> http://www.prezidentas.lt/lt/spaudos_centras_392/pranesimai_spaudai/inovatyvus_verslas_-_konkurencingos_lietuvos_pagrindas.html
> http://www.ruptela.lt/news/37/121/Ruptela-sekmingiausia-jauna-aukstuju-technologiju-imone-Lietuvoje
>

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres array parser

2011-12-13 Thread Aleksej Trofimov

We have tried foreach syntax, but we have noticed performance degradation:
Function with for: 203ms
Function with foreach: ~250ms:

there is functions code:
CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[], 
in_input_nr numeric)

  RETURNS numeric AS
$BODY$
declare i numeric[];
BEGIN
FOREACH i SLICE 1 IN ARRAY in_inputs
LOOP
 if i[1] = in_input_nr then
return i[2];
 end if;
END LOOP;

return null;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[], in_input_nr 
numeric)

  RETURNS numeric AS
$BODY$
declare
  size int;
BEGIN
  size = array_upper(in_inputs, 1);
IF size IS NOT NULL THEN
FOR i IN 1 .. size LOOP
if in_inputs[i][1] = in_input_nr then
return in_inputs[i][2];
end if;
END LOOP;
END IF;

return null;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

On 12/13/2011 04:02 PM, Pavel Stehule wrote:

Hello

do you know FOREACH IN ARRAY statement in 9.1

this significantly accelerate iteration over array

http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-in-array/



2011/12/13 Aleksej Trofimov:

Hello, I wanted to ask according such a problem which we had faced with.
We are widely using postgres arrays like key->value array by doing like
this:

{{1,5},{2,6},{3,7}}

where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions we are
using self written array_input(array::numeric[], key::numeric) function
which makes a loop on whole array and searches for key like
FOR i IN 1 .. size LOOP
if array[i][1] = key then
return array[i][2];
end if;
END LOOP;

But this was a good solution until our arrays and database had grown. So now
FOR loop takes a lot of time to find value of an array.

And my question is, how this problem of performance could be solved? We had
tried pgperl for string parsing, but it takes much more time than our
current solution. Also we are thinking about self-written C++ function, may
be someone had implemented this algorithm before?


you can use indexes or you can use hstore

Regards

Pavel Stehule


--
Best regards

Aleksej Trofimov


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Best regards

Aleksej Trofimov

UAB "Ruptela"

Phone: +370 657 80475

E-Mail: aleksej.trofi...@ruptela.lt
Web:http://www.ruptela.lt

Ruptela - the most successful IT company in Lithuania 2011
Ruptela - sekmingiausia Lietuvos aukštųjų technologijų įmonė 2011
http://www.prezidentas.lt/lt/spaudos_centras_392/pranesimai_spaudai/inovatyvus_verslas_-_konkurencingos_lietuvos_pagrindas.html
http://www.ruptela.lt/news/37/121/Ruptela-sekmingiausia-jauna-aukstuju-technologiju-imone-Lietuvoje


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres array parser

2011-12-13 Thread Pavel Stehule
Hello

do you know FOREACH IN ARRAY statement in 9.1

this significantly accelerate iteration over array

http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-in-array/



2011/12/13 Aleksej Trofimov :
> Hello, I wanted to ask according such a problem which we had faced with.
> We are widely using postgres arrays like key->value array by doing like
> this:
>
> {{1,5},{2,6},{3,7}}
>
> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions we are
> using self written array_input(array::numeric[], key::numeric) function
> which makes a loop on whole array and searches for key like
> FOR i IN 1 .. size LOOP
>            if array[i][1] = key then
>                return array[i][2];
>            end if;
> END LOOP;
>
> But this was a good solution until our arrays and database had grown. So now
> FOR loop takes a lot of time to find value of an array.
>
> And my question is, how this problem of performance could be solved? We had
> tried pgperl for string parsing, but it takes much more time than our
> current solution. Also we are thinking about self-written C++ function, may
> be someone had implemented this algorithm before?
>

you can use indexes or you can use hstore

Regards

Pavel Stehule

> --
> Best regards
>
> Aleksej Trofimov
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Postgres array parser

2011-12-13 Thread Aleksej Trofimov

Hello, I wanted to ask according such a problem which we had faced with.
We are widely using postgres arrays like key->value array by doing like 
this:


{{1,5},{2,6},{3,7}}

where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions we 
are using self written array_input(array::numeric[], key::numeric) 
function which makes a loop on whole array and searches for key like

FOR i IN 1 .. size LOOP
if array[i][1] = key then
return array[i][2];
end if;
END LOOP;

But this was a good solution until our arrays and database had grown. So 
now FOR loop takes a lot of time to find value of an array.


And my question is, how this problem of performance could be solved? We 
had tried pgperl for string parsing, but it takes much more time than 
our current solution. Also we are thinking about self-written C++ 
function, may be someone had implemented this algorithm before?


--
Best regards

Aleksej Trofimov


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance