Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-28 Thread Cédric Villemain
   Is that behaviour is correct?
  
  yes - access to n position means in postgresql - skip n-1 elements
  
  Hmmm...
  
  how many elements to be skipped here ?
 
 there are two independent stages:
 
 a) detoast - loading and decompression (complete array is detoasted)
 b) access
 
 if you has very large arrays, then @a is significant

There is a place to add PG_GETARG_ARRAY_P_SLICE. The code is just not done 
yet. 

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-26 Thread Marc Mamin

 On 22/06/12 09:02, Maxim Boguk wrote: 

 May be I completely wrong but I always assumed that the access speed to the 
 array element in PostgreSQL should be close to constant time.
 But in tests I found that access speed degrade as O(N) of array size.

 Is that behaviour is correct?


 From: pgsql-performance-ow...@postgresql.org On Behalf Of Jesper Krogh

 Default column storage is to compress it, and store in TOAST with large 
 values. 
 This it what is causing the shift. Try to change the column storage of the 
 column
 to EXTERNAL instead and rerun the test. 


Hello,

I've repeated your test in a simplified form:
you are right :-(

create table t1 ( _array int[]);
alter table t1 alter _array set storage external;
insert into t1 SELECT ARRAY(SELECT * FROM generate_series(1,5));

create table t2 ( _array int[]);
alter table t2 alter _array set storage external;
insert into t2 SELECT ARRAY(SELECT * FROM generate_series(1,500));

explain analyze SELECT _array[1] FROM t1;
Total runtime: 0.125 ms

explain analyze SELECT _array[1] FROM t2;
Total runtime: 8.649 ms


best regards,

Marc Mamin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-26 Thread Pavel Stehule
2012/6/26 Marc Mamin m.ma...@intershop.de:

 On 22/06/12 09:02, Maxim Boguk wrote:

 May be I completely wrong but I always assumed that the access speed to the 
 array element in PostgreSQL should be close to constant time.
 But in tests I found that access speed degrade as O(N) of array size.

 Is that behaviour is correct?

yes - access to n position means in postgresql - skip n-1 elements

Regards

Pavel



 From: pgsql-performance-ow...@postgresql.org On Behalf Of Jesper Krogh

 Default column storage is to compress it, and store in TOAST with large 
 values.
 This it what is causing the shift. Try to change the column storage of the 
 column
 to EXTERNAL instead and rerun the test.


 Hello,

 I've repeated your test in a simplified form:
 you are right :-(

 create table t1 ( _array int[]);
 alter table t1 alter _array set storage external;
 insert into t1 SELECT ARRAY(SELECT * FROM generate_series(1,5));

 create table t2 ( _array int[]);
 alter table t2 alter _array set storage external;
 insert into t2 SELECT ARRAY(SELECT * FROM generate_series(1,500));

 explain analyze SELECT _array[1] FROM t1;
 Total runtime: 0.125 ms

 explain analyze SELECT _array[1] FROM t2;
 Total runtime: 8.649 ms


 best regards,

 Marc Mamin



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-26 Thread Marc Mamin


 -Original Message-
 From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
 
 2012/6/26 Marc Mamin m.ma...@intershop.de:
 
  On 22/06/12 09:02, Maxim Boguk wrote:
 
  May be I completely wrong but I always assumed that the access
 speed to the array element in PostgreSQL should be close to constant
 time.
  But in tests I found that access speed degrade as O(N) of array
 size.
 
  Is that behaviour is correct?
 
 yes - access to n position means in postgresql - skip n-1 elements


Hmmm...

how many elements to be skipped here ?

SELECT _array[1] FROM t2;

I wonder if the time rather get spent in first retrieving the array itself 
before accessing its elements.

regards,

Marc Mamin

 
 Regards
 
 Pavel
 
 
 
  From: pgsql-performance-ow...@postgresql.org On Behalf Of Jesper
 Krogh
 
  Default column storage is to compress it, and store in TOAST with
 large values.
  This it what is causing the shift. Try to change the column storage
 of the column
  to EXTERNAL instead and rerun the test.
 
 
  Hello,
 
  I've repeated your test in a simplified form:
  you are right :-(
 
  create table t1 ( _array int[]);
  alter table t1 alter _array set storage external;
  insert into t1 SELECT ARRAY(SELECT * FROM generate_series(1,5));
 
  create table t2 ( _array int[]);
  alter table t2 alter _array set storage external;
  insert into t2 SELECT ARRAY(SELECT * FROM
 generate_series(1,500));
 
  explain analyze SELECT _array[1] FROM t1;
  Total runtime: 0.125 ms
 
  explain analyze SELECT _array[1] FROM t2;
  Total runtime: 8.649 ms
 
 
  best regards,
 
  Marc Mamin
 
 
 
  --
  Sent via pgsql-performance mailing list (pgsql-
 performa...@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-26 Thread Pavel Stehule
2012/6/26 Marc Mamin m.ma...@intershop.de:


 -Original Message-
 From: Pavel Stehule [mailto:pavel.steh...@gmail.com]

 2012/6/26 Marc Mamin m.ma...@intershop.de:
 
  On 22/06/12 09:02, Maxim Boguk wrote:
 
  May be I completely wrong but I always assumed that the access
 speed to the array element in PostgreSQL should be close to constant
 time.
  But in tests I found that access speed degrade as O(N) of array
 size.
 
  Is that behaviour is correct?

 yes - access to n position means in postgresql - skip n-1 elements


 Hmmm...

 how many elements to be skipped here ?

there are two independent stages:

a) detoast - loading and decompression (complete array is detoasted)
b) access

if you has very large arrays, then @a is significant

Regards

Pavel



 SELECT _array[1] FROM t2;

 I wonder if the time rather get spent in first retrieving the array itself 
 before accessing its elements.

 regards,

 Marc Mamin


 Regards

 Pavel

 
 
  From: pgsql-performance-ow...@postgresql.org On Behalf Of Jesper
 Krogh
 
  Default column storage is to compress it, and store in TOAST with
 large values.
  This it what is causing the shift. Try to change the column storage
 of the column
  to EXTERNAL instead and rerun the test.
 
 
  Hello,
 
  I've repeated your test in a simplified form:
  you are right :-(
 
  create table t1 ( _array int[]);
  alter table t1 alter _array set storage external;
  insert into t1 SELECT ARRAY(SELECT * FROM generate_series(1,5));
 
  create table t2 ( _array int[]);
  alter table t2 alter _array set storage external;
  insert into t2 SELECT ARRAY(SELECT * FROM
 generate_series(1,500));
 
  explain analyze SELECT _array[1] FROM t1;
  Total runtime: 0.125 ms
 
  explain analyze SELECT _array[1] FROM t2;
  Total runtime: 8.649 ms
 
 
  best regards,
 
  Marc Mamin
 
 
 
  --
  Sent via pgsql-performance mailing list (pgsql-
 performa...@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-26 Thread Pavel Stehule
2012/6/26 Maxim Boguk maxim.bo...@gmail.com:


 On Tue, Jun 26, 2012 at 6:04 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 2012/6/26 Marc Mamin m.ma...@intershop.de:
 
  On 22/06/12 09:02, Maxim Boguk wrote:
 
  May be I completely wrong but I always assumed that the access speed
  to the array element in PostgreSQL should be close to constant time.
  But in tests I found that access speed degrade as O(N) of array size.
 
  Is that behaviour is correct?

 yes - access to n position means in postgresql - skip n-1 elements


 Hi,

 I understand what you mean, but in my test for all values of N test
 performed access only to first 1 elements of the array independent of
 the array size
 So i still can't see why access time degrade soo fast for N1...:

 WITH
 --GENERATE single entry table with single ARRAY field of size N
 t AS (SELECT ARRAY(SELECT * FROM generate_series(1,N)) AS _array)
 --iterate over first 1 elements of that ARRAY
 SELECT count((SELECT _array[i] FROM t)) FROM generate_series(1,1) as
 g(i);

 ... if access time depend only on position then after 10k there should not
 be any serious degradation, in fact a perfromance degradation is almost
 linear.
 10k    34ms
 50k   177ms
 100k  321ms
 500k 4100ms
 1M   8100ms
 2M  22000ms
 5M  61000ms
 10M    22ms = 22ms to sinlge array element access.

in this use case TOAST/DETOAST is not used, but probably there are
problem with array copy. Taking n element of array means calling some
function, and postgresql uses only passing parameters by value - so
copy of large array needs higher time.

this issue is solved partially in 9.2, where you can use FOR EACH
cycle 
http://www.postgresql.org/docs/9.1/interactive/plpgsql-control-structures.html#PLPGSQL-FOREACH-ARRAY

Regards

Pavel


 And I think no toasting/detoasting happen in my test case.

 Kind Regards,
 Maksym

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-25 Thread Maxim Boguk
Hi all,

May be I completely wrong but I always assumed that the access speed to the
array element in PostgreSQL should be close to constant time.
But in tests I found that access speed degrade as O(N) of array size.

Test case (performed on large not busy server with 1GB work_mem to ensure I
working with memory only):

WITH
t AS (SELECT ARRAY(SELECT * FROM generate_series(1,N)) AS _array)
SELECT count((SELECT _array[i] FROM t)) FROM generate_series(1,1) as
g(i);

Results for N between 1 and 10.000.000 (used locally connected psql with
\timing):

N:  Time:
1   5.8ms
10  5.8ms
100 5.8ms
10006.7ms
--until there all reasonable
5k 21ms
10k34ms
50k   177ms
100k  321ms
500k 4100ms
1M   8100ms
2M  22000ms
5M  61000ms
10M22ms = 22ms to sinlge array element access.


Is that behaviour is correct?

PS: what I actually lookin for - constant fast access by position
tuplestore for use with recursive queries and/or pl/pgsql, but without
using C programming.

-- 
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com
МойКруг: http://mboguk.moikrug.ru/

People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage.


Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-25 Thread Jesper Krogh

On 22/06/12 09:02, Maxim Boguk wrote:

Hi all,

May be I completely wrong but I always assumed that the access speed 
to the array element in PostgreSQL should be close to constant time.

But in tests I found that access speed degrade as O(N) of array size.

Test case (performed on large not busy server with 1GB work_mem to 
ensure I working with memory only):


WITH
t AS (SELECT ARRAY(SELECT * FROM generate_series(1,N)) AS _array)
SELECT count((SELECT _array[i] FROM t)) FROM generate_series(1,1) 
as g(i);


Results for N between 1 and 10.000.000 (used locally connected psql 
with \timing):


N:  Time:
1   5.8ms
10  5.8ms
100 5.8ms
10006.7ms
--until there all reasonable
5k 21ms
10k34ms
50k   177ms
100k  321ms
500k 4100ms
1M   8100ms
2M  22000ms
5M  61000ms
10M22ms = 22ms to sinlge array element access.


Is that behaviour is correct?

PS: what I actually lookin for - constant fast access by position 
tuplestore for use with recursive queries and/or pl/pgsql, but without 
using C programming.


Default column storage is to compress it, and store in TOAST with 
large values.
This it what is causing the shift. Try to change the column storage of 
the column

to EXTERNAL instead and rerun the test.

ALTER TABLE tablename ALTER COLUMN column name SET STORAGE EXTERNAL

Default is EXTENDED which runs compression on it, which again makes it 
hard to

position into without reading and decompressing everything.

http://www.postgresql.org/docs/9.1/static/sql-altertable.html

Let us know what you get.?

Jesper