Re: [PERFORM] [SQL] OFFSET impact on Performance???

Oleg Bartunov Thu, 27 Jan 2005 05:57:45 -0800

On Thu, 27 Jan 2005, PFC wrote:

for example, http://www.sai.msu.su/~megera/postgres/gist/code/7.3/README.intarray see OPERATIONS and EXAMPLE USAGE:

Thanks, I already know this documentation and have used intarray before (I find it absolutely fabulous in the right application, it has a great potential for getting out of tight situations which would involve huge unmanageable pivot or attributes tables). Its only drawback is that the gist index creation time is slow and sometimes just... takes forever until the disk is full. However, it seems that integer && integer[] does not exist :


Try intset(id) && int[]. intset is an undocumented function :)
I'm going to add intset() to README.

SELECT * FROM table WHERE id && int[]
explain analyze select * from temp t where id && ( '{1,2,3,4,5,6,7,8,9,10,11,12}'::integer[] ); ERREUR: L'operateur n'existe pas : integer && integer[] ASTUCE : Aucun operateur correspond au nom donne et aux types d'arguments. Vous devez ajouter des conversions explicites de type.

I have already used this type of intarray indexes, but you have to create a special gist index with the gist__int_ops on the column, and the column has to be an array. In my case the column is just a SERIAL PRIMARY KEY, and should stay this way, and I don't want to create a functional index in array[id] just for this feature ; so I guess I can't use the && operator. Am I mistaken ? My index is the standard btree here. It would be nice if the =ANY() could use the index just like IN does ; besides at planning time the length of the array is known which makes it behave quite just like IN().

So I'll use either an EXECUTE'd plpgsql-generated query (IN (....)) , which I don't like because it's a kludge ; or this other solution which I find more elegant :

CREATE OR REPLACE FUNCTION tools.array_srf( INTEGER[] ) RETURNS SETOF INTEGER RETURNS NULL ON NULL INPUT LANGUAGE plpgsql AS $$ DECLARE _data ALIAS FOR $1; _i INTEGER; BEGIN FOR _i IN 1..icount(_data) LOOP RETURN NEXT _data[_i]; END LOOP; RETURN; END; $$;

----------------------------------------------------------------------------------- explain analyze select * from temp t where id =ANY( '{1,2,3,4,5,6,7,8,9,10,11,12}' ); Seq Scan on "temp" t (cost=0.00..5165.52 rows=65536 width=8) (actual time=0.030..173.319 rows=12 loops=1) Filter: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12}'::integer[])) Total runtime: 173.391 ms

----------------------------------------------------------------------------------- explain analyze select * from temp t where id IN( 1,2,3,4,5,6,7,8,9,10,11,12 ); Index Scan using temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey on "temp" t (cost=0.00..36.49 rows=12 width=8) (actual time=0.046..0.137 rows=12 loops=1) Index Cond: ((id = 1) OR (id = 2) OR (id = 3) OR (id = 4) OR (id = 5) OR (id = 6) OR (id = 7) OR (id = 8) OR (id = 9) OR (id = 10) OR (id = 11) OR (id = 12)) Total runtime: 0.292 ms

----------------------------------------------------------------------------------- explain analyze select * from temp t where id in (select * from tools.array_srf('{1,2,3,4,5,6,7,8,9,10,11,12}')); Nested Loop (cost=15.00..620.20 rows=200 width=8) (actual time=0.211..0.368 rows=12 loops=1) -> HashAggregate (cost=15.00..15.00 rows=200 width=4) (actual time=0.160..0.173 rows=12 loops=1) -> Function Scan on array_srf (cost=0.00..12.50 rows=1000 width=4) (actual time=0.127..0.139 rows=12 loops=1) -> Index Scan using temp_pkey on "temp" t (cost=0.00..3.01 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=12) Index Cond: (t.id = "outer".array_srf) Total runtime: 0.494 ms

----------------------------------------------------------------------------------- explain analyze select * from temp t, (select * from tools.array_srf('{1,2,3,4,5,6,7,8,9,10,11,12}')) foo where foo.array_srf = t.id;

Merge Join (cost=62.33..2824.80 rows=1000 width=12) (actual time=0.215..0.286 rows=12 loops=1) Merge Cond: ("outer".id = "inner".array_srf) -> Index Scan using temp_pkey on "temp" t (cost=0.00..2419.79 rows=131072 width=8) (actual time=0.032..0.056 rows=13 loops=1) -> Sort (cost=62.33..64.83 rows=1000 width=4) (actual time=0.169..0.173 rows=12 loops=1) Sort Key: array_srf.array_srf -> Function Scan on array_srf (cost=0.00..12.50 rows=1000 width=4) (actual time=0.127..0.139 rows=12 loops=1) Total runtime: 0.391 ms

----------------------------------------------------------------------------------- Note that the meaning is different ; the IN removes duplicates in the array but the join does not.


    Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [email protected], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

Re: [PERFORM] [SQL] OFFSET impact on Performance???

Reply via email to