I've found an interesting performance problem in the intarray extension
module.  It doesn't seem to be version dependent, I've verified it in 9.4.4
and 9.6devel.

If I do a query that involves both an = op and a @@ op ANDed together, it
gives a high cost estimate, which is justified by the slow runtime.  If I
omit the @@ it gives a low estimate, also justified.  If I trick it into
thinking it cannot use the index to satisfy the @@ op, then it gives a low
estimate and low runtime, applying the @@ in the filter step and only the
fast = in the bitmap index scan.

Now it could use the index for the fast = operation and apply the @@ in the
recheck, rather than the filter.  But I guess it doesn't think of that,
despite knowing that applying the @@ in index operation is slow.

So it seems to be missing a trick someplace, but I don't if it reasonable
to expect it to find that trick, or how easy/hard it would be to implement.

The proposed selectivity estimate improvement patch for @@ does not fix
this (and evaluating that patch was how I found this issue.)


Set up:

create table foobar as
select (
  select
    array_agg(floor(sqrt(random()*10000000+g.y/1000000+f.x/10000000))::int)
    from generate_series(1,10) g(y)
  ) foo
  from generate_series(1,10000000) f(x);
create index on foobar using gin(foo gin__int_ops);
analyze;

You can knock an order of magnitude off from the table size and should
still be able to see the problem.

example:

explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo @@
'!1'::query_int;

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Seq Scan on foobar  (cost=0.00..263637.00 rows=1 width=61) (actual
time=9717.957..9717.957 rows=0 loops=1)
   Filter: ((foo @@ '!1'::query_int) AND (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]))
   Rows Removed by Filter: 10000000
   Buffers: shared hit=64 read=113573
   I/O Timings: read=361.402
 Planning time: 0.101 ms
 Execution time: 9717.991 ms
(7 rows)


explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' ;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on foobar  (cost=112.01..116.02 rows=1 width=61) (actual
time=0.027..0.027 rows=0 loops=1)
   Recheck Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
   Buffers: shared hit=21
   ->  Bitmap Index Scan on foobar_foo_idx  (cost=0.00..112.01 rows=1
width=0) (actual time=0.023..0.023 rows=0 loops=1)
         Index Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
         Buffers: shared hit=21
 Planning time: 0.097 ms
 Execution time: 0.061 ms

If I trick it into thinking the @@ operator cannot be used in th eindex,
then it gets really fast again:

explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo||'{}'
@@ '!1'::query_int;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on foobar  (cost=112.01..116.03 rows=1 width=61) (actual
time=0.082..0.082 rows=0 loops=1)
   Recheck Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
   Filter: ((foo || '{}'::integer[]) @@ '!1'::query_int)
   Buffers: shared hit=21
   ->  Bitmap Index Scan on foobar_foo_idx  (cost=0.00..112.01 rows=1
width=0) (actual time=0.080..0.080 rows=0 loops=1)
         Index Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
         Buffers: shared hit=21
 Planning time: 0.139 ms
 Execution time: 0.129 ms

Cheers,

Jeff

Reply via email to