Andres Freund <and...@2ndquadrant.com> writes:
>> On 2014-04-25 12:05:17 -0400, Tom Lane wrote:
>>> Meh ... is it likely that the columns involved in an ordering comparison
>>> would be so wide as to be toasted out-of-line?  Such a query would only be
>>> fast if the row value were indexed, which would pretty much preclude use
>>> of wide columns.

> Just for some clarity, that also happens with expressions like:
> WHERE
>     ROW(ev_class, rulename, ev_action) >= ROW('pg_rewrite'::regclass, 
> '_RETURN', NULL)
> ORDER BY ROW(ev_class, rulename, ev_action);

> which is what is generated by such query generators - where the leading
> columns *are* indexed but not necessarily unique.

Ah, I see.  Well, we're pretty darn stupid about such queries anyway :-(.
Your first example could be greatly improved by expanding the whole-row
Var into a ROW() construct (so that RowCompareExpr could be used), and
the second one by exploding the ROW() order-by into separate order-by
columns.  Maybe someday we can do that, or persuade the query generators
not to generate such brain-dead SQL in the first place.  But in the
meantime these coding techniques lead to highly suboptimal plans anyway,
with or without TOAST considerations.  It's also worth noting that
it's merest luck that the existing code isn't *slower* about such
queries; if there were any significant number of comparisons of the
toasted columns occurring during the sort step, it could come out far
behind.  So I'm not finding myself terribly concerned here.

Also, I did a bit more research and verified that my patch doesn't cause
any extra detoasting activity for simple set-returning-function cases,
for example:

regression=# create or replace function pgr() returns setof pg_rewrite as
'declare r pg_rewrite;                                     
begin
for r in select * from pg_rewrite loop
  return next r;
end loop;
end' language plpgsql;
CREATE FUNCTION

regression=# explain (analyze, buffers) select r.* from pgr() r;
                                                 QUERY PLAN                     
------------------------------------------------------------------------------------------------------------
 Function Scan on pgr r  (cost=0.25..10.25 rows=1000 width=135) (actual 
time=0.881..0.911 rows=177 loops=1)
   Buffers: shared hit=36
 Planning time: 0.059 ms
 Execution time: 0.986 ms

The same for SQL-language functions, either inlined or not.  It's not so
good if you insist on putting the SRF call in the targetlist:

explain (analyze, buffers) select pgr();  
                                        QUERY PLAN                              
------------------------------------------------------------------------------------------
 Result  (cost=0.00..5.25 rows=1000 width=0) (actual time=0.941..10.575 
rows=177 loops=1)
   Buffers: shared hit=179
 Planning time: 0.029 ms
 Execution time: 10.677 ms

On the other hand, in real-world usage (not EXPLAIN), a query like that is
certainly going to be detoasting all the fields anyway to return them to
the client.

On the whole I feel fairly good about the opinion that this change won't
be disastrous for mainstream usages, and will be beneficial for
performance some of the time.  Since I'm not hearing any volunteers to
try to convert the other approach into a complete patch, I plan to push
forward with this one.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to