Okay, forget what i've just written, sorry for that.

I've digged deeper and checked my generic example - it works perfectly using
a merge-append in combination with limit.

i'm sorry i didn't check that first. it just won't work in my real-life example. though each part of the query is using an index-scan it is than using a 'normal' append
instead of a merge-append, but i don't know why.

here is the expected query plan as used with my generic example:

Limit  (cost=0.86..0.90 rows=1 width=12)
  ->  Merge Append  (cost=0.86..104314.37 rows=3435900 width=12)
        Sort Key: kpkp_orders.buchungsdatum DESC
        ->  Index Only Scan Backward using kpkp_orders_1 on kpkp_orders  (cost=0.43..34977.68 rows=1717950 width=12)         ->  Index Only Scan Backward using kpkp_invoices_1 on kpkp_invoices  (cost=0.43..34977.68 rows=1717950 width=12)

this is fast.

and here is the query plan used on my real-life example:


Limit  (cost=918.40..918.41 rows=1 width=8)
  ->  Sort  (cost=918.40..919.59 rows=473 width=8)
        Sort Key: s.belegdatum DESC
        ->  Subquery Scan on s  (cost=0.56..916.04 rows=473 width=8)
              ->  Append  (cost=0.56..911.31 rows=473 width=327)
                    ->  Subquery Scan on "*SELECT* 1" (cost=0.56..624.08 rows=318 width=187)                           ->  Index Scan using soit_erpprozesshistorie_12 on erpprozesshistorie eph (cost=0.56..620.10 rows=318 width=181)                                 Index Cond: ((clientid = '0'::numeric) AND (activeflag = 't'::bpchar) AND (prozessuntertyp = 2) AND (CASE WHEN ((typ = 1) OR (typ = 7)) THEN 1 ELSE CASE WHEN (typ = 3) THEN 2 ELSE CASE WHEN ((typ = 4) OR (typ = 9)) THEN 3 ELSE 4 END END END = 1) AND (CASE WHEN (prozesstyp = 1) THEN true ELSE false END = true) AND (artikelvariante_objid = '1064748816'::numeric))                     ->  Subquery Scan on "*SELECT* 2" (cost=0.43..284.87 rows=155 width=214)                           ->  Index Scan using soit_umsatz_alt_08 on umsatz_alt  (cost=0.43..282.93 rows=155 width=186)                                 Index Cond: ((clientid = '0'::numeric) AND (activeflag = 't'::bpchar) AND (verkauf = true) AND (artikelvariante_objid = '1064748816'::numeric))


so my question is: under wich circumstance does the query-planner use or prefer the 'merge append' over 'append'?


Thanks in advance!


Am 08.05.25 um 11:57 schrieb Karsten P:
Hi,

i've already googled so far but didn't find anything regarding my problem..
I hope i'm here at the right place.

Following situation (this is just an simplyfied example):

suppose we have two tables, lets say

orders
  - column 'order_number' -> varchar
  - column 'order_date' -> timestamp

with index on order_date

and

invoices
  - column 'invoice_number' -> varchar
  - column 'invoice_date' -> timestamp

with index on invoice_date

and many records in both if them.

now we have a view combining both of them as

create view documents as
(
    select order_number as document_number, order_date as document_date from orders
    union all select invoice_number, invoice_date from invoices
)


finding the last order placed in the database ist really easy:

  select order_number from orders order by order_date desc limit 1

will result in an index scan backward on orders

same with invoices only...

but when querying the view

  select document_number from documents order by document_date desc limit 1

seems to break down to
  - collect all rows from orders
  - combine it with all rows from invoices
  - sort all rows (descending)
  - limit to one row

with many data this is quite slow.

I've tested this with PG9.6 and PG14, it doesn't seem to make a difference (correct me if i'm wrong).


So my question is: What about optimizing the query-planner that if

- a query with unions of selects is executed
- and an 'order by' in combination with 'limit' is applied on the complete query (not only on subselects)
- and there is a matching index for each select

the order by and limit - part of the sql is also beeing applied on each sub-select ?

actually
    select document_number from documents order by document_date desc limit 1

is beeing processed as
    select order_number from orders
    union all select invoice_number from invoices
    order by document_number desc
    limit 1

but would it be possible to let the query-optimizer expand the query to
    select order_number from (
        (select order_number, order_date from orders order by order_date desc limit 1)         union all (select invoice_number, invoice_date from invoices order by invoice_date desc limit 1)
    ) as subselect
   order by order_date desc
   limit 1

as this would use two (or number of unions) index-backward-scans
and than only has to reorder at maximum two rows before limiting to the first of it?

this should be significantly faster.

thanks a lot and greetz,
Karsten






Reply via email to