Re: Sort is generating rows

Nicolas Seinlet Thu, 31 May 2018 06:16:53 -0700

2018-05-31 15:10 GMT+02:00 Jeff Janes <jeff.ja...@gmail.com>:

> On Thu, May 31, 2018 at 7:22 AM, Nicolas Seinlet <nico...@seinlet.com>
> wrote:
>
>> Hi,
>>
>> I have a query with a strange query plan.
>>
>> This query is roughly searching for sales, and convert them with a
>> currency rate. As currency rate changes from time to time, table contains
>> the currency, the company, the rate, the start date of availability of this
>> rate and the end date of availability.
>>
>> The join is done using :
>>     left join currency_rate cr on (cr.currency_id = pp.currency_id and
>>           cr.company_id = s.company_id and
>>           cr.date_start <= coalesce(s.date_order, now()) and
>>          (cr.date_end is null or cr.date_end > coalesce(s.date_order,
>> now())))
>>
>> The tricky part is the date range on the currency rate, which is not an
>> equality.
>>
>> the query plan shows:
>> ->  Sort  (cost=120.13..124.22 rows=1637 width=56) (actual
>> time=14.300..72084.758 rows=308054684 loops=1)
>>                           Sort Key: cr.currency_id, cr.company_id
>>                           Sort Method: quicksort  Memory: 172kB
>>                           ->  CTE Scan on currency_rate cr
>> (cost=0.00..32.74 rows=1637 width=56) (actual time=1.403..13.610 rows=1576
>> loops=1)
>>
>> There's 2 challenging things :
>> - planner estimates 1637 rows, and get 300 million lines
>> - sorting is generating lines
>>
>
> These are both explained by the same thing.  The sort is feeding into a
> merge join.  For every row in the other node which have the same value of
> the scan keys, the entire section of this sort with those same keys gets
> scanned again.  The repeated scanning gets counted in the actual row count,
> but isn't counted in the expected row count, or the actual row count of the
> thing feeding into the sort (the CTE)
>
>
>>
>>
> For now, the more currency rates, the slowest the query. There's not that
>> much currency rates (1k in this case), as you can only have one rate per
>> day per currency.
>>
>
> If it is only per currency per day, then why is company_id present? In any
> case, you might be better off listing the rates per day, rather than as a
> range, and then doing an equality join.
>
> Cheers,
>
> Jeff
>


Hi,

Thanks for the answer.

You're right, company_id is present, so you can have one rate per day per
currency per company. I've tried to simplify the question without modifying
the query plan, so I didn't talk about it even if it's present. I will now
try to generate a virtual table of rates per dates.

Re: Sort is generating rows

Reply via email to