2018-05-31 15:10 GMT+02:00 Jeff Janes <jeff.ja...@gmail.com>: > On Thu, May 31, 2018 at 7:22 AM, Nicolas Seinlet <nico...@seinlet.com> > wrote: > >> Hi, >> >> I have a query with a strange query plan. >> >> This query is roughly searching for sales, and convert them with a >> currency rate. As currency rate changes from time to time, table contains >> the currency, the company, the rate, the start date of availability of this >> rate and the end date of availability. >> >> The join is done using : >> left join currency_rate cr on (cr.currency_id = pp.currency_id and >> cr.company_id = s.company_id and >> cr.date_start <= coalesce(s.date_order, now()) and >> (cr.date_end is null or cr.date_end > coalesce(s.date_order, >> now()))) >> >> The tricky part is the date range on the currency rate, which is not an >> equality. >> >> the query plan shows: >> -> Sort (cost=120.13..124.22 rows=1637 width=56) (actual >> time=14.300..72084.758 rows=308054684 loops=1) >> Sort Key: cr.currency_id, cr.company_id >> Sort Method: quicksort Memory: 172kB >> -> CTE Scan on currency_rate cr >> (cost=0.00..32.74 rows=1637 width=56) (actual time=1.403..13.610 rows=1576 >> loops=1) >> >> There's 2 challenging things : >> - planner estimates 1637 rows, and get 300 million lines >> - sorting is generating lines >> > > These are both explained by the same thing. The sort is feeding into a > merge join. For every row in the other node which have the same value of > the scan keys, the entire section of this sort with those same keys gets > scanned again. The repeated scanning gets counted in the actual row count, > but isn't counted in the expected row count, or the actual row count of the > thing feeding into the sort (the CTE) > > >> >> > For now, the more currency rates, the slowest the query. There's not that >> much currency rates (1k in this case), as you can only have one rate per >> day per currency. >> > > If it is only per currency per day, then why is company_id present? In any > case, you might be better off listing the rates per day, rather than as a > range, and then doing an equality join. > > Cheers, > > Jeff >
Hi, Thanks for the answer. You're right, company_id is present, so you can have one rate per day per currency per company. I've tried to simplify the question without modifying the query plan, so I didn't talk about it even if it's present. I will now try to generate a virtual table of rates per dates.