Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-07 Thread Francisco Olarte
Alan: On Mon, 6 Dec 2021 at 18:58, Alan Hodgson wrote: > On Mon, 2021-12-06 at 18:20 +0100, Francisco Olarte wrote: > Can you post an explain analyze? To me it seems like the planner > thinks shipment_import_id is randomly distributed and the table is > well correlated with it's PK, so scanning

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Mladen Gogala
On 12/6/21 22:16, Laurenz Albe wrote: An ugly alternative is to use "ORDER BY id + 0", which prevents PostgreSQL from using the index. That was actually the earliest form of Oracle hints. I remember doing exactly that in Oracle 5.1.22 on VAX/VMS. -- Mladen Gogala Database Consultant Tel:

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Laurenz Albe
On Mon, 2021-12-06 at 19:22 +0100, Pavel Stehule wrote: > po 6. 12. 2021 v 18:21 odesílatel Francisco Olarte > napsal: > > On Mon, 6 Dec 2021 at 18:03, Alan Hodgson wrote: > > > # explain SELECT "shipment_import_records".* FROM > > > "shipment_import_records" WHERE > > >

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Rob Sargent
On 12/6/21 11:02 AM, Alan Hodgson wrote: On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote: To be clear, is it the devs or the ORM that's adding the ORDER  and the LIMIT?  I'm betting on devs.  Do they need the smallest id (first occurrance?) or do they need data common to all 5096 entries

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Pavel Stehule
po 6. 12. 2021 v 18:21 odesílatel Francisco Olarte napsal: > On Mon, 6 Dec 2021 at 18:03, Alan Hodgson > wrote: > ... > > The table has nearly 29 million records. 5069 of them match > shipment_import_id = 5090609. There is an index on shipment_import_id, > which the planner happily uses without

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Adrian Klaver
On 12/6/21 10:02, Alan Hodgson wrote: On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote: To be clear, is it the devs or the ORM that's adding the ORDER  and the LIMIT?  I'm betting on devs.  Do they need the smallest id (first occurrance?) or do they need data common to all 5096 entries

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Alan Hodgson
On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote: > To be clear, is it the devs or the ORM that's adding the ORDER  and > the  > LIMIT?  I'm betting on devs.  Do they need the smallest id (first > occurrance?) or do they need data common to all 5096 entries > (Name?) and > any record will

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Alan Hodgson
On Mon, 2021-12-06 at 18:20 +0100, Francisco Olarte wrote: > > Can you post an explain analyze? To me it seems like the planner > thinks shipment_import_id is randomly distributed and the table is > well correlated with it's PK, so scanning it for the first id > should > be fast.    #explain

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Alan Hodgson
On Mon, 2021-12-06 at 10:18 -0700, Michael Lewis wrote: > What is your default_statistics_target and how accurate is that > estimate of 5668 rows? What is random_page_cost set to by the way? > > > default_statistics_target = 1000 random_page_cost = 2.0 (it's on AWS on a 9000 iops gp2 volume)

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Francisco Olarte
On Mon, 6 Dec 2021 at 18:03, Alan Hodgson wrote: ... > The table has nearly 29 million records. 5069 of them match > shipment_import_id = 5090609. There is an index on shipment_import_id, which > the planner happily uses without the LIMIT specifically. Yet with it the > query planner will

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Rob Sargent
On 12/6/21 10:03 AM, Alan Hodgson wrote: I keep running into problems like these: Devs are using an ORM. It really likes to produce queries like: SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Michael Lewis
What is your default_statistics_target and how accurate is that estimate of 5668 rows? What is random_page_cost set to by the way? More importantly, what is the better plan that you'd like the planner to use with your existing indexes? It would seem logical to me to scan for the matching

Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Alan Hodgson
I keep running into problems like these: Devs are using an ORM. It really likes to produce queries like: SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1; I don't