Our DBA mentioned that he uses LIMIT 0 himself.
But from what I gather in regards to the ORDER BY RANDOM(), its because when 
you don't have column that has small gaps between values (such as our pk 
UID's), then attaching an order by with a random value can actually help the 
query pull a result. You could very well be right that the query plan may not 
be optimized, but we have found that using that `order by random()` can take a 
5 min query down to less than 5 seconds. Again, this is only when we attach a 
LIMIT clause. Stackoverflow (i know, sorry), shows many uses that run into such 
issues and do in fact use the same trick.
________________________________
From: sequel-talk@googlegroups.com <sequel-talk@googlegroups.com> on behalf of 
Jeremy Evans <jeremyeva...@gmail.com>
Sent: Tuesday, October 12, 2021 8:33 PM
To: sequel-talk@googlegroups.com <sequel-talk@googlegroups.com>
Subject: Re: dataset columns method performance

On Tue, Oct 12, 2021 at 2:40 PM cincy_kal 
<joshua.cr...@cordatahealth.com<mailto:joshua.cr...@cordatahealth.com>> wrote:
I just ran into an issue where I thought a query that was being dynamically 
built was hanging. After checking the DB for the query, i don't see any of the 
where clauses in the one that was hung. What I saw was:
`SELECT * FROM table_name LIMIT 1`
So then I saw this conditional that had me wondering...
`if query.columns.include?(x)`
so I looked at the docs and saw that the columns function does actually run the 
query that was being hung. It never actually reached my dynamic query's 
execution point.
>From previous experience in postgresql, I found that in largely populated 
>tables, when running a query with a `LIMIT` clause and no `ORDER BY`, there 
>are big performance implications. So even if I don't know the potential 
>columns, or I don't really care about the order, I will still add an ORDER 
>clause.
`SELECT * FROM my_table ORDER BY RANDOM() LIMIT 1`

Weird.  There shouldn't be an issue with not having an ORDER BY clause, and 
adding an ORDER BY clause should never improve performance.  ORDER BY RANDOM() 
LIMIT 1 should always perform worse, since it should require all rows be 
calculated by the database, sorted by the random number, and then have the 
first row returned.  I would think if adding ORDER BY improves performance, 
there is likely a problem in the database's query planner/optimizer.

Thinking more about it, I wonder if LIMIT 0 would be a better way to handle the 
query to get the columns, since we shouldn't need any rows to be returned, we 
should only need the metadata from the query.

Thanks,
Jeremy

--
You received this message because you are subscribed to a topic in the Google 
Groups "sequel-talk" group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/sequel-talk/It67_FsCMaM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
sequel-talk+unsubscr...@googlegroups.com<mailto:sequel-talk+unsubscr...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sequel-talk/CADGZSSeTsS6BTBnkrrZZuCsrH90E5ujnOktDM_BcMomL77vj_w%40mail.gmail.com<https://groups.google.com/d/msgid/sequel-talk/CADGZSSeTsS6BTBnkrrZZuCsrH90E5ujnOktDM_BcMomL77vj_w%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sequel-talk+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sequel-talk/SJ0PR18MB40604B3496B3F67CCAC2B34496B79%40SJ0PR18MB4060.namprd18.prod.outlook.com.

Reply via email to