Our DBA mentioned that he uses LIMIT 0 himself. But from what I gather in regards to the ORDER BY RANDOM(), its because when you don't have column that has small gaps between values (such as our pk UID's), then attaching an order by with a random value can actually help the query pull a result. You could very well be right that the query plan may not be optimized, but we have found that using that `order by random()` can take a 5 min query down to less than 5 seconds. Again, this is only when we attach a LIMIT clause. Stackoverflow (i know, sorry), shows many uses that run into such issues and do in fact use the same trick. ________________________________ From: sequel-talk@googlegroups.com <sequel-talk@googlegroups.com> on behalf of Jeremy Evans <jeremyeva...@gmail.com> Sent: Tuesday, October 12, 2021 8:33 PM To: sequel-talk@googlegroups.com <sequel-talk@googlegroups.com> Subject: Re: dataset columns method performance
On Tue, Oct 12, 2021 at 2:40 PM cincy_kal <joshua.cr...@cordatahealth.com<mailto:joshua.cr...@cordatahealth.com>> wrote: I just ran into an issue where I thought a query that was being dynamically built was hanging. After checking the DB for the query, i don't see any of the where clauses in the one that was hung. What I saw was: `SELECT * FROM table_name LIMIT 1` So then I saw this conditional that had me wondering... `if query.columns.include?(x)` so I looked at the docs and saw that the columns function does actually run the query that was being hung. It never actually reached my dynamic query's execution point. >From previous experience in postgresql, I found that in largely populated >tables, when running a query with a `LIMIT` clause and no `ORDER BY`, there >are big performance implications. So even if I don't know the potential >columns, or I don't really care about the order, I will still add an ORDER >clause. `SELECT * FROM my_table ORDER BY RANDOM() LIMIT 1` Weird. There shouldn't be an issue with not having an ORDER BY clause, and adding an ORDER BY clause should never improve performance. ORDER BY RANDOM() LIMIT 1 should always perform worse, since it should require all rows be calculated by the database, sorted by the random number, and then have the first row returned. I would think if adding ORDER BY improves performance, there is likely a problem in the database's query planner/optimizer. Thinking more about it, I wonder if LIMIT 0 would be a better way to handle the query to get the columns, since we shouldn't need any rows to be returned, we should only need the metadata from the query. Thanks, Jeremy -- You received this message because you are subscribed to a topic in the Google Groups "sequel-talk" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/sequel-talk/It67_FsCMaM/unsubscribe. To unsubscribe from this group and all its topics, send an email to sequel-talk+unsubscr...@googlegroups.com<mailto:sequel-talk+unsubscr...@googlegroups.com>. To view this discussion on the web visit https://groups.google.com/d/msgid/sequel-talk/CADGZSSeTsS6BTBnkrrZZuCsrH90E5ujnOktDM_BcMomL77vj_w%40mail.gmail.com<https://groups.google.com/d/msgid/sequel-talk/CADGZSSeTsS6BTBnkrrZZuCsrH90E5ujnOktDM_BcMomL77vj_w%40mail.gmail.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to sequel-talk+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sequel-talk/SJ0PR18MB40604B3496B3F67CCAC2B34496B79%40SJ0PR18MB4060.namprd18.prod.outlook.com.