Re: [GENERAL] How do I bump a row to the front of sort efficiently
Or maybe instead of a view you could write a set-returning function, e.g. as described here: I thought I'd see if I could make this work just for fun. Here is a simple proof of concept (on 9.3): -- DROP TABLE IF EXISTS topics; CREATE TABLE topics ( id INTEGER PRIMARY KEY, bumped_at INTEGER NOT NULL ); INSERT INTO topics SELECT a, a * 2 FROM generate_series(1, 1000) s(a) ; CREATE OR REPLACE FUNCTION topics_sorted_after_id(INT, INT) RETURNS TABLE(id int, after_top int, bumped_at int) AS $$ SELECT id, 0 AS after_top, bumped_at FROMtopics WHERE id = $1 UNION ALL (SELECT id, 1 AS after_top, bumped_at FROM topics WHERE id IS DISTINCT FROM $1 ORDER BY bumped_at DESC LIMIT $2 - 1) ORDER BY after_top, bumped_at DESC $$ LANGUAGE sql; SELECT * FROM topics_sorted_after_id(45, 30); That looks to me like it gives the right results. I'm curious if RETURNS TABLE is the right approach to use here or if there is something nicer. What if the ORM insists on `FROM topics`? Is there any way to rewrite the query or function to work around that? Paul -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
Hey, I'm not a guru, here is what I understood. You are mixing several problems in the same question : - 1. why the planner isn't more efficient - 2. why the workaround is difficult to use with an ORM. for 1. you can't do much (as said by others, you don't really need a case here anyway). I think using a CASE is equivalent for the planner to using your own custom blackbox function. So no way to improve that. for 2. : if you can't pass limit and offset in your ORM, a small workaround is to number your row following the order you defined with the function row_number() over(your order here), then you can use your ORM to design where conditions equivalent to limit and offset : WHERE row_number BETWEEN your_offset AND your_limit Cheers, Rémi-C 2015-02-04 21:40 GMT+01:00 Paul Jungwirth p...@illuminatedcomputing.com: I imagine your original would be at risk of LIMITing out the very row you seek to get at the top, since you don't have an ORDER BY to tell it which ones to keep during the outer LIMIT. Here is an old thread about combining ORDER BY with UNION: http://www.postgresql.org/message-id/16814.1280268...@sss.pgh.pa.us So I think this query would work: select * from topic where id = 1000 union all (select * from topic where id 1000 order by bumped_at desc limit 29) order by case when id = 1000 then 0 else 1 end, bumped_at desc ; I need to be able to offset and limit the union hack in a view, which is proving very tricky. Since this is sort of a parameterized view (which Postgres does not have) you are probably better off figuring out how to make the UNION query work with your ORM. What ORM is it? Maybe someone here can help you with that. Or maybe instead of a view you could write a set-returning function, e.g. as described here: http://stackoverflow.com/questions/11401749/pass-in-where-parameters-to-postgresql-view Paul -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
I imagine your original would be at risk of LIMITing out the very row you seek to get at the top, since you don't have an ORDER BY to tell it which ones to keep during the outer LIMIT. Here is an old thread about combining ORDER BY with UNION: http://www.postgresql.org/message-id/16814.1280268...@sss.pgh.pa.us So I think this query would work: select * from topic where id = 1000 union all (select * from topic where id 1000 order by bumped_at desc limit 29) order by case when id = 1000 then 0 else 1 end, bumped_at desc ; I need to be able to offset and limit the union hack in a view, which is proving very tricky. Since this is sort of a parameterized view (which Postgres does not have) you are probably better off figuring out how to make the UNION query work with your ORM. What ORM is it? Maybe someone here can help you with that. Or maybe instead of a view you could write a set-returning function, e.g. as described here: http://stackoverflow.com/questions/11401749/pass-in-where-parameters-to-postgresql-view Paul -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
The union hack may be able to work, but what I want is slightly more complex I want this to work efficiently, even without the case it chokes: explain select * from testing order by id in (100,2,-1) desc, id limit 30; QUERY PLAN --- Limit (cost=4869.45..4869.52 rows=30 width=9) - Sort (cost=4869.45..5119.45 rows=10 width=9) Sort Key: ((id = ANY ('{100,2,-1}'::integer[]))), id - Seq Scan on testing (cost=0.00..1916.00 rows=10 width=9) (4 rows) I need to be able to offset and limit the union hack in a view, which is proving very tricky. On Wed, Feb 4, 2015 at 9:15 PM, BladeOfLight16 bladeofligh...@gmail.com wrote: On Tue, Feb 3, 2015 at 11:28 PM, Sam Saffron sam.saff...@gmail.com wrote: Note: I still consider this a bug/missing feature of sorts since the planner could do better here, and there is no real clean way of structuring a query to perform efficiently here, which is why I erroneously cross posted this to hacker initially: No, it should not be considered a bug or a deficiency. You're telling the system to use a computed value that uses an arbitrary logic construct for sorting, and that's before you actually give it a filter to work with. You're also trying to abuse ORDER BY to make LIMIT do filtering. How do you expect that to go? I would expect the planner to do exactly what you told it you wanted: sort the entire table by that computed value, and it would have to do a sequential scan to compute the value for every row before it knew which ones came first for the LIMIT. CASE is well known to cause optimization problems; arbitrary conditional logic isn't especially conducive to a planner optimizing things. In general, the planner has no idea what the logic really means, so the planner would have to have some kind of special logic trying to pick up on this case. Your particular use case is uncommon; why should the planner code be junked up with a thousand little optimizations for uncommon situations like this (which would make it unmaintainable) when you already have a reasonable alternative? PG is great for a reason: the devs have made a lot of fantastic choices in designing it. Trying to keep the planner relatively simple is one of them, if I recall correctly. What you want is well represented by a UNION query: you want it to fetch one particular row by ID, and then you want to tack on 29 other particular rows based on a particular sort order and possibly an offset. These are two completely disparate ways of fetching data; of course the most optimal way is going to be to essentially write two queries and put the results together. That's also going to be the clearest way of writing the query, so the next person to work on it knows what you were doing. And that aside, you don't even need to use CASE. You could've just used (id = 1), which would give you a boolean result. You can most certainly sort by a boolean. (I believe PG sorts TRUE first.) You should see if that gets optimized at all. (If it doesn't, I won't be surprised and everything else I said still holds.) By the way, you can do better on your UNION query: select * from topic where id = 1000 union all (select * from topic where id 1000 order by bumped_at desc limit 29) I imagine your original would be at risk of LIMITing out the very row you seek to get at the top, since you don't have an ORDER BY to tell it which ones to keep during the outer LIMIT. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
On Tue, Feb 3, 2015 at 11:28 PM, Sam Saffron sam.saff...@gmail.com wrote: Note: I still consider this a bug/missing feature of sorts since the planner could do better here, and there is no real clean way of structuring a query to perform efficiently here, which is why I erroneously cross posted this to hacker initially: No, it should not be considered a bug or a deficiency. You're telling the system to use a *computed value* that uses an *arbitrary logic construct *for *sorting*, and that's *before* you actually give it a filter to work with. You're also trying to abuse ORDER BY to make LIMIT do filtering. How do you expect that to go? I would expect the planner to do exactly what you told it you wanted: sort the entire table by that computed value, and it would have to do a sequential scan to compute the value for every row before it knew which ones came first for the LIMIT. CASE is well known to cause optimization problems; arbitrary conditional logic isn't especially conducive to a planner optimizing things. In general, the planner has *no* idea what the logic really means, so the planner would have to have some kind of special logic trying to pick up on this case. Your particular use case is uncommon; why should the planner code be junked up with a thousand little optimizations for uncommon situations like this (which would make it unmaintainable) when you already have a reasonable alternative? PG is great for a reason: the devs have made a lot of fantastic choices in designing it. Trying to keep the planner relatively simple is one of them, if I recall correctly. What you want is well represented by a UNION query: you want it to fetch one particular row by ID, and then you want to tack on 29 other particular rows based on a particular sort order and possibly an offset. These are two completely disparate ways of fetching data; of course the most optimal way is going to be to essentially write two queries and put the results together. That's also going to be the *clearest* way of writing the query, so the next person to work on it knows what you were doing. And that aside, you don't even need to use CASE. You could've just used (id = 1), which would give you a boolean result. You can most certainly sort by a boolean. (I believe PG sorts TRUE first.) You should see if that gets optimized at all. (If it doesn't, I won't be surprised and everything else I said still holds.) By the way, you can do better on your UNION query: select * from topic where id = 1000 union all (select * from topic where id 1000 order by bumped_at desc limit 29) I imagine your original would be at risk of LIMITing out the very row you seek to get at the top, since you don't have an ORDER BY to tell it which ones to keep during the outer LIMIT.
Re: [GENERAL] How do I bump a row to the front of sort efficiently
On Tue, Feb 3, 2015 at 9:33 PM, BladeOfLight16 bladeofligh...@gmail.com wrote: This is why ORMs are bad. They make hard problems *much* harder, and the only benefit is that they maybe make easy problems a little quicker. The cost/savings is *heavily* skewed toward the cost, since there's no upper bound on the cost and there is a pretty small lower bound on the savings. Micro-ORMs tend to do a better job of not shielding you from (or rather, getting in the way of) the SQL while still providing some good result-to-object translation. Whether even that is necessary depends on your language, though. (For example, in Python, psycopg2 has a built in way of spitting out namedtuples, which means you get result-to-object translation out of the box. That makes even a micro-ORM pretty unnecessary. On the other hand, a micro-ORM that does this well without blocking you from the SQL, such as PetaPOCO, is a boon in .NET.) If you can, your best bet would probably be to find a way to get your ORM to execute raw SQL (with good parametrization to prevent injection attacks) and be done with it. It took me way too much experience fighting with an ORM on complicated queries to realize that. Er, *pretty small upper bound on the savings.
Re: [GENERAL] How do I bump a row to the front of sort efficiently
On Mon, Feb 2, 2015 at 1:16 AM, Sam Saffron sam.saff...@gmail.com wrote: However, the contortions on the above query make it very un-ORM friendly as I would need to define a view for it but would have no clean way to pass limits and offsets in. This is why ORMs are bad. They make hard problems *much* harder, and the only benefit is that they maybe make easy problems a little quicker. The cost/savings is *heavily* skewed toward the cost, since there's no upper bound on the cost and there is a pretty small lower bound on the savings. Micro-ORMs tend to do a better job of not shielding you from (or rather, getting in the way of) the SQL while still providing some good result-to-object translation. Whether even that is necessary depends on your language, though. (For example, in Python, psycopg2 has a built in way of spitting out namedtuples, which means you get result-to-object translation out of the box. That makes even a micro-ORM pretty unnecessary. On the other hand, a micro-ORM that does this well without blocking you from the SQL, such as PetaPOCO, is a boon in .NET.) If you can, your best bet would probably be to find a way to get your ORM to execute raw SQL (with good parametrization to prevent injection attacks) and be done with it. It took me way too much experience fighting with an ORM on complicated queries to realize that.
Re: [GENERAL] How do I bump a row to the front of sort efficiently
Note: I still consider this a bug/missing feature of sorts since the planner could do better here, and there is no real clean way of structuring a query to perform efficiently here, which is why I erroneously cross posted this to hacker initially: # create table testing(id serial primary key, data varchar); # insert into testing(data) select 'test' from pg_tables a,pg_tables b,pg_tables c,pg_tables d limit 10 # explain select * from testing order by id limit 30; QUERY PLAN -- Limit (cost=0.29..1.24 rows=30 width=9) - Index Scan using testing_pkey on testing (cost=0.29..3148.29 rows=10 width=9) (2 rows) # explain select * from testing where id = 1000; QUERY PLAN Index Scan using testing_pkey on testing (cost=0.29..8.31 rows=1 width=9) Index Cond: (id = 1000) (2 rows) # explain select * from testing order by case when id = 1000 then 0 else 1 end, id limit 30; QUERY PLAN --- Limit (cost=4744.45..4744.52 rows=30 width=9) - Sort (cost=4744.45..4994.45 rows=10 width=9) Sort Key: (CASE WHEN (id = 1000) THEN 0 ELSE 1 END), id - Seq Scan on testing (cost=0.00..1791.00 rows=10 width=9) (4 rows) Cost goes through the roof for a query that pg could have have done better with if it were able to better understand the case statement. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
On 02/02/15 08:40, hubert depesz lubaczewski wrote: On Mon, Feb 02, 2015 at 05:16:40PM +1100, Sam Saffron wrote: Even this is fast, and logically equiv as id is primary key unique select * from topic where id = 1000 union all select * from ( select * from topics where id 1000 order by bumped_at desc limit 30 ) as x limit 30 Is there any clean technique to bump up particular rows to the front of a sort if a certain condition is met without paying a huge performance hit? Why don't you use the union-all approach? If it's fast, and does what you need ? depesz Or create another index or column that's build at insertion time with the sort that you want in it. At least then retrieval time will be fast at the cost of extra space. -- Tim Clarke A: Because we read from top to bottom, left to right. Q: Why should I start my reply below the quoted text? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I bump a row to the front of sort efficiently
On Mon, Feb 02, 2015 at 05:16:40PM +1100, Sam Saffron wrote: Even this is fast, and logically equiv as id is primary key unique select * from topic where id = 1000 union all select * from ( select * from topics where id 1000 order by bumped_at desc limit 30 ) as x limit 30 Is there any clean technique to bump up particular rows to the front of a sort if a certain condition is met without paying a huge performance hit? Why don't you use the union-all approach? If it's fast, and does what you need ? depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How do I bump a row to the front of sort efficiently
I have this query: select * from topics order by case when id=1 then 0 else 1 end, bumped_at desc limit 30 It works fine, bumps id 1 to the front of the sort fine but is terribly inefficient and scans OTH select * from topics where id = 1 is super fast select * from topics order by bumped_at desc limit 30 is super fast Even this is fast, and logically equiv as id is primary key unique select * from topic where id = 1000 union all select * from ( select * from topics where id 1000 order by bumped_at desc limit 30 ) as x limit 30 However, the contortions on the above query make it very un-ORM friendly as I would need to define a view for it but would have no clean way to pass limits and offsets in. Is there any clean technique to bump up particular rows to the front of a sort if a certain condition is met without paying a huge performance hit? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general