Re: [PERFORM] Planner question - bit data types
On Fri, Sep 4, 2009 at 6:29 PM, Josh Berkusj...@agliodbs.com wrote: Karl, For reference, I was having SEVERE performance problems with the following comparison in an SQL statement where mask was an integer: select ... from where .. and (permission mask = permission) AFAIK, the only way to use an index on these queries is through expression indexes. That's why a lot of folks use INTARRAY instead; it comes with a GIN index type. It would probably be possible to create a new index type using GiST or GIN which indexed bitstrings automatically, but I don't know that anyone has done it yet. Changing your integer to a bitstring will not, to my knowledge, improve this. agreed. also, gist/gin is no free lunch, maintaining these type of indexes is fairly expensive. If you are only interested in one or a very small number of cases of 'permission', you can use an expression index to target constant values: select ... from where .. and (permission mask = permission) create index foo_permission_xyz_idx on foo((64 mask = 64)); select * from foo where 64 mask = 64; --indexed! this optimizes a _particular_ case of permission into a boolean based index. this can be a big win if the # of matching cases is very small or you want to use this in a multi-column index. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Merlin Moncure mmonc...@gmail.com writes: If you are only interested in one or a very small number of cases of 'permission', you can use an expression index to target constant values: select ... from where .. and (permission mask = permission) create index foo_permission_xyz_idx on foo((64 mask = 64)); select * from foo where 64 mask = 64; --indexed! A possibly more useful variant is to treat the permission condition as a partial index's WHERE condition. The advantage of that is that the index's actual content can be some other column, so that you can combine the permission check with a second indexable test. The index is still available for queries that don't use the other column, but it's more useful for those that do. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Tom Lane wrote: Merlin Moncure mmonc...@gmail.com writes: If you are only interested in one or a very small number of cases of 'permission', you can use an expression index to target constant values: select ... from where .. and (permission mask = permission) create index foo_permission_xyz_idx on foo((64 mask = 64)); select * from foo where 64 mask = 64; --indexed! A possibly more useful variant is to treat the permission condition as a partial index's WHERE condition. The advantage of that is that the index's actual content can be some other column, so that you can combine the permission check with a second indexable test. The index is still available for queries that don't use the other column, but it's more useful for those that do. regards, tom lane That doesn't help in this case as the returned set will typically be quite large, with the condition typically being valid on anywhere from 10-80% of the returned tuples. What I am trying to avoid is creating a boolean column for EACH potential bit (and an index on each), as that makes the schema non-portable for others and quite messy as well - while there are a handful of known masks the system also has a number of user defined bit positions that vary from installation to installation. -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Karl Denninger k...@denninger.net writes: That doesn't help in this case as the returned set will typically be quite large, with the condition typically being valid on anywhere from 10-80% of the returned tuples. In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Tom Lane wrote: Karl Denninger k...@denninger.net writes: That doesn't help in this case as the returned set will typically be quite large, with the condition typically being valid on anywhere from 10-80% of the returned tuples. In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. regards, tom lane The query, sans this condition, is extremely fast and contains a LOT of other conditions (none of which cause trouble.) It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble. -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Karl Denninger k...@denninger.net writes: Tom Lane wrote: In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. The query, sans this condition, is extremely fast and contains a LOT of other conditions (none of which cause trouble.) It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble. My comment stands: asking about how to use an index for this is the wrong question. You never showed us any EXPLAIN results, but I suspect what is happening is that the planner thinks the permission mask = permission condition is fairly selective (offhand I think it'd default to DEFAULT_EQ_SEL or 0.005) whereas the true selectivity per your prior comment is only 0.1 to 0.8. This is causing it to change to a plan that would be good for a small number of rows, when it should stick to a plan that is good for a large number of rows. So the right question is how do I fix the bad selectivity estimate?. Unfortunately there's no non-kluge answer. What I think I'd try is wrapping the condition into a function, say create function permission_match(perms int, mask int) returns bool as $$begin return perms mask = mask; end$$ language plpgsql strict immutable; The planner won't know what to make of where permission_match(perms, 64) either, but the default selectivity estimate for a boolean function is 0.333, much closer to what you need. Or plan B, which I'd recommend, is to forget the mask business and go over to a boolean column per permission flag. Then the planner would actually have decent statistics about the flag selectivities, and the queries would be a lot more readable too. Your objection that you'd need an index per flag column is misguided --- at these selectivities an index is really pointless. And I entirely fail to understand the complaint about it being unportable; you think is more portable than boolean? Only one of those things is in the SQL standard. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Tom Lane wrote: Karl Denninger k...@denninger.net writes: Tom Lane wrote: In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. The query, sans this condition, is extremely fast and contains a LOT of other conditions (none of which cause trouble.) It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble. My comment stands: asking about how to use an index for this is the wrong question. You never showed us any EXPLAIN results, Yes I did. Go back and look at the archives. I provided full EXPLAIN and EXPLAIN ANALYZE results for the original query. Sheesh. Or plan B, which I'd recommend, is to forget the mask business and go over to a boolean column per permission flag. Then the planner would actually have decent statistics about the flag selectivities, and the queries would be a lot more readable too. Your objection that you'd need an index per flag column is misguided --- at these selectivities an index is really pointless. And I entirely fail to understand the complaint about it being unportable; you think is more portable than boolean? Only one of those things is in the SQL standard. regards, tom lane The point isn't portability to other SQL engines - it is to other people's installations. The bitmask is (since it requires only changing the mask constants in the container file that makes the SQL calls by reference) where explicit columns is not by a long shot. In any event it looks like that's the only reasonable way to do this, so thanks (I think) -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
Karl Denninger escribió: Tom Lane wrote: You never showed us any EXPLAIN results, Yes I did. Go back and look at the archives. I provided full EXPLAIN and EXPLAIN ANALYZE results for the original query. Sheesh. You did? Where? This is your first message in this thread: http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php No EXPLAINs anywhere to be seen. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it. I give up; the attack-dog crowd has successfully driven me off. Ciao. Alvaro Herrera wrote: Karl Denninger escribió: Tom Lane wrote: You never showed us any EXPLAIN results, Yes I did. Go back and look at the archives. I provided full EXPLAIN and EXPLAIN ANALYZE results for the original query. Sheesh. You did? Where? This is your first message in this thread: http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php No EXPLAINs anywhere to be seen. attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
On Sat, Sep 5, 2009 at 8:19 PM, Karl Denningerk...@denninger.net wrote: There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it. I give up; the attack-dog crowd has successfully driven me off. Ciao. Another more standard sql approach is to push the flags out to a subordinate table. This is less efficient of course but now you get to use standard join tactics to match conditions... merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance