[GENERAL] partitioning for speed, but query planner ignores
We have a by-our-standards large table (about 40e6 rows). Since it is the bottleneck in some places, I thought I'd experiment with partitioning. I'm following the instructions here: http://www.postgresql.org/docs/current/static/ddl-partitioning.html The table holds data about certain objects, each of which has an object number and some number of historical entries (like account activity at a bank, say). The typical usage pattern is: relatively rare inserts that happen in the background via an automated process (meaning I don't care if they take a little longer) and frequent querying, including some where a human is sitting in front of it (i.e. I'd like it to be a lot faster). Our most frequent queries either select all history for object N or most recent item for some subset of objects. Because object number figure so prominently, I thought I'd partition on that. To me, it makes the most sense from a load-balancing perspective to partition on the mod of the object number (for this test, evens vs odds, but planning to go up to mod 10 or even mod 100). Lower numbers are going to be queried much less often than higher numbers. This scheme also means I never have to add partitions in the future. I set up my check constraints ((objnum % 2) = 0 and (objnum % 2) = 1 on the relevant tables) and turned constraint_exclusion to 'partition' in postgresql.conf. I also turned it to 'on' in my psql interface. However, when I run an explain or an explain analyze, I still seeing it checking both partitions. Is this because the query planner doesn't want to do a mod? Should I go with simple ranges, even though this adds a maintenance task? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning for speed, but query planner ignores
On Wed, 2 Oct 2013 08:34:44 -0400, David Rysdam drys...@ll.mit.edu wrote: However, when I run an explain or an explain analyze, I still seeing it checking both partitions. Is this because the query planner doesn't want to do a mod? Should I go with simple ranges, even though this adds a maintenance task? I guess I should give some administrivia as well: Server is 9.2.1 running Linux. The configuration is otherwise pretty vanilla with only minor, and poorly-understood, conf changes. smime.p7s Description: S/MIME cryptographic signature
Re: [GENERAL] partitioning for speed, but query planner ignores
On Wed, 2 Oct 2013 08:34:44 -0400 David Rysdam drys...@ll.mit.edu wrote: We have a by-our-standards large table (about 40e6 rows). Since it is the bottleneck in some places, I thought I'd experiment with partitioning. I'm following the instructions here: http://www.postgresql.org/docs/current/static/ddl-partitioning.html The table holds data about certain objects, each of which has an object number and some number of historical entries (like account activity at a bank, say). The typical usage pattern is: relatively rare inserts that happen in the background via an automated process (meaning I don't care if they take a little longer) and frequent querying, including some where a human is sitting in front of it (i.e. I'd like it to be a lot faster). Our most frequent queries either select all history for object N or most recent item for some subset of objects. Because object number figure so prominently, I thought I'd partition on that. To me, it makes the most sense from a load-balancing perspective to partition on the mod of the object number (for this test, evens vs odds, but planning to go up to mod 10 or even mod 100). Lower numbers are going to be queried much less often than higher numbers. This scheme also means I never have to add partitions in the future. I set up my check constraints ((objnum % 2) = 0 and (objnum % 2) = 1 on the relevant tables) and turned constraint_exclusion to 'partition' in postgresql.conf. I also turned it to 'on' in my psql interface. However, when I run an explain or an explain analyze, I still seeing it checking both partitions. Is this because the query planner doesn't want to do a mod? Should I go with simple ranges, even though this adds a maintenance task? Last I looked, the partitioning mechanism isn't _quite_ as smart as could be desired. For example: SELECT * FROM table WHERE objnum = 5; -- will not take advantage of partition You have to give the planner a little more hint as to the fact that it can take advantage of the partition: SELECT * FROM table WHERE (objnum % 2) = 1 AND objnum = 5; As silly as it seems, this is enough information for the planner to know that it only needs to scan one partition. If this doesn't answer your question, you should probably provide some more details (actual query and actual explain output, for example) to help people better help you. -- Bill Moran wmo...@potentialtech.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning for speed, but query planner ignores
On Wed, 2 Oct 2013 09:12:02 -0400, Bill Moran wmo...@potentialtech.com wrote: Last I looked, the partitioning mechanism isn't _quite_ as smart as could be desired. For example: SELECT * FROM table WHERE objnum = 5; -- will not take advantage of partition You have to give the planner a little more hint as to the fact that it can take advantage of the partition: SELECT * FROM table WHERE (objnum % 2) = 1 AND objnum = 5; As silly as it seems, this is enough information for the planner to know that it only needs to scan one partition. This seemed ridiculously silly until I thought about it. I guess it has no way of unwrapping my constraint and figuring out what to do. Would this also apply if I did ranges or is that a common enough constraint that it *can* figure it out without me having to modify all my queries? smime.p7s Description: S/MIME cryptographic signature
Re: [GENERAL] partitioning for speed, but query planner ignores
David Rysdam drys...@ll.mit.edu wrote: Would the planner be smart enough to figure out ranges without me having to hint my queries? Yes, it handles ranges well. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning for speed, but query planner ignores
On Wed, Oct 2, 2013 at 9:01 AM, David Rysdam drys...@ll.mit.edu wrote: I had some vague notion of tables doing work but really if it can load one partition into RAM and get most of my hits from there, it'd be a big win. The same concept applies to the frequently-used indexes on that partition.