Re: [PERFORM] Index usage for tstzrange?
Heikki Linnakangas writes: > We should do this automatically. Or am I missing something? Yes. This is not equality. > ALTER OPERATOR FAMILY integer_ops USING btree ADD > OPERATOR 3 <@ (int4, int4range), > FUNCTION 1 btint4rangecmp(int4, int4range); That will break approximately everything in sight, starting with the planner's opinion of what equality is. There is *way* too much stuff that knows the semantics of btree opclasses for us to start jamming random operators into them, even if this seemed to work in trivial testing. (See the last section of src/backend/access/nbtree/README to just scratch the surface of the assumptions this breaks.) It's possible that for constant ranges we could have the planner expand "intcol <@ 'x,y'::int4range" into "intcol between x and y", using something similar to the index LIKE optimization (ie, the "special operator" stuff in indxpath.c). I'd like to find a way to make that type of optimization pluggable, though --- the existing approach of hard-wiring knowledge into indxpath.c has never been anything but a kluge, and it definitely doesn't scale as-is to anything except built-in types and operators. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
> We should do this automatically. Or am I missing something? Aside from the need to support @> as well, not that I can see. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
On 22.03.2013 02:05, Josh Berkus wrote: Well, no.<@ is not a btree-indexable operator. Yes, but it's equivalent to ( ( a>= b1 or b1 is null ) and ( a< b2 or b2 is null ) ), which *is* btree-indexable and can use an index. So it seems like the kind of optimization we could eventually make. Yeah. The sort order of <@ is the same as regular b-tree, so it should be possible. In fact, nothing stops you from creating the suitable operator and b-tree support functions. See attached patch for int4, but the same should work for timestamptz. We should do this automatically. Or am I missing something? - Heikki CREATE or replace FUNCTION btint4rangecmp(int4, int4range) RETURNS int4 AS $$ SELECT CASE WHEN $1 < lower($2) THEN -1 WHEN $1 < upper($2) OR upper($2) IS NULL THEN 0 ELSE 1 END $$ LANGUAGE SQL IMMUTABLE STRICT; CREATE FUNCTION int4_contained_by_int4range (int4, int4range) RETURNS bool AS 'elem_contained_by_range' LANGUAGE internal IMMUTABLE STRICT; CREATE OPERATOR <@ ( PROCEDURE=int4_contained_by_int4range, LEFTARG=int4, RIGHTARG=int4range ); ALTER OPERATOR FAMILY integer_ops USING btree ADD OPERATOR 3 <@ (int4, int4range), FUNCTION 1 btint4rangecmp(int4, int4range); -- Create a test table create table inttable (i int4 primary key); insert into inttable select g from generate_series(1,10) g; -- Query it. Uses index, hooray! select * from inttable where i <@ int4range(111, 120); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
On 21.03.2013 17:55, Alexander Korotkov wrote: On Thu, Mar 21, 2013 at 12:52 PM, Heikki Linnakangas< The immediate fix is attached, but this made me realize that rangesel() is still missing estimation for the "element<@ range" operator. It shouldn't be hard to implement, I'm pretty sure we have all the statistics we need for that. Probably we could even call existing scalarltsel and scalargtsel for this case. I came up with the attached. I didn't quite use scalarltsel, but I used the scalarineqsel function, which contains the "guts" of scalarltsel and scalargtsel. One thing I wasn't quite sure of (from the patch): /* * We use the data type's default < operator. This is bogus, if the range * type's rngsubopc operator class is different. In practice, that ought * to be rare. It would also be bogus to use the < operator from the * rngsubopc operator class, because the statistics are collected using * using the default operator class, anyway. * * For the same reason, use the default collation. The statistics are * collected with the default collation. */ Does that make sense? The other option would be to use the < operator from the rngsubopc op class, even though the scalar statistics are collected with the default b-tree < operator. As long as the two sort roughly the same way, you get reasonable results either way. Yet another option would be to use histogram_selectivity() instead of ineq_histogram_selectivity(), if the range's rngsubopc opclass isn't the type's default opclass. histogram_selectivity() works with any operator regardless of the sort ordering, basically using the histogram values merely as a sample, rather than as a histogram. But I'm reluctant to make this any more complicated, as using a non-default opclass for the range type is rare. - Heikki diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c index c450c6a..60700b8 100644 --- a/src/backend/utils/adt/rangetypes_selfuncs.c +++ b/src/backend/utils/adt/rangetypes_selfuncs.c @@ -55,6 +55,8 @@ static double calc_hist_selectivity_contains(TypeCacheEntry *typcache, RangeBound *lower, RangeBound *upper, RangeBound *hist_lower, int hist_nvalues, Datum *length_hist_values, int length_hist_nvalues); +static double calc_elem_contained_by_sel(PlannerInfo *root, TypeCacheEntry *typcache, + VariableStatData *vardata, RangeType *constval); /* * Returns a default selectivity estimate for given operator, when we don't @@ -155,18 +157,35 @@ rangesel(PG_FUNCTION_ARGS) } /* - * OK, there's a Var and a Const we're dealing with here. We need the - * Const to be of same range type as the column, else we can't do anything - * useful. (Such cases will likely fail at runtime, but here we'd rather - * just return a default estimate.) - * - * If the operator is "range @> element", the constant should be of the - * element type of the range column. Convert it to a range that includes - * only that single point, so that we don't need special handling for - * that in what follows. + * OK, there's a Var and a Const we're dealing with here. Check that the + * Const is of the right type, else we can't do anything useful. (Such + * cases will likely fail at runtime, but here we'd rather just return a + * default estimate.) */ - if (operator == OID_RANGE_CONTAINS_ELEM_OP) + if (operator == OID_RANGE_ELEM_CONTAINED_OP) + { + /* + * "element <@ range" is quite different from the other range + * operators, in that the Var is not a range, but of the element type. + */ + typcache = range_get_typcache(fcinfo, ((Const *) other)->consttype); + + if (typcache->rngelemtype->type_id == vardata.vartype) + { + constrange = DatumGetRangeType(((Const *) other)->constvalue); + selec = calc_elem_contained_by_sel(root, typcache, &vardata, constrange); + } + else + selec = default_range_selectivity(operator); + } + else if (operator == OID_RANGE_CONTAINS_ELEM_OP) { + /* + * In "range @> element", the constant should be of the element type + * of the range column. Convert it to a range that includes only that + * single point, so that we don't need special handling for that in + * what follows. + */ typcache = range_get_typcache(fcinfo, vardata.vartype); if (((Const *) other)->consttype == typcache->rngelemtype->type_id) @@ -181,26 +200,29 @@ rangesel(PG_FUNCTION_ARGS) upper.infinite = false; upper.lower = false; constrange = range_serialize(typcache, &lower, &upper, false); + + selec = calc_rangesel(typcache, &vardata, constrange, operator); } + else + selec = default_range_selectivity(operator); } else { - typcache = range_get_typcache(fcinfo, ((Const *) other)->consttype); + /* + * In all other range operators, both operands are ranges, and they + * must be of the same type. + */ + typcache
Re: [PERFORM] Index usage for tstzrange?
> Well, no. <@ is not a btree-indexable operator. Yes, but it's equivalent to ( ( a >= b1 or b1 is null ) and ( a < b2 or b2 is null ) ), which *is* btree-indexable and can use an index. So it seems like the kind of optimization we could eventually make. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
On 21.03.2013 06:07, Vasilis Ventirozos wrote: On Thu, Mar 21, 2013 at 5:58 AM, Tom Lane wrote: What I find more disturbing is that this is what I get from the example in HEAD: regression=# explain SELECT * FROM a WHERE ts<@ tstzrange('2013-01-01','2013-01-01 00:10:00'); ERROR: XX000: type 1184 is not a range type LOCATION: range_get_typcache, rangetypes.c:1451 Haven't traced through it to determine exactly what's happening, but isn't this a legitimate usage? And if it isn't, surely a more user-facing error ought to be getting thrown somewhere upstream of here. It is a legit usage, this is from a test i did myself (9.2.3) test=# explain SELECT * FROM a WHERE ts<@ tstzrange('2013-01-01','2013-04-01 00:10:00'); QUERY PLAN Seq Scan on a (cost=0.00..23.75 rows=1 width=44) Filter: (ts<@ '["2013-01-01 00:00:00+02","2013-04-01 00:10:00+03")'::tstzrange) Looks like the range type cost estimation patch broke this, back in August already. The case of var <@ constant, where constant is a range and var is an element, that's broken. The cost estimation function, rangesel(), incorrectly assumes that the 'var' is always a range type. It's a bit worrying that no-one noticed until now. I'll add a test for that operator to the rangetypes regression test. The immediate fix is attached, but this made me realize that rangesel() is still missing estimation for the "element <@ range" operator. It shouldn't be hard to implement, I'm pretty sure we have all the statistics we need for that. - Heikki diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c index 76dc913..c450c6a 100644 --- a/src/backend/utils/adt/rangetypes_selfuncs.c +++ b/src/backend/utils/adt/rangetypes_selfuncs.c @@ -154,8 +154,6 @@ rangesel(PG_FUNCTION_ARGS) } } - typcache = range_get_typcache(fcinfo, vardata.vartype); - /* * OK, there's a Var and a Const we're dealing with here. We need the * Const to be of same range type as the column, else we can't do anything @@ -169,6 +167,8 @@ rangesel(PG_FUNCTION_ARGS) */ if (operator == OID_RANGE_CONTAINS_ELEM_OP) { + typcache = range_get_typcache(fcinfo, vardata.vartype); + if (((Const *) other)->consttype == typcache->rngelemtype->type_id) { RangeBound lower, upper; @@ -185,6 +185,8 @@ rangesel(PG_FUNCTION_ARGS) } else { + typcache = range_get_typcache(fcinfo, ((Const *) other)->consttype); + if (((Const *) other)->consttype == vardata.vartype) constrange = DatumGetRangeType(((Const *) other)->constvalue); } -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
On Thu, Mar 21, 2013 at 5:58 AM, Tom Lane wrote: > Josh Berkus writes: > > I just noticed that if I use a tstzrange for convenience, a standard > > btree index on a timestamp won't get used for it. Example: > > > table a ( > > id int, > > val text, > > ts timestamptz > > ); > > index a_ts on a(ts); > > > SELECT * FROM a WHERE ts <@ tstzrange('2013-01-01','2013-01-01 00:10:00') > > > ... will NOT use the index a_ts. > > Well, no. <@ is not a btree-indexable operator. > > What I find more disturbing is that this is what I get from the example > in HEAD: > > regression=# explain SELECT * FROM a WHERE ts <@ > tstzrange('2013-01-01','2013-01-01 00:10:00'); > ERROR: XX000: type 1184 is not a range type > LOCATION: range_get_typcache, rangetypes.c:1451 > > Haven't traced through it to determine exactly what's happening, but > isn't this a legitimate usage? And if it isn't, surely a more > user-facing error ought to be getting thrown somewhere upstream of here. > > regards, tom lane > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > It is a legit usage, this is from a test i did myself (9.2.3) test=# explain SELECT * FROM a WHERE ts <@ tstzrange('2013-01-01','2013-04-01 00:10:00'); QUERY PLAN Seq Scan on a (cost=0.00..23.75 rows=1 width=44) Filter: (ts <@ '["2013-01-01 00:00:00+02","2013-04-01 00:10:00+03")'::tstzrange)
Re: [PERFORM] Index usage for tstzrange?
Josh Berkus writes: > I just noticed that if I use a tstzrange for convenience, a standard > btree index on a timestamp won't get used for it. Example: > table a ( > id int, > val text, > ts timestamptz > ); > index a_ts on a(ts); > SELECT * FROM a WHERE ts <@ tstzrange('2013-01-01','2013-01-01 00:10:00') > ... will NOT use the index a_ts. Well, no. <@ is not a btree-indexable operator. What I find more disturbing is that this is what I get from the example in HEAD: regression=# explain SELECT * FROM a WHERE ts <@ tstzrange('2013-01-01','2013-01-01 00:10:00'); ERROR: XX000: type 1184 is not a range type LOCATION: range_get_typcache, rangetypes.c:1451 Haven't traced through it to determine exactly what's happening, but isn't this a legitimate usage? And if it isn't, surely a more user-facing error ought to be getting thrown somewhere upstream of here. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Index usage for tstzrange?
Folks, I just noticed that if I use a tstzrange for convenience, a standard btree index on a timestamp won't get used for it. Example: table a ( id int, val text, ts timestamptz ); index a_ts on a(ts); SELECT * FROM a WHERE ts <@ tstzrange('2013-01-01','2013-01-01 00:10:00') ... will NOT use the index a_ts. Is this something which could be fixed for 9.4? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage for min() vs. "order by asc limit 1"
can you run an analyze command first and then post here the results of: select * FROM pg_stats WHERE tablename = 'delayed_jobs'; ? -- View this message in context: http://postgresql.1045698.n5.nabble.com/index-usage-for-min-vs-order-by-asc-limit-1-tp5002928p5004410.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage for min() vs. "order by asc limit 1"
On Nov 17, 2011, at 5:20 PM, Steve Atkins wrote: > > I don't think you want the group by in that first query. Heh, I tried to simply the example, but in reality that = becomes an in clause of multiple values. So the group by is needed. >> >> >> postgres=# explain analyze select min(id) from delayed_jobs where >> strand='sis_batch:account:15' group by strand; >> QUERY PLAN >> -- >> GroupAggregate (cost=0.00..8918.59 rows=66 width=29) (actual >> time=226.759..226.760 rows=1 loops=1) >> -> Seq Scan on delayed_jobs (cost=0.00..8553.30 rows=72927 width=29) >> (actual time=0.014..169.941 rows=72268 loops=1) >>Filter: ((strand)::text = 'sis_batch:account:15'::text) >> Total runtime: 226.817 ms >> (4 rows) >> >> postgres=# explain analyze select id from delayed_jobs where >> strand='sis_batch:account:15' order by id limit 1; >> QUERY >> PLAN >> - >> Limit (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 >> loops=1) >> -> Index Scan using index_delayed_jobs_on_strand on delayed_jobs >> (cost=0.00..24181.74 rows=72927 width=8) (actual time=0.095..0.095 rows=1 >> loops=1) >>Index Cond: ((strand)::text = 'sis_batch:account:15'::text) >> Total runtime: 0.129 ms >> (4 rows) >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage for min() vs. "order by asc limit 1"
On Nov 17, 2011, at 5:12 PM, Ben Chobot wrote: > I have two queries in PG 9.1. One uses an index like I would like, the other > does not. Is this expected behavior? If so, is there any way around it? I don't think you want the group by in that first query. Cheers, Steve > > > postgres=# explain analyze select min(id) from delayed_jobs where > strand='sis_batch:account:15' group by strand; >QUERY PLAN > -- > GroupAggregate (cost=0.00..8918.59 rows=66 width=29) (actual > time=226.759..226.760 rows=1 loops=1) > -> Seq Scan on delayed_jobs (cost=0.00..8553.30 rows=72927 width=29) > (actual time=0.014..169.941 rows=72268 loops=1) > Filter: ((strand)::text = 'sis_batch:account:15'::text) > Total runtime: 226.817 ms > (4 rows) > > postgres=# explain analyze select id from delayed_jobs where > strand='sis_batch:account:15' order by id limit 1; > QUERY > PLAN > - > Limit (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 > loops=1) > -> Index Scan using index_delayed_jobs_on_strand on delayed_jobs > (cost=0.00..24181.74 rows=72927 width=8) (actual time=0.095..0.095 rows=1 > loops=1) > Index Cond: ((strand)::text = 'sis_batch:account:15'::text) > Total runtime: 0.129 ms > (4 rows) > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] index usage for min() vs. "order by asc limit 1"
I have two queries in PG 9.1. One uses an index like I would like, the other does not. Is this expected behavior? If so, is there any way around it? postgres=# explain analyze select min(id) from delayed_jobs where strand='sis_batch:account:15' group by strand; QUERY PLAN -- GroupAggregate (cost=0.00..8918.59 rows=66 width=29) (actual time=226.759..226.760 rows=1 loops=1) -> Seq Scan on delayed_jobs (cost=0.00..8553.30 rows=72927 width=29) (actual time=0.014..169.941 rows=72268 loops=1) Filter: ((strand)::text = 'sis_batch:account:15'::text) Total runtime: 226.817 ms (4 rows) postgres=# explain analyze select id from delayed_jobs where strand='sis_batch:account:15' order by id limit 1; QUERY PLAN - Limit (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 loops=1) -> Index Scan using index_delayed_jobs_on_strand on delayed_jobs (cost=0.00..24181.74 rows=72927 width=8) (actual time=0.095..0.095 rows=1 loops=1) Index Cond: ((strand)::text = 'sis_batch:account:15'::text) Total runtime: 0.129 ms (4 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage on queries on inherited tables
On Apr 27, 2011, at 11:11 PM, Joseph Shraibman wrote: > On 04/27/2011 04:32 PM, Robert Haas wrote: >> In the first case, PostgreSQL evidently thinks that using the indexes >> will be slower than just ignoring them. You could find out whether >> it's right by trying it with enable_seqscan=off. > > My point is that this is just a problem with inherited tables. It > should be obvious to postgres that few rows are being returned, but in > the inherited tables case it doesn't use indexes. This was just an > example. In a 52 gig table I have a "select id from table limit 1 order > by id desc" returns instantly, but as soon as you declare a child table > it tries to seq scan all the tables. Oh, sorry, I must have misunderstood. As Greg says, this is fixed in 9.1. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage on queries on inherited tables
Joseph Shraibman wrote: In a 52 gig table I have a "select id from table limit 1 order by id desc" returns instantly, but as soon as you declare a child table it tries to seq scan all the tables. This is probably the limitation that's fixed in PostgreSQL 9.1 by this commit (following a few others leading up to it): http://archives.postgresql.org/pgsql-committers/2010-11/msg00028.php There was a good example showing what didn't work as expected before (along with an earlier patch that didn't everything the larger 9.1 improvement does) at http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php ; "ORDER BY x DESC LIMIT 1" returns the same things as MAX(x). It's a pretty serious issue with the partitioning in earlier versions. I know of multiple people, myself included, who have been compelled to apply this change to an earlier version of PostgreSQL to make larger partitioned databases work correctly. The other option is to manually decompose the queries into ones that target each of the child tables individually, then combine the results, which is no fun either. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage on queries on inherited tables
On Wed, Apr 27, 2011 at 2:11 PM, Joseph Shraibman wrote: > On 04/27/2011 04:32 PM, Robert Haas wrote: > > In the first case, PostgreSQL evidently thinks that using the indexes > > will be slower than just ignoring them. You could find out whether > > it's right by trying it with enable_seqscan=off. > > My point is that this is just a problem with inherited tables. It > should be obvious to postgres that few rows are being returned, but in > the inherited tables case it doesn't use indexes. This was just an > example. In a 52 gig table I have a "select id from table limit 1 order > by id desc" returns instantly, but as soon as you declare a child table > it tries to seq scan all the tables. > > If I'm understanding correctly, this kind of obviates the utility of partitioning if you structure a warehouse in a traditional manner. Assuming a fact table partitioned by time, but with foreign keys to a time dimension, it is now not possible to gain any advantage from the partitioning if selecting on columns in the time dimension. "select * from fact_table f join time_dimension t on f.time_id = t.time_id where t.quarter=3 and t.year = 2010" will scan all partitions of the fact table despite the fact that all of the rows would come from 3 partitions, assuming a partitioning schema that uses one partition for each month. I use a time id that is calculable from the from the timestamp so it doesn't need to be looked up, and partitioning on time_id directly is easy enough to handle, but if I'm understanding the problem, it sounds like nothing short of computing the appropriate time ids before issuing the query and then including a 'where f.time_id between x and y' clause to the query will result in the partitions being correctly excluded. Is that what people are doing to solve this problem? The alternative is to leave a timestamp column in the fact table (something I tend to do since it makes typing ad-hoc queries in psql much easier) and partition on that column and then always include a where clause for that column that is at least as large as the requested row range. Both result in fairly ugly queries, though I can certainly see how I might structure my code to always build queries which adhere to this. I'm just in the process of designing a star schema for a project and was intending to use exactly the structure I described at the top of the email. Is there a postgres best-practices for solving this problem? There's no way I can get away without partitioning. I'm looking at a worst case table of 100,000 rows being written every 5 minutes, 24x7 - 29 million rows per day, a billion rows per month - with most queries running over a single month or comparing same months from differing years and quarters - so a month based partitioning. Normal case is closer to 10K rows per 5 minutes. Suggestions? --sam
Re: [PERFORM] index usage on queries on inherited tables
On 04/27/2011 04:32 PM, Robert Haas wrote: > In the first case, PostgreSQL evidently thinks that using the indexes > will be slower than just ignoring them. You could find out whether > it's right by trying it with enable_seqscan=off. My point is that this is just a problem with inherited tables. It should be obvious to postgres that few rows are being returned, but in the inherited tables case it doesn't use indexes. This was just an example. In a 52 gig table I have a "select id from table limit 1 order by id desc" returns instantly, but as soon as you declare a child table it tries to seq scan all the tables. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage on queries on inherited tables
On Fri, Apr 1, 2011 at 2:41 AM, Joseph Shraibman wrote: > When I do a query on a table with child tables on certain queries pg > uses indexes and on others it doesn't. Why does this happen? For example: > > > [local]:playpen=> explain analyze select * from vis where id > 10747 ; > QUERY > PLAN > > Result (cost=4.29..115.11 rows=325 width=634) (actual > time=0.063..0.116 rows=5 loops=1) > -> Append (cost=4.29..115.11 rows=325 width=634) (actual > time=0.053..0.090 rows=5 loops=1) > -> Bitmap Heap Scan on vis (cost=4.29..23.11 rows=5 > width=948) (actual time=0.051..0.058 rows=5 loops=1) > Recheck Cond: (id > 10747) > -> Bitmap Index Scan on vis_pkey (cost=0.00..4.29 > rows=5 width=0) (actual time=0.037..0.037 rows=5 loops=1) > Index Cond: (id > 10747) > -> Seq Scan on vis_for_seg_1_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_4_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_66_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_69_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_79_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_80_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_82_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) > Filter: (id > 10747) > -> Seq Scan on vis_for_seg_87_2011_03 vis (cost=0.00..11.50 > rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) > Filter: (id > 10747) > Total runtime: 0.724 ms > (23 rows) > > Time: 5.804 ms > [local]:playpen=> explain analyze select * from vis where id = 10747 ; > > QUERY > PLAN > > > Result (cost=0.00..74.41 rows=9 width=664) (actual time=0.060..0.503 > rows=1 loops=1) > -> Append (cost=0.00..74.41 rows=9 width=664) (actual > time=0.053..0.493 rows=1 loops=1) > -> Index Scan using vis_pkey on vis (cost=0.00..8.27 rows=1 > width=948) (actual time=0.051..0.055 rows=1 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_1_2011_03_pkey on > vis_for_seg_1_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.122..0.122 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_4_2011_03_pkey on > vis_for_seg_4_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.043..0.043 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_66_2011_03_pkey on > vis_for_seg_66_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.041..0.041 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_69_2011_03_pkey on > vis_for_seg_69_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.041..0.041 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_79_2011_03_pkey on > vis_for_seg_79_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.043..0.043 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_80_2011_03_pkey on > vis_for_seg_80_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.041..0.041 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_82_2011_03_pkey on > vis_for_seg_82_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.049..0.049 rows=0 loops=1) > Index Cond: (id = 10747) > -> Index Scan using vis_for_seg_87_2011_03_pkey on > vis_for_seg_87_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual > time=0.043..0.043 rows=0 loops=1) > Index Cond: (id = 10747) > Total runtime: 1.110 ms > (21 rows) > > [local]:playpen=> select version(); > > version > > PostgreSQL 9.0.3 on i
[PERFORM] index usage on queries on inherited tables
When I do a query on a table with child tables on certain queries pg uses indexes and on others it doesn't. Why does this happen? For example: [local]:playpen=> explain analyze select * from vis where id > 10747 ; QUERY PLAN Result (cost=4.29..115.11 rows=325 width=634) (actual time=0.063..0.116 rows=5 loops=1) -> Append (cost=4.29..115.11 rows=325 width=634) (actual time=0.053..0.090 rows=5 loops=1) -> Bitmap Heap Scan on vis (cost=4.29..23.11 rows=5 width=948) (actual time=0.051..0.058 rows=5 loops=1) Recheck Cond: (id > 10747) -> Bitmap Index Scan on vis_pkey (cost=0.00..4.29 rows=5 width=0) (actual time=0.037..0.037 rows=5 loops=1) Index Cond: (id > 10747) -> Seq Scan on vis_for_seg_1_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_4_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_66_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_69_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_79_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_80_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_82_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.001..0.001 rows=0 loops=1) Filter: (id > 10747) -> Seq Scan on vis_for_seg_87_2011_03 vis (cost=0.00..11.50 rows=40 width=629) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id > 10747) Total runtime: 0.724 ms (23 rows) Time: 5.804 ms [local]:playpen=> explain analyze select * from vis where id = 10747 ; QUERY PLAN Result (cost=0.00..74.41 rows=9 width=664) (actual time=0.060..0.503 rows=1 loops=1) -> Append (cost=0.00..74.41 rows=9 width=664) (actual time=0.053..0.493 rows=1 loops=1) -> Index Scan using vis_pkey on vis (cost=0.00..8.27 rows=1 width=948) (actual time=0.051..0.055 rows=1 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_1_2011_03_pkey on vis_for_seg_1_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.122..0.122 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_4_2011_03_pkey on vis_for_seg_4_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.043..0.043 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_66_2011_03_pkey on vis_for_seg_66_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.041..0.041 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_69_2011_03_pkey on vis_for_seg_69_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.041..0.041 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_79_2011_03_pkey on vis_for_seg_79_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.043..0.043 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_80_2011_03_pkey on vis_for_seg_80_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.041..0.041 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_82_2011_03_pkey on vis_for_seg_82_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.049..0.049 rows=0 loops=1) Index Cond: (id = 10747) -> Index Scan using vis_for_seg_87_2011_03_pkey on vis_for_seg_87_2011_03 vis (cost=0.00..8.27 rows=1 width=629) (actual time=0.043..0.043 rows=0 loops=1) Index Cond: (id = 10747) Total runtime: 1.110 ms (21 rows) [local]:playpen=> select version(); version
Re: [PERFORM] Index usage with functions in where condition
Hi Tom, Thanks for the help - much appreciated. Yes I'm using PostGIS, and with a simply join to a relating table I could get access to the geometry for these point positions. Is using the GIST r-tree index faster than using the 2 b-tree indexes on the lat and long values? I guess this is a question for the PostGIS guys and a quick test could tell me anyway! My memory is that the GIST r-tree index is slow for points at the moment, and that a good implementation of a kd-tree index over GIST is required for better speed. Regards, Jeremy Palmer Geodetic Surveyor National Geodetic Office Land Information New Zealand | Toitu te whenua 160 Lambton Quay | Private Box 5501 | Wellington 6145 DDI: 64 (0)4 498 3537 | Fax: 64 (0)4 498 3837 | www.linz.govt.nz -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Saturday, 10 July 2010 11:20 a.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Index usage with functions in where condition Jeremy Palmer writes: > This is the query that does not use the indexes: > SELECT > coo.nod_id, > 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - > 175.58461)*cos(radians(-41.0618)))^2)) as distance > FROM > crs_coordinate coo > WHERE > coo.value1 between -41.0618-degrees(1200.0/640.0) and > -41.0618+degrees(1200.0/640.0) and > coo.value2 between > 175.58461-degrees(1200.0/640.0)/(cos(radians(-41.0618))) and > 175.58461+degrees(1200.0/640.0)/(cos(radians(-41.0618))); Those expressions yield float8, not numeric, and numeric vs float8 isn't an indexable operator for reasons we needn't get into here. You should probably rethink whether numeric is really the best choice of datatype for your columns, if this is the sort of value you expect to work with --- you're paying a considerable price in speed and space for perhaps-illusory precision gains. But if you insist on using numeric then the solution is to cast the expression results to numeric explicitly. BTW I wonder whether you ought not be looking into postgis rather than rolling-your-own coordinate arithmetic ... regards, tom lane __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage with functions in where condition
Jeremy Palmer writes: > This is the query that does not use the indexes: > SELECT > coo.nod_id, > 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - > 175.58461)*cos(radians(-41.0618)))^2)) as distance > FROM > crs_coordinate coo > WHERE > coo.value1 between -41.0618-degrees(1200.0/640.0) and > -41.0618+degrees(1200.0/640.0) and > coo.value2 between > 175.58461-degrees(1200.0/640.0)/(cos(radians(-41.0618))) and > 175.58461+degrees(1200.0/640.0)/(cos(radians(-41.0618))); Those expressions yield float8, not numeric, and numeric vs float8 isn't an indexable operator for reasons we needn't get into here. You should probably rethink whether numeric is really the best choice of datatype for your columns, if this is the sort of value you expect to work with --- you're paying a considerable price in speed and space for perhaps-illusory precision gains. But if you insist on using numeric then the solution is to cast the expression results to numeric explicitly. BTW I wonder whether you ought not be looking into postgis rather than rolling-your-own coordinate arithmetic ... regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Index usage with functions in where condition
I'm having trouble getting the query planner to use indexes. The situation occurs when writing a query that uses functions for defining the parameters for the conditions on the indexed columns. The system I'm running is Windows Server 2003, using version 8.4.2 of PostgreSQL. This is the following table that I'm running my query against: CREATE TABLE crs_coordinate ( id integer NOT NULL, nod_id integer NOT NULL, value1 numeric(22,12), value2 numeric(22,12), CONSTRAINT crs_coordinate_pkey PRIMARY KEY (id) ) WITH ( OIDS=FALSE ); CREATE INDEX coo_value1 ON crs_coordinate USING btree (value1); CREATE INDEX coo_value2 ON crs_coordinate USING btree (value2); This table has 23 million rows in it and was analysed just before planning my queries. This is the query that does not use the indexes: SELECT coo.nod_id, 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 between -41.0618-degrees(1200.0/640.0) and -41.0618+degrees(1200.0/640.0) and coo.value2 between 175.58461-degrees(1200.0/640.0)/(cos(radians(-41.0618))) and 175.58461+degrees(1200.0/640.0)/(cos(radians(-41.0618))); Seq Scan on crs_coordinate coo (cost=0.00..1039607.49 rows=592 width=28) Filter: (((value1)::double precision >= (-41.0725429586587)::double precision) AND ((value1)::double precision <= (-41.0510570413413)::double precision) AND ((value2)::double precision >= 175.570362072701::double precision) AND ((value2)::double precision <= 175.598857927299::double precision)) However if I pre-evaluated the parameters for the where condition on the value1 and value2 columns, the planner chooses to use the indexes: SELECT coo.nod_id, 640*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 BETWEEN -41.07254296 AND -41.05105704 AND coo.value2 BETWEEN 175.57036207 AND 175.59885792; Bitmap Heap Scan on crs_coordinate coo (cost=5299.61..6705.41 rows=356 width=28) Recheck Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704)) AND (value2 >= 175.57036207) AND (value2 <= 175.59885792)) -> BitmapAnd (cost=5299.61..5299.61 rows=356 width=0) -> Bitmap Index Scan on coo_value1 (cost=0.00..1401.12 rows=54923 width=0) Index Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704))) -> Bitmap Index Scan on coo_value2 (cost=0.00..3898.06 rows=153417 width=0) Index Cond: ((value2 >= 175.57036207) AND (value2 <= 175.59885792)) So why is the first query not using the indexes on the value1 and value2 columns? I'm assuming that both the COS and RAIDIANS functions are STRICT IMMUTABLE, so logically the evaluation of these functions in the where clause should be inlined. Looking at the query plan this inlining does seem to be happening... At this stage I have a work around by putting the query into a plpgsql function and using dynamic SQL. But it is still frustrating why the planner seems to be working in a far from optimal fashion. Can anyone shed some light on this for me? Thanks, Jeremy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
In response to Kenneth Marshall : > > > How many rows do you have in your table? If there are relatively few, > > > it probably guesses it to be cheaper to do a sequential scan and > > > calculate lower values on-the-fly rather than bother with the index. > > > > That's one reason, an other reason, i think, is, that a btree-index can't > > search with an 'not like' - operator. > > The 'not like' condition is likely to be extremely non-selective > which would cause a sequential scan to be used in any event whether > or not an index could be used. That's true, but i have an example where the 'not like' condition is extremely selective: ,[ sql ] | test=*# select count(1) from words where lower(w) not like lower('f%'); | count | --- | 10 | (1 row) | | test=*# select count(1) from words where lower(w) like lower('f%'); | count | --- | 1 | (1 row) ` But the index can't use: ,[ code ] | test=*# explain select * from words where lower(w) not like lower('f%'); | QUERY PLAN | -- | Seq Scan on words (cost=0.00..4396.15 rows=10 width=47) |Filter: (lower(w) !~~ 'f%'::text) | (2 rows) ` And i think, the reason is: ,[ quote from docu ] | B-trees can handle equality and range queries on data that can be sorted | into some ordering. In particular, the PostgreSQL query planner will | consider using a B-tree index whenever an indexed column is involved in | a comparison using one of these operators: | | < | <= | = | >= | > ` Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
On 18 February 2010 12:18, A. Kretschmer wrote: > In response to Thom Brown : >> On 18 February 2010 11:55, AI Rumman wrote: >> > "Not like" operation does not use index. >> > >> > select * from vtiger_contactscf where lower(cf_1253) not like >> > lower('Former%') >> > >> > I created index on lower(cf_1253). >> > >> > How can I ensure index usage in not like operation? >> > Anyone please help. >> > >> >> How many rows do you have in your table? If there are relatively few, >> it probably guesses it to be cheaper to do a sequential scan and >> calculate lower values on-the-fly rather than bother with the index. > > That's one reason, an other reason, i think, is, that a btree-index can't > search with an 'not like' - operator. > Erm.. yes. Now that you say it, it's obvious. :S Thom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
On Thu, Feb 18, 2010 at 01:18:10PM +0100, A. Kretschmer wrote: > In response to Thom Brown : > > On 18 February 2010 11:55, AI Rumman wrote: > > > "Not like" operation does not use index. > > > > > > select * from vtiger_contactscf where lower(cf_1253) not like > > > lower('Former%') > > > > > > I created index on lower(cf_1253). > > > > > > How can I ensure index usage in not like operation? > > > Anyone please help. > > > > > > > How many rows do you have in your table? If there are relatively few, > > it probably guesses it to be cheaper to do a sequential scan and > > calculate lower values on-the-fly rather than bother with the index. > > That's one reason, an other reason, i think, is, that a btree-index can't > search with an 'not like' - operator. > > > > test=*# insert into words select 'fucking example' from > generate_series(1,1); > INSERT 0 1 > test=*# insert into words select 'abc' from generate_series(1,10); > INSERT 0 10 > test=*# explain select * from words where lower(w) like lower('a%') or > lower(w) like lower('b%'); > QUERY PLAN > - > Bitmap Heap Scan on words (cost=1538.75..6933.39 rows=55643 width=36) >Recheck Cond: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) >Filter: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) >-> BitmapOr (cost=1538.75..1538.75 rows=57432 width=0) > -> Bitmap Index Scan on idx_words (cost=0.00..1027.04 rows=39073 > width=0) >Index Cond: ((lower(w) ~>=~ 'a'::text) AND (lower(w) ~<~ > 'b'::text)) > -> Bitmap Index Scan on idx_words (cost=0.00..483.90 rows=18359 > width=0) >Index Cond: ((lower(w) ~>=~ 'b'::text) AND (lower(w) ~<~ > 'c'::text)) > (8 rows) > > test=*# explain select * from words where lower(w) not like lower('a%') or > lower(w) like lower('b%'); > QUERY PLAN > --- > Seq Scan on words (cost=0.00..10624.48 rows=282609 width=36) >Filter: ((lower(w) !~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) > (2 rows) > > > In other words: revert your where-condition from 'not like' to multiple > 'like' conditions for all letters except 'f%'. > > > Andreas The 'not like' condition is likely to be extremely non-selective which would cause a sequential scan to be used in any event whether or not an index could be used. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
In response to Thom Brown : > On 18 February 2010 11:55, AI Rumman wrote: > > "Not like" operation does not use index. > > > > select * from vtiger_contactscf where lower(cf_1253) not like > > lower('Former%') > > > > I created index on lower(cf_1253). > > > > How can I ensure index usage in not like operation? > > Anyone please help. > > > > How many rows do you have in your table? If there are relatively few, > it probably guesses it to be cheaper to do a sequential scan and > calculate lower values on-the-fly rather than bother with the index. That's one reason, an other reason, i think, is, that a btree-index can't search with an 'not like' - operator. test=*# insert into words select 'fucking example' from generate_series(1,1); INSERT 0 1 test=*# insert into words select 'abc' from generate_series(1,10); INSERT 0 10 test=*# explain select * from words where lower(w) like lower('a%') or lower(w) like lower('b%'); QUERY PLAN - Bitmap Heap Scan on words (cost=1538.75..6933.39 rows=55643 width=36) Recheck Cond: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) Filter: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) -> BitmapOr (cost=1538.75..1538.75 rows=57432 width=0) -> Bitmap Index Scan on idx_words (cost=0.00..1027.04 rows=39073 width=0) Index Cond: ((lower(w) ~>=~ 'a'::text) AND (lower(w) ~<~ 'b'::text)) -> Bitmap Index Scan on idx_words (cost=0.00..483.90 rows=18359 width=0) Index Cond: ((lower(w) ~>=~ 'b'::text) AND (lower(w) ~<~ 'c'::text)) (8 rows) test=*# explain select * from words where lower(w) not like lower('a%') or lower(w) like lower('b%'); QUERY PLAN --- Seq Scan on words (cost=0.00..10624.48 rows=282609 width=36) Filter: ((lower(w) !~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) (2 rows) In other words: revert your where-condition from 'not like' to multiple 'like' conditions for all letters except 'f%'. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
> On Thu, Feb 18, 2010 at 6:00 PM, Thom Brown wrote: >> >> On 18 February 2010 11:55, AI Rumman wrote: >> > "Not like" operation does not use index. >> > >> > select * from vtiger_contactscf where lower(cf_1253) not like >> > lower('Former%') >> > >> > I created index on lower(cf_1253). >> > >> > How can I ensure index usage in not like operation? >> > Anyone please help. >> > >> >> How many rows do you have in your table? If there are relatively few, >> it probably guesses it to be cheaper to do a sequential scan and >> calculate lower values on-the-fly rather than bother with the index. >> >> Thom > On 18 February 2010 12:06, AI Rumman wrote: > vtigercrm504=# explain analyze select * from vtiger_contactscf where > lower(cf_1253) like 'customer'; > > QUERY > PLAN > > Index Scan using vtiger_contactscf_cf_1253_idx on vtiger_contactscf > (cost=0.00..146.54 rows=6093 width=179) (actual time=0.083..29.868 rows=5171 > loops=1) >Index Cond: (lower((cf_1253)::text) ~=~ 'customer'::character varying) >Filter: (lower((cf_1253)::text) ~~ 'customer'::text) > Total runtime: 34.956 ms > (4 rows) > vtigercrm504=# explain analyze select * from vtiger_contactscf where > lower(cf_1253) like 'customer'; > > QUERY > PLAN > > Index Scan using vtiger_contactscf_cf_1253_idx on vtiger_contactscf > (cost=0.00..146.54 rows=6093 width=179) (actual time=0.083..29.868 rows=5171 > loops=1) >Index Cond: (lower((cf_1253)::text) ~=~ 'customer'::character varying) >Filter: (lower((cf_1253)::text) ~~ 'customer'::text) > Total runtime: 34.956 ms > (4 rows) Could you do the same again for a "not like" query? Thom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
On 18 February 2010 11:55, AI Rumman wrote: > "Not like" operation does not use index. > > select * from vtiger_contactscf where lower(cf_1253) not like > lower('Former%') > > I created index on lower(cf_1253). > > How can I ensure index usage in not like operation? > Anyone please help. > How many rows do you have in your table? If there are relatively few, it probably guesses it to be cheaper to do a sequential scan and calculate lower values on-the-fly rather than bother with the index. Thom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] index usage in not like
"Not like" operation does not use index. select * from vtiger_contactscf where lower(cf_1253) not like lower('Former%') I created index on lower(cf_1253). How can I ensure index usage in not like operation? Anyone please help.
Re: [PERFORM] Index usage with sub select or outer joins
Hello, Joshua, I did different test cases and here are the results (numbers in seconds), using (case sub queries) or not (case join) the index: Rows (main table) Outer Join Sub queries setting 1396163 rows39.219.6 work_mem=256Mb 3347443 rows72.2203.1 work_mem=256Mb 3347443 rows70.331.1 work_mem=1024Mb 4321072 rows115 554.9 work_mem=256Mb 4321072 rows111 583 work_mem=1024Mb All outer joins where done without index uses To force the use of the index for the first case (outer join), I have change the seq_scan cost (from 1 to 2.5), it takes now only 6.1s for the outer join on 1.4M rows. New explain plan below: "HashAggregate (cost=457881.84..460248.84 rows=39450 width=49)" " -> Nested Loop Left Join (cost=0.00..456994.22 rows=39450 width=49)" "-> Seq Scan on bm_us_views_main_2608 a (cost=0.00..223677.45 rows=39450 width=41)" " Filter: ((item_type = ANY ('{7,9}'::numeric[])) AND (qty > 1))" "-> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..5.65 rows=13 width=19)" " Index Cond: ((b.item_id = a.item_id) AND (b.bid_date < a.pv_timestamp) AND (b.bid_date >= (a.pv_timestamp - '60 days'::interval)))" Index bm_us_bids_item_ix is on item_id, bidder_id (not used in the condition) & bid_date What can be the recommendations on tuning the different costs so it can better estimate the seq scan & index scans costs? I think the issue is there. But didn't find any figures helping to choose the correct parameters according to cpu & disks speed Regards, Julien Theulier -Message d'origine- De : Joshua Tolley [mailto:[EMAIL PROTECTED] Envoyé : mercredi 12 novembre 2008 14:54 À : Julien Theulier Cc : pgsql-performance@postgresql.org Objet : Re: [PERFORM] Index usage with sub select or inner joins On Wed, Nov 12, 2008 at 02:22:47PM +0100, Julien Theulier wrote: > QUESTION: Why the planner choose seq scan in the first case & indexes > scan in the second case? In a more general way, I observed that the > planner has difficulties to select index scans & does in almost all > the cases seq scan, when doing join queries. After investigations, it > looks like when you join table a with table b on a column x and y and > you have an index on column x only, the planner is not able to choose > the index scan. You have to build the index corresponding exactly to > the join statement btw the 2 tables Short, general answer: index scans aren't always faster than sequential scans, and the planner is smart enough to know that. Googling "Why isn't postgresql using my index" provides more detailed results, but in short, if it scans an index, it has to read pages from the index, and for all the tuples it finds in the index, it has to read once again from the heap, whereas a sequential scan requires reading once from the heap. If your query will visit most of the rows of the table, pgsql will choose a sequential scan over an index scan. - Josh / eggyknap -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage with sub select or inner joins
On Wed, Nov 12, 2008 at 02:22:47PM +0100, Julien Theulier wrote: > QUESTION: Why the planner choose seq scan in the first case & indexes scan > in the second case? In a more general way, I observed that the planner has > difficulties to select index scans & does in almost all the cases seq scan, > when doing join queries. After investigations, it looks like when you join > table a with table b on a column x and y and you have an index on column x > only, the planner is not able to choose the index scan. You have to build > the index corresponding exactly to the join statement btw the 2 tables Short, general answer: index scans aren't always faster than sequential scans, and the planner is smart enough to know that. Googling "Why isn't postgresql using my index" provides more detailed results, but in short, if it scans an index, it has to read pages from the index, and for all the tuples it finds in the index, it has to read once again from the heap, whereas a sequential scan requires reading once from the heap. If your query will visit most of the rows of the table, pgsql will choose a sequential scan over an index scan. - Josh / eggyknap signature.asc Description: Digital signature
[PERFORM] Index usage with sub select or inner joins
Hello, I am doing some performances testing on Postgres & I discovered the following behavior, when using 2 different ways of writing selects (but doing the same aggregations at the end): 1. test case 1, using outer join: create table test2 as select soj_session_log_id, pv_timestamp, vi_pv_id,a.item_id, coalesce(sum(case when (bid_date=pv_timestamp - INTERVAL '3 day') then 1 else 0 end)) as recent_sales_3d1, coalesce(sum(case when (bid_date=pv_timestamp - INTERVAL '7 day') then 1 else 0 end)) as recent_sales_7d1, coalesce(sum(case when (bid_date=pv_timestamp - INTERVAL '14 day') then 1 else 0 end)) as recent_sales_14d1, coalesce(sum(case when (bid_date=pv_timestamp - INTERVAL '30 day') then 1 else 0 end)) as recent_sales_30d1, coalesce(sum(case when (bid_date=pv_timestamp - INTERVAL '60 day') then 1 else 0 end)) as recent_sales_60d1 from bm_us_views_main_1609 a left outer join bm_us_bids b on (b.item_id=a.item_id and b.bid_date=a.pv_timestamp - INTERVAL '60 day')) where a.item_type in (7,9) and qty>1 group by soj_session_log_id, pv_timestamp, vi_pv_id, a.item_id;; This query doesn't use any index according to the explain plan: "HashAggregate (cost=672109.07..683054.81 rows=182429 width=49)" " -> Merge Left Join (cost=646489.83..668004.42 rows=182429 width=49)" "Merge Cond: (a.item_id = b.item_id)" "Join Filter: ((b.bid_date < a.pv_timestamp) AND (b.bid_date >= (a.pv_timestamp - '60 days'::interval)))" "-> Sort (cost=331768.62..332224.69 rows=182429 width=41)" " Sort Key: a.item_id" " -> Seq Scan on bm_us_views_main_1609 a (cost=0.00..315827.08 rows=182429 width=41)" "Filter: ((item_type = ANY ('{7,9}'::numeric[])) AND (qty > 1))" "-> Sort (cost=314669.01..320949.52 rows=2512205 width=19)" " Sort Key: b.item_id" " -> Seq Scan on bm_us_bids b (cost=0.00..47615.05 rows=2512205 width=19)" 2. Test case 2, using sub queries: create table test2 as select soj_session_log_id, pv_timestamp, vi_pv_id,item_id, coalesce((select count(*) from bm_us_bids b where b.item_id=a.item_id and bid_date=pv_timestamp - INTERVAL '3 day' group by item_id ),0) as recent_sales_3d, coalesce((select count(*) from bm_us_bids b where b.item_id=a.item_id and bid_date=pv_timestamp - INTERVAL '7 day' group by item_id ),0) as recent_sales_7d, coalesce((select count(*) from bm_us_bids b where b.item_id=a.item_id and bid_date=pv_timestamp - INTERVAL '14 day' group by item_id ),0) as recent_sales_14d, coalesce((select count(*) from bm_us_bids b where b.item_id=a.item_id and bid_date=pv_timestamp - INTERVAL '30 day' group by item_id ),0) as recent_sales_30d, coalesce((select count(*) from bm_us_bids b where b.item_id=a.item_id and bid_date=pv_timestamp - INTERVAL '60 day' group by item_id ),0) as recent_sales_60d from bm_us_views_main_1609 a where item_type in (7,9) and qty>1; This query uses indexes according to the explain plan: "Seq Scan on bm_us_views_main_1609 a (cost=0.00..8720230.77 rows=182429 width=41)" " Filter: ((item_type = ANY ('{7,9}'::numeric[])) AND (qty > 1))" " SubPlan" "-> GroupAggregate (cost=0.00..9.21 rows=1 width=11)" " -> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..9.20 rows=1 width=11)" "Index Cond: ((item_id = $0) AND (bid_date < $1) AND (bid_date >= ($1 - '60 days'::interval)))" "-> GroupAggregate (cost=0.00..9.21 rows=1 width=11)" " -> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..9.20 rows=1 width=11)" "Index Cond: ((item_id = $0) AND (bid_date < $1) AND (bid_date >= ($1 - '30 days'::interval)))" "-> GroupAggregate (cost=0.00..9.21 rows=1 width=11)" " -> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..9.20 rows=1 width=11)" "Index Cond: ((item_id = $0) AND (bid_date < $1) AND (bid_date >= ($1 - '14 days'::interval)))" "-> GroupAggregate (cost=0.00..9.21 rows=1 width=11)" " -> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..9.20 rows=1 width=11)" "Index Cond: ((item_id = $0) AND (bid_date < $1) AND (bid_date >= ($1 - '7 days'::interval)))" "-> GroupAggregate (cost=0.00..9.21 rows=1 width=11)" " -> Index Scan using bm_us_bids_item_ix on bm_us_bids b (cost=0.00..9.20 rows=1 width=11)" "Index Cond: ((item_id = $0) AND (bid_date < $1) AND (bid_date >= ($1 - '3 days'::interval)))" The index bm_us_bids_item_ix is on columns item_id, bidder_id, bid_date QUESTION: Why the planner choose seq scan in the first case & indexes scan in the second case? In a more general way, I observed that the planner has difficulties to select index scans & does in almost all the cases seq scan, when doing join queries. After investigations, it looks like when you join table a with table b on a column x and y and you have an index on column x only, the planner is not able to choose the index sca
Re: [PERFORM] Index usage problem on 8.3.3
On Fri, 31 Oct 2008, Gregory Stark wrote: Tom Lane <[EMAIL PROTECTED]> writes: Jeff Frost <[EMAIL PROTECTED]> writes: Tom Lane wrote: Huh. That does sound like it's a version-to-version difference. There's nothing in the CVS log that seems related though. Are you willing to post your test case? It's a customer DB, so I'll contact them and see if we can boil it down to a test case with no sensitive data. Well, if there was a change it seems to have been in the right direction ;-) so this is mostly just idle curiosity. Don't jump through hoops to get a test case. Assuming it's not a bug... Well, after boiling down my test case to the bare essentials, I was unable to reproduce the different behavior between 8.3.3 and 8.3.4. Now, I've gone back to the original script and can't reproduce the behavior I previously saw on 8.3.4 and my screen session doesn't have enough scrollback to look at what happened previously. I was thinking perhaps I had inadvertently committed the transaction, but then the act would have been dropped as it's a temp table created with ON COMMIT DROP. But, I've tested 3 times in a row and every time 8.3.4 uses the seq scan just like 8.3.3 now, so I must've done something differently to get that result as Tom had originally suggested. I just can't think what it might have been. Perhaps it's time to buy some glasses. :-/ -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Tom Lane <[EMAIL PROTECTED]> writes: > Jeff Frost <[EMAIL PROTECTED]> writes: >> Tom Lane wrote: >>> Huh. That does sound like it's a version-to-version difference. >>> There's nothing in the CVS log that seems related though. Are you >>> willing to post your test case? >>> >> It's a customer DB, so I'll contact them and see if we can boil it down >> to a test case with no sensitive data. > > Well, if there was a change it seems to have been in the right direction > ;-) so this is mostly just idle curiosity. Don't jump through hoops to > get a test case. Assuming it's not a bug... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Jeff Frost <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Huh. That does sound like it's a version-to-version difference. >> There's nothing in the CVS log that seems related though. Are you >> willing to post your test case? >> > It's a customer DB, so I'll contact them and see if we can boil it down > to a test case with no sensitive data. Well, if there was a change it seems to have been in the right direction ;-) so this is mostly just idle curiosity. Don't jump through hoops to get a test case. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Tom Lane wrote: > Jeff Frost <[EMAIL PROTECTED]> writes: > >> On Thu, 30 Oct 2008, Tom Lane wrote: >> Any idea why I don't see it on 8.3.4? >>> I think it's more likely some small difference in your test conditions >>> than any real version-to-version difference. In particular I think the >>> "still see" test might be influenced by the ages of transactions running >>> concurrently. >>> > > >> Interesting. This is on a test server which has no other concurrent >> transactions and it acts the same way after I stopped 8.3.4 and started up >> 8.3.3 again as it did before stopping 8.3.3 to bring up 8.3.4. Hrmm..I'm >> not >> sure that makes sense. So, I did the test with the sql script on 8.3.3, >> then >> shut down 8.3.3, started up 8.3.4 on the same data dir, ran the test >> successfully. Next I shut down 8.3.4 and started 8.3.3 and verified that >> the >> behavior was still the same on 8.3.3. I wonder what else I might be doing >> differently. >> > > Huh. That does sound like it's a version-to-version difference. > There's nothing in the CVS log that seems related though. Are you > willing to post your test case? > > It's a customer DB, so I'll contact them and see if we can boil it down to a test case with no sensitive data. -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032
Re: [PERFORM] Index usage problem on 8.3.3
Jeff Frost <[EMAIL PROTECTED]> writes: > On Thu, 30 Oct 2008, Tom Lane wrote: >>> Any idea why I don't see it on 8.3.4? >> >> I think it's more likely some small difference in your test conditions >> than any real version-to-version difference. In particular I think the >> "still see" test might be influenced by the ages of transactions running >> concurrently. > Interesting. This is on a test server which has no other concurrent > transactions and it acts the same way after I stopped 8.3.4 and started up > 8.3.3 again as it did before stopping 8.3.3 to bring up 8.3.4. Hrmm..I'm not > sure that makes sense. So, I did the test with the sql script on 8.3.3, then > shut down 8.3.3, started up 8.3.4 on the same data dir, ran the test > successfully. Next I shut down 8.3.4 and started 8.3.3 and verified that the > behavior was still the same on 8.3.3. I wonder what else I might be doing > differently. Huh. That does sound like it's a version-to-version difference. There's nothing in the CVS log that seems related though. Are you willing to post your test case? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
On Thu, 30 Oct 2008, Tom Lane wrote: Any idea why I don't see it on 8.3.4? I think it's more likely some small difference in your test conditions than any real version-to-version difference. In particular I think the "still see" test might be influenced by the ages of transactions running concurrently. Interesting. This is on a test server which has no other concurrent transactions and it acts the same way after I stopped 8.3.4 and started up 8.3.3 again as it did before stopping 8.3.3 to bring up 8.3.4. Hrmm..I'm not sure that makes sense. So, I did the test with the sql script on 8.3.3, then shut down 8.3.3, started up 8.3.4 on the same data dir, ran the test successfully. Next I shut down 8.3.4 and started 8.3.3 and verified that the behavior was still the same on 8.3.3. I wonder what else I might be doing differently. The good news is that making the indexes before the updates seems to make the planner happy! -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Jeff Frost <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Okay. What that means is that the indexes were created on data that had >> already been inserted and updated to some extent, resulting in >> HOT-update chains that turned out to be illegal for the new indexes. >> The way we deal with this is to mark the indexes as not usable by any >> query that can still see the dead HOT-updated tuples. > Any idea why I don't see it on 8.3.4? I think it's more likely some small difference in your test conditions than any real version-to-version difference. In particular I think the "still see" test might be influenced by the ages of transactions running concurrently. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Tom Lane wrote: > Okay. What that means is that the indexes were created on data that had > already been inserted and updated to some extent, resulting in > HOT-update chains that turned out to be illegal for the new indexes. > The way we deal with this is to mark the indexes as not usable by any > query that can still see the dead HOT-updated tuples. > > Your best bet for dodging the problem is probably to break the operation > into two transactions, if that's possible. INSERT and UPDATE in the > first xact, create the indexes at the start of the second. (Hmm ... > I'm not sure if that's sufficient if there are other concurrent > transactions; but it's certainly necessary.) Another possibility is > to create the indexes just after data load, before you start updating > the columns they're on. > > Thanks Tom! Any idea why I don't see it on 8.3.4? -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Jeff Frost <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> This may be a HOT side-effect ... is pg_index.indcheckxmin set for >> the index? >> > Yep, sure enough, the 'act' table's indexes have it set and jefftest and > jefftest2's indexes do not. Okay. What that means is that the indexes were created on data that had already been inserted and updated to some extent, resulting in HOT-update chains that turned out to be illegal for the new indexes. The way we deal with this is to mark the indexes as not usable by any query that can still see the dead HOT-updated tuples. Your best bet for dodging the problem is probably to break the operation into two transactions, if that's possible. INSERT and UPDATE in the first xact, create the indexes at the start of the second. (Hmm ... I'm not sure if that's sufficient if there are other concurrent transactions; but it's certainly necessary.) Another possibility is to create the indexes just after data load, before you start updating the columns they're on. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage problem on 8.3.3
Tom Lane wrote: > Jeff Frost <[EMAIL PROTECTED]> writes: > >> I've run across a strange problem with PG 8.3.3 not using indexes on a >> particular table after building the table during a transaction. >> > > This may be a HOT side-effect ... is pg_index.indcheckxmin set for > the index? > Yep, sure enough, the 'act' table's indexes have it set and jefftest and jefftest2's indexes do not. select c.relname,i.indcheckxmin from pg_class c, pg_index i WHERE i.indexrelid = c.oid AND c.relname IN ('act_act_usr_id', 'act_arrived', 'act_closing', 'act_place'); relname | indcheckxmin +-- act_closing| t act_act_usr_id | t act_place | t act_arrived| t (4 rows) consdb=# select c.relname,i.indcheckxmin from pg_class c, pg_index i WHERE i.indexrelid = c.oid AND c.relname IN ('jefftest2_jefftest_usr_id', 'jefftest2_arrived', 'jefftest2_closing', 'jefftest2_place'); relname | indcheckxmin ---+-- jefftest2_jefftest_usr_id | f jefftest2_place | f jefftest2_arrived | f jefftest2_closing | f (4 rows) -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032
Re: [PERFORM] Index usage problem on 8.3.3
Jeff Frost <[EMAIL PROTECTED]> writes: > I've run across a strange problem with PG 8.3.3 not using indexes on a > particular table after building the table during a transaction. This may be a HOT side-effect ... is pg_index.indcheckxmin set for the index? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Index usage problem on 8.3.3
I've run across a strange problem with PG 8.3.3 not using indexes on a particular table after building the table during a transaction. You can see a transcript of the issue here: http://gist.github.com/21154 Interestingly, if I create another temp table 'CREATE TEMP TABLE AS SELECT * FROM act' as seen on line 107, then add the same indexes to that table, PG will use the indexes. While it's not in the gist transcript, even an extremely simple query like: SELECT * FROM act WHERE act_usr_id = 1; will not use the index on the original act table, but the jefftest and jefftest2 tables both work fine. As you can probably see in the transcript, the tables have been ANALYZEd. I even tried 'enable seqscan=0;' and that made the cost really high for the seq scan, but the planner still chose the seq scan. The issue does not affect 8.2.3 nor does it affect 8.3.4. I didn't see any mention of a fix for this sort of thing in 8.3.4's release notes. I was wondering if this is a known bug in 8.3.3 (and maybe other 8.3.x versions) and just didn't make it into the release notes of 8.3.4? -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] index usage makes problem
I had 50 lakh records in my table... while counting that am using that row in where condition... which makes problem, cpu is waiting for device... Debian OS, postresql 7.4, 50 lakh records. Query is EXPLAIN ANALYZE select count(call_id) from call_log where call_id > 1; while seeing the top, cpu is waiting for i/o, and without this call_id condition if i do EXPLAIN ANALYZE select count(oid) from call_log where oid > 1; it executed in 21 seconds
Re: [PERFORM] index usage on arrays
andrew klassen <[EMAIL PROTECTED]> writes: > Is there any alternative to what am I currently doing other than creating a > row for > each array element, Since (I think) 8.2, you could create a GIN index on the array column and then array overlap (&&) would be indexable. GIN has some performance issues if the table is heavily updated, though. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] index usage on arrays
andrew, what are your queries ? Have you seen contrib/intarray, GIN index ? On Thu, 7 Feb 2008, andrew klassen wrote: I am using Postgres 8.2.5. I have a table that has rows containing a variable length array with a known maximum. I was doing selects on the array elements using an ANY match. The performance was not too good as my table got bigger. So I added an index on the array. That didn't help since the select was not using it. I saw a thread in the mailing lists stating the index wouldn't be used. So I created indices on the individual array elements and then do a select on each element separately and then combine each match using OR. This did substantially increase the select performance. However, it may be difficult to maintain this approach over time as the maximum array size may increase dramatically and forming the query will become tedious. Is there any alternative to what am I currently doing other than creating a row for each array element, i.e. stop using an array and use a separate row for each array index? The reason I didn't want to take this approach is because there are other columns in the row that will be duplicated needlessly. Thanks, Andrew Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] index usage on arrays
I am using Postgres 8.2.5. I have a table that has rows containing a variable length array with a known maximum. I was doing selects on the array elements using an ANY match. The performance was not too good as my table got bigger. So I added an index on the array. That didn't help since the select was not using it. I saw a thread in the mailing lists stating the index wouldn't be used. So I created indices on the individual array elements and then do a select on each element separately and then combine each match using OR. This did substantially increase the select performance. However, it may be difficult to maintain this approach over time as the maximum array size may increase dramatically and forming the query will become tedious. Is there any alternative to what am I currently doing other than creating a row for each array element, i.e. stop using an array and use a separate row for each array index? The reason I didn't want to take this approach is because there are other columns in the row that will be duplicated needlessly. Thanks, Andrew Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
Re: [PERFORM] Index usage when bitwise operator is used
Hi Tom, do you think it would be a good idea to ask GIN index team to implement an int-based bitmap set indexing operator for GIN/GiST based indexes? Or there will be a possibility to somehow optimally index arrays of enumerations to implement such bitmap structures in 8.3 or later postgresql versions? With best regards, -- Valentine On Sep 17, 3:37 pm, [EMAIL PROTECTED] (Tom Lane) wrote: > "Kevin Grittner" <[EMAIL PROTECTED]> writes: > > On Mon, Sep 17, 2007 at 2:49 AM, in message > > <[EMAIL PROTECTED]>, valgog > > <[EMAIL PROTECTED]> wrote:=20 > >> Are you sure you understood what was the question? > > >> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & > >> TBL2.CATEGORY > 0? > > Yes, given that he stipulated that one and only one bit would be set. > > Really? In that case, isn't this bit-field just a bad implementation of > an enum-style field? > > regards, tom lane > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index usage when bitwise operator is used
Hi, A little clarification. Actually, TBL1.CATEGORY and/or TBL2.CATEGORY may hold a binary value having multiple binary(ies) '1'. Each binary value column represent an business attribute. If a binary value column is equal to '1', it means that the business attribute is True, otherwise it is false. I adopted this avoid defining a detail table to table TBL1. Idem to TBL2. If TBL1.CATEGORY | TBL2.CATEGORY > 0 => it means that we have at least one common business attribute that is TRUE for TBL1 and TBL2. Regards W.Alf On 9/17/07, Tom Lane <[EMAIL PROTECTED]> wrote: > > "Kevin Grittner" <[EMAIL PROTECTED]> writes: > > On Mon, Sep 17, 2007 at 2:49 AM, in message > > <[EMAIL PROTECTED]>, valgog > > <[EMAIL PROTECTED]> wrote:=20 > >> Are you sure you understood what was the question? > >> > >> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & > >> TBL2.CATEGORY > 0? > > > Yes, given that he stipulated that one and only one bit would be set. > > Really? In that case, isn't this bit-field just a bad implementation of > an enum-style field? > >regards, tom lane > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly >
Re: [PERFORM] Index usage when bitwise operator is used
>>> On Mon, Sep 17, 2007 at 8:37 AM, in message <[EMAIL PROTECTED]>, Tom Lane <[EMAIL PROTECTED]> wrote: > "Kevin Grittner" <[EMAIL PROTECTED]> writes: >> On Mon, Sep 17, 2007 at 2:49 AM, in message >> <[EMAIL PROTECTED]>, valgog >> <[EMAIL PROTECTED]> wrote:=20 >>> Are you sure you understood what was the question? >>> >>> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & >>> TBL2.CATEGORY > 0? > >> Yes, given that he stipulated that one and only one bit would be set. > > Really? In that case, isn't this bit-field just a bad implementation of > an enum-style field? My bad. I did misread it. Sorry, all. -Kevin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index usage when bitwise operator is used
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > On Mon, Sep 17, 2007 at 2:49 AM, in message > <[EMAIL PROTECTED]>, valgog > <[EMAIL PROTECTED]> wrote:=20 >> Are you sure you understood what was the question? >> >> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & >> TBL2.CATEGORY > 0? > Yes, given that he stipulated that one and only one bit would be set. Really? In that case, isn't this bit-field just a bad implementation of an enum-style field? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index usage when bitwise operator is used
>>> On Mon, Sep 17, 2007 at 2:49 AM, in message <[EMAIL PROTECTED]>, valgog <[EMAIL PROTECTED]> wrote: >> What about saying?: >> >> TBL1.CATEGORY = TBL2.CATEGORY >> > > Are you sure you understood what was the question? > > Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & > TBL2.CATEGORY > 0? Yes, given that he stipulated that one and only one bit would be set. -Kevin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Index usage when bitwise operator is used
> What about saying?: > > TBL1.CATEGORY = TBL2.CATEGORY > Are you sure you understood what was the question? Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY & TBL2.CATEGORY > 0? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Index usage when bitwise operator is used
Hi, I could not find and normal solution for that issue. But I am using some workarounds for that issue. The solution, that I am using now is to create an index for every bit of your bitmap field. So something like CREATE INDEX idx_hobbybit_0_limited ON "versionA".user_fast_index USING btree (gender, dateofbirth) -- here the gender and dateofbirth fields are the fields that we usually ORDER BY in the select statements, but you can play with the needed fields WHERE (hobby_bitmap & 1) > 0; by creating such an index for every used bit and combining WHERE (hobby_bitmap & 1 ) > 0 like statements the planner will be choosing the right index to use. Another workaround, that will be more applicable in your case I think, is to create a functional GIN index on your bitmap field using a static function to create an array of bitmap keys from your bitmap field. CREATE OR REPLACE FUNCTION "versionA".bitmap_to_bit_array(source_bitmap integer) RETURNS integer[] AS 'select ARRAY( select (1 << s.i) from generate_series(0, 32) as s(i) where ( 1 << s.i ) & $1 > 0 )' LANGUAGE 'sql' IMMUTABLE STRICT; And than create a GIN index on the needed field using this stored procedure. After that, it would be possible to use intarray set operators on the result of that function. This will also make it possible to use that GIN index. Actually it would be much much better if it were possible to build GIN indexes directly on the bitmap fields. But this is to be implemented by GIN and GiST index development team. Probably would be not a bad idea to make a feature request on them. With best regards, Valentine Gogichashvili On Sep 13, 2:30 pm, [EMAIL PROTECTED] ("W.Alphonse HAROUNY") wrote: > Hello, > > My question is about index usage when bitwise operations are invoked. > Situation Context: > -- > > Lets suppose we have 2 tables TBL1 and TBL2 as the following: > TBL1 { > . ; > integer categoryGroup; // categoryGroup is declared as an index on TABL1 > . ; > > } > > TBL2 { > . ; > integer categoryGroup; // categoryGroup is declared as an index on TABL2 > . ; > > } > > By conception, I suppose that: > - [categoryGroup] may hold a limited number of values, less than 32 values. > - [categoryGroup] is of type integer => it means 4 bytes => 32 bits > => 32 places available to hold binary '0' or binary '1' values. > - [categoryGroup] is the result of an "OR bitwise operation" among a > predefined set of variables [variableCategory]. >We suppose that [variableCategory] is of type integer (=>32 bits) >and each binary value of [variableCategory] may only hold a single binary > '1'. > > Ex: variableCategory1 = 0010 > variableCategory2 = 0010 > variableCategory3 = 1000 > > If [categoryGroup] = variableCategory1 | variableCategory2 | > variableCategory3 > =>[categoryGroup] = 00101010 > > Question: > -- > I have an SQL request similar to: > > SELECT . FROM TBL1, TBL2 WHERE > AND > TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise > operator > > Qst: > 1/ IS the above SQL request will use the INDEX [categoryGroup] defined on > TBL1 and TBL2 ? > 2/ What should I do or How should I modify my SQL request in order >to force the query engine to use an index ? (the already defined index or > another useful index) > > Thx a lot ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Index usage when bitwise operator is used
>>> On Thu, Sep 13, 2007 at 7:30 AM, in message <[EMAIL PROTECTED]>, "W.Alphonse HAROUNY" <[EMAIL PROTECTED]> wrote: >and each binary value of [variableCategory] may only hold a single binary > '1'. > TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise > operator What about saying?: TBL1.CATEGORY = TBL2.CATEGORY If your indexes include this and the other columns which cause the tables to be related, one or both of them stand a pretty good chance of evaluating to the lowest-cost method to join the tables. Forcing a query to use an index outside of it being the cheapest path is rarely productive. -Kevin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Index usage when bitwise operator is used
Hi, I could not find and normal solution for that issue. But I am using some workarounds for that issue. The solution, that I am using now is to create an index for every bit of your bitmap field. So something like CREATE INDEX idx_hobbybit_0_limited ON "versionA".user_fast_index USING btree (gender, dateofbirth) -- here the gender and dateofbirth fields are the fields that we usually ORDER BY in the select statements, but you can play with the needed fields WHERE (hobby_bitmap & 1) > 0; by creating such an index for every used bit and combining WHERE (hobby_bitmap & 1 ) > 0 like statements the planner will be choosing the right index to use. Another workaround, that will be more applicable in your case I think, is to create a functional GIN index on your bitmap field using a static function to create an array of bitmap keys from your bitmap field. CREATE OR REPLACE FUNCTION "versionA".bitmap_to_bit_array(source_bitmap integer) RETURNS integer[] AS 'select ARRAY( select (1 << s.i) from generate_series(0, 32) as s(i) where ( 1 << s.i ) & $1 > 0 )' LANGUAGE 'sql' IMMUTABLE STRICT; And than create a GIN index on the needed field using this stored procedure. After that, it would be possible to use intarray set operators on the result of that function. This will also make it possible to use that GIN index. Actually it would be much much better if it were possible to build GIN indexes directly on the bitmap fields. But this is to be implemented by GIN and GiST index development team. Probably would be not a bad idea to make a feature request on them. With best regards, Valentine Gogichashvili On 9/13/07, W.Alphonse HAROUNY <[EMAIL PROTECTED]> wrote: > > Hello, > > My question is about index usage when bitwise operations are invoked. > Situation Context: > -- > > Lets suppose we have 2 tables TBL1 and TBL2 as the following: > TBL1 { > . ; > integer categoryGroup; // categoryGroup is declared as an index on TABL1 > . ; > } > > TBL2 { > . ; > integer categoryGroup; // categoryGroup is declared as an index on TABL2 > . ; > } > > By conception, I suppose that: > - [categoryGroup] may hold a limited number of values, less than 32 > values. > - [categoryGroup] is of type integer => it means 4 bytes => 32 bits > => 32 places available to hold binary '0' or binary '1' values. > - [categoryGroup] is the result of an "OR bitwise operation" among a > predefined set of variables [variableCategory]. >We suppose that [variableCategory] is of type integer (=>32 bits) >and each binary value of [variableCategory] may only hold a single > binary '1'. > > > Ex: variableCategory1 = 0010 > variableCategory2 = 0010 > variableCategory3 = 1000 > > If [categoryGroup] = variableCategory1 | variableCategory2 | > variableCategory3 > =>[categoryGroup] = 00101010 > > > > Question: > -- > I have an SQL request similar to: > > SELECT . FROM TBL1, TBL2 WHERE > AND > TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise > operator > > Qst: > 1/ IS the above SQL request will use the INDEX [categoryGroup] defined on > TBL1 and TBL2 ? > 2/ What should I do or How should I modify my SQL request in order >to force the query engine to use an index ? (the already defined index > or another useful index) > > > > Thx a loт >
[PERFORM] Index usage when bitwise operator is used
Hello, My question is about index usage when bitwise operations are invoked. Situation Context: -- Lets suppose we have 2 tables TBL1 and TBL2 as the following: TBL1 { . ; integer categoryGroup; // categoryGroup is declared as an index on TABL1 . ; } TBL2 { . ; integer categoryGroup; // categoryGroup is declared as an index on TABL2 . ; } By conception, I suppose that: - [categoryGroup] may hold a limited number of values, less than 32 values. - [categoryGroup] is of type integer => it means 4 bytes => 32 bits => 32 places available to hold binary '0' or binary '1' values. - [categoryGroup] is the result of an "OR bitwise operation" among a predefined set of variables [variableCategory]. We suppose that [variableCategory] is of type integer (=>32 bits) and each binary value of [variableCategory] may only hold a single binary '1'. Ex: variableCategory1 = 0010 variableCategory2 = 0010 variableCategory3 = 1000 If [categoryGroup] = variableCategory1 | variableCategory2 | variableCategory3 =>[categoryGroup] = 00101010 Question: -- I have an SQL request similar to: SELECT . FROM TBL1, TBL2 WHERE AND TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise operator Qst: 1/ IS the above SQL request will use the INDEX [categoryGroup] defined on TBL1 and TBL2 ? 2/ What should I do or How should I modify my SQL request in order to force the query engine to use an index ? (the already defined index or another useful index) Thx a lot
Re: [PERFORM] index usage
On Mon, Apr 23, 2007 at 07:20:29PM +0200, Arkadiusz Raj wrote: > I have a table in my database that is updated every minute with new acquired > data. Anyway there is a query to get latest values to be displayed on > screen. I have postgresql 7.4.2 that work very fine. You want _at least_ the latest 7.4 version -- ideally, the latest 8.2 version. > The problem was that > after hdd crash I have rebuild database from the archive and... Execution > time of this query starts to be unacceptable. Have you re-ANALYZEd after the data load? Anyhow, the issue with the planner not knowing how to estimate expressions like "now() - interval '5 minutes'" correctly is a known 7.4 issue, and it's fixed in later versions. It might have worked more or less by accident earlier, although it seems odd that it wouldn't even have considered the index scan... /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] index usage
Hi, I have a table in my database that is updated every minute with new acquired data. Anyway there is a query to get latest values to be displayed on screen. I have postgresql 7.4.2 that work very fine. The problem was that after hdd crash I have rebuild database from the archive and... Execution time of this query starts to be unacceptable. And I found funny thing. Using static value in place expression remove this problem. Query started to be executed fast again. I did not change any settings in postgresql configuration. Just had to restart all the services. Can someone tell me why the optimizer stopped to choose index? I had seqscan disabled already. One note about those two outputs below: there are different number of touples returned due to the fact that in fact the timestamp is chosen differently. Regards, /Arek -- explain analyze SELECT distinct on (index) index, status, value FROM _values WHERE device=1 and timestamp>(now()-5*interval '1 min') ORDER by index, timestamp desc; QUERY PLAN - Unique (cost=100117679.93..100117756.29 rows=1 width=24) (actual time=5279.262..5279.308 rows=10 loops=1) -> Sort (cost=100117679.93..100117718.11 rows=15272 width=24) (actual time=5279.260..5279.275 rows=21 loops=1) Sort Key: "index", "timestamp" -> Seq Scan on _values (cost=1.00..100116618.64 rows=15272 width=24) (actual time=5277.596..5279.184 rows=21 loops=1) Filter: ((device = 1) AND (("timestamp")::timestamp with time zone > (now() - '00:05:00'::interval))) Total runtime: 5279.391 ms (6 rows) explain analyze SELECT distinct on (index) index, status, value FROM _values WHERE device=1 and timestamp>'2007-04-22 21:20' ORDER by index, timestamp desc; QUERY PLAN --- Unique (cost=703.45..703.47 rows=1 width=24) (actual time=4.807..4.867 rows=10 loops=1) -> Sort (cost=703.45..703.46 rows=5 width=24) (actual time=4.804..4.827 rows=31 loops=1) Sort Key: "index", "timestamp" -> Index Scan using _values_dbidx_idx on _values (cost=0.00..703.39 rows=5 width=24) (actual time=0.260..4.728 rows=31 loops=1) Index Cond: ("timestamp" > '2007-04-22 21:20:00'::timestamp without time zone) Filter: (device = 1) Total runtime: 4.958 ms (7 rows) -- List przeskanowano programem ArcaMail, ArcaVir 2007 przeskanowano 2007-04-23 19:20:29, silnik: 2007.01.01 12:00:00, bazy: 2007.04.15 09:21:20 This message has been scanned by ArcaMail, ArcaVir 2007 scanned 2007-04-23 19:20:29, engine: 2007.01.01 12:00:00, base: 2007.04.15 09:21:20 http://www.arcabit.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Index usage
Tom Lane wrote: What other planner parameters have you been fooling with? Hi Tom, The other parameters that have been changed are: set join_collapse_limit to 1 set enable_sort to off We are using version 8.1.3. We've noticed the query plan changing depending on the amount of data in the tables especially when the query looks at more rows in tableA. The parameter work_mem is set to 262,144. Thanks, Scott ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index usage
Scott Matseas <[EMAIL PROTECTED]> writes: > If I enable sequential scan the Index Cond in > question gets replaced with a Seq scan. What other planner parameters have you been fooling with? With no data in the tables, I get a reasonably sane-looking plan, so I'm thinking you've chosen bad values for something or other (starting with enable_seqscan = off ;-)) explain SELECT * FROM last_summarized ls JOIN tableA s ON s.table_idA > ls.summarized_id AND s.table_idA <= ls.max_session_id LEFT JOIN tableB sf ON s.table_idA = sf.table_idA AND sf.direction = 'a'::char LEFT JOIN tableB sfb ON s.table_idA = sfb.table_idA AND sfb.direction = 'b'::char WHERE ls.summary_name::text = 'summary'::text ; QUERY PLAN - Nested Loop Left Join (cost=6.16..54.51 rows=216 width=116) -> Nested Loop Left Join (cost=6.16..42.05 rows=216 width=95) -> Nested Loop (cost=6.16..29.58 rows=216 width=74) -> Index Scan using last_summarized_pk on last_summarized ls (cost=0.00..8.02 rows=1 width=66) Index Cond: ((summary_name)::text = 'summary'::text) -> Bitmap Heap Scan on tablea s (cost=6.16..18.32 rows=216 width=8) Recheck Cond: ((s.table_ida > ls.summarized_id) AND (s.table_ida <= ls.max_session_id)) -> Bitmap Index Scan on table_ida_pk (cost=0.00..6.16 rows=216 width=0) Index Cond: ((s.table_ida > ls.summarized_id) AND (s.table_ida <= ls.max_session_id)) -> Index Scan using tableb_unq on tableb sfb (cost=0.00..0.05 rows=1 width=21) Index Cond: ((s.table_ida = sfb.table_ida) AND (sfb.direction = 'b'::bpchar)) -> Index Scan using tableb_unq on tableb sf (cost=0.00..0.05 rows=1 width=21) Index Cond: ((s.table_ida = sf.table_ida) AND (sf.direction = 'a'::bpchar)) (13 rows) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Index usage
We're having a problem with one of our queries being slow. It appears to be due to the index being used to go from tableA to tableB. Here are the tables: CREATE TABLE tableA ( table_idA int8 NOT NULL DEFAULT nextval('tableA_id_seq'::regclass), CONSTRAINT table_idA_pk PRIMARY KEY (table_idA), ) WITHOUT OIDS; CREATE TABLE tableB ( table_idB int8 NOT NULL DEFAULT nextval('tableB_id_seq'::regclass), table_idA int8 NOT NULL, direction char NOT NULL, CONSTRAINT tableB_pk PRIMARY KEY (table_idB), CONSTRAINT tableB_unq UNIQUE (table_idA, direction), ) WITHOUT OIDS; CREATE TABLE last_summarized ( summary_name varchar(64) NOT NULL, summarized_id int8, max_session_id int8, CONSTRAINT last_summarized_pk PRIMARY KEY (summary_name) ) WITHOUT OIDS; Here is the query: explain SELECT * FROM last_summarized ls JOIN tableA s ON s.table_idA > ls.summarized_id AND s.table_idA <= ls.max_session_id LEFT JOIN tableB sf ON s.table_idA = sf.table_idA AND sf.direction = 'a'::"char" LEFT JOIN tableB sfb ON s.table_idA = sfb.table_idA AND sfb.direction = 'b'::"char" WHERE ls.summary_name::text = 'summary'::text Size of tables in # of rows tableA: 9,244,816 tableB: 15,398,497 last_summarized: 1 Explain of the above query: "Hash Left Join (cost=1811349.31..18546527.89 rows=1029087 width=294)" " Hash Cond: ("outer".table_idA = "inner".table_idA)" " -> Hash Left Join (cost=915760.88..7519203.61 rows=1029087 width=219)" "Hash Cond: ("outer".table_idA = "inner".table_idA)" "-> Nested Loop (cost=0.00..126328.57 rows=1029087 width=144)" " -> Index Scan using last_summarized_pk on last_summarized ls (cost=0.00..5.98 rows=1 width=82)" "Index Cond: ((summary_name)::text = 'summary'::text)" " -> Index Scan using table_idA_pk on tableA s (cost=0.00..110886.29 rows=1029087 width=62)" "Index Cond: ((s.table_idA > "outer".summarized_id) AND (s.table_idA <= "outer".max_session_id))" "-> Hash (cost=784763.16..784763.16 rows=8100289 width=75)" " -> Bitmap Heap Scan on tableB sf (cost=216418.55..784763.16 rows=8100289 width=75)" "Recheck Cond: (direction = 'a'::"char")" "-> Bitmap Index Scan on tableB_unq (cost=0.00..216418.55 rows=8100289 width=0)" " Index Cond: (direction = 'a'::"char")" <-- USING part of Index " -> Hash (cost=775968.61..775968.61 rows=7396725 width=75)" "-> Bitmap Heap Scan on tableB sfb (cost=216418.55..775968.61 rows=7396725 width=75)" " Recheck Cond: (direction = 'b'::"char")" " -> Bitmap Index Scan on tableB_unq (cost=0.00..216418.55 rows=7396725 width=0)" "Index Cond: (direction = 'b'::"char")" <-- USING part of Index From the above explain see inline comment("<-- USING part of Index"). The table_idA column looks like it is being ignored in the index Cond. If I enable sequential scan the Index Cond in question gets replaced with a Seq scan. Also if I disable enable_bitmapscan sometimes both columns of the index(tableB_unq) will be used. Does anyone know why we're experiencing this behavior? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] index usage
Ben <[EMAIL PROTECTED]> writes: > It's volatile, but it will always return an integer. If it's volatile then it can't be used for an index condition. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] index usage
A volatile function has may return a different result for each row; think of the random() or nextval() functions for example. You wouldn't want them to return the same value for each row returned. -- Mark Lewis On Fri, 2006-07-28 at 13:59 -0700, Ben wrote: > It's volatile, but it will always return an integer. > > On Fri, 28 Jul 2006, Daniel Caune wrote: > > >> De : [EMAIL PROTECTED] [mailto:pgsql-performance- > >> [EMAIL PROTECTED] De la part de Ben > >> Envoyé : vendredi, juillet 28, 2006 15:21 > >> À : pgsql-performance@postgresql.org > >> Objet : [PERFORM] index usage > >> > >> I have a table with 37000 rows, an integer column, and an index on that > >> column. I've got a function that returns an integer. When I do a select > >> where I restrict that column to being equal to a static number, explain > >> tells me the index will be used. When I do the same thing but use the > >> function instead of a static number, explain shows me a full scan on the > >> table. > >> > >> I must be missing something, because my understanding is that the function > >> will be evaluated once for the statement and then collapsed into a static > >> number for the filtering. But the results of the explain seem to imply > >> that's not the case? > >> > > > > Is your function IMMUTABLE, STABLE or VOLATILE? > > > > -- > > Daniel > > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] index usage
It's volatile, but it will always return an integer. On Fri, 28 Jul 2006, Daniel Caune wrote: De : [EMAIL PROTECTED] [mailto:pgsql-performance- [EMAIL PROTECTED] De la part de Ben Envoyé : vendredi, juillet 28, 2006 15:21 À : pgsql-performance@postgresql.org Objet : [PERFORM] index usage I have a table with 37000 rows, an integer column, and an index on that column. I've got a function that returns an integer. When I do a select where I restrict that column to being equal to a static number, explain tells me the index will be used. When I do the same thing but use the function instead of a static number, explain shows me a full scan on the table. I must be missing something, because my understanding is that the function will be evaluated once for the statement and then collapsed into a static number for the filtering. But the results of the explain seem to imply that's not the case? Is your function IMMUTABLE, STABLE or VOLATILE? -- Daniel ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] index usage
> De : [EMAIL PROTECTED] [mailto:pgsql-performance- > [EMAIL PROTECTED] De la part de Ben > Envoyé : vendredi, juillet 28, 2006 15:21 > À : pgsql-performance@postgresql.org > Objet : [PERFORM] index usage > > I have a table with 37000 rows, an integer column, and an index on that > column. I've got a function that returns an integer. When I do a select > where I restrict that column to being equal to a static number, explain > tells me the index will be used. When I do the same thing but use the > function instead of a static number, explain shows me a full scan on the > table. > > I must be missing something, because my understanding is that the function > will be evaluated once for the statement and then collapsed into a static > number for the filtering. But the results of the explain seem to imply > that's not the case? > Is your function IMMUTABLE, STABLE or VOLATILE? -- Daniel ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] index usage
I have a table with 37000 rows, an integer column, and an index on that column. I've got a function that returns an integer. When I do a select where I restrict that column to being equal to a static number, explain tells me the index will be used. When I do the same thing but use the function instead of a static number, explain shows me a full scan on the table. I must be missing something, because my understanding is that the function will be evaluated once for the statement and then collapsed into a static number for the filtering. But the results of the explain seem to imply that's not the case? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Index Usage using IN
Select ID from TableA where not exists ( Select ID from Table B where ID = TableA.ID) might give you index scan. Of course, that is only useful is TableA is very small table. Not appropriate for 250k rows on 2/1/2006 12:12 PM Ralph Mason said the following: Hi, I have 2 tables both have an index on ID (both ID columns are an oid). I want to find only only rows in one and not the other. Select ID from TableA where ID not IN ( Select ID from Table B) This always generates sequential scans. Table A has about 250,000 rows. Table B has about 250,000 Rows. We should get a Scan on Table B and a Index Lookup on Table A. Is there any way to force this? enable_seqscan off doesn't help at all. The Plan is Seq Scan on tablea(cost=1.00..23883423070450.96 rows=119414 width=4) Filter: (NOT (subplan))" SubPlan-> Seq Scan on tableb (cost=1.00..14611.17 rows=242617 width=4) Thanks Ralph ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index Usage using IN
On Thu, Feb 02, 2006 at 09:12:59 +1300, Ralph Mason <[EMAIL PROTECTED]> wrote: > Hi, > > I have 2 tables both have an index on ID (both ID columns are an oid). > > I want to find only only rows in one and not the other. > > Select ID from TableA where ID not IN ( Select ID from Table B) > > This always generates sequential scans. > > Table A has about 250,000 rows. Table B has about 250,000 Rows. > > We should get a Scan on Table B and a Index Lookup on Table A. I don't think that is going to work if there are NULLs in table B. I don't know whether or not Postgres has code to special case NULL testing (either for constraints ruling them out, or doing probes for them in addition to the key it is trying to match) for doing NOT IN. Just doing a simple index probe into table A isn't going to tell you all you need to know if you don't find a match. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Index Usage using IN
"Jeffrey W. Baker" <[EMAIL PROTECTED]> writes: > On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: >> Select ID from TableA where ID not IN ( Select ID from Table B) > Have you considered this: > SELECT ID from TableA EXCEPT Select ID from Table B Also, increasing work_mem might persuade the planner to try a hashed subplan, which'd be a lot better than what you have. Note that it's quite unlikely that indexes are going to help for this. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index Usage using IN
On Wed, 2006-02-01 at 12:22 -0800, Jeffrey W. Baker wrote: > On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: > > Hi, > > > > I have 2 tables both have an index on ID (both ID columns are an oid). > > > > I want to find only only rows in one and not the other. > > > > Select ID from TableA where ID not IN ( Select ID from Table B) > > Have you considered this: > > SELECT ID from TableA EXCEPT Select ID from Table B Alternately: SELECT a.ID FROM TableA AS a LEFT JOIN TableB AS b ON a.ID = b.ID WHERE b.ID IS NULL -jwb ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Index Usage using IN
On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: > Hi, > > I have 2 tables both have an index on ID (both ID columns are an oid). > > I want to find only only rows in one and not the other. > > Select ID from TableA where ID not IN ( Select ID from Table B) Have you considered this: SELECT ID from TableA EXCEPT Select ID from Table B ? -jwb ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Index Usage using IN
Hi, I have 2 tables both have an index on ID (both ID columns are an oid). I want to find only only rows in one and not the other. Select ID from TableA where ID not IN ( Select ID from Table B) This always generates sequential scans. Table A has about 250,000 rows. Table B has about 250,000 Rows. We should get a Scan on Table B and a Index Lookup on Table A. Is there any way to force this? enable_seqscan off doesn't help at all. The Plan is Seq Scan on tablea(cost=1.00..23883423070450.96 rows=119414 width=4) Filter: (NOT (subplan))" SubPlan-> Seq Scan on tableb (cost=1.00..14611.17 rows=242617 width=4) Thanks Ralph ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Index usage for sorted query
Hi, Pierre-Frédéric, On Sat, 20 Nov 2004 17:12:43 +0100 Pierre-Frédéric Caillaud <[EMAIL PROTECTED]> wrote: > > WHERE cd='ca' ORDER BY l_postcode; > > Write : > > > WHERE cd='ca' ORDER BY cd, l_postcode; > > You have a multicolumn index, so you should specify a multicolumn sort > exactly the same as your index, and the planner will get it. Thanks, that seems to help. Seems weird to order by a column that is all the same value, but well, why not :-) Thanks a lot, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:[EMAIL PROTECTED] | www.logi-track.com ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index usage for sorted query
Markus Schaber <[EMAIL PROTECTED]> writes: > But as it fetches all the rows through the index, why doesn't it > recognize that, fetching this way, the rows are already sorted by > l_postcode? Tell it to "ORDER BY cd, l_postcode". > Is Postgresql 8 more intelligend in this case? No. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index usage for sorted query
Instead of : WHERE cd='ca' ORDER BY l_postcode; Write : WHERE cd='ca' ORDER BY cd, l_postcode; You have a multicolumn index, so you should specify a multicolumn sort exactly the same as your index, and the planner will get it. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] Index usage for sorted query
Hello, I have the following query plan: logigis=# explain SELECT geom, ref_in_id as ref, nref_in_id as nref, st_name as name, substr(l_postcode,1,2) as postfirst, func_class as level FROM schabi.streets WHERE cd='ca' ORDER BY l_postcode; QUERY PLAN --- Sort (cost=2950123.42..2952466.07 rows=937059 width=290) Sort Key: l_postcode -> Index Scan using streets_name_idx on streets (cost=0.00..2857177.57 rows=937059 width=290) Index Cond: ((cd)::text = 'ca'::text) And I have, beside others, the following index: »streets_name_idx« btree (cd, l_postcode) As the query plan shows, my postgresql 7.4 does fine on using the index for the WHERE clause. But as it fetches all the rows through the index, why doesn't it recognize that, fetching this way, the rows are already sorted by l_postcode? As I have a larger set of data, it nearly breaks down our developer machine every time we do this, as it always creates a temporary copy of the large amount of data to sort it (setting sort_mem higher makes it swap, setting it lower makes it thrashing disk directly). Is Postgresql 8 more intelligend in this case? Thanks for your hints, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:[EMAIL PROTECTED] | www.logi-track.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] index usage
"scott.marlowe" <[EMAIL PROTECTED]> writes: > There are tons of hints that it works this way in how they're written, but > nothing that just comes out and says that with pgsql's mvcc > implementation, an index scan still has to hit the pages that contain the > tuples, so often in pgsql a seq scan is a win where in other databases and > index scan would have been a win? > If not, where would I add it if I were going to write something up for the > docs? Just wondering... AFAIR the only place in the docs that mentions seqscan or indexscan at all is the discussion of EXPLAIN in "Performance Tips". Perhaps a suitably-enlarged version of that section could cover this. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] index usage
On Mon, 26 Apr 2004, Stephan Szabo wrote: > > On Fri, 23 Apr 2004 [EMAIL PROTECTED] wrote: > > > I have a query which I think should be using an index all of the time but > > postgres only uses the index part of the time. The index > > (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed > > first followed by the selected column (support_person_id). Wouldn't the > > most efficient plan be to scan the index each time because the only columns > > needed are in the index? Below is the table, 2 queries showing the > > Not necessarily. The rows in the actual file still need to be checked to > see if they're visible to the select and if it's expected that the entire > file (or a reasonable % of the pages anyway) will need to be loaded using > the index isn't necessarily a win. While those of us familiar with PostgreSQL are well aware of the fact that indexes can't be used directly to garner information, but only as a lookup to a tuple in the table, it seems this misconception is quite common among those coming to postgreSQL from other databases. Is there any information that directly reflects this issue in the docs? There are tons of hints that it works this way in how they're written, but nothing that just comes out and says that with pgsql's mvcc implementation, an index scan still has to hit the pages that contain the tuples, so often in pgsql a seq scan is a win where in other databases and index scan would have been a win? If not, where would I add it if I were going to write something up for the docs? Just wondering... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] index usage
On Fri, 23 Apr 2004 [EMAIL PROTECTED] wrote: > I have a query which I think should be using an index all of the time but > postgres only uses the index part of the time. The index > (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed > first followed by the selected column (support_person_id). Wouldn't the > most efficient plan be to scan the index each time because the only columns > needed are in the index? Below is the table, 2 queries showing the Not necessarily. The rows in the actual file still need to be checked to see if they're visible to the select and if it's expected that the entire file (or a reasonable % of the pages anyway) will need to be loaded using the index isn't necessarily a win. > athenapost=> explain analyze select distinct support_person_id from > ticket_crm_map where crm_id = 1; > QUERY PLAN > > - > Unique (cost=10911.12..11349.26 rows=32 width=4) (actual > time=659.102..791.517 rows=24 loops=1) > -> Sort (cost=10911.12..11130.19 rows=87628 width=4) (actual > time=659.090..713.285 rows=93889 loops=1) > Sort Key: support_person_id > -> Seq Scan on ticket_crm_map (cost=0.00..3717.25 rows=87628 > width=4) (actual time=0.027..359.299 rows=93889 loops=1) > Filter: (crm_id = 1) > Total runtime: 814.601 ms How far off is this from the index scan version in time? Try doing set enable_seqscan=off; and then explain analyzing again. It's possible that you may wish to lower random_page_cost to change the estimated effect of how much more expensive random reads are compared to sequential ones. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] index usage
When checking an index in postgres the original table has to be checked for each result to find if the index entry is still valid? In which case you can't blindly scan the whole index and assume the data is good. I was used to Oracle behavior where the index is up to date so it can do the scan without hitting the original table. Does this sound correct to anyone? Thanks, Brad Stephan Szabo writes: On Fri, 23 Apr 2004 [EMAIL PROTECTED] wrote: I have a query which I think should be using an index all of the time but postgres only uses the index part of the time. The index (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed first followed by the selected column (support_person_id). Wouldn't the most efficient plan be to scan the index each time because the only columns needed are in the index? Below is the table, 2 queries showing the Not necessarily. The rows in the actual file still need to be checked to see if they're visible to the select and if it's expected that the entire file (or a reasonable % of the pages anyway) will need to be loaded using the index isn't necessarily a win. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] index usage
Hi, I have a query which I think should be using an index all of the time but postgres only uses the index part of the time. The index (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed first followed by the selected column (support_person_id). Wouldn't the most efficient plan be to scan the index each time because the only columns needed are in the index? Below is the table, 2 queries showing the difference in plans, followed by the record distribution of ticket_crm_map. I first did a 'vacuum analyze' to update the statistics. Thanks, Brad athenapost=> \d ticket_crm_map Table "public.ticket_crm_map" Column |Type | Modifiers +-+- --- tcrm_map_id| integer | not null ticket_id | integer | not null crm_id | integer | not null support_person_id | integer | not null escalated_to_person_id | integer | not null status | character varying(50) | not null default 'Open'::character varying close_date | timestamp without time zone | updated_date | timestamp without time zone | updated_by | character varying(255) | created_date | timestamp without time zone | created_by | character varying(255) | additional_info| text| subject| character varying(255) | Indexes: "ticket_crm_map_pkey" primary key, btree (tcrm_map_id) "ticket_crm_map_crm_id_key" unique, btree (crm_id, ticket_id) "ticket_crm_map_crm_id_suppid" btree (crm_id, support_person_id) "ticket_crm_map_status" btree (status) "ticket_crm_map_ticket_id" btree (ticket_id) Foreign-key constraints: "$1" FOREIGN KEY (ticket_id) REFERENCES ticket(ticket_id) "$2" FOREIGN KEY (crm_id) REFERENCES company_crm(crm_id) "$3" FOREIGN KEY (support_person_id) REFERENCES person(person_id) "$4" FOREIGN KEY (escalated_to_person_id) REFERENCES person(person_id) "$5" FOREIGN KEY (status) REFERENCES ticket_status(status) athenapost=> explain analyze select distinct support_person_id from ticket_crm_map where crm_id = 7; QUERY PLAN -- Unique (cost=1262.99..1265.27 rows=1 width=4) (actual time=15.335..18.245 rows=20 loops=1) -> Sort (cost=1262.99..1264.13 rows=456 width=4) (actual time=15.332..16.605 rows=2275 loops=1) Sort Key: support_person_id -> Index Scan using ticket_crm_map_crm_id_suppid on ticket_crm_map (cost=0.00..1242.85 rows=456 width=4) (actual time=0.055..11.281 rows=2275 loops=1) Index Cond: (crm_id = 7) Total runtime: 18.553 ms (6 rows) Time: 20.598 ms athenapost=> explain analyze select distinct support_person_id from ticket_crm_map where crm_id = 1; QUERY PLAN - Unique (cost=10911.12..11349.26 rows=32 width=4) (actual time=659.102..791.517 rows=24 loops=1) -> Sort (cost=10911.12..11130.19 rows=87628 width=4) (actual time=659.090..713.285 rows=93889 loops=1) Sort Key: support_person_id -> Seq Scan on ticket_crm_map (cost=0.00..3717.25 rows=87628 width=4) (actual time=0.027..359.299 rows=93889 loops=1) Filter: (crm_id = 1) Total runtime: 814.601 ms (6 rows) Time: 817.095 ms athenapost=> select count(*), crm_id from ticket_crm_map group by crm_id; count | crm_id ---+ 2554 | 63 129 | 25 17 | 24 110 | 23 74 | 22 69 | 21 2 | 20 53 | 82 10 | 17 16 | 81 46637 | 16 14 | 80 2 | 15 1062 | 79 87 | 78 93 | 77 60 | 44 363 | 76 225 | 10 4 | 74 83 | 9 27 | 73 182 | 8 2275 | 7 15 | 71 554 | 6 44 | 70 631 | 5 37 | 4 190 | 3 112 | 2 93889 | 1 (32 rows) Time: 436.697 ms ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org