> increase default stats target, analyze, try again. This field has only 5 values. I had put values/frequencies in my first post. Based on the values (see below) - there is no reason for planner to think that mt_flags&134=0 should return 12200 rows. select mt_flags, count(*) from staging.tmp_t group by 1; mt_flags | count ----------+--------- 128 | 57362 4 | 1371 8 | 627 2 | 19072 0 | 2361630 (5 rows)
In fact, if I rewrite the query using value matching - the estimations are right on: explain analyze select count(*) from staging.tmp_t where mt_flags not in (128,2,4); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=85878.63..85878.64 rows=1 width=0) (actual time=2904.005..2904.005 rows=1 loops=1) -> Seq Scan on tmp_t (cost=0.00..79973.85 rows=**2361910** width=0) (actual time=0.008..2263.983 rows=2362257 loops=1) Filter: (mt_flags <> ALL ('{128,2,4}'::integer[])) Total runtime: 2904.038 ms (4 rows) Anyways, I've been using statistics target of 100 in 8.3 and in 8.4 100 is default. I am currently using default_statistics_target=1000. Do you think that bit-and function might be skewing the statistics for execution plan somehow? Thanks, -Slava. -----Original Message----- From: Scott Marlowe [mailto:scott.marl...@gmail.com] Sent: Tuesday, August 18, 2009 2:58 PM To: Slava Moudry Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] number of rows estimation for bit-AND operation 2009/8/18 Slava Moudry <smou...@4info.net>: > Hi Scott, > Thank you for reply. > I am using Postgres 8.4.0 (btw - great release --very happy about it) and I > got a different plan after following your advice: Yeah, you're returning most of the rows, so a seq scan makes sense. Try indexing / matching on something more uncommon and you should get an index scan. > The seq scan is OK, since I don't expect Postgres to use index scan for such > low-selective condition. > It would be tough for me to support indexes for each bit flag value and their > combinations. E.g. in the query below it is again 200x off on number of rows. increase default stats target, analyze, try again. > explain analyze select count(*) from staging.tmp_t where mt_flags&134=0; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=83054.43..83054.44 rows=1 width=0) (actual > time=2964.960..2964.960 rows=1 loops=1) > -> Seq Scan on tmp_t (cost=0.00..83023.93 rows=12200 width=0) (actual > time=0.014..2152.031 rows=2362257 loops=1) > Filter: ((mt_flags & 134) = 0) > Total runtime: 2965.009 ms > (4 rows) > > I still wonder if it's something I could/should report as a bug? I've been > struggling with this issue in 8.2, 8.3.x (now using 8.4.0). > We can more or less work around this by disabling nestloop in our analytics > queries but I have problems enforcing this in reporting applications. Looks more like a low stats target. Try increasing that first. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance