> increase default stats target, analyze, try again.
This field has only 5 values. I had put values/frequencies in my first post.
Based on the values (see below) - there is no reason for planner to think that 
mt_flags&134=0 should return 12200 rows.
select mt_flags, count(*) from staging.tmp_t group by 1;
 mt_flags |  count  
----------+---------
      128 |   57362
        4 |    1371
        8 |     627
        2 |   19072
        0 | 2361630
(5 rows)

In fact, if I rewrite the query using value matching - the estimations are 
right on:
explain analyze select count(*) from staging.tmp_t where mt_flags not in 
(128,2,4);
                                                       QUERY PLAN               
                                        
------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=85878.63..85878.64 rows=1 width=0) (actual 
time=2904.005..2904.005 rows=1 loops=1)
   ->  Seq Scan on tmp_t  (cost=0.00..79973.85 rows=**2361910** width=0) 
(actual time=0.008..2263.983 rows=2362257 loops=1)
         Filter: (mt_flags <> ALL ('{128,2,4}'::integer[]))
 Total runtime: 2904.038 ms
(4 rows)


Anyways, I've been using statistics target of 100 in 8.3 and in 8.4 100 is 
default. I am currently using default_statistics_target=1000.

Do you think that bit-and function might be skewing the statistics for 
execution plan somehow?
Thanks,
-Slava.

-----Original Message-----
From: Scott Marlowe [mailto:scott.marl...@gmail.com] 
Sent: Tuesday, August 18, 2009 2:58 PM
To: Slava Moudry
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] number of rows estimation for bit-AND operation

2009/8/18 Slava Moudry <smou...@4info.net>:
> Hi Scott,
> Thank you for reply.
> I am using Postgres 8.4.0 (btw - great release --very happy about it) and I 
> got a different plan after following your advice:

Yeah, you're returning most of the rows, so a seq scan makes sense.
Try indexing / matching on something more uncommon and you should get
an index scan.



> The seq scan is OK, since I don't expect Postgres to use index scan for such 
> low-selective condition.
> It would be tough for me to support indexes for each bit flag value and their 
> combinations. E.g. in the query below it is again 200x off on number of rows.

increase default stats target, analyze, try again.


> explain analyze select count(*) from staging.tmp_t where  mt_flags&134=0;
>                                                      QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=83054.43..83054.44 rows=1 width=0) (actual 
> time=2964.960..2964.960 rows=1 loops=1)
>   ->  Seq Scan on tmp_t  (cost=0.00..83023.93 rows=12200 width=0) (actual 
> time=0.014..2152.031 rows=2362257 loops=1)
>         Filter: ((mt_flags & 134) = 0)
>  Total runtime: 2965.009 ms
> (4 rows)
>
> I still wonder if it's something I could/should report as a bug? I've been 
> struggling with this issue in 8.2, 8.3.x  (now using 8.4.0).
> We can more or less work around this by disabling nestloop in our analytics 
> queries but I have problems enforcing this in reporting applications.

Looks more like a low stats target.  Try increasing that first.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to