Re: [PERFORM] query a table with lots of coulmns

2014-09-19 Thread Szymon Guz
On 19 September 2014 13:51, Björn Wittich bjoern_witt...@gmx.de wrote:

 Hi mailing list,

 I am relatively new to postgres. I have a table with 500 coulmns and about
 40 mio rows. I call this cache table where one column is a unique key
 (indexed) and the 499 columns (type integer) are some values belonging to
 this key.

 Now I have a second (temporary) table (only 2 columns one is the key of my
 cache table) and I want  do an inner join between my temporary table and
 the large cache table and export all matching rows. I found out, that the
 performance increases when I limit the join to lots of small parts.
 But it seems that the databases needs a lot of disk io to gather all 499
 data columns.
 Is there a possibilty to tell the databases that all these colums are
 always treated as tuples and I always want to get the whole row? Perhaps
 the disk oraganization could then be optimized?


Hi,
do you have indexes on the columns you use for joins?

Szymon


Re: [PERFORM] question about partial index

2014-03-18 Thread Szymon Guz
On 18 March 2014 22:26, Yu Zhao yzha...@gmail.com wrote:

 In PostgreSQL 9.3.3 Documentation 11.8. Partial Indexes Example 11-2
 (http://www.postgresql.org/docs/9.3/interactive/indexes-partial.html),
 the partial index is created

 CREATE INDEX orders_unbilled_index ON orders (order_nr) WHERE billed
 is not true;

 And the suggested use mode is

 SELECT * FROM orders WHERE billed is not true AND order_nr  1;

 My question is after an update to the billed column is done, will PG
 automatically add or remove records whose billed are just set to false
 or true to/from the b-tree?

 Thanks in advance.



Hi,
the short answer is: yes, it will work as you expect.

The long answer is: no, it will not simply add/remove because postgres
keeps many different versions of the same row, so when you change the
column from false to true, the new row version will be added to the index,
when you change from true to false, the previous rows will be still stored
in the index as well, because there could be some older transaction which
should see some older version of the row.

The mechanism is quite internal, and you shouldn't bother. As a database
user you should just see, that the index is updated automatically, and it
will store all rows where billed = true.

regards,
Szymon


Re: [PERFORM] SQL performance

2013-06-02 Thread Szymon Guz
On 2 June 2013 21:39, Robert DiFalco robert.difa...@gmail.com wrote:

 I have a table called contacts. It has a BIGINT owner_id which references
 a record in the user table. It also has a BIGINT user_id which may be null.
 Additionally it has a BOOLEAN blocked column to indicate if a contact is
 blocked. The final detail is that multiple contacts for an owner may
 reference the same user.

 I have a query to get all the user_ids of a non-blocked contact that is a
 mutual contact of the user. The important part of the table looks like this:

 CREATE TABLE contacts
 (
 id BIGINT PRIMARY KEY NOT NULL, // generated
 blocked BOOL,
 owner_id BIGINT NOT NULL,
 user_id BIGINT,
 FOREIGN KEY ( owner_id ) REFERENCES app_users ( id ) ON DELETE CASCADE,
 FOREIGN KEY ( user_id ) REFERENCES app_users ( id ) ON DELETE SET NULL
 );
 CREATE INDEX idx_contact_owner ON contacts ( owner_id );
 CREATE INDEX idx_contact_mutual ON contacts ( owner_id, user_id ) WHERE
 user_id IS NOT NULL AND NOT blocked;

 The query looks like this:

 explain analyze verbose
 select c.user_id
 from contact_entity c
 where c.owner_id=24 and c.user_id24 and c.user_id IS NOT NULL and NOT
 c.blocked and (exists (
   select 1
   from contact_entity c1
   where NOT c1.blocked and c1.owner_id=c.user_id and c1.user_id IS NOT
 NULL and c1.user_id=24))
 group by c.user_id;

 This will get all the users for user 24 that are mutual unblocked contacts
 but exclude the user 24.

 I have run this through explain several times and I'm out of ideas on the
 index. I note that I can also right the query like this:

 explain analyze verbose
 select distinct c.user_id
 from contact_entity c left outer join contact_entity c1 on c1.owner_id =
 c.user_id and c1.user_id = c.owner_id
 where NOT c.blocked AND NOT c1.blocked AND c.owner_id = 24 AND c.user_id
  24
 AND c.user_id IS NOT NULL AND c1.user_id IS NOT NULL
 group by c.user_id;

 I don't notice a big difference in the query plans. I also notice no
 difference if I replace the GROUP BY with DISTINCT.

 My question is, can this be tightened further in a way I haven't been
 creative enough to try? Does it matter if I use the EXISTS versus the OUTER
 JOIN or the GROUP BY versus the DISTINCT.

 Is there a better index and I just have not been clever enough to come up
 with it yet? I've tried a bunch.

 Thanks in advance!!

 Robert



Hi Robert,
could you show us the plans?

thanks,
Szymon


Re: [PERFORM] [planner] Ignore order by in subselect if parrent do count(*)

2012-03-01 Thread Szymon Guz
On 1 March 2012 13:02, Marcin Mirosław mar...@mejor.pl wrote:

 W dniu 01.03.2012 12:50, Szymon Guz pisze:
 Hi Szymon,
  If you have only 2 rows in the table, then the plan really doesn't
  matter too much. Sorting two rows would be really fast :)
 
  Try to check it with 10k rows.

 It doesn't matter (in this case) how many records is in user_profile
 table. Planner does sorting.
 Here is version with more rows:
 $ explain (analyze,verbose,buffers) SELECT count(*) from (select * from
 users_profile order by id) u_p;
  QUERY
 PLAN

 ---
  Aggregate  (cost=1593639.92..1593639.93 rows=1 width=0) (actual
 time=11738.498..11738.498 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=2499 read=41749 written=10595, temp read=17107
 written=17107
   -  Sort  (cost=1443640.26..1468640.21 rows=977 width=4) (actual
 time=9804.461..10963.911 rows=1000 loops=1)
 Output: users_profile.id
 Sort Key: users_profile.id
 Sort Method: external sort  Disk: 136856kB
 Buffers: shared hit=2499 read=41749 written=10595, temp
 read=17107 written=17107
 -  Seq Scan on public.users_profile  (cost=0.00..144247.77
 rows=977 width=4) (actual time=0.021..1192.202 rows=1000 loops=1)
   Output: users_profile.id
   Buffers: shared hit=2499 read=41749 written=10595
  Total runtime: 11768.199 ms
 (12 rows)

 And without order by:
 $ explain (analyze,verbose,buffers) SELECT count(*) from (select * from
 users_profile ) u_p;
   QUERY
 PLAN

 
  Aggregate  (cost=169247.71..169247.72 rows=1 width=0) (actual
 time=1757.613..1757.613 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=2522 read=41726
   -  Seq Scan on public.users_profile  (cost=0.00..144247.77
 rows=977 width=0) (actual time=0.032..946.166 rows=1000 loops=1)
 Output: users_profile.id
 Buffers: shared hit=2522 read=41726
  Total runtime: 1757.656 ms
 (7 rows)



Could you provide the postgres version and the structure of users_profile
table (with indexes)?

- Szymon


Re: [PERFORM] Query tuning help

2011-10-11 Thread Szymon Guz
On 11 October 2011 19:52, CS DBA cs_...@consistentstate.com wrote:

  Hi all ;

 I'm trying to tune a difficult query.

 I have 2 tables:
 cust_acct (9million rows)
 cust_orders (200,000 rows)

 Here's the query:

 SELECT
 a.account_id, a.customer_id, a.order_id, a.primary_contact_id,
 a.status,  a.customer_location_id, a.added_date,
 o.agent_id, p.order_location_id_id,
 COALESCE(a.customer_location_id, p.order_location_id) AS
 order_location_id
 FROM
 cust_acct a JOIN
 cust_orders o
 ON a.order_id = p.order_id;

 I can't get it to run much faster that about 13 seconds, in most cases it's
 more like 30 seconds.
 We have a box with 8 2.5GZ cores and 28GB of ram, shared_buffers is at 8GB


 I've tried separating the queries as filtering queries  joining the
 results, disabling seq scans, upping work_mem and half a dozen other
 approaches.  Here's the explain plan:

  Hash Join  (cost=151.05..684860.30 rows=9783130 width=100)
Hash Cond: (a.order_id = o.order_id)
-  Seq Scan on cust_acct a  (cost=0.00..537962.30 rows=9783130
 width=92)
-  Hash  (cost=122.69..122.69 rows=2269 width=12)
  -  Seq Scan on cust_orders o  (cost=0.00..122.69 rows=2269
 width=12)

 Thanks in advance for any help, tips, etc...





Hi,
two simple questions:

- do you really need getting all 9M rows?
- show us the table structure, together with index definitions

regards
Szymon


Re: [PERFORM] postgresql query runtime

2011-10-11 Thread Szymon Guz
On 11 October 2011 21:08, Radhya sahal rad_cs_2...@yahoo.com wrote:

 Hi
 I want to know how can i measure runtime query in postgresql if i use
 command line psql?
 not explain rutime for the query such as the runtime which appear in
 pgadmin ?
 such as Total query runtime: 203 ms.



run this in psql:

\t

regards
Szymon


Re: [PERFORM] postgresql query runtime

2011-10-11 Thread Szymon Guz
On 11 October 2011 21:13, Szymon Guz mabew...@gmail.com wrote:



 On 11 October 2011 21:08, Radhya sahal rad_cs_2...@yahoo.com wrote:

 Hi
 I want to know how can i measure runtime query in postgresql if i use
 command line psql?
 not explain rutime for the query such as the runtime which appear in
 pgadmin ?
 such as Total query runtime: 203 ms.



 run this in psql:

 \t

 regards
 Szymon



yes... \timing of course... I think I shouldn't send emails when I've got a
fever :)


- Szymon


Re: [PERFORM] Which Join is better

2011-08-02 Thread Szymon Guz
On 2 August 2011 08:42, Adarsh Sharma adarsh.sha...@orkash.com wrote:

 Dear all,

 Just want to know which join is better for querying data faster.

 I have 2 tables A ( 70 GB )  B ( 7 MB )

 A has 10 columns  B has 3 columns.Indexes exist on both tables's ids.

 select p.* from table A p, B q where p.id=q.id

 or

 select p.* from table B q , A p where q.id=p.id



Hi,
it really doesn't matter. PostgreSQL can reorder the joins as it likes.
And you can always check, but I think the plans will be the same.

regards
Szymon


Re: [PERFORM] help speeding up a query in postgres 8.4.5

2011-04-06 Thread Szymon Guz
On 5 April 2011 21:25, Maria L. Wilson maria.l.wilso...@nasa.gov wrote:

  Would really appreciate someone taking a look at the query below
 Thanks in advance!


 this is on a linux box...
 Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37
 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

 explain analyze
 select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
 from GRAN_VER GV
 left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR
 INVS
 where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and
 INVS.sensor_id='13'


 Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual
 time=17532.930..17532.930 rows=1 loops=1)
   -  Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual
 time=13791.593..17323.080 rows=924675 loops=1)
 Hash Cond: (invs.granule_id = gv.granule_id)
 -  Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943
 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)
   Filter: (sensor_id = 13)
 -  Hash  (cost=674401.52..674401.52 rows=1270865 width=16)
 (actual time=13787.698..13787.698 rows=1270750 loops=1)
   -  Hash Join  (cost=513545.62..674401.52 rows=1270865
 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)
 Hash Cond: (gv.granule_id = iv.granule_id)
 -  Seq Scan on gran_ver gv  (cost=0.00..75224.90
 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)
 -  Hash  (cost=497659.81..497659.81 rows=1270865
 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)
   -  Bitmap Heap Scan on inventory iv
 (cost=24050.00..497659.81 rows=1270865 width=12) (actual
 time=253.542..1387.957 rows=1270750 loops=1)
 Recheck Cond: (inv_id = 65)
 -  Bitmap Index Scan on inven_idx1
 (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364
 rows=1270977 loops=1)
   Index Cond: (inv_id = 65)
 Total runtime: 17533.100 ms

 some additional info.
 the table inventory is about 4481 MB and also has postgis types.
 the table gran_ver is about 523 MB
 the table INVSENSOR is about 217 MB

 the server itself has 32G RAM with the following set in the postgres conf
 shared_buffers = 3GB
 work_mem = 64MB
 maintenance_work_mem = 512MB
 wal_buffers = 6MB

 let me know if I've forgotten anything!  thanks a bunch!!

 Maria Wilson
 NASA/Langley Research Center
 Hampton, Virginia
 m.l.wil...@nasa.gov



Hi,
could you show us indexes that you have on all tables from this query? Have
you tried running vacuum analyze on those tables? Do you have autovacuum
active?

regards
Szymon


Re: [PERFORM] Running PostgreSQL as fast as possible no matter the consequences

2010-11-05 Thread Szymon Guz
On 5 November 2010 11:59, A B gentosa...@gmail.com wrote:

 Hi there.

 If you just wanted PostgreSQL to go as fast as possible WITHOUT any
 care for your data (you accept 100% dataloss and datacorruption if any
 error should occur), what settings should you use then?



I'm just curious, what do you need that for?


regards
Szymon


Re: [PERFORM] which one is faster

2010-10-26 Thread Szymon Guz
On 26 October 2010 12:56, AI Rumman rumman...@gmail.com wrote:

 Which one is faster?
 select count(*) from talble
 or
 select count(id) from table
 where id is the primary key.



Check the query plan, both queries are the same.

regards
Szymon


Re: [PERFORM] which one is faster

2010-10-26 Thread Szymon Guz
2010/10/26 Marcin Mirosław mar...@mejor.pl

 W dniu 26.10.2010 12:59, Szymon Guz pisze:
  both queries are the same.

 IMHO they aren't the same, but they returns the same value in this case.
 I mean count(field) doesn't count NULL values, count(*) does it.
 I'm writing this only for note:)
 Regards


Yup, indeed. I omitted that note, as it was written that the field is
primary key :).

regards
Szymon


Re: [PERFORM] which one is faster

2010-10-26 Thread Szymon Guz
2010/10/26 Grzegorz Jaśkiewicz gryz...@gmail.com

 implementation wise, count(*) is faster. Very easy to test:

 SELECT COUNT(*) FROM generate_series(1,100) a, generate_series(1,1000) b;

 SELECT COUNT(a) FROM generate_series(1,100) a, generate_series(1,1000) b;


 ;]


Well, strange. Why is that slower?


Re: [PERFORM] which one is faster

2010-10-26 Thread Szymon Guz
2010/10/26 Grzegorz Jaśkiewicz gryz...@gmail.com

 2010/10/26 Szymon Guz mabew...@gmail.com:
 
  Well, strange. Why is that slower?

 To answer that fully, you would need to see the implementation.
 suffice to say,

 count(a) does:

 if (a  NULL)
 {
  count++;
 }

 and count(*) does:

  count++;



Yup, I was afraid of that, even if there is not null on the column... but I
think usually nobody notices the difference with count.

regards
Szymon


Re: [PERFORM] why index is not working in operation?

2010-07-22 Thread Szymon Guz
2010/7/22 AI Rumman rumman...@gmail.com

 I have a table.

 \d email_track
 Table public.email_track
  Column |  Type   | Modifiers
 +-+
  crmid  | integer | not null default 0
  mailid | integer | not null default 0
  count  | integer |
 Indexes:
 email_track_pkey PRIMARY KEY, btree (crmid, mailid) CLUSTER
 email_track_count_idx btree (count)


 explain analyze select * from email_track where count  10 ;
  QUERY
 PLAN

 
  Bitmap Heap Scan on email_track  (cost=12.79..518.05 rows=1941 width=12)
 (actual time=0.430..3.047 rows=1743 loops=1)
Recheck Cond: (count  10)
-  Bitmap Index Scan on email_track_count_idx  (cost=0.00..12.79
 rows=1941 width=0) (actual time=0.330..0.330 rows=1743 loops=1)
  Index Cond: (count  10)
  Total runtime: 4.702 ms
 (5 rows)

 explain analyze select * from email_track where count  1 ;
 QUERY
 PLAN

 --
  Seq Scan on email_track  (cost=0.00..1591.65 rows=88851 width=12) (actual
 time=0.011..118.499 rows=88852 loops=1)
Filter: (count  1)
  Total runtime: 201.206 ms
 (3 rows)

 I don't know why index scan is not working for count  1 operation.
 Any idea please.


Database knows, due to table statistics, that the query 10 would return
small (1941) number of rows, while query 1 would return big
(88851) number of rows. The small and big is quite relative, but the
result is that the database knows, that it would be faster not to use index,
if the number of returning rows is big.

regards
Szymon Guz


Re: [PERFORM] Slow function in queries SELECT clause.

2010-06-20 Thread Szymon Guz
2010/6/19 Davor J. dav...@live.com

 I think I have read what is to be read about queries being prepared in
 plpgsql functions, but I still can not explain the following, so I thought
 to post it here:

 Suppose 2 functions: factor(int,int) and offset(int, int).
 Suppose a third function: convert(float,int,int) which simply returns
 $1*factor($2,$3)+offset($2,$3)
 All three functions are IMMUTABLE.

 Very simple, right? Now I have very fast AND very slow executing queries on
 some 150k records:

 VERY FAST (half a second):
 
 SELECT data*factor(1,2)+offset(1,2) FROM tbl_data;

 VERY SLOW (a minute):
 
 SELECT convert(data, 1, 2) FROM tbl_data;

 The slowness cannot be due to calling a function 150k times. If I define
 convert2(float,int,int) to return a constant value, then it executes in
 about a second. (still half as slow as the VERY FAST query).

 I assume that factor and offset are cached in the VERY FAST query, and not
 in the slow one? If so, why not and how can I force it? Currently I need
 only one function for conversions.

 Regards,
 Davor




Hi,
show us the code of those two functions and explain analyze of those
queries.

regards
Szymon Guz


Re: [PERFORM] query hangs

2010-06-10 Thread Szymon Guz
2010/6/10 AI Rumman rumman...@gmail.com

 Can anyone please tell me why the following query hangs?
 This is a part of a large query.

 explain
 select *
 from vtiger_emaildetails
 inner join vtiger_vantage_email_track on vtiger_emaildetails.emailid =
 vtiger_vantage_email_track.mailid
 left join vtiger_seactivityrel on vtiger_seactivityrel.activityid =
 vtiger_emaildetails.emailid

QUERY
 PLAN

 -
  Merge Left Join  (cost=9500.30..101672.51 rows=2629549 width=506)
Merge Cond: (outer.emailid = inner.activityid)
-  Merge Join  (cost=9500.30..11658.97 rows=88852 width=498)
  Merge Cond: (outer.emailid = inner.mailid)
  -  Index Scan using vtiger_emaildetails_pkey on
 vtiger_emaildetails  (cost=0.00..714.40 rows=44595 width=486)
  -  Sort  (cost=9500.30..9722.43 rows=88852 width=12)
Sort Key: vtiger_vantage_email_track.mailid
-  Seq Scan on vtiger_vantage_email_track
 (cost=0.00..1369.52 rows=88852 width=12)
-  Index Scan using seactivityrel_activityid_idx on
 vtiger_seactivityrel  (cost=0.00..28569.29 rows=1319776 width=8)
 (9 rows)

 select relname, reltuples, relpages
 from pg_class
 where relname in
 ('vtiger_emaildetails','vtiger_vantage_email_track','vtiger_seactivityrel');


   relname   |  reltuples  | relpages
 +-+--
  vtiger_emaildetails|   44595 | 1360
  vtiger_seactivityrel   | 1.31978e+06 | 6470
  vtiger_vantage_email_track |   88852 |  481
 (3 rows)




Could you define what you mean by 'hangs'? Does it work or not?
Check table pg_locks for locking issues, maybe the query is just slow but
not hangs.
Notice that the query just returns 2M rows, that can be quite huge number
due to your database structure, data amount and current server
configuration.

regards
Szymon Guz


Re: [PERFORM] query hangs

2010-06-10 Thread Szymon Guz
2010/6/10 AI Rumman rumman...@gmail.com

 I found only AccessShareLock in pg_locks during the query.
 And the query does not return data though I have been waiting for 10 mins.

 Do you have any idea ?


 On Thu, Jun 10, 2010 at 5:26 PM, Szymon Guz mabew...@gmail.com wrote:



 2010/6/10 AI Rumman rumman...@gmail.com

 Can anyone please tell me why the following query hangs?
 This is a part of a large query.

 explain
 select *
 from vtiger_emaildetails
 inner join vtiger_vantage_email_track on vtiger_emaildetails.emailid =
 vtiger_vantage_email_track.mailid
 left join vtiger_seactivityrel on vtiger_seactivityrel.activityid =
 vtiger_emaildetails.emailid

QUERY
 PLAN

 -
  Merge Left Join  (cost=9500.30..101672.51 rows=2629549 width=506)
Merge Cond: (outer.emailid = inner.activityid)
-  Merge Join  (cost=9500.30..11658.97 rows=88852 width=498)
  Merge Cond: (outer.emailid = inner.mailid)
  -  Index Scan using vtiger_emaildetails_pkey on
 vtiger_emaildetails  (cost=0.00..714.40 rows=44595 width=486)
  -  Sort  (cost=9500.30..9722.43 rows=88852 width=12)
Sort Key: vtiger_vantage_email_track.mailid
-  Seq Scan on vtiger_vantage_email_track
 (cost=0.00..1369.52 rows=88852 width=12)
-  Index Scan using seactivityrel_activityid_idx on
 vtiger_seactivityrel  (cost=0.00..28569.29 rows=1319776 width=8)
 (9 rows)

 select relname, reltuples, relpages
 from pg_class
 where relname in
 ('vtiger_emaildetails','vtiger_vantage_email_track','vtiger_seactivityrel');


   relname   |  reltuples  | relpages
 +-+--
  vtiger_emaildetails|   44595 | 1360
  vtiger_seactivityrel   | 1.31978e+06 | 6470
  vtiger_vantage_email_track |   88852 |  481
 (3 rows)




 Could you define what you mean by 'hangs'? Does it work or not?
 Check table pg_locks for locking issues, maybe the query is just slow but
 not hangs.
 Notice that the query just returns 2M rows, that can be quite huge number
 due to your database structure, data amount and current server
 configuration.

 regards
 Szymon Guz



1. Make vacuum analyze on used tables.
2. Check how long it would take if you limit the number of returned rows
just to 100
3. Do you have indexes on used columns?

regards
Szymon Guz


Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present

2010-06-02 Thread Szymon Guz
2010/6/2 Jori Jovanovich j...@dimensiology.com

 hi,

 I have a problem space where the main goal is to search backward in time
 for events.  Time can go back very far into the past, and so the
 table can get quite large.  However, the vast majority of queries are all
 satisfied by relatively recent data.  I have an index on the row creation
 date and I would like almost all of my queries to have a query plan looking
 something like:



[CUT]

Do you have autovacuum running? Have you tried updating statistics?

regards
Szymon Guz


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Szymon Guz
2010/3/28 Tadipathri Raghu traghu@gmail.com

 Hi All,

 Example on optimizer
 ===
 postgres=# create table test(id int);
 CREATE TABLE
 postgres=# insert into test VALUES (1);
 INSERT 0 1
 postgres=# select * from test;
  id
 
   1
 (1 row)
 postgres=# explain select * from test;
QUERY PLAN
 
  Seq Scan on test  (cost=0.00..34.00 *rows=2400* width=4)
 (1 row)
 In the above, example the optimizer is retreiving those many rows where
 there is only one row in that table. If i analyze am geting one row.


No, the optimizer is not retrieving anything, it just assumes that there are
2400 rows because that is the number of rows that exists in the statictics
for this table. The optimizer just tries to find the best plan and to
optimize the query plan for execution taking into consideration all
information that can be found for this table (it also looks in the
statistics information about rows from this table).



 postgres=# ANALYZE test;
 ANALYZE
 postgres=# explain select * from test;
  QUERY PLAN
 
  Seq Scan on test  (cost=0.00..1.01 *rows=1* width=4)
 (1 row)

 My question here is, what it retreiving as rows when there is no such. One
 more thing, if i wont do analyze and run the explain plan for three or more
 times, then catalogs getting updated automatically and resulting the correct
 row as 1.



Now ANALYZE changed the statistics for this table and now the planner knows
that there is just one row. In the background there can work autovacuum so
it changes rows automatically (the autovacuum work characteristic depends on
the settings for the database).


 Q2. Does explain , will update the catalogs automatically.



No, explain doesn't update table's statistics.


regards
Szymon Guz


Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-28 Thread Szymon Guz
2010/3/28 Tadipathri Raghu traghu@gmail.com

 Hi Guz,

 Thank you for the prompt reply.


 No, the optimizer is not retrieving anything, it just assumes that there
 are 2400 rows because that is the number of rows that exists in the
 statictics for this table. The optimizer just tries to find the best plan
 and to optimize the query plan for execution taking into consideration all
 information that can be found for this table (it also looks in the
 statistics information about rows from this table).


 So, whats it assuming here as rows(2400).  Could you explain this.



It is assuming that there are 2400 rows in this table. Probably you've
deleted some rows from the table leaving just one.

regards
Szymon Guz