R: R: [PERFORM] Query plan on identical tables differs . Why ?

2004-05-14 Thread Fabio Panizzutti


-Messaggio originale-
Da: Tom Lane [mailto:[EMAIL PROTECTED] 
Inviato: giovedì 13 maggio 2004 17.01
A: Fabio Panizzutti
Cc: 'Shridhar Daithankar'; [EMAIL PROTECTED]
Oggetto: Re: R: [PERFORM] Query plan on identical tables 
differs . Why ? 


Fabio Panizzutti [EMAIL PROTECTED] writes:
 I don't understand why the planner chose a different query plan on 
 identical tables with same indexes .

Different data statistics; not to mention different table 
sizes (the cost equations are not linear).

Have you ANALYZEd (or VACUUM ANALYZEd) both tables recently?

If the stats are up to date but still not doing the right 
thing, you might try increasing the statistics target for 
the larger table's tag_id column.  See ALTER TABLE SET STATISTICS.

 regards, tom lane


All tables are vacumed and analyzed . 
I try so set statistics to 1000 to tag_id columns with ALTER TABLE SET
STATISTIC, revacuum analyze ,  but the planner choose the same query
plan . 
I'm trying now to change the indexes .

Thanks 




---(end of broadcast)---
TIP 8: explain analyze is your friend


R: R: [PERFORM] Query plan on identical tables differs . Why ?

2004-05-14 Thread Fabio Panizzutti


-Messaggio originale-
Da: Stephan Szabo [mailto:[EMAIL PROTECTED] 
Inviato: giovedì 13 maggio 2004 17.17
A: Fabio Panizzutti
Cc: 'Shridhar Daithankar'; [EMAIL PROTECTED]
Oggetto: Re: R: [PERFORM] Query plan on identical tables 
differs . Why ?


On Thu, 13 May 2004, Fabio Panizzutti wrote:


 I don't understand why the planner chose a different query plan on 
 identical tables with same indexes .

Because it's more than table structure that affects the 
choice made by the planner.  In addition the statistics 
about the values that are there as well as the estimated 
size of the table have effects.  One way to see is to see 
what it thinks is best is to remove the indexes it is using 
and see what plan it gives then, how long it takes and the 
estimated costs for those plans.

In other suggestions, I think having a (tag_id, data_tag) 
index rather than (data_tag, tag_id) may be a win for 
queries like this. Also, unless you're doing many select 
queries by only the first field of the composite index and 
you're not doing very many insert/update/deletes, you may 
want to drop the other index on just that field.


Thanks for your attention , i change the indexes on the tables as you
suggested  :

 storico=# \d storico_misure_short
Table tenore.storico_misure_short
 Column  |Type | Modifiers
-+-+---
 data_tag| timestamp without time zone | not null
 tag_id  | integer | not null
 unita_misura| character varying(6)| not null
 valore_tag  | numeric(20,3)   | not null
 qualita | integer | not null
 numero_campioni | numeric(5,0)|
 frequenza_campionamento | numeric(3,0)|
Indexes:
storico_misure_short_idx primary key, btree (tag_id, data_tag)
storico_misure_short_data_tag_idx2 btree (data_tag)

storico=# \d storico_misure
   Table tenore.storico_misure
 Column  |Type | Modifiers
-+-+---
 data_tag| timestamp without time zone | not null
 tag_id  | integer | not null
 unita_misura| character varying(6)| not null
 valore_tag  | numeric(20,3)   | not null
 qualita | integer | not null
 numero_campioni | numeric(5,0)|
 frequenza_campionamento | numeric(3,0)|
Indexes:
storico_misure_idx primary key, btree (tag_id, data_tag)
storico_misure_data_tag_idx2 btree (data_tag)

And now performance are similar and the planner works correctly :

storico=# \d storico_misure_short
Table tenore.storico_misure_short
 Column  |Type | Modifiers
-+-+---
 data_tag| timestamp without time zone | not null
 tag_id  | integer | not null
 unita_misura| character varying(6)| not null
 valore_tag  | numeric(20,3)   | not null
 qualita | integer | not null
 numero_campioni | numeric(5,0)|
 frequenza_campionamento | numeric(3,0)|
Indexes:
storico_misure_short_idx primary key, btree (tag_id, data_tag)
storico_misure_short_data_tag_idx2 btree (data_tag)

storico=# \d storico_misure
   Table tenore.storico_misure
 Column  |Type | Modifiers
-+-+---
 data_tag| timestamp without time zone | not null
 tag_id  | integer | not null
 unita_misura| character varying(6)| not null
 valore_tag  | numeric(20,3)   | not null
 qualita | integer | not null
 numero_campioni | numeric(5,0)|
 frequenza_campionamento | numeric(3,0)|
Indexes:
storico_misure_idx primary key, btree (tag_id, data_tag)
storico_misure_data_tag_idx2 btree (data_tag)

storico=# explain analyze select tag_id,valore_tag,data_tag from
storico_misure_short where (data_tag'2004-05-03' and data_tag
'2004-05-12') and tag_id=37423 ;
 
QUERY PLAN


--
 Index Scan using storico_misure_short_idx on storico_misure_short
(cost=0.00..2104.47 rows=584 width=20) (actual time=0.232..39.932
rows=864 loops=1)
   Index Cond: ((tag_id = 37423) AND (data_tag  '2004-05-03
00:00:00

R: R: [PERFORM] Query plan on identical tables differs . Why ?

2004-05-14 Thread Fabio Panizzutti


 Index Scan using pk_storico_misure_2 on storico_misure 
(cost=0.00..1984.64 rows=658 width=21) (actual 
time=723.441..1858.107 
rows=835 loops=1)
   Index Cond: ((data_tag  '2004-05-03 
00:00:00'::timestamp without 
time zone) AND (data_tag  '2004-05-12 00:00:00'::timestamp without 
time
zone) AND (tag_id = 37423))

Either most of the time is spent skipping index tuples in 
the data_tag range 2004-05-03 to 2004-05-12 which have 
tag_id  37423, or getting those 835 rows causes a lot of 
disk seeks.

If the former is true, an index on (tag_id, data_tag) will help.

Is true , i recreate the indexes making an index on  (tag_id, data_tag)
and works fine . 


In your first message you wrote:
fsync = false

Do this only if you don't care for your data.


I set it to false , for performance tests .I've a stored procedure that
make about 2000 insert in 2 tables and 2000 delete in another and with
fsync false perfomrmance are  2.000 -3.000 ms (stable) with fsync 3.000
ms to 15.000 ms . I trust in my hardware an O.S so fsync setting is a
big dubt for my production enviroment .

Thanks a lot

Bye 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: R: R: [PERFORM] Query plan on identical tables differs . Why ?

2004-05-14 Thread Christopher Kings-Lynne
 I trust in my hardware an O.S so fsync setting is a
 big dubt for my production enviroment .

Then you are making a big mistake, loving your hardware more than your
data...

Chris



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


R: R: R: [PERFORM] Query plan on identical tables differs . Why ?

2004-05-14 Thread Fabio Panizzutti


-Messaggio originale-
Da: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] Per conto di 
Christopher Kings-Lynne
Inviato: venerdì 14 maggio 2004 11.55
A: Fabio Panizzutti
Cc: 'Manfred Koizar'; [EMAIL PROTECTED]
Oggetto: Re: R: R: [PERFORM] Query plan on identical tables 
differs . Why ?


 I trust in my hardware an O.S so fsync setting is a
 big dubt for my production enviroment .

Then you are making a big mistake, loving your hardware more 
than your data...

Chris



I'm testing for better performance in insert/delete so i turn off fsync
, i don't love hardware more than data , so i'll set fsync on in the
production enviroment .
Thanks a lot

Best regards

Fabio


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly