[PERFORM] Problem with large query

2004-09-08 Thread Marc Cousin
Hi. I hope I'm not asking a too trivial question here...

I'm having trouble with a (quite big) query, and can't find a way to make it 
faster.

Here is the information :

Tables :

sces_vte - 2753539 rows
sces_art - 602327
sces_fsf - 8126
sces_frc - 7763
sces_tps - 38
sces_gtr - 35


Query :
===
SELECT
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','99'),
TO_NUMBER('200405','99'),
sces_tps.tps_libc,
sum(sces_vte.vte_mnt),
sum(sces_vte.vte_qte),
sum(sces_vte.vte_ton),
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib
FROM
sces_vte,
sces_art,
sces_fsf  sces_famille,
sces_fsf  sces_s_famille,
sces_frc  sces_frc_art,
sces_tps,
sces_gtr  sces_gtr_art
WHERE
( sces_famille.fsf_codfam=sces_s_famille.fsf_codfam  )
AND  ( sces_famille.fsf_codseg= 0 and sces_famille.fsf_codsfm = 0  )
AND  ( sces_vte.tps_annee_mois=sces_tps.tps_annee_mois  )
AND  ( sces_vte.art_cod=sces_art.art_cod and 
sces_vte.dos_cod=sces_art.dos_cod  )
AND  ( sces_gtr_art.gtr_cod=sces_frc_art.gtr_cod  )
AND  ( sces_frc_art.gtr_cod=sces_art.gtr_cod and 
sces_frc_art.fou_cod=sces_art.fou_cod )
AND  ( sces_s_famille.fsf_codfam=sces_art.fsf_codfam and 
sces_s_famille.fsf_codsfm=sces_art.fsf_codsfm  )
AND  ( sces_s_famille.fsf_codseg = 0  )
AND  (
( ( ( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200401','99') ) and 
( sces_tps.tps_annee_mois ) = (
TO_NUMBER('200405','99') )
)
OR
(
( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200401','99') )-100 and 
( sces_tps.tps_annee_mois ) = (
TO_NUMBER('200405','99') )-100
)  )
AND  ( sces_gtr_art.gtr_cod  in  (2))
)
GROUP BY
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','99'),
TO_NUMBER('200405','99'),
sces_tps.tps_libc,
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib

Explain Analyze Plan : 

 GroupAggregate  (cost=27161.91..27938.72 rows=16354 width=280) (actual 
time=484509.210..544436.148 rows=4115 loops=1)
   -  Sort  (cost=27161.91..27202.79 rows=16354 width=280) (actual 
time=484496.188..485334.151 rows=799758 loops=1)
 Sort Key: sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, 
sces_frc_art.fou_lib, sces_tps.tps_annee_mois, 200401::numeric, 200405::numeric, 
sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, 
sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib
 -  Merge Join  (cost=25727.79..26017.34 rows=16354 width=280) (actual 
time=58945.821..69321.146 rows=799758 loops=1)
   Merge Cond: ((outer.fsf_codfam = inner.fsf_codfam) AND 
(outer.fsf_codsfm = inner.fsf_codsfm))
   -  Sort  (cost=301.36..304.60 rows=1298 width=83) (actual 
time=27.926..28.256 rows=332 loops=1)
 Sort Key: sces_s_famille.fsf_codfam, sces_s_famille.fsf_codsfm
 -  Seq Scan on sces_fsf sces_s_famille  (cost=0.00..234.24 
rows=1298 width=83) (actual time=0.042..19.124 rows=1341 loops=1)
   Filter: (fsf_codseg = 0::numeric)
   -  Sort  (cost=25426.43..25448.05 rows=8646 width=225) (actual 
time=58917.106..59693.810 rows=799758 loops=1)
 Sort Key: sces_art.fsf_codfam, sces_art.fsf_codsfm
 -  Merge Join  (cost=24726.32..24861.08 rows=8646 width=225) 
(actual time=19036.709..29404.943 rows=799758 loops=1)
   Merge Cond: (outer.tps_annee_mois = 
inner.tps_annee_mois)
   -  Sort  (cost=2.49..2.53 rows=17 width=23) (actual 
time=0.401..0.428 rows=20 loops=1)
 Sort Key: sces_tps.tps_annee_mois
 -  Seq Scan on sces_tps  (cost=0.00..2.14 rows=17 
width=23) (actual time=0.068..0.333 rows=20 loops=1)
   Filter: (((tps_annee_mois = 200301::numeric) 
OR (tps_annee_mois = 200401::numeric)) AND ((tps_annee_mois = 200305::numeric) OR 
(tps_annee_mois = 200401::numeric)) AND ((tps_annee_mois = 200301::numeric) OR 
(tps_annee_mois = 200405::numeric)) AND ((tps_annee_mois = 200305::numeric) OR 
(tps_annee_mois = 200405::numeric)))
   -  Sort  (cost=24723.83..24747.97 rows=9656 width=214) 
(actual time=19036.223..19917.214 rows=799757 loops=1)
 Sort Key: sces_vte.tps_annee_mois
 -  Nested Loop  (cost=21825.09..24084.74 rows=9656 
width=214) (actual time=417.603..8644.294 rows=399879 loops=1)
   -  Nested Loop  (cost=21825.09..21837.50 
rows=373 width=195) (actual time=417.444..672.741 rows=14158 loops=1)
 -  Seq Scan on sces_gtr sces_gtr_art  
(cost=0.00..1.44 rows=1 width=40) (actual 

Re: [PERFORM] Problem with large query

2004-09-08 Thread Tom Lane
Marc Cousin [EMAIL PROTECTED] writes:
 I'm having trouble with a (quite big) query, and can't find a way to make it 
 faster.

Seems like it might help if the thing could use a HashAggregate instead
of sort/group.  Numeric is not hashable, so having those TO_NUMBER
constants in GROUP BY destroys this option instantly ... but why in the
world are you grouping by constants anyway?  You didn't say what the
datatypes of the other columns were...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Problem with large query

2004-09-08 Thread Marc Cousin
The query has been generated by business objects ... i'ill try to suggest to the 
developpers to remove this constant (if they can)...
The fields used by the sort are of type numeric(6,0) or (10,0) ...
Could it be better if the fields were integer or anything else ?


On Wednesday 08 September 2004 16:40, you wrote:
 Marc Cousin [EMAIL PROTECTED] writes:
  I'm having trouble with a (quite big) query, and can't find a way to make it 
  faster.
 
 Seems like it might help if the thing could use a HashAggregate instead
 of sort/group.  Numeric is not hashable, so having those TO_NUMBER
 constants in GROUP BY destroys this option instantly ... but why in the
 world are you grouping by constants anyway?  You didn't say what the
 datatypes of the other columns were...
 
regards, tom lane
 

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Problem with large query

2004-09-08 Thread Tom Lane
Marc Cousin [EMAIL PROTECTED] writes:
 The query has been generated by business objects ... i'ill try to suggest to the 
 developpers to remove this constant (if they can)...
 The fields used by the sort are of type numeric(6,0) or (10,0) ...
 Could it be better if the fields were integer or anything else ?

integer or bigint would be a WHOLE lot faster.  I'd venture that
comparing two numerics is order of a hundred times slower than
comparing two integers.

Even if you don't want to change the fields on-disk, you might think
about casting them all to int/bigint in the query.

Another thing that might or might not be easy is to change the order of
the GROUP BY items so that the fields with the largest number of
distinct values are listed first.  If two rows are distinct at the first
column, the sorting comparison doesn't even have to look at the
remaining columns ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Problem with large query

2004-09-08 Thread Marc Cousin
On Wednesday 08 September 2004 16:56, you wrote:
 Marc Cousin [EMAIL PROTECTED] writes:
  The query has been generated by business objects ... i'ill try to suggest to the 
  developpers to remove this constant (if they can)...
  The fields used by the sort are of type numeric(6,0) or (10,0) ...
  Could it be better if the fields were integer or anything else ?
 
 integer or bigint would be a WHOLE lot faster.  I'd venture that
 comparing two numerics is order of a hundred times slower than
 comparing two integers.
 
 Even if you don't want to change the fields on-disk, you might think
 about casting them all to int/bigint in the query.
 
 Another thing that might or might not be easy is to change the order of
 the GROUP BY items so that the fields with the largest number of
 distinct values are listed first.  If two rows are distinct at the first
 column, the sorting comparison doesn't even have to look at the
 remaining columns ...
 
regards, tom lane
 
Thanks. I've just had confirmation that they can remove the two constants (allready 
won 100 seconds thanks to that)
I've tried the cast, and got down to 72 seconds.
So now we're going to try to convert the fields to int or bigint.

Thanks a lot for your help and time.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org