Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Fernando Papa

Sorry Chris... a little slower...

esdc= EXPLAIN ANALYZE 
SELECT
  cont_contenido.id_contenido
 ,cont_contenido.pertenece_premium
 ,cont_contenido.Titulo_esp as v_sufix 
 ,cont_contenido.url_contenido
 ,cont_contenido.tipo_acceso
 ,cont_contenido.id_sbc
 ,cont_contenido.cant_vistos
 ,cont_contenido.cant_votos 
 ,cont_contenido.puntaje_total 
 ,cont_contenido.id_contenido_padre 
 ,juegos_config.imagen_tapa_especial 
 ,juegos_config.info_general_esp as info_general 
 ,juegos_config.ayuda 
 ,juegos_config.tips_tricks_esp as tips_tricks 
 ,juegos_config.mod_imagen_tapa_especial 
 ,cont_publicacion.fecha_publicacion as fecha_publicacion 
 ,cont_publicacion.generar_Vainilla 
FROM 
 cont_contenido   
 ,juegos_config
 ,cont_publicacion 
WHERE 
 cont_contenido.id_instalacion= 2
 AND cont_contenido.id_sbc   = 619
 AND cont_contenido.id_tipo   = 2
 AND cont_contenido.id_instalacion  = juegos_config.id_instalacion 
 AND cont_contenido.id_contenido   = juegos_config.id_contenido   
 AND upper(cont_publicacion.generar_Vainilla) = 'S'
 AND cont_publicacion.id_instalacion = cont_contenido.id_instalacion 
 AND cont_publicacion.id_contenido = cont_contenido.id_contenido 
 AND cont_publicacion.fecha_publicacion = (SELECT cp1.fecha_publicacion
FROM cont_publicacion cp1
 WHERE cp1.id_instalacion = cont_publicacion.id_instalacion
 AND cp1.id_contenido = cont_publicacion.id_contenido
 AND cp1.generar_vainilla = cont_publicacion.generar_vainilla
 ORDER BY fecha_publicacion LIMIT 1)
ORDER BY  cont_publicacion.fecha_publicacion desc 
 LIMIT 10
 OFFSET 0
 ;

  QUERY PLAN   

---
 Limit  (cost=9.75..9.76 rows=1 width=479) (actual time=465085.25..465085.27 rows=8 
loops=1)
   -  Sort  (cost=9.75..9.76 rows=1 width=479) (actual time=465085.23..465085.24 
rows=8 loops=1)
 Sort Key: cont_publicacion.fecha_publicacion
 -  Merge Join  (cost=9.73..9.74 rows=1 width=479) (actual 
time=210743.83..465083.31 rows=8 loops=1)
   Merge Cond: ((outer.id_instalacion = inner.id_instalacion) AND 
(outer.id_contenido = inner.id_contenido))
   -  Nested Loop  (cost=0.00..284756.79 rows=1 width=367) (actual 
time=8319.87..464981.68 rows=40 loops=1)
 Join Filter: ((inner.id_contenido = outer.id_contenido) AND 
(inner.id_instalacion = outer.id_instalacion))
 -  Index Scan using jue_conf_pk on juegos_config  
(cost=0.00..12.19 rows=40 width=332) (actual time=52.93..142.31 rows=40 loops=1)
 -  Seq Scan on cont_publicacion  (cost=0.00..7118.60 rows=1 
width=35) (actual time=51.79..11617.12 rows=97 loops=40)
   Filter: ((upper((generar_vainilla)::text) = 'S'::text) AND 
(fecha_publicacion = (subplan)))
   SubPlan
 -  Limit  (cost=15.85..15.85 rows=1 width=8) (actual 
time=25.86..25.86 rows=1 loops=17880)
   -  Sort  (cost=15.85..15.86 rows=1 width=8) 
(actual time=25.82..25.82 rows=2 loops=17880)
 Sort Key: fecha_publicacion
 -  Seq Scan on cont_publicacion cp1  
(cost=0.00..15.84 rows=1 width=8) (actual time=10.68..25.32 rows=7 loops=17880)
   Filter: ((id_instalacion = $0) AND 
(id_contenido = $1) AND (generar_vainilla = $2))
   -  Sort  (cost=9.73..9.74 rows=3 width=112) (actual time=94.91..94.93 
rows=8 loops=1)
 Sort Key: cont_contenido.id_instalacion, 
cont_contenido.id_contenido
 -  Seq Scan on cont_contenido  (cost=0.00..9.70 rows=3 
width=112) (actual time=21.70..92.96 rows=8 loops=1)
   Filter: ((id_instalacion = 2::numeric) AND (id_sbc = 
619::numeric) AND (id_tipo = 2::numeric))
 Total runtime: 465088.66 msec
(21 rows)



-Mensaje original-
De: Christopher Browne [mailto:[EMAIL PROTECTED] 
Enviado el: viernes, 01 de agosto de 2003 18:27
Para: Fernando Papa
CC: [EMAIL PROTECTED]
Asunto: Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec



I'd point at the following as being a sterling candidate for being a cause of this 
being slow...

 AND cont_publicacion.fecha_publicacion = (SELECT max(cp1.fecha_publicacion) 
  FROM cont_publicacion cp1  
  WHERE cp1.id_instalacion = cont_publicacion.id_instalacion 
    AND cp1.id_contenido = cont_publicacion.id_contenido  
    AND cp1.generar_vainilla = cont_publicacion.generar_vainilla) 

May I suggest changing it to:

 AND cont_publicacion.fecha_publicacion = (SELECT 

Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Fernando Papa

Hi Josh... a little worse time:

EXPLAIN ANALYZE 
SELECT
 cont_contenido.id_contenido
 ,cont_contenido.pertenece_premium
 ,cont_contenido.Titulo_esp as v_sufix 
 ,cont_contenido.url_contenido
 ,cont_contenido.tipo_acceso
 ,cont_contenido.id_sbc
 ,cont_contenido.cant_vistos
 ,cont_contenido.cant_votos 
 ,cont_contenido.puntaje_total 
 ,cont_contenido.id_contenido_padre 
 ,juegos_config.imagen_tapa_especial 
 ,juegos_config.info_general_esp as info_general 
 ,juegos_config.ayuda 
 ,juegos_config.tips_tricks_esp as tips_tricks 
 ,juegos_config.mod_imagen_tapa_especial 
 ,cont_publicacion.fecha_publicacion as fecha_publicacion 
 ,cont_publicacion.generar_Vainilla 
 FROM 
 cont_contenido   
 ,juegos_config
,cont_publicacion 
 WHERE 
 cont_contenido.id_instalacion= 2
 AND cont_contenido.id_sbc   = 619
 AND cont_contenido.id_tipo   = 2
 AND cont_contenido.id_instalacion  = juegos_config.id_instalacion 
 AND cont_contenido.id_contenido   = juegos_config.id_contenido 
 AND upper(cont_publicacion.generar_Vainilla) = 'S'
 AND cont_publicacion.id_instalacion = cont_contenido.id_instalacion 
 AND cont_publicacion.id_contenido = cont_contenido.id_contenido 
 AND EXISTS (SELECT max(cp1.fecha_publicacion)
 FROM cont_publicacion cp1
WHERE cp1.id_instalacion = cont_publicacion.id_instalacion
AND cp1.id_contenido = cont_publicacion.id_contenido
AND cp1.generar_vainilla = cont_publicacion.generar_vainilla
   HAVING max(cp1.fecha_publicacion) =
cont_publicacion.fecha_publicacion)
ORDER BY  cont_publicacion.fecha_publicacion desc 
 LIMIT 10
OFFSET 0
;


 
QUERY PLAN



-
 Limit  (cost=9.75..9.76 rows=1 width=479) (actual
time=449760.88..449760.91 rows=8 loops=1)
   -  Sort  (cost=9.75..9.76 rows=1 width=479) (actual
time=449760.87..449760.88 rows=8 loops=1)
 Sort Key: cont_publicacion.fecha_publicacion
 -  Merge Join  (cost=9.73..9.74 rows=1 width=479) (actual
time=202257.20..449759.00 rows=8 loops=1)
   Merge Cond: ((outer.id_instalacion =
inner.id_instalacion) AND (outer.id_contenido =
inner.id_contenido))
   -  Nested Loop  (cost=0.00..284556.86 rows=1 width=367)
(actual time=7794.28..449741.85 rows=40 loops=1)
 Join Filter: ((inner.id_contenido =
outer.id_contenido) AND (inner.id_instalacion =
outer.id_instalacion))
 -  Index Scan using jue_conf_pk on juegos_config
(cost=0.00..12.19 rows=40 width=332) (actual time=0.43..8.12 rows=40
loops=1)
 -  Seq Scan on cont_publicacion
(cost=0.00..7113.60 rows=1 width=35) (actual time=24.10..11239.67
rows=97 loops=40)
   Filter: ((upper((generar_vainilla)::text) =
'S'::text) AND (subplan))
   SubPlan
 -  Aggregate  (cost=15.85..15.85 rows=1
width=8) (actual time=25.03..25.03 rows=0 loops=17880)
   Filter: (max(fecha_publicacion) = $3)
   -  Seq Scan on cont_publicacion cp1
(cost=0.00..15.84 rows=1 width=8) (actual time=10.51..24.85 rows=7
loops=17880)
 Filter: ((id_instalacion = $0)
AND (id_contenido = $1) AND (generar_vainilla = $2))
   -  Sort  (cost=9.73..9.74 rows=3 width=112) (actual
time=10.49..10.52 rows=8 loops=1)
 Sort Key: cont_contenido.id_instalacion,
cont_contenido.id_contenido
 -  Seq Scan on cont_contenido  (cost=0.00..9.70
rows=3 width=112) (actual time=0.59..8.07 rows=8 loops=1)
   Filter: ((id_instalacion = 2::numeric) AND
(id_sbc = 619::numeric) AND (id_tipo = 2::numeric))
 Total runtime: 449765.69 msec
(20 rows)



-Mensaje original-
De: Josh Berkus [mailto:[EMAIL PROTECTED] 
Enviado el: viernes, 01 de agosto de 2003 18:32
Para: Christopher Browne; Fernando Papa
CC: [EMAIL PROTECTED]
Asunto: Re: [PERFORM] I can't wait too much: Total runtime 432478.44
msec


Fernando,

  AND cont_publicacion.fecha_publicacion = (SELECT
max(cp1.fecha_publicacion) 
   FROM cont_publicacion cp1  
   WHERE cp1.id_instalacion =
cont_publicacion.id_instalacion 
 AND cp1.id_contenido =
cont_publicacion.id_contenido  
 AND cp1.generar_vainilla =
cont_publicacion.generar_vainilla) 

Or event changing it to:

AND EXISTS (SELECT max(cp1.fecha_publicacion)
FROM cont_publicacion cp1
WHERE cp1.id_instalacion = cont_publicacion.id_instalacion 
AND cp1.id_contenido = cont_publicacion.id_contenido  
AND cp1.generar_vainilla = cont_publicacion.generar_vainilla
HAVING max(cp1.fecha_publicacion) =
cont_publicacion.fecha_publicacion)
-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco



Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Fernando Papa

I create the index, but doesn't help too much:

   QUERY 
PLAN
-
 Limit  (cost=9.75..9.76 rows=1 width=479) (actual time=486421.35..486421.38 rows=8 
loops=1)
   -  Sort  (cost=9.75..9.76 rows=1 width=479) (actual time=486421.33..486421.34 
rows=8 loops=1)
 Sort Key: cont_publicacion.fecha_publicacion
 -  Merge Join  (cost=9.73..9.74 rows=1 width=479) (actual 
time=220253.76..486420.35 rows=8 loops=1)
   Merge Cond: ((outer.id_instalacion = inner.id_instalacion) AND 
(outer.id_contenido = inner.id_contenido))
   -  Nested Loop  (cost=0.00..1828.35 rows=1 width=367) (actual 
time=8347.78..486405.02 rows=40 loops=1)
 Join Filter: ((inner.id_contenido = outer.id_contenido) AND 
(inner.id_instalacion = outer.id_instalacion))
 -  Index Scan using jue_conf_pk on juegos_config  
(cost=0.00..12.19 rows=40 width=332) (actual time=0.23..6.73 rows=40 loops=1)
 -  Index Scan using idx_generar_vainilla_ci on cont_publicacion  
(cost=0.00..45.39 rows=1 width=35) (actual time=56.01..12156.48 rows=97 loops=40)
   Index Cond: (upper((generar_vainilla)::text) = 'S'::text)
   Filter: (fecha_publicacion = (subplan))
   SubPlan
 -  Aggregate  (cost=15.84..15.84 rows=1 width=8) (actual 
time=27.03..27.03 rows=1 loops=17880)
   -  Seq Scan on cont_publicacion cp1  
(cost=0.00..15.84 rows=1 width=8) (actual time=11.21..26.86 rows=7 loops=17880)
 Filter: ((id_instalacion = $0) AND 
(id_contenido = $1) AND (generar_vainilla = $2))
   -  Sort  (cost=9.73..9.74 rows=3 width=112) (actual time=9.28..9.32 
rows=8 loops=1)
 Sort Key: cont_contenido.id_instalacion, 
cont_contenido.id_contenido
 -  Seq Scan on cont_contenido  (cost=0.00..9.70 rows=3 
width=112) (actual time=0.47..7.48 rows=8 loops=1)
   Filter: ((id_instalacion = 2::numeric) AND (id_sbc = 
619::numeric) AND (id_tipo = 2::numeric))
 Total runtime: 486445.19 msec
(20 rows)


-Mensaje original-
De: Mendola Gaetano [mailto:[EMAIL PROTECTED] 
Enviado el: sábado, 02 de agosto de 2003 7:36
Para: [EMAIL PROTECTED]
CC: Fernando Papa
Asunto: Re: I can't wait too much: Total runtime 432478.44 msec


From: Fernando Papa [EMAIL PROTECTED]

 AND upper(cont_publicacion.generar_Vainilla) = 'S'


  Filter: ((upper((generar_vainilla)::text) = 'S'::text) AND
(fecha_publicacion = (subplan)))

using a functional index on this field should help

create index idx_generar_vainilla_ci on cont_publicacion (
upper(generar_Vainilla) )



Regards
Gaetano Mendola



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Fernando Papa
Title: Mensaje



Hi 
Volker!!! I think you're right. Look at times:
 
QUERY 
PLAN 
---Limit 
(cost=23.37..23.37 rows=1 width=487) (actual time=2245.61..2245.61 rows=0 
loops=1) - Sort (cost=23.37..23.37 rows=1 
width=487) (actual time=2245.60..2245.60 rows=0 
loops=1) Sort Key: 
cont_publicacion.fecha_publicacion 
- Nested Loop (cost=23.33..23.36 rows=1 width=487) (actual 
time=2244.10..2244.10 rows=0 
loops=1) 
Join Filter: ("outer".fecha_publicacion = 
"inner".max_pub) 
- Merge Join (cost=9.73..9.74 rows=1 width=479) (actual 
time=918.73..1988.43 rows=16 
loops=1) 
Merge Cond: (("outer".id_instalacion = "inner".id_instalacion) AND 
("outer".id_contenido = 
"inner".id_contenido)) 
- Nested Loop (cost=0.00..409.35 rows=1 width=367) (actual 
time=35.44..1967.20 rows=82 
loops=1) 
Join Filter: (("inner".id_contenido = "outer".id_contenido) AND 
("inner".id_instalacion = 
"outer".id_instalacion)) 
- Index Scan using jue_conf_pk on juegos_config 
(cost=0.00..12.19 rows=40 width=332) (actual time=0.42..6.73 rows=40 
loops=1) 
- Index Scan using idx_generar_vainilla_ci on cont_publicacion 
(cost=0.00..9.90 rows=2 width=35) (actual time=0.20..35.19 rows=447 
loops=40) 
Index Cond: (upper((generar_vainilla)::text) = 
'S'::text) 
- Sort (cost=9.73..9.74 rows=3 width=112) (actual 
time=10.42..10.48 rows=15 
loops=1) 
Sort Key: cont_contenido.id_instalacion, 
cont_contenido.id_contenido 
- Seq Scan on cont_contenido (cost=0.00..9.70 rows=3 width=112) 
(actual time=0.57..8.11 rows=8 
loops=1) 
Filter: ((id_instalacion = 2::numeric) AND (id_sbc = 619::numeric) AND (id_tipo 
= 
2::numeric)) 
- Subquery Scan a (cost=13.60..13.60 rows=1 width=8) (actual 
time=15.89..15.90 rows=1 
loops=16) 
- Aggregate (cost=13.60..13.60 rows=1 width=8) (actual 
time=15.87..15.88 rows=1 
loops=16) 
- Seq Scan on cont_publicacion cp1 (cost=0.00..12.48 rows=448 
width=8) (actual time=0.05..11.62 rows=448 loops=16)Total runtime: 
2250.92 msec(20 rows)
The 
problem was the subquery, no doubt.


  
  -Mensaje original-De: Volker Helm 
  [mailto:[EMAIL PROTECTED] Enviado el: lunes, 04 de agosto de 2003 
  11:45Para: Fernando PapaAsunto: AW: [PERFORM] I can't 
  wait too much: Total runtime 432478.44 msec
  Hi,
  
  just use the subquery as inline-View an join the 
  tables:
  
  SELECT 
  cont_contenido.id_contenido,cont_contenido.pertenece_premium,cont_contenido.Titulo_esp 
  as v_sufix 
  ,cont_contenido.url_contenido,cont_contenido.tipo_acceso,cont_contenido.id_sbc,cont_contenido.cant_vistos,cont_contenido.cant_votos 
  ,cont_contenido.puntaje_total 
  ,cont_contenido.id_contenido_padre 
  ,juegos_config.imagen_tapa_especial 
  ,juegos_config.info_general_esp as info_general 
  ,juegos_config.ayuda ,juegos_config.tips_tricks_esp as 
  tips_tricks ,juegos_config.mod_imagen_tapa_especial 
  ,cont_publicacion.fecha_publicacion as fecha_publicacion 
  ,cont_publicacion.generar_Vainilla FROM 
  cont_contenido 
  ,juegos_config ,cont_publicacion 
  
  ,(SELECT max(cp1.fecha_publicacion) as 
  max_pub 
  --change 
  here 
  FROM cont_publicacion cp1) a--change 
  hereWHERE 
  cont_contenido.id_instalacion 
  = 2AND 
  cont_contenido.id_sbc 
  = 619AND 
  cont_contenido.id_tipo 
  = 2AND cont_contenido.id_instalacion = 
  juegos_config.id_instalacion AND 
  cont_contenido.id_contenido = 
  juegos_config.id_contenido 
  AND upper(cont_publicacion.generar_Vainilla) = 'S'AND 
  cont_publicacion.id_instalacion = cont_contenido.id_instalacion AND 
  cont_publicacion.id_contenido = cont_contenido.id_contenido AND 
  cont_publicacion.fecha_publicacion 
  =a.max_pub 
  -- change hereORDER BY 
  cont_publicacion.fecha_publicacion desc
  
  hope it helps,
  
  Volker Helm
  
-Ursprüngliche Nachricht-Von: 
[EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]Im Auftrag von 
Fernando PapaGesendet: Freitag, 1. August 2003 
23:17An: [EMAIL PROTECTED]Betreff: 
[PERFORM] I can't wait too much: Total runtime 432478.44 
msec

Hi 
all!
Really I don't 
know what happened with this query. I'm running PG 7.3.1 on solaris, 
vaccumed (full) every nigth.
The cardinality 
of each table was:

cont_contenido: 97 rows
juegos_config: 40 rows
cont_publicacion: 446 rows
not huge 
tables...

however, this 
query took a lot of time to run:Total runtime: 432478.44 
msecI made a explain analyze, but really I don't undertand 
why...

esdc= explain analyzeSELECT 
cont_contenido.id_contenido,cont_contenido.pertenece_premium,cont_contenido.Titulo_esp 
as v_sufix 
,cont_contenido.url_contenido,cont_contenido.tipo_acceso,cont_contenido.id_sbc,cont_contenido.cant_vistos,cont_contenido.cant_votos 
,cont_contenido.puntaje_total 
,cont_contenido.id_contenido_padre 

Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Fernando Papa
Err... you're right... one of us say the same thing when I show the
Volker mail...

-Mensaje original-
De: Manfred Koizar [mailto:[EMAIL PROTECTED] 
Enviado el: lunes, 04 de agosto de 2003 12:17
Para: Fernando Papa
CC: Volker Helm; [EMAIL PROTECTED]
Asunto: Re: [PERFORM] I can't wait too much: Total runtime 432478.44
msec


On Mon, 4 Aug 2003 12:02:46 -0300, Fernando Papa [EMAIL PROTECTED]
wrote:
FROM 
cont_contenido   
,juegos_config
,cont_publicacion 
,(SELECT max(cp1.fecha_publicacion) as max_pub
--change here
 FROM cont_publicacion cp1) a --change here

But this calculates the global maximum, not per id_instalacion,
id_contenido, and generar_vainilla as in

AND cont_publicacion.fecha_publicacion = (SELECT
max(cp1.fecha_publicacion) 
 FROM cont_publicacion cp1  
 WHERE cp1.id_instalacion =
cont_publicacion.id_instalacion 
   AND cp1.id_contenido =
cont_publicacion.id_contenido  
   AND cp1.generar_vainilla =
cont_publicacion.generar_vainilla) 

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] EXTERNAL storage and substring on long strings

2003-08-04 Thread Scott Cain
Hello,

Note: there is a SQL question way at the bottom of this narrative :-)

Last week I asked about doing substring operations on very long strings
(10 million characters).  I was given a suggestion to use EXTERNAL
storage on the column via the ALTER TABLE ... SET STORAGE command.  In
one test case, the performance of substring actually got worse using
EXTERNAL storage.  

In an effort to find the best way to do this operation, I decided to
look at what is my worst case scenario: the DNA sequence for human
chromosome 1, which is about 250 million characters long (previous
strings where about 20 million characters long).  I wrote a perl script
to do several substring operations over this very long string, with
substring lengths varying between 1000 and 40,000 characters spread out
over various locations along the string.  While EXTENDED storage won in
this case, it was a hollow victory: 38 seconds per operation versus 40
seconds, both of which are way too long to for an interactive
application.

Time for a new method.  A suggestion from my boss was to shred the DNA
into smallish chunks and a column giving offsets from the beginning of
the string, so that it can be reassembled when needed. Here is the test
table:

string= \d dna
  Table public.dna
 Column  |  Type   | Modifiers
-+-+---
 foffset | integer |
 pdna| text|
Indexes: foffset_idx btree (foffset)

In practice, there would also be a foreign key column to give the
identifier of the dna.  Then I wrote the following function (here's the
SQL part promised above):

CREATE OR REPLACE FUNCTION dna_string (integer, integer) RETURNS TEXT AS '
DECLARE
smin ALIAS FOR $1;
smax ALIAS FOR $2;
longdna TEXT := ;
dna_row dna%ROWTYPE;
dnastring   TEXT;
firstchunk  INTEGER;
lastchunk   INTEGER;
in_longdnastart INTEGER;
in_longdnalen   INTEGER;
chunksize   INTEGER;
BEGIN
SELECT INTO chunksize min(foffset) FROM dna WHERE foffset0;
firstchunk :=  chunksize*(smin/chunksize);
lastchunk  :=  chunksize*(smax/chunksize);

in_longdnastart := smin % chunksize;
in_longdnalen   := smax - smin + 1;

FOR dna_row IN
SELECT * FROM dna
WHERE foffset = firstchunk AND foffset = lastchunk
ORDER BY foffset
LOOP

longdna := longdna || dna_row.pdna;
END LOOP;

dnastring := substring(longdna FROM in_longdnastart FOR in_longdnalen);

RETURN dnastring;
END;
' LANGUAGE 'plpgsql';

So here's the question: I've never written a plpgsql function before, so
I don't have much experience with it; is there anything obviously wrong
with this function, or are there things that could be done better?  At
least this appears to work and is much faster, completing substring
operations like above in about 0.27 secs (that's about two orders of
magnitude improvement!)

Thanks,
Scott


-- 

Scott Cain, Ph. D. [EMAIL PROTECTED]
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] [PERFORM] EXTERNAL storage and substring on long strings

2003-08-04 Thread Tom Lane
Scott Cain [EMAIL PROTECTED] writes:
 At least this appears to work and is much faster, completing substring
 operations like above in about 0.27 secs (that's about two orders of
 magnitude improvement!)

I find it really, really hard to believe that a crude reimplementation
in plpgsql of the TOAST concept could beat the built-in implementation
at all, let alone beat it by two orders of magnitude.

Either there's something unrealistic about your testing of the
dna_string function, or your original tests are not causing TOAST to be
invoked in the expected way, or there's a bug we need to fix.  I'd
really like to see some profiling of the poor-performing
external-storage case, so we can figure out what's going on.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [osdldbt-general] Re: [PERFORM] OSDL Database Test Suite 3 is

2003-08-04 Thread scott.marlowe
On 4 Aug 2003, Jenny Zhang wrote:

 On Mon, 2003-08-04 at 06:33, Manfred Koizar wrote:
  | effective_cache_size   | 1000
  
  With 4GB of memory this is definitely too low and *can* (note that I
  don't say *must*) lead the planner to wrong decisions.
  
 I changed the default to effective_cache_size=393216 as calculated by
 Scott.  Another way to check the execution plan is to go to the results
 dir:
 http://khack.osdl.org/stp/276917/results
 There is a 'power_plan.out' file to record the execution plan.  I am
 running a test with the changed effective_cache_size, I will see how it
 affect the plan.
 
  | shared_buffers | 15200
  
  ... looks reasonable.  Did you test with other values?
 I have only one with shared_buffers=120 at:
 http://khack.osdl.org/stp/276847/
 The performance degraded.  

Well, that's truly huge, even for a machine with lots-o-ram.  Most tests 
find that once the shared_buffers are big enough to use more than about 25 
to 33% of RAM, they're too big, as you get little return.

  | sort_mem   | 524288
  
  This is a bit high, IMHO, but might be ok given that DBT3 is not run
  with many concurrent sessions (right?).
  http://khack.osdl.org/stp/276912/results/plot/vmstat_swap.png shows
  some swapping activity towards the end of the run which could be
  caused by a too high sort_mem setting.
 Right, I run only 4 streams.  Setting this parameter lower caused more
 reading/writing to the pgsql/tmp.  I guess the database has to do it if
 it can not do sorting in memory. 

Note that IF your sortmem really is 1/2 gig, then you'll likely have LESS 
than 3 gigs left for OS system cache.  About how big does top show buff 
and cached to be on that box under load?  Not that it's a big deal if you 
get the effective cache size off by a little bit, it's more of a rubber 
mallet setting than a jeweler's screw driver setting.


Thanks a bunch for all the great testing.  It's a very nice tool to have 
for convincing the bosses to go with Postgresql.


---(end of broadcast)---
TIP 8: explain analyze is your friend