Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Steinar H. Gunderson
On Wed, Sep 07, 2005 at 12:22:27PM -0400, Christian Compagnon wrote:
 I'm a newbie in postgresql, I've installed it on a Windows XP machine
 ( I can't use linux, it's a company machine ), I'm courious why this
 query takes so long

It sounds like you've set work_mem too low; increasing it might help. Also
try rewriting your query to

  SELECT Rut Cliente
  FROM Internet_Abril
  WHERE
Rut Cliente NOT IN ( SELECT Rut Cliente FROM Internet_Enero )
AND Rut Cliente NOT IN ( SELECT Rut Cliente FROM Internet_Febrero )
AND Rut Cliente NOT IN ( SELECT Rut Cliente FROM Internet_Marzo )

(I'm not sure how optimized UNION inside an IN/NOT IN is.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Meetesh Karia
PG is creating the union of January, February and March tables first
and that doesn't have an index on it. If you're going to do many
queries using the union of those three tables, you might want to place
their contents into one table and create an index on it.

Otherwise, try something like this:

SELECT Rut Cliente
FROM Internet_Abril
WHERE Rut Cliente NOT IN (SELECT Rut Cliente FROM
Internet_Enero)
AND Rut Cliente NOT IN (SELECT Rut Cliente FROM
Internet_Febrero)
AND Rut Cliente NOT IN (SELECT Rut Cliente FROM
Internet_Marzo);

You could also compare the performance of that to this and choose the one that works the best:

SELECT Rut Cliente

FROM Internet_Abril a
LEFT JOIN Internet_Enero e ON a.Rut Cliente = e.Rut Cliente
LEFT JOIN Internet_Febrero f ON a.Rut Cliente = f.Rut Cliente
LEFT JOIN Internet_Marzo m ON a.Rut Cliente = m.Rut Cliente
WHERE e.Rut Cliente IS NULL AND f.Rut Cliente IS NULL and m.Rut Cliente IS NULL;

MeeteshOn 9/7/05, Christian Compagnon [EMAIL PROTECTED] wrote:
Hello,I'm a newbie in postgresql, I've installed it on a Windows XP machine( I can't use linux, it's a company machine ), I'm courious why thisquery takes so longSELECT Rut ClienteFROM Internet_Abril
WHERE Rut ClienteNOT IN ((SELECT Rut ClienteFROMInternet_Enero) UNION (SELECT Rut ClienteFROMInternet_Febrero) UNION (SELECT Rut ClienteFROM
Internet_Marzo));it takes about 100 minutes to complete the query.All tables has index created ( Rut Cliente is a VarChar ), and tableshas 50.000 records each.The explain for the query tells the following
QUERY PLAN Seq Scan on Internet_Abril(cost=19406.67..62126112.70 rows=24731 width=13)Filter: (NOT (subplan)) SubPlan -Materialize(cost=19406.67..21576.07 rows=136740 width=13)
-Unique(cost=17784.23..18467.93
rows=136740 width=13)-Sort(cost=17784.23..18126.08 rows=136740 width=13)
SortKey: Rut Cliente-Append(cost=0.00..3741.80
rows=136740 width=13)-Subquery
Scan *SELECT* 1(cost=0.00..1233.38rows=45069 width=13)
-Seq Scan on
Internet_Enero(cost=0.00..782.69rows=45069 width=13)
-Subquery Scan *SELECT*
2(cost=0.00..1104.06rows=40353 width=13)
-Seq Scan on
Internet_Febrero(cost=0.00..700.53rows=40353 width=13)
-Subquery Scan *SELECT*
3(cost=0.00..1404.36rows=51318 width=13)-Seq
Scan on Internet_Marzo(cost=0.00..891.18rows=51318 width=13)Any help will be apreciated, It's for my thesissaludosChristian---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly



Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Alex Hayward
On Wed, 7 Sep 2005, Meetesh Karia wrote:

 PG is creating the union of January, February and March tables first and
 that doesn't have an index on it. If you're going to do many queries using
 the union of those three tables, you might want to place their contents into
 one table and create an index on it.

 Otherwise, try something like this:

 SELECT Rut Cliente
 FROM Internet_Abril
 WHERE Rut Cliente NOT IN (SELECT Rut Cliente FROM
 Internet_Enero)
 AND Rut Cliente NOT IN (SELECT Rut Cliente FROM
 Internet_Febrero)
 AND Rut Cliente NOT IN (SELECT Rut Cliente FROM
 Internet_Marzo);

You may also wish to try:

SELECT Rut Cliente
FROM Internet_Abril
WHERE NOT EXISTS
  (SELECT 1 FROM Internet_Enero
   WHERE Internet_Enero.Rut Cliente=Internet_Abril.Rut Cliente)
AND NOT EXISTS
  (SELECT 1 FROM Internet_Febrero
   WHERE Internet_Febrero.Rut Cliente=Internet_Abril.Rut Cliente)
AND NOT EXISTS
  (SELECT 1 FROM Internet_Marzo
   WHERE Internet_Marzo.Rut Cliente=Internet_Abril.Rut Cliente)

which will probably scan the indexes on the January, February and March
indexes once for each row in the April table.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Tom Lane
Steinar H. Gunderson [EMAIL PROTECTED] writes:
 (I'm not sure how optimized UNION inside an IN/NOT IN is.)

NOT IN is pretty nonoptimal, period.  It'd help a lot to boost work_mem
to the point where the planner figures it can use a hashtable (look for
EXPLAIN to say hashed subplan rather than just subplan).  Of course,
if there's enough stuff in the UNION that that drives you into swapping,
it's gonna be painful anyway.

Using UNION ALL instead of UNION might save a few cycles too.

If you're willing to rewrite the query wholesale, you could try the old
trick of a LEFT JOIN where you discard rows for which there's a match,
ie, the righthand join value isn't NULL.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend