Reuven--

a few years back this same scenario happened for me working on a project with a 
large Oracle DB which held clinical data 
We needed to implement a join condition but the column we were using wa not 
using an Index scan 
so since the cardinality of the results was extraordinary we re-implemented the 
column to use EXISTS (for large result sets)
*following the omnipresent example used by Oracle books where IN is used for 
columns with low cardinality *
and following the docs from EXISTS where the EXISTS clause ALWAYS has to return 
something 

In our case we were joining on a doctorid with IN (which made no sense as there 
were millions of PatientIDs)  to find all patients whose doctors
were in PPO's so to increase performance we changed the IN clause for the 
column with high cardinality (doctorid) to EXISTS

select p.PATIENT_NAME from PATIENT p where p.doctorid in (select doctorid from 
ppo_table)

/******there are many doctorids in this ppo table so we will change to 
EXISTS********/

select p.PATIENT_NAME from PATIENT p
    where exists (select 0 from  ppo_table ppo where p.doctorID = 
ppo.doctorID); 

Shalom/
Martin--
--------------------------------------------------------------------------- 
This e-mail message (including attachments, if any) is intended for the use of 
the individual or entity to which it is addressed and may contain information 
that is privileged, proprietary , confidential and exempt from disclosure. If 
you are not the intended recipient, you are notified that any dissemination, 
distribution or copying of this communication is strictly prohibited.
--------------------------------------------------------------------------- 
Le présent message électronique (y compris les pièces qui y sont annexées, le 
cas échéant) s'adresse au destinataire indiqué et peut contenir des 
renseignements de caractère privé ou confidentiel. Si vous n'êtes pas le 
destinataire de ce document, nous vous signalons qu'il est strictement interdit 
de le diffuser, de le distribuer ou de le reproduire.
----- Original Message ----- 
From: "Alvaro Herrera" <[EMAIL PROTECTED]>
To: "Reuven M. Lerner" <[EMAIL PROTECTED]>
Cc: "Webb Sprague" <[EMAIL PROTECTED]>; <pgsql-general@postgresql.org>
Sent: Tuesday, March 06, 2007 4:19 PM
Subject: Re: [GENERAL] Database slowness -- my design, hardware, or both?


> Reuven M. Lerner escribió:
>> Hi, Webb Sprague.  You wrote:
>> >... but I see two seq scans in your explain in a loop -- this is
>> >probably not good.  If you can find a way to rewrite the IN clause
>> >(either de-normalizing through triggers to save whatever you need on
>> >an insert and not have to deal with a set, or by using except in the
>> >query, or someing else more creative)...
>> I would normally agree that an IN clause is a place to worry -- except 
>> that I'm using IN on a very small table, with about 4-5 rows.  That 
>> might indeed affect things, and I could certainly pull out these values 
>> into a Perl variable that I then interpolate literally into the SQL.  
>> However, I have to assume that this wouldn't affect things all that much.
> 
> Don't assume -- measure.  I had a query which ran orders of magnitude
> faster because I interpolated the constant list in the big query.  The
> table from which the interpolated values were being extracted had about
> 30 rows or so.
> 
> -- 
> Alvaro Herrera                                http://www.CommandPrompt.com/
> The PostgreSQL Company - Command Prompt, Inc.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/

Reply via email to