You're probably right about my being overly optimistic about the load
imposed by EXPLAIN ANALYZE. It was just that in my previous experience
with it, I'd never seen such a large runtime discrepancy before. I even
allowed for a "caching effect" by making sure the server was all but
quiescent, and then running the three queries as quickly after one another
as I could.
The server itself is an IBM x345 with dual Xeon 3ghz CPU's (hyperthreading
turned off) and 2.5gb of RAM. O/S is RHEL3 Update 4. Disks are a
ServeRAID of some flavor, I'm not sure what.
Thanks for the heads-up about the performance of IN in 7.3. We're looking
to migrate to 8.0 or 8.0.1 when they become GA, but some of our databases
are in excess of 200gb-300gb, and we need to make sure we have a good
migration plan in place (space to store the dump out of the 7.3 db) before
we start.
___________________________________________________________________________________
Steven Rosenstein
IT Architect/Developer | IBM Virtual Server Administration
Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001
Text Messaging: 6463456978 @ mobile.mycingular.com
Email: srosenst @ us.ibm.com
"Learn from the mistakes of others because you can't live long enough to
make them all yourself." -- Eleanor Roosevelt
Tom Lane
<[EMAIL PROTECTED]
s> To
Steven Rosenstein/New
02/06/2005 05:46 York/[EMAIL PROTECTED]
PM cc
pgsql-performance@postgresql.org
Subject
Re: [PERFORM] Can the V7.3 EXPLAIN
ANALYZE be trusted?
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Steven
Rosenstein
> >> I don't think EXPLAIN ANALYZE puts that much overhead on a query.
I think you're being overly optimistic. The explain shows that the
Materialize subnode is being entered upwards of 32 million times:
-> Materialize (cost=505.06..511.38 rows=632 width=4) (actual
time=0.00..0.02 rows=43 loops=752066)
43 * 752066 = 32338838. The instrumentation overhead is basically two
gettimeofday() kernel calls per node entry. Doing the math shows that
your machine is able to do gettimeofday() in about half a microsecond,
which isn't stellar but it's not all that slow for a kernel call.
(What's the platform here, anyway?) Nonetheless it's a couple of times
larger than the actual time needed to pull a row from a materialized
array ...
The real answer to your question is "IN (subselect) sucks before PG 7.4;
get a newer release".
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org