Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted?
Hi, @all, Greg Stark schrieb: Leeuw van der, Tim [EMAIL PROTECTED] writes: I don't think EXPLAIN ANALYZE puts that much overhead on a query. EXPLAIN ANALYZE does indeed impose a significant overhead. Additional note: In some rare cases, you can experience just the opposite effect, explain analyze can be quicker then the actual query. This is the case for rather expensive send/output functions, like the PostGIS ones: lwgeom=# \timing Zeitmessung ist an. lwgeom=# explain analyze select setsrid(geom,4326) from adminbndy1; QUERY PLAN --- Seq Scan on adminbndy1 (cost=0.00..4.04 rows=83 width=89) (actual time=11.793..2170.184 rows=83 loops=1) Total runtime: 2170.834 ms (2 Zeilen) Zeit: 2171,688 ms lwgeom=# \o /dev/null lwgeom=# select setsrid(geom,4326) from adminbndy1; Zeit: 9681,001 ms BTW: I use the cheap setsrid(geom,4326) to force deTOASTing of the geometry column. Not using it seems to ignore TOASTed columns in sequential scan simulation.) lwgeom=# explain analyze select geom from adminbndy1; QUERY PLAN --- Seq Scan on adminbndy1 (cost=0.00..3.83 rows=83 width=89) (actual time=0.089..0.499 rows=83 loops=1) Total runtime: 0.820 ms (2 Zeilen) Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:[EMAIL PROTECTED] | www.logi-track.com signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted?
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven Rosenstein Sent: Sunday, February 06, 2005 8:51 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted? While working on a previous question I posed to this group, I ran a number of EXPLAIN ANALYZE's to provide as examples. After sending up my last email, I ran the same query *without* EXPLAIN ANALYZE. The runtimes were vastly different. In the following example, I ran two identical queries one right after the other. The runtimes for both was very close (44.77 sec). I then immediately ran the exact same query, but without EXPLAIN ANALYZE. The same number of rows was returned, but the runtime was only 8.7 sec. I don't think EXPLAIN ANALYZE puts that much overhead on a query. Does anyone have any idea what is going on here? --- Steve Caching by the OS? (Did you try to *first* run the query w/o EXPLAIN ANALYZE, and then with? What's the timing if you do that?) --Tim ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted?
Leeuw van der, Tim [EMAIL PROTECTED] writes: I don't think EXPLAIN ANALYZE puts that much overhead on a query. EXPLAIN ANALYZE does indeed impose a significant overhead. What percentage of the time is overhead depends heavily on how much i/o the query is doing. For queries that are primarily cpu bound because they're processing data from the cache it can be substantial. If all the data is in the shared buffers then the gettimeofday calls for explain analyze can be just about the only syscalls being executed and they're executed a lot. It would be interesting to try to subtract out the profiling overhead from the data like most profilers do. But it's not an easy thing to do since the times are nested. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted?
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven Rosenstein I don't think EXPLAIN ANALYZE puts that much overhead on a query. I think you're being overly optimistic. The explain shows that the Materialize subnode is being entered upwards of 32 million times: - Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) 43 * 752066 = 32338838. The instrumentation overhead is basically two gettimeofday() kernel calls per node entry. Doing the math shows that your machine is able to do gettimeofday() in about half a microsecond, which isn't stellar but it's not all that slow for a kernel call. (What's the platform here, anyway?) Nonetheless it's a couple of times larger than the actual time needed to pull a row from a materialized array ... The real answer to your question is IN (subselect) sucks before PG 7.4; get a newer release. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted?
You're probably right about my being overly optimistic about the load imposed by EXPLAIN ANALYZE. It was just that in my previous experience with it, I'd never seen such a large runtime discrepancy before. I even allowed for a caching effect by making sure the server was all but quiescent, and then running the three queries as quickly after one another as I could. The server itself is an IBM x345 with dual Xeon 3ghz CPU's (hyperthreading turned off) and 2.5gb of RAM. O/S is RHEL3 Update 4. Disks are a ServeRAID of some flavor, I'm not sure what. Thanks for the heads-up about the performance of IN in 7.3. We're looking to migrate to 8.0 or 8.0.1 when they become GA, but some of our databases are in excess of 200gb-300gb, and we need to make sure we have a good migration plan in place (space to store the dump out of the 7.3 db) before we start. ___ Steven Rosenstein IT Architect/Developer | IBM Virtual Server Administration Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001 Text Messaging: 6463456978 @ mobile.mycingular.com Email: srosenst @ us.ibm.com Learn from the mistakes of others because you can't live long enough to make them all yourself. -- Eleanor Roosevelt Tom Lane [EMAIL PROTECTED] s To Steven Rosenstein/New 02/06/2005 05:46 York/[EMAIL PROTECTED] PM cc pgsql-performance@postgresql.org Subject Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted? From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven Rosenstein I don't think EXPLAIN ANALYZE puts that much overhead on a query. I think you're being overly optimistic. The explain shows that the Materialize subnode is being entered upwards of 32 million times: - Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) 43 * 752066 = 32338838. The instrumentation overhead is basically two gettimeofday() kernel calls per node entry. Doing the math shows that your machine is able to do gettimeofday() in about half a microsecond, which isn't stellar but it's not all that slow for a kernel call. (What's the platform here, anyway?) Nonetheless it's a couple of times larger than the actual time needed to pull a row from a materialized array ... The real answer to your question is IN (subselect) sucks before PG 7.4; get a newer release. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org