Re: [PERFORM] query performance question

2006-04-30 Thread Dave Dutcher
Title: Message



You 
are pulling a fair amount of data from the database and doing a lot of 
computation in the SQL.  I'm not sure how fast this query could be expected 
to run, but I had one idea.  If you've 
inserted and deleted a lot into this table, you will need to run vacuum 
ocasionally.  If you haven't been doing that, I would try a VACUUM FULL 
ANALYZE on the table.  (That will take a lock on the table and prevent 
clients from reading data while it is running.)
 
 

  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  gulsahSent: Friday, April 28, 2006 6:31 AMTo: 
  pgsql-performance@postgresql.orgSubject: [PERFORM] query 
  performance questionHi,I have a performance 
  problem with Postgresql version 8.1 installed on a Fedora Core release 4 
  (Stentz) with kernel version 2.6.11.The machine I am working on has 
  512MB of RAM and Pentium III 800 MHz CPU.I have only one table in the 
  database which consists of 256 columns and 1 rows. Each column is of float 
  type and each row corresponds to a vector in my application. What I want to do 
  is to compute the distance between a predefined vector in hand and the ones in 
  the database.The computation proceeds according to the following 
  pseudocode:    for(i=1; 
  i<=256 ; 
  i++){    
  distance += abs(x1_i - x2_i);    
  }where x1_i denotes the vector in hand's i coordinate and x2_i denotes 
  the icoordinate of the vector in the database.The distance 
  computation have to be done for all the vectors in the databaseby means of 
  a query and the result set should be sorted in terms of thecomputed 
  distances.When I implement the query and measure the time spent for it 
  in an applicationI see that the query is handled in more than 8 seconds 
  which is undesirable inmy application.Here what I want to ask you 
  all is that, is it a normal performance for acomputer with the properties 
  that I have mentioned above? Is there any solutionin your mind to increase 
  the performance of my query?To make it more undestandable, I should 
  give the query for vectors with size3, but in my case their size is 
  256.selectid as 
  vectorid,abs(40.9546-x2_1)+abs(-72.9964-x2_2)+abs(53.5348-x2_3) as 
  distancefrom vectordborder by distanceThank you all for your 
  help.-gulsah
  
  
  Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great 
  rates starting at 1¢/min.


Re: [PERFORM] Why so slow?

2006-04-30 Thread Bill Moran
"Bealach-na Bo" <[EMAIL PROTECTED]> wrote:

> >If you don't need access to the old data constantly:
> >
> >  - copy the live data to a new table
> >  - TRUNCATE the old table (which needs an exclusive lock but is very fast)
> >  - insert the data back in
> >  - for an event log I would imagine this could work
> 
> Obtaining exclusive locks on this table is very difficult, or rather,
> will make life very difficult for others, so I'm averse to running
> vacuum full or truncate (though I don't know how fast truncate is)
> on a regular basis. I might just get away with running it
> once a month, but no more.
> 
> (Lazy) vacuum, however is a much more palatable option. But (lazy)
> vacuum does not always reclaim space. Will this affect performance and
> does this mean that a full vacuum is unavoidable? Or can I get away
> with daily (lazy) vacuums?  Disk space is not an issue for me, but
> performance is a BIG issue. Of course, I realize that I could improve
> the latter with better schema design - I'm working on a new schema,
> but can't kill this one yet :|.

My understanding is basically that if you vacuum with the correct
frequency, you'll never need to vacuum full.  This is why the
autovacuum system is so nice, it adjusts the frequency of vacuum according
to how much use the DB is getting.

The problem is that if you get behind, plain vacuum is unable to get things
caught up again, and a vacuum full is required to recover disk space.

At this point, it seems like you need to do 2 things:
1) Schedule lazy vacuum to run, or configure autovacuum.
2) Schedule some downtime to run "vacuum full" to recover some disk space.

#2 only needs done once to get you back on track, assuming that #1 is
done properly.

A little bit of wasted space in the database is OK, and lazy vacuum done
on a reasonable schedule will keep the level of wasted space to an
acceptable level.

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance Issues on Opteron Dual Core

2006-04-30 Thread Mark Kirkwood

Gregory Stewart wrote:

Hello,

We are currently developing a web application and have the webserver and 
PostgreSQL with our dev db running on a machine with these specs:


Win 2003 standard
AMD Athlon XP 3000 / 2.1 GHZ
2 Gig ram
120 gig SATA HD
PostgreSQL 8.1.0
Default pgsql configuration + shared buffers = 30,000

The performance of postgresql and our web application is good on that 
machine, but we decided to build a dedicated database server for our 
production database that scales better and that we can also use for internal 
applications (CRM and so on).


To make a long story short, we built a machine with these specs:

Windows 2003 Standard
AMD Opteron 165 Dual Core / running at 2 GHZ
2 gig ram
2 x 150 Gig SATA II HDs in RAID 1 mode (mirror)
PostgreSQL 8.1.3
Default pgsql configuration + shared buffers = 30,000

Perfomance tests in windows show that the new box outperforms our dev 
machine quite a bit in CPU, HD and memory performance.


I did some EXPLAIN ANALYZE tests on queries and the results were very good, 
3 to 4 times faster than our dev db.


However one thing is really throwing me off.
When I open a table with 320,000 rows / 16 fields in the pgadmin tool (v 
1.4.0) it takes about 6 seconds on the dev server to display the result (all 
rows). During these 6 seconds the CPU usage jumps to 90%-100%.


When I open the same table on the new, faster, better production box, it 
takes 28 seconds!?! During these 28 seconds the CPU usage jumps to 30% for 1 
second, and goes back to 0% for the remaining time while it is running the 
query.


What is going wrong here? It is my understanding that postgresql supports 
multi-core / cpu environments out of the box, but to me it appears that it 
isn't utilizing any of the 2 cpu's available. I doubt that my server is that 
fast that it can perform this operation in idle mode.


I played around with the shared buffers and tried out versions 8.1.3, 8.1.2, 
8.1.0 with the same result.


Has anyone experienced this kind of behaviour before?
How representative is the query performance in pgadmin?



Pgadmin can give misleading times for queries that return large result 
sets over a network, due to:


1/ It takes time to format the (large) result set for display.
2/ It has to count the time spent waiting for the (large) result set to 
travel across the network.


You aren't running Pgadmin off the dev server are you? If not check your 
network link to dev and prod  - is one faster than the other? (etc).


To eliminate Pgadmin and the network as factors try wrapping your query 
in a 'SELECT count(*) FROM (your query here) AS a', and see if it 
changes anything!


Cheers

Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Why so slow?

2006-04-30 Thread Bealach-na Bo

If you don't need access to the old data constantly:

 - copy the live data to a new table
 - TRUNCATE the old table (which needs an exclusive lock but is very fast)
 - insert the data back in
 - for an event log I would imagine this could work


Obtaining exclusive locks on this table is very difficult, or rather,
will make life very difficult for others, so I'm averse to running
vacuum full or truncate (though I don't know how fast truncate is)
on a regular basis. I might just get away with running it
once a month, but no more.

(Lazy) vacuum, however is a much more palatable option. But (lazy)
vacuum does not always reclaim space. Will this affect performance and
does this mean that a full vacuum is unavoidable? Or can I get away
with daily (lazy) vacuums?  Disk space is not an issue for me, but
performance is a BIG issue. Of course, I realize that I could improve
the latter with better schema design - I'm working on a new schema,
but can't kill this one yet :|.

Regards,

Bealach



---(end of broadcast)---
TIP 6: explain analyze is your friend