Re: [PERFORM] Performance Bottleneck

Martin Foster Wed, 04 Aug 2004 09:47:08 -0700

Gaetano Mendola wrote:

Martin Foster wrote:
Gaetano Mendola wrote:
Martin Foster wrote:
I run a Perl/CGI driven website that makes extensive use of PostgreSQL (7.4.3) for everything from user information to formatting and display of specific sections of the site. The server itself, is a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).

Recently loads on the site have increased during peak hours to the point of showing considerable loss in performance. This can be observed when connections move from the 120 concurrent connections to PostgreSQL to roughly 175 or more. Essentially, the machine seems to struggle to keep up with continual requests and slows down respectively as resources are tied down.

Code changes have been made to the scripts to essentially back off in high load working environments which have worked to an extent. However, as loads continue to increase the database itself is not taking well to the increased traffic taking place.

Having taken a look at 'Tuning PostgreSQL for Performance' (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best I could in order to set my settings. However, even with statistics disabled and ever setting tweaked things still consider to deteriorate.

Is there anything anyone can recommend in order to give the system a necessary speed boost? It would seem to me that a modest dataset of roughly a Gig combined with that type of hardware should be able to handle substantially more load then what it is. Can anyone provide me with clues as where to pursue? Would disabling 'fsync' provide more performance if I choose that information may be lost in case of a crash?

If anyone needs access to logs, settings et cetera. Please ask, I simply wish to test the waters first on what is needed. Thanks!
Tell us about your tipical queries, show us your configuration file.
The access are only in read only mode or do you have concurrent writers
and readers ? During peak hours your processors are tied to 100% ?
What say the vmstat and the iostat ?
May be you are not using indexes some where, or may be yes but the
planner is not using it... In two words we needs other informations
in order to help you.
Regards
Gaetano Mendola
I included all the files in attachments, which will hopefully cut down on any replied to Emails. As for things like connection pooling, the web server makes use of Apache::DBI to pool the connections for the Perl scripts being driven on that server. For the sake of being thorough, a quick 'apachectl status' was thrown in when the database was under a good load.
Let start from your postgres configuration:
shared_buffers = 8192 <==== This is really too small for your configuration sort_mem = 2048
wal_buffers = 128    <==== This is really too small for your configuration
effective_cache_size = 16000
change this values in:
shared_buffers = 50000
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb nature of the scripts themselves.
Show us the explain analyze on that queries, how many rows the tables are
containing, the table schema could be also usefull.
regards
Gaetano Mendola

I will look into moving up those values and seeing how they interact with the system once I get back from work. Since it was requested, I have a visual representation of an older schema, one that was used under MySQL. Note that all of the timestamps are now properly set to LOCALTIME on PostgreSQL.

http://prdownloads.sourceforge.net/ethereal-realms/ethereal-3_0_0.png?download

The amount of rows for tables of note are as follows:
 Puppeteer 1606
 Puppet    33176
 Realm     83
 Post      36156
 Audit     61961

The post table is continually cleared of old information since the nature of the information is time very critical and archiving would only hinder performance. As a result, this will vary wildly based on time of day since users (Puppeteers) tend to post more during peak hours.

NOTE:   The scripts make use of different schema's with the same
        information in order to virtualize the script in order
        to support more then one site on the same hardware.

On a side note, this would be a normal post-authentication session once in realm for getting new posts: * Script is executed and schema is determined through stored procedure; * Formatting information is fetched from Tag and RealmDesign as needed; * Script will retrieve stored parameters in the Param table; * Script will decode, analyze and authenticate against Puppeteer; * Script will scan the Puppet and Post tables to generate posts; * Sub-query to determine ignored puppeteers/users; * Sub-query to determine ignored puppets/handles; and * Loop above if necessary until expiry of script delaying the execution of the script from 5 to 25 seconds.

This should provide an idea on that portion. of course the flow changes when one posts, but is handled by a different script instance as is authentication et cetera.

        Martin Foster
        Creator/Designer Ethereal Realms
        [EMAIL PROTECTED]

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Performance Bottleneck

Reply via email to