On Fri, 7 May 2004, Michael Ryan S. Puncia wrote:

> Hi,
> 
>             I am a newbie here and just starting to use postgresql. My
> problems is how to tune up my server because it its too slow.

First, read this:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

> This is my PC specs: P4, 512Ram, Linux 9

get more ram.

Hard Drives:  interface, how many, RAID???

For a mostly read database IDEs are pretty good.  Having multiple drives 
in a RAID-5 or RAID1+0 works well on a mostly read database too.  Keep the 
stripe size small is setting up a RAID array for a database.

> Because I am working in a statistical organization we have a very large data
> volume
> 
> These are my data:
> 
>  
> 
> Table 1 with 60 million data but only with 10 fields
> 
> Table 2 with 30 million data with 15 fields
> 
> Table 3 with 30 million data with 10 fields

That's not really that big, but it's big enough you have to make sure your 
server is tuned properly.

> I will only use this server for querying ... I already read and apply those
> articles found in the archives section but still the performance is not
> good.
> 
> I am planning to add another 512 RAM .Another question is how to calculate
> shared_buffer size ..

I'm assuming you've recently vacuumed full and analyzed your database...

Shared buffers should probably be between 1000 and 10000 on about 98% of 
all installations.  Setting it higher than 25% of memory is usually a bad 
idea.  Since they're in 8k blocks (unless you compiled with a customer 
block size, you'd know if you did, it's not something you can accidentally 
do by leaning on the wrong switch...) you probably want about 10000 blocks 
or so to start, which will give you about 80 megs of shared buffer.

PostgreSQL doesn't really cache as well as the kernel, so it's better to 
leave more memory available for kernel cache than you allocate to buffer 
cache.  On a machine with only 512Meg, I'm guessing you'll get about 128 
to 200 megs of kernel cache if you're only running postgresql and you have 
it set to 10000 buffers.

The important things to check / set are things lik effective_cache_size.  
It too is measured in 8k blocks, and reflects the approximate amount of 
kernel cache being dedicated to postgresql.  assuming a single service 
postgresql only box, that will be the number that a server that's been up 
for a while shows under top like so:

  9:50am  up 12:16,  4 users,  load average: 0.00, 0.00, 0.00                          
             
104 processes: 102 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  0.7% user,  0.3% system,  0.0% nice,  1.7% idle
Mem:   512924K av,  499248K used,   13676K free,       0K shrd,   54856K buff
Swap: 2048248K av,    5860K used, 2042388K free                  229572K cached

the 229572k cached entry shows about 230 megs.  divided by 8192 we get 
about 28000.

sort_mem might do with a small bump, especially if you're only handling a 
few connections at a time.  Be careful, it's per sort, and measured in 
megs, so it's easy for folks to set it too high and make their machine 
start flushing too much kernel cache, which will slow down the other 
backends that have to go to disk for data.

A good starting point for testing is anywhere from 8192 to 32768.  32768 
is 32 megs, which can starve a machine as small as yours if there are a 
couple of queries each running a couple of sorts on large sets at the same 
time.

Lastly, using explain analyze <your query here> you can see if postgresql 
is making a bad plan choice.  compared estimated rows to actual rows.  
Look for things like nested loops being run on what the planner thinks 
will be 80 rows but is, in fact, 8000 rows.

You can change random page cost to change the tendency of the server to 
favor seq scans to index scans.  Lower = greater tendency towards index 
scans.  the default is 4, but most production servers with enough memory 
to cache most of their data will run well on a setting of 1.2 to 1.4.  My 
dual 2800 with 2 gig ram runs quite well at 1.3 to 1.4. 

You can also change the settings to random_page_cost, as well as turning 
off options to the planner with the following env vars:

enable_hashagg
enable_hashjoin
enable_indexscan
enable_mergejoin
enable_nestloop
enable_seqscan
enable_sort
enable_tidscan

They are all on by default, and shouldn't really be turned off by default 
for the most part.  but for an individual session to figure out if the 
query planner is making the right plan you can set them to off to see if 
using another plan works better.  

so, if you've got a nested loop running over 80000 rows that the planner 
thought was gonna be 80 rows, you can force it to stop using the nested 
loop for your session with:

set enable_nestloop=off;

and use explain analyze to see if it runs faster.

You can set effective_cache_size and sort_mem on the fly for a single 
connection, or set them in postgresql.conf and restart or reload to make a 
change in the default.

shared_buffers is set on postgresql startup, and can't be changed without 
restarting the database.  Reloading won't do it.




---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to