Re: [PERFORM] help tuning queries on large database

2006-01-10 Thread Ron

At 12:23 PM 1/9/2006, peter royal wrote:


On Jan 8, 2006, at 4:35 PM, Ron wrote:

Areca ARC-1220 8-port PCI-E controller


Make sure you have 1GB or 2GB of cache.  Get the battery backup and
set the cache for write back rather than write through.


The card we've got doesn't have a SODIMM socket, since its only an 
8- port card.  My understanding was that was cache used when writing?
Trade in your 8 port ARC-1220 that doesn't support 1-2GB of cache for 
a 12, 16, or 24 port Areca one that does.  It's that important.


Present generation SATA2 HDs should average ~50MBps raw ASTR.  The 
Intel IOP333 DSP on the ARC's is limited to 800MBps, so that's your 
limit per card.  That's 16 SATA2 HD's operating in parallel (16HD 
RAID 0, 17 HD RAID 5, 32 HD RAID 10).


Next generation 2.5 form factor 10Krpm SAS HD's due to retail in 
2006 are supposed to average ~90MBps raw ASTR.  8 such HDs in 
parallel per ARC-12xx will be the limit.


Side Note: the PCI-Ex8 bus on the 12xx cards is good for ~1.6GBps 
RWPB, so I expect Areca is going to be upgrading this controller to 
at least 2x, if not 4x (would require replacing the x8 bus with a x16 
bus), the bandwidth at some point.


A PCI-Ex16 bus is good for ~3.2GBps RWPB, so if you have the slots 4 
such populated ARC cards will max out a PCI-Ex16 bus.


In your shoes, I think I would recommend replacing your 8 port 
ARC-1220 with a 12 port ARC-1230 with 1-2GB of battery backed cache 
and planning to get more of them as need arises.




A 2.6.12 or later based Linux distro should have NO problems using
more than 4GB or RAM.


Upgraded the kernel to 2.6.15, then we were able to set the BIOS
option for the 'Memory Hole' to 'Software' and it saw all 4G (under
2.6.11 we got a kernel panic with that set)
There are some other kernel tuning params that should help memory and 
physical IO performance.  Talk to a Linux kernel guru to get the 
correct advice specific to your installation and application.



It should be noted that there are indications of some major 
inefficiencies in pg's IO layer that make it compute bound under some 
circumstances before it becomes IO bound.  These may or may not cause 
trouble for you as you keep pushing the envelope for maximum IO performance.



With the kind of work you are doing and we are describing, I'm sure 
you can have a _very_ zippy system.


Ron



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] help tuning queries on large database

2006-01-10 Thread Mark Lewis
Ron,

A few days back you mentioned:

 Upgrade your kernel to at least  2.6.12
 There's a known issue with earlier versions of the 2.6.x kernel and 
 64b CPUs like the Opteron.  See kernel.org for details.
 

I did some searching and couldn't find any obvious mention of this issue
(I gave up after searching through the first few hundred instances of
64 in the 2.6.12 changelog).

Would you mind being a little more specific about which issue you're
talking about?  We're about to deploy some new 16GB RAM Opteron DB
servers and I'd like to check and make sure RH backported whatever the
fix was to their current RHEL4 kernel.

Thanks,
Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] help tuning queries on large database

2006-01-10 Thread Ron

At 07:28 PM 1/10/2006, Mark Lewis wrote:

Ron,

A few days back you mentioned:

 Upgrade your kernel to at least  2.6.12
 There's a known issue with earlier versions of the 2.6.x kernel and
 64b CPUs like the Opteron.  See kernel.org for details.


I did some searching and couldn't find any obvious mention of this issue
(I gave up after searching through the first few hundred instances of
64 in the 2.6.12 changelog).

Would you mind being a little more specific about which issue you're
talking about?  We're about to deploy some new 16GB RAM Opteron DB
servers and I'd like to check and make sure RH backported whatever the
fix was to their current RHEL4 kernel.

There are 3 issues I know about in general:
1= As Peter Royal noted on this list, pre 12 versions of 2.6.x have 
problems with RAM of = 4GB.


2= Pre 12 versions on 2.6.x when running A64 or Xeon 64b SMP seem to 
be susceptible to context switch storms.


3= Physical and memory IO is considerably improved in the later 
versions of 2.6.x compared to 2.6.11 or earlier.


Talk to a real Linux kernel guru (I am not) for details and specifics.
Ron



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread Kelly Burkhart
On 1/8/06, Ron [EMAIL PROTECTED] wrote:
snipAmong the other tricks having lots of RAM allows:If some of your tables are Read Only or VERY rarely written to, youcan preload them at boot time and make them RAM resident using the/etc/tmpfs trick.

What is the /etc/tmpfs trick?

-K 



Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread Harry Jackson
On 1/9/06, Kelly Burkhart [EMAIL PROTECTED] wrote:
 On 1/8/06, Ron [EMAIL PROTECTED] wrote:
  snip
  Among the other tricks having lots of RAM allows:
  If some of your tables are Read Only or VERY rarely written to, you
  can preload them at boot time and make them RAM resident using the
  /etc/tmpfs trick.

  What is the /etc/tmpfs trick?

I think he means you can create a directory that mounts and area of
RAM. If you put the tables on it then it will be very fast. I would
not recommend it for anything you cannot afford to loose.

I have also tried it and found that it did not produce as good as
performance as I expected.

--
Harry
http://www.hjackson.org
http://www.uklug.co.uk

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread peter royal

On Jan 8, 2006, at 1:42 PM, Luke Lonergan wrote:
Have you tested the underlying filesystem for it's performance?   
Run this:

  time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k
count=your_memory_size_in_GB * 25  sync'


This is a 2-disk RAID0

[EMAIL PROTECTED] /opt/alt-2]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/ 
bigfile bs=8k count=100  sync'

100+0 records in
100+0 records out

real1m27.143s
user0m0.276s
sys 0m37.338s

'iostat -x' showed writes peaking at ~100MB/s



Then run this:
  time dd if=/my_file_system/bigfile bs=8k of=/dev/null


[EMAIL PROTECTED] /opt/alt-2]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/ 
null

100+0 records in
100+0 records out

real1m9.846s
user0m0.189s
sys 0m11.099s

'iostat -x' showed reads peaking at ~116MB/s


Again with kernel 2.6.15:

[EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/bigfile  
bs=8k count=100  sync'

100+0 records in
100+0 records out

real1m29.144s
user0m0.204s
sys 0m48.415s

[EMAIL PROTECTED] ~]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/null
100+0 records in
100+0 records out

real1m9.701s
user0m0.168s
sys 0m11.933s


And report the times here please.  With your 8 disks in any of the  
RAID0
configurations you describe, you should be getting 480MB/s.  In the  
RAID10

configuration you should get 240.


Not anywhere near that. I'm scouring the 'net looking to see what  
needs to be tuned at the HW level.



You should also experiment with using larger readahead, which you can
implement like this:
  blockdev --setra 16384 /dev/my_block_device

E.g. blockdev --setra 16384 /dev/sda


wow, this helped nicely. Without using the updated kernel, it took  
28% off my testcase time.



From what you describe, one of these is likely:
- hardware isn't configured properly or a driver problem.


Using the latest Areca driver, looking to see if there is some  
configuration that was missed.



- you need to use xfs and tune your Linux readahead


Will try XFS soon, concentrating on the 'dd' speed issue first.


On Jan 8, 2006, at 4:35 PM, Ron wrote:

Areca ARC-1220 8-port PCI-E controller


Make sure you have 1GB or 2GB of cache.  Get the battery backup and  
set the cache for write back rather than write through.


The card we've got doesn't have a SODIMM socket, since its only an 8- 
port card.  My understanding was that was cache used when writing?


A 2.6.12 or later based Linux distro should have NO problems using  
more than 4GB or RAM.


Upgraded the kernel to 2.6.15, then we were able to set the BIOS  
option for the 'Memory Hole' to 'Software' and it saw all 4G (under  
2.6.11 we got a kernel panic with that set)



RAID Layout:

4 2-disk RAID0 sets created
You do know that a RAID 0 set provides _worse_ data protection than  
a single HD?  Don't use RAID 0 for any data you want kept reliably.


yup, aware of that. was planning on RAID10 for production, but just  
broke it out into RAID0 sets for testing (from what I read, I  
gathered that the read performance of RAID0 and RAID10 were comparable)



thanks for all the suggestions, I'll report back as I continue testing.

-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi



smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread Luke Lonergan
Peter,

On 1/9/06 9:23 AM, peter royal [EMAIL PROTECTED] wrote:

 This is a 2-disk RAID0

Your 2-disk results look fine - what about your 8-disk results?

Given that you want to run in production with RAID10, the most you should
expect is 2x the 2-disk results using all 8 of your disks.  If you want the
best rate for production while preserving data integrity, I recommend
running your Areca in RAID5, in which case you should expect 3.5x your
2-disk results (7 drives).  You can assume you'll get that if you use XFS +
readahead.  OTOH - I'd like to see your test results anyway :-)

- Luke



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread peter royal

On Jan 9, 2006, at 2:01 PM, Luke Lonergan wrote:

Peter,

On 1/9/06 9:23 AM, peter royal [EMAIL PROTECTED] wrote:


This is a 2-disk RAID0


Your 2-disk results look fine - what about your 8-disk results?


after some further research the 2-disk RAID0 numbers are not bad.

I have a single drive of the same type hooked up to the SATA2 port on  
the motherboard to boot from, and its performance numbers are (linux  
2.6.15, ext3):


[EMAIL PROTECTED] ~]# time bash -c 'dd if=/dev/zero of=/tmp/bigfile bs=8k  
count=100  sync'

100+0 records in
100+0 records out

real4m55.032s
user0m0.256s
sys 0m47.299s
[EMAIL PROTECTED] ~]# time dd if=/tmp/bigfile bs=8k of=/dev/null
100+0 records in
100+0 records out

real3m27.229s
user0m0.156s
sys 0m13.377s

so, there is a clear advantage to RAID over a single drive.


now, some stats in a 8-disk configuration:

8-disk RAID0, ext3, 16k read-ahead

[EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ 
pgdata/bigfile bs=8k count=100  sync'

100+0 records in
100+0 records out

real0m53.030s
user0m0.204s
sys 0m42.015s

[EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ 
dev/null

100+0 records in
100+0 records out

real0m23.232s
user0m0.144s
sys 0m13.213s


8-disk RAID0, xfs, 16k read-ahead

[EMAIL PROTECTED] /opt/pgdata]# time bash -c 'dd if=/dev/zero of=/opt/ 
pgdata/bigfile bs=8k count=100  sync'

100+0 records in
100+0 records out

real0m32.177s
user0m0.212s
sys 0m21.277s

[EMAIL PROTECTED] /opt/pgdata]# time dd if=/opt/pgdata/bigfile bs=8k of=/ 
dev/null

100+0 records in
100+0 records out

real0m21.814s
user0m0.172s
sys 0m13.881s


... WOW.. highly impressed with the XFS write speed! going to stick  
with that!


Overall, I got a 50% boost in the overall speed of my test suite by  
using XFS and the 16k read-ahead.


Given that you want to run in production with RAID10, the most you  
should
expect is 2x the 2-disk results using all 8 of your disks.  If you  
want the

best rate for production while preserving data integrity, I recommend
running your Areca in RAID5, in which case you should expect 3.5x your
2-disk results (7 drives).  You can assume you'll get that if you  
use XFS +

readahead.  OTOH - I'd like to see your test results anyway :-)


I've been avoiding RAID5 after reading how performance drops when a  
drive is out/rebuilding. The performance benefit will outweigh the  
cost I think.


Thanks for the help!
-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi



smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] help tuning queries on large database

2006-01-09 Thread Luke Lonergan
Peter,

On 1/9/06 12:59 PM, peter royal [EMAIL PROTECTED] wrote:


 Overall, I got a 50% boost in the overall speed of my test suite by
 using XFS and the 16k read-ahead.

Yes, it all looks pretty good for your config, though it looks like you
might be adapter limited with the Areca - you should have seen a read time
with XFS of about 17 seconds.

OTOH - with RAID5, you are probably about balanced, you should see a read
time of about 19 seconds and instead you'll get your 22 which isn't too big
of a deal.
 
 Thanks for the help!

Sure - no problem!

BTW - I'm running tests right now with the 3Ware 9550SX controllers.  Two of
them on one machine running simultaneously with 16 drives and we're getting
800MB/s sustained read times.  That's a 32GB file read in 40 seconds (!!)

At that rate, we're only about 3x slower than memory access (practically
limited at around 2GB/s even though the system bus peak is 10GB/s). So, the
point is, if you want to get close to your warm speed, you need to get
your disk I/O as close to main memory speed as you can.  With parallel I/O
you can do that (see Bizgres MPP for more).

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] help tuning queries on large database

2006-01-08 Thread Luke Lonergan
Peter,

On 1/6/06 2:59 PM, peter royal [EMAIL PROTECTED] wrote:

 I have experimented with having all 8 disks in a single RAID0 set, a
 single RAID10 set, and currently 4 RAID0 sets of 2 disks each. There
 hasn't been an appreciable difference in the overall performance of
 my test suite (which randomly generates queries like the samples
 below as well as a few other types. this problem manifests itself on
 other queries in the test suite as well).

Have you tested the underlying filesystem for it's performance?  Run this:
  time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k
count=your_memory_size_in_GB * 25  sync'

Then run this:
  time dd if=/my_file_system/bigfile bs=8k of=/dev/null

And report the times here please.  With your 8 disks in any of the RAID0
configurations you describe, you should be getting 480MB/s.  In the RAID10
configuration you should get 240.

Note that ext3 will not go faster than about 300MB/s in our experience.  You
should use xfs, which will run *much* faster.

You should also experiment with using larger readahead, which you can
implement like this:
  blockdev --setra 16384 /dev/my_block_device

E.g. blockdev --setra 16384 /dev/sda

This will set the readahead of Linux block device reads to 16MB.  Using
3Ware's newest controllers we have seen 500MB/s + on 8 disk drives in RAID0
on CentOS 4.1 with xfs.  Note that you will need to run the CentOS
unsupported kernel to get xfs.

 So, my question is, is there anything I can do to boost performance
 with what I've got, or am I in a position where the only 'fix' is
 more faster disks? I can't think of any schema/index changes that
 would help, since everything looks pretty optimal from the 'explain
 analyze' output. I'd like to get a 10x improvement when querying from
 the 'cold' state.

From what you describe, one of these is likely:
- hardware isn't configured properly or a driver problem.
- you need to use xfs and tune your Linux readahead

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] help tuning queries on large database

2006-01-08 Thread Ron

I'll second all of Luke Lonergan's comments and add these.

You should be able to increase both cold and warm performance (as 
well as data integrity.  read below.) considerably.

Ron

At 05:59 PM 1/6/2006, peter royal wrote:

Howdy.

I'm running into scaling problems when testing with a 16gb (data 
+indexes) database.


I can run a query, and it returns in a few seconds. If I run it
again, it returns in a few milliseconds. I realize this is because
during subsequent runs, the necessary disk pages have been cached by
the OS.

I have experimented with having all 8 disks in a single RAID0 set, a
single RAID10 set, and currently 4 RAID0 sets of 2 disks each. There
hasn't been an appreciable difference in the overall performance of
my test suite (which randomly generates queries like the samples
below as well as a few other types. this problem manifests itself on
other queries in the test suite as well).

So, my question is, is there anything I can do to boost performance
with what I've got, or am I in a position where the only 'fix' is
more faster disks? I can't think of any schema/index changes that
would help, since everything looks pretty optimal from the 'explain
analyze' output. I'd like to get a 10x improvement when querying from
the 'cold' state.

Thanks for any assistance. The advice from reading this list to
getting to where I am now has been invaluable.
-peter


Configuration:

PostgreSQL 8.1.1

shared_buffers = 1  # (It was higher, 50k, but didn't help any,
so brought down to free ram for disk cache)
work_mem = 8196
random_page_cost = 3
effective_cache_size = 25


Hardware:

CentOS 4.2 (Linux 2.6.9-22.0.1.ELsmp)


Upgrade your kernel to at least  2.6.12
There's a known issue with earlier versions of the 2.6.x kernel and 
64b CPUs like the Opteron.  See kernel.org for details.



Areca ARC-1220 8-port PCI-E controller


Make sure you have 1GB or 2GB of cache.  Get the battery backup and 
set the cache for write back rather than write through.



8 x Hitachi Deskstar 7K80 (SATA2) (7200rpm)
2 x Opteron 242 @ 1.6ghz
3gb RAM (should be 4gb, but separate Linux issue preventing us from
getting it to see all of it)
Tyan Thunder K8WE
The K8WE has 8 DIMM slots.  That should be good for 16 or 32 GB of 
RAM (Depending on whether the mainboard recognizes 4GB DIMMs or 
not.  Ask Tyan about the latest K8WE firmare.).  If nothing else, 1GB 
DIMMs are now so cheap that you should have no problems having 8GB on the K8WE.


A 2.6.12 or later based Linux distro should have NO problems using 
more than 4GB or RAM.


Among the other tricks having lots of RAM allows:
If some of your tables are Read Only or VERY rarely written to, you 
can preload them at boot time and make them RAM resident using the 
/etc/tmpfs trick.


In addition there is at least one company making a cheap battery 
backed PCI-X card that can hold up to 4GB of RAM and pretend to be a 
small HD to the OS.  I don't remember any names at the moment, but 
there have been posts here and at storage.review.com on such products.




RAID Layout:

4 2-disk RAID0 sets created
You do know that a RAID 0 set provides _worse_ data protection than a 
single HD?  Don't use RAID 0 for any data you want kept reliably.


With 8 HDs, the best config is probably
1 2HD RAID 1 + 1 6HD RAID 10  or
2 4HD RAID 10's

It is certainly true that once you have done everything you can with 
RAM, the next set of HW optimizations is to add HDs.  The more the 
better up to a the limits of your available PCI-X bandwidth.


In short, a 2nd RAID fully populated controller is not unreasonable.



---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] help tuning queries on large database

2006-01-06 Thread peter royal

Howdy.

I'm running into scaling problems when testing with a 16gb (data 
+indexes) database.


I can run a query, and it returns in a few seconds. If I run it  
again, it returns in a few milliseconds. I realize this is because  
during subsequent runs, the necessary disk pages have been cached by  
the OS.


I have experimented with having all 8 disks in a single RAID0 set, a  
single RAID10 set, and currently 4 RAID0 sets of 2 disks each. There  
hasn't been an appreciable difference in the overall performance of  
my test suite (which randomly generates queries like the samples  
below as well as a few other types. this problem manifests itself on  
other queries in the test suite as well).


So, my question is, is there anything I can do to boost performance  
with what I've got, or am I in a position where the only 'fix' is  
more faster disks? I can't think of any schema/index changes that  
would help, since everything looks pretty optimal from the 'explain  
analyze' output. I'd like to get a 10x improvement when querying from  
the 'cold' state.


Thanks for any assistance. The advice from reading this list to  
getting to where I am now has been invaluable.

-peter


Configuration:

PostgreSQL 8.1.1

shared_buffers = 1  # (It was higher, 50k, but didn't help any,  
so brought down to free ram for disk cache)

work_mem = 8196
random_page_cost = 3
effective_cache_size = 25


Hardware:

CentOS 4.2 (Linux 2.6.9-22.0.1.ELsmp)
Areca ARC-1220 8-port PCI-E controller
8 x Hitachi Deskstar 7K80 (SATA2) (7200rpm)
2 x Opteron 242 @ 1.6ghz
3gb RAM (should be 4gb, but separate Linux issue preventing us from  
getting it to see all of it)

Tyan Thunder K8WE


RAID Layout:

4 2-disk RAID0 sets created

Each raid set is a tablespace, formatted ext3. The majority of the  
database is in the primary tablespace, and the popular object_data  
table is in its own tablespace.



Sample 1:

triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM  
object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND  
O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag =  
'transmitter\'s' LIMIT 1000;

QUERY PLAN
 
 
-
Limit  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=3702.697..3704.665 rows=206 loops=1)
   -  Unique  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=3702.691..3703.900 rows=206 loops=1)
 -  Sort  (cost=1245.07..1245.31 rows=97 width=4) (actual  
time=3702.686..3703.056 rows=206 loops=1)

   Sort Key: o.subject
   -  Nested Loop  (cost=2.82..1241.87 rows=97 width=4)  
(actual time=97.166..3701.970 rows=206 loops=1)
 -  Nested Loop  (cost=2.82..678.57 rows=186  
width=4) (actual time=59.903..1213.170 rows=446 loops=1)
   -  Index Scan using tags_tag_key on tags  
t2  (cost=0.00..5.01 rows=1 width=4) (actual time=13.139..13.143  
rows=1 loops=1)
 Index Cond: (tag =  
'transmitter''s'::text)
   -  Bitmap Heap Scan on object_tags t1   
(cost=2.82..670.65 rows=233 width=8) (actual time=46.751..1198.198  
rows=446 loops=1)
 Recheck Cond: (t1.tag_id =  
outer.tag_id)
 -  Bitmap Index Scan on  
object_tags_tag_id_object_id  (cost=0.00..2.82 rows=233 width=0)  
(actual time=31.571..31.571 rows=446 loops=1)
   Index Cond: (t1.tag_id =  
outer.tag_id)
 -  Index Scan using object_data_pkey on  
object_data o  (cost=0.00..3.02 rows=1 width=4) (actual  
time=5.573..5.574 rows=0 loops=446)

   Index Cond: (o.subject = outer.object_id)
   Filter: (type = 179)
Total runtime: 3705.166 ms
(16 rows)

triple_store=# explain analyze SELECT DISTINCT O.subject AS oid FROM  
object_data O, object_tags T1, tags T2 WHERE O.type = 179 AND  
O.subject = T1.object_id AND T1.tag_id = T2.tag_id AND T2.tag =  
'transmitter\'s' LIMIT 1000;

   QUERY PLAN
 
 
---
Limit  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=11.037..12.923 rows=206 loops=1)
   -  Unique  (cost=1245.07..1245.55 rows=97 width=4) (actual  
time=11.031..12.190 rows=206 loops=1)
 -  Sort  (cost=1245.07..1245.31 rows=97 width=4) (actual  
time=11.027..11.396 rows=206 loops=1)

   Sort Key: o.subject
   -  Nested Loop  (cost=2.82..1241.87 rows=97 width=4)  
(actual time=0.430..10.461 rows=206 loops=1)

Re: [PERFORM] help tuning queries on large database

2006-01-06 Thread Tom Lane
peter royal [EMAIL PROTECTED] writes:
 So, my question is, is there anything I can do to boost performance  
 with what I've got, or am I in a position where the only 'fix' is  
 more faster disks? I can't think of any schema/index changes that  
 would help, since everything looks pretty optimal from the 'explain  
 analyze' output. I'd like to get a 10x improvement when querying from  
 the 'cold' state.

I don't think you have any hope of improving the cold state much.
The right way to think about this is not to be in the cold state.
Check your kernel parameters and make sure it's not set to limit
the amount of memory used for cache (I'm not actually sure if there
is such a limit on Linux, but there definitely is on some other Unixen).
Look around and see if you can reduce the memory used by processes,
or even better, offload non-database tasks to other machines.

Basically you need to get as much of the database as you can to stay
in disk cache.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] help tuning queries on large database

2006-01-06 Thread Harry Jackson
On 1/6/06, peter royal [EMAIL PROTECTED] wrote:
 PostgreSQL 8.1.1

 shared_buffers = 1  # (It was higher, 50k, but didn't help any,
 so brought down to free ram for disk cache)
 work_mem = 8196
 random_page_cost = 3
 effective_cache_size = 25

I have played with both disk cache settings and shared buffers and I
found that if I increased the shared buffers above a certain value
performance would increase dramatically. Playing with the effective
cache did not have the same amount of impact. I am currently running
with

shared_buffers = 254288 # approx 2.1Gb

and this is on a smaller dataset than yours.

--
Harry
http://www.hjackson.org
http://www.uklug.co.uk

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] help tuning queries on large database

2006-01-06 Thread David Lang

On Fri, 6 Jan 2006, Tom Lane wrote:


Date: Fri, 06 Jan 2006 18:47:55 -0500
From: Tom Lane [EMAIL PROTECTED]
To: peter royal [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] help tuning queries on large database

peter royal [EMAIL PROTECTED] writes:

So, my question is, is there anything I can do to boost performance
with what I've got, or am I in a position where the only 'fix' is
more faster disks? I can't think of any schema/index changes that
would help, since everything looks pretty optimal from the 'explain
analyze' output. I'd like to get a 10x improvement when querying from
the 'cold' state.


I don't think you have any hope of improving the cold state much.
The right way to think about this is not to be in the cold state.
Check your kernel parameters and make sure it's not set to limit
the amount of memory used for cache (I'm not actually sure if there
is such a limit on Linux, but there definitely is on some other Unixen).


Linux doesn't have any ability to limit the amount of memory used for 
caching (there are periodicly requests for such a feature)


David Lang


Look around and see if you can reduce the memory used by processes,
or even better, offload non-database tasks to other machines.

Basically you need to get as much of the database as you can to stay
in disk cache.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly