Re: [PERFORM] Postgresql works too slow

2005-04-18 Thread Tom Lane
Steve Poe [EMAIL PROTECTED] writes:
 Try enabliing your checkpoint_segments. In my example, our database 
 restore took 75mins. After enabling checkpoints_segments to 20, we cut 
 it down to less than 30 minutes.

Increasing maintenance_work_mem might help too ... or several other
settings ... with no information about exactly *what* is slow, it's
hard to say.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgresql works too slow

2005-04-18 Thread Christopher Kings-Lynne
Try enabliing your checkpoint_segments. In my example, our database 
restore took 75mins. After enabling checkpoints_segments to 20, we cut 
it down to less than 30 minutes.

Increasing maintenance_work_mem might help too ... or several other
settings ... with no information about exactly *what* is slow, it's
hard to say.
Try turning fsync = false for the duration of your reload.
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] UNSUBSCRIBE

2005-04-18 Thread Nick Trainor


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread William Yu
Problem with this strategy. You want battery-backed write caching for 
best performance  safety. (I've tried IDE for WAL before w/ write 
caching off -- the DB got crippled whenever I had to copy files from/to 
the drive on the WAL partition -- ended up just moving WAL back on the 
same SCSI drive as the main DB.) That means in addition to a $$$ SCSI 
caching controller, you also need a $$$ SATA caching controller. From my 
glance at prices, advanced SATA controllers seem to cost nearly as their 
SCSI counterparts.

This also looks to be the case for the drives themselves. Sure you can 
get super cheap 7200RPM SATA drives but they absolutely suck for 
database work. Believe me, I gave it a try once -- ugh. The highend WD 
10K Raptors look pretty good though -- the benchmarks @ storagereview 
seem to put these drives at about 90% of SCSI 10Ks for both single-user 
and multi-user. However, they're also priced like SCSIs -- here's what I 
found @ Mwave (going through pricewatch to find WD740GDs):

Seagate 7200 SATA -- 80GB$59
WD 10K SATA   -- 72GB$182
Seagate 10K U320  -- 72GB$289
Using the above prices for a fixed budget for RAID-10, you could get:
SATA 7200 -- 680MB per $1000
SATA 10K  -- 200MB per $1000
SCSI 10K  -- 125MB per $1000
For a 99% read-only DB that required lots of disk space (say something 
like Wikipedia or blog host), using consumer level SATA probably is ok. 
For anything else, I'd consider SATA 10K if (1) I do not need 15K RPM 
and (2) I don't have SCSI intrastructure already.

Steve Poe wrote:
If SATA drives don't have the ability to replace SCSI for a multi-user
Postgres apps, but you needed to save on cost (ALWAYS an issue), 
could/would you implement SATA for your logs (pg_xlog) and keep the rest 
on SCSI?

Steve Poe
Mohan, Ross wrote:
I've been doing some reading up on this, trying to keep up here, and 
have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a Phase Zero
   implementation of Tagged Command Queueing (the special sauce
   for SCSI).
2) This SATA TCQ is called NCQ and I believe it basically
   allows the disk software itself to do the reordering
   (this is called simple in TCQ terminology) It does not
   yet allow the TCQ head of queue command, allowing the
   current tagged request to go to head of queue, which is
   a simple way of manifesting a high priority request.
3) SATA drives are not yet multi-initiator?
Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications.
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Kevin Brown [EMAIL PROTECTED] writes:
 

Greg Stark wrote:
  

I think you're being misled by analyzing the write case.
Consider the read case. When a user process requests a block and 
that read makes its way down to the driver level, the driver can't 
just put it aside and wait until it's convenient. It has to go ahead 
and issue the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a 
couple of milliseconds to see if other requests come in, and then 
issue the read if none do.  If there are already other requests being 
fulfilled, then it'll schedule the request in question just like the 
rest.
  

But then the cure is worse than the disease. You're basically 
describing exactly what does happen anyways, only you're delaying more 
requests than necessary. That intervening time isn't really idle, it's 
filled with all the requests that were delayed during the previous 
large seek...

 

Once the first request has been fulfilled, the driver can now 
schedule the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that 
has tagged queueing to the disks and one that doesn't.  The only 
difference is where the queueing happens.
  

And *when* it happens. Instead of being able to issue requests while a 
large seek is happening and having some of them satisfied they have to 
wait until that seek is finished and get acted on during the next 
large seek.

If my theory is correct then I would expect bandwidth to be 
essentially equivalent but the latency on SATA drives to be increased 
by about 50% of the average seek time. Ie, while a busy SCSI drive can 
satisfy most requests in about 10ms a busy SATA drive would satisfy 
most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI 
drives have even lower seek times and no such IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing 
your application to run slower causing bandwidth demands to be lower 
as well. It's often hard to distinguish root causes from symptoms when 
optimizing 

Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-18 Thread Dawid Kuroczko
On 4/15/05, Enrico Weigelt [EMAIL PROTECTED] wrote:
 a) SELECT items.a, items.b, ..., users.username FROM items, users
 WHERE items.uid = users.uid;
 
 c) CREATE FUNCTION id2username(oid) RETURNS text
 LANGUAGE 'SQL' IMMUTABLE AS '
 SELECT username AS RESULT FROM users WHERE uid = $1';

You will be told that this function is not immutable but stable, and this
is quite right.  But consider such a function:

CREATE OR REPLACE FUNCTION id2username (oid int) RETURNS TEXT AS $$
  BEGIN
IF oid = 0 THEN RETURN 'foo';
ELSIF oid = 1 THEN RETURN 'bar';
END IF;
  END;
$$ LANGUAGE plpgsql IMMUTABLE;

versus a lookup table with similar data.  Logic suggests it should be faster
than a table...  It got me worried when I added: RAISE WARNING 'Called'
after begin and I got lots of Called warnings when using this IMMUTABLE
function in select...  And the timings for ~6000 values in aaa table
(and two values in lookup table) are:

There is a query, output of the EXPLAIN ANALYZE, Time of EXPLAIN
ANALYZE and Real time of SELECT (without EXPLAIN ANALYZE):

a) simple select from temp table, and a lookup cost:
  EXPLAIN ANALYZE SELECT n FROM aaa;
 Seq Scan on aaa  (cost=0.00..87.92 rows=5992 width=4) (actual
time=0.011..24.849 rows=6144 loops=1)
 Total runtime: 51.881 ms
(2 rows)
Time: 52,882 ms
Real time: 16,261 ms

  EXPLAIN ANALYZE SELECT id2username(n) FROM aaa limit 2;
Limit  (cost=0.00..0.03 rows=2 width=4) (actual time=0.111..0.150
rows=2 loops=1)
   -  Seq Scan on aaa  (cost=0.00..104.80 rows=6144 width=4) (actual
time=0.102..0.129 rows=2 loops=1)
 Total runtime: 0.224 ms
(3 rows)
Time: 1,308 ms
Real time: 1,380 ms

b) natural join with lookup table:
  EXPLAIN ANALYZE SELECT username FROM aaa NATURAL JOIN lookup;
 Hash Join  (cost=2.45..155.09 rows=3476 width=32) (actual
time=0.306..83.677 rows=6144 loops=1)
   Hash Cond: (outer.n = inner.n)
   -  Seq Scan on aaa  (cost=0.00..87.92 rows=5992 width=4) (actual
time=0.006..25.517 rows=6144 loops=1)
   -  Hash  (cost=2.16..2.16 rows=116 width=36) (actual
time=0.237..0.237 rows=0 loops=1)
 -  Seq Scan on lookup  (cost=0.00..2.16 rows=116 width=36)
(actual time=0.016..0.034 rows=2 loops=1)
 Total runtime: 107.378 ms
(6 rows)
Time: 109,040 ms
Real time: 25,364 ms

c) IMMUTABLE static lookup function:
  EXPLAIN ANALYZE SELECT id2username(n) FROM aaa;
Seq Scan on aaa  (cost=0.00..104.80 rows=6144 width=4) (actual
time=0.089..116.397 rows=6144 loops=1)
 Total runtime: 143.800 ms
(2 rows)
Time: 144,869 ms
Real time: 102,428 ms

d) self-join with a function ;)
  EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username
FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN
aaa USING (n);
 Hash Left Join  (cost=506.82..688.42 rows=6144 width=36) (actual
time=102.382..182.661 rows=6144 loops=1)
   Hash Cond: (outer.n = inner.n)
   -  Seq Scan on aaa  (cost=0.00..89.44 rows=6144 width=4) (actual
time=0.012..24.360 rows=6144 loops=1)
   -  Hash  (cost=506.82..506.82 rows=2 width=36) (actual
time=102.217..102.217 rows=0 loops=1)
 -  Subquery Scan v_lookup  (cost=476.05..506.82 rows=2
width=36) (actual time=53.626..102.057 rows=2 loops=1)
   -  Subquery Scan values  (cost=476.05..506.80 rows=2
width=4) (actual time=53.613..102.023 rows=2 loops=1)
 -  Unique  (cost=476.05..506.77 rows=2 width=4)
(actual time=53.456..101.772 rows=2 loops=1)
   -  Sort  (cost=476.05..491.41 rows=6144
width=4) (actual time=53.440..76.710 rows=6144 loops=1)
 Sort Key: n
 -  Seq Scan on aaa 
(cost=0.00..89.44 rows=6144 width=4) (actual time=0.013..26.626
rows=6144 loops=1)
 Total runtime: 209.378 ms
(11 rows)
Time: 211,460 ms
Real time:  46,682 ms

...so this IMMUTABLE is twice as slow (~100 ms) as the query joining
itself with a SELECT DISTINCT on an IMMUTABLE function (~50 ms),
which is twice as slow as JOIN against lookup table (~25 ms), and I feel
this IMMUTABLE function could be around ~20 ms (~16 ms plus
calling the function two times plus giving the values).

Ah, and this is PostgreSQL 8.0.1 running under FreeBSD on a
CPU: Intel(R) Celeron(R) CPU 2.40GHz (2400.10-MHz 686-class CPU).

   Regards,
  Dawid

PS: I have a feeling that IMMUTABLE functions worked better in 7.4,
yet I am unable to confirm this.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-18 Thread Merlin Moncure
 d) self-join with a function ;)
   EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username
 FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN
 aaa USING (n);

That's pretty clever.  
It sure seems like the server was not caching the results of the
function...maybe the server thought it was to small a table to bother?  

Merlin

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin
Another odd thing is when I tried turning off merge joins on the XP desktop
It took 32 secs to run compared to the 6 secs it was taking.
On the Linux (4proc box) it is now running in 3 secs with the mergejoins
turned off.

Unfortunately it takes over 2 minutes to actually return the 160,000+ rows.
I am guessing that is either network (I have gig cards on a LAN) or perhaps
the ODBC driver (using PGADMIN III to do the select).

I tried to run on psql on the server but it was putting it out to more.
If I do it and use  test.txt will it run it all out so I can get a time?
Does it display the time anywhere like in pgadminIII?


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin

What are the statistics
for tbljobtitle.id and tbljobtitle.clientnum 
I added default_statistics_target = 250 to the config and re-loaded the data
base. If that is what you mean?

--- how many distinct values of each, 

tbljobtitle.id 6764 for all clients 1018 for SAKS
tbljobtitle.clientnum 237 distinct clientnums just 1 for SAKS

and are the distributions skewed to a few popular values?
There are 3903 distinct values for jobtitle

Not sure if I answered the questions, let me know if you need more info.
It appears there are 1018 job titles in the table for saks and 6764 for all
the clients. There can be more values as presentation layer can have more
then one value for an id. SAKS is not using presentation layer yet as there
are only 1018 distinct values 1 for each id.

Joel




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


FW: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin
Sorry if this posts twice I posted and did not see it hit the list.

What are the statistics
for tbljobtitle.id and tbljobtitle.clientnum 
I added default_statistics_target = 250 to the config and re-loaded the data
base. If that is what you mean?

--- how many distinct values of each, 

tbljobtitle.id 6764 for all clients 1018 for SAKS
tbljobtitle.clientnum 237 distinct clientnums just 1 for SAKS

and are the distributions skewed to a few popular values?
There are 3903 distinct values for jobtitle

Not sure if I answered the questions, let me know if you need more info.
It appears there are 1018 job titles in the table for saks and 6764 for all
the clients. There can be more values as presentation layer can have more
then one value for an id. SAKS is not using presentation layer yet as there
are only 1018 distinct values 1 for each id.

Joel




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] speed of querry?

2005-04-18 Thread Dave Page
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Joel Fradkin
 Sent: 18 April 2005 14:02
 To: PostgreSQL Perform
 Subject: FW: [PERFORM] speed of querry?
 
 Another odd thing is when I tried turning off merge joins on 
 the XP desktop
 It took 32 secs to run compared to the 6 secs it was taking.
 On the Linux (4proc box) it is now running in 3 secs with the 
 mergejoins
 turned off.
 
 Unfortunately it takes over 2 minutes to actually return the 
 160,000+ rows.
 I am guessing that is either network (I have gig cards on a 
 LAN) or perhaps
 the ODBC driver (using PGADMIN III to do the select).

pgAdmin III uses libpq, not the ODBC driver.

Regards, Dave

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] speed of querry?

2005-04-18 Thread Joel Fradkin
pgAdmin III uses libpq, not the ODBC driver.

Sorry I am not too aware of all the semantics.
I guess the question is if it is normal to take 2 mins to get 160K of
records, or is there something else I can do (I plan on limiting the query
screens using limit and offset; I realize this will only be effective for
the early part of the returned record set, but I believe they don't page
through a bunch of records, they probably add search criteria). But for
reporting I will need to return all the records and this seems slow to me
(but it might be in line with what I get now; I will have to do some
benchmarking).

The application is a mixture of .net and asp and will soon have java.
So I am using the .net library for the .net pages and the ODBC driver for
the asp pages.

I did find using a view for the location join sped up the query a great
deal, I will have to see if there are other places I can use that thinking
(instead of joining on the associate table and its dependants I can just
join on a view of that data, etc).

Basically I have a view that does a join from location to district, region
and division tables. The old viwassoclist had those joined to the assoc
table in the viwassoclist, I changed it to use the view I created where the
tables were joined to the location table and in assoclist I just join to the
location view. This really made a huge difference in speed.

Regards, Dave


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-18 Thread Dawid Kuroczko
On 4/18/05, Merlin Moncure [EMAIL PROTECTED] wrote:
  d) self-join with a function ;)
EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username
  FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN
  aaa USING (n);
 
 That's pretty clever.
 It sure seems like the server was not caching the results of the
 function...maybe the server thought it was to small a table to bother?

Nah, I don't thinks so.  Having around 2 097 152 rows of 1s and 0s takes
48 seconds for id2username() query.
The self join you've quoted above takes 32 seconds.
SELECT n FROM aaa; takes 7 seconds.

Thinking further...
SELECT CASE n WHEN 0 THEN 'foo' WHEN 1 THEN 'bar' END FROM aaa;
takes 9 seconds.

CREATE OR REPLACE FUNCTION id2un_case(oid int) RETURNS text AS $$
BEGIN RETURN CASE oid WHEN 0 THEN 'foo' WHEN 1 THEN 'bar' END; END; $$
LANGUAGE plpgsql IMMUTABLE;
SELECT id2un_case(n) FROM aaa;
...takes 36 seconds

...and to see how it depends on flags used:
SELECT count(id2un_case(n)) FROM aaa;
...id2un_case(n) IMMUTABLE takes 29900,114 ms
...id2un_case(n) IMMUTABLE STRICT takes 30187,958 ms
...id2un_case(n) STABLE takes 31457,560 ms
...id2un_case(n) takes 33545,178 ms
...id2un_case(n) VOLATILE takes 35150,920 ms
(and a count(CASE n WHEN ... END) FROM aaa takes: 2564,188 ms


I understand that these measurements are not too accurate.  They
were done on idle system, and the queries were run couple of times
(to make sure they're cached :)).  I believe either something is minor
performance difference between IMMUTABLE STABLE and even
VOLATILE plpgsql... :(

Oh, and doing things like ORDER BY n or WHERE n = 1 didn't help
either...

I still wonder whether it's only my case or is there really something
wrong with these functions?

   Regards,
   Dawid

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: FW: [PERFORM] speed of querry?

2005-04-18 Thread Stephan Szabo
On Mon, 18 Apr 2005, Joel Fradkin wrote:

 Another odd thing is when I tried turning off merge joins on the XP desktop
 It took 32 secs to run compared to the 6 secs it was taking.
 On the Linux (4proc box) it is now running in 3 secs with the mergejoins
 turned off.

 Unfortunately it takes over 2 minutes to actually return the 160,000+ rows.
 I am guessing that is either network (I have gig cards on a LAN) or perhaps
 the ODBC driver (using PGADMIN III to do the select).

 I tried to run on psql on the server but it was putting it out to more.
 If I do it and use  test.txt will it run it all out so I can get a time?
 Does it display the time anywhere like in pgadminIII?

Redirecting should turn the pager off. \timing will add a timing number
after queries.  If you want to not be bothered by the pager, you can turn
if off with \pset pager off.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] How to improve postgres performace

2005-04-18 Thread Rodrigo Moreno
Hi all,

Some months ago i post a similar problem here i it was solved by running
vaccumdb time by time.
So, when i started using the postgres, i never been used the vacuumdb, and
after 2 months i started using once a week, after few weeks, i tried once a
day and now twice a day.

At this weekend i have started to use pg_autovacuum with default settings.

I really worried about that, because it's no enough anymore, and users claim
about performace. But running the vacuumdb full, everthing starts to run
better again, so i think the problem is not related to a specific query.

What I can do to check what I have change to get more performance ? 
Could I use vacuum verbose to check what is going on ? So, how ? 

Most all the time, even user querying the server the machine is 96%-100%
idle. The discs are SCSI, FreeBSD 5.3, the size of database is 1.1Gb, max 30
connections and 10 concurrent conections. My server have 512Mb Ram and 256Mb
has changed to SHMAX. There is max 1000 inserted/excluded/Updated row by
day.

These are my kernel params:
--
options SHMMAXPGS=65536
options SEMMNI=40
options SEMMNS=240
options SEMUME=40
options SEMMNU=120

Postgresql.conf non-default settings

tcpip_socket = true
max_connections = 30

shared_buffers = 1024
sort_mem = 2048
vacuum_mem = 16384

wal_buffers = 16
checkpoint_segments = 5

effective_cache_size = 16384
random_page_cost = 2

stats_start_collector = true
stats_row_level = true


I follow the most of all discussions in this group and tried myself change
the parameters, but now, I don't know more what to do to get better
performance.

Thanks a Lot
Rodrigo Moreno




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Greg Stark

William Yu [EMAIL PROTECTED] writes:

 Using the above prices for a fixed budget for RAID-10, you could get:
 
 SATA 7200 -- 680MB per $1000
 SATA 10K  -- 200MB per $1000
 SCSI 10K  -- 125MB per $1000

What a lot of these analyses miss is that cheaper == faster because cheaper
means you can buy more spindles for the same price. I'm assuming you picked
equal sized drives to compare so that 200MB/$1000 for SATA is almost twice as
many spindles as the 125MB/$1000. That means it would have almost double the
bandwidth. And the 7200 RPM case would have more than 5x the bandwidth.

While 10k RPM drives have lower seek times, and SCSI drives have a natural
seek time advantage, under load a RAID array with fewer spindles will start
hitting contention sooner which results into higher latency. If the controller
works well the larger SATA arrays above should be able to maintain their
mediocre latency much better under load than the SCSI array with fewer drives
would maintain its low latency response time despite its drives' lower average
seek time.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
This is fundamentaly untrue.

A mirror is still a mirror.  At most in a RAID 10 you can have two
simultaneous seeks.  You are always going to be limited by the seek
time of your drives.  It's a stripe, so you have to read from all
members of the stripe to get data, requiring all drives to seek. 
There is no advantage to seek time in adding more drives.  By adding
more drives you can increase throughput, but the max throughput of the
PCI-X bus isn't that high (I think around 400MB/sec)  You can easily
get this with a six or seven drive RAID 5, or a ten drive RAID 10.  At
that point you start having to factor in the cost of a bigger chassis
to hold more drives, which can be big bucks.

Alex Turner
netEconomist

On 18 Apr 2005 10:59:05 -0400, Greg Stark [EMAIL PROTECTED] wrote:
 
 William Yu [EMAIL PROTECTED] writes:
 
  Using the above prices for a fixed budget for RAID-10, you could get:
 
  SATA 7200 -- 680MB per $1000
  SATA 10K  -- 200MB per $1000
  SCSI 10K  -- 125MB per $1000
 
 What a lot of these analyses miss is that cheaper == faster because cheaper
 means you can buy more spindles for the same price. I'm assuming you picked
 equal sized drives to compare so that 200MB/$1000 for SATA is almost twice as
 many spindles as the 125MB/$1000. That means it would have almost double the
 bandwidth. And the 7200 RPM case would have more than 5x the bandwidth.
 
 While 10k RPM drives have lower seek times, and SCSI drives have a natural
 seek time advantage, under load a RAID array with fewer spindles will start
 hitting contention sooner which results into higher latency. If the controller
 works well the larger SATA arrays above should be able to maintain their
 mediocre latency much better under load than the SCSI array with fewer drives
 would maintain its low latency response time despite its drives' lower average
 seek time.
 
 --
 greg
 
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] Sort and index

2005-04-18 Thread Andrei Gaspar
Hi,
I thought that an index can be used for sorting.
I'm a little confused about the following result:
create index OperationsName on Operations(cOperationName);
explain SELECT * FROM Operations ORDER BY cOperationName;
 QUERY PLAN
---
Sort  (cost=185.37..189.20 rows=1532 width=498)
  Sort Key: coperationname
  -  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
(3 rows)
Is this supposed to be so?
Andrei
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.15 - Release Date: 4/16/2005
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve postgres performace

2005-04-18 Thread Tom Lane
Rodrigo Moreno [EMAIL PROTECTED] writes:
 At this weekend i have started to use pg_autovacuum with default settings.

 I really worried about that, because it's no enough anymore, and users claim
 about performace. But running the vacuumdb full, everthing starts to run
 better again, so i think the problem is not related to a specific query.

It sounds like you may not have the FSM settings set large enough for
your database.  The default settings are only enough for a small DB
(perhaps a few hundred meg).

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Sort and index

2005-04-18 Thread Andrei Gaspar
Thanks for the quick response
Andrei
Dave Held wrote:
-Original Message-
From: Andrei Gaspar [mailto:[EMAIL PROTECTED]
Sent: Monday, April 18, 2005 10:36 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Sort and index
I thought that an index can be used for sorting.
I'm a little confused about the following result:
create index OperationsName on Operations(cOperationName);
explain SELECT * FROM Operations ORDER BY cOperationName;
 QUERY PLAN
--
-
Sort  (cost=185.37..189.20 rows=1532 width=498)
  Sort Key: coperationname
  -  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
(3 rows)
Is this supposed to be so?
   

Since you are fetching the entire table, you are touching all the rows.
If the query were to fetch the rows in index order, it would be seeking
all over the table's tracks.  By fetching in sequence order, it has a
much better chance of fetching rows in a way that minimizes head seeks.
Since disk I/O is generally 10-100x slower than RAM, the in-memory sort 
can be surprisingly slow and still beat indexed disk access.  Of course,
this is only true if the table can fit and be sorted entirely in memory
(which, with 1500 rows, probably can).

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
 


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.15 - Release Date: 4/16/2005
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


RES: [PERFORM] How to improve postgres performace

2005-04-18 Thread Rodrigo Moreno
Tom,

How to check if the value it's enough ? The log generate by vacuum verbose
can help ?

The current values for:

max_fsm_pages = 1048576  
max_fsm_relations = 1000  

this is enough ?

Regards,
Rodrigo 

-Mensagem original-
De: Tom Lane [mailto:[EMAIL PROTECTED] 
Enviada em: segunda-feira, 18 de abril de 2005 12:58
Para: Rodrigo Moreno
Cc: pgsql-performance@postgresql.org
Assunto: Re: [PERFORM] How to improve postgres performace

Rodrigo Moreno [EMAIL PROTECTED] writes:
 At this weekend i have started to use pg_autovacuum with default settings.

 I really worried about that, because it's no enough anymore, and users 
 claim about performace. But running the vacuumdb full, everthing 
 starts to run better again, so i think the problem is not related to a
specific query.

It sounds like you may not have the FSM settings set large enough for your
database.  The default settings are only enough for a small DB (perhaps a
few hundred meg).

regards, tom lane




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
[snip]
 
 Adding drives will not let you get lower response times than the average seek
 time on your drives*. But it will let you reach that response time more often.
 
[snip]

I believe your assertion is fundamentaly flawed.  Adding more drives
will not let you reach that response time more often.  All drives are
required to fill every request in all RAID levels (except possibly
0+1, but that isn't used for enterprise applicaitons).  Most requests
in OLTP require most of the request time to seek, not to read.  Only
in single large block data transfers will you get any benefit from
adding more drives, which is atypical in most database applications. 
For most database applications, the only way to increase
transactions/sec is to decrease request service time, which is
generaly achieved with better seek times or a better controller card,
or possibly spreading your database accross multiple tablespaces on
seperate paritions.

My assertion therefore is that simply adding more drives to an already
competent* configuration is about as likely to increase your database
effectiveness as swiss cheese is to make your car run faster.

Alex Turner
netEconomist

*Assertion here is that the DBA didn't simply configure all tables and
xlog on a single 7200 RPM disk, but has seperate physical drives for
xlog and tablespace at least on 10k drives.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Sort and index

2005-04-18 Thread Michael Fuhr
On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote:
  
  I thought that an index can be used for sorting.
  I'm a little confused about the following result:
  
  create index OperationsName on Operations(cOperationName);
  explain SELECT * FROM Operations ORDER BY cOperationName;
QUERY PLAN
  --
  -
   Sort  (cost=185.37..189.20 rows=1532 width=498)
 Sort Key: coperationname
 -  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
  (3 rows)
  
  Is this supposed to be so?
 
 Since you are fetching the entire table, you are touching all the rows.
 If the query were to fetch the rows in index order, it would be seeking
 all over the table's tracks.  By fetching in sequence order, it has a
 much better chance of fetching rows in a way that minimizes head seeks.
 Since disk I/O is generally 10-100x slower than RAM, the in-memory sort 
 can be surprisingly slow and still beat indexed disk access.  Of course,
 this is only true if the table can fit and be sorted entirely in memory
 (which, with 1500 rows, probably can).

Out of curiosity, what are the results of the following queries?
(Queries run twice to make sure time differences aren't due to
caching.)

SET enable_seqscan TO on;
SET enable_indexscan TO off;
EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;

SET enable_seqscan TO off;
SET enable_indexscan TO on;
EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;

SELECT version();

With 1500 rows of random data, I consistently see better performance
with an index scan (about twice as fast as a sequence scan), and
the planner uses an index scan if it has a choice (i.e., when
enable_seqscan and enable_indexscan are both on).  But my test case
and postgresql.conf settings might be different enough from yours
to account for different behavior.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Jacques Caron
Hi,
At 18:56 18/04/2005, Alex Turner wrote:
All drives are required to fill every request in all RAID levels
No, this is definitely wrong. In many cases, most drives don't actually 
have the data requested, how could they handle the request?

When reading one random sector, only *one* drive out of N is ever used to 
service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.

When writing:
- in RAID 0, 1 drive
- in RAID 1, RAID 0+1 or 1+0, 2 drives
- in RAID 5, you need to read on all drives and write on 2.
Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
Jacques.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alan Stange
Alex Turner wrote:
[snip]
 

Adding drives will not let you get lower response times than the average seek
time on your drives*. But it will let you reach that response time more often.
   

[snip]
I believe your assertion is fundamentaly flawed.  Adding more drives
will not let you reach that response time more often.  All drives are
required to fill every request in all RAID levels (except possibly
0+1, but that isn't used for enterprise applicaitons).  Most requests
in OLTP require most of the request time to seek, not to read.  Only
in single large block data transfers will you get any benefit from
adding more drives, which is atypical in most database applications. 
For most database applications, the only way to increase
transactions/sec is to decrease request service time, which is
generaly achieved with better seek times or a better controller card,
or possibly spreading your database accross multiple tablespaces on
seperate paritions.

My assertion therefore is that simply adding more drives to an already
competent* configuration is about as likely to increase your database
effectiveness as swiss cheese is to make your car run faster.
 

Consider the case of a mirrored file system with a mostly read() 
workload.  Typical behavior is to use a round-robin method for issueing 
the read operations to each mirror in turn, but one can use other 
methods like a geometric algorithm that will issue the reads to the 
drive with the head located closest to the desired track.Some 
systems have many mirrors of the data for exactly this behavior.   In 
fact, one can carry this logic to the extreme and have one drive for 
every cylinder in the mirror, thus removing seek latencies completely.  
In fact this extreme case would also remove the rotational latency as 
the cylinder will be in the disks read cache.  :-)   Of course, writing 
data would be a bit slow!

I'm not sure I understand your assertion that all drives are required 
to fill every request in all RAID levels.   After all, in mirrored 
reads only one mirror needs to read any given block of data, so I don't 
know what goal is achieved in making other mirrors read the same data.

My assertion (based on ample personal experience) is that one can 
*always* get improved performance by adding more drives.  Just limit the 
drives to use the first few cylinders so that the average seek time is 
greatly reduced and concatenate the drives together.  One can then build 
the usual RAID device out of these concatenated metadevices.  Yes, one 
is wasting lots of disk space, but that's life.   If your goal is 
performance, then you need to put your money on the table. The 
system will be somewhat unreliable because of the device count, 
additional SCSI buses, etc., but that too is life in the high 
performance world.

-- Alan
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread John A Meinel
Alex Turner wrote:
[snip]

Adding drives will not let you get lower response times than the average seek
time on your drives*. But it will let you reach that response time more often.

[snip]
I believe your assertion is fundamentaly flawed.  Adding more drives
will not let you reach that response time more often.  All drives are
required to fill every request in all RAID levels (except possibly
0+1, but that isn't used for enterprise applicaitons).
Actually 0+1 is the recommended configuration for postgres databases
(both for xlog and for the bulk data), because the write speed of RAID5
is quite poor.
Hence you base assumption is not correct, and adding drives *does* help.
Most requests
in OLTP require most of the request time to seek, not to read.  Only
in single large block data transfers will you get any benefit from
adding more drives, which is atypical in most database applications.
For most database applications, the only way to increase
transactions/sec is to decrease request service time, which is
generaly achieved with better seek times or a better controller card,
or possibly spreading your database accross multiple tablespaces on
seperate paritions.

This is probably true. However, if you are doing lots of concurrent
connections, and things are properly spread across multiple spindles
(using RAID0+1, or possibly tablespaces across multiple raids).
Then each seek occurs on a separate drive, which allows them to occur at
the same time, rather than sequentially. Having 2 processes competing
for seeking on the same drive is going to be worse than having them on
separate drives.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Jacques Caron
Hi,
At 16:59 18/04/2005, Greg Stark wrote:
William Yu [EMAIL PROTECTED] writes:
 Using the above prices for a fixed budget for RAID-10, you could get:

 SATA 7200 -- 680MB per $1000
 SATA 10K  -- 200MB per $1000
 SCSI 10K  -- 125MB per $1000
What a lot of these analyses miss is that cheaper == faster because cheaper
means you can buy more spindles for the same price. I'm assuming you picked
equal sized drives to compare so that 200MB/$1000 for SATA is almost twice as
many spindles as the 125MB/$1000. That means it would have almost double the
bandwidth. And the 7200 RPM case would have more than 5x the bandwidth.
While 10k RPM drives have lower seek times, and SCSI drives have a natural
seek time advantage, under load a RAID array with fewer spindles will start
hitting contention sooner which results into higher latency. If the controller
works well the larger SATA arrays above should be able to maintain their
mediocre latency much better under load than the SCSI array with fewer drives
would maintain its low latency response time despite its drives' lower average
seek time.
I would definitely agree. More factors in favor of more cheap drives:
- cheaper drives (7200 rpm) have larger disks (3.7 diameter against 2.6 or 
3.3). That means the outer tracks hold more data, and the same amount of 
data is held on a smaller area, which means less tracks, which means 
reduced seek times. You can roughly count the real average seek time as 
(average seek time over full disk * size of dataset / capacity of disk). 
And you actually need to physicall seek less often too.

- more disks means less data per disk, which means the data is further 
concentrated on outer tracks, which means even lower seek times

Also, what counts is indeed not so much the time it takes to do one single 
random seek, but the number of random seeks you can do per second. Hence, 
more disks means more seeks per second (if requests are evenly distributed 
among all disks, which a good stripe size should achieve).

Not taking into account TCQ/NCQ or write cache optimizations, the important 
parameter (random seeks per second) can be approximated as:

N * 1000 / (lat + seek * ds / (N * cap))
Where:
N is the number of disks
lat is the average rotational latency in milliseconds (500/(rpm/60))
seek is the average seek over the full disk in milliseconds
ds is the dataset size
cap is the capacity of each disk
Using this formula and a variety of disks, counting only the disks 
themselves (no enclosures, controllers, rack space, power, maintenance...), 
trying to maximize the number of seeks/second for a fixed budget (1000 
euros) with a dataset size of 100 GB makes SATA drives clear winners: you 
can get more than 4000 seeks/second (with 21 x 80GB disks) where SCSI 
cannot even make it to the 1400 seek/second point (with 8 x 36 GB disks). 
Results can vary quite a lot based on the dataset size, which illustrates 
the importance of staying on the edges of the disks. I'll try to make the 
analysis more complete by counting some of the overhead (obviously 21 
drives has a lot of other implications!), but I believe SATA drives still 
win in theory.

It would be interesting to actually compare this to real-world (or 
nearly-real-world) benchmarks to measure the effectiveness of features like 
TCQ/NCQ etc.

Jacques.
 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Steve Poe
Alex,
In the situation of the animal hospital server I oversee, their 
application is OLTP. Adding hard drives (6-8) does help performance. 
Benchmarks like pgbench and OSDB agree with it, but in reality users 
could not see noticeable change. However, moving the top 5/10 tables and 
indexes to their own space made a greater impact.

Someone who reads PostgreSQL 8.0 Performance Checklist is going to see 
point #1 add more disks is the key. How about adding a subpoint to 
explaining when more disks isn't enough or applicable? I maybe 
generalizing the complexity of tuning an OLTP application, but some 
clarity could help.

Steve Poe


---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


RES: RES: [PERFORM] How to improve postgres performace

2005-04-18 Thread Rodrigo Moreno
 That max_fsm_pages value is enough to cover 8Gb, so it should work OK for
a database disk footprint up to 10 or so Gb.   I don't know how many tables
in your installation so I can't say if max_fsm_relations is high enough, but
you can check that by looking at the tail end of the output of VACUUM
VERBOSE.  (Or just count 'em ;-))

The last count in vacuum verbose shows me 92 relations, and I know the lower
value for max_fsm_relations is enough, maybe I'll change to 500.

 Offhand these look reasonable, though, so if you are seeing database bloat
over time it probably means you need to tweak  your autovacuum settings.
I'm not much of an autovacuum expert, but maybe someone else can help you
there.

I'll let the autovacuum running this week to see what happen. 

 You might want to keep track of physical file sizes over a period of time
and try to determine exactly where the bloat  is happening.

There is two mostly used and bigger tables, I'll keep eyes on both tables.

Thanks 
Rodrigo Moreno



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
Not true - the recommended RAID level is RAID 10, not RAID 0+1 (at
least I would never recommend 1+0 for anything).

RAID 10 and RAID 0+1 are _quite_ different.  One gives you very good
redundancy, the other is only slightly better than RAID 5, but
operates faster in degraded mode (single drive).

Alex Turner
netEconomist

On 4/18/05, John A Meinel [EMAIL PROTECTED] wrote:
 Alex Turner wrote:
 
 [snip]
 
 
 Adding drives will not let you get lower response times than the average 
 seek
 time on your drives*. But it will let you reach that response time more 
 often.
 
 
 
 [snip]
 
 I believe your assertion is fundamentaly flawed.  Adding more drives
 will not let you reach that response time more often.  All drives are
 required to fill every request in all RAID levels (except possibly
 0+1, but that isn't used for enterprise applicaitons).
 
 Actually 0+1 is the recommended configuration for postgres databases
 (both for xlog and for the bulk data), because the write speed of RAID5
 is quite poor.
 Hence you base assumption is not correct, and adding drives *does* help.
 
 Most requests
 in OLTP require most of the request time to seek, not to read.  Only
 in single large block data transfers will you get any benefit from
 adding more drives, which is atypical in most database applications.
 For most database applications, the only way to increase
 transactions/sec is to decrease request service time, which is
 generaly achieved with better seek times or a better controller card,
 or possibly spreading your database accross multiple tablespaces on
 seperate paritions.
 
 
 This is probably true. However, if you are doing lots of concurrent
 connections, and things are properly spread across multiple spindles
 (using RAID0+1, or possibly tablespaces across multiple raids).
 Then each seek occurs on a separate drive, which allows them to occur at
 the same time, rather than sequentially. Having 2 processes competing
 for seeking on the same drive is going to be worse than having them on
 separate drives.
 John
 =:-
 
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
I think the add more disks thing is really from the point of view that
one disk isn't enough ever.  You should really have at least four
drives configured into two RAID 1s.  Most DBAs will know this, but
most average Joes won't.

Alex Turner
netEconomist

On 4/18/05, Steve Poe [EMAIL PROTECTED] wrote:
 Alex,
 
 In the situation of the animal hospital server I oversee, their
 application is OLTP. Adding hard drives (6-8) does help performance.
 Benchmarks like pgbench and OSDB agree with it, but in reality users
 could not see noticeable change. However, moving the top 5/10 tables and
 indexes to their own space made a greater impact.
 
 Someone who reads PostgreSQL 8.0 Performance Checklist is going to see
 point #1 add more disks is the key. How about adding a subpoint to
 explaining when more disks isn't enough or applicable? I maybe
 generalizing the complexity of tuning an OLTP application, but some
 clarity could help.
 
 Steve Poe
 


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
Ok - well - I am partially wrong...

If you're stripe size is 64Kb, and you are reading 256k worth of data,
it will be spread across four drives, so you will need to read from
four devices to get your 256k of data (RAID 0 or 5 or 10), but if you
are only reading 64kb of data, I guess you would only need to read
from one disk.

So my assertion that adding more drives doesn't help is pretty
wrong... particularly with OLTP because it's always dealing with
blocks that are smaller that the stripe size.

Alex Turner
netEconomist

On 4/18/05, Jacques Caron [EMAIL PROTECTED] wrote:
 Hi,
 
 At 18:56 18/04/2005, Alex Turner wrote:
 All drives are required to fill every request in all RAID levels
 
 No, this is definitely wrong. In many cases, most drives don't actually
 have the data requested, how could they handle the request?
 
 When reading one random sector, only *one* drive out of N is ever used to
 service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.
 
 When writing:
 - in RAID 0, 1 drive
 - in RAID 1, RAID 0+1 or 1+0, 2 drives
 - in RAID 5, you need to read on all drives and write on 2.
 
 Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
 
 Jacques.
 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
So I wonder if one could take this stripe size thing further and say
that a larger stripe size is more likely to result in requests getting
served parallized across disks which would lead to increased
performance?

Again, thanks to all people on this list, I know that I have learnt a
_hell_ of alot since subscribing.

Alex Turner
netEconomist

On 4/18/05, Alex Turner [EMAIL PROTECTED] wrote:
 Ok - well - I am partially wrong...
 
 If you're stripe size is 64Kb, and you are reading 256k worth of data,
 it will be spread across four drives, so you will need to read from
 four devices to get your 256k of data (RAID 0 or 5 or 10), but if you
 are only reading 64kb of data, I guess you would only need to read
 from one disk.
 
 So my assertion that adding more drives doesn't help is pretty
 wrong... particularly with OLTP because it's always dealing with
 blocks that are smaller that the stripe size.
 
 Alex Turner
 netEconomist
 
 On 4/18/05, Jacques Caron [EMAIL PROTECTED] wrote:
  Hi,
 
  At 18:56 18/04/2005, Alex Turner wrote:
  All drives are required to fill every request in all RAID levels
 
  No, this is definitely wrong. In many cases, most drives don't actually
  have the data requested, how could they handle the request?
 
  When reading one random sector, only *one* drive out of N is ever used to
  service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.
 
  When writing:
  - in RAID 0, 1 drive
  - in RAID 1, RAID 0+1 or 1+0, 2 drives
  - in RAID 5, you need to read on all drives and write on 2.
 
  Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
 
  Jacques.
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Greg Stark

Jacques Caron [EMAIL PROTECTED] writes:

 When writing:
 - in RAID 0, 1 drive
 - in RAID 1, RAID 0+1 or 1+0, 2 drives
 - in RAID 5, you need to read on all drives and write on 2.

Actually RAID 5 only really needs to read from two drives. The existing parity
block and the block you're replacing. It just xors the old block, the new
block, and the existing parity block to generate the new parity block.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Joshua D. Drake
Alex Turner wrote:
Not true - the recommended RAID level is RAID 10, not RAID 0+1 (at
least I would never recommend 1+0 for anything).
Uhmm I was under the impression that 1+0 was RAID 10 and that 0+1 is NOT
RAID 10.
Ref: http://www.acnc.com/raid.html
Sincerely,
Joshua D. Drake

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Jacques Caron
Hi,
At 20:16 18/04/2005, Alex Turner wrote:
So my assertion that adding more drives doesn't help is pretty
wrong... particularly with OLTP because it's always dealing with
blocks that are smaller that the stripe size.
When doing random seeks (which is what a database needs most of the time), 
the number of disks helps improve the number of seeks per second (which is 
the bottleneck in this case). When doing sequential reads, the number of 
disks helps improve total throughput (which is the bottleneck in that case).

In short: in always helps :-)
Jacques.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-18 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 It would be interesting sometime to try to teach the planner about
 inlining SQL-language functions to become joins.  That is, given
 
 create function id2name(int) returns text as
 'select name from mytab where id = $1' language sql stable;
 
 select uid, id2name(uid) from othertab where something;
 
 I think that in principle this could automatically be converted to
 
 select uid, name from othertab left join mytab on (uid = id) where something;

The Inlining of the function is presumably a side-issue. I have tons of
queries that use subqueries in the select list for which the same behaviour
would be appropriate.

Things like

select uid, (select name from mytab where id = uid) as name from othertab ...


 There are some pitfalls though, particularly that you'd have to be able to
 prove that the function's query couldn't return more than one row (else the
 join might produce more result rows than the original query).

Or just have a special join type that has the desired behaviour in that case.
Ie, pretend the query was really

SELECT * FROM othertab LEFT SINGLE JOIN mytab ...

Where LEFT SINGLE JOIN is an imaginary syntax that doesn't actually have to
exist in the parser, but exists in the planner/executor and behaves
differently in the case of duplicate matches.

Actually I could see such a syntax being useful directly too.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Mohan, Ross
Don't you think optimal stripe width would be
a good question to research the binaries for? I'd
think that drives the answer, largely.  (uh oh, pun alert)

EG, oracle issues IO requests (this may have changed _just_ 
recently) in 64KB chunks, regardless of what you ask for. 
So when I did my striping (many moons ago, when the Earth 
was young...) I did it in 128KB widths, and set the oracle 
multiblock read count according. For oracle, any stripe size
under 64KB=stupid, anything much over 128K/258K=wasteful. 

I am eager to find out how PG handles all this. 


- Ross



p.s. Brooklyn thug accent 'You want a database record? I 
  gotcher record right here' http://en.wikipedia.org/wiki/Akashic_Records



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Turner
Sent: Monday, April 18, 2005 2:21 PM
To: Jacques Caron
Cc: Greg Stark; William Yu; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


So I wonder if one could take this stripe size thing further and say that a 
larger stripe size is more likely to result in requests getting served 
parallized across disks which would lead to increased performance?

Again, thanks to all people on this list, I know that I have learnt a _hell_ of 
alot since subscribing.

Alex Turner
netEconomist

On 4/18/05, Alex Turner [EMAIL PROTECTED] wrote:
 Ok - well - I am partially wrong...
 
 If you're stripe size is 64Kb, and you are reading 256k worth of data, 
 it will be spread across four drives, so you will need to read from 
 four devices to get your 256k of data (RAID 0 or 5 or 10), but if you 
 are only reading 64kb of data, I guess you would only need to read 
 from one disk.
 
 So my assertion that adding more drives doesn't help is pretty 
 wrong... particularly with OLTP because it's always dealing with 
 blocks that are smaller that the stripe size.
 
 Alex Turner
 netEconomist
 
 On 4/18/05, Jacques Caron [EMAIL PROTECTED] wrote:
  Hi,
 
  At 18:56 18/04/2005, Alex Turner wrote:
  All drives are required to fill every request in all RAID levels
 
  No, this is definitely wrong. In many cases, most drives don't 
  actually have the data requested, how could they handle the request?
 
  When reading one random sector, only *one* drive out of N is ever 
  used to service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.
 
  When writing:
  - in RAID 0, 1 drive
  - in RAID 1, RAID 0+1 or 1+0, 2 drives
  - in RAID 5, you need to read on all drives and write on 2.
 
  Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
 
  Jacques.
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Jacques Caron
Hi,
At 20:21 18/04/2005, Alex Turner wrote:
So I wonder if one could take this stripe size thing further and say
that a larger stripe size is more likely to result in requests getting
served parallized across disks which would lead to increased
performance?
Actually, it would be pretty much the opposite. The smaller the stripe 
size, the more evenly distributed data is, and the more disks can be used 
to serve requests. If your stripe size is too large, many random accesses 
within one single file (whose size is smaller than the stripe size/number 
of disks) may all end up on the same disk, rather than being split across 
multiple disks (the extreme case being stripe size = total size of all 
disks, which means concatenation). If all accesses had the same cost (i.e. 
no seek time, only transfer time), the ideal would be to have a stripe size 
equal to the number of disks.

But below a certain size, you're going to use multiple disks to serve one 
single request which would not have taken much more time from a single disk 
(reading even a large number of consecutive blocks within one cylinder does 
not take much more time than reading a single block), so you would add 
unnecessary seeks on a disk that could have served another request in the 
meantime. You should definitely not go below the filesystem block size or 
the database block size.

There is a interesting discussion of the optimal stripe size in the vinum 
manpage on FreeBSD:

http://www.freebsd.org/cgi/man.cgi?query=vinumapropos=0sektion=0manpath=FreeBSD+5.3-RELEASE+and+Portsformat=html
(look for Performance considerations, towards the end -- note however 
that some of the calculations are not entirely correct).

Basically it says the optimal stripe size is somewhere between 256KB and 
4MB, preferably an odd number, and that some hardware RAID controllers 
don't like big stripe sizes. YMMV, as always.

Jacques.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
Mistype.. I meant 0+1 in the second instance :(


On 4/18/05, Joshua D. Drake [EMAIL PROTECTED] wrote:
 Alex Turner wrote:
  Not true - the recommended RAID level is RAID 10, not RAID 0+1 (at
  least I would never recommend 1+0 for anything).
 
 Uhmm I was under the impression that 1+0 was RAID 10 and that 0+1 is NOT
 RAID 10.
 
 Ref: http://www.acnc.com/raid.html
 
 Sincerely,
 
 Joshua D. Drake
 
 
  ---(end of broadcast)---
  TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Compressing WAL

2005-04-18 Thread Bruce Momjian

Added to TODO:

* Compress WAL entries [wal]

I have also added this email to TODO.detail.

---

Simon Riggs wrote:
 On Sun, 2005-04-10 at 21:12 -0400, Bruce Momjian wrote:
  Jim C. Nasby wrote:
   Maybe better for -hackers, but here it goes anyway...
   
   Has anyone looked at compressing WAL's before writing to disk? On a
   system generating a lot of WAL it seems there might be some gains to be
   had WAL data could be compressed before going to disk, since today's
   machines are generally more I/O bound than CPU bound. And unlike the
   base tables, you generally don't need to read the WAL, so you don't
   really need to worry about not being able to quickly scan through the
   data without decompressing it.
  
  I have never heard anyone talk about it, but it seems useful.  I think
  compressing the page images written on first page modification since
  checkpoint would be a big win.
 
 Well it was discussed 2-3 years ago as part of the PITR preamble. You
 may be surprised to read that over...
 
 A summary of thoughts to date on this are:
 
 xlog.c XLogInsert places backup blocks into the wal buffers before
 insertion, so is the right place to do this. It would be possible to do
 this before any LWlocks are taken, so would not not necessarily impair
 scalability.
 
 Currently XLogInsert is a severe CPU bottleneck around the CRC
 calculation, as identified recently by Tom. Digging further, the code
 used seems to cause processor stalls on Intel CPUs, possibly responsible
 for much of the CPU time. Discussions to move to a 32-bit CRC would also
 be effected by this because of the byte-by-byte nature of the algorithm,
 whatever the length of the generating polynomial. PostgreSQL's CRC
 algorithm is the fastest BSD code available. Until improvement is made
 there, I would not investigate compression further. Some input from
 hardware tuning specialists is required...
 
 The current LZW compression code uses a 4096 byte lookback size, so that
 would need to be modified to extend across a whole block. An
 alternative, suggested originally by Tom and rediscovered by me because
 I just don't read everybody's fine words in history, is to simply take
 out the freespace in the middle of every heap block that consists of
 zeros.
 
 Any solution in this area must take into account the variability of the
 size of freespace in database blocks. Some databases have mostly full
 blocks, others vary. There would also be considerable variation in
 compressability of blocks, especially since some blocks (e.g. TOAST) are
 likely to already be compressed. There'd need to be some testing done to
 see exactly the point where the costs of compression produce realisable
 benefits.
 
 So any solution must be able to cope with both compressed blocks and
 non-compressed blocks. My current thinking is that this could be
 achieved by using the spare fourth bit of the BkpBlocks portion of the
 XLog structure, so that either all included BkpBlocks are compressed or
 none of them are, and hope that allows benefit to shine through. Not
 thought about heap/index issues.
 
 It is possible that an XLogWriter process could be used to assist in the
 CRC and compression calculations also, an a similar process used to
 assist decompression for recovery, in time.
 
 I regret I do not currently have time to pursue further.
 
 Best Regards, Simon Riggs
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Question on REINDEX

2005-04-18 Thread Bill Chandler
All,

A couple of questions regarding REINDEX command:

Running PostgreSQL 7.4.2 on Solaris.

1) When is it necessary to run REINDEX or drop/create
an index?  All I could really find in the docs is:

In some situations it is worthwhile to rebuild
indexes periodically with the REINDEX command. (There
is also contrib/reindexdb which can reindex an entire
database.) However, PostgreSQL 7.4 has substantially
reduced the need for this activity compared to earlier
releases.

What are these situations?  We have a database with
some large tables.  Currently we reindex (actually
drop/create) nightly.  But as the tables have grown
this has become prohibitively time-consuming. 
According to the above comment it may not be necessary
at all.

2) If reindexing is necessary, how can this be done in
a non-obtrusive way in a production environment.  Our
database is being updated constantly.  REINDEX locks
client apps out while in progress.  Same with CREATE
INDEX when we drop/create.  The table can have over
10 million row.  Recreating the indexes seems to take
hours.  This is too long to lock the client apps out. 
Is there any other solution?

thanks,

Bill



__ 
Do you Yahoo!? 
Make Yahoo! your home page 
http://www.yahoo.com/r/hs

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
On 4/18/05, Jacques Caron [EMAIL PROTECTED] wrote:
 Hi,
 
 At 20:21 18/04/2005, Alex Turner wrote:
 So I wonder if one could take this stripe size thing further and say
 that a larger stripe size is more likely to result in requests getting
 served parallized across disks which would lead to increased
 performance?
 
 Actually, it would be pretty much the opposite. The smaller the stripe
 size, the more evenly distributed data is, and the more disks can be used
 to serve requests. If your stripe size is too large, many random accesses
 within one single file (whose size is smaller than the stripe size/number
 of disks) may all end up on the same disk, rather than being split across
 multiple disks (the extreme case being stripe size = total size of all
 disks, which means concatenation). If all accesses had the same cost (i.e.
 no seek time, only transfer time), the ideal would be to have a stripe size
 equal to the number of disks.
 
[snip]

Ahh yes - but the critical distinction is this:
The smaller the stripe size, the more disks will be used to serve _a_
request - which is bad for OLTP because you want fewer disks per
request so that you can have more requests per second because the cost
is mostly seek.  If more than one disk has to seek to serve a single
request, you are preventing that disk from serving a second request at
the same time.

To have more throughput in MB/sec, you want a smaller stripe size so
that you have more disks serving a single request allowing you to
multiple by effective drives to get total bandwidth.

Because OLTP is made up of small reads and writes to a small number of
different files, I would guess that you want those files split up
across your RAID, but not so much that a single small read or write
operation would traverse more than one disk.   That would infer that
your optimal stripe size is somewhere on the right side of the bell
curve that represents your database read and write block count
distribution.  If on average the dbwritter never flushes less than 1MB
to disk at a time, then I guess your best stripe size would be 1MB,
but that seems very large to me.

So I think therefore that I may be contending the exact opposite of
what you are postulating!

Alex Turner
netEconomist

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Question on REINDEX

2005-04-18 Thread Josh Berkus
Bill,

 1) When is it necessary to run REINDEX or drop/create
 an index?  All I could really find in the docs is:

If you need to VACUUM FULL, you need to REINDEX as well.  For example, if you 
drop millions of rows from a table.

 2) If reindexing is necessary, how can this be done in
 a non-obtrusive way in a production environment.  Our
 database is being updated constantly.  REINDEX locks
 client apps out while in progress.  Same with CREATE
 INDEX when we drop/create.  The table can have over
 10 million row.  Recreating the indexes seems to take
 hours.  This is too long to lock the client apps out.
 Is there any other solution?

Better to up your max_fsm_pages and do regular VACUUMs regularly and 
frequently so that you don't have to REINDEX at all.


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Question on REINDEX

2005-04-18 Thread Bruno Wolff III
On Mon, Apr 18, 2005 at 12:21:42 -0700,
  Bill Chandler [EMAIL PROTECTED] wrote:
 
 Running PostgreSQL 7.4.2 on Solaris.
 
 1) When is it necessary to run REINDEX or drop/create
 an index?  All I could really find in the docs is:
 
 In some situations it is worthwhile to rebuild
 indexes periodically with the REINDEX command. (There
 is also contrib/reindexdb which can reindex an entire
 database.) However, PostgreSQL 7.4 has substantially
 reduced the need for this activity compared to earlier
 releases.

In pathologic cases it is possible to have a lot of empty space on a lot
of your index pages. Reindexing would change that to a smaller number.
In earlier versions, I think it was possible to have completely empty
pages and this happened for patterns of use (new values monotonically
increasing, oldest values deleted first) that were actually seen in
practice.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Question on vacuumdb

2005-04-18 Thread Bruno Wolff III
On Mon, Apr 18, 2005 at 12:27:08 -0700,
  Bill Chandler [EMAIL PROTECTED] wrote:
 All,
 
 If I run the command vacuumdb mydb I understand that
 it does some disk space recovery (but not as much as
 vacuumdb --full mydb).  

You are better off not using vacuum full unless some unusual event has
bloated your database. By running normal vacuums often enough (and with
a large enough fsm setting) your database should reach a steady state size.

 Question: if I run the command vacuumdb --analyze
 mydb does it still do the aforementioned disk space
 recovery AS WELL AS update query planning statistics? 
 Or are those two completely separate operations
 requiring separate invocations of 'vacuumdb'.

It is better to do both with one command.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-18 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 It would be interesting sometime to try to teach the planner about
 inlining SQL-language functions to become joins.  That is, given

 The Inlining of the function is presumably a side-issue. I have tons of
 queries that use subqueries in the select list for which the same behaviour
 would be appropriate.

Yeah, I was actually thinking about a two-step process: inline the
function to produce somethig equivalent to a handwritten scalar
sub-SELECT, and then try to convert sub-SELECTs into joins.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Question on REINDEX

2005-04-18 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 1) When is it necessary to run REINDEX or drop/create
 an index?  All I could really find in the docs is:

 If you need to VACUUM FULL, you need to REINDEX as well.  For example, if you
 drop millions of rows from a table.

That's probably a pretty good rule of thumb.  It's worth noting that
VACUUM FULL tends to actively bloat indexes, not reduce them in size,
because it has to create new index entries for the rows it moves before
it can delete the old ones.  So if a VACUUM FULL moves many rows you
are likely to see the indexes get bigger not smaller.

 Better to up your max_fsm_pages and do regular VACUUMs regularly and 
 frequently so that you don't have to REINDEX at all.

Yes, definitely.  Also consider using CLUSTER rather than VACUUM FULL
when you need to clean up after massive deletions from a table.  It's
not any less intrusive in terms of locking, but it's often faster and it
avoids the index bloat problem (since it effectively does a REINDEX).

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Bruce Momjian
Kevin Brown wrote:
 Greg Stark wrote:
 
 
  I think you're being misled by analyzing the write case.
  
  Consider the read case. When a user process requests a block and
  that read makes its way down to the driver level, the driver can't
  just put it aside and wait until it's convenient. It has to go ahead
  and issue the read right away.
 
 Well, strictly speaking it doesn't *have* to.  It could delay for a
 couple of milliseconds to see if other requests come in, and then
 issue the read if none do.  If there are already other requests being
 fulfilled, then it'll schedule the request in question just like the
 rest.

The idea with SCSI or any command queuing is that you don't have to wait
for another request to come in --- you can send the request as it
arrives, then if another shows up, you send that too, and the drive
optimizes the grouping at a later time, knowing what the drive is doing,
rather queueing in the kernel.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alex Turner
Does it really matter at which end of the cable the queueing is done
(Assuming both ends know as much about drive geometry etc..)?

Alex Turner
netEconomist

On 4/18/05, Bruce Momjian pgman@candle.pha.pa.us wrote:
 Kevin Brown wrote:
  Greg Stark wrote:
 
 
   I think you're being misled by analyzing the write case.
  
   Consider the read case. When a user process requests a block and
   that read makes its way down to the driver level, the driver can't
   just put it aside and wait until it's convenient. It has to go ahead
   and issue the read right away.
 
  Well, strictly speaking it doesn't *have* to.  It could delay for a
  couple of milliseconds to see if other requests come in, and then
  issue the read if none do.  If there are already other requests being
  fulfilled, then it'll schedule the request in question just like the
  rest.
 
 The idea with SCSI or any command queuing is that you don't have to wait
 for another request to come in --- you can send the request as it
 arrives, then if another shows up, you send that too, and the drive
 optimizes the grouping at a later time, knowing what the drive is doing,
 rather queueing in the kernel.
 
 --
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Alvaro Herrera
On Mon, Apr 18, 2005 at 06:49:44PM -0400, Alex Turner wrote:
 Does it really matter at which end of the cable the queueing is done
 (Assuming both ends know as much about drive geometry etc..)?

That is a pretty strong assumption, isn't it?  Also you seem to be
assuming that the controller-disk protocol (some internal, unknown to
mere mortals, mechanism) is equally powerful than the host-controller
(SATA, SCSI, etc).

I'm lost whether this thread is about what is possible with current,
in-market technology, or about what could in theory be possible [if you
were to design open source disk controllers and disks.]

-- 
Alvaro Herrera ([EMAIL PROTECTED])
La fuerza no está en los medios físicos
sino que reside en una voluntad indomable (Gandhi)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Matthew Nuzum
On 4/14/05, Tom Lane [EMAIL PROTECTED] wrote:
 
 That's basically what it comes down to: SCSI lets the disk drive itself
 do the low-level I/O scheduling whereas the ATA spec prevents the drive
 from doing so (unless it cheats, ie, caches writes).  Also, in SCSI it's
 possible for the drive to rearrange reads as well as writes --- which
 AFAICS is just not possible in ATA.  (Maybe in the newest spec...)
 
 The reason this is so much more of a win than it was when ATA was
 designed is that in modern drives the kernel has very little clue about
 the physical geometry of the disk.  Variable-size tracks, bad-block
 sparing, and stuff like that make for a very hard-to-predict mapping
 from linear sector addresses to actual disk locations.  Combine that
 with the fact that the drive controller can be much smarter than it was
 twenty years ago, and you can see that the case for doing I/O scheduling
 in the kernel and not in the drive is pretty weak.
   
 

So if you all were going to choose between two hard drives where:
drive A has capacity C and spins at 15K rpms, and
drive B has capacity 2 x C and spins at 10K rpms and
all other features are the same, the price is the same and C is enough
disk space which would you choose?

I've noticed that on IDE drives, as the capacity increases the data
density increases and there is a pereceived (I've not measured it)
performance increase.

Would the increased data density of the higher capacity drive be of
greater benefit than the faster spindle speed of drive A?

-- 
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Foreign key slows down copy/insert

2005-04-18 Thread Richard van den Berg
I am new to cross references between tables, and I am trying to 
understand how they impact performance. From reading the documentation I 
was under the impression that deffering foreign keys would yield about 
the same performance as dropping them before a copy, and adding them 
after. However, I cannot see this in my test case.

I have a table A with an int column ID that references table B column 
ID. Table B has about 150k rows, and has an index on B.ID. When trying 
to copy 1 million rows into A, I get the following \timings:

1) drop FK, copy (200s), add FK (5s)
2) add FK defferable initially deffered, copy (I aborted after 30min)
3) add FK defferable initially deffered, begin, copy (200s), commit (I 
aborted after 30min)

How do I explain why test cases 2 and 3 do not come close to case 1? Am 
I missing something obvious?

Since the database I am working on has many FKs, I would rather not have 
to drop/add them when I am loading large data sets.

If it would help I can write this out in a reproducable scenario. I am 
using postgresql 7.4.5 at the moment.

Sincerely,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [Fwd: Re: [PERFORM] Functionscan estimates]

2005-04-18 Thread elein
Hmmm. My brain is being jostled and I'm confusing illustra-postgres,
informix-postgres and postgresql.  Some things had functions and
some things had constants and I do not remember which products had
what combination. But probably how they are in postgresql, post
hellerstein, is how I am remembering.

I can find out for sure, given a little time, by querying old contacts.
It would be best if I had a clear question to ask, though.

--elein


On Thu, Apr 14, 2005 at 02:58:09PM -0400, Alvaro Herrera wrote:
 On Thu, Apr 14, 2005 at 10:39:03AM -0700, elein wrote:
 
  All functions could have a cost associated with them, set by the writer of
  the function in order for the planner to reorder function calls.
  The stonebraker airplane level example was:
  select ... from ... where f(id) = 3 and expensive_image_function(img)
  The idea, of course is to weight the expensive function so it was
  pushed to the end of the execution.
 
 So there was only a constant cost associated with the function?  No
 estimator function, for example?
 
 -- 
 Alvaro Herrera ([EMAIL PROTECTED])
 If you have nothing to say, maybe you need just the right tool to help you
 not say it.   (New York Times, about Microsoft PowerPoint)
 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-18 Thread Joshua D. Drake
Since the database I am working on has many FKs, I would rather not have to 
drop/add them when I am loading large data sets.
You may want to hunt the archives. IIRCC I saw a couple of posts in the 
recent months about an update you can do to one of the system tables to disable
the key checks and then re-enable them after your done with the import.

Sincerely,
Joshua D. Drake

--
Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
Custom programming, 24x7 support, managed services, and hosting
Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
Reliable replication, Mammoth Replicator - http://www.commandprompt.com/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] refcurosr vs. setof

2005-04-18 Thread Michael Fuhr
On Sun, Apr 17, 2005 at 10:05:29PM +0200, Rüdiger Herrmann wrote:
 
 I need to write several PL/pgSQL functions all returning a result set wich 
 can be obtained by a single SELECT statement.
 For now the functions are called by a Java application.
 Both REFCURSOR and SETOF serve my purpose, but I was wondering if there is a 
 perfonance difference between the two. The result set can become quite 
 large.

Here's an excerpt from the Control Structures section of the
PL/pgSQL documentation:

The current implementation of RETURN NEXT for PL/pgSQL stores
the entire result set before returning from the function, as
discussed above.  That means that if a PL/pgSQL function produces
a very large result set, performance may be poor: data will be
written to disk to avoid memory exhaustion, but the function
itself will not return until the entire result set has been
generatedCurrently, the point at which data begins being
written to disk is controlled by the work_mem configuration
variable.

You might want to test both ways in typical and worst-case scenarios
and see how each performs.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq