Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Kevin Brown
Rosser Schwarz wrote:
> while you weren't looking, Kevin Brown wrote:
> 
> [reordering bursty reads]
> 
> > In other words, it's a corner case that I strongly suspect
> > isn't typical in situations where SCSI has historically made a big
> > difference.
> 
> [...]
> 
> > But I rather doubt that has to be a huge penalty, if any.  When a
> > process issues an fsync (or even a sync), the kernel doesn't *have* to
> > drop everything it's doing and get to work on it immediately.  It
> > could easily gather a few more requests, bundle them up, and then
> > issue them.
> 
> To make sure I'm following you here, are you or are you not suggesting
> that the kernel could sit on -all- IO requests for some small handful
> of ms before actually performing any IO to address what you "strongly
> suspect" is a "corner case"?

The kernel *can* do so.  Whether or not it's a good idea depends on
the activity in the system.  You'd only consider doing this if you
didn't already have a relatively large backlog of I/O requests to
handle.  You wouldn't do this for every I/O request.

Consider this: I/O operations to a block device are so slow compared
with the speed of other (non I/O) operations on the system that the
system can easily wait for, say, a hundredth of the typical latency on
the target device before issuing requests to it and not have any real
negative impact on the system's I/O throughput.  A process running on
my test system, a 3 GHz Xeon, can issue a million read system calls
per second (I've measured it.  I can post the rather trivial source
code if you're interested).  That's the full round trip of issuing the
system call and having the kernel return back.  That means that in the
span of a millisecond, the system could receive 1000 requests if the
system were busy enough.  If the average latency for a random read
from the disk (including head movement and everything) is 10
milliseconds, and we decide to delay the issuance of the first I/O
request for a tenth of a millisecond (a hundredth of the latency),
then the system might receive 100 additional I/O requests, which it
could then put into the queue and sort by block address before issuing
the read request.  As long as the system knows what the last block
that was requested from that physical device was, it can order the
requests properly and then begin issuing them.  Since the latency on
the target device is so high, this is likely to be a rather big win
for overall throughput.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Kevin Brown
Vivek Khera wrote:
> 
> On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote:
> 
> >Now, bad block remapping destroys that guarantee, but unless you've
> >got a LOT of bad blocks, it shouldn't destroy your performance, right?
> >
> 
> ALL disks have bad blocks, even when you receive them.  you honestly 
> think that these large disks made today (18+ GB is the smallest now) 
> that there are no defects on the surfaces?

Oh, I'm not at all arguing that you won't have bad blocks.  My
argument is that the probability of any given block read or write
operation actually dealing with a remapped block is going to be
relatively small, unless the fraction of bad blocks to total blocks is
large (in which case you basically have a bad disk).  And so the
ability to account for remapped blocks shouldn't itself represent a
huge improvement in overall throughput.




-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Kevin Brown
Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > In the case of pure random reads, you'll end up having to wait an
> > average of half of a rotation before beginning the read.
> 
> You're assuming the conclusion.  The above is true if the disk is handed
> one request at a time by a kernel that doesn't have any low-level timing
> information.  If there are multiple random requests on the same track,
> the drive has an opportunity to do better than that --- if it's got all
> the requests in hand.

True, but see below.  Actually, I suspect what matters is if they're
on the same cylinder (which may be what you're talking about here).
And in the above, I was assuming randomly distributed single-sector
reads.  In that situation, we can't generically know what the
probability that more than one will appear on the same cylinder
without knowing something about the drive geometry.


That said, most modern drives have tens of thousands of cylinders (the
Seagate ST380011a, an 80 gigabyte drive, has 94,600 tracks per inch
according to its datasheet), but much, much smaller queue lengths
(tens of entries, hundreds at most, I'd expect.  Hard data on this
would be appreciated).  For purely random reads, the probability that
two or more requests in the queue happen to be in the same cylinder is
going to be quite small.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How

2005-04-15 Thread Ron Mayer
Rod Taylor wrote:
> On Fri, 2005-04-15 at 15:43 -0500, Matthew Nuzum wrote:
>> * I agree with the threads that more disks are better.
>> * I also agree that SCSI is better, but can be hard to justify
Here's another approach to spend $7000 that we're currently
trying but it'll only work for certain systems if you can
use load balancing and/or application level partitioning
of your software.
For $859 you can buy
a Dell SC1425 with  (*see footnote)
2 Xeon 2.8GHz processors  (*see footnote)
1 GB ram
1 80GB hard drive. (*see footnote)
Doing the math, it seems I could get 8 of
these systems for that $6870, giving me:
   16  Xeon processors (*see footnote),
   640 GB of disk space spread over 8 spindles
   8   GB of ram
   16  1Gbps network adapters.
Despite the non-optimal hardware (* see footnote), the price
of each system and extra redundancy may make up the difference
for some applications.
For example, I didn't see many other $7000 proposals have
have nearly 10GB of ram, or over a dozen CPUs (even counting
the raid controllers), or over a half a terrabyte of storage ,
or capable of 5-10 Gbit/sec of network traffic...  The extra
capacity would allow me to have redundancy that would somewhat
make up for the flakier hardware, no raid, etc.
Thoughts?  Over the next couple months I'll be evaluating
a cluster of 4 systems almost exactly as I described (but
with cheaper dual hard drives in each system), for a GIS
system that does lend itself well to application-level
partitioning.
Ron
(* footnotes)
 Yeah, I know some reports here say that dual Xeons can suck;
 but Dell's throwing in the second one for free.
 Yeah, I know some reports here say Dells can suck, but it
 was easy to get a price quote online, and they're a nice
 business partner of ours.
 Yeah, I should get 2 hard drives in each system, but Dell
 wanting an additional $160 for a 80GB hard drive is not a good deal.
 Yeah, I know I'd be better off with 2GB ram, but Dell
 wants $400 (half the price of an entire additional
 system) for the upgrade from 1GB to 2.
 I also realize that application level partitioning needed
 to take advantage of a loose cluster like this is not practical
 for many applications.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] speed of querry?

2005-04-15 Thread Tom Lane
"Joel Fradkin" <[EMAIL PROTECTED]> writes:
> Turning off merg joins seems to of done it but what do I need to do so I am
> not telling the system explicitly not to use them, I must be missing some
> setting?

> "  ->  Hash Left Join  (cost=185.57..226218.77 rows=177236 width=75) (actual
> time=21.147..2221.098 rows=177041 loops=1)"
> "Hash Cond: (("outer".jobtitleid = "inner".id) AND
> (("outer".clientnum)::text = ("inner".clientnum)::text))"

It's overestimating the cost of this join for some reason ... and I
think I see why.  It's not accounting for the combined effect of the
two hash clauses, only for the "better" one.  What are the statistics
for tbljobtitle.id and tbljobtitle.clientnum --- how many distinct
values of each, and are the distributions skewed to a few popular values?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How

2005-04-15 Thread Rod Taylor
On Fri, 2005-04-15 at 15:43 -0500, Matthew Nuzum wrote:
> I think there are many people who feel that $7,000 is a good budget for a
> database server, me being one.

The budget for a database server is usually some %age of the value of
the data within the database or the value of it's availability. Big
budget hardware (well, going from $7k to $100k) often brings more
redundancy and reliability improvement than performance improvement.

If you're going to lose $100k in business because the database was
unavailable for 12 hours, then kick $75k into the hardware and call a
profit of $25k over 3 years (hardware lifetime is 3 years, catastrophic
failure happens once every 3 or so years...).

Ditto for backup systems. If the company depends on the data in the
database for it's survival, where bankruptcy or worse would happen as a
result of complete dataloss, then it would be a good idea to invest a
significant amount of the companies revenue into making damn sure that
doesn't happen. Call it an insurance policy.


Performance for me dictates which hardware is purchased and
configuration is used within $BUDGET, but $BUDGET itself is nearly
always defined by the value of the data stored.


>  * I agree with the threads that more disks are better.
>  * I also agree that SCSI is better, but can be hard to justify if your
> budget is tight, and I have great certainty that 2x SATA drives on a good
> controller is better than x SCSI drives for many work loads.
>  * I also feel that good database design and proper maintenance can be one
> of the single biggest performance enhancers available. This can be labor
> intensive, however, and sometimes throwing more hardware at a problem is
> cheaper than restructuring a db.
> 
> Either way, having a good hardware platform is an excellent place to start,
> as much of your tuning will depend on certain aspects of your hardware.
> 
> So if you need a db server, and you have $7k to spend, I'd say spend it.
> >From this list, I've gathered that I/O and RAM are your two most important
> investments.
> 
> Once you get that figured out, you can still do some performance tuning on
> your new server using the excellent advice from this mailing list.
> 
> By the way, for all those who make this list work, I've rarely found such a
> thorough, helpful and considerate group of people as these on the
> performance list.
> 
-- 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-15 Thread Steve Poe
Tom,
People's opinions on pgbench may vary, so take what I say with a grain 
of salt. Here are my thoughts:

1) Test with no less than 200 transactions per client. I've heard with 
less than this, your results will vary too much with the direction of 
the wind blowing. A high enough value will help rule out some "noise" 
factor. If I am wrong, please let me know.

2) How is the database going to  be used?  What percentage will be 
read/write if you had to guess? Pgbench is like a TPC-B with will help 
guage the potential throughput of your tps. However, it may not stress 
the server enough to help you make key performance changes. However, 
benchmarks are like statistics...full of lies .

3) Run not just a couple pgbench runs, but *many* (I do between 20-40 
runs) so you can rule out noise and guage improvement on median results.

4) Find something that you test OLTP-type transactions. I used OSDB 
since it is simple to implement and use. Although OSDL's OLTP testing 
will closer to reality.

Steve Poe
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-15 Thread Tom Lane
"Thomas F.O'Connell" <[EMAIL PROTECTED]> writes:
> http://www.sitening.com/pgbench.html

You need to run *many* more transactions than that to get pgbench
numbers that aren't mostly noise.  In my experience 1000 transactions
per client is a rock-bottom minimum to get repeatable numbers; 1 per
is better.

Also, in any run where #clients >= scaling factor, what you're measuring
is primarily contention to update the "branches" rows.  Which is not
necessarily a bad thing to check, but it's generally not the most
interesting performance domain (if your app is like that you need to
redesign the app...)

> To me, it looks like basic transactional performance is modestly 
> improved at 8.0 across a variety of metrics.

That's what I would expect --- we usually do some performance work in
every release cycle, but there was not a huge amount of it for 8.0.

However, these numbers don't prove much either way.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-15 Thread Tom Lane
Enrico Weigelt <[EMAIL PROTECTED]> writes:
> c) CREATE FUNCTION id2username(oid) RETURNS text 
> LANGUAGE 'SQL' IMMUTABLE AS '
>   SELECT username AS RESULT FROM users WHERE uid = $1';

This is simply dangerous.  The function is *NOT* immutable (it is
stable though).  When ... not if ... your application breaks because
you got the wrong answers, you'll get no sympathy from anyone.

The correct question to ask was "if I make a stable function like
this, is it likely to be faster than the join?".  The answer is
"probably not; at best it will be equal to the join".  The best the
planner is likely to be able to do with the function-based query
is equivalent to a nestloop with inner indexscan (assuming there is
an index on users.uid).  If that's the best plan then the join case
should find it too ... but if you are selecting a lot of items rows
then it won't be the best plan.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-15 Thread Thomas F . O'Connell
I'm in the fortunate position of having a newly built database server 
that's pre-production. I'm about to run it through the ringer with some 
simulations of business data and logic, but I wanted to post the 
results of some preliminary pgbench marking.

http://www.sitening.com/pgbench.html
To me, it looks like basic transactional performance is modestly 
improved at 8.0 across a variety of metrics. I think this bodes well 
for more realistic loads, but I'll be curious to see the results of 
some of the simulations.

I've still got a little bit of preparatory time with this box, so I can 
continue to do some experimentation.

I'd be curious to see whether these numbers meet developer expectations 
and to see whether the developer and user community have insight into 
other pgbench options that would be useful to see.

Thanks!
-tfo
--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
Strategic Open Source: Open Your i™
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] immutable functions vs. join for lookups ?

2005-04-15 Thread Enrico Weigelt

Hi folks,

I like to use (immutable) functions for looking up serveral 
(almost constant) things, i.e fetching a username by id. 
This makes my queries more clear.

But is this really performant ?

Lets imagine: 

We've got an table with user accounts (uid,name,...). Then we've
got another one which contains some items assigned to users, and
so are linked to them by an uid field.
Now want to view the items with usernames instead of just uid:

a) SELECT items.a, items.b, ..., users.username FROM items, users
WHERE items.uid = users.uid;

c) CREATE FUNCTION id2username(oid) RETURNS text 
LANGUAGE 'SQL' IMMUTABLE AS '
SELECT username AS RESULT FROM users WHERE uid = $1';

   SELECT items.a, items.b, ..., id2username(users.uid);
   

Which one is faster with
a) only a few users (<50) 
b) many users ( >1k )
while we have several 10k of items ?


thx
-- 
-
 Enrico Weigelt==   metux IT service

  phone: +49 36207 519931 www:   http://www.metux.de/
  fax:   +49 36207 519932 email: [EMAIL PROTECTED]
  cellphone: +49 174 7066481
-
 -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops --
-

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] clear function cache (WAS: SQL function inlining)

2005-04-15 Thread Enrico Weigelt
* Stephan Szabo <[EMAIL PROTECTED]> wrote:
> 
> On Thu, 24 Mar 2005, Enrico Weigelt wrote:
> 
> > * Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> > > On Thu, Mar 24, 2005 at 02:32:48PM +0100, Enrico Weigelt wrote:
> > >
> > > > BTW: is it possible to explicitly clear the cache for immutable
> > > > functions ?
> > >
> > > What cache?  There is no caching of function results.
> >
> > Not ? So what's immutable for ?
> 
> For knowing that you can do things like use it in a functional index and
> I think for things like constant folding in a prepared plan.

So when can I expect the function to be reevaluated ? 
Next query ? Next session ? Random time ?

cu
-- 
-
 Enrico Weigelt==   metux IT service

  phone: +49 36207 519931 www:   http://www.metux.de/
  fax:   +49 36207 519932 email: [EMAIL PROTECTED]
  cellphone: +49 174 7066481
-
 -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops --
-

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How

2005-04-15 Thread Matthew Nuzum
I think there are many people who feel that $7,000 is a good budget for a
database server, me being one.

 * I agree with the threads that more disks are better.
 * I also agree that SCSI is better, but can be hard to justify if your
budget is tight, and I have great certainty that 2x SATA drives on a good
controller is better than x SCSI drives for many work loads.
 * I also feel that good database design and proper maintenance can be one
of the single biggest performance enhancers available. This can be labor
intensive, however, and sometimes throwing more hardware at a problem is
cheaper than restructuring a db.

Either way, having a good hardware platform is an excellent place to start,
as much of your tuning will depend on certain aspects of your hardware.

So if you need a db server, and you have $7k to spend, I'd say spend it.
>From this list, I've gathered that I/O and RAM are your two most important
investments.

Once you get that figured out, you can still do some performance tuning on
your new server using the excellent advice from this mailing list.

By the way, for all those who make this list work, I've rarely found such a
thorough, helpful and considerate group of people as these on the
performance list.

-- 
Matthew Nuzum <[EMAIL PROTECTED]>
www.followers.net - Makers of "Elite Content Management System"
View samples of Elite CMS in action by visiting
http://www.followers.net/portfolio/




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Mohan, Ross
Greg, et al. 

I never found any evidence of a "stop and get an intermediate request"
functionality in the TCQ protocol. 

IIRC, what is there is

1) Ordered
2) Head First
3) Simple

implemented as choices. *VERY* roughly, that'd be like
(1) disk subsystem satisfies requests as submitted, (2) let's
the "this" request be put at the very head of the per se disk
queue after the currently-running disk request is complete, and
(3) is "let the per se disk and it's software reorder the requests
on-hand as per it's onboard software".  (N.B. in the last, it's
the DISK not the controller making those decisions). (N.B. too, that
this last is essentially what NCQ (cf. TCQ) is doing )

I know we've been batting around a hypothetical case of SCSI
where it "stops and gets smth. on the way", but I can find
no proof (yet) that this is done, pro forma, by SCSI drives. 

In other words, SCSI is a necessary, but not sufficient cause
for intermediate reading. 

FWIW

- Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Friday, April 15, 2005 2:02 PM
To: Tom Lane
Cc: Kevin Brown; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Tom Lane <[EMAIL PROTECTED]> writes:

> Yes, you can probably assume that blocks with far-apart numbers are 
> going to require a big seek, and you might even be right in supposing 
> that a block with an intermediate number should be read on the way. 
> But you have no hope at all of making the right decisions at a more 
> local level --- say, reading various sectors within the same cylinder 
> in an optimal fashion.  You don't know where the track boundaries are, 
> so you can't schedule in a way that minimizes rotational latency. 
> You're best off to throw all the requests at the drive together and 
> let the drive sort it out.

Consider for example three reads, one at the beginning of the disk, one at the 
very end, and one in the middle. If the three are performed in the logical 
order (assuming the head starts at the beginning), then the drive has to seek, 
say, 4ms to get to the middle and 4ms to get to the end.

But if the middle block requires a full rotation to reach it from when the head 
arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive).

Whereas the drive could have seeked over to the last block, then seeked back in 
8ms and gotten there just in time to perform the read for free.


I'm not entirely convinced this explains all of the SCSI drives' superior 
performance though. The above is about a worst-case scenario. should really 
only have a small effect, and it's not like the drive firmware can really 
schedule things perfectly either.


I think most of the difference is that the drive manufacturers just don't 
package their high end drives with ATA interfaces. So there are no 10k RPM ATA 
drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but 
most of the manufacturers aren't even doing that.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] speed of querry?

2005-04-15 Thread Joel Fradkin


Joel Fradkin
 
Turning off merg joins seems to of done it but what do I need to do so I am
not telling the system explicitly not to use them, I must be missing some
setting?

On linux box.

explain analyze select * from viwassoclist where clientnum ='SAKS'

"Hash Join  (cost=988.25..292835.36 rows=15773 width=113) (actual
time=23.514..3024.064 rows=160593 loops=1)"
"  Hash Cond: ("outer".locationid = "inner".locationid)"
"  ->  Hash Left Join  (cost=185.57..226218.77 rows=177236 width=75) (actual
time=21.147..2221.098 rows=177041 loops=1)"
"Hash Cond: (("outer".jobtitleid = "inner".id) AND
(("outer".clientnum)::text = ("inner".clientnum)::text))"
"->  Seq Scan on tblassociate a  (cost=0.00..30851.25 rows=177236
width=53) (actual time=0.390..1095.385 rows=177041 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"->  Hash  (cost=152.55..152.55 rows=6604 width=37) (actual
time=20.609..20.609 rows=0 loops=1)"
"  ->  Seq Scan on tbljobtitle jt  (cost=0.00..152.55 rows=6604
width=37) (actual time=0.033..12.319 rows=6603 loops=1)"
"Filter: (1 = presentationid)"
"  ->  Hash  (cost=801.54..801.54 rows=454 width=49) (actual
time=2.196..2.196 rows=0 loops=1)"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..801.54 rows=454 width=49) (actual time=0.111..1.755 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"Total runtime: 3120.366 ms"

here are the table defs and view if that helps. I posted the config a while
back, but can do it again if you need to see it.

CREATE OR REPLACE VIEW viwassoclist AS 
 SELECT a.clientnum, a.associateid, a.associatenum, a.lastname, a.firstname,
jt.value AS jobtitle, l.name AS "location", l.locationid AS mainlocationid,
l.divisionid, l.regionid, l.districtid, (a.lastname::text || ', '::text) ||
a.firstname::text AS assocname, a.isactive, a.isdeleted
   FROM tblassociate a
   LEFT JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND jt.clientnum::text =
a.clientnum::text AND 1 = jt.presentationid
   JOIN tbllocation l ON a.locationid = l.locationid AND l.clientnum::text =
a.clientnum::text;

CREATE TABLE tblassociate
(
  clientnum varchar(16) NOT NULL,
  associateid int4 NOT NULL,
  associatenum varchar(10),
  firstname varchar(50),
  middleinit varchar(5),
  lastname varchar(50),
  ssn varchar(18),
  dob timestamp,
  address varchar(100),
  city varchar(50),
  state varchar(50),
  country varchar(50),
  zip varchar(10),
  homephone varchar(14),
  cellphone varchar(14),
  pager varchar(14),
  associateaccount varchar(50),
  doh timestamp,
  dot timestamp,
  rehiredate timestamp,
  lastdayworked timestamp,
  staffexecid int4,
  jobtitleid int4,
  locationid int4,
  deptid int4,
  positionnum int4,
  worktypeid int4,
  sexid int4,
  maritalstatusid int4,
  ethnicityid int4,
  weight float8,
  heightfeet int4,
  heightinches int4,
  haircolorid int4,
  eyecolorid int4,
  isonalarmlist bool NOT NULL DEFAULT false,
  isactive bool NOT NULL DEFAULT true,
  ismanager bool NOT NULL DEFAULT false,
  issecurity bool NOT NULL DEFAULT false,
  createdbyid int4,
  isdeleted bool NOT NULL DEFAULT false,
  militarybranchid int4,
  militarystatusid int4,
  patrontypeid int4,
  identificationtypeid int4,
  workaddress varchar(200),
  testtypeid int4,
  testscore int4,
  pin int4,
  county varchar(50),
  CONSTRAINT pk_tblassociate PRIMARY KEY (clientnum, associateid),
  CONSTRAINT ix_tblassociate UNIQUE (clientnum, associatenum)
)
CREATE TABLE tbljobtitle
(
  clientnum varchar(16) NOT NULL,
  id int4 NOT NULL,
  value varchar(50),
  code varchar(16),
  isdeleted bool DEFAULT false,
  presentationid int4 NOT NULL DEFAULT 1,
  CONSTRAINT pk_tbljobtitle PRIMARY KEY (clientnum, id, presentationid)
)
CREATE TABLE tbllocation
(
  clientnum varchar(16) NOT NULL,
  locationid int4 NOT NULL,
  districtid int4 NOT NULL,
  regionid int4 NOT NULL,
  divisionid int4 NOT NULL,
  locationnum varchar(8),
  name varchar(50),
  clientlocnum varchar(50),
  address varchar(100),
  address2 varchar(100),
  city varchar(50),
  state varchar(2) NOT NULL DEFAULT 'zz'::character varying,
  zip varchar(10),
  countryid int4,
  phone varchar(15),
  fax varchar(15),
  payname varchar(40),
  contact char(36),
  active bool NOT NULL DEFAULT true,
  coiprogram text,
  coilimit text,
  coiuser varchar(255),
  coidatetime varchar(32),
  ec_note_field varchar(1050),
  locationtypeid int4,
  open_time timestamp,
  close_time timestamp,
  insurance_loc_id varchar(50),
  lpregionid int4,
  sic int4,
  CONSTRAINT pk_tbllocation PRIMARY KEY (clientnum, locationid),
  CONSTRAINT ix_tbllocation_1 UNIQUE (clientnum, locationnum, name),
  CONSTRAINT ix_tbllocation_unique_number UNIQUE (clientnum, divisionid,
regionid, districtid, locationnum)
)

It strikes me as odd that the thing isn't considering hash joins for
at least some of these steps.  Can you force it to (by setting
enable_mergejoin off)?  If not, what are the datat

Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How

2005-04-15 Thread Steve Poe
Ross,
I agree with you, but I' am the lowly intergrator/analyst, I have to solve the 
problem without all
the authority (sounds like a project manager). I originally started this thread 
since I had the $7k budget.
I am not a dba/developer. but I play one on t.v., so I can only assume that 
throwing money
at the application code means one understand what the bottleneck in the code 
and what it takes to fix it.
In this situation, the code is hidden by the vendor that connects to the 
database. So, besides persisent requests of the vendor to improve the area of 
the application, the balance of tuning lies with the hardware. The answer is 
*both* hardware and application code. Finding the right balance is key. Your 
mileage may vary.
Steve Poe


If, however, in the far-more-likely case that the application code
or system/business process is the throttle point, it'd be a great
use of money to have a test report showing that to the "higher ups". 
That's where the best scalability bang-for-buck can be made. 

- Ross
p.s. having said this, and as already been noted "7K" ain't
going to buy that muchmaybe the ability to go RAID 10?

p.p.s  Why don't we start a PGSQL-7K listserv, to handle this EPIC thread?  :-) 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Yes, you can probably assume that blocks with far-apart numbers are
> going to require a big seek, and you might even be right in supposing
> that a block with an intermediate number should be read on the way.
> But you have no hope at all of making the right decisions at a more
> local level --- say, reading various sectors within the same cylinder
> in an optimal fashion.  You don't know where the track boundaries are,
> so you can't schedule in a way that minimizes rotational latency.
> You're best off to throw all the requests at the drive together and
> let the drive sort it out.

Consider for example three reads, one at the beginning of the disk, one at the
very end, and one in the middle. If the three are performed in the logical
order (assuming the head starts at the beginning), then the drive has to seek,
say, 4ms to get to the middle and 4ms to get to the end.

But if the middle block requires a full rotation to reach it from when the
head arrives that adds another 8ms of rotational delay (assuming a 7200RPM
drive).

Whereas the drive could have seeked over to the last block, then seeked back
in 8ms and gotten there just in time to perform the read for free.


I'm not entirely convinced this explains all of the SCSI drives' superior
performance though. The above is about a worst-case scenario. should really
only have a small effect, and it's not like the drive firmware can really
schedule things perfectly either.


I think most of the difference is that the drive manufacturers just don't
package their high end drives with ATA interfaces. So there are no 10k RPM ATA
drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but
most of the manufacturers aren't even doing that.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?

2005-04-15 Thread Mohan, Ross

Sorry to blend threads, but in my kinda longish, somewhat thankless, 
essentially anonymous, and quite average career as a dba, I have 
found that the 7K would be best spent on a definitive end-to-end
"application critical path" test (pretty easy to instrument apps
and lash on test harnesses these days). 


If it's "the disk subsystem", then by all means, spend the 7K there. 

If the "7K$" is for "hardware only", then disk is always a good choice. For
a really small shop, maybe it's an upgrade to a dual CPU opteron MOBO, eg. 
dunno.

If, however, in the far-more-likely case that the application code
or system/business process is the throttle point, it'd be a great
use of money to have a test report showing that to the "higher ups". 
That's where the best scalability bang-for-buck can be made. 


- Ross

p.s. having said this, and as already been noted "7K" ain't
 going to buy that muchmaybe the ability to go RAID 10?
 
p.p.s  Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-)


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Friday, April 15, 2005 11:20 AM
To: Alex Turner
Cc: Dave Held; pgsql-performance@postgresql.org; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Intel SRCS16 SATA raid?


This is a different thread that the $7k server thread.
Greg Stark started it and wrote:

 "I'm also wondering about whether I'm better off with one of these 
 SATA raid  
 controllers or just going with SCSI drives."   



Rick

[EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM:

> The original thread was how much can I get for $7k
>
> You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are ona
budget!
>
> 10k RPM SATA drives give acceptable performance at a good price, thats 
> really the point here.
>
> I have never really argued that SATA is going to match SCSI 
> performance on multidrive arrays for IO/sec.  But it's all about the 
> benjamins baby.  If I told my boss we need $25k for a database 
> machine, he'd tell me that was impossible, and I have $5k to do it. If 
> I tell him $7k - he will swallow that.  We don't _need_ the amazing 
> performance of a 15k RPM drive config.  Our biggest hit is reads, so 
> we can buy 3xSATA machines and load balance.  It's all about the 
> application, and buying what is appropriate.  I don't buy a Corvette 
> if all I need is a malibu.
>
> Alex Turner
> netEconomist
>
> On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > > -Original Message-
> > > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > > Sent: Thursday, April 14, 2005 6:15 PM
> > > To: Dave Held
> > > Cc: pgsql-performance@postgresql.org
> > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> > >
> > > Looking at the numbers, the raptor with TCQ enabled was close or 
> > > beat the Atlas III 10k drive on most benchmarks.
> >
> > And I would be willing to bet that the Atlas 10k is not using the 
> > same generation of technology as the Raptors.
> >
> > > Naturaly a 15k drive is going to be faster in many areas, but it 
> > > is also much more expensive.  It was only 44% better on the server 
> > > tests than the raptor with TCQ, but it costs nearly 300% more 
> > > ($538 cdw.com, $180 newegg.com).
> >
> > State that in terms of cars.  Would you be willing to pay 300% more 
> > for a car that is 44% faster than your competitor's?  Of course you 
> > would, because we all recognize that the cost of speed/performance 
> > does not scale linearly.  Naturally, you buy the best speed that you 
> > can afford, but when it comes to hard drives, the only major feature 
> > whose price tends to scale anywhere close to linearly is capacity.
> >
> > > Note also that the 15k drive was the only drive that kept up with 
> > > the raptor on raw transfer speed, which is going to matter for 
> > > WAL.
> >
> > So get a Raptor for your WAL partition. ;)
> >
> > > [...]
> > > The Raptor drives can be had for as little as $180/ea, which is 
> > > quite a good price point considering they can keep up with their 
> > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled 
> > > (Note that 3ware controllers _don't_ support NCQ, although they 
> > > claim their HBA based queueing is 95% as good as NCQ on the 
> > > drive).
> >
> > Just keep in mind the points made by the Seagate article.  You're 
> > buying much more than just performance for that $500+.  You're also 
> > buying vibrational tolerance, high MTBF, better internal 
> > environmental controls, and a pretty significant margin on seek 
> > time, which is probably your most important feature for disks 
> > storing tables. An interesting test would be to stick several drives 
> > in a cabi

Re: [PERFORM] Intel SRCS16 SATA raid? (somewhat OT)

2005-04-15 Thread Marinos Yannikos
Alex Turner wrote:
No offense to that review, but it was really wasn't that good, and
drew bad conclusions from the data.  I posted it originaly and
immediately regretted it.
See http://www.tweakers.net/reviews/557/18
Amazingly the controller with 1Gig cache manages a write throughput of
750MB/sec on a single drive.
quote:
"Floating high above the crowd, the ARC-1120 has a perfect view on the
struggles of the other adapters. "
It's because the adapter has 1Gig of RAM, nothing to do with the RAID
architecture, it's clearly caching the entire dataset.  The drive
can't physicaly run that fast.  These guys really don't know what they
are doing.
Perhaps you didn't read the whole page. It says right at the beginning:
"Because of its simplicity and short test duration, the ATTO Disk 
Benchmark is used a lot for comparing the 'peformance' of hard disks. 
The tool measures the sequential transfer rate of a partition using a 
test length of 32MB at most. Because of this small dataset, ATTO is 
unsuitable for measuring media transfer rates of intelligent 
RAID-adapters which are equipped with cache memory. The smart RAID 
adapters will serve the requested data directly from their cache, as a 
result of which the results have no relationship to the media transfer 
rates of these cards. For this reason ATTO is an ideal tool to test the 
cache transfer rates of intelligent RAID-adapters."

Therefore, the results on this page are valid - they're supposed to show 
the cache/transfer speed, the dataset is 32MB(!) and should fit in the 
caches of all cards.

See also:
http://www20.tomshardware.com/storage/20041227/areca-raid6-06.html
I trust toms hardware a little more to set up a good review to be honest.
I don't, for many (historical) reasons.
The 3ware trounces the Areca in all IO/sec test.
Maybe, but with no mention of stripe size and other configuration 
details, this is somewhat suspicious. I'll be able to offer benchmarks 
for the 8506-8 vs. the 1120 shortly (1-2 weeks), if you're interested 
(pg_bench, for example, to be a bit more on-topic).

Regards,
 Marinos
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Alex Turner
I stand corrected!

Maybe I should re-evaluate our own config!

Alex T

(The dell PERC controllers do pretty much suck on linux)

On 4/15/05, Vivek Khera <[EMAIL PROTECTED]> wrote:
> 
> On Apr 15, 2005, at 11:01 AM, Alex Turner wrote:
> 
> > You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are on a
> > budget!
> >
> 
> I just bought a pair of Dual Opteron, 4GB RAM, LSI 320-2X RAID dual
> channel with 8 36GB 15kRPM seagate drives.  Each one of these boxes set
> me back just over $7k, including onsite warrantee.
> 
> They totally blow away the Dell Dual XEON with external 14 disk RAID
> (also 15kRPM drives, manufacturer unknown) which also has 4GB RAM and a
> Dell PERC 3/DC controller, the whole of which set me back over $15k.
> 
> Vivek Khera, Ph.D.
> +1-301-869-4449 x806
> 
> 
>

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Vivek Khera
On Apr 15, 2005, at 11:58 AM, Joshua D. Drake wrote:
ALL disks have bad blocks, even when you receive them.  you honestly 
think that these large disks made today (18+ GB is the smallest now) 
that there are no defects on the surfaces?
That is correct. It is just that the HD makers will mark the bad blocks
so that the OS knows not to use them. You can also run the bad blocks
command to try and find new bad blocks.
my point was that you cannot assume an linear correlation between block 
number and physical location, since the bad blocks will be mapped all 
over the place.

Vivek Khera, Ph.D.
+1-301-869-4449 x806


smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Joshua D. Drake
Vivek Khera wrote:
On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote:
Now, bad block remapping destroys that guarantee, but unless you've
got a LOT of bad blocks, it shouldn't destroy your performance, right?
ALL disks have bad blocks, even when you receive them.  you honestly 
think that these large disks made today (18+ GB is the smallest now) 
that there are no defects on the surfaces?
That is correct. It is just that the HD makers will mark the bad blocks
so that the OS knows not to use them. You can also run the bad blocks
command to try and find new bad blocks.
Over time hard drives get bad blocks. It doesn't always mean you have to 
replace the drive but it does mean you need to maintain it and usually
at least backup, low level (if scsi) and mark bad blocks. Then restore.

Sincerely,
Joshua D. Drake

/me remembers trying to cram an old donated 5MB (yes M) disk into an old 
8088 Zenith PC in college...

Vivek Khera, Ph.D.
+1-301-869-4449 x806

--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Vivek Khera
On Apr 15, 2005, at 11:01 AM, Alex Turner wrote:
You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are on a 
budget!

I just bought a pair of Dual Opteron, 4GB RAM, LSI 320-2X RAID dual 
channel with 8 36GB 15kRPM seagate drives.  Each one of these boxes set 
me back just over $7k, including onsite warrantee.

They totally blow away the Dell Dual XEON with external 14 disk RAID 
(also 15kRPM drives, manufacturer unknown) which also has 4GB RAM and a 
Dell PERC 3/DC controller, the whole of which set me back over $15k.

Vivek Khera, Ph.D.
+1-301-869-4449 x806


smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Vivek Khera
On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote:
Now, bad block remapping destroys that guarantee, but unless you've
got a LOT of bad blocks, it shouldn't destroy your performance, right?
ALL disks have bad blocks, even when you receive them.  you honestly 
think that these large disks made today (18+ GB is the smallest now) 
that there are no defects on the surfaces?

/me remembers trying to cram an old donated 5MB (yes M) disk into an 
old 8088 Zenith PC in college...

Vivek Khera, Ph.D.
+1-301-869-4449 x806


smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Richard_D_Levine
This is a different thread that the $7k server thread.
Greg Stark started it and wrote:

 "I'm also wondering about whether I'm better off with one of these 
 SATA raid  
 controllers or just going with SCSI drives."   



Rick

[EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM:

> The original thread was how much can I get for $7k
>
> You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are ona
budget!
>
> 10k RPM SATA drives give acceptable performance at a good price, thats
> really the point here.
>
> I have never really argued that SATA is going to match SCSI
> performance on multidrive arrays for IO/sec.  But it's all about the
> benjamins baby.  If I told my boss we need $25k for a database
> machine, he'd tell me that was impossible, and I have $5k to do it.
> If I tell him $7k - he will swallow that.  We don't _need_ the amazing
> performance of a 15k RPM drive config.  Our biggest hit is reads, so
> we can buy 3xSATA machines and load balance.  It's all about the
> application, and buying what is appropriate.  I don't buy a Corvette
> if all I need is a malibu.
>
> Alex Turner
> netEconomist
>
> On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > > -Original Message-
> > > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > > Sent: Thursday, April 14, 2005 6:15 PM
> > > To: Dave Held
> > > Cc: pgsql-performance@postgresql.org
> > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> > >
> > > Looking at the numbers, the raptor with TCQ enabled was close or
> > > beat the Atlas III 10k drive on most benchmarks.
> >
> > And I would be willing to bet that the Atlas 10k is not using the
> > same generation of technology as the Raptors.
> >
> > > Naturaly a 15k drive is going to be faster in many areas, but it
> > > is also much more expensive.  It was only 44% better on the server
> > > tests than the raptor with TCQ, but it costs nearly 300% more ($538
> > > cdw.com, $180 newegg.com).
> >
> > State that in terms of cars.  Would you be willing to pay 300% more
> > for a car that is 44% faster than your competitor's?  Of course you
> > would, because we all recognize that the cost of speed/performance
> > does not scale linearly.  Naturally, you buy the best speed that you
> > can afford, but when it comes to hard drives, the only major feature
> > whose price tends to scale anywhere close to linearly is capacity.
> >
> > > Note also that the 15k drive was the only drive that kept up with
> > > the raptor on raw transfer speed, which is going to matter for WAL.
> >
> > So get a Raptor for your WAL partition. ;)
> >
> > > [...]
> > > The Raptor drives can be had for as little as $180/ea, which is
> > > quite a good price point considering they can keep up with their
> > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled
> > > (Note that 3ware controllers _don't_ support NCQ, although they
> > > claim their HBA based queueing is 95% as good as NCQ on the drive).
> >
> > Just keep in mind the points made by the Seagate article.  You're
> > buying much more than just performance for that $500+.  You're also
> > buying vibrational tolerance, high MTBF, better internal
> > environmental controls, and a pretty significant margin on seek time,
> > which is probably your most important feature for disks storing tables.
> > An interesting test would be to stick several drives in a cabinet and
> > graph how performance is affected at the different price points/
> > technologies/number of drives.
> >
> > __
> > David B. Held
> > Software Engineer/Array Services Group
> > 200 14th Ave. East,  Sartell, MN 56377
> > 320.534.3637 320.253.7800 800.752.8129
> >
> > ---(end of
broadcast)---
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> >   subscribe-nomail command to [EMAIL PROTECTED] so that your
> >   message can get through to the mailing list cleanly
> >
>
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Dave Held
> -Original Message-
> From: Alex Turner [mailto:[EMAIL PROTECTED]
> Sent: Friday, April 15, 2005 9:44 AM
> To: Marinos Yannikos
> Cc: Joshua D. Drake; Mohan, Ross; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> 
> No offense to that review, but it was really wasn't that good,
> and drew bad conclusions from the data.  I posted it originaly
> and immediately regretted it.

I didn't read the whole thing, but it didn't seem that bad to me.

> See http://www.tweakers.net/reviews/557/18
> 
> Amazingly the controller with 1Gig cache manages a write throughput
> of 750MB/sec on a single drive.
> 
> quote:
> "Floating high above the crowd, the ARC-1120 has a perfect view on
> the struggles of the other adapters. "
> 
> It's because the adapter has 1Gig of RAM, nothing to do with the RAID
> architecture, it's clearly caching the entire dataset.  The drive
> can't physicaly run that fast. 

And that's pretty much exactly what the article says.  Even before the
part you quoted.  Not sure what the problem is there.

> These guys really don't know what they are doing.

They weren't pretending that the drive array was serving up data at
that rate directly from the physical media.  They clearly indicated
that they were testing controller cache speed with the small test.

> Curiously:
> http://www.tweakers.net/reviews/557/25
> 
> The 3ware does very well as a data drive for MySQL.
> [...]

If you take a close look, they pretty much outright say that the Areca
controller does very poorly on the random accesses typical of DB work.
They also specifically mention that the 3ware still dominates the
competition in this area.

Dave


__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Alex Turner
The original thread was how much can I get for $7k

You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are on a budget!

10k RPM SATA drives give acceptable performance at a good price, thats
really the point here.

I have never really argued that SATA is going to match SCSI
performance on multidrive arrays for IO/sec.  But it's all about the
benjamins baby.  If I told my boss we need $25k for a database
machine, he'd tell me that was impossible, and I have $5k to do it. 
If I tell him $7k - he will swallow that.  We don't _need_ the amazing
performance of a 15k RPM drive config.  Our biggest hit is reads, so
we can buy 3xSATA machines and load balance.  It's all about the
application, and buying what is appropriate.  I don't buy a Corvette
if all I need is a malibu.

Alex Turner
netEconomist

On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote:
> > -Original Message-
> > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, April 14, 2005 6:15 PM
> > To: Dave Held
> > Cc: pgsql-performance@postgresql.org
> > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> >
> > Looking at the numbers, the raptor with TCQ enabled was close or
> > beat the Atlas III 10k drive on most benchmarks.
> 
> And I would be willing to bet that the Atlas 10k is not using the
> same generation of technology as the Raptors.
> 
> > Naturaly a 15k drive is going to be faster in many areas, but it
> > is also much more expensive.  It was only 44% better on the server
> > tests than the raptor with TCQ, but it costs nearly 300% more ($538
> > cdw.com, $180 newegg.com).
> 
> State that in terms of cars.  Would you be willing to pay 300% more
> for a car that is 44% faster than your competitor's?  Of course you
> would, because we all recognize that the cost of speed/performance
> does not scale linearly.  Naturally, you buy the best speed that you
> can afford, but when it comes to hard drives, the only major feature
> whose price tends to scale anywhere close to linearly is capacity.
> 
> > Note also that the 15k drive was the only drive that kept up with
> > the raptor on raw transfer speed, which is going to matter for WAL.
> 
> So get a Raptor for your WAL partition. ;)
> 
> > [...]
> > The Raptor drives can be had for as little as $180/ea, which is
> > quite a good price point considering they can keep up with their
> > SCSI 10k RPM counterparts on almost all tests with NCQ enabled
> > (Note that 3ware controllers _don't_ support NCQ, although they
> > claim their HBA based queueing is 95% as good as NCQ on the drive).
> 
> Just keep in mind the points made by the Seagate article.  You're
> buying much more than just performance for that $500+.  You're also
> buying vibrational tolerance, high MTBF, better internal
> environmental controls, and a pretty significant margin on seek time,
> which is probably your most important feature for disks storing tables.
> An interesting test would be to stick several drives in a cabinet and
> graph how performance is affected at the different price points/
> technologies/number of drives.
> 
> __
> David B. Held
> Software Engineer/Array Services Group
> 200 14th Ave. East,  Sartell, MN 56377
> 320.534.3637 320.253.7800 800.752.8129
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
>

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Alex Turner
No offense to that review, but it was really wasn't that good, and
drew bad conclusions from the data.  I posted it originaly and
immediately regretted it.

See http://www.tweakers.net/reviews/557/18

Amazingly the controller with 1Gig cache manages a write throughput of
750MB/sec on a single drive.

quote:
"Floating high above the crowd, the ARC-1120 has a perfect view on the
struggles of the other adapters. "

It's because the adapter has 1Gig of RAM, nothing to do with the RAID
architecture, it's clearly caching the entire dataset.  The drive
can't physicaly run that fast.  These guys really don't know what they
are doing.

Curiously:
http://www.tweakers.net/reviews/557/25

The 3ware does very well as a data drive for MySQL.

The size of your cache is going to _directly_ affect RAID 5
performance.  Put a gig of memory in a 3ware 9500S and benchmark it
against the Areca then.

Also - folks don't run data paritions on RAID 5 because the write
speed is too low.  When you look at the results for RAID 10, the 3ware
leads the pack.

See also:
http://www20.tomshardware.com/storage/20041227/areca-raid6-06.html

I trust toms hardware a little more to set up a good review to be honest.

The 3ware trounces the Areca in all IO/sec test.

Alex Turner
netEconomist

On 4/15/05, Marinos Yannikos <[EMAIL PROTECTED]> wrote:
> Joshua D. Drake wrote:
> > Well I have never even heard of it. 3ware is the defacto authority of
> > reasonable SATA RAID.
> 
> no! 3ware was rather early in this business, but there are plenty of
> (IMHO, and some other people's opinion) better alternatives available.
> 3ware has good Linux drivers, but the performance of their current
> controllers isn't that good.
> 
> Have a look at this: http://www.tweakers.net/reviews/557/1
> 
> especially the sequential writes with RAID-5 on this page:
> 
> http://www.tweakers.net/reviews/557/19
> 
> We have been a long-time user of a 3ware 8506 controller (8 disks,
> RAID-5) and have purchased 2 Areca ARC-1120 now since we weren't
> satisfied with the performance and the 2TB per array limit...
> 
> -mjy
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Richard_D_Levine
Dave wrote "An interesting test would be to stick several drives in a
cabinet and
graph how performance is affected at the different price points/
technologies/number of drives."

>From the discussion on the $7k server thread, it seems the RAID controller
would
be an important data point also.  And RAID level.  And application
load/kind.

Hmmm.  I just talked myself out of it.  Seems like I'd end up with
something
akin to those database benchmarks we all love to hate.

Rick

[EMAIL PROTECTED] wrote on 04/15/2005 08:40:13 AM:

> > -Original Message-
> > From: Alex Turner [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, April 14, 2005 6:15 PM
> > To: Dave Held
> > Cc: pgsql-performance@postgresql.org
> > Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> >
> > Looking at the numbers, the raptor with TCQ enabled was close or
> > beat the Atlas III 10k drive on most benchmarks.
>
> And I would be willing to bet that the Atlas 10k is not using the
> same generation of technology as the Raptors.
>
> > Naturaly a 15k drive is going to be faster in many areas, but it
> > is also much more expensive.  It was only 44% better on the server
> > tests than the raptor with TCQ, but it costs nearly 300% more ($538
> > cdw.com, $180 newegg.com).
>
> State that in terms of cars.  Would you be willing to pay 300% more
> for a car that is 44% faster than your competitor's?  Of course you
> would, because we all recognize that the cost of speed/performance
> does not scale linearly.  Naturally, you buy the best speed that you
> can afford, but when it comes to hard drives, the only major feature
> whose price tends to scale anywhere close to linearly is capacity.
>
> > Note also that the 15k drive was the only drive that kept up with
> > the raptor on raw transfer speed, which is going to matter for WAL.
>
> So get a Raptor for your WAL partition. ;)
>
> > [...]
> > The Raptor drives can be had for as little as $180/ea, which is
> > quite a good price point considering they can keep up with their
> > SCSI 10k RPM counterparts on almost all tests with NCQ enabled
> > (Note that 3ware controllers _don't_ support NCQ, although they
> > claim their HBA based queueing is 95% as good as NCQ on the drive).
>
> Just keep in mind the points made by the Seagate article.  You're
> buying much more than just performance for that $500+.  You're also
> buying vibrational tolerance, high MTBF, better internal
> environmental controls, and a pretty significant margin on seek time,
> which is probably your most important feature for disks storing tables.
> An interesting test would be to stick several drives in a cabinet and
> graph how performance is affected at the different price points/
> technologies/number of drives.
>
> __
> David B. Held
> Software Engineer/Array Services Group
> 200 14th Ave. East,  Sartell, MN 56377
> 320.534.3637 320.253.7800 800.752.8129
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] plperl vs plpgsql

2005-04-15 Thread Christopher Browne
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] (Alex) belched out:
> i am thinking about swiching to plperl as it seems to me much more
> flexible and easier to create functions.
>
> what is the recommended PL for postgres? or which one is most widely
> used / most popular?
> is there a performance difference between plpgsql and plperl ?

If what you're trying to do is "munge text," pl/perl will be a whole
lot more suitable than pl/pgsql because it has a rich set of text
mungeing tools and string functions which pl/pgsql lacks.

If you intend to do a lot of work involving reading unmunged tuples
from this table and that, pl/pgsql provides a much more natural
syntax, and will probably be a bit faster as the query processor may
even be able to expand some of the actions, rather than needing to
treat Perl code as an "opaque blob."

I would definitely be inclined to use the more natural language for
the given task...
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org').
http://linuxdatabases.info/info/internet.html
"If you want to talk with some experts about something, go to the bar
where they hang out, buy a round of beers, and they'll surely talk
your ear off, leaving you wiser than before.

If you, a stranger, show up at the bar, walk up to the table, and ask
them to fax you a position paper, they'll tell you to call their
office in the morning and ask for a rate sheet." -- Miguel Cruz

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] speed of querry?

2005-04-15 Thread Tom Lane
"Joel Fradkin" <[EMAIL PROTECTED]> writes:
> "Merge Join  (cost=49697.60..50744.71 rows=14987 width=113) (actual
> time=11301.160..12171.072 rows=160593 loops=1)"
> "  Merge Cond: ("outer".locationid = "inner".locationid)"
> "  ->  Sort  (cost=788.81..789.89 rows=432 width=49) (actual
> time=3.318..3.603 rows=441 loops=1)"
> "Sort Key: l.locationid"
> "->  Index Scan using ix_location on tbllocation l
> (cost=0.00..769.90 rows=432 width=49) (actual time=0.145..2.283 rows=441
> loops=1)"
> "  Index Cond: ('SAKS'::text = (clientnum)::text)"
> "  ->  Sort  (cost=48908.79..49352.17 rows=177352 width=75) (actual
> time=11297.774..11463.780 rows=160594 loops=1)"
> "Sort Key: a.locationid"
> "->  Merge Right Join  (cost=26247.95..28942.93 rows=177352
> width=75) (actual time=8357.010..9335.362 rows=177041 loops=1)"
> "  Merge Cond: ((("outer".clientnum)::text =
> "inner"."?column10?") AND ("outer".id = "inner".jobtitleid))"
> "  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt
> (cost=0.00..243.76 rows=6604 width=37) (actual time=0.122..12.049 rows=5690
> loops=1)"
> "Filter: (1 = presentationid)"
> "  ->  Sort  (cost=26247.95..26691.33 rows=177352 width=53)
> (actual time=8342.271..8554.943 rows=177041 loops=1)"
> "Sort Key: (a.clientnum)::text, a.jobtitleid"
> "->  Index Scan using ix_associate_clientnum on
> tblassociate a  (cost=0.00..10786.17 rows=177352 width=53) (actual
> time=0.166..1126.052 rows=177041 loops=1)"
> "  Index Cond: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 12287.502 ms"

It strikes me as odd that the thing isn't considering hash joins for
at least some of these steps.  Can you force it to (by setting
enable_mergejoin off)?  If not, what are the datatypes of the join
columns exactly?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Richard van den Berg
Tom Lane wrote:
You didn't show us any \timing.  The 94.109 ms figure is all server-side.
Whoop, my mistake. I had been retesting without the explain, just the 
query. I re-run the explain analyze a few times, and it only reports 
90ms the first time. After that it reports 2ms even over the network 
(the \timing on those are about 50ms which includes the network latency).

Thanks,
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Tom Lane
Richard van den Berg <[EMAIL PROTECTED]> writes:
> Christopher Kings-Lynne wrote:
>> No explain analyze is done on the server...

> Yes, but the psql \timing is calculated on the client, right? That is 
> the value that PFC was refering to.

You didn't show us any \timing.  The 94.109 ms figure is all server-side.

As an example:

regression=# \timing
Timing is on.
regression=# explain analyze select * from tenk1;
 QUERY PLAN
-
 Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244) (actual 
time=0.050..149.615 rows=1 loops=1)
 Total runtime: 188.518 ms
(2 rows)

Time: 210.885 ms
regression=#

Here, 188.5 is at the server, 210.8 is at the client.  The difference is
not all network delay, either --- parse/plan overhead is in there too.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Dave Held
> -Original Message-
> From: Alex Turner [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 14, 2005 6:15 PM
> To: Dave Held
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
> 
> Looking at the numbers, the raptor with TCQ enabled was close or
> beat the Atlas III 10k drive on most benchmarks.

And I would be willing to bet that the Atlas 10k is not using the
same generation of technology as the Raptors.

> Naturaly a 15k drive is going to be faster in many areas, but it
> is also much more expensive.  It was only 44% better on the server
> tests than the raptor with TCQ, but it costs nearly 300% more ($538
> cdw.com, $180 newegg.com).

State that in terms of cars.  Would you be willing to pay 300% more
for a car that is 44% faster than your competitor's?  Of course you
would, because we all recognize that the cost of speed/performance
does not scale linearly.  Naturally, you buy the best speed that you
can afford, but when it comes to hard drives, the only major feature
whose price tends to scale anywhere close to linearly is capacity.

> Note also that the 15k drive was the only drive that kept up with
> the raptor on raw transfer speed, which is going to matter for WAL.

So get a Raptor for your WAL partition. ;)

> [...]
> The Raptor drives can be had for as little as $180/ea, which is
> quite a good price point considering they can keep up with their
> SCSI 10k RPM counterparts on almost all tests with NCQ enabled
> (Note that 3ware controllers _don't_ support NCQ, although they
> claim their HBA based queueing is 95% as good as NCQ on the drive).

Just keep in mind the points made by the Seagate article.  You're
buying much more than just performance for that $500+.  You're also
buying vibrational tolerance, high MTBF, better internal 
environmental controls, and a pretty significant margin on seek time,
which is probably your most important feature for disks storing tables.
An interesting test would be to stick several drives in a cabinet and
graph how performance is affected at the different price points/
technologies/number of drives.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] speed of querry?

2005-04-15 Thread Joel Fradkin
It is still slower on the Linux box. (included is explain with SET
enable_seqscan = off;
explain analyze select * from viwassoclist where clientnum ='SAKS') See
below.

I did a few other tests (changing drive arrays helped by 1 second was slower
on my raid 10 on the powervault).

Pulling just raw data is much faster on the Linux box.
"Seq Scan on tblresponse_line  (cost=1.00..100089717.78 rows=4032078
width=67) (actual time=0.028..4600.431 rows=4032078 loops=1)"
"Total runtime: 6809.399 ms"
Windows box
"Seq Scan on tblresponse_line  (cost=0.00..93203.68 rows=4031968 width=67)
(actual time=16.000..11316.000 rows=4031968 loops=1)"
"Total runtime: 16672.000 ms"

I am going to reload the data bases, just to see what I get.
I am thinking I may have to flatten the files for postgres (eliminate joins
of any kind for reporting etc). Might make a good deal more data, but I
think from the app's point of view it is a good idea anyway, just not sure
how to handle editing.

Joel Fradkin
 
"Merge Join  (cost=49697.60..50744.71 rows=14987 width=113) (actual
time=11301.160..12171.072 rows=160593 loops=1)"
"  Merge Cond: ("outer".locationid = "inner".locationid)"
"  ->  Sort  (cost=788.81..789.89 rows=432 width=49) (actual
time=3.318..3.603 rows=441 loops=1)"
"Sort Key: l.locationid"
"->  Index Scan using ix_location on tbllocation l
(cost=0.00..769.90 rows=432 width=49) (actual time=0.145..2.283 rows=441
loops=1)"
"  Index Cond: ('SAKS'::text = (clientnum)::text)"
"  ->  Sort  (cost=48908.79..49352.17 rows=177352 width=75) (actual
time=11297.774..11463.780 rows=160594 loops=1)"
"Sort Key: a.locationid"
"->  Merge Right Join  (cost=26247.95..28942.93 rows=177352
width=75) (actual time=8357.010..9335.362 rows=177041 loops=1)"
"  Merge Cond: ((("outer".clientnum)::text =
"inner"."?column10?") AND ("outer".id = "inner".jobtitleid))"
"  ->  Index Scan using ix_tbljobtitle_id on tbljobtitle jt
(cost=0.00..243.76 rows=6604 width=37) (actual time=0.122..12.049 rows=5690
loops=1)"
"Filter: (1 = presentationid)"
"  ->  Sort  (cost=26247.95..26691.33 rows=177352 width=53)
(actual time=8342.271..8554.943 rows=177041 loops=1)"
"Sort Key: (a.clientnum)::text, a.jobtitleid"
"->  Index Scan using ix_associate_clientnum on
tblassociate a  (cost=0.00..10786.17 rows=177352 width=53) (actual
time=0.166..1126.052 rows=177041 loops=1)"
"  Index Cond: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 12287.502 ms"


This is above and beyond toying with the column statistics.  You
are basically telling the planner to use an index.  Try this,
and post the EXPLAIN ANALYZE for the seqscan = off case on the
slow box if it doesn't speed things up for you.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Alan Stange
PFC wrote:

My argument is that a sufficiently smart kernel scheduler *should*
yield performance results that are reasonably close to what you can
get with that feature.  Perhaps not quite as good, but reasonably
close.  It shouldn't be an orders-of-magnitude type difference.

And a controller card (or drive) has a lot less RAM to use as a 
cache /  queue for reordering stuff than the OS has, potentially the 
OS can us most  of the available RAM, which can be gigabytes on a big 
server, whereas in  the drive there are at most a few tens of 
megabytes...

However all this is a bit looking at the problem through the wrong 
end.  The OS should provide a multi-read call for the applications to 
pass a  list of blocks they'll need, then reorder them and read them 
the fastest  possible way, clustering them with similar requests from 
other threads.

Right now when a thread/process issues a read() it will block 
until the  block is delivered to this thread. The OS does not know if 
this thread  will then need the next block (which can be had very 
cheaply if you know  ahead of time you'll need it) or not. Thus it 
must make guesses, read  ahead (sometimes), etc...
All true.  Which is why high performance computing folks use 
aio_read()/aio_write() and load up the kernel with all the requests they 
expect to make. 

The kernels that I'm familiar with will do read ahead on files based on 
some heuristics:  when you read the first byte of a file the OS will 
typically load up several pages of the file (depending on file size, 
etc).  If you continue doing read() calls without a seek() on the file 
descriptor the kernel will get the hint that you're doing a sequential 
read and continue caching up the pages ahead of time, usually using the 
pages you just read to hold the new data so that one isn't bloating out 
memory with data that won't be needed again.  Throw in a seek() and the 
amount of read ahead caching may be reduced.

One point that is being missed in all this discussion is that the file 
system also imposes some constraints on how IO's can be done.  For 
example, simply doing a write(fd, buf, 1) doesn't emit a stream 
of sequential blocks to the drives.  Some file systems (UFS was one) 
would force portions of large files into other cylinder groups so that 
small files could be located near the inode data, thus avoiding/reducing 
the size of seeks.  Similarly, extents need to be allocated and the 
bitmaps recording this data usually need synchronous updates, which will 
require some seeks, etc.  Not to mention the need to update inode data, 
etc.  Anyway, my point is that the allocation policies of the file 
system can confuse the situation.

Also, the seek times one sees reported are an average.  One really needs 
to look at the track-to-track seek time and also the "full stoke" seek 
times.   It takes a *long* time to move the heads across the whole 
platter.  I've seen people partition drives to only use small regions of 
the drives to avoid long seeks and to better use the increased number of 
bits going under the head in one rotation.   A 15K drive doesn't need to 
have a faster seek time than a 10K drive because the rotational speed is 
higher.  The average seek time might be faster just because the 15K 
drives are smaller with fewer number of cylinders. 

-- Alan
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Richard van den Berg
Christopher Kings-Lynne wrote:
No explain analyze is done on the server...
Yes, but the psql \timing is calculated on the client, right? That is 
the value that PFC was refering to.

--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] plperl vs plpgsql

2005-04-15 Thread Alex
Hi,
i am thinking about swiching to plperl as it seems to me much more 
flexible and easier to create functions.

what is the recommended PL for postgres? or which one is most widely 
used / most popular?
is there a performance difference between plpgsql and plperl ?

porting to other systems is not a real issue as all my servers have perl 
installed.

Thanks for any advice
Alex

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Christopher Kings-Lynne
Am I correct is assuming that the timings are calculated locally by psql 
on my client, thus including network latency?
No explain analyze is done on the server...
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Richard van den Berg
PFC wrote:
94 ms for an index scan ?
this look really slow...
That seems to be network latency. My psql client is connecting over 
ethernet to the database server. Retrying the command gives very 
different values, as low as 20ms. That 94ms was the highest I've seen. 
Running the same command locally (via a Unix socket) yields 2.5 ms every 
time.

Am I correct is assuming that the timings are calculated locally by psql 
on my client, thus including network latency?

--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-15 Thread Marinos Yannikos
Joshua D. Drake wrote:
Well I have never even heard of it. 3ware is the defacto authority of 
reasonable SATA RAID. 
no! 3ware was rather early in this business, but there are plenty of 
(IMHO, and some other people's opinion) better alternatives available. 
3ware has good Linux drivers, but the performance of their current 
controllers isn't that good.

Have a look at this: http://www.tweakers.net/reviews/557/1
especially the sequential writes with RAID-5 on this page:
http://www.tweakers.net/reviews/557/19
We have been a long-time user of a 3ware 8506 controller (8 disks, 
RAID-5) and have purchased 2 Areca ARC-1120 now since we weren't 
satisfied with the performance and the 2TB per array limit...

-mjy
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread PFC

Index Scan using ix_B on B (cost=0.04..3.06 rows=1 width=329) (actual  
time=93.824..93.826 rows=1 loops=1)
Index Cond: (id = $0)
InitPlan
  ->  Limit  (cost=0.00..0.04 rows=1 width=4) (actual  
time=15.128..15.129 rows=1 loops=1)
->  Seq Scan on A  (cost=0.00..47569.70 rows=1135570  
width=4) (actual time=15.121..15.121 rows=1 loops=1)
  Total runtime: 94.109 ms
94 ms for an index scan ?
this look really slow...
	was the index in the RAM cache ? does it fit ? is it faster the second  
time ? If it's still that slow, something somewhere is severely screwed.

	B has 150K rows you say, so everything about B should fit in RAM, and you  
should get 0.2 ms for an index scan, not 90 ms !
	Try this :

	Locate the files on disk which are involved in table B (table + indexes)  
looking at the system catalogs
	Look at the size of the files. Is the index severely bloated ? REINDEX ?  
DROP/Recreate the index ?
	Load them into the ram cache (just cat files | wc -b several times until  
it's almost instantaneous)
	Retry your query and your COPY

	I know it's stupid... but it's a lot faster to load an index in the cache  
by plainly reading the file rather than accessing it randomly.
	(even though, with this number of rows, it should not be THAT slow !)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread PFC

platter compared to the rotational speed, which would agree with the
fact that you can read 70MB/sec, but it takes up to 13ms to seek.
	Actually :
	- the head has to be moved
	this time depends on the distance, for instance moving from a cylinder to  
the next is very fast (it needs to, to get good throughput)
	- then you have to wait for the disk to spin until the information you  
want comes in front of the head... statistically you have to wait a half  
rotation. And this does not depend on the distance between the cylinders,  
it depends on the position of the data in the cylinder.
	The more RPMs you have, the less you wait, which is why faster RPMs  
drives have faster seek (they must also have faster actuators to move the  
head)...

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread PFC

My argument is that a sufficiently smart kernel scheduler *should*
yield performance results that are reasonably close to what you can
get with that feature.  Perhaps not quite as good, but reasonably
close.  It shouldn't be an orders-of-magnitude type difference.
	And a controller card (or drive) has a lot less RAM to use as a cache /  
queue for reordering stuff than the OS has, potentially the OS can us most  
of the available RAM, which can be gigabytes on a big server, whereas in  
the drive there are at most a few tens of megabytes...

	However all this is a bit looking at the problem through the wrong end.  
The OS should provide a multi-read call for the applications to pass a  
list of blocks they'll need, then reorder them and read them the fastest  
possible way, clustering them with similar requests from other threads.

	Right now when a thread/process issues a read() it will block until the  
block is delivered to this thread. The OS does not know if this thread  
will then need the next block (which can be had very cheaply if you know  
ahead of time you'll need it) or not. Thus it must make guesses, read  
ahead (sometimes), etc...

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Richard van den Berg
PFC wrote:
You're using 7.4.5. It's possible that you have a type mismatch in 
your  foreign keys which prevents use of the index on B.
I read about this pothole and made damn sure the types match. (Actually, 
I kinda hoped that was the problem, it would have been an easy fix.)

First of all, be really sure it's THAT foreign key, ie. do your COPY 
with  only ONE foreign key at a time if you have several, and see which 
one is  the killer.
I took exactly this route, and the first FK I tried already hit the 
jackpot. The real table had 4 FKs.

EXPLAIN ANALYZE the following :
SELECT * FROM B WHERE id = (SELECT id FROM A LIMIT 1);
It should use the index. Does it ?
It sure looks like it:
Index Scan using ix_B on B (cost=0.04..3.06 rows=1 width=329) (actual 
time=93.824..93.826 rows=1 loops=1)
   Index Cond: (id = $0)
   InitPlan
 ->  Limit  (cost=0.00..0.04 rows=1 width=4) (actual 
time=15.128..15.129 rows=1 loops=1)
   ->  Seq Scan on A  (cost=0.00..47569.70 rows=1135570 
width=4) (actual time=15.121..15.121 rows=1 loops=1)
 Total runtime: 94.109 ms

The real problem seems to be what Chris and Stephen pointed out: even 
though the FK check is deferred, it is done on a per-row bases. With 1M 
rows, this just takes forever.

Thanks for the help.
--
Richard van den Berg, CISSP
---
Trust Factory B.V. | www.dna-portal.net
Bazarstraat 44a|  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands|  Fax  : +31 70 3603009
---
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly