Re: [PERFORM] How to improve db performance with $7K?
Rosser Schwarz wrote: > while you weren't looking, Kevin Brown wrote: > > [reordering bursty reads] > > > In other words, it's a corner case that I strongly suspect > > isn't typical in situations where SCSI has historically made a big > > difference. > > [...] > > > But I rather doubt that has to be a huge penalty, if any. When a > > process issues an fsync (or even a sync), the kernel doesn't *have* to > > drop everything it's doing and get to work on it immediately. It > > could easily gather a few more requests, bundle them up, and then > > issue them. > > To make sure I'm following you here, are you or are you not suggesting > that the kernel could sit on -all- IO requests for some small handful > of ms before actually performing any IO to address what you "strongly > suspect" is a "corner case"? The kernel *can* do so. Whether or not it's a good idea depends on the activity in the system. You'd only consider doing this if you didn't already have a relatively large backlog of I/O requests to handle. You wouldn't do this for every I/O request. Consider this: I/O operations to a block device are so slow compared with the speed of other (non I/O) operations on the system that the system can easily wait for, say, a hundredth of the typical latency on the target device before issuing requests to it and not have any real negative impact on the system's I/O throughput. A process running on my test system, a 3 GHz Xeon, can issue a million read system calls per second (I've measured it. I can post the rather trivial source code if you're interested). That's the full round trip of issuing the system call and having the kernel return back. That means that in the span of a millisecond, the system could receive 1000 requests if the system were busy enough. If the average latency for a random read from the disk (including head movement and everything) is 10 milliseconds, and we decide to delay the issuance of the first I/O request for a tenth of a millisecond (a hundredth of the latency), then the system might receive 100 additional I/O requests, which it could then put into the queue and sort by block address before issuing the read request. As long as the system knows what the last block that was requested from that physical device was, it can order the requests properly and then begin issuing them. Since the latency on the target device is so high, this is likely to be a rather big win for overall throughput. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
Vivek Khera wrote: > > On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote: > > >Now, bad block remapping destroys that guarantee, but unless you've > >got a LOT of bad blocks, it shouldn't destroy your performance, right? > > > > ALL disks have bad blocks, even when you receive them. you honestly > think that these large disks made today (18+ GB is the smallest now) > that there are no defects on the surfaces? Oh, I'm not at all arguing that you won't have bad blocks. My argument is that the probability of any given block read or write operation actually dealing with a remapped block is going to be relatively small, unless the fraction of bad blocks to total blocks is large (in which case you basically have a bad disk). And so the ability to account for remapped blocks shouldn't itself represent a huge improvement in overall throughput. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve db performance with $7K?
Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > In the case of pure random reads, you'll end up having to wait an > > average of half of a rotation before beginning the read. > > You're assuming the conclusion. The above is true if the disk is handed > one request at a time by a kernel that doesn't have any low-level timing > information. If there are multiple random requests on the same track, > the drive has an opportunity to do better than that --- if it's got all > the requests in hand. True, but see below. Actually, I suspect what matters is if they're on the same cylinder (which may be what you're talking about here). And in the above, I was assuming randomly distributed single-sector reads. In that situation, we can't generically know what the probability that more than one will appear on the same cylinder without knowing something about the drive geometry. That said, most modern drives have tens of thousands of cylinders (the Seagate ST380011a, an 80 gigabyte drive, has 94,600 tracks per inch according to its datasheet), but much, much smaller queue lengths (tens of entries, hundreds at most, I'd expect. Hard data on this would be appreciated). For purely random reads, the probability that two or more requests in the queue happen to be in the same cylinder is going to be quite small. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How
Rod Taylor wrote: > On Fri, 2005-04-15 at 15:43 -0500, Matthew Nuzum wrote: >> * I agree with the threads that more disks are better. >> * I also agree that SCSI is better, but can be hard to justify Here's another approach to spend $7000 that we're currently trying but it'll only work for certain systems if you can use load balancing and/or application level partitioning of your software. For $859 you can buy a Dell SC1425 with (*see footnote) 2 Xeon 2.8GHz processors (*see footnote) 1 GB ram 1 80GB hard drive. (*see footnote) Doing the math, it seems I could get 8 of these systems for that $6870, giving me: 16 Xeon processors (*see footnote), 640 GB of disk space spread over 8 spindles 8 GB of ram 16 1Gbps network adapters. Despite the non-optimal hardware (* see footnote), the price of each system and extra redundancy may make up the difference for some applications. For example, I didn't see many other $7000 proposals have have nearly 10GB of ram, or over a dozen CPUs (even counting the raid controllers), or over a half a terrabyte of storage , or capable of 5-10 Gbit/sec of network traffic... The extra capacity would allow me to have redundancy that would somewhat make up for the flakier hardware, no raid, etc. Thoughts? Over the next couple months I'll be evaluating a cluster of 4 systems almost exactly as I described (but with cheaper dual hard drives in each system), for a GIS system that does lend itself well to application-level partitioning. Ron (* footnotes) Yeah, I know some reports here say that dual Xeons can suck; but Dell's throwing in the second one for free. Yeah, I know some reports here say Dells can suck, but it was easy to get a price quote online, and they're a nice business partner of ours. Yeah, I should get 2 hard drives in each system, but Dell wanting an additional $160 for a 80GB hard drive is not a good deal. Yeah, I know I'd be better off with 2GB ram, but Dell wants $400 (half the price of an entire additional system) for the upgrade from 1GB to 2. I also realize that application level partitioning needed to take advantage of a loose cluster like this is not practical for many applications. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] speed of querry?
"Joel Fradkin" <[EMAIL PROTECTED]> writes: > Turning off merg joins seems to of done it but what do I need to do so I am > not telling the system explicitly not to use them, I must be missing some > setting? > " -> Hash Left Join (cost=185.57..226218.77 rows=177236 width=75) (actual > time=21.147..2221.098 rows=177041 loops=1)" > "Hash Cond: (("outer".jobtitleid = "inner".id) AND > (("outer".clientnum)::text = ("inner".clientnum)::text))" It's overestimating the cost of this join for some reason ... and I think I see why. It's not accounting for the combined effect of the two hash clauses, only for the "better" one. What are the statistics for tbljobtitle.id and tbljobtitle.clientnum --- how many distinct values of each, and are the distributions skewed to a few popular values? regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How
On Fri, 2005-04-15 at 15:43 -0500, Matthew Nuzum wrote: > I think there are many people who feel that $7,000 is a good budget for a > database server, me being one. The budget for a database server is usually some %age of the value of the data within the database or the value of it's availability. Big budget hardware (well, going from $7k to $100k) often brings more redundancy and reliability improvement than performance improvement. If you're going to lose $100k in business because the database was unavailable for 12 hours, then kick $75k into the hardware and call a profit of $25k over 3 years (hardware lifetime is 3 years, catastrophic failure happens once every 3 or so years...). Ditto for backup systems. If the company depends on the data in the database for it's survival, where bankruptcy or worse would happen as a result of complete dataloss, then it would be a good idea to invest a significant amount of the companies revenue into making damn sure that doesn't happen. Call it an insurance policy. Performance for me dictates which hardware is purchased and configuration is used within $BUDGET, but $BUDGET itself is nearly always defined by the value of the data stored. > * I agree with the threads that more disks are better. > * I also agree that SCSI is better, but can be hard to justify if your > budget is tight, and I have great certainty that 2x SATA drives on a good > controller is better than x SCSI drives for many work loads. > * I also feel that good database design and proper maintenance can be one > of the single biggest performance enhancers available. This can be labor > intensive, however, and sometimes throwing more hardware at a problem is > cheaper than restructuring a db. > > Either way, having a good hardware platform is an excellent place to start, > as much of your tuning will depend on certain aspects of your hardware. > > So if you need a db server, and you have $7k to spend, I'd say spend it. > >From this list, I've gathered that I/O and RAM are your two most important > investments. > > Once you get that figured out, you can still do some performance tuning on > your new server using the excellent advice from this mailing list. > > By the way, for all those who make this list work, I've rarely found such a > thorough, helpful and considerate group of people as these on the > performance list. > -- ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2
Tom, People's opinions on pgbench may vary, so take what I say with a grain of salt. Here are my thoughts: 1) Test with no less than 200 transactions per client. I've heard with less than this, your results will vary too much with the direction of the wind blowing. A high enough value will help rule out some "noise" factor. If I am wrong, please let me know. 2) How is the database going to be used? What percentage will be read/write if you had to guess? Pgbench is like a TPC-B with will help guage the potential throughput of your tps. However, it may not stress the server enough to help you make key performance changes. However, benchmarks are like statistics...full of lies . 3) Run not just a couple pgbench runs, but *many* (I do between 20-40 runs) so you can rule out noise and guage improvement on median results. 4) Find something that you test OLTP-type transactions. I used OSDB since it is simple to implement and use. Although OSDL's OLTP testing will closer to reality. Steve Poe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2
"Thomas F.O'Connell" <[EMAIL PROTECTED]> writes: > http://www.sitening.com/pgbench.html You need to run *many* more transactions than that to get pgbench numbers that aren't mostly noise. In my experience 1000 transactions per client is a rock-bottom minimum to get repeatable numbers; 1 per is better. Also, in any run where #clients >= scaling factor, what you're measuring is primarily contention to update the "branches" rows. Which is not necessarily a bad thing to check, but it's generally not the most interesting performance domain (if your app is like that you need to redesign the app...) > To me, it looks like basic transactional performance is modestly > improved at 8.0 across a variety of metrics. That's what I would expect --- we usually do some performance work in every release cycle, but there was not a huge amount of it for 8.0. However, these numbers don't prove much either way. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] immutable functions vs. join for lookups ?
Enrico Weigelt <[EMAIL PROTECTED]> writes: > c) CREATE FUNCTION id2username(oid) RETURNS text > LANGUAGE 'SQL' IMMUTABLE AS ' > SELECT username AS RESULT FROM users WHERE uid = $1'; This is simply dangerous. The function is *NOT* immutable (it is stable though). When ... not if ... your application breaks because you got the wrong answers, you'll get no sympathy from anyone. The correct question to ask was "if I make a stable function like this, is it likely to be faster than the join?". The answer is "probably not; at best it will be equal to the join". The best the planner is likely to be able to do with the function-based query is equivalent to a nestloop with inner indexscan (assuming there is an index on users.uid). If that's the best plan then the join case should find it too ... but if you are selecting a lot of items rows then it won't be the best plan. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] pgbench Comparison of 7.4.7 to 8.0.2
I'm in the fortunate position of having a newly built database server that's pre-production. I'm about to run it through the ringer with some simulations of business data and logic, but I wanted to post the results of some preliminary pgbench marking. http://www.sitening.com/pgbench.html To me, it looks like basic transactional performance is modestly improved at 8.0 across a variety of metrics. I think this bodes well for more realistic loads, but I'll be curious to see the results of some of the simulations. I've still got a little bit of preparatory time with this box, so I can continue to do some experimentation. I'd be curious to see whether these numbers meet developer expectations and to see whether the developer and user community have insight into other pgbench options that would be useful to see. Thanks! -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] immutable functions vs. join for lookups ?
Hi folks, I like to use (immutable) functions for looking up serveral (almost constant) things, i.e fetching a username by id. This makes my queries more clear. But is this really performant ? Lets imagine: We've got an table with user accounts (uid,name,...). Then we've got another one which contains some items assigned to users, and so are linked to them by an uid field. Now want to view the items with usernames instead of just uid: a) SELECT items.a, items.b, ..., users.username FROM items, users WHERE items.uid = users.uid; c) CREATE FUNCTION id2username(oid) RETURNS text LANGUAGE 'SQL' IMMUTABLE AS ' SELECT username AS RESULT FROM users WHERE uid = $1'; SELECT items.a, items.b, ..., id2username(users.uid); Which one is faster with a) only a few users (<50) b) many users ( >1k ) while we have several 10k of items ? thx -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] clear function cache (WAS: SQL function inlining)
* Stephan Szabo <[EMAIL PROTECTED]> wrote: > > On Thu, 24 Mar 2005, Enrico Weigelt wrote: > > > * Alvaro Herrera <[EMAIL PROTECTED]> wrote: > > > On Thu, Mar 24, 2005 at 02:32:48PM +0100, Enrico Weigelt wrote: > > > > > > > BTW: is it possible to explicitly clear the cache for immutable > > > > functions ? > > > > > > What cache? There is no caching of function results. > > > > Not ? So what's immutable for ? > > For knowing that you can do things like use it in a functional index and > I think for things like constant folding in a prepared plan. So when can I expect the function to be reevaluated ? Next query ? Next session ? Random time ? cu -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How
I think there are many people who feel that $7,000 is a good budget for a database server, me being one. * I agree with the threads that more disks are better. * I also agree that SCSI is better, but can be hard to justify if your budget is tight, and I have great certainty that 2x SATA drives on a good controller is better than x SCSI drives for many work loads. * I also feel that good database design and proper maintenance can be one of the single biggest performance enhancers available. This can be labor intensive, however, and sometimes throwing more hardware at a problem is cheaper than restructuring a db. Either way, having a good hardware platform is an excellent place to start, as much of your tuning will depend on certain aspects of your hardware. So if you need a db server, and you have $7k to spend, I'd say spend it. >From this list, I've gathered that I/O and RAM are your two most important investments. Once you get that figured out, you can still do some performance tuning on your new server using the excellent advice from this mailing list. By the way, for all those who make this list work, I've rarely found such a thorough, helpful and considerate group of people as these on the performance list. -- Matthew Nuzum <[EMAIL PROTECTED]> www.followers.net - Makers of "Elite Content Management System" View samples of Elite CMS in action by visiting http://www.followers.net/portfolio/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
Greg, et al. I never found any evidence of a "stop and get an intermediate request" functionality in the TCQ protocol. IIRC, what is there is 1) Ordered 2) Head First 3) Simple implemented as choices. *VERY* roughly, that'd be like (1) disk subsystem satisfies requests as submitted, (2) let's the "this" request be put at the very head of the per se disk queue after the currently-running disk request is complete, and (3) is "let the per se disk and it's software reorder the requests on-hand as per it's onboard software". (N.B. in the last, it's the DISK not the controller making those decisions). (N.B. too, that this last is essentially what NCQ (cf. TCQ) is doing ) I know we've been batting around a hypothetical case of SCSI where it "stops and gets smth. on the way", but I can find no proof (yet) that this is done, pro forma, by SCSI drives. In other words, SCSI is a necessary, but not sufficient cause for intermediate reading. FWIW - Ross -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Friday, April 15, 2005 2:02 PM To: Tom Lane Cc: Kevin Brown; pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Tom Lane <[EMAIL PROTECTED]> writes: > Yes, you can probably assume that blocks with far-apart numbers are > going to require a big seek, and you might even be right in supposing > that a block with an intermediate number should be read on the way. > But you have no hope at all of making the right decisions at a more > local level --- say, reading various sectors within the same cylinder > in an optimal fashion. You don't know where the track boundaries are, > so you can't schedule in a way that minimizes rotational latency. > You're best off to throw all the requests at the drive together and > let the drive sort it out. Consider for example three reads, one at the beginning of the disk, one at the very end, and one in the middle. If the three are performed in the logical order (assuming the head starts at the beginning), then the drive has to seek, say, 4ms to get to the middle and 4ms to get to the end. But if the middle block requires a full rotation to reach it from when the head arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive). Whereas the drive could have seeked over to the last block, then seeked back in 8ms and gotten there just in time to perform the read for free. I'm not entirely convinced this explains all of the SCSI drives' superior performance though. The above is about a worst-case scenario. should really only have a small effect, and it's not like the drive firmware can really schedule things perfectly either. I think most of the difference is that the drive manufacturers just don't package their high end drives with ATA interfaces. So there are no 10k RPM ATA drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but most of the manufacturers aren't even doing that. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] speed of querry?
Joel Fradkin Turning off merg joins seems to of done it but what do I need to do so I am not telling the system explicitly not to use them, I must be missing some setting? On linux box. explain analyze select * from viwassoclist where clientnum ='SAKS' "Hash Join (cost=988.25..292835.36 rows=15773 width=113) (actual time=23.514..3024.064 rows=160593 loops=1)" " Hash Cond: ("outer".locationid = "inner".locationid)" " -> Hash Left Join (cost=185.57..226218.77 rows=177236 width=75) (actual time=21.147..2221.098 rows=177041 loops=1)" "Hash Cond: (("outer".jobtitleid = "inner".id) AND (("outer".clientnum)::text = ("inner".clientnum)::text))" "-> Seq Scan on tblassociate a (cost=0.00..30851.25 rows=177236 width=53) (actual time=0.390..1095.385 rows=177041 loops=1)" " Filter: ((clientnum)::text = 'SAKS'::text)" "-> Hash (cost=152.55..152.55 rows=6604 width=37) (actual time=20.609..20.609 rows=0 loops=1)" " -> Seq Scan on tbljobtitle jt (cost=0.00..152.55 rows=6604 width=37) (actual time=0.033..12.319 rows=6603 loops=1)" "Filter: (1 = presentationid)" " -> Hash (cost=801.54..801.54 rows=454 width=49) (actual time=2.196..2.196 rows=0 loops=1)" "-> Index Scan using ix_location on tbllocation l (cost=0.00..801.54 rows=454 width=49) (actual time=0.111..1.755 rows=441 loops=1)" " Index Cond: ('SAKS'::text = (clientnum)::text)" "Total runtime: 3120.366 ms" here are the table defs and view if that helps. I posted the config a while back, but can do it again if you need to see it. CREATE OR REPLACE VIEW viwassoclist AS SELECT a.clientnum, a.associateid, a.associatenum, a.lastname, a.firstname, jt.value AS jobtitle, l.name AS "location", l.locationid AS mainlocationid, l.divisionid, l.regionid, l.districtid, (a.lastname::text || ', '::text) || a.firstname::text AS assocname, a.isactive, a.isdeleted FROM tblassociate a LEFT JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND jt.clientnum::text = a.clientnum::text AND 1 = jt.presentationid JOIN tbllocation l ON a.locationid = l.locationid AND l.clientnum::text = a.clientnum::text; CREATE TABLE tblassociate ( clientnum varchar(16) NOT NULL, associateid int4 NOT NULL, associatenum varchar(10), firstname varchar(50), middleinit varchar(5), lastname varchar(50), ssn varchar(18), dob timestamp, address varchar(100), city varchar(50), state varchar(50), country varchar(50), zip varchar(10), homephone varchar(14), cellphone varchar(14), pager varchar(14), associateaccount varchar(50), doh timestamp, dot timestamp, rehiredate timestamp, lastdayworked timestamp, staffexecid int4, jobtitleid int4, locationid int4, deptid int4, positionnum int4, worktypeid int4, sexid int4, maritalstatusid int4, ethnicityid int4, weight float8, heightfeet int4, heightinches int4, haircolorid int4, eyecolorid int4, isonalarmlist bool NOT NULL DEFAULT false, isactive bool NOT NULL DEFAULT true, ismanager bool NOT NULL DEFAULT false, issecurity bool NOT NULL DEFAULT false, createdbyid int4, isdeleted bool NOT NULL DEFAULT false, militarybranchid int4, militarystatusid int4, patrontypeid int4, identificationtypeid int4, workaddress varchar(200), testtypeid int4, testscore int4, pin int4, county varchar(50), CONSTRAINT pk_tblassociate PRIMARY KEY (clientnum, associateid), CONSTRAINT ix_tblassociate UNIQUE (clientnum, associatenum) ) CREATE TABLE tbljobtitle ( clientnum varchar(16) NOT NULL, id int4 NOT NULL, value varchar(50), code varchar(16), isdeleted bool DEFAULT false, presentationid int4 NOT NULL DEFAULT 1, CONSTRAINT pk_tbljobtitle PRIMARY KEY (clientnum, id, presentationid) ) CREATE TABLE tbllocation ( clientnum varchar(16) NOT NULL, locationid int4 NOT NULL, districtid int4 NOT NULL, regionid int4 NOT NULL, divisionid int4 NOT NULL, locationnum varchar(8), name varchar(50), clientlocnum varchar(50), address varchar(100), address2 varchar(100), city varchar(50), state varchar(2) NOT NULL DEFAULT 'zz'::character varying, zip varchar(10), countryid int4, phone varchar(15), fax varchar(15), payname varchar(40), contact char(36), active bool NOT NULL DEFAULT true, coiprogram text, coilimit text, coiuser varchar(255), coidatetime varchar(32), ec_note_field varchar(1050), locationtypeid int4, open_time timestamp, close_time timestamp, insurance_loc_id varchar(50), lpregionid int4, sic int4, CONSTRAINT pk_tbllocation PRIMARY KEY (clientnum, locationid), CONSTRAINT ix_tbllocation_1 UNIQUE (clientnum, locationnum, name), CONSTRAINT ix_tbllocation_unique_number UNIQUE (clientnum, divisionid, regionid, districtid, locationnum) ) It strikes me as odd that the thing isn't considering hash joins for at least some of these steps. Can you force it to (by setting enable_mergejoin off)? If not, what are the datat
Re: [PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How
Ross, I agree with you, but I' am the lowly intergrator/analyst, I have to solve the problem without all the authority (sounds like a project manager). I originally started this thread since I had the $7k budget. I am not a dba/developer. but I play one on t.v., so I can only assume that throwing money at the application code means one understand what the bottleneck in the code and what it takes to fix it. In this situation, the code is hidden by the vendor that connects to the database. So, besides persisent requests of the vendor to improve the area of the application, the balance of tuning lies with the hardware. The answer is *both* hardware and application code. Finding the right balance is key. Your mileage may vary. Steve Poe If, however, in the far-more-likely case that the application code or system/business process is the throttle point, it'd be a great use of money to have a test report showing that to the "higher ups". That's where the best scalability bang-for-buck can be made. - Ross p.s. having said this, and as already been noted "7K" ain't going to buy that muchmaybe the ability to go RAID 10? p.p.s Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve db performance with $7K?
Tom Lane <[EMAIL PROTECTED]> writes: > Yes, you can probably assume that blocks with far-apart numbers are > going to require a big seek, and you might even be right in supposing > that a block with an intermediate number should be read on the way. > But you have no hope at all of making the right decisions at a more > local level --- say, reading various sectors within the same cylinder > in an optimal fashion. You don't know where the track boundaries are, > so you can't schedule in a way that minimizes rotational latency. > You're best off to throw all the requests at the drive together and > let the drive sort it out. Consider for example three reads, one at the beginning of the disk, one at the very end, and one in the middle. If the three are performed in the logical order (assuming the head starts at the beginning), then the drive has to seek, say, 4ms to get to the middle and 4ms to get to the end. But if the middle block requires a full rotation to reach it from when the head arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive). Whereas the drive could have seeked over to the last block, then seeked back in 8ms and gotten there just in time to perform the read for free. I'm not entirely convinced this explains all of the SCSI drives' superior performance though. The above is about a worst-case scenario. should really only have a small effect, and it's not like the drive firmware can really schedule things perfectly either. I think most of the difference is that the drive manufacturers just don't package their high end drives with ATA interfaces. So there are no 10k RPM ATA drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but most of the manufacturers aren't even doing that. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?
Sorry to blend threads, but in my kinda longish, somewhat thankless, essentially anonymous, and quite average career as a dba, I have found that the 7K would be best spent on a definitive end-to-end "application critical path" test (pretty easy to instrument apps and lash on test harnesses these days). If it's "the disk subsystem", then by all means, spend the 7K there. If the "7K$" is for "hardware only", then disk is always a good choice. For a really small shop, maybe it's an upgrade to a dual CPU opteron MOBO, eg. dunno. If, however, in the far-more-likely case that the application code or system/business process is the throttle point, it'd be a great use of money to have a test report showing that to the "higher ups". That's where the best scalability bang-for-buck can be made. - Ross p.s. having said this, and as already been noted "7K" ain't going to buy that muchmaybe the ability to go RAID 10? p.p.s Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, April 15, 2005 11:20 AM To: Alex Turner Cc: Dave Held; pgsql-performance@postgresql.org; [EMAIL PROTECTED] Subject: Re: [PERFORM] Intel SRCS16 SATA raid? This is a different thread that the $7k server thread. Greg Stark started it and wrote: "I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives." Rick [EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM: > The original thread was how much can I get for $7k > > You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are ona budget! > > 10k RPM SATA drives give acceptable performance at a good price, thats > really the point here. > > I have never really argued that SATA is going to match SCSI > performance on multidrive arrays for IO/sec. But it's all about the > benjamins baby. If I told my boss we need $25k for a database > machine, he'd tell me that was impossible, and I have $5k to do it. If > I tell him $7k - he will swallow that. We don't _need_ the amazing > performance of a 15k RPM drive config. Our biggest hit is reads, so > we can buy 3xSATA machines and load balance. It's all about the > application, and buying what is appropriate. I don't buy a Corvette > if all I need is a malibu. > > Alex Turner > netEconomist > > On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote: > > > -Original Message- > > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, April 14, 2005 6:15 PM > > > To: Dave Held > > > Cc: pgsql-performance@postgresql.org > > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > > > Looking at the numbers, the raptor with TCQ enabled was close or > > > beat the Atlas III 10k drive on most benchmarks. > > > > And I would be willing to bet that the Atlas 10k is not using the > > same generation of technology as the Raptors. > > > > > Naturaly a 15k drive is going to be faster in many areas, but it > > > is also much more expensive. It was only 44% better on the server > > > tests than the raptor with TCQ, but it costs nearly 300% more > > > ($538 cdw.com, $180 newegg.com). > > > > State that in terms of cars. Would you be willing to pay 300% more > > for a car that is 44% faster than your competitor's? Of course you > > would, because we all recognize that the cost of speed/performance > > does not scale linearly. Naturally, you buy the best speed that you > > can afford, but when it comes to hard drives, the only major feature > > whose price tends to scale anywhere close to linearly is capacity. > > > > > Note also that the 15k drive was the only drive that kept up with > > > the raptor on raw transfer speed, which is going to matter for > > > WAL. > > > > So get a Raptor for your WAL partition. ;) > > > > > [...] > > > The Raptor drives can be had for as little as $180/ea, which is > > > quite a good price point considering they can keep up with their > > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > > > (Note that 3ware controllers _don't_ support NCQ, although they > > > claim their HBA based queueing is 95% as good as NCQ on the > > > drive). > > > > Just keep in mind the points made by the Seagate article. You're > > buying much more than just performance for that $500+. You're also > > buying vibrational tolerance, high MTBF, better internal > > environmental controls, and a pretty significant margin on seek > > time, which is probably your most important feature for disks > > storing tables. An interesting test would be to stick several drives > > in a cabi
Re: [PERFORM] Intel SRCS16 SATA raid? (somewhat OT)
Alex Turner wrote: No offense to that review, but it was really wasn't that good, and drew bad conclusions from the data. I posted it originaly and immediately regretted it. See http://www.tweakers.net/reviews/557/18 Amazingly the controller with 1Gig cache manages a write throughput of 750MB/sec on a single drive. quote: "Floating high above the crowd, the ARC-1120 has a perfect view on the struggles of the other adapters. " It's because the adapter has 1Gig of RAM, nothing to do with the RAID architecture, it's clearly caching the entire dataset. The drive can't physicaly run that fast. These guys really don't know what they are doing. Perhaps you didn't read the whole page. It says right at the beginning: "Because of its simplicity and short test duration, the ATTO Disk Benchmark is used a lot for comparing the 'peformance' of hard disks. The tool measures the sequential transfer rate of a partition using a test length of 32MB at most. Because of this small dataset, ATTO is unsuitable for measuring media transfer rates of intelligent RAID-adapters which are equipped with cache memory. The smart RAID adapters will serve the requested data directly from their cache, as a result of which the results have no relationship to the media transfer rates of these cards. For this reason ATTO is an ideal tool to test the cache transfer rates of intelligent RAID-adapters." Therefore, the results on this page are valid - they're supposed to show the cache/transfer speed, the dataset is 32MB(!) and should fit in the caches of all cards. See also: http://www20.tomshardware.com/storage/20041227/areca-raid6-06.html I trust toms hardware a little more to set up a good review to be honest. I don't, for many (historical) reasons. The 3ware trounces the Areca in all IO/sec test. Maybe, but with no mention of stripe size and other configuration details, this is somewhat suspicious. I'll be able to offer benchmarks for the 8506-8 vs. the 1120 shortly (1-2 weeks), if you're interested (pg_bench, for example, to be a bit more on-topic). Regards, Marinos ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Intel SRCS16 SATA raid?
I stand corrected! Maybe I should re-evaluate our own config! Alex T (The dell PERC controllers do pretty much suck on linux) On 4/15/05, Vivek Khera <[EMAIL PROTECTED]> wrote: > > On Apr 15, 2005, at 11:01 AM, Alex Turner wrote: > > > You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are on a > > budget! > > > > I just bought a pair of Dual Opteron, 4GB RAM, LSI 320-2X RAID dual > channel with 8 36GB 15kRPM seagate drives. Each one of these boxes set > me back just over $7k, including onsite warrantee. > > They totally blow away the Dell Dual XEON with external 14 disk RAID > (also 15kRPM drives, manufacturer unknown) which also has 4GB RAM and a > Dell PERC 3/DC controller, the whole of which set me back over $15k. > > Vivek Khera, Ph.D. > +1-301-869-4449 x806 > > > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
On Apr 15, 2005, at 11:58 AM, Joshua D. Drake wrote: ALL disks have bad blocks, even when you receive them. you honestly think that these large disks made today (18+ GB is the smallest now) that there are no defects on the surfaces? That is correct. It is just that the HD makers will mark the bad blocks so that the OS knows not to use them. You can also run the bad blocks command to try and find new bad blocks. my point was that you cannot assume an linear correlation between block number and physical location, since the bad blocks will be mapped all over the place. Vivek Khera, Ph.D. +1-301-869-4449 x806 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] How to improve db performance with $7K?
Vivek Khera wrote: On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote: Now, bad block remapping destroys that guarantee, but unless you've got a LOT of bad blocks, it shouldn't destroy your performance, right? ALL disks have bad blocks, even when you receive them. you honestly think that these large disks made today (18+ GB is the smallest now) that there are no defects on the surfaces? That is correct. It is just that the HD makers will mark the bad blocks so that the OS knows not to use them. You can also run the bad blocks command to try and find new bad blocks. Over time hard drives get bad blocks. It doesn't always mean you have to replace the drive but it does mean you need to maintain it and usually at least backup, low level (if scsi) and mark bad blocks. Then restore. Sincerely, Joshua D. Drake /me remembers trying to cram an old donated 5MB (yes M) disk into an old 8088 Zenith PC in college... Vivek Khera, Ph.D. +1-301-869-4449 x806 -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Intel SRCS16 SATA raid?
On Apr 15, 2005, at 11:01 AM, Alex Turner wrote: You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are on a budget! I just bought a pair of Dual Opteron, 4GB RAM, LSI 320-2X RAID dual channel with 8 36GB 15kRPM seagate drives. Each one of these boxes set me back just over $7k, including onsite warrantee. They totally blow away the Dell Dual XEON with external 14 disk RAID (also 15kRPM drives, manufacturer unknown) which also has 4GB RAM and a Dell PERC 3/DC controller, the whole of which set me back over $15k. Vivek Khera, Ph.D. +1-301-869-4449 x806 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] How to improve db performance with $7K?
On Apr 14, 2005, at 10:03 PM, Kevin Brown wrote: Now, bad block remapping destroys that guarantee, but unless you've got a LOT of bad blocks, it shouldn't destroy your performance, right? ALL disks have bad blocks, even when you receive them. you honestly think that these large disks made today (18+ GB is the smallest now) that there are no defects on the surfaces? /me remembers trying to cram an old donated 5MB (yes M) disk into an old 8088 Zenith PC in college... Vivek Khera, Ph.D. +1-301-869-4449 x806 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Intel SRCS16 SATA raid?
This is a different thread that the $7k server thread. Greg Stark started it and wrote: "I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives." Rick [EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM: > The original thread was how much can I get for $7k > > You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are ona budget! > > 10k RPM SATA drives give acceptable performance at a good price, thats > really the point here. > > I have never really argued that SATA is going to match SCSI > performance on multidrive arrays for IO/sec. But it's all about the > benjamins baby. If I told my boss we need $25k for a database > machine, he'd tell me that was impossible, and I have $5k to do it. > If I tell him $7k - he will swallow that. We don't _need_ the amazing > performance of a 15k RPM drive config. Our biggest hit is reads, so > we can buy 3xSATA machines and load balance. It's all about the > application, and buying what is appropriate. I don't buy a Corvette > if all I need is a malibu. > > Alex Turner > netEconomist > > On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote: > > > -Original Message- > > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, April 14, 2005 6:15 PM > > > To: Dave Held > > > Cc: pgsql-performance@postgresql.org > > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > > > Looking at the numbers, the raptor with TCQ enabled was close or > > > beat the Atlas III 10k drive on most benchmarks. > > > > And I would be willing to bet that the Atlas 10k is not using the > > same generation of technology as the Raptors. > > > > > Naturaly a 15k drive is going to be faster in many areas, but it > > > is also much more expensive. It was only 44% better on the server > > > tests than the raptor with TCQ, but it costs nearly 300% more ($538 > > > cdw.com, $180 newegg.com). > > > > State that in terms of cars. Would you be willing to pay 300% more > > for a car that is 44% faster than your competitor's? Of course you > > would, because we all recognize that the cost of speed/performance > > does not scale linearly. Naturally, you buy the best speed that you > > can afford, but when it comes to hard drives, the only major feature > > whose price tends to scale anywhere close to linearly is capacity. > > > > > Note also that the 15k drive was the only drive that kept up with > > > the raptor on raw transfer speed, which is going to matter for WAL. > > > > So get a Raptor for your WAL partition. ;) > > > > > [...] > > > The Raptor drives can be had for as little as $180/ea, which is > > > quite a good price point considering they can keep up with their > > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > > > (Note that 3ware controllers _don't_ support NCQ, although they > > > claim their HBA based queueing is 95% as good as NCQ on the drive). > > > > Just keep in mind the points made by the Seagate article. You're > > buying much more than just performance for that $500+. You're also > > buying vibrational tolerance, high MTBF, better internal > > environmental controls, and a pretty significant margin on seek time, > > which is probably your most important feature for disks storing tables. > > An interesting test would be to stick several drives in a cabinet and > > graph how performance is affected at the different price points/ > > technologies/number of drives. > > > > __ > > David B. Held > > Software Engineer/Array Services Group > > 200 14th Ave. East, Sartell, MN 56377 > > 320.534.3637 320.253.7800 800.752.8129 > > > > ---(end of broadcast)--- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to [EMAIL PROTECTED] so that your > > message can get through to the mailing list cleanly > > > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Intel SRCS16 SATA raid?
> -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Friday, April 15, 2005 9:44 AM > To: Marinos Yannikos > Cc: Joshua D. Drake; Mohan, Ross; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > No offense to that review, but it was really wasn't that good, > and drew bad conclusions from the data. I posted it originaly > and immediately regretted it. I didn't read the whole thing, but it didn't seem that bad to me. > See http://www.tweakers.net/reviews/557/18 > > Amazingly the controller with 1Gig cache manages a write throughput > of 750MB/sec on a single drive. > > quote: > "Floating high above the crowd, the ARC-1120 has a perfect view on > the struggles of the other adapters. " > > It's because the adapter has 1Gig of RAM, nothing to do with the RAID > architecture, it's clearly caching the entire dataset. The drive > can't physicaly run that fast. And that's pretty much exactly what the article says. Even before the part you quoted. Not sure what the problem is there. > These guys really don't know what they are doing. They weren't pretending that the drive array was serving up data at that rate directly from the physical media. They clearly indicated that they were testing controller cache speed with the small test. > Curiously: > http://www.tweakers.net/reviews/557/25 > > The 3ware does very well as a data drive for MySQL. > [...] If you take a close look, they pretty much outright say that the Areca controller does very poorly on the random accesses typical of DB work. They also specifically mention that the 3ware still dominates the competition in this area. Dave __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Intel SRCS16 SATA raid?
The original thread was how much can I get for $7k You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are on a budget! 10k RPM SATA drives give acceptable performance at a good price, thats really the point here. I have never really argued that SATA is going to match SCSI performance on multidrive arrays for IO/sec. But it's all about the benjamins baby. If I told my boss we need $25k for a database machine, he'd tell me that was impossible, and I have $5k to do it. If I tell him $7k - he will swallow that. We don't _need_ the amazing performance of a 15k RPM drive config. Our biggest hit is reads, so we can buy 3xSATA machines and load balance. It's all about the application, and buying what is appropriate. I don't buy a Corvette if all I need is a malibu. Alex Turner netEconomist On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote: > > -Original Message- > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > Sent: Thursday, April 14, 2005 6:15 PM > > To: Dave Held > > Cc: pgsql-performance@postgresql.org > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > Looking at the numbers, the raptor with TCQ enabled was close or > > beat the Atlas III 10k drive on most benchmarks. > > And I would be willing to bet that the Atlas 10k is not using the > same generation of technology as the Raptors. > > > Naturaly a 15k drive is going to be faster in many areas, but it > > is also much more expensive. It was only 44% better on the server > > tests than the raptor with TCQ, but it costs nearly 300% more ($538 > > cdw.com, $180 newegg.com). > > State that in terms of cars. Would you be willing to pay 300% more > for a car that is 44% faster than your competitor's? Of course you > would, because we all recognize that the cost of speed/performance > does not scale linearly. Naturally, you buy the best speed that you > can afford, but when it comes to hard drives, the only major feature > whose price tends to scale anywhere close to linearly is capacity. > > > Note also that the 15k drive was the only drive that kept up with > > the raptor on raw transfer speed, which is going to matter for WAL. > > So get a Raptor for your WAL partition. ;) > > > [...] > > The Raptor drives can be had for as little as $180/ea, which is > > quite a good price point considering they can keep up with their > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > > (Note that 3ware controllers _don't_ support NCQ, although they > > claim their HBA based queueing is 95% as good as NCQ on the drive). > > Just keep in mind the points made by the Seagate article. You're > buying much more than just performance for that $500+. You're also > buying vibrational tolerance, high MTBF, better internal > environmental controls, and a pretty significant margin on seek time, > which is probably your most important feature for disks storing tables. > An interesting test would be to stick several drives in a cabinet and > graph how performance is affected at the different price points/ > technologies/number of drives. > > __ > David B. Held > Software Engineer/Array Services Group > 200 14th Ave. East, Sartell, MN 56377 > 320.534.3637 320.253.7800 800.752.8129 > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Intel SRCS16 SATA raid?
No offense to that review, but it was really wasn't that good, and drew bad conclusions from the data. I posted it originaly and immediately regretted it. See http://www.tweakers.net/reviews/557/18 Amazingly the controller with 1Gig cache manages a write throughput of 750MB/sec on a single drive. quote: "Floating high above the crowd, the ARC-1120 has a perfect view on the struggles of the other adapters. " It's because the adapter has 1Gig of RAM, nothing to do with the RAID architecture, it's clearly caching the entire dataset. The drive can't physicaly run that fast. These guys really don't know what they are doing. Curiously: http://www.tweakers.net/reviews/557/25 The 3ware does very well as a data drive for MySQL. The size of your cache is going to _directly_ affect RAID 5 performance. Put a gig of memory in a 3ware 9500S and benchmark it against the Areca then. Also - folks don't run data paritions on RAID 5 because the write speed is too low. When you look at the results for RAID 10, the 3ware leads the pack. See also: http://www20.tomshardware.com/storage/20041227/areca-raid6-06.html I trust toms hardware a little more to set up a good review to be honest. The 3ware trounces the Areca in all IO/sec test. Alex Turner netEconomist On 4/15/05, Marinos Yannikos <[EMAIL PROTECTED]> wrote: > Joshua D. Drake wrote: > > Well I have never even heard of it. 3ware is the defacto authority of > > reasonable SATA RAID. > > no! 3ware was rather early in this business, but there are plenty of > (IMHO, and some other people's opinion) better alternatives available. > 3ware has good Linux drivers, but the performance of their current > controllers isn't that good. > > Have a look at this: http://www.tweakers.net/reviews/557/1 > > especially the sequential writes with RAID-5 on this page: > > http://www.tweakers.net/reviews/557/19 > > We have been a long-time user of a 3ware 8506 controller (8 disks, > RAID-5) and have purchased 2 Areca ARC-1120 now since we weren't > satisfied with the performance and the 2TB per array limit... > > -mjy > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Intel SRCS16 SATA raid?
Dave wrote "An interesting test would be to stick several drives in a cabinet and graph how performance is affected at the different price points/ technologies/number of drives." >From the discussion on the $7k server thread, it seems the RAID controller would be an important data point also. And RAID level. And application load/kind. Hmmm. I just talked myself out of it. Seems like I'd end up with something akin to those database benchmarks we all love to hate. Rick [EMAIL PROTECTED] wrote on 04/15/2005 08:40:13 AM: > > -Original Message- > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > Sent: Thursday, April 14, 2005 6:15 PM > > To: Dave Held > > Cc: pgsql-performance@postgresql.org > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > Looking at the numbers, the raptor with TCQ enabled was close or > > beat the Atlas III 10k drive on most benchmarks. > > And I would be willing to bet that the Atlas 10k is not using the > same generation of technology as the Raptors. > > > Naturaly a 15k drive is going to be faster in many areas, but it > > is also much more expensive. It was only 44% better on the server > > tests than the raptor with TCQ, but it costs nearly 300% more ($538 > > cdw.com, $180 newegg.com). > > State that in terms of cars. Would you be willing to pay 300% more > for a car that is 44% faster than your competitor's? Of course you > would, because we all recognize that the cost of speed/performance > does not scale linearly. Naturally, you buy the best speed that you > can afford, but when it comes to hard drives, the only major feature > whose price tends to scale anywhere close to linearly is capacity. > > > Note also that the 15k drive was the only drive that kept up with > > the raptor on raw transfer speed, which is going to matter for WAL. > > So get a Raptor for your WAL partition. ;) > > > [...] > > The Raptor drives can be had for as little as $180/ea, which is > > quite a good price point considering they can keep up with their > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > > (Note that 3ware controllers _don't_ support NCQ, although they > > claim their HBA based queueing is 95% as good as NCQ on the drive). > > Just keep in mind the points made by the Seagate article. You're > buying much more than just performance for that $500+. You're also > buying vibrational tolerance, high MTBF, better internal > environmental controls, and a pretty significant margin on seek time, > which is probably your most important feature for disks storing tables. > An interesting test would be to stick several drives in a cabinet and > graph how performance is affected at the different price points/ > technologies/number of drives. > > __ > David B. Held > Software Engineer/Array Services Group > 200 14th Ave. East, Sartell, MN 56377 > 320.534.3637 320.253.7800 800.752.8129 > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] plperl vs plpgsql
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] (Alex) belched out: > i am thinking about swiching to plperl as it seems to me much more > flexible and easier to create functions. > > what is the recommended PL for postgres? or which one is most widely > used / most popular? > is there a performance difference between plpgsql and plperl ? If what you're trying to do is "munge text," pl/perl will be a whole lot more suitable than pl/pgsql because it has a rich set of text mungeing tools and string functions which pl/pgsql lacks. If you intend to do a lot of work involving reading unmunged tuples from this table and that, pl/pgsql provides a much more natural syntax, and will probably be a bit faster as the query processor may even be able to expand some of the actions, rather than needing to treat Perl code as an "opaque blob." I would definitely be inclined to use the more natural language for the given task... -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org'). http://linuxdatabases.info/info/internet.html "If you want to talk with some experts about something, go to the bar where they hang out, buy a round of beers, and they'll surely talk your ear off, leaving you wiser than before. If you, a stranger, show up at the bar, walk up to the table, and ask them to fax you a position paper, they'll tell you to call their office in the morning and ask for a rate sheet." -- Miguel Cruz ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] speed of querry?
"Joel Fradkin" <[EMAIL PROTECTED]> writes: > "Merge Join (cost=49697.60..50744.71 rows=14987 width=113) (actual > time=11301.160..12171.072 rows=160593 loops=1)" > " Merge Cond: ("outer".locationid = "inner".locationid)" > " -> Sort (cost=788.81..789.89 rows=432 width=49) (actual > time=3.318..3.603 rows=441 loops=1)" > "Sort Key: l.locationid" > "-> Index Scan using ix_location on tbllocation l > (cost=0.00..769.90 rows=432 width=49) (actual time=0.145..2.283 rows=441 > loops=1)" > " Index Cond: ('SAKS'::text = (clientnum)::text)" > " -> Sort (cost=48908.79..49352.17 rows=177352 width=75) (actual > time=11297.774..11463.780 rows=160594 loops=1)" > "Sort Key: a.locationid" > "-> Merge Right Join (cost=26247.95..28942.93 rows=177352 > width=75) (actual time=8357.010..9335.362 rows=177041 loops=1)" > " Merge Cond: ((("outer".clientnum)::text = > "inner"."?column10?") AND ("outer".id = "inner".jobtitleid))" > " -> Index Scan using ix_tbljobtitle_id on tbljobtitle jt > (cost=0.00..243.76 rows=6604 width=37) (actual time=0.122..12.049 rows=5690 > loops=1)" > "Filter: (1 = presentationid)" > " -> Sort (cost=26247.95..26691.33 rows=177352 width=53) > (actual time=8342.271..8554.943 rows=177041 loops=1)" > "Sort Key: (a.clientnum)::text, a.jobtitleid" > "-> Index Scan using ix_associate_clientnum on > tblassociate a (cost=0.00..10786.17 rows=177352 width=53) (actual > time=0.166..1126.052 rows=177041 loops=1)" > " Index Cond: ((clientnum)::text = 'SAKS'::text)" > "Total runtime: 12287.502 ms" It strikes me as odd that the thing isn't considering hash joins for at least some of these steps. Can you force it to (by setting enable_mergejoin off)? If not, what are the datatypes of the join columns exactly? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Foreign key slows down copy/insert
Tom Lane wrote: You didn't show us any \timing. The 94.109 ms figure is all server-side. Whoop, my mistake. I had been retesting without the explain, just the query. I re-run the explain analyze a few times, and it only reports 90ms the first time. After that it reports 2ms even over the network (the \timing on those are about 50ms which includes the network latency). Thanks, -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Foreign key slows down copy/insert
Richard van den Berg <[EMAIL PROTECTED]> writes: > Christopher Kings-Lynne wrote: >> No explain analyze is done on the server... > Yes, but the psql \timing is calculated on the client, right? That is > the value that PFC was refering to. You didn't show us any \timing. The 94.109 ms figure is all server-side. As an example: regression=# \timing Timing is on. regression=# explain analyze select * from tenk1; QUERY PLAN - Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) (actual time=0.050..149.615 rows=1 loops=1) Total runtime: 188.518 ms (2 rows) Time: 210.885 ms regression=# Here, 188.5 is at the server, 210.8 is at the client. The difference is not all network delay, either --- parse/plan overhead is in there too. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Intel SRCS16 SATA raid?
> -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 14, 2005 6:15 PM > To: Dave Held > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > Looking at the numbers, the raptor with TCQ enabled was close or > beat the Atlas III 10k drive on most benchmarks. And I would be willing to bet that the Atlas 10k is not using the same generation of technology as the Raptors. > Naturaly a 15k drive is going to be faster in many areas, but it > is also much more expensive. It was only 44% better on the server > tests than the raptor with TCQ, but it costs nearly 300% more ($538 > cdw.com, $180 newegg.com). State that in terms of cars. Would you be willing to pay 300% more for a car that is 44% faster than your competitor's? Of course you would, because we all recognize that the cost of speed/performance does not scale linearly. Naturally, you buy the best speed that you can afford, but when it comes to hard drives, the only major feature whose price tends to scale anywhere close to linearly is capacity. > Note also that the 15k drive was the only drive that kept up with > the raptor on raw transfer speed, which is going to matter for WAL. So get a Raptor for your WAL partition. ;) > [...] > The Raptor drives can be had for as little as $180/ea, which is > quite a good price point considering they can keep up with their > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > (Note that 3ware controllers _don't_ support NCQ, although they > claim their HBA based queueing is 95% as good as NCQ on the drive). Just keep in mind the points made by the Seagate article. You're buying much more than just performance for that $500+. You're also buying vibrational tolerance, high MTBF, better internal environmental controls, and a pretty significant margin on seek time, which is probably your most important feature for disks storing tables. An interesting test would be to stick several drives in a cabinet and graph how performance is affected at the different price points/ technologies/number of drives. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] speed of querry?
It is still slower on the Linux box. (included is explain with SET enable_seqscan = off; explain analyze select * from viwassoclist where clientnum ='SAKS') See below. I did a few other tests (changing drive arrays helped by 1 second was slower on my raid 10 on the powervault). Pulling just raw data is much faster on the Linux box. "Seq Scan on tblresponse_line (cost=1.00..100089717.78 rows=4032078 width=67) (actual time=0.028..4600.431 rows=4032078 loops=1)" "Total runtime: 6809.399 ms" Windows box "Seq Scan on tblresponse_line (cost=0.00..93203.68 rows=4031968 width=67) (actual time=16.000..11316.000 rows=4031968 loops=1)" "Total runtime: 16672.000 ms" I am going to reload the data bases, just to see what I get. I am thinking I may have to flatten the files for postgres (eliminate joins of any kind for reporting etc). Might make a good deal more data, but I think from the app's point of view it is a good idea anyway, just not sure how to handle editing. Joel Fradkin "Merge Join (cost=49697.60..50744.71 rows=14987 width=113) (actual time=11301.160..12171.072 rows=160593 loops=1)" " Merge Cond: ("outer".locationid = "inner".locationid)" " -> Sort (cost=788.81..789.89 rows=432 width=49) (actual time=3.318..3.603 rows=441 loops=1)" "Sort Key: l.locationid" "-> Index Scan using ix_location on tbllocation l (cost=0.00..769.90 rows=432 width=49) (actual time=0.145..2.283 rows=441 loops=1)" " Index Cond: ('SAKS'::text = (clientnum)::text)" " -> Sort (cost=48908.79..49352.17 rows=177352 width=75) (actual time=11297.774..11463.780 rows=160594 loops=1)" "Sort Key: a.locationid" "-> Merge Right Join (cost=26247.95..28942.93 rows=177352 width=75) (actual time=8357.010..9335.362 rows=177041 loops=1)" " Merge Cond: ((("outer".clientnum)::text = "inner"."?column10?") AND ("outer".id = "inner".jobtitleid))" " -> Index Scan using ix_tbljobtitle_id on tbljobtitle jt (cost=0.00..243.76 rows=6604 width=37) (actual time=0.122..12.049 rows=5690 loops=1)" "Filter: (1 = presentationid)" " -> Sort (cost=26247.95..26691.33 rows=177352 width=53) (actual time=8342.271..8554.943 rows=177041 loops=1)" "Sort Key: (a.clientnum)::text, a.jobtitleid" "-> Index Scan using ix_associate_clientnum on tblassociate a (cost=0.00..10786.17 rows=177352 width=53) (actual time=0.166..1126.052 rows=177041 loops=1)" " Index Cond: ((clientnum)::text = 'SAKS'::text)" "Total runtime: 12287.502 ms" This is above and beyond toying with the column statistics. You are basically telling the planner to use an index. Try this, and post the EXPLAIN ANALYZE for the seqscan = off case on the slow box if it doesn't speed things up for you. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
PFC wrote: My argument is that a sufficiently smart kernel scheduler *should* yield performance results that are reasonably close to what you can get with that feature. Perhaps not quite as good, but reasonably close. It shouldn't be an orders-of-magnitude type difference. And a controller card (or drive) has a lot less RAM to use as a cache / queue for reordering stuff than the OS has, potentially the OS can us most of the available RAM, which can be gigabytes on a big server, whereas in the drive there are at most a few tens of megabytes... However all this is a bit looking at the problem through the wrong end. The OS should provide a multi-read call for the applications to pass a list of blocks they'll need, then reorder them and read them the fastest possible way, clustering them with similar requests from other threads. Right now when a thread/process issues a read() it will block until the block is delivered to this thread. The OS does not know if this thread will then need the next block (which can be had very cheaply if you know ahead of time you'll need it) or not. Thus it must make guesses, read ahead (sometimes), etc... All true. Which is why high performance computing folks use aio_read()/aio_write() and load up the kernel with all the requests they expect to make. The kernels that I'm familiar with will do read ahead on files based on some heuristics: when you read the first byte of a file the OS will typically load up several pages of the file (depending on file size, etc). If you continue doing read() calls without a seek() on the file descriptor the kernel will get the hint that you're doing a sequential read and continue caching up the pages ahead of time, usually using the pages you just read to hold the new data so that one isn't bloating out memory with data that won't be needed again. Throw in a seek() and the amount of read ahead caching may be reduced. One point that is being missed in all this discussion is that the file system also imposes some constraints on how IO's can be done. For example, simply doing a write(fd, buf, 1) doesn't emit a stream of sequential blocks to the drives. Some file systems (UFS was one) would force portions of large files into other cylinder groups so that small files could be located near the inode data, thus avoiding/reducing the size of seeks. Similarly, extents need to be allocated and the bitmaps recording this data usually need synchronous updates, which will require some seeks, etc. Not to mention the need to update inode data, etc. Anyway, my point is that the allocation policies of the file system can confuse the situation. Also, the seek times one sees reported are an average. One really needs to look at the track-to-track seek time and also the "full stoke" seek times. It takes a *long* time to move the heads across the whole platter. I've seen people partition drives to only use small regions of the drives to avoid long seeks and to better use the increased number of bits going under the head in one rotation. A 15K drive doesn't need to have a faster seek time than a 10K drive because the rotational speed is higher. The average seek time might be faster just because the 15K drives are smaller with fewer number of cylinders. -- Alan ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Foreign key slows down copy/insert
Christopher Kings-Lynne wrote: No explain analyze is done on the server... Yes, but the psql \timing is calculated on the client, right? That is the value that PFC was refering to. -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] plperl vs plpgsql
Hi, i am thinking about swiching to plperl as it seems to me much more flexible and easier to create functions. what is the recommended PL for postgres? or which one is most widely used / most popular? is there a performance difference between plpgsql and plperl ? porting to other systems is not a real issue as all my servers have perl installed. Thanks for any advice Alex ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Foreign key slows down copy/insert
Am I correct is assuming that the timings are calculated locally by psql on my client, thus including network latency? No explain analyze is done on the server... Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Foreign key slows down copy/insert
PFC wrote: 94 ms for an index scan ? this look really slow... That seems to be network latency. My psql client is connecting over ethernet to the database server. Retrying the command gives very different values, as low as 20ms. That 94ms was the highest I've seen. Running the same command locally (via a Unix socket) yields 2.5 ms every time. Am I correct is assuming that the timings are calculated locally by psql on my client, thus including network latency? -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Intel SRCS16 SATA raid?
Joshua D. Drake wrote: Well I have never even heard of it. 3ware is the defacto authority of reasonable SATA RAID. no! 3ware was rather early in this business, but there are plenty of (IMHO, and some other people's opinion) better alternatives available. 3ware has good Linux drivers, but the performance of their current controllers isn't that good. Have a look at this: http://www.tweakers.net/reviews/557/1 especially the sequential writes with RAID-5 on this page: http://www.tweakers.net/reviews/557/19 We have been a long-time user of a 3ware 8506 controller (8 disks, RAID-5) and have purchased 2 Areca ARC-1120 now since we weren't satisfied with the performance and the 2TB per array limit... -mjy ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Foreign key slows down copy/insert
Index Scan using ix_B on B (cost=0.04..3.06 rows=1 width=329) (actual time=93.824..93.826 rows=1 loops=1) Index Cond: (id = $0) InitPlan -> Limit (cost=0.00..0.04 rows=1 width=4) (actual time=15.128..15.129 rows=1 loops=1) -> Seq Scan on A (cost=0.00..47569.70 rows=1135570 width=4) (actual time=15.121..15.121 rows=1 loops=1) Total runtime: 94.109 ms 94 ms for an index scan ? this look really slow... was the index in the RAM cache ? does it fit ? is it faster the second time ? If it's still that slow, something somewhere is severely screwed. B has 150K rows you say, so everything about B should fit in RAM, and you should get 0.2 ms for an index scan, not 90 ms ! Try this : Locate the files on disk which are involved in table B (table + indexes) looking at the system catalogs Look at the size of the files. Is the index severely bloated ? REINDEX ? DROP/Recreate the index ? Load them into the ram cache (just cat files | wc -b several times until it's almost instantaneous) Retry your query and your COPY I know it's stupid... but it's a lot faster to load an index in the cache by plainly reading the file rather than accessing it randomly. (even though, with this number of rows, it should not be THAT slow !) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
platter compared to the rotational speed, which would agree with the fact that you can read 70MB/sec, but it takes up to 13ms to seek. Actually : - the head has to be moved this time depends on the distance, for instance moving from a cylinder to the next is very fast (it needs to, to get good throughput) - then you have to wait for the disk to spin until the information you want comes in front of the head... statistically you have to wait a half rotation. And this does not depend on the distance between the cylinders, it depends on the position of the data in the cylinder. The more RPMs you have, the less you wait, which is why faster RPMs drives have faster seek (they must also have faster actuators to move the head)... ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
My argument is that a sufficiently smart kernel scheduler *should* yield performance results that are reasonably close to what you can get with that feature. Perhaps not quite as good, but reasonably close. It shouldn't be an orders-of-magnitude type difference. And a controller card (or drive) has a lot less RAM to use as a cache / queue for reordering stuff than the OS has, potentially the OS can us most of the available RAM, which can be gigabytes on a big server, whereas in the drive there are at most a few tens of megabytes... However all this is a bit looking at the problem through the wrong end. The OS should provide a multi-read call for the applications to pass a list of blocks they'll need, then reorder them and read them the fastest possible way, clustering them with similar requests from other threads. Right now when a thread/process issues a read() it will block until the block is delivered to this thread. The OS does not know if this thread will then need the next block (which can be had very cheaply if you know ahead of time you'll need it) or not. Thus it must make guesses, read ahead (sometimes), etc... ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Foreign key slows down copy/insert
PFC wrote: You're using 7.4.5. It's possible that you have a type mismatch in your foreign keys which prevents use of the index on B. I read about this pothole and made damn sure the types match. (Actually, I kinda hoped that was the problem, it would have been an easy fix.) First of all, be really sure it's THAT foreign key, ie. do your COPY with only ONE foreign key at a time if you have several, and see which one is the killer. I took exactly this route, and the first FK I tried already hit the jackpot. The real table had 4 FKs. EXPLAIN ANALYZE the following : SELECT * FROM B WHERE id = (SELECT id FROM A LIMIT 1); It should use the index. Does it ? It sure looks like it: Index Scan using ix_B on B (cost=0.04..3.06 rows=1 width=329) (actual time=93.824..93.826 rows=1 loops=1) Index Cond: (id = $0) InitPlan -> Limit (cost=0.00..0.04 rows=1 width=4) (actual time=15.128..15.129 rows=1 loops=1) -> Seq Scan on A (cost=0.00..47569.70 rows=1135570 width=4) (actual time=15.121..15.121 rows=1 loops=1) Total runtime: 94.109 ms The real problem seems to be what Chris and Stephen pointed out: even though the FK check is deferred, it is done on a per-row bases. With 1M rows, this just takes forever. Thanks for the help. -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly