Re: [PERFORM] How to improve db performance with $7K?
Kevin Brown <[EMAIL PROTECTED]> writes: > In the case of pure random reads, you'll end up having to wait an > average of half of a rotation before beginning the read. You're assuming the conclusion. The above is true if the disk is handed one request at a time by a kernel that doesn't have any low-level timing information. If there are multiple random requests on the same track, the drive has an opportunity to do better than that --- if it's got all the requests in hand. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How to improve db performance with $7K?
Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> The reason this is so much more of a win than it was when ATA was > >> designed is that in modern drives the kernel has very little clue about > >> the physical geometry of the disk. Variable-size tracks, bad-block > >> sparing, and stuff like that make for a very hard-to-predict mapping > >> from linear sector addresses to actual disk locations. > > > What I mean is that when it comes to scheduling disk activity, > > knowledge of the specific physical geometry of the disk isn't really > > important. > > Oh? > > Yes, you can probably assume that blocks with far-apart numbers are > going to require a big seek, and you might even be right in supposing > that a block with an intermediate number should be read on the way. > But you have no hope at all of making the right decisions at a more > local level --- say, reading various sectors within the same cylinder > in an optimal fashion. You don't know where the track boundaries are, > so you can't schedule in a way that minimizes rotational latency. This is true, but has to be examined in the context of the workload. If the workload is a sequential read, for instance, then the question becomes whether or not giving the controller a set of sequential blocks (in block ID order) will get you maximum read throughput. Given that the manufacturers all attempt to generate the biggest read throughput numbers, I think it's reasonable to assume that (a) the sectors are ordered within a cylinder such that reading block x + 1 immediately after block x will incur the smallest possible amount of delay if requested quickly enough, and (b) the same holds true when block x + 1 is on the next cylinder. In the case of pure random reads, you'll end up having to wait an average of half of a rotation before beginning the read. Where SCSI buys you something here is when you have sequential chunks of reads that are randomly distributed. The SCSI drive can determine which block in the set to start with first. But for that to really be a big win, the chunks themselves would have to span more than half a track at least, else you'd have a greater than half a track gap in the middle of your two sorted sector lists for that track (a really well-engineered SCSI disk could take advantage of the fact that there are multiple platters and fill the "gap" with reads from a different platter). Admittedly, this can be quite a big win. With an average rotational latency of 4 milliseconds on a 7200 RPM disk, being able to begin the read at the earliest possible moment will shave at most 25% off the total average random-access latency, if the average seek time is 12 milliseconds. > That might be the case with respect to decisions about long seeks, > but not with respect to rotational latency. The kernel simply hasn't > got the information. True, but that should reduce the total latency by something like 17% (on average). Not trivial, to be sure, but not an order of magnitude, either. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
Kevin Brown <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The reason this is so much more of a win than it was when ATA was >> designed is that in modern drives the kernel has very little clue about >> the physical geometry of the disk. Variable-size tracks, bad-block >> sparing, and stuff like that make for a very hard-to-predict mapping >> from linear sector addresses to actual disk locations. > What I mean is that when it comes to scheduling disk activity, > knowledge of the specific physical geometry of the disk isn't really > important. Oh? Yes, you can probably assume that blocks with far-apart numbers are going to require a big seek, and you might even be right in supposing that a block with an intermediate number should be read on the way. But you have no hope at all of making the right decisions at a more local level --- say, reading various sectors within the same cylinder in an optimal fashion. You don't know where the track boundaries are, so you can't schedule in a way that minimizes rotational latency. You're best off to throw all the requests at the drive together and let the drive sort it out. This is not to say that there's not a place for a kernel-side scheduler too. The drive will probably have a fairly limited number of slots in its command queue. The optimal thing is for those slots to be filled with requests that are in the same area of the disk. So you can still get some mileage out of an elevator algorithm that works on logical block numbers to give the drive requests for nearby block numbers at the same time. But there's also a lot of use in letting the drive do its own low-level scheduling. > My argument is that a sufficiently smart kernel scheduler *should* > yield performance results that are reasonably close to what you can > get with that feature. Perhaps not quite as good, but reasonably > close. It shouldn't be an orders-of-magnitude type difference. That might be the case with respect to decisions about long seeks, but not with respect to rotational latency. The kernel simply hasn't got the information. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
3ware claim that their 'software' implemented command queueing performs at 95% effectiveness compared to the hardware queueing on a SCSI drive, so I would say that they agree with you. I'm still learning, but as I read it, the bits are split across the platters and there is only 'one' head, but happens to be reading from multiple platters. The 'further' in linear distance the data is from the current position, the longer it's going to take to get there. This seems to be true based on a document that was circulated. A hard drive takes considerable amount of time to 'find' a track on the platter compared to the rotational speed, which would agree with the fact that you can read 70MB/sec, but it takes up to 13ms to seek. the ATA protocol is just how the HBA communicates with the drive, there is no reason why the HBA can't reschedule reads and writes just the like SCSI drive would do natively, and this is what infact 3ware claims. I get the feeling based on my own historical experience that generaly drives don't just have a bunch of bad blocks. This all leads me to believe that you can predict with pretty good accuracy how expensive it is to retrieve a given block knowing it's linear increment. Alex Turner netEconomist On 4/14/05, Kevin Brown <[EMAIL PROTECTED]> wrote: > Tom Lane wrote: > > Kevin Brown <[EMAIL PROTECTED]> writes: > > > I really don't see how this is any different between a system that has > > > tagged queueing to the disks and one that doesn't. The only > > > difference is where the queueing happens. In the case of SCSI, the > > > queueing happens on the disks (or at least on the controller). In the > > > case of SATA, the queueing happens in the kernel. > > > > That's basically what it comes down to: SCSI lets the disk drive itself > > do the low-level I/O scheduling whereas the ATA spec prevents the drive > > from doing so (unless it cheats, ie, caches writes). Also, in SCSI it's > > possible for the drive to rearrange reads as well as writes --- which > > AFAICS is just not possible in ATA. (Maybe in the newest spec...) > > > > The reason this is so much more of a win than it was when ATA was > > designed is that in modern drives the kernel has very little clue about > > the physical geometry of the disk. Variable-size tracks, bad-block > > sparing, and stuff like that make for a very hard-to-predict mapping > > from linear sector addresses to actual disk locations. > > Yeah, but it's not clear to me, at least, that this is a first-order > consideration. A second-order consideration, sure, I'll grant that. > > What I mean is that when it comes to scheduling disk activity, > knowledge of the specific physical geometry of the disk isn't really > important. What's important is whether or not the disk conforms to a > certain set of expectations. Namely, that the general organization is > such that addressing the blocks in block number order guarantees > maximum throughput. > > Now, bad block remapping destroys that guarantee, but unless you've > got a LOT of bad blocks, it shouldn't destroy your performance, right? > > > Combine that with the fact that the drive controller can be much > > smarter than it was twenty years ago, and you can see that the case > > for doing I/O scheduling in the kernel and not in the drive is > > pretty weak. > > Well, I certainly grant that allowing the controller to do the I/O > scheduling is faster than having the kernel do it, as long as it can > handle insertion of new requests into the list while it's in the > middle of executing a request. The most obvious case is when the head > is in motion and the new request can be satisfied by reading from the > media between where the head is at the time of the new request and > where the head is being moved to. > > My argument is that a sufficiently smart kernel scheduler *should* > yield performance results that are reasonably close to what you can > get with that feature. Perhaps not quite as good, but reasonably > close. It shouldn't be an orders-of-magnitude type difference. > > -- > Kevin Brown [EMAIL PROTECTED] > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > I really don't see how this is any different between a system that has > > tagged queueing to the disks and one that doesn't. The only > > difference is where the queueing happens. In the case of SCSI, the > > queueing happens on the disks (or at least on the controller). In the > > case of SATA, the queueing happens in the kernel. > > That's basically what it comes down to: SCSI lets the disk drive itself > do the low-level I/O scheduling whereas the ATA spec prevents the drive > from doing so (unless it cheats, ie, caches writes). Also, in SCSI it's > possible for the drive to rearrange reads as well as writes --- which > AFAICS is just not possible in ATA. (Maybe in the newest spec...) > > The reason this is so much more of a win than it was when ATA was > designed is that in modern drives the kernel has very little clue about > the physical geometry of the disk. Variable-size tracks, bad-block > sparing, and stuff like that make for a very hard-to-predict mapping > from linear sector addresses to actual disk locations. Yeah, but it's not clear to me, at least, that this is a first-order consideration. A second-order consideration, sure, I'll grant that. What I mean is that when it comes to scheduling disk activity, knowledge of the specific physical geometry of the disk isn't really important. What's important is whether or not the disk conforms to a certain set of expectations. Namely, that the general organization is such that addressing the blocks in block number order guarantees maximum throughput. Now, bad block remapping destroys that guarantee, but unless you've got a LOT of bad blocks, it shouldn't destroy your performance, right? > Combine that with the fact that the drive controller can be much > smarter than it was twenty years ago, and you can see that the case > for doing I/O scheduling in the kernel and not in the drive is > pretty weak. Well, I certainly grant that allowing the controller to do the I/O scheduling is faster than having the kernel do it, as long as it can handle insertion of new requests into the list while it's in the middle of executing a request. The most obvious case is when the head is in motion and the new request can be satisfied by reading from the media between where the head is at the time of the new request and where the head is being moved to. My argument is that a sufficiently smart kernel scheduler *should* yield performance results that are reasonably close to what you can get with that feature. Perhaps not quite as good, but reasonably close. It shouldn't be an orders-of-magnitude type difference. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Intel SRCS16 SATA raid?
Alex Turner wrote: Looking at the numbers, the raptor with TCQ enabled was close or beat the Atlas III 10k drive on most benchmarks. Naturaly a 15k drive is going to be faster in many areas, but it is also much more expensive. It was only 44% better on the server tests than the raptor with TCQ, but it costs nearly 300% more ($538 cdw.com, $180 newegg.com). True, but that's a one time expense (300%) for a 44% gain ALL the time. '44% better' is nothing to sneeze at. I'd easily pay the price for the gain in a large server env. -- Until later, Geoffrey ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Intel SRCS16 SATA raid?
Looking at the numbers, the raptor with TCQ enabled was close or beat the Atlas III 10k drive on most benchmarks. Naturaly a 15k drive is going to be faster in many areas, but it is also much more expensive. It was only 44% better on the server tests than the raptor with TCQ, but it costs nearly 300% more ($538 cdw.com, $180 newegg.com). Note also that the 15k drive was the only drive that kept up with the raptor on raw transfer speed, which is going to matter for WAL. For those of us on a budget, a quality controller card with lots of RAM is going to be our biggest friend because it can cache writes, and improve performance. The 3ware controllers seem to be universally benchmarked as the best SATA RAID 10 controllers where database performance is concerned. Even the crappy tweakers.net review had the 3ware as the fastest controller for a MySQL data partition in RAID 10. The Raptor drives can be had for as little as $180/ea, which is quite a good price point considering they can keep up with their SCSI 10k RPM counterparts on almost all tests with NCQ enabled (Note that 3ware controllers _don't_ support NCQ, although they claim their HBA based queueing is 95% as good as NCQ on the drive). Alex Turner netEconomist On 4/14/05, Dave Held <[EMAIL PROTECTED]> wrote: > > -Original Message- > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > Sent: Thursday, April 14, 2005 12:14 PM > > To: [EMAIL PROTECTED] > > Cc: Greg Stark; pgsql-performance@postgresql.org; > > [EMAIL PROTECTED] > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > > > I have put together a little head to head performance of a 15k SCSI, > > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive > > comparison at storage review > > > > http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph > > p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI > > D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5 > > =248&devCnt=6 > > > > It does illustrate some of the weaknesses of SATA drives, but all in > > all the Raptor drives put on a good show. > > [...] > > I think it's a little misleading that your tests show 0ms seek times > for some of the write tests. The environmental test also selects a > missing data point as the winner. Besides that, it seems to me that > seek time is one of the most important features for a DB server, which > means that the SCSI drives are the clear winners and the non-WD SATA > drives are the embarrassing losers. Transfer rate is import, but > perhaps less so because DBs tend to read/write small blocks rather > than large files. On the server suite, which seems to me to be the > most relevant for DBs, the Atlas 15k spanks the other drives by a > fairly large margin (especially the lesser SATA drives). When you > ignore the "consumer app" benchmarks, I wouldn't be so confident in > saying that the Raptors "put on a good show". > > __ > David B. Held > Software Engineer/Array Services Group > 200 14th Ave. East, Sartell, MN 56377 > 320.534.3637 320.253.7800 800.752.8129 > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] 8.0.1 much slower than 7.4.2?
I would ask this on the jdbc mailling list. They might know. --- anon permutation wrote: > > Hi, > > I have just upgraded our db from 7.4.2 to 8.0.1 and we are doing some > testing. For some reason, we have discovered that our application performs > much slower on 8.0.1. > > My initial reaction was to turn on log_min_duration_statement to see what's > happening. However, log_min_duration_statement does not work for JDBC > clients in 8.0.1. > > As a result, I modified log_statement to all. Without my application doing > anything, I see statements below being executed non-stop. Who is triggering > these statemetns? Is this normal? What am I doing wrong? > > I am using Fedora Core 1 - Kernel: 2.4.22-1.2174.nptl > > Please help. Thanks. > > > > > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND attnum = $2 > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT def.adsrc FROM > pg_catalog.pg_class c JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid > ) LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND > a.attnum = def.adnum) WHERE c.oid = $1 and a.attnum = $2 AND def.adsrc L > IKE '%nextval(%' > 2005-04-04 18:05:00 CST PARSELOG: statement: SELECT attnotnull FROM > pg_catalog.pg_attribute WHERE attrelid = $1 AND at
Re: [PERFORM] Intel SRCS16 SATA raid?
Just to clarify these are tests from http://www.storagereview.com, not my own. I guess they couldn't get number for those parts. I think everyone understands that a 0ms seek time impossible, and indicates a missing data point. Thanks, Alex Turner netEconomist On 4/14/05, Dave Held <[EMAIL PROTECTED]> wrote: > > -Original Message- > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > Sent: Thursday, April 14, 2005 12:14 PM > > To: [EMAIL PROTECTED] > > Cc: Greg Stark; pgsql-performance@postgresql.org; > > [EMAIL PROTECTED] > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > > > I have put together a little head to head performance of a 15k SCSI, > > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive > > comparison at storage review > > > > http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph > > p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI > > D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5 > > =248&devCnt=6 > > > > It does illustrate some of the weaknesses of SATA drives, but all in > > all the Raptor drives put on a good show. > > [...] > > I think it's a little misleading that your tests show 0ms seek times > for some of the write tests. The environmental test also selects a > missing data point as the winner. Besides that, it seems to me that > seek time is one of the most important features for a DB server, which > means that the SCSI drives are the clear winners and the non-WD SATA > drives are the embarrassing losers. Transfer rate is import, but > perhaps less so because DBs tend to read/write small blocks rather > than large files. On the server suite, which seems to me to be the > most relevant for DBs, the Atlas 15k spanks the other drives by a > fairly large margin (especially the lesser SATA drives). When you > ignore the "consumer app" benchmarks, I wouldn't be so confident in > saying that the Raptors "put on a good show". > > __ > David B. Held > Software Engineer/Array Services Group > 200 14th Ave. East, Sartell, MN 56377 > 320.534.3637 320.253.7800 800.752.8129 > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] speed of querry?
> -Original Message- > From: Joel Fradkin [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 14, 2005 11:39 AM > To: 'Tom Lane'; 'Dawid Kuroczko' > Cc: 'PERFORM' > Subject: Re: [PERFORM] speed of querry? > > > I did as described to alter table and did not see any > difference in speed. I am trying to undo the symbolic > link to the data array and set it up on raid 5 disks in > the machine just to test if there is an issue with the > config of the raid 10 array or a problem with the controller. > > I am kinda lame at Linux so not sure I have got it yet still > testing. Still kind puzzled why it chose tow different option, > but one is running windows version of postgres, so maybe that > has something to do with it. That sounds like a plausible explanation. However, it could simply be that the statistics gathered on each box are sufficiently different to cause different plans. > The data bases and configs (as far as page cost) are the same. Did you do as Dawid suggested? > [...] > Then do a query couple of times (EXPLAIN ANALYZE also :)), then > do: > SET enable_seqscan = off; > and rerun the query -- if it was significantly faster, you will > want to do: > SET enable_seqscan = on; > and tweak: > SET random_page_cost = 2.1; > ...and play with values. When you reach the random_page_cost > which suits your data, you will want to put it into > postgresql.conf > [...] This is above and beyond toying with the column statistics. You are basically telling the planner to use an index. Try this, and post the EXPLAIN ANALYZE for the seqscan = off case on the slow box if it doesn't speed things up for you. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] speed of querry?
Well so far I have 1.5 hours with commandpromt.com and 8 + hours with Dell and have not seen any changes in the speed of my query. I did move the data base to the raid 5 drives and did see a 1 second improvement from 13 secs to 12 secs (keep in mind it runs in 6 on the optiplex). The dell guy ran Bonie and found 40meg per sec read/write speed for the arrays. He also installed version 8.0.2 (went fine on AS4 he had to uninstall 8.0.1 first). He is going to get a 6650 in his test lab to see what he can fugure out. I will say both commandprompt.com and Dell have been very professional and I am impressed at the level of support available for Redhat from Dell and postgres. As always I still feel this list has been my most useful asset, but I am glad there are folks to call on. I am trying to go live soon and need to get this resolved. I told the guy from Dell it makes no sense that a windows 2.4 single proc with 750 meg of ram can go faster then a 4 proc (3.ghz) 8 gig machine. Both databases were restored from the same file. Same view etc. Config files are set the same except for amount of cached ram, although Commandprompt.com had me adjust a few items that should help going into production, put planning stuff is basicly the same. This view returns in 3 secs on MSSQL server on the optiplex (750 meg 2.4 box); and 6 secs using postgres on windows and 12-13 secs on the 4 processor box. Needless to say I am very frustrated. Maybe Dell will turn up something testing in their lab. It took a bit of perseverance to get to the right guy at Dell (the first guy actually told me to load it all on a like machine and if it was very much slower on my original they would pursue it otherwise it was not an issue. I was like the machine cost 30K you going to send me one to test that. But seriously I am open to trying anything (loading AS3, using postgres 7.4)? The fellow at Dell does not think it is a hardware problem, so if it is Linux (could very well be, but he seemed very sharp and did not come up with anything yet) or postgres config (again Josh at commandprompt.com was very sharp) then what do I do now to isolate the issue? At least they are loading one in the lab (in theory, I cant send them my database, so who knows what they will test). Dell changed the file system to ext2 is that going to bite me in the butt? It did not seem to change the speed of my explain analyze. Joel Fradkin Dawid Kuroczko <[EMAIL PROTECTED]> writes: > Basically it tells postgres how many values should it keep for > statistics per column. The config default_statistics_target > is the default (= used when creating table) and ALTER... is > a way to change it later. Not quite. default_statistics_target is the value used by ANALYZE for any column that hasn't had an explicit ALTER SET STATISTICS done on it. So you can change default_statistics_target and that will affect existing tables. (It used to work the way you are saying, but that was a few releases back...) regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to improve db performance with $7K?
Steve Poe wrote: If SATA drives don't have the ability to replace SCSI for a multi-user I don't think it is a matter of not having the ability. SATA all in all is fine as long as it is battery backed. It isn't as high performing as SCSI but who says it has to be? There are plenty of companies running databases on SATA without issue. Would I put it on a database that is expecting to have 500 connections at all times? No. Then again, if you have an application with that requirement, you have the money to buy a big fat SCSI array. Sincerely, Joshua D. Drake Postgres apps, but you needed to save on cost (ALWAYS an issue), could/would you implement SATA for your logs (pg_xlog) and keep the rest on SCSI? Steve Poe Mohan, Ross wrote: I've been doing some reading up on this, trying to keep up here, and have found out that (experts, just yawn and cover your ears) 1) some SATA drives (just type II, I think?) have a "Phase Zero" implementation of Tagged Command Queueing (the special sauce for SCSI). 2) This SATA "TCQ" is called NCQ and I believe it basically allows the disk software itself to do the reordering (this is called "simple" in TCQ terminology) It does not yet allow the TCQ "head of queue" command, allowing the current tagged request to go to head of queue, which is a simple way of manifesting a "high priority" request. 3) SATA drives are not yet multi-initiator? Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives are likely to whomp SATA II drives for a while yet (read: a year or two) in multiuser PostGres applications. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 2:04 PM To: Kevin Brown Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Kevin Brown <[EMAIL PROTECTED]> writes: Greg Stark wrote: I think you're being misled by analyzing the write case. Consider the read case. When a user process requests a block and that read makes its way down to the driver level, the driver can't just put it aside and wait until it's convenient. It has to go ahead and issue the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [Fwd: Re: [PERFORM] Functionscan estimates]
On Thu, Apr 14, 2005 at 10:39:03AM -0700, elein wrote: > All functions could have a cost associated with them, set by the writer of > the function in order for the planner to reorder function calls. > The stonebraker airplane level example was: > select ... from ... where f(id) = 3 and expensive_image_function(img) > The idea, of course is to weight the expensive function so it was > pushed to the end of the execution. So there was only a constant cost associated with the function? No estimator function, for example? -- Alvaro Herrera (<[EMAIL PROTECTED]>) "If you have nothing to say, maybe you need just the right tool to help you not say it." (New York Times, about Microsoft PowerPoint) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve db performance with $7K?
> -Original Message- > From: Mohan, Ross [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 14, 2005 1:30 PM > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > Greg Stark wrote: > > > > Kevin Brown <[EMAIL PROTECTED]> writes: > > > > > Greg Stark wrote: > > > > > > > I think you're being misled by analyzing the write case. > > > > > > > > Consider the read case. When a user process requests a block > > > > and that read makes its way down to the driver level, the > > > > driver can't just put it aside and wait until it's convenient. > > > > It has to go ahead and issue the read right away. > > > > > > Well, strictly speaking it doesn't *have* to. It could delay > > > for a couple of milliseconds to see if other requests come in, > > > and then issue the read if none do. If there are already other > > > requests being fulfilled, then it'll schedule the request in > > > question just like the rest. > > > > But then the cure is worse than the disease. You're basically > > describing exactly what does happen anyways, only you're > > delaying more requests than necessary. That intervening time > > isn't really idle, it's filled with all the requests that > > were delayed during the previous large seek... > > [...] > > [...] > 1) some SATA drives (just type II, I think?) have a "Phase Zero" > implementation of Tagged Command Queueing (the special sauce > for SCSI). > [...] > Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives > are likely to whomp SATA II drives for a while yet (read: a > year or two) in multiuser PostGres applications. I would say it depends on the OS. What Kevin is describing sounds just like the Anticipatory I/O Scheduler in Linux 2.6: http://www.linuxjournal.com/article/6931 For certain application contexts, it looks like a big win. Not entirely sure if Postgres is one of them, though. If SCSI beats SATA, it sounds like it will be mostly due to better seek times. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to improve db performance with $7K?
If SATA drives don't have the ability to replace SCSI for a multi-user Postgres apps, but you needed to save on cost (ALWAYS an issue), could/would you implement SATA for your logs (pg_xlog) and keep the rest on SCSI? Steve Poe Mohan, Ross wrote: I've been doing some reading up on this, trying to keep up here, and have found out that (experts, just yawn and cover your ears) 1) some SATA drives (just type II, I think?) have a "Phase Zero" implementation of Tagged Command Queueing (the special sauce for SCSI). 2) This SATA "TCQ" is called NCQ and I believe it basically allows the disk software itself to do the reordering (this is called "simple" in TCQ terminology) It does not yet allow the TCQ "head of queue" command, allowing the current tagged request to go to head of queue, which is a simple way of manifesting a "high priority" request. 3) SATA drives are not yet multi-initiator? Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives are likely to whomp SATA II drives for a while yet (read: a year or two) in multiuser PostGres applications. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 2:04 PM To: Kevin Brown Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Kevin Brown <[EMAIL PROTECTED]> writes: Greg Stark wrote: I think you're being misled by analyzing the write case. Consider the read case. When a user process requests a block and that read makes its way down to the driver level, the driver can't just put it aside and wait until it's convenient. It has to go ahead and issue the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to improve db performance with $7K?
The real question is whether you choose the single 15kRPM drive or additional drives at 10kRPM... Additional spindles would give a much bigger And the bonus question. Expensive fast drives as a RAID for everything, or for the same price many more slower drives (even SATA) so you can put the transaction log, tables, indexes all on separate physical drives ? Like put one very frequently used table on its own disk ? For the same amount of money which one would be more interesting ? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to improve db performance with $7K?
I've been doing some reading up on this, trying to keep up here, and have found out that (experts, just yawn and cover your ears) 1) some SATA drives (just type II, I think?) have a "Phase Zero" implementation of Tagged Command Queueing (the special sauce for SCSI). 2) This SATA "TCQ" is called NCQ and I believe it basically allows the disk software itself to do the reordering (this is called "simple" in TCQ terminology) It does not yet allow the TCQ "head of queue" command, allowing the current tagged request to go to head of queue, which is a simple way of manifesting a "high priority" request. 3) SATA drives are not yet multi-initiator? Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives are likely to whomp SATA II drives for a while yet (read: a year or two) in multiuser PostGres applications. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 2:04 PM To: Kevin Brown Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Kevin Brown <[EMAIL PROTECTED]> writes: > Greg Stark wrote: > > > > I think you're being misled by analyzing the write case. > > > > Consider the read case. When a user process requests a block and > > that read makes its way down to the driver level, the driver can't > > just put it aside and wait until it's convenient. It has to go ahead > > and issue the read right away. > > Well, strictly speaking it doesn't *have* to. It could delay for a > couple of milliseconds to see if other requests come in, and then > issue the read if none do. If there are already other requests being > fulfilled, then it'll schedule the request in question just like the > rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... > Once the first request has been fulfilled, the driver can now schedule > the rest of the queued-up requests in disk-layout order. > > I really don't see how this is any different between a system that has > tagged queueing to the disks and one that doesn't. The only > difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] speed of querry?
I did as described to alter table and did not see any difference in speed. I am trying to undo the symbolic link to the data array and set it up on raid 5 disks in the machine just to test if there is an issue with the config of the raid 10 array or a problem with the controller. I am kinda lame at Linux so not sure I have got it yet still testing. Still kind puzzled why it chose tow different option, but one is running windows version of postgres, so maybe that has something to do with it. The data bases and configs (as far as page cost) are the same. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 [EMAIL PROTECTED] www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, April 14, 2005 11:21 AM To: Dawid Kuroczko Cc: Joel Fradkin; PERFORM Subject: Re: [PERFORM] speed of querry? Dawid Kuroczko <[EMAIL PROTECTED]> writes: > Basically it tells postgres how many values should it keep for > statistics per column. The config default_statistics_target > is the default (= used when creating table) and ALTER... is > a way to change it later. Not quite. default_statistics_target is the value used by ANALYZE for any column that hasn't had an explicit ALTER SET STATISTICS done on it. So you can change default_statistics_target and that will affect existing tables. (It used to work the way you are saying, but that was a few releases back...) regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
> -Original Message- > From: Greg Stark [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 14, 2005 12:55 PM > To: [EMAIL PROTECTED] > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > "Matthew Nuzum" <[EMAIL PROTECTED]> writes: > > > drive A has capacity C and spins at 15K rpms, and > > drive B has capacity 2 x C and spins at 10K rpms and > > all other features are the same, the price is the same and > > C is enough disk space which would you choose? > > In this case you always choose the 15k RPM drive, at least > for Postgres. The 15kRPM reduces the latency which improves > performance when fsyncing transaction commits. I think drive B is clearly the best choice. Matt said "all other features are the same", including price. I take that to mean that the seek time and throughput are also identical. However, I think it's fairly clear that there is no such pair of actual devices. If Matt really meant that they have the same cache size, interface, etc, then I would agree with you. The 15k drive is likely to have the better seek time. > The real question is whether you choose the single 15kRPM > drive or additional drives at 10kRPM... Additional spindles > would give a much bigger bandwidth improvement but questionable > latency improvement. Under the assumption that the seek times and throughput are realistic rather than contrived as in the stated example, I would say the 15k drive is the likely winner. It probably has the better seek time, and it seems that latency is more important than bandwidth for DB apps. > > Would the increased data density of the higher capacity drive > > be of greater benefit than the faster spindle speed of drive > > A? > > actually a 2xC capacity drive probably just has twice as many > platters which means it would perform identically to the C > capacity drive. If it has denser platters that might improve > performance slightly. Well, according to the paper referenced by Richard, twice as many platters means that it probably has slightly worse seek time (because of the increased mass of the actuator/rw-head). Yet another reason why the smaller drive might be preferable. Of course, the data density is certainly a factor, as you say. But since the drives are within a factor of 2, it seems likely that real drives would have comparable densities. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Intel SRCS16 SATA raid?
Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead of the 3ware one. Well I have never even heard of it. 3ware is the defacto authority of reasonable SATA RAID. If you were to go with a different brand I would go with LSI. The LSI 150-6 is a nice card with a battery backup option as well. Oh and 3ware has BBU for certain models as well. Sincerely, Joshua D. Drake ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
"Matthew Nuzum" <[EMAIL PROTECTED]> writes: > So if you all were going to choose between two hard drives where: > drive A has capacity C and spins at 15K rpms, and > drive B has capacity 2 x C and spins at 10K rpms and > all other features are the same, the price is the same and C is enough > disk space which would you choose? > I've noticed that on IDE drives, as the capacity increases the data > density increases and there is a pereceived (I've not measured it) > performance increase. > Would the increased data density of the higher capacity drive be of > greater benefit than the faster spindle speed of drive A? Depends how they got the 2x capacity increase. If they got it by increased bit density --- same number of tracks, but more sectors per track --- then drive B actually has a higher transfer rate, because in one rotation it can transfer twice as much data as drive A. More tracks per cylinder (ie, more platters) can also be a speed win since you can touch more data before you have to seek to another cylinder. Drive B will lose if the 2x capacity was all from adding cylinders (unless its seek-time spec is way better than A's ... which is unlikely but not impossible, considering the cylinders are probably closer together). Usually there's some-of-each involved, so it's hard to make any definite statement without more facts. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] recovery after long delete
Markus Bertheau <[EMAIL PROTECTED]> writes: > How does oracle do that? Has all this something to do with mvcc? Why > does it take oracle so long to recover? Postgres does "pessimistic MVCC" where it keeps the old versions where they are in the table. Only after it's committed can they be cleaned up and reused. So aborting is a noop but committing requires additional cleanup (which is put off until vacuum runs). Oracle does "optimistic MVCC" where it assumes most transactions will commit and most transactions will be reading mostly committed data. So it immediately does all the cleanup for the commit. It stores the old version in separate storage spaces called the rollback segment and redo logs. Committing is a noop (almost, there are some details, search for "delayed block cleanout") whereas rolling back requires copying back all that old data from the redo logs back to the table. Engineering is all about tradeoffs. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to improve db performance with $7K?
Kevin Brown <[EMAIL PROTECTED]> writes: > Greg Stark wrote: > > > > I think you're being misled by analyzing the write case. > > > > Consider the read case. When a user process requests a block and > > that read makes its way down to the driver level, the driver can't > > just put it aside and wait until it's convenient. It has to go ahead > > and issue the read right away. > > Well, strictly speaking it doesn't *have* to. It could delay for a > couple of milliseconds to see if other requests come in, and then > issue the read if none do. If there are already other requests being > fulfilled, then it'll schedule the request in question just like the > rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... > Once the first request has been fulfilled, the driver can now schedule > the rest of the queued-up requests in disk-layout order. > > I really don't see how this is any different between a system that has > tagged queueing to the disks and one that doesn't. The only > difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
"Matthew Nuzum" <[EMAIL PROTECTED]> writes: > drive A has capacity C and spins at 15K rpms, and > drive B has capacity 2 x C and spins at 10K rpms and > all other features are the same, the price is the same and C is enough > disk space which would you choose? In this case you always choose the 15k RPM drive, at least for Postgres. The 15kRPM reduces the latency which improves performance when fsyncing transaction commits. The real question is whether you choose the single 15kRPM drive or additional drives at 10kRPM... Additional spindles would give a much bigger bandwidth improvement but questionable latency improvement. > Would the increased data density of the higher capacity drive be of > greater benefit than the faster spindle speed of drive A? actually a 2xC capacity drive probably just has twice as many platters which means it would perform identically to the C capacity drive. If it has denser platters that might improve performance slightly. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Intel SRCS16 SATA raid?
Nice research Alex. Your data strongly support the information in the paper. Your SCSI drives blew away the others in all of the server benchmarks. They're only marginally better in desktop use. I do find it somewhat amazing that a 15K SCSI 320 drive isn't going to help me play Unreal Tournament much faster. That's okay. I suck at it anyway. My kid has never lost to me. She enjoys seeing daddy as a bloody smear and bouncing body parts anyway. It promotes togetherness. Here's a quote from the paper: "[SCSI] interfaces support multiple initiators or hosts. The drive must keep track of separate sets of information for each host to which it is attached, e.g., maintaining the processor pointer sets for multiple initiators and tagged commands. The capability of SCSI/FC to efficiently process commands and tasks in parallel has also resulted in a higher overhead âkernelâ structure for the firmware." Has anyone ever seen a system with multiple hosts or initiators on a SCSI bus? Seems like it would be a very cool thing in an SMP architecture, but I've not seen an example implemented. Rick Alex Turner <[EMAIL PROTECTED]> wrote on 04/14/2005 12:13:41 PM: > I have put together a little head to head performance of a 15k SCSI, > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive > comparison at storage review > > http://www.storagereview.com/php/benchmark/compare_rtg_2001.php? > typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devID_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5=248&devCnt=6 > > It does illustrate some of the weaknesses of SATA drives, but all in > all the Raptor drives put on a good show. > > Alex Turner > netEconomist > > On 4/14/05, Alex Turner <[EMAIL PROTECTED]> wrote: > > I have read a large chunk of this, and I would highly recommend it to > > anyone who has been participating in the drive discussions. It is > > most informative!! > > > > Alex Turner > > netEconomist > > > > On 4/14/05, [EMAIL PROTECTED] > <[EMAIL PROTECTED]> wrote: > > > Greg, > > > > > > I posted this link under a different thread (the $7k server > thread). It is > > > a very good read on why SCSI is better for servers than ATA. I > didn't note > > > bias, though it is from a drive manufacturer. YMMV. There is an > > > interesting, though dated appendix on different manufacturers' drive > > > characteristics. > > > > > > http://www.seagate. > com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > > > > Enjoy, > > > > > > Rick > > > > > > [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM: > > > > > > > > > > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller > > > > instead of the 3ware one. > > > > > > > > Poking around it seems this does come with Linux drivers and there is a > > > > battery backup option. So it doesn't seem to be completely insane. > > > > > > > > Anyone have any experience with these controllers? > > > > > > > > I'm also wondering about whether I'm better off with one of these SATA > > > raid > > > > controllers or just going with SCSI drives. > > > > > > > > -- > > > > greg > > > > > > > > > > > > ---(end of broadcast)--- > > > > TIP 8: explain analyze is your friend > > > > > > ---(end of broadcast)--- > > > TIP 9: the planner will ignore your desire to choose an index scan if your > > > joining column's datatypes do not match > > > > > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Fwd: Re: [PERFORM] Functionscan estimates]
I'm not subscribed to performance at this time. I reviewed the thread and owe everything I know about this to Wei Hong whose brilliance exceeds all others :) All misinterpretations are mine alone. I have not reviewed hellerstein's papers posted by neil, but I will. My understanding of this issue is at a very high user level. In Illustra SRF functions were not necessarily special functions. All functions could have a cost associated with them, set by the writer of the function in order for the planner to reorder function calls. The stonebraker airplane level example was: select ... from ... where f(id) = 3 and expensive_image_function(img) The idea, of course is to weight the expensive function so it was pushed to the end of the execution. The only difference I see with SRFs in Postgres is that you may want the cost represented as one row returned and another weighting representing the number of estimated rows. I think this conclusion has already been drawn. It seems to make sense, if the optimizer can use this information, to include wild and/or educated guesses for the costs of the SRF. I'm sure I haven't contributed here anything new, but perhaps phrased it differently. Copy me on replies and I'll participate as I can. --elein On Thu, Apr 14, 2005 at 08:36:38AM +0100, Simon Riggs wrote: > Elein, > > Any chance you could join this discussion on PERFORM ? > > I understand you did time with Illustra. I thought they had solved the > optimizer plug-in issue...how did they do it? > > Best Regards, Simon Riggs > > > Forwarded Message > From: Tom Lane <[EMAIL PROTECTED]> > To: Alvaro Herrera <[EMAIL PROTECTED]> > Cc: Josh Berkus , Michael Fuhr <[EMAIL PROTECTED]>, > > Subject: Re: [PERFORM] Functionscan estimates > Date: Sat, 09 Apr 2005 00:00:56 -0400 > Not too many releases ago, there were several columns in pg_proc that > were intended to support estimation of the runtime cost and number of > result rows of set-returning functions. I believe in fact that these > were the remains of Joe Hellerstein's thesis on expensive-function > evaluation, and are exactly what he was talking about here: > http://archives.postgresql.org/pgsql-hackers/2002-06/msg00085.php > > But with all due respect to Joe, I think the reason that stuff got > trimmed is that it didn't work very well. In most cases it's > *hard* to write an estimator for a SRF. Let's see you produce > one for dblink() for instance ... > > regards, tom lane > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Intel SRCS16 SATA raid?
> -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 14, 2005 12:14 PM > To: [EMAIL PROTECTED] > Cc: Greg Stark; pgsql-performance@postgresql.org; > [EMAIL PROTECTED] > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > I have put together a little head to head performance of a 15k SCSI, > 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive > comparison at storage review > > http://www.storagereview.com/php/benchmark/compare_rtg_2001.ph > p?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devI > D_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5 > =248&devCnt=6 > > It does illustrate some of the weaknesses of SATA drives, but all in > all the Raptor drives put on a good show. > [...] I think it's a little misleading that your tests show 0ms seek times for some of the write tests. The environmental test also selects a missing data point as the winner. Besides that, it seems to me that seek time is one of the most important features for a DB server, which means that the SCSI drives are the clear winners and the non-WD SATA drives are the embarrassing losers. Transfer rate is import, but perhaps less so because DBs tend to read/write small blocks rather than large files. On the server suite, which seems to me to be the most relevant for DBs, the Atlas 15k spanks the other drives by a fairly large margin (especially the lesser SATA drives). When you ignore the "consumer app" benchmarks, I wouldn't be so confident in saying that the Raptors "put on a good show". __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Foreign key slows down copy/insert
I have a table A with an int column ID that references table B column ID. Table B has about 150k rows, and has an index on B.ID. When trying to copy 1 million rows into A, I get the following \timings: You're using 7.4.5. It's possible that you have a type mismatch in your foreign keys which prevents use of the index on B. First of all, be really sure it's THAT foreign key, ie. do your COPY with only ONE foreign key at a time if you have several, and see which one is the killer. Then, supposing it's the column in A which REFERENCE's B(id) : SELECT id FROM A LIMIT 1; (check type) SELECT id FROM B LIMIT 1; (check type) EXPLAIN ANALYZE the following : SELECT * FROM B WHERE id = (SELECT id FROM A LIMIT 1); It should use the index. Does it ? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Intel SRCS16 SATA raid?
I have put together a little head to head performance of a 15k SCSI, 10k SCSI 10K SATA w/TCQ, 10K SATA wo/TCQ and 7.2K SATA drive comparison at storage review http://www.storagereview.com/php/benchmark/compare_rtg_2001.php?typeID=10&testbedID=3&osID=4&raidconfigID=1&numDrives=1&devID_0=232&devID_1=40&devID_2=259&devID_3=267&devID_4=261&devID_5=248&devCnt=6 It does illustrate some of the weaknesses of SATA drives, but all in all the Raptor drives put on a good show. Alex Turner netEconomist On 4/14/05, Alex Turner <[EMAIL PROTECTED]> wrote: > I have read a large chunk of this, and I would highly recommend it to > anyone who has been participating in the drive discussions. It is > most informative!! > > Alex Turner > netEconomist > > On 4/14/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Greg, > > > > I posted this link under a different thread (the $7k server thread). It is > > a very good read on why SCSI is better for servers than ATA. I didn't note > > bias, though it is from a drive manufacturer. YMMV. There is an > > interesting, though dated appendix on different manufacturers' drive > > characteristics. > > > > http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > > Enjoy, > > > > Rick > > > > [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM: > > > > > > > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller > > > instead of the 3ware one. > > > > > > Poking around it seems this does come with Linux drivers and there is a > > > battery backup option. So it doesn't seem to be completely insane. > > > > > > Anyone have any experience with these controllers? > > > > > > I'm also wondering about whether I'm better off with one of these SATA > > raid > > > controllers or just going with SCSI drives. > > > > > > -- > > > greg > > > > > > > > > ---(end of broadcast)--- > > > TIP 8: explain analyze is your friend > > > > ---(end of broadcast)--- > > TIP 9: the planner will ignore your desire to choose an index scan if your > > joining column's datatypes do not match > > > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Intel SRCS16 SATA raid?
I have read a large chunk of this, and I would highly recommend it to anyone who has been participating in the drive discussions. It is most informative!! Alex Turner netEconomist On 4/14/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Greg, > > I posted this link under a different thread (the $7k server thread). It is > a very good read on why SCSI is better for servers than ATA. I didn't note > bias, though it is from a drive manufacturer. YMMV. There is an > interesting, though dated appendix on different manufacturers' drive > characteristics. > > http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > Enjoy, > > Rick > > [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM: > > > > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller > > instead of the 3ware one. > > > > Poking around it seems this does come with Linux drivers and there is a > > battery backup option. So it doesn't seem to be completely insane. > > > > Anyone have any experience with these controllers? > > > > I'm also wondering about whether I'm better off with one of these SATA > raid > > controllers or just going with SCSI drives. > > > > -- > > greg > > > > > > ---(end of broadcast)--- > > TIP 8: explain analyze is your friend > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] recovery after long delete
Markus Bertheau <[EMAIL PROTECTED]> writes: > Now if you pull the plug after 2, at startup, pg will go through the > in-progress txns and mark them as aborted. That's all the recovery in > this case. All rows are still there. O(1). Right. (Actually it's O(checkpoint interval), because we have to make sure that everything we did since the last checkpoint actually got to disk --- but in principle, there's zero recovery effort.) > How does oracle do that? Has all this something to do with mvcc? Why > does it take oracle so long to recover? Oracle doesn't do MVCC the same way we do. They update rows in place and put the previous version of a row into an "undo log". If the transaction aborts, they have to go back through the undo log and put back the previous version of the row. I'm not real clear on how that applies to deletions, but I suppose it's the same deal: cost of undoing a transaction in Oracle is proportional to the number of rows it changed. There's also the little problem that the space available for UNDO logs is limited :-( As against which, they don't have to VACUUM. So it's a tradeoff. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Foreign key slows down copy/insert
About the foreign key performance: Maybe foreign key checks could be delayed into the COMMIT phase. In that position, you could check, that there are lots of foreign key checks for each foreign key pending, and do the foreign key check for an area or for the whole table, if it is faster. I have heard, that the database must be in consistent state after COMMIT, but it does not have necessarily to be okay inside a transaction. 1. COMMIT wanted 2. If there are lots of foreign key checks pending, do either an area foreign key check (join result must be 0 rows), or a full table join. 3. If all foreign key checks are okay, complete the COMMIT operation. 4. If a foreign key check fails, go into the ROLLBACK NEEDED state. Maybe Tom Lane meant the same. set option delayed_foreign_keys=true; BEGIN; insert 1000 rows. COMMIT; Regards, Marko Ristola Christopher Kings-Lynne wrote: My problem with this really is that in my database it is hard to predict which inserts will be huge (and thus need FKs dissabled), so I would have to code it around all inserts. Instead I can code my own integirty logic and avoid using FKs all together. Just drop the fk and re-add it, until postgres gets more smarts. Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Foreign key slows down copy/insert
On Thu, 14 Apr 2005, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > ... At some point, if we can work out how to do all the semantics > > properly, it'd probably be possible to replace the insert type check with > > a per-statement check which would be somewhere in between. That requires > > access to the affected rows inside the trigger which I don't believe is > > available currently. > > Not necessarily. It occurs to me that maybe what we need is "lossy > storage" of the trigger events. If we could detect that the queue of > pending checks for a particular FK is getting large, we could discard > the whole queue and replace it with one entry that says "run the > wholesale check again when we are ready to fire triggers". I'm not Yeah, but there's a potentially large middle ground where neither our current plan nor check the entire table is particularly good for that we might be able to handle better. It'd be nice to also fall back to check the entire table for even larger changes. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Intel SRCS16 SATA raid?
Greg, I posted this link under a different thread (the $7k server thread). It is a very good read on why SCSI is better for servers than ATA. I didn't note bias, though it is from a drive manufacturer. YMMV. There is an interesting, though dated appendix on different manufacturers' drive characteristics. http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf Enjoy, Rick [EMAIL PROTECTED] wrote on 04/14/2005 09:54:45 AM: > > Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller > instead of the 3ware one. > > Poking around it seems this does come with Linux drivers and there is a > battery backup option. So it doesn't seem to be completely insane. > > Anyone have any experience with these controllers? > > I'm also wondering about whether I'm better off with one of these SATA raid > controllers or just going with SCSI drives. > > -- > greg > > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] speed of querry?
Dawid Kuroczko <[EMAIL PROTECTED]> writes: > Basically it tells postgres how many values should it keep for > statistics per column. The config default_statistics_target > is the default (= used when creating table) and ALTER... is > a way to change it later. Not quite. default_statistics_target is the value used by ANALYZE for any column that hasn't had an explicit ALTER SET STATISTICS done on it. So you can change default_statistics_target and that will affect existing tables. (It used to work the way you are saying, but that was a few releases back...) regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] speed of querry?
On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote: > Josh from commandprompt.com had me alter the config to have > default_statistics_target = 250 > > Is this somehow related to what your asking me to do? > I did do an analyze, but have only ran the viw a few times. well, he did suggest the right thing. However this parameter applies to newly created tables, so either recreate the tables or do the ALTER TABLE I've sent eariler. Basically it tells postgres how many values should it keep for statistics per column. The config default_statistics_target is the default (= used when creating table) and ALTER... is a way to change it later. The more statistics PostgreSQL has means it can better predict how much data will be returned -- and this directly leads to a choice how to handle the data (order in which tables should be read, whether to use index or not, which algorithm use for join, etc.). The more statistics, the better PostgreSQL is able to predict. The more statistics, the slower planner is able to do the analysis. So you have to find a value which will be as much as is needed to accurately predict the results but not more! PostgreSQL's default of 10 is a bit conservative, hence the suggestions to increase it. :) [ and so is random_page_cost or some people have found that in their cases it is beneficial to reduce the value, even as much as below 2. ] Hope this clairifies things a bit. Regards, Dawid ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How to improve db performance with $7K?
On 4/14/05, Tom Lane <[EMAIL PROTECTED]> wrote: > > That's basically what it comes down to: SCSI lets the disk drive itself > do the low-level I/O scheduling whereas the ATA spec prevents the drive > from doing so (unless it cheats, ie, caches writes). Also, in SCSI it's > possible for the drive to rearrange reads as well as writes --- which > AFAICS is just not possible in ATA. (Maybe in the newest spec...) > > The reason this is so much more of a win than it was when ATA was > designed is that in modern drives the kernel has very little clue about > the physical geometry of the disk. Variable-size tracks, bad-block > sparing, and stuff like that make for a very hard-to-predict mapping > from linear sector addresses to actual disk locations. Combine that > with the fact that the drive controller can be much smarter than it was > twenty years ago, and you can see that the case for doing I/O scheduling > in the kernel and not in the drive is pretty weak. > > So if you all were going to choose between two hard drives where: drive A has capacity C and spins at 15K rpms, and drive B has capacity 2 x C and spins at 10K rpms and all other features are the same, the price is the same and C is enough disk space which would you choose? I've noticed that on IDE drives, as the capacity increases the data density increases and there is a pereceived (I've not measured it) performance increase. Would the increased data density of the higher capacity drive be of greater benefit than the faster spindle speed of drive A? -- Matthew Nuzum <[EMAIL PROTECTED]> www.followers.net - Makers of Elite Content Management System View samples of Elite CMS in action by visiting http://www.followers.net/portfolio/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] recovery after long delete
Hi. Our professor told us the following story: Oracle. A client issued a selective delete statement on a big table. After two days he lost patience and pulled the plug. Unfortunately while starting up, oracle had to restore all the deleted rows, which took it another two days. He reasoned that one better copies all rows that are not to be deleted in another table drops the original table afterwards. (Concurrency, fks, indexes are not the question here). So I wondered how this works in PostgreSQL. As I understand it, what's going on is the following: 1. The transaction 45 is started. It is recorded as in-progress. 2. The rows selected in the delete statement are one by one marked as to-be-deleted by txn 45. Among them row 27. 3. If a concurrently running read committed txn 47 wants to update row 27, it blocks, awaiting whether txn 45 commits or aborts. 4.1 When txn 45 commits, it is marked as such. 5.1 txn 47 can continue, but as row 27 was deleted, it is not affected by txn 47's update statement. 4.2 When txn 45 aborts, it is marked as such. This means the same as not being marked at all. 5.2 txn 47 continues and updates row 27. Now if you pull the plug after 2, at startup, pg will go through the in-progress txns and mark them as aborted. That's all the recovery in this case. All rows are still there. O(1). How does oracle do that? Has all this something to do with mvcc? Why does it take oracle so long to recover? Thanks Markus -- Markus Bertheau <[EMAIL PROTECTED]> signature.asc Description: This is a digitally signed message part
Re: [PERFORM] Foreign key slows down copy/insert
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes: > No it certainly won't warn you. You have _avoided_ the check entirely. > That's why I was warning you... > If you wanted to be really careful, you could: Probably the better bet is to drop and re-add the FK constraint. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Foreign key slows down copy/insert
My problem with this really is that in my database it is hard to predict which inserts will be huge (and thus need FKs dissabled), so I would have to code it around all inserts. Instead I can code my own integirty logic and avoid using FKs all together. Just drop the fk and re-add it, until postgres gets more smarts. Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Foreign key slows down copy/insert
Christopher Kings-Lynne wrote: No it certainly won't warn you. You have _avoided_ the check entirely. That's why I was warning you... I figured as much when I realized it was just a simple table update. I was thinking more of a DB2 style "set integrity" command. If you wanted to be really careful, you could: So I will be re-checking my own FKs. That's not really what I'd expect from a FK. My problem with this really is that in my database it is hard to predict which inserts will be huge (and thus need FKs dissabled), so I would have to code it around all inserts. Instead I can code my own integirty logic and avoid using FKs all together. Thanks, -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- Have you visited our new DNA Portal? --- ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Foreign key slows down copy/insert
Thanks for the pointer. I got this from the archives: update pg_class set reltriggers=0 where relname = 'YOUR_TABLE_NAME'; to enable them after you are done, do update pg_class set reltriggers = count(*) from pg_trigger where pg_class.oid=tgrelid and relname='YOUR_TABLE_NAME'; I assume the re-enabling will cause an error when the copy/insert added data that does not satisfy the FK. In that case I'll indeed end up with invalid data, but at least I will know about it. No it certainly won't warn you. You have _avoided_ the check entirely. That's why I was warning you... If you wanted to be really careful, you could: being; lock tables for writes... turn off triggers insert delete where rows don't match fk constraint turn on triggers commit; Chris ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Intel SRCS16 SATA raid?
sorry, don't remember whether it's SCSI or SATA II, but IIRC the Areca controllers are just stellar for things. If you do get SATA for db stuff..especially multiuser...i still haven't seen anything to indicate an across-the-board primacy for SATA over SCSI. I'd go w/SCSI, or if SATA for $$$ reasons, I'd be sure to have many spindles and RAID 10. my 0.02. I'm surely not an expert of any kind. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 10:55 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Intel SRCS16 SATA raid? Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead of the 3ware one. Poking around it seems this does come with Linux drivers and there is a battery backup option. So it doesn't seem to be completely insane. Anyone have any experience with these controllers? I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Foreign key slows down copy/insert
Christopher Kings-Lynne wrote: But why then is the speed acceptable if I copy and then manually add the FK? Is the check done by the FK so much different from when it is done automatically using an active deffered FK? Yeah I think it uses a different query formulation... Actually I only assume that deferred fk's don't use that - I guess your experiment proves that. In my tests deferred or not deferred makes no difference in speed. I am still quite surprised by how huge the difference is.. this makes FKs quite unusable when added a lot of data to a table. Actually, you can just "disable" them if you want to be really dirty :) Thanks for the pointer. I got this from the archives: update pg_class set reltriggers=0 where relname = 'YOUR_TABLE_NAME'; to enable them after you are done, do update pg_class set reltriggers = count(*) from pg_trigger where pg_class.oid=tgrelid and relname='YOUR_TABLE_NAME'; I assume the re-enabling will cause an error when the copy/insert added data that does not satisfy the FK. In that case I'll indeed end up with invalid data, but at least I will know about it. Thanks, -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
Imagine a system in "furious activity" with two (2) process regularly occuring Process One: Long read (or write). Takes 20ms to do seek, latency, and stream off. Runs over and over. Process Two: Single block read ( or write ). Typical database row access. Optimally, could be submillisecond. happens more or less randomly. Let's say process one starts, and then process two. Assume, for sake of this discussion, that P2's block lies w/in P1's swath. (But doesn't have to...) Now, everytime process two has to wait at LEAST 20ms to complete. In a queue-reordering system, it could be a lot faster. And me, looking for disk service times on P2, keep wondering "why does a single diskblock read keep taking >20ms?" Sit doesn't need to be "a read" or "a write". It doesn't need to be "furious activity" (two processes is not furious, even for a single user desktop.) This is not a "corner case", and while it doesn't take into account kernel/drivecache/UBC buffering issues, I think it shines a light on why command re-ordering might be useful. YMMV. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Brown Sent: Thursday, April 14, 2005 4:36 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Greg Stark wrote: > I think you're being misled by analyzing the write case. > > Consider the read case. When a user process requests a block and that > read makes its way down to the driver level, the driver can't just put > it aside and wait until it's convenient. It has to go ahead and issue > the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. > In the 10ms or so that it takes to seek to perform that read > *nothing* gets done. If the driver receives more read or write > requests it just has to sit on them and wait. 10ms is a lifetime for a > computer. In that time dozens of other processes could have been > scheduled and issued reads of their own. This is true, but now you're talking about a situation where the system goes from an essentially idle state to one of furious activity. In other words, it's a corner case that I strongly suspect isn't typical in situations where SCSI has historically made a big difference. Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. In the case of SCSI, the queueing happens on the disks (or at least on the controller). In the case of SATA, the queueing happens in the kernel. I suppose the tagged queueing setup could begin the head movement and, if another request comes in that requests a block on a cylinder between where the head currently is and where it's going, go ahead and read the block in question. But is that *really* what happens in a tagged queueing system? It's the only major advantage I can see it having. > The same thing would happen if you had lots of processes issuing lots > of small fsynced writes all over the place. Postgres doesn't really do > that though. It sort of does with the WAL logs, but that shouldn't > cause a lot of seeking. Perhaps it would mean that having your WAL > share a spindle with other parts of the OS would have a bigger penalty > on IDE drives than on SCSI drives though? Perhaps. But I rather doubt that has to be a huge penalty, if any. When a process issues an fsync (or even a sync), the kernel doesn't *have* to drop everything it's doing and get to work on it immediately. It could easily gather a few more requests, bundle them up, and then issue them. If there's a lot of disk activity, it's probably smart to do just that. All fsync and sync require is that the caller block until the data hits the disk (from the point of view of the kernel). The specification doesn't require that the kernel act on the calls immediately or write only the blocks referred to by the call in question. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Foreign key slows down copy/insert
Stephan Szabo <[EMAIL PROTECTED]> writes: > ... At some point, if we can work out how to do all the semantics > properly, it'd probably be possible to replace the insert type check with > a per-statement check which would be somewhere in between. That requires > access to the affected rows inside the trigger which I don't believe is > available currently. Not necessarily. It occurs to me that maybe what we need is "lossy storage" of the trigger events. If we could detect that the queue of pending checks for a particular FK is getting large, we could discard the whole queue and replace it with one entry that says "run the wholesale check again when we are ready to fire triggers". I'm not sure how to detect this efficiently, though --- the trigger manager doesn't presently know anything about FKs being different from any other kind of trigger. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] Intel SRCS16 SATA raid?
Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead of the 3ware one. Poking around it seems this does come with Linux drivers and there is a battery backup option. So it doesn't seem to be completely insane. Anyone have any experience with these controllers? I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
while you weren't looking, Kevin Brown wrote: [reordering bursty reads] > In other words, it's a corner case that I strongly suspect > isn't typical in situations where SCSI has historically made a big > difference. [...] > But I rather doubt that has to be a huge penalty, if any. When a > process issues an fsync (or even a sync), the kernel doesn't *have* to > drop everything it's doing and get to work on it immediately. It > could easily gather a few more requests, bundle them up, and then > issue them. To make sure I'm following you here, are you or are you not suggesting that the kernel could sit on -all- IO requests for some small handful of ms before actually performing any IO to address what you "strongly suspect" is a "corner case"? /rls -- :wq ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Foreign key slows down copy/insert
On Thu, 14 Apr 2005, Richard van den Berg wrote: > Hello Chris, > > Thanks for your answers. > > Christopher Kings-Lynne wrote: > > Deferring makes no difference to FK checking speed... > > But why then is the speed acceptable if I copy and then manually add the > FK? Is the check done by the FK so much different from when it is done > automatically using an active deffered FK? Yes, because currently the check done by the FK on an insert type activity is a per-row inserted check while the check done when adding a FK acts on the entire table in a go which allows better optimization of that case (while generally being worse on small number inserts especially on large tables). At some point, if we can work out how to do all the semantics properly, it'd probably be possible to replace the insert type check with a per-statement check which would be somewhere in between. That requires access to the affected rows inside the trigger which I don't believe is available currently. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] speed of querry?
Josh from commandprompt.com had me alter the config to have default_statistics_target = 250 Is this somehow related to what your asking me to do? I did do an analyze, but have only ran the viw a few times. Joel Fradkin -Original Message- From: Dawid Kuroczko [mailto:[EMAIL PROTECTED] Sent: Thursday, April 14, 2005 9:21 AM To: Joel Fradkin Cc: PostgreSQL Perform Subject: Re: [PERFORM] speed of querry? On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote: > I have done a vacuum and a vacuum analyze. > I can try again for kicks, but it is not in production so no new records are > added and vacuum analyze is ran after any mods to the indexes. > > I am still pursuing Dell on why the monster box is so much slower then the > desktop as well. First thing: Do something like: ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50; make it for each column used, make it even higher than 50 for many-values columns. THEN make VACUUM ANALYZE; Then do a query couple of times (EXPLAIN ANALYZE also :)), then do: SET enable_seqscan = off; and rerun the query -- if it was significantly faster, you will want to do: SET enable_seqscan = on; and tweak: SET random_page_cost = 2.1; ...and play with values. When you reach the random_page_cost which suits your data, you will want to put it into postgresql.conf I am sorry if it is already known to you. :) Also, it is a rather simplistic approach to tuning PostgreSQL but it is worth doing. Especially the statistics part. :) Regards, Dawid ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to improve db performance with $7K?
Kevin Brown <[EMAIL PROTECTED]> writes: > I really don't see how this is any different between a system that has > tagged queueing to the disks and one that doesn't. The only > difference is where the queueing happens. In the case of SCSI, the > queueing happens on the disks (or at least on the controller). In the > case of SATA, the queueing happens in the kernel. That's basically what it comes down to: SCSI lets the disk drive itself do the low-level I/O scheduling whereas the ATA spec prevents the drive from doing so (unless it cheats, ie, caches writes). Also, in SCSI it's possible for the drive to rearrange reads as well as writes --- which AFAICS is just not possible in ATA. (Maybe in the newest spec...) The reason this is so much more of a win than it was when ATA was designed is that in modern drives the kernel has very little clue about the physical geometry of the disk. Variable-size tracks, bad-block sparing, and stuff like that make for a very hard-to-predict mapping from linear sector addresses to actual disk locations. Combine that with the fact that the drive controller can be much smarter than it was twenty years ago, and you can see that the case for doing I/O scheduling in the kernel and not in the drive is pretty weak. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Foreign key slows down copy/insert
Deferring makes no difference to FK checking speed... But why then is the speed acceptable if I copy and then manually add the FK? Is the check done by the FK so much different from when it is done automatically using an active deffered FK? Yeah I think it uses a different query formulation... Actually I only assume that deferred fk's don't use that - I guess your experiment proves that. Well, that's what people do - even pg_dump will restore data and add the foreign key afterward... If I have to go this route, is there a way of automatically dropping and re-adding FKs? I can probably query pg_constraints and drop the appropriate ones, but how do I re-add them after the copy/insert? Actually, you can just "disable" them if you want to be really dirty :) You have to be confident that the data you're inserting does satisfy the FK, however otherwise you can end up with invalid data. To see how to do that, try pg_dump with --disable-triggers mode enabled. Just do a data-only dump. Chris ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Foreign key slows down copy/insert
Hello Chris, Thanks for your answers. Christopher Kings-Lynne wrote: Deferring makes no difference to FK checking speed... But why then is the speed acceptable if I copy and then manually add the FK? Is the check done by the FK so much different from when it is done automatically using an active deffered FK? Well, that's what people do - even pg_dump will restore data and add the foreign key afterward... If I have to go this route, is there a way of automatically dropping and re-adding FKs? I can probably query pg_constraints and drop the appropriate ones, but how do I re-add them after the copy/insert? Sincerely, -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] speed of querry?
On 4/14/05, Joel Fradkin <[EMAIL PROTECTED]> wrote: > I have done a vacuum and a vacuum analyze. > I can try again for kicks, but it is not in production so no new records are > added and vacuum analyze is ran after any mods to the indexes. > > I am still pursuing Dell on why the monster box is so much slower then the > desktop as well. First thing: Do something like: ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50; make it for each column used, make it even higher than 50 for many-values columns. THEN make VACUUM ANALYZE; Then do a query couple of times (EXPLAIN ANALYZE also :)), then do: SET enable_seqscan = off; and rerun the query -- if it was significantly faster, you will want to do: SET enable_seqscan = on; and tweak: SET random_page_cost = 2.1; ...and play with values. When you reach the random_page_cost which suits your data, you will want to put it into postgresql.conf I am sorry if it is already known to you. :) Also, it is a rather simplistic approach to tuning PostgreSQL but it is worth doing. Especially the statistics part. :) Regards, Dawid ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Foreign key slows down copy/insert
I am new to cross references between tables, and I am trying to understand how they impact performance. From reading the documentation I was under the impression that deffering foreign keys would yield about the same performance as dropping them before a copy, and adding them after. However, I cannot see this in my test case. Even if you defer them, it just defers the check, doesn't eliminate it... I have a table A with an int column ID that references table B column ID. Table B has about 150k rows, and has an index on B.ID. When trying to copy 1 million rows into A, I get the following \timings: 1) drop FK, copy (200s), add FK (5s) 2) add FK defferable initially deffered, copy (I aborted after 30min) 3) add FK defferable initially deffered, begin, copy (200s), commit (I aborted after 30min) How do I explain why test cases 2 and 3 do not come close to case 1? Am I missing something obvious? Deferring makes no difference to FK checking speed... Since the database I am working on has many FKs, I would rather not have to drop/add them when I am loading large data sets. Well, that's what people do - even pg_dump will restore data and add the foreign key afterward... Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Use of data within indexes
To be more explicit, let's say I have table with two fields a and b. If I have an index on (a,b) and I do a request like "SELECT b FROM table WHERE a=x", will Postgresql use only the index, or will it need to also read the table page for that (those) row(s)? It must read the table because of visibility considerations. There might be a reason why this is not possible (I don't know if the indexes have all necessary transaction ID information?) but otherwise this could possibly provide an interesting performance gain for some operations, in particular with some types of joins. Or maybe it already does it. It's already been thought of :) The 4 or so columns that store visibility information are not in the indexes, to do so would require a significant write cost. Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] speed of querry?
I have done a vacuum and a vacuum analyze. I can try again for kicks, but it is not in production so no new records are added and vacuum analyze is ran after any mods to the indexes. I am still pursuing Dell on why the monster box is so much slower then the desktop as well. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 [EMAIL PROTECTED] www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, April 14, 2005 1:47 AM To: Richard Huxton Cc: Joel Fradkin; PostgreSQL Perform Subject: Re: [PERFORM] speed of querry? Richard Huxton writes: > In the first, we match outer.clientnum to inner.clientnum, in the second > it's "?column10?" - are you sure the query was identical in each case. > I'm guessing the unidentified column in query 2 is the reason for the > sort a couple of lines below it, which seems to take up a large chunk of > time. The "?column10?" is because EXPLAIN isn't excessively bright about reporting references to outputs of lower plan nodes. (Gotta fix that sometime.) The real point here is that the planner thought that a scan plus sort would be faster than scanning an index that exactly matched the sort order the Merge Join needed ... and it was wrong :-( So this is just the usual sort of question of "are your stats up to date, maybe you need to increase stats targets, or else play with random_page_cost, etc" ... regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Foreign key slows down copy/insert
I am new to cross references between tables, and I am trying to understand how they impact performance. From reading the documentation I was under the impression that deffering foreign keys would yield about the same performance as dropping them before a copy, and adding them after. However, I cannot see this in my test case. I have a table A with an int column ID that references table B column ID. Table B has about 150k rows, and has an index on B.ID. When trying to copy 1 million rows into A, I get the following \timings: 1) drop FK, copy (200s), add FK (5s) 2) add FK defferable initially deffered, copy (I aborted after 30min) 3) add FK defferable initially deffered, begin, copy (200s), commit (I aborted after 30min) How do I explain why test cases 2 and 3 do not come close to case 1? Am I missing something obvious? Since the database I am working on has many FKs, I would rather not have to drop/add them when I am loading large data sets. If it would help I can write this out in a reproducable scenario. I am using postgresql 7.4.5 at the moment. Sincerely, -- Richard van den Berg, CISSP --- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a| www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands| Fax : +31 70 3603009 --- ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Use of data within indexes
Hi, Just wondering... Is Postgresql able to use data present within indexes without looking up the table data? To be more explicit, let's say I have table with two fields a and b. If I have an index on (a,b) and I do a request like "SELECT b FROM table WHERE a=x", will Postgresql use only the index, or will it need to also read the table page for that (those) row(s)? There might be a reason why this is not possible (I don't know if the indexes have all necessary transaction ID information?) but otherwise this could possibly provide an interesting performance gain for some operations, in particular with some types of joins. Or maybe it already does it. Any hint welcome! Thanks, Jacques. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
Greg Stark wrote: > I think you're being misled by analyzing the write case. > > Consider the read case. When a user process requests a block and > that read makes its way down to the driver level, the driver can't > just put it aside and wait until it's convenient. It has to go ahead > and issue the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. > In the 10ms or so that it takes to seek to perform that read > *nothing* gets done. If the driver receives more read or write > requests it just has to sit on them and wait. 10ms is a lifetime for > a computer. In that time dozens of other processes could have been > scheduled and issued reads of their own. This is true, but now you're talking about a situation where the system goes from an essentially idle state to one of furious activity. In other words, it's a corner case that I strongly suspect isn't typical in situations where SCSI has historically made a big difference. Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. In the case of SCSI, the queueing happens on the disks (or at least on the controller). In the case of SATA, the queueing happens in the kernel. I suppose the tagged queueing setup could begin the head movement and, if another request comes in that requests a block on a cylinder between where the head currently is and where it's going, go ahead and read the block in question. But is that *really* what happens in a tagged queueing system? It's the only major advantage I can see it having. > The same thing would happen if you had lots of processes issuing > lots of small fsynced writes all over the place. Postgres doesn't > really do that though. It sort of does with the WAL logs, but that > shouldn't cause a lot of seeking. Perhaps it would mean that having > your WAL share a spindle with other parts of the OS would have a > bigger penalty on IDE drives than on SCSI drives though? Perhaps. But I rather doubt that has to be a huge penalty, if any. When a process issues an fsync (or even a sync), the kernel doesn't *have* to drop everything it's doing and get to work on it immediately. It could easily gather a few more requests, bundle them up, and then issue them. If there's a lot of disk activity, it's probably smart to do just that. All fsync and sync require is that the caller block until the data hits the disk (from the point of view of the kernel). The specification doesn't require that the kernel act on the calls immediately or write only the blocks referred to by the call in question. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly