Re: [PERFORM] Arguments Pro/Contra Software Raid
William Yu wrote: We upgraded our disk system for our main data processing server earlier this year. After pricing out all the components, basically we had the choice of: LSI MegaRaid 320-2 w/ 1GB RAM+BBU + 8 15K 150GB SCSI or Areca 1124 w/ 1GB RAM+BBU + 24 7200RPM 250GB SATA My mistake -- I keep doing calculations and they don't add up. So I looked again on pricewatch and it turns out the actual comparison was for 4 SCSI drives, not 8! ($600 for a 15K 145GB versus $90 for a 7200 250GB.) No wonder our decision seemed to much more decisive back then. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Worsening performance with 7.4 on flash-based system
Usually when simple queries take a long time to run, it's the system tables (pg_*) that have become bloated and need vacuuming. But that's just random guess on my part w/o my detailed info. Greg Stumph wrote: Well, since I got no response at all to this message, I can only assume that I've asked the question in an insufficient way, or else that no one has anything to offer on our problem. This was my first post to the list, so if there's a better way I should be asking this, or different data I should provide, hopefully someone will let me know... Thanks, Greg Greg Stumph [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] We are experiencing gradually worsening performance in PostgreSQL 7.4.7, on a system with the following specs: Linux OS (Fedora Core 1, 2.4 kernal) Flash file system (2 Gig, about 80% full) 256 Meg RAM 566 MHz Celeron CPU We use Orbit 2.9.8 to access PostGres. The database contains 62 tables. When the system is running with a fresh copy of the database, performance is fine. At its worst, we are seeing fairly simple SELECT queries taking up to 1 second to execute. When these queries are run in a loop, the loop can take up to 30 seconds to execute, instead of the 2 seconds or so that we would expect. VACUUM FULL ANALYZE helps, but doesn't seem to eliminate the problem. The following table show average execution time in bad performance mode in the first column, execution time after VACUUM ANALYZE in the second column, and % improvement (or degradation?) in the third. The fourth column show the query that was executed. 741.831|582.038|-21.5| ^IDECLARE table_cursor 170.065|73.032|-57.1| FETCH ALL in table_cursor 41.953|45.513|8.5| CLOSE table_cursor 61.504|47.374|-23.0| SELECT last_value FROM pm_id_seq 39.651|46.454|17.2| select id from la_looprunner 1202.170|265.316|-77.9| select id from rt_tran 700.669|660.746|-5.7| ^IDECLARE my_tran_load_cursor 1192.182|47.258|-96.0| FETCH ALL in my_tran_load_cursor 181.934|89.752|-50.7| CLOSE my_tran_load_cursor 487.285|873.474|79.3| ^IDECLARE my_get_router_cursor 51.543|69.950|35.7| FETCH ALL in my_get_router_cursor 48.312|74.061|53.3| CLOSE my_get_router_cursor 814.051|1016.219|24.8| SELECT $1 = 'INSERT' 57.452|78.863|37.3| select id from op_sched 48.010|117.409|144.6| select short_name, long_name from la_loopapp 54.425|58.352|7.2| select id from cd_range 45.289|52.330|15.5| SELECT last_value FROM rt_tran_id_seq 39.658|82.949|109.2| SELECT last_value FROM op_sched_id_seq 42.158|68.189|61.7| select card_id,router_id from rt_valid Has anyone else seen gradual performance degradation like this? Would upgrading to Postgres 8 help? Any other thoughts on directions for troubleshooting this? Thanks... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
[EMAIL PROTECTED] wrote: I have an Intel Pentium D 920, and an AMD X2 3800+. These are very close in performance. The retail price difference is: Intel Pentium D 920 is selling for $310 CDN AMD X2 3800+is selling for $347 CDN Anybody who claims that Intel is 2X more expensive for the same performance, isn't considering all factors. No question at all - the Opteron is good, and the Xeon isn't - but the original poster didn't ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not double Pentium D. Maybe 10%. Maybe not at all. Especially now that Intel is dropping it's prices due to overstock. There's part of the equation you are missing here. This is a PostgreSQL mailing list which means we're usually talking about performance of just this specific server app. While in general there may not be that much of a % difference between the 2 chips, there's a huge gap in Postgres. For whatever reason, Postgres likes Opterons. Way more than Intel P4-architecture chips. (And it appears way more than IBM Power4 chips and a host of other chips also.) Here's one of the many discussions we had about this issue last year: http://qaix.com/postgresql-database-development/337-670-re-opteron-vs-xeon-was-what-to-do-with-6-disks-read.shtml The exact reasons why Opteron runs PostgreSQL so much better than P4s, we're not 100% sure of. We have guesses -- lower memory latency, lack of shared FSB, better 64-bit, 64-bit IOMMU, context-switch storms on P4, better dualcore implementation and so on. Perhaps it's a combination of all the above factors but somehow, the general experience people have had is that equivalently priced Opterons servers run PostgreSQL 2X faster than P4 servers as the baseline and the gap increases as you add more sockets and more cores. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
David Boreham wrote: It isn't only Postgres. I work on a number of other server applications that also run much faster on Opterons than the published benchmark figures would suggest they should. They're all compiled with gcc4, so possibly there's a compiler issue. I don't run Windows on any of our Opteron boxes so I can't easily compare using the MS compiler. Maybe it's just a fact that the majority of x86 64-bit development for open source software happens on Opteron/A64 machines. 64-bit AMD machines were selling a good year before 64-bit Intel machines were available. And even after Intel EMT64 were available, anybody in their right mind would have picked AMD machines over Intel due to cost/heat/performance. So you end up with 64-bit OSS being developed/optimized for Opterons and the 10% running Intel EMT64 handle compatibility issues. Would be interesting to see a survey of what machines OSS developers use to write/test/optimize their code. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] 3WARE Card performance boost?
Benjamin Arai wrote: Obviously, I have done this to improve write performance for the update each week. My question is if I install a 3ware or similar card to replace my current software RAID 1 configuration, am I going to see a very large improvement? If so, what would be a ball park figure? The key is getting a card with the ability to upgrade the onboard ram. Our previous setup was a LSI MegaRAID 320-1 (128MB), 4xRAID10, fsync=off. Replaced it with a ARC-1170 (1GB) w/ 24x7200RPM SATA2 drives (split into 3 8-drive RAID6 arrays) and performance for us is through the ceiling. For OLTP type updates, we've gotten about +80% increase. For massive 1-statement updates, performance increase is in the +triple digits. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] 3WARE Card performance boost?
Steinar H. Gunderson wrote: On Wed, Jan 18, 2006 at 01:58:09PM -0800, William Yu wrote: The key is getting a card with the ability to upgrade the onboard ram. Our previous setup was a LSI MegaRAID 320-1 (128MB), 4xRAID10, fsync=off. Replaced it with a ARC-1170 (1GB) w/ 24x7200RPM SATA2 drives (split into 3 8-drive RAID6 arrays) and performance for us is through the ceiling. Well, the fact that you went from four to 24 disks would perhaps be a bigger factor than the amount of RAM... /* Steinar */ Actually no. Our 2xOpteron 244 server is NOT fast enough to drive an array this large. That's why we had to split it up into 3 different arrays. I tried all different RAID configs and once past about 8 drives, I got the same performance no matter what because the CPU was pegged at 100%. Right now, 2 of the arrays are just mirroring each other because we can't seem utilize the performance right now. (Also protects against cabling/power supply issues as we're using 3 seperate external enclosures.) The 1GB RAM is much bigger because it almost completely hides the write activity. Looking at iostat while all our jobs are running, there's almost no disk activity. If I manually type sync, I see 1 big 250MB-500MB write storm for 2 seconds but otherwise, writes just slowly dribble out to disk. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
David Lang wrote: raid 5 is bad for random writes as you state, but how does it do for sequential writes (for example data mining where you do a large import at one time, but seldom do other updates). I'm assuming a controller with a reasonable amount of battery-backed cache. Random write performance (small block that only writes to 1 drive): 1 write requires N-1 reads + N writes -- 1/2N-1 % Sequential write performance (write big enough block to use all N drives): N-1 Write requires N writes -- N-1/N % Assuming enough cache so all reads/writes are done in 1 transaction + onboard processor calcs RAID parity fast enough to not cause an extra delay. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
Luke Lonergan wrote: Note that host-based SCSI raid cards from LSI, Adaptec, Intel, Dell, HP and others have proven to have worse performance than a single disk drive in many cases, whether for RAID0 or RAID5. In most circumstances This is my own experience. Running a LSI MegaRAID in pure passthrough mode + Linux software RAID10 is a ton faster than configuring the RAID via the LSI card. One of the things I've noticed is that the card does not seem to be able to parallel read on mirrors. While looking at iostat under Linux, I can see software RAID1 reading all drives and the MD number adding up to the sum of all drives. The ARECA SATA controller I just got though doesn't seem to exhibit these problems. Performance is a few % points above Linux software RAID at lower CPU usage. In fact, I'm getting better single-threaded bandwidth on a 4x7200RPM SATA config versus a 6x15K SCSI config on the LSI. The drives are bigger for the SATA drive (300GB) versus 36GB for the SCSI so that means the heads don't have to move any where as much and can stay on the fast portion of the disk. Haven't had a chance to test multi-user DB between the two setup though. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
Juan Casero wrote: Can you elaborate on the reasons the opteron is better than the Xeon when it comes to disk io? I have a PostgreSQL 7.4.8 box running a DSS. One of our Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, transfers to 4GB, the OS must allocated the memory 4GB, DMA to that block and then the CPU must do extra work in copying the memory to 4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the background. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 15,000 tables - next step
Michael Riess wrote: Well, I'd think that's were your problem is. Not only you have a (relatively speaking) small server -- you also share it with other very-memory-hungry services! That's not a situation I'd like to be in. Try putting Apache and Tomcat elsewhere, and leave the bulk of the 1GB to Postgres. No can do. I can try to switch to a 2GB machine, but I will not use several machines. Not for a 5GB database. ;-) With 1500 shared buffers you are not really going anywhere -- you should have ten times that at the very least. Like I said - I tried to double the buffers and the performance did not improve in the least. And I also tried this on a 2GB machine, and swapping was not a problem. If I used 10x more buffers, I would in essence remove the OS buffers. Increasing buffers do improve performance -- if you have enough memory. You just don't have enough memory to play with. My servers run w/ 10K buffers (128MB on 64-bit FC4) and it definitely runs better w/ it at 10K versus 1500. With that many tables, your system catalogs are probably huge. To keep your system catalog from continually cycling in-out of buffers/OS cache/disk, you need a lot more memory. Ordinarily, I'd say the 500MB you have available for Postgres to cache 5GB is a workable ratio. My servers all have similar ratios of ~1:10 and they perform pretty good -- *except* when the system catalogs bloated due to lack of vacuuming on system tables. My app regularly creates drops thousands of temporary tables leaving a lot of dead rows in the system catalogs. (Nearly the same situation as you -- instead of 15K live tables, I had 200 live tables and tens of thousands of dead table records.) Even with almost 8GB of RAM dedicated to postgres, performance on every single query -- not matter how small the table was -- took forever because the query planner had to spend a significant period of time scanning through my huge system catalogs to build the execution plan. While my situtation was fixable by scheduling a nightly vacuum/analyze on the system catalogs to get rid of the bazillion dead table/index info, you have no choice but to get more memory so you can stuff your entire system catalog into buffers/os cache. Personally, w/ 1GB of ECC RAM at ~$85, it's a no brainer. Get as much memory as your server can support. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alan Stange wrote: Luke Lonergan wrote: The aka iowait is the problem here - iowait is not idle (otherwise it would be in the idle column). Iowait is time spent waiting on blocking io calls. As another poster pointed out, you have a two CPU system, and during your scan, as iowait time is idle time. Period. This point has been debated endlessly for Solaris and other OS's as well. I'm sure the the theory is nice but here's my experience with iowait just a minute ago. I run Linux/XFce as my desktop -- decided I wanted to lookup some stuff in Wikipedia under Mozilla and my computer system became completely unusable for nearly a minute while who knows what Mozilla was doing. (Probably loading all the language packs.) I could not even switch to IRC (already loaded) to chat with other people while Mozilla was chewing up all my disk I/O. So I went to another computer, connected to mine remotely (slow...) and checked top. 90% in the wa column which I assume is the iowait column. It may be idle in theory but it's not a very useful idle -- wasn't able to switch to any programs already running, couldn't click on the XFce launchbar to run any new programs. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hardware/OS recommendations for large databases (
Welty, Richard wrote: David Boreham wrote: I guess I've never bought into the vendor story that there are two reliability grades. Why would they bother making two different kinds of bearing, motor etc ? Seems like it's more likely an excuse to justify higher prices. then how to account for the fact that bleeding edge SCSI drives turn at twice the rpms of bleeding edge consumer drives? The motors spin twice as fast? I'm pretty sure the original comment was based on drives w/ similar specs. E.g. 7200RPM enterprise drives versus 7200RPM consumer drives. Next time one of my 7200RPM SCSIs fail, I'll take it apart and compare the insides to an older 7200RPM IDE from roughly the same era. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alex Turner wrote: Opteron 242 - $178.00 Opteron 242 - $178.00 Tyan S2882 - $377.50 Total: $733.50 Opteron 265 - $719.00 Tyan K8E - $169.00 Total: $888.00 You're comparing the wrong CPUs. The 265 is the 2x of the 244 so you'll have to bump up the price more although not enough to make a difference. Looks like the price of the 2X MBs have dropped since I last looked at it. Just a few months back, Tyan duals were $450-$500 which is what I was basing my priced less statement from. Tyan K8E - doesn't have any PCI-X, so it's also not apples to apples. Infact I couldn't find a single CPU slot board that did, so you pretty much have to buy a dual CPU board to get PCI-X. You can get single CPU boards w/ PCIe and use PCIe controller cards. Probably expensive right now because they're so bleeding-edge new but definitely on the downswing. a) OLTP - probably IO bound, large number of queries/sec updating info on _disks_, not requiring much CPU activity except to retrieve item infomration which is well indexed and normalized. Not in my experience. I find on our OLTP servers, we run 98% in RAM and hence are 100% CPU-bound. Our DB is about 50GB in size now, servers run w/ 8GB of RAM. We were *very* CPU limited running 2x244. During busy hours of the day, our avg user transaction time were jumping from 0.8sec to 1.3+sec. Did the 2x265 and now we're always in the 0.7sec to 0.9sec range. DC also gives you a better upgrade path. Let's say you do testing and figure 2x246 is the right setup to handle the load. Well instead of getting 2x1P, use the same 2P motherboard but only populate 1 CPU w/ a DC/270. Now you have a server that can be upgraded to +80% more CPU by popping in another DC/270 versus throwing out the entire thing to get a 4x1P setup. No argument there. But it's pointless if you are IO bound. Why would you just accept we're IO bound, nothing we can do? I'd do everything in my power to make my app go from IO bound to CPU bound -- whether by optimizing my code or buying more hardware. I can tell you if our OLTP servers were IO bound, it would run like crap. Instead of 1 sec, we'd be looking at 5-10 seconds per user transaction and our users would be screaming bloody murder. In theory, you can always convert your IO bound DB to CPU bound by stuffing more and more RAM into your server. (Or partitioning the DB across multiple servers.) Whether it's cost effective depends on the DB and how much your users are paying you -- and that's a case-by-case analysis. Not a global statement of IO-bound, pointless. (2) Does a DC system perform better than it's Nx1P cousin? My experience is yes. Did some rough tests in a drop-in-replacement 1x265 versus 2x244 and saw about +10% for DC. All the official benchmarks (Spec, Java, SAP, etc) from AMD/Sun/HP/IBM show DCs outperforming the Nx1P setups. Maybe true, but the 265 does have a 25% faster FSB than the 244, which might perhaps play a role. Nope. There's no such thing as FSB on Opterons. On-die memory controller runs @ CPU speed and hence connects at whatever the memory runs at (rounded off to some multiplier math). There's the HT speed that controls the max IO bandwidth but that's based on the motherboard, not the CPU. Plus the 265 and 244 both run at 1.8Ghz so the memory multiplier HT IO are both the same. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Hardware/OS recommendations for large databases (
Joshua Marsh wrote: On 11/17/05, *William Yu* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: No argument there. But it's pointless if you are IO bound. Why would you just accept we're IO bound, nothing we can do? I'd do everything in my power to make my app go from IO bound to CPU bound -- whether by optimizing my code or buying more hardware. I can tell you if our OLTP servers were IO bound, it would run like crap. Instead of 1 sec, we'd be looking at 5-10 seconds per user transaction and our users would be screaming bloody murder. In theory, you can always convert your IO bound DB to CPU bound by stuffing more and more RAM into your server. (Or partitioning the DB across multiple servers.) Whether it's cost effective depends on the DB and how much your users are paying you -- and that's a case-by-case analysis. Not a global statement of IO-bound, pointless. We all want our systems to be CPU bound, but it's not always possible. Remember, he is managing a 5 TB Databse. That's quite a bit different than a 100 GB or even 500 GB database. I did say in theory. :) I'm pretty sure google is more CPU bound than IO bound -- they just spread their DB over 50K servers or whatever. Not everybody is willing to pay for that but it's always in the realm of plausibility. Plus we have to go back to the statement I was replying to which was I have yet to come across a DB system that wasn't IO bound. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alex Turner wrote: Not at random access in RAID 10 they aren't, and anyone with their head screwed on right is using RAID 10. The 9500S will still beat the Areca cards at RAID 10 database access patern. The max 256MB onboard for 3ware cards is disappointing though. While good enough for 95% of cases, there's that 5% that could use a gig or two of onboard ram for ultrafast updates. For example, I'm specing out an upgrade to our current data processing server. Instead of the traditional 6xFast-Server-HDs, we're gonna go for broke and do 32xConsumer-HDs. This will give us mega I/O bandwidth but we're vulnerable to random access since consumer-grade HDs don't have the RPMs or the queueing-smarts. This means we're very dependent on the controller using onboard RAM to do I/O scheduling. 256MB divided over 4/6/8 drives -- OK. 256MB divided over 32 drives -- ugh, the HD's buffers are bigger than the RAM alotted to it. At least this is how it seems it would work from thinking through all the factors. Unfortunately, I haven't found anybody else who has gone this route and reported their results so I guess we're the guinea pig. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Hardware/OS recommendations for large databases ( 5TB)
James Mello wrote: Unless there was a way to guarantee consistency, it would be hard at best to make this work. Convergence on large data sets across boxes is non-trivial, and diffing databases is difficult at best. Unless there was some form of automated way to ensure consistency, going 8 ways into separate boxes is *very* hard. I do suppose that if you have fancy storage (EMC, Hitachi) you could do BCV or Shadow copies. But in terms of commodity stuff, I'd have to agree with Merlin. If you're talking about data consistency, I don't see why that's an issue in a bulk-load/read-only setup. Either bulk load on 1 server and then do a file copy to all the others -- or simultaneously bulk load on all servers. If you're talking about consistency in directly queries to the appropriate servers, I agree that's a more complicated issue but not unsurmountable. If you don't use persistent connections, you can probably get pretty good routing using DNS -- monitor servers by looking at top/iostat/memory info/etc and continually change the DNS zonemaps to direct traffic to less busy servers. (I use this method for our global load balancers -- pretty easy to script via Perl/Python/etc.) Mind you since you need a Dual Processor motherboard anyways to get PCI-X, that means every machine would be a 2xDual Core so there's enough CPU power to handle the cases where 2 or 3 queries get sent to the same server back-to-back. Of course, I/O would take a hit in this case -- but I/O would take a hit in every case on a single 16-core mega system. If use persistent connections, it'll definitely require extra programming beyond simple scripting. Take one of the opensource projects like PgPool or SQLRelay and alter it so it monitors all servers to see what server is least busy before passing a query on. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alex Stapleton wrote: Your going to have to factor in the increased failure rate in your cost measurements, including any downtime or performance degradation whilst rebuilding parts of your RAID array. It depends on how long your planning for this system to be operational as well of course. If we go 32xRAID10, rebuild time should be the same as rebuild time in a 4xRAID10 system. Only the hard drive that was replaced needs rebuild -- not the entire array. And yes, definitely need a bunch of drives lying around as spares. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alex Turner wrote: Spend a fortune on dual core CPUs and then buy crappy disks... I bet for most applications this system will be IO bound, and you will see a nice lot of drive failures in the first year of operation with consumer grade drives. Spend your money on better Disks, and don't bother with Dual Core IMHO unless you can prove the need for it. I would say the opposite -- you always want Dual Core nowadays. DC Opterons simply give you better bang for the buck than single core Opterons. Price out a 1xDC system against a 2x1P system -- the 1xDC will be cheaper. Do the same for 2xDC versus 4x1P, 4xDC versus 8x1P, 8xDC versus 16x1P, etc. -- DC gets cheaper by wider and wider margins because those mega-CPU motherboards are astronomically expensive. DC also gives you a better upgrade path. Let's say you do testing and figure 2x246 is the right setup to handle the load. Well instead of getting 2x1P, use the same 2P motherboard but only populate 1 CPU w/ a DC/270. Now you have a server that can be upgraded to +80% more CPU by popping in another DC/270 versus throwing out the entire thing to get a 4x1P setup. The only questions would be: (1) Do you need a SMP server at all? I'd claim yes -- you always need 2+ cores whether it's DC or 2P to avoid IO interrupts blocking other processes from running. (2) Does a DC system perform better than it's Nx1P cousin? My experience is yes. Did some rough tests in a drop-in-replacement 1x265 versus 2x244 and saw about +10% for DC. All the official benchmarks (Spec, Java, SAP, etc) from AMD/Sun/HP/IBM show DCs outperforming the Nx1P setups. (3) Do you need an insane amount of memory? Well here's the case where the more expensive motherboard will serve you better since each CPU slot has its own bank of memory. Spend more money on memory, get cheaper single-core CPUs. Of course, this doesn't apply if you are an Intel/Dell-only shop. Xeon DCs, while cheaper than their corresponding single-core SMPs, don't have the same performance profile of Opteron DCs. Basically, you're paying a bit extra so your server can generate a ton more heat. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hardware/OS recommendations for large databases (
David Boreham wrote: Spend a fortune on dual core CPUs and then buy crappy disks... I bet for most applications this system will be IO bound, and you will see a nice lot of drive failures in the first year of operation with consumer grade drives. I guess I've never bought into the vendor story that there are two reliability grades. Why would they bother making two different kinds of bearing, motor etc ? Seems like it's more likely an excuse to justify higher prices. In my experience the expensive SCSI drives I own break frequently while the cheapo desktop drives just keep chunking along (modulo certain products that have a specific known reliability problem). I'd expect that a larger number of hotter drives will give a less reliable system than a smaller number of cooler ones. Our SCSI drives have failed maybe a little less than our IDE drives. Hell, some of the SCSIs even came bad when we bought them. Of course, the IDE drive failure % is inflated by all the IBM Deathstars we got -- ugh. Basically, I've found it's cooling that's most important. Packing the drives together into really small rackmounts? Good for your density, not good for the drives. Now we do larger rackmounts -- drives have more space in between each other plus fans in front and back of the drives. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware/OS recommendations for large databases ( 5TB)
Merlin Moncure wrote: You could instead buy 8 machines that total 16 cores, 128GB RAM and It's hard to say what would be better. My gut says the 5u box would be a lot better at handling high cpu/high concurrency problems...like your typical business erp backend. This is pure speculation of course...I'll defer to the experts here. In this specific case (data warehouse app), multiple machines is the better bet. Load data on 1 machine, copy to other servers and then use a middleman to spread out SQL statements to each machine. I was going to suggest pgpool as the middleman but I believe it's limited to 2 machines max at this time. I suppose you could daisy chain pgpools running on every machine. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] shared buffers
Carlos Henrique Reimer wrote: I forgot to say that it´s a 12GB database... Ok, I´ll set shared buffers to 30.000 pages but even so meminfo and top shouldn´t show some shared pages? I heard something about that Redhat 9 can´t handle very well RAM higher than 2GB. Is it right? Thanks in advance! RH9, like any 32-bit OS, is limited to 2GB address space w/o special tricks. However, it can access 2GB for the OS disk cache using PAE if you are running the bigmem kernel. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Caching by Postgres
Donald Courtney wrote: I built postgreSQL 8.1 64K bit on solaris 10 a few months ago and side by side with the 32 bit postgreSQL build saw no improvement. In fact the 64 bit result was slightly lower. I'm not surprised 32-bit binaries running on a 64-bit OS would be faster than 64-bit/64-bit. 64-bit isn't some magical wand you wave and it's all ok. Programs compiled as 64-bit will only run faster if (1) you need 64-bit address space and you've been using ugly hacks like PAE to get access to memory 2GB or (2) you need native 64-bit data types and you've been using ugly hacks to piece 32-bit ints together (example, encryption/compression). In most cases, 64-bit will run slightly slower due to extra overhead of using larger datatypes. Since PostgreSQL hands off the majority of memory management/data caching to the OS, only the OS needs to be 64-bit to reap the benefits of better memory management. Since Postgres *ALREADY* reaps the 64-bit benefit, I'm not sure how the argument moving caching/mm/fs into Postgres would apply. Yes there's the point about possibly implementing better/smarter/more appropriate caching algorithms but that has nothing to do with 64-bit. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Caching by Postgres
Donald Courtney wrote: in that even if you ran postgreSQL on a 64 bit address space with larger number of CPUs you won't see much of a scale up and possibly even a drop. I am not alone in having the *expectation* What's your basis for believing this is the case? Why would PostgreSQL's dependence on the OS's caching/filesystem limit scalability? I know when I went from 32bit to 64bit Linux, I got *HUGE* increases in performance using the same amount of memory. And when I went from 2x1P to 2xDC, my average cpu usage % dropped almost in half. that a database should have some cache size parameter and the option to skip the file system. If I use oracle, sybase, mysql and maxdb they all have the ability to size a data cache and move to 64 bits. Josh Berkus has already mentioned this as conventional wisdom as written by Oracle. This may also be legacy wisdom. Oracle/Sybase/etc has been around for a long time; it was probably a clear performance win way back when. Nowadays with how far open-source OS's have advanced, I'd take it with a grain of salt and do my own performance analysis. I suspect the big vendors wouldn't change their stance even if they knew it was no longer true due to the support hassles. My personal experience with PostgreSQL. Dropping shared buffers from 2GB to 750MB improved performance on my OLTP DB a good 25%. Going down from 750MB to 150MB was another +10%. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] extremly low memory usage
Ron wrote: PERC4eDC-PCI Express, 128MB Cache, 2-External Channels Looks like they are using the LSI Logic MegaRAID SCSI 320-2E controller. IIUC, you have 2 of these, each with 2 external channels? A lot of people have mentioned Dell's versions of the LSI cards can be WAY slower than the ones you buy from LSI. Why this is the case? Nobody knows for sure. Here's a guess on my part. A while back, I was doing some googling. And instead of typing LSI MegaRAID xxx, I just typed MegaRAID xxx. Going beyond the initial pages, I saw Tekram -- a company that supposedly produces their own controllers -- listing products with the exact model numbers and photos as cards from LSI and Areca. Seemed puzzling until I read a review about SATA RAID cards where it mentioned Tekram produces the Areca cards under their own name but using slower components to avoid competing at the highend with them. So what may be happening is that the logic circuitry on the Dell PERCs are the same as the source LSI cards, the speed of the RAID processor/RAM/internal buffers/etc is not as fast so Dell can shave off a few bucks for every server. That would mean while a true LSI card has the processing power to do the RAID calculates for X drives, the Dell version probably can only do X*0.6 drives or so. The 128MB buffer also looks suspiciously small, and I do not see any upgrade path for it on LSI Logic's site. Serious RAID controllers from companies like Xyratex, Engino, and Dot-hill can have up to 1-2GB The card is upgradable. If you look at the pic of the card, it shows a SDRAM DIMM versus integrated RAM chips. I've also read reviews a while back comparing benchmarks of the 320-2 w/ 128K versus 512K onboard RAM. Their product literature is just nebulous on the RAM upgrade part. I'm sure if you opened up the PDF manuals, you could find the exact info That 128MB of buffer cache may very well be too small to keep the IO rate up, and/or there may be a more subtle problem with the LSI card, and/or you may have a configuration problem, but _something(s)_ need fixing since you are only getting raw sequential IO of ~100-150MB/s when it should be above 500MB/s. I think it just might be the Dell hardware or the lack of 64-bit IOMMU on Xeon's. Here's my numbers on 320-1 w/ 128K paired up with Opterons compared to Jeremiah's. # time dd if=/dev/zero of=testfile bs=1024 count=100 100+0 records in 100+0 records out real0m8.885s user0m0.299s sys 0m6.998s 2x15K RAID1 real0m14.493s user0m0.255s sys 0m11.712s 6x15K RAID10 (2x 320-1) real0m9.986s user0m0.200s sys 0m8.634s # time dd of=/dev/null if=testfile bs=1024 count=100 100+0 records in 100+0 records out real0m1.654s user0m0.232s sys 0m1.415s 2x15K RAID1 real0m3.383s user0m0.176s sys 0m3.207s 6x15K RAID10 (2x 320-1) real0m2.427s user0m0.178s sys 0m2.250s If all 14 HDs are arranged in a RAID10 array, I'd say there's definitely something wrong with Jeremiah's hardware. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems on 4/8way Opteron (dualcore)
I've been running 2x265's on FC4 64-bit (2.6.11-1+) and it's been running perfect. With NUMA enabled, it runs incrementally faster than NUMA off. Performance is definitely better than the 2x244s they replaced -- how much faster, I can't measure since I don't have the transaction volume to compare to previous benchmarks. I do see more consistently low response times though, can run apache also on the server for faster HTML generation times and top seems to show in general twice as much CPU power idle on average (25% per 265 core versus 50% per 244.) I haven't investigated the 2.6.12+ kernel updates yet -- I probably will do our development servers first to give it a test. The problem as I remember it boiled down to the Linux kernel handling memory/process management very badly on large dual core systems -- pathological NUMA behavior. However, this problem has apparently been fixed in Linux v2.6.12+, and using the more recent kernel on large dual core systems generated *massive* performance improvements on these systems for the individuals with this issue. Using the patched kernel, one gets the performance most people were expecting. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Trying to figure out pgbench
My Dual Core Opteron server came in last week. I tried to do some benchmarks with pgbench to get some numbers on the difference between 1x1 - 2x1 - 2x2 but no matter what I did, I kept getting the same TPS on all systems. Any hints on what the pgbench parameters I should be using? In terms of production use, it definitely can handle more load. Previously, Apache/Perl had to run on a separate server to avoid a ~50% penalty. Now, the numbers are +15% performance even with Apache/Perl running on the same box as PostgreSQL. How much more load of course is what I'd like to quantify. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help specifying new web server/database machine
Rory Campbell-Lange wrote: Processor: First of all I noted that we were intending to use Opteron processors. I guess this isn't a straightforward choice because I believe Debian (our Linux of choice) doesn't have a stable AMD64 port. However some users on this list suggest that Opterons work very well even in a 32 bit environment. Some have suggested that a single dual core processor is the way to go. The RAM needs to fit the CPU arrangement too; William points out that one needs 2 DIMMS per CPU. Your summary here just pointed out the obvious to me. Start with a 2P MB but only populate a single DC Opteron. That'll give you 2P system with room to expand to 4P in the future. Plus you only need to populate 1 memory bank so you can do 2x1GB. Disks: I'm somewhat confused here. I've followed the various notes about SATA vs SCSI and it seems that SCSI is the way to go. On a four-slot 1U server, would one do a single RAID10 over 4 disks 1rpm U320 disks? I would run the database in its own partition, separate from the rest of the OS, possible on LVM. An LSI-Megaraid-2 appears to be the card of choice. With only 4 disks, a MegaRAID U320-1 is good enough. It's quite a premium to go to the 2x channel MegaRAID. With 4 drives, I'd still do 2 big drives mirrored for the DB partition and 2 small drives for OS+WAL. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Help specifying new web server/database machine
We are considering two RAID1 system disks, and two RAID1 data disks. We've avoided buying Xeons. The machine we are looking at looks like this: Rackmount Chassis - 500W PSU / 4 x SATA Disk Drive Bays S2882-D - Dual Opteron / AMD 8111 Chipset / 5 x PCI Slots 2x - (Dual) AMD Opteron 246 Processors (2.0GHz) - 1MB L2 Cache/core (single core) For about $1500 more, you could go 2x270 (dual core 2ghz) and get a 4X SMP system. (My DC 2x265 system just arrived -- can't wait to start testing it!!!) 2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank) This is a wierd configuration. For a 2x Opteron server to operate at max performance, it needs 4 DIMMs minimum. Opterons use a 128-bit memory interface and hence requires 2 DIMMs per CPU to run at full speed. With only 2 DIMMS, you either have both CPUs run @ 64-bit (this may not even be possible) or populate only 1 CPU bank -- the other CPU must then request all memory access through the other CPU which is a significant penalty. If you went 4x512MB, you'd limit your future update options by having less slots available to add more memory. I'd definitely out of the chute get 4x1GB, 4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 80GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache 250GB SATA-150 7200RPM Hard Disk / 8MB Cache Now this is comes to the interesting part. We've had huge, gigantic threads (check archives for the $7K server threads) about SCSI versus SATA in the past. 7200 SATAs just aren't fast/smart enough to cut it for most production uses in regular configs. If you are set on SATA, you will have to consider the following options: (1) use 10K Raptors for TCQ goodness, (2) put a huge amount of memory onto the SATA RAID card -- 1GB minimum, (3) use a ton of SATA drives to make a RAID10 array -- 8 drives minimum. Or you could go SCSI. SCSI is cost prohibitive though at the larger disk sizes -- this is why I'm considering option #3 for my data processing server. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Forcing use of specific index
A pretty awful way is to mangle the sql statement so the other field logical statements are like so: select * from mytable where 0+field = 100 Tobias Brox wrote: Is it any way to attempt to force the planner to use some specific index while creating the plan? Other than eventually dropping all the other indices (which is obiously not a solution in production setting anyway)? I have one case where I have added 16 indices to a table, many of them beeing partial indices. The table itself has only 50k of rows, but are frequently used in heavy joins. I imagine there can be exponential order on the number of alternative paths the planner must examinate as function of the number of indices? It seems to me that the planner is quite often not choosing the best index, so I wonder if there is any easy way for me to check out what the planner think about a specific index :-) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Adaptec/LSI/?? RAID
I've used LSI MegaRAIDs successfully in the following systems with both Redhat 9 and FC3 64bit. Arima HDAMA/8GB RAM Tyan S2850/4GB RAM Tyan S2881/4GB RAM I've previously stayed away from Adaptec because we used to run Solaris x86 and the driver was somewhat buggy. For Linux and FreeBSD, I'd be less worried as open source development of drivers usually lead to better testing bug-fixing. Stacy White wrote: We're in the process of buying another Opteron server to run Postgres, and based on the suggestions in this list I've asked our IT director to get an LSI MegaRaid controller rather than one of the Adaptecs. But when we tried to place our order, our vendor (Penguin Computing) advised us: we find LSI does not work well with 4GB of RAM. Our engineering find that LSI card could cause system crashes. One of our customer ... has found that Adaptec cards works well on PostGres SQL -- they're using it as a preforce server with xfs and post-gress. Any comments? Suggestions for other RAID controllers? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] ok you all win what is best opteron (I dont want a
I say most apps because it's true. :) I would suggest that pretty much every app (other than video/audio streaming) people think are bandwidth-limited are actually latency-limited. Take the SpecFoo tests. Sure I would have rather seen SAP/TPC/etc that would be more relevant to Postgres but there aren't any apples-to-apples comparisons available yet. But there's something to consider here. What people in the past have believed is that memory bandwidth is the key to Spec numbers -- SpecFP isn't a test of floating point performance, it's a test of memory bandwidth. Or is it? Numbers for DC Opterons show lower latency/lower bandwith beating higher latency/higher bandwidth in what was supposedly bandwidth limited. What may actually be happening is extra bandwidth isn't actually used directly by the app itself -- instead the CPU uses it for prefetching to hide latency. Scrounging around for more numbers, I've found benchmarks at Anandtech that relate better to Postgres. He has a Order Entry OLTP app running on MS-SQL. 1xDC beats 2x1 -- 2xDC beats 4x1. order entry reads 2x248 - 235113 1x175 - 257192 4x848 - 360014 2x275 - 392643 order entry writes 2x248 - 235107 1x175 - 257184 4x848 - 360008 2x275 - 392634 order entry stored procedures 2x248 - 2939 1x175 - 3215 4x848 - 4500 2x275 - 4908 Greg Stark wrote: William Yu [EMAIL PROTECTED] writes: It turns out the latency in a 2xDC setup is just so much lower and most apps like lower latency than higher bandwidth. You haven't tested anything about most apps. You tested what the SpecFoo apps prefer. If you're curious about which Postgres prefers you'll have to test with Postgres. I'm not sure whether it will change the conclusion but I expect Postgres will like bandwidth better than random benchmarks do. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Postgresql Performance via the LSI MegaRAID 2x Card
I'm sure there's some corner case where more memory helps. If you consider that 1GB of RAM is about $100, I'd max out memory on the controller just for the hell of it. Josh Berkus wrote: Steve, Past recommendations for a good RAID card (for SCSI) have been the LSI MegaRAID 2x. This unit comes with 128MB of RAM on-board. Has anyone found by increasing the on-board RAM, did Postgresql performed better? My informal tests showed no difference between 64MB and 256MB. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Whence the Opterons?
Unfortunately, Anandtech only used Postgres just a single time in his benchmarks. And what it did show back then was a huge performance advantage for the Opteron architecture over Xeon in this case. Where the fastest Opterons were just 15% faster in MySQL/MSSQL/DB2 than the fastest Xeons, it was 100%+ faster in Postgres. He probably got rid of Postgres from his benchmark suite since it favors Opteron too much. As a general hardware review site, makes senses that he needs to get more neutral apps in order to get free systems to review and (ahem) ad dollars. That being said, I wouldn't get a quad Opteron system anyways now that the dual core Opterons are available. A DP+DC system would be faster and cheaper than a pure quad system. Unless of course, I needed a QP+DC for 8-way SMP. Anjan Dave wrote: Wasn't the context switching issue occurring in specific cases only? I haven't seen any benchmarks for a 50% performance difference. Neither have I seen any benchmarks of pure disk IO performance of specific models of Dell vs HP or Sun Opterons. Thanks, Anjan EMC you can file an RPQ via your sales contacts to get it approved, though not sure how lengthy/painful that process might be, or if it's gonna be worth it. Read the article devoted to the v40z on anandtech.com. I am also trying to get a quad-Opteron versus the latest quad-XEON from Dell (6850), but it's hard to justify a difference between a 15K dell versus a 30k v40z for a 5-8% performance gain (read the XEON Vs. Opteron Database comparo on anandtech.com)... Thanks, Anjan 15k vs 30k is indeed a big difference. But also realize that Postgres has a specific benefit to Opterons versus Xeons. The context switching storm happens less on an Opteron for some reason. I would venture a much greater benefit than 5-8%, more like 10-50%. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)
I posted this link a few months ago and there was some surprise over the difference in postgresql compared to other DBs. (Not much surprise in Opteron stomping on Xeon in pgsql as most people here have had that experience -- the surprise was in how much smaller the difference was in other DBs.) If it was across the board +100% in MS-SQL, MySQL, etc -- you can chalk in up to overall better CPU architecture. Most of the time though, the numbers I've seen show +0-30% for [insert DB here] and a huge whopping + for pgsql. Why the pronounced preference for postgresql, I'm not sure if it was explained fully. BTW, the Anandtech test compares single CPU systems w/ 1GB of RAM. Go to dual/quad and SMP Xeon will suffer even more since it has to share a fixed amount of FSB/memory bandwidth amongst all CPUs. Xeons also seem to suffer more from context-switch storms. Go 4GB of RAM and the Xeon suffers another hit due to the lack of a 64-bit IOMMU. Devices cannot map to addresses 4GB which means the OS has to do extra work in copying data from/to 4GB anytime you have IO. (Although this penalty might exist all the time in 64-bit mode for Xeon if Linux/Windows took the expedient and less-buggy route of using a single method versus checking whether target addresses are or 4GB.) Jeff Frost wrote: On Tue, 19 Apr 2005, J. Andrew Rogers wrote: I don't know about 2.5x faster (perhaps on specific types of loads), but the reason Opterons rock for database applications is their insanely good memory bandwidth and latency that scales much better than the Xeon. Opterons also have a ccNUMA-esque I/O fabric and two dedicated on-die memory channels *per processor* -- no shared bus there, closer to real UNIX server iron than a glorified PC. Thanks J! That's exactly what I was suspecting it might be. Actually, I found an anandtech benchmark that shows the Opteron coming in at close to 2.0x performance: http://www.anandtech.com/linux/showdoc.aspx?i=2163p=2 It's an Opteron 150 (2.4ghz) vs. Xeon 3.6ghz from August. I wonder if the differences are more pronounced with the newer Opterons. -Jeff ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
The Linux kernel is definitely headed this way. The 2.6 allows for several different I/O scheduling algorithms. A brief overview about the different modes: http://nwc.serverpipeline.com/highend/60400768 Although a much older article from the beta-2.5 days, more indepth info from one of the programmers who developed the AS scheduler and worked on the deadline scheduler: http://kerneltrap.org/node/657 I think I'm going to start testing the deadline scheduler for our data processing server for a few weeks before trying it on our production servers. Alex Turner wrote: Whilst I admire your purist approach, I would say that if it is beneficial to performance that a kernel understand drive geometry, then it is worth investigating teaching it how to deal with that! I was less referrring to the kernel as I was to the controller. Lets say we invented a new protocol that including the drive telling the controller how it was layed out at initialization time so that the controller could make better decisions about re-ordering seeks. It would be more cost effective to have that set of electronics just once in the controller, than 8 times on each drive in an array, which would yield better performance to cost ratio. Therefore I would suggest it is something that should be investigated. After all, why implemented TCQ on each drive, if it can be handled more effeciently at the other end by the controller for less money?! Alex Turner netEconomist On 4/19/05, Dave Held [EMAIL PROTECTED] wrote: -Original Message- From: Alex Turner [mailto:[EMAIL PROTECTED] Sent: Monday, April 18, 2005 5:50 PM To: Bruce Momjian Cc: Kevin Brown; pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Does it really matter at which end of the cable the queueing is done (Assuming both ends know as much about drive geometry etc..)? [...] The parenthetical is an assumption I'd rather not make. If my performance depends on my kernel knowing how my drive is laid out, I would always be wondering if a new drive is going to break any of the kernel's geometry assumptions. Drive geometry doesn't seem like a kernel's business any more than a kernel should be able to decode the ccd signal of an optical mouse. The kernel should queue requests at a level of abstraction that doesn't depend on intimate knowledge of drive geometry, and the drive should queue requests on the concrete level where geometry matters. A drive shouldn't guess whether a process is trying to read a file sequentially, and a kernel shouldn't guess whether sector 30 is contiguous with sector 31 or not. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What to do with 6 disks?
My experience: 1xRAID10 for postgres 1xRAID1 for OS + WAL Jeff Frost wrote: Now that we've hashed out which drives are quicker and more money equals faster... Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid option would you use for a standalone postgres server? a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? b) 1xRAID1 for OS/xlog, 1xRAID5 for data c) 1xRAID10 for OS/xlong/data d) 1xRAID1 for OS, 1xRAID10 for data e) . I was initially leaning towards b, but after talking to Josh a bit, I suspect that with only 4 disks the raid5 might be a performance detriment vs 3 raid 1s or some sort of split raid10 setup. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
Problem with this strategy. You want battery-backed write caching for best performance safety. (I've tried IDE for WAL before w/ write caching off -- the DB got crippled whenever I had to copy files from/to the drive on the WAL partition -- ended up just moving WAL back on the same SCSI drive as the main DB.) That means in addition to a $$$ SCSI caching controller, you also need a $$$ SATA caching controller. From my glance at prices, advanced SATA controllers seem to cost nearly as their SCSI counterparts. This also looks to be the case for the drives themselves. Sure you can get super cheap 7200RPM SATA drives but they absolutely suck for database work. Believe me, I gave it a try once -- ugh. The highend WD 10K Raptors look pretty good though -- the benchmarks @ storagereview seem to put these drives at about 90% of SCSI 10Ks for both single-user and multi-user. However, they're also priced like SCSIs -- here's what I found @ Mwave (going through pricewatch to find WD740GDs): Seagate 7200 SATA -- 80GB$59 WD 10K SATA -- 72GB$182 Seagate 10K U320 -- 72GB$289 Using the above prices for a fixed budget for RAID-10, you could get: SATA 7200 -- 680MB per $1000 SATA 10K -- 200MB per $1000 SCSI 10K -- 125MB per $1000 For a 99% read-only DB that required lots of disk space (say something like Wikipedia or blog host), using consumer level SATA probably is ok. For anything else, I'd consider SATA 10K if (1) I do not need 15K RPM and (2) I don't have SCSI intrastructure already. Steve Poe wrote: If SATA drives don't have the ability to replace SCSI for a multi-user Postgres apps, but you needed to save on cost (ALWAYS an issue), could/would you implement SATA for your logs (pg_xlog) and keep the rest on SCSI? Steve Poe Mohan, Ross wrote: I've been doing some reading up on this, trying to keep up here, and have found out that (experts, just yawn and cover your ears) 1) some SATA drives (just type II, I think?) have a Phase Zero implementation of Tagged Command Queueing (the special sauce for SCSI). 2) This SATA TCQ is called NCQ and I believe it basically allows the disk software itself to do the reordering (this is called simple in TCQ terminology) It does not yet allow the TCQ head of queue command, allowing the current tagged request to go to head of queue, which is a simple way of manifesting a high priority request. 3) SATA drives are not yet multi-initiator? Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives are likely to whomp SATA II drives for a while yet (read: a year or two) in multiuser PostGres applications. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 2:04 PM To: Kevin Brown Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Kevin Brown [EMAIL PROTECTED] writes: Greg Stark wrote: I think you're being misled by analyzing the write case. Consider the read case. When a user process requests a block and that read makes its way down to the driver level, the driver can't just put it aside and wait until it's convenient. It has to go ahead and issue the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing
Re: [PERFORM] How to improve db performance with $7K?
Alex Turner wrote: I'm no drive expert, but it seems to me that our write performance is excellent. I think what most are concerned about is OLTP where you are doing heavy write _and_ heavy read performance at the same time. Our system is mostly read during the day, but we do a full system update everynight that is all writes, and it's very fast compared to the smaller SCSI system we moved off of. Nearly a 6x spead improvement, as fast as 900 rows/sec with a 48 byte record, one row per transaction. I've started with SATA in a multi-read/multi-write environment. While it ran pretty good with 1 thread writing, the addition of a 2nd thread (whether reading or writing) would cause exponential slowdowns. I suffered through this for a week and then switched to SCSI. Single threaded performance was pretty similar but with the advanced command queueing SCSI has, I was able to do multiple reads/writes simultaneously with only a small performance hit for each thread. Perhaps having a SATA caching raid controller might help this situation. I don't know. It's pretty hard justifying buying a $$$ 3ware controller just to test it when you could spend the same money on SCSI and have a guarantee it'll work good under multi-IO scenarios. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
It's the same money if you factor in the 3ware controller. Even without a caching controller, SCSI works good in multi-threaded IO (not withstanding crappy shit from Dell or Compaq). You can get such cards from LSI for $75. And of course, many server MBs come with LSI controllers built-in. Our older 32-bit production servers all use Linux software RAID w/ SCSI and there's no issues when multiple users/processes hit the DB. *Maybe* a 3ware controller w/ onboard cache + battery backup might do much better for multi-threaded IO than just plain-jane SATA. Unfortunately, I have not been able to find anything online that can confirm or deny this. Hence, the choice is spend $$$ on the 3ware controller and hope it meets your needs -- or spend $$$ on SCSI drives and be sure. Now if you want to run such tests, we'd all be delighted with to see the results so we have another option for building servers. Alex Turner wrote: It's hardly the same money, the drives are twice as much. It's all about the controller baby with any kind of dive. A bad SCSI controller will give sucky performance too, believe me. We had a Compaq Smart Array 5304, and it's performance was _very_ sub par. If someone has a simple benchmark test database to run, I would be happy to run it on our hardware here. Alex Turner On Apr 6, 2005 3:30 AM, William Yu [EMAIL PROTECTED] wrote: Alex Turner wrote: I'm no drive expert, but it seems to me that our write performance is excellent. I think what most are concerned about is OLTP where you are doing heavy write _and_ heavy read performance at the same time. Our system is mostly read during the day, but we do a full system update everynight that is all writes, and it's very fast compared to the smaller SCSI system we moved off of. Nearly a 6x spead improvement, as fast as 900 rows/sec with a 48 byte record, one row per transaction. I've started with SATA in a multi-read/multi-write environment. While it ran pretty good with 1 thread writing, the addition of a 2nd thread (whether reading or writing) would cause exponential slowdowns. I suffered through this for a week and then switched to SCSI. Single threaded performance was pretty similar but with the advanced command queueing SCSI has, I was able to do multiple reads/writes simultaneously with only a small performance hit for each thread. Perhaps having a SATA caching raid controller might help this situation. I don't know. It's pretty hard justifying buying a $$$ 3ware controller just to test it when you could spend the same money on SCSI and have a guarantee it'll work good under multi-IO scenarios. ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] name search query speed
Jeremiah Jahn wrote: I have about 5M names stored on my DB. Currently the searches are very quick unless, they are on a very common last name ie. SMITH. The Index is always used, but I still hit 10-20 seconds on a SMITH or Jones search, and I average about 6 searches a second and max out at about 30/s. Any suggestions on how I could arrange things to make this search quicker? I have 4gb of mem on a raid 5 w/ 3 drives. I'm hoping that I can increase this speed w/o a HW upgrade. If it's just SMITH, the only fix is to throw more hardware at the problem. I've got my own database of medical providers facilities in the millions and anytime somebody tries to search for MEDICAL FACILITY, it takes forever. I've tried every optimization possible but when you have 500K records with the word MEDICAL in it, what can you do? You've got to check all 500K records to see if it matches your criteria. For multi-word searches, what I've found does work is to periodically generate stats on work frequencies and use those stats to search the least common words first. For example, if somebody enters ALTABATES MEDICAL HOSPITAL, I can get the ~50 providers with ALTABATES in the name and then do a 2nd and 3rd pass to filter against MEDICAL and HOSPITAL. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system
You can get 64-bit Xeons also but it takes hit in the I/O department due to the lack of a hardware I/O MMU which limits DMA transfers to addresses below 4GB. This has a two-fold impact: 1) transfering data to 4GB require first a transfer to 4GB and then a copy to the final destination. 2) You must allocate real memory 2X the address space of the devices to act as bounce buffers. This is especially problematic for workstations because if you put a 512MB Nvidia card in your computer for graphics work -- you've just lost 1GB of memory. (I dunno how much the typical SCSI/NIC/etc take up.) I thought Intel was copying AMD's 64-bit API. Is Intel's implementation as poor as you description? Does Intel have any better 64-bit offering other than the Itanium/Itanic? Unfortunately, there's no easy way for Intel to have implemented a 64-bit IOMMU under their current restrictions. The memory controller resides on the chipset and to upgrade the functionality significantly, it would probably require changing the bus protocol. It's not that they couldn't do it -- it would just require all Intel chipset/MB vendors/partners to go through the process of creating validating totally new products. A way lengthier process than just producing 64-bit CPUs that drop into current motherboards. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system
Bruce Momjian wrote: William Yu wrote: You can get 64-bit Xeons also but it takes hit in the I/O department due to the lack of a hardware I/O MMU which limits DMA transfers to addresses below 4GB. This has a two-fold impact: 1) transfering data to 4GB require first a transfer to 4GB and then a copy to the final destination. 2) You must allocate real memory 2X the address space of the devices to act as bounce buffers. This is especially problematic for workstations because if you put a 512MB Nvidia card in your computer for graphics work -- you've just lost 1GB of memory. (I dunno how much the typical SCSI/NIC/etc take up.) When you say allocate real memory 2X are you saying that if you have 16GB of RAM only 8GB is actually usable and the other 8GB is for bounce buffers, or is it just address space being used up? It's 2x the memory space of the devices. E.g. a Nvidia Graphics card w/ 512MB of RAM would require 1GB of memory to act as bounce buffers. And it has to be real chunks of memory in 64-bit mode since DMA transfer must drop it into real memory in order to then be copied to 4GB. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system
Jim C. Nasby wrote: On Tue, Feb 01, 2005 at 07:35:35AM +0100, Cosimo Streppone wrote: You might look at Opteron's, which theoretically have a higher data bandwidth. If you're doing anything data intensive, like a sort in memory, this could make a difference. Would Opteron systems need 64-bit postgresql (and os, gcc, ...) build to have that advantage? Well, that would give you the most benefit, but the memory bandwidth is still greater than on a Xeon. There's really no issue with 64 bit if you're using open source software; it all compiles for 64 bits and you're good to go. http://stats.distributed.net runs on a dual opteron box running FreeBSD and I've had no issues. You can get 64-bit Xeons also but it takes hit in the I/O department due to the lack of a hardware I/O MMU which limits DMA transfers to addresses below 4GB. This has a two-fold impact: 1) transfering data to 4GB require first a transfer to 4GB and then a copy to the final destination. 2) You must allocate real memory 2X the address space of the devices to act as bounce buffers. This is especially problematic for workstations because if you put a 512MB Nvidia card in your computer for graphics work -- you've just lost 1GB of memory. (I dunno how much the typical SCSI/NIC/etc take up.) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
Hervé Piedvache wrote: My point being is that there is no free solution. There simply isn't. I don't know why you insist on keeping all your data in RAM, but the mysql cluster requires that ALL data MUST fit in RAM all the time. I don't insist about have data in RAM but when you use PostgreSQL with big database you know that for quick access just for reading the index file for example it's better to have many RAM as possible ... I just want to be able to get a quick access with a growing and growind database ... If it's an issue of RAM and not CPU power, think about this scenario. Let's just say you *COULD* partition your DB over multiple servers. What are your plans then? Are you going to buy 4 Dual Xeon servers? Ok, let's price that out. For a full-blown rackmount server w/ RAID, 6+ SCSI drives and so on, you are looking at roughly $4000 per machine. So now you have 4 machines -- total of 16GB of RAM over the 4 machines. On the otherhand, let's say you spent that money on a Quad Opteron instead. 4x850 will cost you roughly $8000. 16GB of RAM using 1GB DIMMs is $3000. If you went with 2GB DIMMs, you could stuff 32GB of RAM onto that machine for $7500. Let's review the math: 4X server cluster, total 16GB RAM = $16K 1 beefy server w/ 16GB RAM = $11K 1 beefy server w/ 32GB RAM = $16K I know what I would choose. I'd get the mega server w/ a ton of RAM and skip all the trickyness of partitioning a DB over multiple servers. Yes your data will grow to a point where even the XXGB can't cache everything. On the otherhand, memory prices drop just as fast. By that time, you can ebay your original 16/32GB and get 64/128GB. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
Hervé Piedvache wrote: Sorry but I don't agree with this ... Slony is a replication solution ... I don't need replication ... what will I do when my database will grow up to 50 Gb ... I'll need more than 50 Gb of RAM on each server ??? This solution is not very realistic for me ... Have you confirmed you need a 1:1 RAM:data ratio? Of course more memory gets more speed but often at a diminishing rate of return. Unless every record of your 50GB is used in every query, only the most commonly used elements of your DB needs to be in RAM. This is the very idea of caching. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql
I inferred this from reading up on the compressed vm project. It can be higher or lower depending on what devices you have in your system -- however, I've read messages from kernel hackers saying Linux is very aggressive in reserving memory space for devices because it must be allocated at boottime. Josh Berkus wrote: William, The theshold for using PAE is actually far lower than 4GB. 4GB is the total memory address space -- split that in half for 2GB for userspace, 2GB for kernel. The OS cache resides in kernel space -- after you take alway the memory allocation for devices, you're left with a window of roughly 900MB. I'm curious, how do you get 1.1GB for memory allocation for devices? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql
[EMAIL PROTECTED] wrote: Since the optimal state is to allocate a small amount of memory to Postgres and leave a huge chunk to the OS cache, this means you are already hitting the PAE penalty at 1.5GB of memory. How could I chang this hitting? Upgrade to 64-bit processors + 64-bit linux. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql
My experience is RH9 auto detected machines = 2GB of RAM and installs the PAE bigmem kernel by default. I'm pretty sure the FC2/3 installer will do the same. [EMAIL PROTECTED] wrote: I understand that the 2.6.* kernels are much better at large memory support (with respect to performance issues), so unless you have a 64-bit machine lying around - this is probably worth a try. You may need to build a new kernel with the various parameters : CONFIG_NOHIGHMEM CONFIG_HIGHMEM4G CONFIG_HIGHMEM64G set appropriately (or even upgrade to the latest 2.6.10). I would expect that some research and experimentation will be required to get the best out of it - (e.g. the 'bounce buffers' issue). In the standard rpm FC 2-3 with a newly install server , would it automatically detect and config it if I use the mechine with 4 Gb [6Gb.] or should I manually config it? Amrit Thailand ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql
Gavin Sherry wrote: There is no problem with free Linux distros handling 4 GB of memory. The problem is that 32 hardware must make use of some less than efficient mechanisms to be able to address the memory. The theshold for using PAE is actually far lower than 4GB. 4GB is the total memory address space -- split that in half for 2GB for userspace, 2GB for kernel. The OS cache resides in kernel space -- after you take alway the memory allocation for devices, you're left with a window of roughly 900MB. Since the optimal state is to allocate a small amount of memory to Postgres and leave a huge chunk to the OS cache, this means you are already hitting the PAE penalty at 1.5GB of memory. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Low Performance for big hospital server ..
[EMAIL PROTECTED] wrote: Now I turn hyperthreading off and readjust the conf . I found the bulb query that was : update one flag of the table [8 million records which I think not too much] .When I turned this query off everything went fine. I don't know whether update the data is much slower than insert [Postgresql 7.3.2] and how could we improve the update method? UPDATE is expensive. Under a MVCC setup, it's roughtly the equivalent of DELETE + INSERT new record (ie, old record deprecated, new version of record. Updating 8 million records would be very I/O intensive and probably flushes your OS cache so all other queries hit disk versus superfast memory. And if this operation is run multiple times during the day, you may end up with a lot of dead tuples in the table which makes querying it deadly slow. If it's a dead tuples issue, you probably have to increase your freespace map and vacuum analyze that specific table more often. If it's an I/O hit issue, a lazy updating procedure would help if the operation is not time critical (eg. load the record keys that need updating and loop through the records with a time delay.) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Low Performance for big hospital server ..
[EMAIL PROTECTED] wrote: I will try to reduce shared buffer to 1536 [1.87 Mb]. 1536 is probaby too low. I've tested a bunch of different settings on my 8GB Opteron server and 10K seems to be the best setting. also effective cache is the sum of kernel buffers + shared_buffers so it should be bigger than shared buffers. also make the effective cache to 2097152 [2 Gb]. I will give you the result , because tomorrow [4/12/05] will be the official day of my hospital [which have more than 1700 OPD patient/day]. To figure out your effective cache size, run top and add free+cached. Also turning hyperthreading off may help, it is unlikely it is doing any good unless you are running a relatively new (2.6.x) kernel. Why , could you give me the reason? Pre 2.6, the kernel does not know the difference between logical and physical CPUs. Hence, in a dual processor system with hyperthreading, it actually sees 4 CPUs. And when assigning processes to CPUs, it may assign to 2 logical CPUs in the same physical CPU. I presume you are vacuuming on a regular basis? Yes , vacuumdb daily. Do you vacuum table by table or the entire DB? I find over time, the system tables can get very bloated and cause a lot of slowdowns just due to schema queries/updates. You might want to try a VACUUM FULL ANALYZE just on the system tables. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Low Performance for big hospital server ..
Dave Cramer wrote: William Yu wrote: [EMAIL PROTECTED] wrote: I will try to reduce shared buffer to 1536 [1.87 Mb]. 1536 is probaby too low. I've tested a bunch of different settings on my 8GB Opteron server and 10K seems to be the best setting. Be careful here, he is not using opterons which can access physical memory above 4G efficiently. Also he only has 4G the 6-10% rule still applies 10% of 4GB is 400MB. 10K buffers is 80MB. Easily less than the 6-10% rule. To figure out your effective cache size, run top and add free+cached. My understanding is that effective cache is the sum of shared buffers, plus kernel buffers, not sure what free + cached gives you? Not true. Effective cache size is the free memory available that the OS can use for caching for Postgres. In a system that runs nothing but Postgres, it's free + cached. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Some Performance Advice Needed
Alex wrote: Hi, i recently run pgbench against different servers and got some results I dont quite understand. A) EV1: Dual Xenon, 2GHz, 1GB Memory, SCSI 10Krpm, RHE3 B) Dual Pentium3 1.4ghz (Blade), SCSI Disk 10Krmp, 1GB Memory, Redhat 8 C) P4 3.2GHz, IDE 7.2Krpm, 1GBMem, Fedora Core2 Runnig PGbench reported A) 220 tps B) 240 tps C) 510 tps Running hdparm reported A) 920mb/s (SCSI 10k) B) 270mb/s (SCSI 10k) C) 1750mb/s (IDE 7.2k) What I dont quite understand is why a P3.2 is twice as fast as a Dual Xenon with SCSI disks, A dual Xenon 2GHz is not faster than a dual P3 1.4Ghz, and the hdparm results also dont make much sense. A few things to clear up about the P3/P4/Xeons. Xeons are P4s. Hence, a P4 2ghz will run the same speed as a Xeon 2ghz assuming all other variables are the same. Of course they aren't because your P4 is probably running unregistered memory, uses either a 533mhz or 800mhz FSB compared to the Xeon's shared 400mhz amongs 2 CPUs, running a faster non-smp kernel. Add all those variables up and it's definitely possible for a P4 3.2ghz to run twice as fast as a Dual Xeon 2ghz on a single-thread benchmark. (The corollary here is that in a multi-thread benchmark, the 2X Xeon can only hope to equal your P4 3.2.) P3s are faster than P4s at the same clock rate. By a lot. It's not really that surprising that a P3 1.4 is faster than a P4/Xeon 2.0. I've seen results like this many times over a wide range of applications. The only variable that is throwing off your comparisons are the hard drives. IDE drives have write caching on by default -- SCSI drives have it off. Use: hdparm -W0 /dev/hda to turn it off on the P4 system and rerun the tests then. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Some Performance Advice Needed
IDE disks lie about write completion (This can be disabled on some drives) whereas SCSI drives wait for the data to actually be written before they report success. It is quite easy to corrupt a PG (Or most any db really) on an IDE drive. Check the archives for more info. Do we have any real info on this? Specifically which drives? Is SATA the same way? What about SATA-II? I am not saying it isn't true (I know it is) but this is a blanket statement that may or may not be true with newer tech. From my experience with SATA controllers, write caching is controlled via the BIOS. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Some quick Opteron 32-bit/64-bit results
Greg Stark wrote: William Yu [EMAIL PROTECTED] writes: Biggest speedup I've found yet is the backup process (PG_DUMP -- GZIP). 100% faster in 64-bit mode. This drastic speed might be more the result of 64-bit GZIP though as I've seen benchmarks in the past showing encryption/compression running 2 or 3 times faster in 64-bit mode versus 32-bit. Isn't this a major kernel bump too? So a different scheduler, different IO scheduler, etc? I'm sure there's some speedup due to the kernel bump. I really didn't have the patience to even burn the FC2 32-bit CDs much less install both 32-bit 64-bit FC2 in order to have a more accurate baseline comparison. However, that being said -- when you see huge speed increases like 50% 100% for dump+gzip, it's doubtful the kernel/process scheduler/IO scheduler could have made that drastic of a difference. Maybe somebody else who has done a 2.4 - 2.6 upgrade can give us a baseline to subtract from my numbers. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Some quick Opteron 32-bit/64-bit results
I just finished upgrading the OS on our Opteron 148 from Redhat9 to Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc. The verdict: a definite performance improvement. I tested just a few CPU intensive queries and many of them are a good 30%-50% faster. Transactional/batch jobs involving client machines (i.e. include fixed client/networking/odbc overhead) seem to be about 10%-20% faster although I will need run more data through the system to get a better feel of the numbers. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Some quick Opteron 32-bit/64-bit results
Biggest speedup I've found yet is the backup process (PG_DUMP -- GZIP). 100% faster in 64-bit mode. This drastic speed might be more the result of 64-bit GZIP though as I've seen benchmarks in the past showing encryption/compression running 2 or 3 times faster in 64-bit mode versus 32-bit. William Yu wrote: I just finished upgrading the OS on our Opteron 148 from Redhat9 to Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc. The verdict: a definite performance improvement. I tested just a few CPU intensive queries and many of them are a good 30%-50% faster. Transactional/batch jobs involving client machines (i.e. include fixed client/networking/odbc overhead) seem to be about 10%-20% faster although I will need run more data through the system to get a better feel of the numbers. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Some quick Opteron 32-bit/64-bit results
I gave -O3 a try with -funroll-loops, -fomit-frame-pointer and a few others. Seemed to perform about the same as the default -O2 so I just left it as -O2. Gustavo Franklin Nóbrega wrote: Hi Willian, Which are the GCC flags that you it used to compile PostgreSQL? Best regards, Gustavo Franklin Nóbrega Infraestrutura e Banco de Dados Planae Tecnologia da Informação (+55) 14 3224-3066 Ramal 209 www.planae.com.br I just finished upgrading the OS on our Opteron 148 from Redhat9 to Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc. The verdict: a definite performance improvement. I tested just a few CPU intensive queries and many of them are a good 30%-50% faster. Transactional/batch jobs involving client machines (i.e. include fixed client/networking/odbc overhead) seem to be about 10%-20% faster although I will need run more data through the system to get a better feel of the numbers. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Caching of Queries
Josh Berkus wrote: 1) Query caching is not a single problem, but rather several different problems requiring several different solutions. 2) Of these several different solutions, any particular query result caching implementation (but particularly MySQL's) is rather limited in its applicability, partly due to the tradeoffs required.Per your explanation, Oracle has improved this by offering a number of configurable options. 3) Certain other caching problems would be solved in part by the ability to construct in-memory tables which would be non-durable and protected from cache-flushing. This is what I'm interested in chatting about. Just my 2 cents on this whole issue. I would lean towards having result caching in pgpool versus the main backend. I want every ounce of memory on a database server devoted to the database. Caching results would double the effect of cache flushing ... ie, now both the results and the pages used to build the results are in memory pushing out other stuff to disk that may be just as important. If it was in pgpool or something similar, I could devote a separate machine just for caching results leaving the db server untouched. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Table UPDATE is too slow
Ron St-Pierre wrote: Yes, I know that it's not a very good idea, however queries are allowed against all of those columns. One option is to disable some or all of the indexes when we update, run the update, and recreate the indexes, however it may slow down user queries. Because there are so many indexes, it is time consuming to recreate them after the update. Just because a query can run against any column does not mean all columns should be indexed. Take a good look at the column types and their value distribution. Let's say I have a table of addresses but every address I collect is in the 94116 zip code. That would mean indexes on city, state and zip are not only useless but could decrease performance. Also, if a search always includes a unique key (or a column with highly unique values), eliminating the other indexes would force the planner to always use that index first. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help specifying new machine
You're not getting much of a bump with this server. The CPU is incrementally faster -- in the absolutely best case scenario where your queries are 100% cpu-bound, that's about ~25%-30% faster. What about using Dual Athlon MP instead of a Xeon? Would be much less expensive, but have higher performance (I think). You're not going to be able to get a Dual Athlon MP for the same price as a single Xeon. A few years back, this was the case because Xeon CPUs MBs had a huge premium over Athlon. This is no longer true mainly because the number of people carrying Athlon MP motherboards has dropped down drastically. Go to pricewatch.com and do a search for 760MPX -- you get a mere 8 entries. Not surprisingly because who would not want to spend a few pennies more for a much superior Dual Opteron? The few sellers you see now just keep stuff in inventory for people who need replacement parts for emergencies and are willing to pay up the nose because it is an emergency. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Scaling further up
Anjan Dave wrote: We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running RH9, PG 7.4.0. There's an internal U320, 10K RPM RAID-10 setup on 4 drives. We are expecting a pretty high load, a few thousands of 'concurrent' users executing either select, insert, update, statments. The quick and dirty method would be to upgrade to the recently announced 3GHz Xeon MPs with 4MB of L3. My semi-educated guess is that you'd get another +60% there due to the huge L3 hiding the Xeon's shared bus penalty. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] cache whole data in RAM
David Teran wrote: Hi, we are trying to speed up a database which has about 3 GB of data. The server has 8 GB RAM and we wonder how we can ensure that the whole DB is read into RAM. We hope that this will speed up some queries. regards David ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html Upon bootup, automatically run SELECT * FROM xyz on every table in your database. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Update on putting WAL on ramdisk/
Some arbitrary data processing job WAL on single drive: 7.990 rec/s WAL on 2nd IDE drive: 8.329 rec/s WAL on tmpfs: 13.172 rec/s A huge jump in performance but a bit scary having a WAL that can disappear at any time. I'm gonna workup a rsync script and do some power-off experiments to see how badly it gets mangled. This could be good method though when you're dumping and restore an entire DB. Make a tmpfs mount, restore, shutdown DB and then copy the WAL back to the HD. I checked out the SanDisk IDE FlashDrives. They have a write cycle life of 2 million. I'll explore more expensive solid state drives. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Update on putting WAL on ramdisk/
Russell Garrett wrote: WAL on single drive: 7.990 rec/s WAL on 2nd IDE drive: 8.329 rec/s WAL on tmpfs: 13.172 rec/s A huge jump in performance but a bit scary having a WAL that can disappear at any time. I'm gonna workup a rsync script and do some power-off experiments to see how badly it gets mangled. Surely this is just equivalent to disabling fsync? If you put a WAL on a volatile file system, there's not a whole lot of point in having one at all. These tests were all with fsync off. And no, it's not equivalent to fsync off since the WAL is always written immediately regardless of fsync setting. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Hardware suggestions for Linux/PGSQL server
Jeff Bohmer wrote: We're willing to shell out extra bucks to get something that will undoubtedly handle the projected peak load in 12 months with excellent performance. But we're not familiar with PG's performance on Linux and don't like to waste money. Properly tuned, PG on Linux runs really nice. A few people have mentioned the VM swapping algorithm on Linux is semi-dumb. I get around that problem by having a ton of memory and almost no swap. I've been thinking of this (overkill? not enough?): 2 Intel 32-bit CPUs Lowest clock speed chip for the fastest available memory bus 4 GB RAM (maybe we only need 3 GB to start with?) SCSI RAID 1 for OS For PostgreSQL data and logs ... 15k rpm SCSI disks RAID 5, 7 disks, 256MB battery-backed write cache (Should we save $ and get a 4-disk RAID 10 array?) I wonder about the 32bit+bigmem vs. 64bit question. At what database size will we need more than 4GB RAM? With 4GB of RAM, you're already running into bigmem. By default, Linux gives 2GB of address space to programs and 2GB to kernel. I usually see people quote 5%-15% penalty in general for using PAE versus a flat address space. I've seen simple MySQL benchmarks where 64-bit versions run 35%+ faster versus 32-bit+PAE but how that translates to PG, I dunno yet. We'd like to always have enough RAM to cache the entire database. While 64bit is in our long-term future, we're willing to stick with 32bit Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL settle in to proven production-quality. Well if this is the case, you probably should get an Opteron server *now* and just run 32-bit Linux on it until you're sure about the software. No point in buying a Xeon and then throwing the machine away in a year when you decide you need 64-bit for more speed. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Hardware suggestions for Linux/PGSQL server
Jeff Bohmer wrote: It seems I don't fully understand the bigmem situation. I've searched the archives, googled, checked RedHat's docs, etc. But I'm getting conflicting, incomplete and/or out of date information. Does anyone have pointers to bigmem info or configuration for the 2.4 kernel? Bigmem is the name for Linux's PAE support. If Linux is setup with 2GB for kernel and 2GB for user, would that be OK with a DB size of 2-2.5 GB? I'm figuring the kernel will cache most/all of the DB in it's 2GB and there's 2GB left for PG processes. Where does PG's SHM buffers live, kernel or user? (I don't plan on going crazy with buffers, but will guess we'd need about 128MB, 256MB at most.) PG's SHM buffers live in user. Whether Linux's OS caches lives in user or kernel, I think it's in kernel and I remember reading a max of ~950KB w/o bigmem which means your 3.5GB of available OS memory will definitely have to be swapped in and out of kernel space using PAE. Well if this is the case, you probably should get an Opteron server *now* and just run 32-bit Linux on it until you're sure about the software. No point in buying a Xeon and then throwing the machine away in a year when you decide you need 64-bit for more speed. That's a good point. I had forgotten about the option to run 32bit on an Operton. If we had 3GB or 4GB initially on an Opteron, we'd need bigmem for 32bit Linux, right? This might work nicely since we'd factor in the penalty from PAE for now and have the performance boost from moving to 64bit available on demand. Not having to build another DB server in a year would also be nice. FYI, we need stability first and performance second. We ordered a 2x Opteron server the moment the CPU was released and it's been perfect -- except for one incident where the PCI riser card had drifted out of the PCI slot due to the heavy SCSI cables connected to the card. I think most of the Opteron server MBs are pretty solid but you want extra peace-of-mind, you could get a server from Newisys as they pack in a cartload of extra monitoring features. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Has anyone run on the new G5 yet
Sean Shanny wrote: First question is do we gain anything by moving the RH Enterprise version of Linux in terms of performance, mainly in the IO realm as we are not CPU bound at all? Second and more radical, has anyone run postgreSQL on the new Apple G5 with an XRaid system? This seems like a great value combination. Fast CPU, wide bus, Fibre Channel IO, 2.5TB all for ~17k. Seems like a great value but until Apple produces a G5 that supports ECC, I'd pass on them. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Slow UPADTE, compared to INSERT
Ivar Zarans wrote: I am experiencing strange behaviour, where simple UPDATE of one field is very slow, compared to INSERT into table with multiple indexes. I have two tables - one with raw data records (about 24000), where one field In Postgres and any other DB that uses MVCC (multi-version concurrency), UPDATES will always be slower than INSERTS. With MVCC, what the DB does is makes a copy of the record, updates that record and then invalidates the previous record. This allows maintains a consistent view for anybody who's reading the DB and also avoids the requirement of row locks. If you have to use UPDATE, make sure (1) your UPDATE WHERE clause is properly indexed and (2) you are running ANALYZE/VACUUM periodically so the query planner can optimize for your UPDATE statements. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Maximum Possible Insert Performance?
Tom Lane wrote: William Yu [EMAIL PROTECTED] writes: I then tried to put the WAL directory onto a ramdisk. I turned off swapping, created a tmpfs mount point and copied the pg_xlog directory over. Everything looked fine as far as I could tell but Postgres just panic'd with a file permissions error. Anybody have thoughts to why tmpfs would not work? I'd say you got the file or directory ownership or permissions wrong. I did a mv instead of a cp which duplicates ownership permissions exactly. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Maximum Possible Insert Performance?
This is an intriguing thought which leads me to think about a similar solution for even a production server and that's a solid state drive for just the WAL. What's the max disk space the WAL would ever take up? There's quite a few 512MB/1GB/2GB solid state drives available now in the ~$200-$500 range and if you never hit those limits... When my current job batch is done, I'll save a copy of the dir and give the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local store and run that through the hooper. Shridhar Daithankar wrote: Mount WAL on RAM disk. WAL is most often hit area for heavy updates/inserts. If you spped that up, things should be pretty faster. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Maximum Possible Insert Performance?
Josh Berkus wrote: William, When my current job batch is done, I'll save a copy of the dir and give the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local store and run that through the hooper. We'll be interested in the results. The Sandisk won't be much of a performance test; last I checked, their access speed was about 1/2 that of a fast SCSI drive. But it could be a feasability test for the more expensive RAMdrive approach. The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You shouldn't need to do mirroring with a solid state drive. Time to Google up some more solid state drive vendors. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Maximum Possible Insert Performance?
Josh Berkus wrote: William, The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You shouldn't need to do mirroring with a solid state drive. I wouldn't count on RAID0 improving the speed of SANDisk's much. How are you connecting to them? USB? USB doesn't support fast parallel data access. You can get ATA SanDisks up to 2GB. Another vendor I checked out -- BitMicro -- has solid state drives for SATA, SCSI and FiberChannel. I'd definitely would not use USB SSDs -- USB performance would be so pokey to be useless. Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB ramdisks, then that's worth considering. Looks like they're linear with size. SanDisk Flashdrive 1GB is about $1000 while 256MB is $250. You're right, though, mirroring a solid state drive is pretty pointless; if power fails, both mirrors are dead. Actually no. Solid state memory is non-volatile. They retain data even without power. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Pg+Linux swap use
Rob Sell wrote: Not being one to hijack threads, but I haven't heard of this performance hit when using HT, I have what should all rights be a pretty fast server, dual 2.4 Xeons with HT 205gb raid 5 array, 1 gig of memory. And it is only 50% as fast as my old server which was a dual AMD MP 1400's with a 45gb raid 5 array and 1gb of ram. I have read everything I could find on Pg performance tweaked all the variables that were suggested and nothing. Which is why I subscribed to this list, just been lurking so far but this caught my eye. Not to get into a big Intel vs AMD argument but 50% sounds about right. Let's first assume that the QS rating for the MP1400 is relatively accurate and convert that to a 1.4GHz Xeon. 2.4/1.4 = +71%. Since processor performance does not increase linearly with clockspeed, 50% is in line with expectations. Then you throw in the fact that (1) QS ratings for slower AMD chips are understated (but overstated for the fastest chips), (2) AMD uses a point-to-point CPU/memory interface (much better for SMP) versus the P4/Xeon's shared bus, (3) Athlon architecture is more suited for DB work compared to the P4, I'd say you're lucky to see 50% more performance from a Xeon 2.4. As for HT, I've seen quite a few benchmarks where HT hurts performance. The problem is it's not only app and workload specific but also system and usage specific. As it involves the internal rescheduling of processes, adding more simultaneous processes could help to a point and then start hurting or vice-versa. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Tuning for mid-size server
So what is the ceiling on 32-bit processors for RAM? Most of the 64-bit vendors are pushing Athalon64 and G5 as breaking the 4GB barrier, and even I can do the math on 2^32. All these 64-bit vendors, then, are talking about the limit on ram *per application* and not per machine? 64-bit CPU on 64-bit OS. Up to physical address limit for anything and everything. 64-bit CPU on 32-bit OS. Up to 4GB minus the kernel allocation -- which is usually 2GB on Windows and Linux. On Windows, you can up this to 3GB by using the /3GB switch. Linux requires a kernel recompile. PAE is then used to move the memory window to point to different areas of the physical memory. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL data on a NAS device ?
I have never worked with a XEON CPU before. Does anyone know how it performs running PostgreSQL 7.3.4 / 7.4 on RedHat 9 ? Is it faster than a Pentium 4? I believe the main difference is cache memory, right? Aside from cache mem, it's basically a Pentium 4, or am I wrong? Well, see the problem is of course, there's so many flavors of P4s and Xeons that it's hard to tell which is faster unless you specify the exact model. And even then, it would depend on the workload. Would a Xeon/3GHz/2MB L3/400FSB be faster than a P4C/3GHz/800FSB? No idea as no one has complete number breakdowns on these comparisons. Oh yeah, you could get a big round number that says on SPEC or something one CPU is faster than the other but whether that's faster for Postgres and your PG app is a totally different story. That in mind, I wouldn't worry about it. The CPU is probably plenty fast for what you need to do. I'd look into two things in the server: memory and CPU expandability. I know you already plan on 4GB but you may need even more in the future. Few things can dramatically improve performance more than moving disk access to disk cache. And if there's a 2nd socket where you can pop another CPU in, that would leave you extra room if your server becomes CPU limited. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Reading data in bulk - help?
1) Memory - clumsily adjusted shared_buffer - tried three values: 64, 128, 256 with no discernible change in performance. Also adjusted, clumsily, effective_cache_size to 1000, 2000, 4000 - with no discernible change in performance. I looked at the Admin manual and googled around for how to set these values and I confess I'm clueless here. I have no idea how many kernel disk page buffers are used nor do I understand what the shared memory buffers are used for (although the postgresql.conf file hints that it's for communication between multiple connections). Any advice or pointers to articles/docs is appreciated. The standard procedure is 1/4 of your memory for shared_buffers. Easiest way to calculate would be ###MB / 32 * 1000. E.g. if you have 256MB of memory, your shared_buffers should be 256 / 32 * 1000 = 8000. The remaining memory you have leftover should be marked as OS cache via the effective_cache_size setting. I usually just multiply the shared_buffers value by 3 on systems with a lot of memory. With less memory, OS/Postgres/etc takes up a larger percentage of memory so values of 2 or 2.5 would be more accurate. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] SELECT's take a long time compared to other DBMS
Relaxin wrote: I have a table with 102,384 records in it, each record is 934 bytes. Using the follow select statement: SELECT * from table PG Info: version 7.3.4 under cygwin on Windows 2000 ODBC: version 7.3.100 Machine: 500 Mhz/ 512MB RAM / IDE HDD Under PG: Data is returned in 26 secs!! Under SQL Server: Data is returned in 5 secs. Under SQLBase: Data is returned in 6 secs. Under SAPDB:Data is returned in 7 secs. I created a similar table (934 bytes, 102K records) on a slightly faster machine: P3/800 + 512MB RAM + IDE HD. The server OS is Solaris 8 x86 and the version is 7.3.3. On the server (via PSQL client) : 7.5 seconds Using ODBC under VFPW: 10.5 seconds How that translates to what you should see, I'm not sure. Assuming it was just the CPU difference, you should see numbers of roughly 13 seconds. But the documentation says PG under CYGWIN is significantly slower than PG under UNIX so your mileage may vary... Have you changed any of the settings yet in postgresql.conf, specifically the shared_buffers setting? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Hardware recommendations to scale to silly load
Shridhar Daithankar wrote: Be careful here, we've seen that with the P4 Xeon's that are hyper-threaded and a system that has very high disk I/O causes the system to be sluggish and slow. But after disabling the hyper-threading itself, our system flew.. Anybody has opteron working? Hows' the performance? Yes. I'm using an 2x 1.8GHz Opteron system w/ 8GB of RAM. Right now, I'm still using 32-bit Linux -- I'm letting others be the 64-bit guinea pigs. :) I probably will get a cheapie 1x Opteron machine first and test the 64-bit kernel/libraries thoroughly before rolling it out to production. As for performance, the scaling is magnificient -- even when just using PAE instead of 64-bit addressing. At low transaction counts, it's only ~75% faster than the 2x Athlon 1800+ MP it replaced. But once the transactions start coming in, the gap is as high as 5x. My w-a-g: since each CPU has an integrated memory controller, you avoid memory bus contention which is probably the major bottleneck as transaction load increases. (I've seen Opteron several vs Xeon comparisons where single-connection tests are par for both CPUs but heavy-load tests favor the Opteron by a wide margin.) I suspect the 4X comparisons would tilt even more towards AMD's favor. We should see a boost when we move to 64-bit Linux and hopefully another one when NUMA for Linux is production-stable. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Hardware recommendations to scale to silly load
Shridhar Daithankar wrote: Just a guess here but does a precompiled postgresql for x86 and a x86-64 optimized one makes difference? Opteron is one place on earth you can watch difference between 32/64 bit on same machine. Can be handy at times.. I don't know yet. I tried building a 64-bit kernel and my eyes glazed over trying to figure out how to create the cross-platform GCC compiler that's first needed to build the kernel. Then I read all the libraries drivers also needed to be 64-bit compiled and at that point gave up the ghost. I'll wait until a 64-bit Redhat distro is available before I test the 64-bit capabilities. The preview SuSE 64-bit Linux used in most of the Opteron rollout tests has MySql precompiled as 64-bit and under that DB, 64-bit added an extra ~25% performance (compared to a 32-bit SuSE install). My guess is half of the performance comes from eliminating the PAE swapping. I am sure. But is 64 bit environment, Xeon is not the compitition. It's PA-RSC- 8700, ultraSparcs, Power series and if possible itanium. Well, just because the Opteron is 64-bit doesn't mean it's direct competition for the high-end RISC chips. Yes, if you're looking at the discrete CPU itself, it appears they could compete -- the SpecINT scores places the Opteron near the top of the list. But big companies also need the infrastructure, management tools and top-end scalability. If you just have to have the million dollar machines (128x Itanium2 servers or whatever), AMD is nowhere close to competing unless Beowulf clusters fit your needs. In terms of infrastructure, scalability, mindshare and pricing, Xeon is most certainly Opteron's main competition. We're talking $10K servers versus $50K+ servers (assuming you actually want performance instead of having a single pokey UltraSparc CPU in a box). And yes, just because Opteron is a better performing server platform than Xeon doesn't mean a corporate fuddy-duddy still won't buy Xeon due to the $1B spent by Intel on marketting. We should see a boost when we move to 64-bit Linux and hopefully another one when NUMA for Linux is production-stable. Getting a 2.6 running now is the answer to make it stable fast..:-) Of course if you have spare hardware.. My office is a pigsty of spare hardware lying around. :) We're like pigs rolling around in the mud. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly