Re: [HACKERS] win32 performance - fsync question
On Thu, 17 Feb 2005, Andrew Dunstan wrote: (the results are interesting, though - with fsync off Windows and Linux are in the same performance ballpark.) Some addition: WinXP fsync = true 20-28 tps WinXP fsync = false 600 tps Linux fsync = true 800 tps Linux fsync = false 980 tps Regards, E.R. _ Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] win32 performance - fsync question
There are two different concerns here. 1. transactions loss because of unexpected power loss and/or system failure 2. inconsistent database state For many application (1) is fairly acceptable, and (2) is not. So I'd like to formulate my questions by another way. - if PostgeSQL is running without fsync, and power loss occur, which kind of damage is possible? 1, 2, or both? - it looks like with proper fwrite/fflush policy it is possible to guarantee that only transactions loss may occur, but database keeps some consistent state as before (several) last transactions. Is it true for PostgeSQL? Regards, E.R. e Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] win32 performance - fsync question
On Thu, 17 Feb 2005, Tom Lane wrote: Christopher Kings-Lynne [EMAIL PROTECTED] writes: WinXP fsync = true 20-28 tps WinXP fsync = false 600 tps Linux fsync = true 800 tps Linux fsync = false 980 tps Wow, that's terrible on Windows. If there's a solution, it'd be nice to backport it... Actually, the number that's way out of line there is the Linux w/fsync one. I infer that he's got disk write cache enabled and therefore the transactions aren't really being synced to disk at all. Any claimed TPS rate exceeding your disk drive's rotation rate is a red flag. Write cache is enabled under Linux by default all the time I make deal with it (since 1993). It doesn't interfere with fsync(), as linux kernel uses cache flush for fsync. I have 2.6.10 kernel running *without* any additional patches, and without any specific hdparm settings. fsync() really works fine as I switch off my notebook everyday 2-3 times, and never had any data loss :) Related staff from dmesg is hda: cache flushes supported Regards, E.R. _ Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] win32 performance - fsync question
On Thu, 17 Feb 2005, Tom Lane wrote: Evgeny Rodichev [EMAIL PROTECTED] writes: Any claimed TPS rate exceeding your disk drive's rotation rate is a red flag. Write cache is enabled under Linux by default all the time I make deal with it (since 1993). You're playing with fire. Yes. I'm lucky in this play :) More seriously, we (with Oleg Bartunov) investigated many platforms/OS for commercial, scientific and other applications during past 10-12 years. I suppose, virtually all excluding modern mainframes. For reliability Linux + PostreSQL was found the best one (including the environment with very frequent unexpected power-off, as at some astronomical observatories at high mountains). Hence, I'm lucky :) fsync() really works fine as I switch off my notebook everyday 2-3 times, and never had any data loss :) Given that it's a notebook, it's possible that the hardware is smart enough not to power down the disk until the disk is done writing everything it's cached. Do you care to try some experiments with pulling out the battery while Postgres is busy making updates? Yes, you are exactly right. All modern HDDs (not entry level ones) has a huge cache (at device, not at controller), and provide the safe hardware flush of cache *after* power off (thanks capacitors). My HDD has 16MB cache, and it is the reason for excellent performance. Regards, E.R. _ Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] win32 performance - fsync question
On Fri, 18 Feb 2005, Oliver Jowett wrote: Evgeny Rodichev wrote: Write cache is enabled under Linux by default all the time I make deal with it (since 1993). It doesn't interfere with fsync(), as linux kernel uses cache flush for fsync. The problem is that most IDE drives lie (or perhaps you could say the specification is ambiguous) about completion of the cache-flush command -- they say Yeah, I've flushed when they have not actually written the data to the media and have no provision for making sure it will get there in the event of power failure. Yes, I agree. But in my real SA practice I've met 50-100 times the situation when HDD were unexpectedly physically corrupted (the heads touch a surface), without possibility to restore. And I never met any corruption because of possible hardware lie. So Linux is indeed doing a cache flush on fsync, but the hardware is not behaving as expected. By turning off the write-cache on the disk via hdparm, you manage to get the hardware to behave better. The kernel is caching anyway, so the loss of the drive's write cache doesn't make a big difference. Again, in practice, it is different. FreeBSD had a true flush (at least 2-3 yeas ago, not sure about the modern versions), and for write-intensive applications it was a bit slower (comparing with linux), but it never was more reliable (since 1996, at least). Another practical example is Google :) Isn't reliable? There was some work done for better IDE write-barrier support (related to TCQ/SATA support?) in the kernel, but I'm not sure how far that has progressed. Yes, but IMHO it is not stable enough at the moment. Regards, E.R. _ Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] win32 performance - fsync question
On Fri, 17 Feb 2005, Greg Stark wrote: Oliver Jowett [EMAIL PROTECTED] writes: So Linux is indeed doing a cache flush on fsync Actually I think the root of the problem was precisely that Linux does not issue any sort of cache flush commands to drives on fsync. No, it does. Let's try the simplest test: for (i = 0; i LEN; i++) { write (fd, buf, 512); if (sync) fsync (fd); } with sync = 0 and 1, and you'll see the difference. There was some talk on linux-kernel of what how they could take advantage of new ATA features planned on new SATA drives coming out now to solve this. But they didn't seem to think it was urgent or worth the performance hit of doing a complete cache flush. It was a bit different topic. Regards, E.R. _ Evgeny Rodichev Sternberg Astronomical Institute email: [EMAIL PROTECTED] Moscow State University Phone: 007 (095) 939 2383 Fax: 007 (095) 932 8841 http://www.sai.msu.su/~er ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster