Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
On Thu, 17 Feb 2005, Andrew Dunstan wrote:
(the results are interesting, though - with fsync off Windows and Linux are 
in the same performance ballpark.)
Some addition:
WinXP  fsync = true 20-28 tps
WinXP  fsync = false  600 tps
Linux  fsync = true   800 tps
Linux  fsync = false  980 tps
Regards,
E.R.
_
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
There are two different concerns here.
1. transactions loss because of unexpected power loss and/or system failure
2. inconsistent database state
For many application (1) is fairly acceptable, and (2) is not.
So I'd like to formulate my questions by another way.
- if PostgeSQL is running without fsync, and power loss occur, which kind
of damage is possible? 1, 2, or both?
- it looks like with proper fwrite/fflush policy it is possible to
guarantee that only transactions loss may occur, but database
keeps some consistent state as before (several) last transactions.
Is it true for PostgeSQL?
Regards,
E.R.
e
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
On Thu, 17 Feb 2005, Tom Lane wrote:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
WinXP  fsync = true 20-28 tps
WinXP  fsync = false  600 tps
Linux  fsync = true   800 tps
Linux  fsync = false  980 tps

Wow, that's terrible on Windows.  If there's a solution, it'd be nice to
backport it...
Actually, the number that's way out of line there is the Linux w/fsync
one.  I infer that he's got disk write cache enabled and therefore the
transactions aren't really being synced to disk at all.
Any claimed TPS rate exceeding your disk drive's rotation rate is a
red flag.
Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
It doesn't interfere with fsync(), as linux kernel uses cache flush for
fsync.
I have 2.6.10 kernel running *without* any additional patches, and without
any specific hdparm settings.
fsync() really works fine as I switch off my notebook everyday 2-3 times,
and never had any data loss :)
Related staff from dmesg is
hda: cache flushes supported
Regards,
E.R.
_
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
On Thu, 17 Feb 2005, Tom Lane wrote:
Evgeny Rodichev [EMAIL PROTECTED] writes:
Any claimed TPS rate exceeding your disk drive's rotation rate is a
red flag.

Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
You're playing with fire.
Yes. I'm lucky in this play :)
More seriously, we (with Oleg Bartunov) investigated many platforms/OS
for commercial, scientific and other applications during past 10-12
years. I suppose, virtually all excluding modern mainframes.
For reliability Linux + PostreSQL was found the best one (including the
environment with very frequent unexpected power-off, as at some astronomical
observatories at high mountains).
Hence, I'm lucky :)

fsync() really works fine as I switch off my notebook everyday 2-3 times,
and never had any data loss :)
Given that it's a notebook, it's possible that the hardware is smart
enough not to power down the disk until the disk is done writing
everything it's cached.  Do you care to try some experiments with
pulling out the battery while Postgres is busy making updates?
Yes, you are exactly right. All modern HDDs (not entry level ones) has
a huge cache (at device, not at controller), and provide the safe hardware
flush of cache *after* power off (thanks capacitors). My HDD has 16MB cache,
and it is the reason for excellent performance.
Regards,
E.R.
_
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
On Fri, 18 Feb 2005, Oliver Jowett wrote:
Evgeny Rodichev wrote:
Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
It doesn't interfere with fsync(), as linux kernel uses cache flush for
fsync.
The problem is that most IDE drives lie (or perhaps you could say the 
specification is ambiguous) about completion of the cache-flush command -- 
they say Yeah, I've flushed when they have not actually written the data to 
the media and have no provision for making sure it will get there in the 
event of power failure.
Yes, I agree. But in my real SA practice I've met 50-100 times the situation
when HDD were unexpectedly physically corrupted (the heads touch a surface),
without possibility to restore. And I never met any corruption because of
possible hardware lie.
So Linux is indeed doing a cache flush on fsync, but the hardware is not 
behaving as expected. By turning off the write-cache on the disk via hdparm, 
you manage to get the hardware to behave better. The kernel is caching 
anyway, so the loss of the drive's write cache doesn't make a big difference.
Again, in practice, it is different. FreeBSD had a true flush (at least
2-3 yeas ago, not sure about the modern versions), and for write-intensive
applications it was a bit slower (comparing with linux), but it never was
more reliable (since 1996, at least).
Another practical example is Google :) Isn't reliable?
There was some work done for better IDE write-barrier support (related to 
TCQ/SATA support?) in the kernel, but I'm not sure how far that has 
progressed.
Yes, but IMHO it is not stable enough at the moment.
Regards,
E.R.
_
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] win32 performance - fsync question

2005-02-17 Thread Evgeny Rodichev
On Fri, 17 Feb 2005, Greg Stark wrote:
Oliver Jowett [EMAIL PROTECTED] writes:
So Linux is indeed doing a cache flush on fsync
Actually I think the root of the problem was precisely that Linux does not
issue any sort of cache flush commands to drives on fsync.
No, it does. Let's try the simplest test:
for (i = 0; i  LEN; i++) {
   write (fd, buf, 512);
   if (sync) fsync (fd);
}
with sync = 0 and 1, and you'll see the difference.
There was some talk
on linux-kernel of what how they could take advantage of new ATA features
planned on new SATA drives coming out now to solve this. But they didn't seem
to think it was urgent or worth the performance hit of doing a complete cache
flush.
It was a bit different topic.
Regards,
E.R.
_
Evgeny Rodichev  Sternberg Astronomical Institute
email: [EMAIL PROTECTED]  Moscow State University
Phone: 007 (095) 939 2383
Fax:   007 (095) 932 8841   http://www.sai.msu.su/~er
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster