Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
Bruce, Is there remarks along these lines in the performance turning section of the docs? Based on what's coming out of this it would seem that stressing the importance of leaving a notable (rule of thumb here?) amount for general OS/kernel needs is pretty important. Greg On Tue, 2002-10-08 at 09:50, Tom Lane wrote: > (This is, BTW, one of the reasons for discouraging people from pushing > Postgres' shared buffer cache up to a large fraction of total RAM; > starving the kernel of disk buffers is just plain not a good idea.) signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
"Curtis Faith" <[EMAIL PROTECTED]> writes: > Do you not think this is a potential performance problem to be explored? I agree that there's a problem if the kernel runs short of buffer space. I am not sure whether that's really an issue in practical situations, nor whether we can do much about it at the application level if it is --- but by all means look for solutions if you are concerned. (This is, BTW, one of the reasons for discouraging people from pushing Postgres' shared buffer cache up to a large fraction of total RAM; starving the kernel of disk buffers is just plain not a good idea.) regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
> So you think if I try to write a 1 gig file, it will write enough to > fill up the buffers, then wait while the sync'er writes out a few blocks > every second, free up some buffers, then write some more? > > Take a look at vfs_bio::getnewbuf() on *BSD and you will see that when > it can't get a buffer, it will async write a dirty buffer to disk. We've addressed this scenario before, if I recall, the point Greg made earlier is that buffers getting full means writes become synchronous. I was trying to point out was that it was very likely that the buffers will fill even for large buffers and that the writes are going to be driven out not by efficient ganging but by something approaching LRU flushing, with an occasional once a second slightly more efficient write of 1/32nd of the buffers. Once the buffers get full, all subsequent writes turn into synchronous writes, since even if the kernel writes asynchronously (meaning it can do other work), the writing process can't complete, it has to wait until the buffer has been flushed and is free for the copy. So the relatively poor implementation (for database inserts at least) of the syncer mechanism will cost a lot of performance if we get to this synchronous write mode due to a full buffer. It appears this scenario is much more likely than I had thought. Do you not think this is a potential performance problem to be explored? I'm only pursuing this as hard as I am because I feel like it's deja vu all over again. I've done this before and found a huge improvement (12X to 20X for bulk inserts). I'm not necessarily expecting that level of improvement here but my gut tells me there is more here than seems obvious on the surface. > As far as this AIO conversation is concerned, I want to see someone come > up with some performance improvement that we can only do with AIO. > Unless I see it, I am not interested in pursuing this thread. If I come up with something via aio that helps I'd be more than happy if someone else points out a non-aio way to accomplish the same thing. I'm by no means married to any particular solutions, I care about getting problems solved. And I'll stop trying to sell anyone on aio. - Curtis ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
Curtis Faith wrote: > > This is the trickle syncer. It prevents bursts of disk activity every > > 30 seconds. It is for non-fsync writes, of course, and I assume if the > > kernel buffers get low, it starts to flush faster. > > AFAICT, the syncer only speeds up when virtual memory paging fills the > buffers past > a threshold and even in that event it only speeds it up by a factor of two. > > I can't find any provision for speeding up flushing of the dirty buffers > when they fill for normal file system writes, so I don't think that > happens. So you think if I try to write a 1 gig file, it will write enough to fill up the buffers, then wait while the sync'er writes out a few blocks every second, free up some buffers, then write some more? Take a look at vfs_bio::getnewbuf() on *BSD and you will see that when it can't get a buffer, it will async write a dirty buffer to disk. As far as this AIO conversation is concerned, I want to see someone come up with some performance improvement that we can only do with AIO. Unless I see it, I am not interested in pursuing this thread. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
> Greg Copeland <[EMAIL PROTECTED]> writes: > > Doesn't this also increase the likelihood that people will be > > running in a buffer-poor environment more frequently that I > > previously asserted, especially in very heavily I/O bound > > systems? Unless I'm mistaken, that opens the door for a > > general case of why an aio implementation should be looked into. Neil Conway replies: > Well, at least for *this specific sitation*, it doesn't really change > anything -- since FreeBSD doesn't implement POSIX AIO as far as I > know, we can't use that as an alternative. I haven't tried it yet but there does seem to be an aio implementation that conforms to POSIX in FreeBSD 4.6.2. Its part of the kernel and can be found in: /usr/src/sys/kern/vfs_aio.c > However, I'd suspect that the FreeBSD kernel allows for some way to > tune the behavior of the syncer. If that's the case, we could do some > research into what settings are more appropriate for FreeBSD, and > recommend those in the docs. I don't run FreeBSD, however -- would > someone like to volunteer to take a look at this? I didn't see anything obvious in the docs but I still believe there's some way to tune it. I'll let everyone know if I find some better settings. > BTW Curtis, did you happen to check whether this behavior has been > changed in FreeBSD 5.0? I haven't checked but I will. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
Greg Copeland <[EMAIL PROTECTED]> writes: > Doesn't this also increase the likelihood that people will be running in > a buffer-poor environment more frequently that I previously asserted, > especially in very heavily I/O bound systems? Unless I'm mistaken, that > opens the door for a general case of why an aio implementation should be > looked into. Well, at least for *this specific sitation*, it doesn't really change anything -- since FreeBSD doesn't implement POSIX AIO as far as I know, we can't use that as an alternative. However, I'd suspect that the FreeBSD kernel allows for some way to tune the behavior of the syncer. If that's the case, we could do some research into what settings are more appropriate for FreeBSD, and recommend those in the docs. I don't run FreeBSD, however -- would someone like to volunteer to take a look at this? BTW Curtis, did you happen to check whether this behavior has been changed in FreeBSD 5.0? > Also, on a side note, IIRC, linux kernel 2.5.x has a new priority > elevator which is said to be MUCH better as saturating disks than ever > before. Yeah, there are lots of new & interesting features for database systems in the new kernel -- I'm looking forward to when 2.6 is widely deployed... Cheers, Neil -- Neil Conway <[EMAIL PROTECTED]> || PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
On Mon, 2002-10-07 at 15:28, Bruce Momjian wrote: > This is the trickle syncer. It prevents bursts of disk activity every > 30 seconds. It is for non-fsync writes, of course, and I assume if the > kernel buffers get low, it starts to flush faster. Doesn't this also increase the likelihood that people will be running in a buffer-poor environment more frequently that I previously asserted, especially in very heavily I/O bound systems? Unless I'm mistaken, that opens the door for a general case of why an aio implementation should be looked into. Also, on a side note, IIRC, linux kernel 2.5.x has a new priority elevator which is said to be MUCH better as saturating disks than ever before. Once 2.6 (or whatever it's number will be) is released, it may not be as much of a problem as it seems to be for FreeBSD (I think that's the one you're using). Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
> This is the trickle syncer. It prevents bursts of disk activity every > 30 seconds. It is for non-fsync writes, of course, and I assume if the > kernel buffers get low, it starts to flush faster. AFAICT, the syncer only speeds up when virtual memory paging fills the buffers past a threshold and even in that event it only speeds it up by a factor of two. I can't find any provision for speeding up flushing of the dirty buffers when they fill for normal file system writes, so I don't think that happens. - Curtis ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
Curtis Faith wrote: > Good points. > > Now for some surprising news (at least it surprised me). > > I researched the file system source on my system (FreeBSD 4.6) and found > that the behavior was optimized for non-database access to eliminate > unnecessary writes when temp files are created and deleted rapidly. It was > not optimized to get data to the disk in the most efficient manner. > > The syncer on FreeBSD appears to place dirtied filesystem buffers into > work queues that range from 1 to SYNCER_MAXDELAY. Each second the syncer > processes one of the queues and increments a counter syncer_delayno. > > On my system the setting for SYNCER_MAXDELAY is 32. So each second 1/32nd > of the writes that were buffered are processed. If the syncer gets behind > and the writes for a given second exceed one second to process the syncer > does not wait but begins processing the next queue. > > AFAICT this means that there is no opportunity to have writes combined by > the disk since they are processed in buckets based on the time the writes > came in. This is the trickle syncer. It prevents bursts of disk activity every 30 seconds. It is for non-fsync writes, of course, and I assume if the kernel buffers get low, it starts to flush faster. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html