Re: PATCH reduce impact of FIFREEZE on userland processes
On Sat, Dec 08, 2012 at 08:47:34AM +, Alun wrote: > On Sat, 8 Dec 2012 12:20:29 +1100 > Dave Chinner wrote: > > First off, thanks for the examples. I'll answer your one question and > then I'll shut up! > > > > I'll try and chase this up by submitting patches to lvcreate and > > > fsfreeze (in the former case, I think there's no reason not to run > > > syncfs; in the latter perhaps it should be a command line option). > > > > Is that even necesary? users can issue the sync themselves if > > necessary > > I think it's necessary for the issue to be better documented in LVM at > the very least. I've dabbled with LVM for nearly 10 years, and used it > in a busy production environment for around 6. For nearly 2 years I've > been seeing, every now and then, these odd cases where taking a snapshot > caused irrecoverable high load on the server. Irrecoverable in what way? > I've never seen any > mention anywhere of the advisability of manually running sync prior to > taking a snapshot on a busy system, and I had to get down to looking at > the kernel sources before I got an inkling this might be the issue. I'd > imagine that the vast majority of end users think the same way as I > did, viz that taking a snapshot was designed to have minimal effect on > any other users of the filesystem. Right - minimal effect, not "no effect". > There's also the issue that AFAIK there's no commonly distributed > program which will allow you to call syncfs() on a filesystem. Running > sync is a bit of a sledgehammer approach for a busy system with > multiple large filesystems. I have no doubt that you could write the 20 lines of C code needed to use syncfs ;) > I've submitted a patch to util-linux, adding a --sync option to > fsfreeze which, if specified, will syncfs the requested filesystem > prior to any freeze operation. Hopefully they'll accept this, though > the only comment I've received so far suggested that I should be > submitting a kernel patch rather than band aiding it in userland! Perhaps that tells you something - that both sides are telling you it's a band aid for your specific issue? :/ fsfreeze is a data integrity operation and some people rely on it to take immediate effect as it currently does. IMO, that's the bar that the any generic freeze optimisation has to overcome. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
On Sat, Dec 08, 2012 at 07:12:04AM -0500, Christoph Hellwig wrote: > On Fri, Dec 07, 2012 at 11:42:55AM +1100, Dave Chinner wrote: > > The problem wth doing this is that the sync can delay the freeze > > process by quite some time under the exact conditions you describe. > > If you want freeze to take effect immediately (i.e instantly stop > > new modifications), then adding a sync will break this semantic. > > THere are existing users of freeze that require this behaviour... > > But that's only because he uses the big hammer sync_filesystem() which > actually waits for I/O completion. I agree that this is a bad idea, > but if we'd just do a writeback_inodes_sb() call in this place that > starts asynchronous writeout I think everyone would benefit. The problem with that is that async writeback will block on IO submission as soon as the disk backs up on congestion. It's effectively still waiting on IO completions to occur, only now indirectly through the request queue submission process. Hence I think for the heavily loaded situations that are causing freeze latency related issues, sync or async pre-flushes are going to cause exactly the same delays to freezing writes Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
On Fri, Dec 07, 2012 at 11:42:55AM +1100, Dave Chinner wrote: > The problem wth doing this is that the sync can delay the freeze > process by quite some time under the exact conditions you describe. > If you want freeze to take effect immediately (i.e instantly stop > new modifications), then adding a sync will break this semantic. > THere are existing users of freeze that require this behaviour... But that's only because he uses the big hammer sync_filesystem() which actually waits for I/O completion. I agree that this is a bad idea, but if we'd just do a writeback_inodes_sb() call in this place that starts asynchronous writeout I think everyone would benefit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
On Sat, 8 Dec 2012 12:20:29 +1100 Dave Chinner wrote: First off, thanks for the examples. I'll answer your one question and then I'll shut up! > > I'll try and chase this up by submitting patches to lvcreate and > > fsfreeze (in the former case, I think there's no reason not to run > > syncfs; in the latter perhaps it should be a command line option). > > Is that even necesary? users can issue the sync themselves if > necessary I think it's necessary for the issue to be better documented in LVM at the very least. I've dabbled with LVM for nearly 10 years, and used it in a busy production environment for around 6. For nearly 2 years I've been seeing, every now and then, these odd cases where taking a snapshot caused irrecoverable high load on the server. I've never seen any mention anywhere of the advisability of manually running sync prior to taking a snapshot on a busy system, and I had to get down to looking at the kernel sources before I got an inkling this might be the issue. I'd imagine that the vast majority of end users think the same way as I did, viz that taking a snapshot was designed to have minimal effect on any other users of the filesystem. There's also the issue that AFAIK there's no commonly distributed program which will allow you to call syncfs() on a filesystem. Running sync is a bit of a sledgehammer approach for a busy system with multiple large filesystems. I've submitted a patch to util-linux, adding a --sync option to fsfreeze which, if specified, will syncfs the requested filesystem prior to any freeze operation. Hopefully they'll accept this, though the only comment I've received so far suggested that I should be submitting a kernel patch rather than band aiding it in userland! Looking at the LVM sources, it would appear that the freezing of affected filesystems is done in the kernel side of device mapper. I'm not going there! Anyway, thanks for your time. Cheers, Alun. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
On Fri, Dec 07, 2012 at 08:59:52AM +, Alun wrote: > Dave Chinner said, in message > 20121207004255.GC27172@dastard: > > > > The problem wth doing this is that the sync can delay the freeze > > process by quite some time under the exact conditions you describe. > > If you want freeze to take effect immediately (i.e instantly stop > > new modifications), then adding a sync will break this semantic. > > THere are existing users of freeze that require this behaviour... > > Ahh, that would be the subtlety I was worried might exist! Thanks. > > The specific issue that brought me here was that, on a fairly heavily > loaded file server (>1000 connected Windows clients), taking an LVM > snapshot caused enough of an interruption to service that many of the > Windows clients disconnected and reconnected, so causing a huge process > load on the server - enough that we'd completely lose service and have > to reboot. Chasing this down, I noticed that FIFREEZE does a filesystem > sync, and it seemed to me that adding another one prior to blocking > writes was an easy hit. Yup, that's typical. > I'm not trying to argue my case here - you've convinced me that this > change in semantics is risky and removes flexibility. > > I'll try and chase this up by submitting patches to lvcreate and > fsfreeze (in the former case, I think there's no reason not to run > syncfs; in the latter perhaps it should be a command line option). Is that even necesary? users can issue the sync themselves if necessary > > That, to me, is irrelevant, because something is normally done while > > the filesystem is frozen. It's not uncommon for freeze periods to > > extend to minutes while work is done by whatever required the > > freeze. Hence the few seconds it takes to acheive the frozen state is > > mostly irrelevant. > > You've referred twice to existing systems that would break in the > presence of this change. I'm really having trouble thinking of a > situation where it's critical to have writes suspended *NOW* and where > it's valid to keep them suspended for minutes. Say you get your filesystem reporting a read error in a directory. There are people out there that will immediately freeze the filesystem (to prevent potential damage from being propagated) while they investigate the problem and determine their next action. This may even involve running non-modifying fsck on the underlying block device while the filesystem is frozen... Then there is systems like HA servers that share a filesystem in a primary/secondary setup - freezes are often used in failover situations. This ensures all cached dirty data is written to disk in preparation for the other node to mount it. Freezing the filesystem ensures that spurious errors are not returned to applications/clients while the failover takes place. Hence the filesystem can remain frozen for some time while everything on the new primary node is started up and fences/STONITHs the frozen node Then there's co-ordinating management operations on filesystems that span multiple storage arrays (e.g. for hardware based snapshots, cloning, etc), VM guest migration between two physical hosts, and so on. Freeze is use for a lot more things than LVM snapshots... > I'd have thought that, > in the vast majority of cases, the critical thing was to minimise the > time for which writes were suspended. In the obvious use cases, yes. Once you look outside snapshots to consider applications that need a stable, unchanging filesystem in an application transparent manner, you'll find lots of interesting uses for FIFREEZE Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
Dave Chinner said, in message 20121207004255.GC27172@dastard: > > The problem wth doing this is that the sync can delay the freeze > process by quite some time under the exact conditions you describe. > If you want freeze to take effect immediately (i.e instantly stop > new modifications), then adding a sync will break this semantic. > THere are existing users of freeze that require this behaviour... Ahh, that would be the subtlety I was worried might exist! Thanks. The specific issue that brought me here was that, on a fairly heavily loaded file server (>1000 connected Windows clients), taking an LVM snapshot caused enough of an interruption to service that many of the Windows clients disconnected and reconnected, so causing a huge process load on the server - enough that we'd completely lose service and have to reboot. Chasing this down, I noticed that FIFREEZE does a filesystem sync, and it seemed to me that adding another one prior to blocking writes was an easy hit. I'm not trying to argue my case here - you've convinced me that this change in semantics is risky and removes flexibility. I'll try and chase this up by submitting patches to lvcreate and fsfreeze (in the former case, I think there's no reason not to run syncfs; in the latter perhaps it should be a command line option). > That, to me, is irrelevant, because something is normally done while > the filesystem is frozen. It's not uncommon for freeze periods to > extend to minutes while work is done by whatever required the > freeze. Hence the few seconds it takes to acheive the frozen state is > mostly irrelevant. You've referred twice to existing systems that would break in the presence of this change. I'm really having trouble thinking of a situation where it's critical to have writes suspended *NOW* and where it's valid to keep them suspended for minutes. I'd have thought that, in the vast majority of cases, the critical thing was to minimise the time for which writes were suspended. Would you mind describing the use case you're thinking of? Cheers, Alun. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH reduce impact of FIFREEZE on userland processes
On Wed, Dec 05, 2012 at 09:17:07PM +, Alun wrote: > > This patch is against kernel version 3.7-rc7. > > The FIFREEZE ioctl blocks userland writes, then calls sync_filesystem. > If there is a large amount of dirty data, this sync can take a > substantial time to complete, with corresponding loss of responsiveness > to any userland processes wishing to write. > > This patch simply adds an extra call to sync_filesystem prior to > blocking writes, so that (hopefully) the majority of outstanding dirty > data has been flushed before we impact on userland. The problem wth doing this is that the sync can delay the freeze process by quite some time under the exact conditions you describe. If you want freeze to take effect immediately (i.e instantly stop new modifications), then adding a sync will break this semantic. THere are existing users of freeze that require this behaviour... > I'm a complete kernel newbie and have only done some pretty minimal > testing on my own machine, but with the patch in place the impact of > running "fsfreeze -f" immediately followed by "fsfreeze -u" on a > moderately loaded filesystem (as measured by time taken for a write() > to complete) was reduced from 2.5 to 0.2 seconds. That, to me, is irrelevant, because something is normally done while the filesystem is frozen. It's not uncommon for freeze periods to extend to minutes while work is done by whatever required the freeze. Hence the few seconds it takes to acheive the frozen state is mostly irrelevant. If you are really concerned by minimising the amount of time it takes to freeze, then "syncfs; fsfreeze -f; fsfreeze -u" will get you exactly the same result as your patch, without having any bad side effects for other users Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH reduce impact of FIFREEZE on userland processes
This patch is against kernel version 3.7-rc7. The FIFREEZE ioctl blocks userland writes, then calls sync_filesystem. If there is a large amount of dirty data, this sync can take a substantial time to complete, with corresponding loss of responsiveness to any userland processes wishing to write. This patch simply adds an extra call to sync_filesystem prior to blocking writes, so that (hopefully) the majority of outstanding dirty data has been flushed before we impact on userland. I'm a complete kernel newbie and have only done some pretty minimal testing on my own machine, but with the patch in place the impact of running "fsfreeze -f" immediately followed by "fsfreeze -u" on a moderately loaded filesystem (as measured by time taken for a write() to complete) was reduced from 2.5 to 0.2 seconds. Hopefully there's no subtlety in how all this works, and that adding the extra call has no scary implications... Signed-off-by: Alun Jones --- linux-3.7-rc7/fs/super.c.orig 2012-11-29 17:35:37.0 + +++ linux-3.7-rc7/fs/super.c2012-12-05 20:56:38.730631855 + @@ -1314,6 +1314,11 @@ int freeze_super(struct super_block *sb) return 0; } + /* Sync before we block writes to reduce the amount of +* work that has to be done afterwards. +*/ + sync_filesystem(sb); + /* From now on, no new normal writers can start */ sb->s_writers.frozen = SB_FREEZE_WRITE; smp_wmb(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/