Re: [RFC, PATCH] locks: remove posix deadlock detection

2007-10-29 Thread Alan Cox
On Sun, 28 Oct 2007 13:43:21 -0400 J. Bruce Fields [EMAIL PROTECTED] wrote: From: J. Bruce Fields [EMAIL PROTECTED] We currently attempt to return -EDEALK to blocking fcntl() file locking requests that would create a cycle in the graph of tasks waiting on locks. This is inefficient: in

Re: [RFC, PATCH] locks: remove posix deadlock detection

2007-10-29 Thread Alan Cox
And if posix file locks are to be useful to threaded applications, then we have to preserve the same no-false-positives requirement for them as well. It isn't useful to threaded applications. The specification requires this. Which is another reason for having an additional Linux (for now) flag

Re: [RFC, PATCH] locks: remove posix deadlock detection

2007-10-29 Thread Jiri Kosina
On Sun, 28 Oct 2007, Matthew Wilcox wrote: A potential for deadlock occurs if a process controlling a locked region is put to sleep by attempting to lock another process' locked region. If the system detects that sleeping until a locked region is unlocked would cause a

Re: [RFC, PATCH] locks: remove posix deadlock detection

2007-10-29 Thread Jiri Kosina
On Sun, 28 Oct 2007, J. Bruce Fields wrote: But, OK, if we can identify unshared current-files at the time we put a task to sleep, then a slight modification of our current algorithm might be sufficient to detect any deadlock that involves purely posix file locks and processes. And we can

Re: [patch 4/6][RFC] Attempt to plug race with truncate

2007-10-29 Thread Chris Mason
On Fri, 26 Oct 2007 16:37:36 -0700 Mike Waychison [EMAIL PROTECTED] wrote: Attempt to deal with races with truncate paths. I'm not really sure on the locking here, but these seem to be taken by the truncate path. BKL is left as some filesystem may(?) still require it. Signed-off-by:

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Chris Mason
On Sat, 27 Oct 2007 18:57:06 +0100 Anton Altaparmakov [EMAIL PROTECTED] wrote: Hi, -bmap is ugly and horrible! If you have to do this at the very least please cause -bmap64 to be able to return error values in case the file system failed to get the information or indeed such information

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Mike Waychison
Zach Brown wrote: And another of my pet peeves with -bmap is that it uses 0 to mean sparse which causes a conflict on NTFS at least as block zero is part of the $Boot system file so it is a real, valid block... NTFS uses -1 to denote sparse blocks internally. Reiserfs and Btrfs also use 0

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Mike Waychison
Chris Mason wrote: On Sat, 27 Oct 2007 18:57:06 +0100 Anton Altaparmakov [EMAIL PROTECTED] wrote: Hi, -bmap is ugly and horrible! If you have to do this at the very least please cause -bmap64 to be able to return error values in case the file system failed to get the information or indeed

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Zach Brown
But, we shouldn't inflict all of this on fibmap/fiemapwe'll get lost trying to make the one true interface for all operations. For grouping operations on files, I think a read_tree syscall with hints for what userland will do (read, stat, delete, list filenames), and a better cookie

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Zach Brown
Can you clarify what you mean above with an example? I don't really follow. Sure, take 'tar' as an example. It'll read files in the order that their names are returned from directory listing. This can produce bad IO patterns because the order in which the file names are returned doesn't

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 12:16 -0700, Mike Waychison wrote: Chris Mason wrote: Reiserfs and Btrfs also use 0 to mean packed. It would be nice if there was a way to indicate your-data-is-here-but-isn't-alone. But that's more of a feature for the FIEMAP stuff. I hadn't heard of FIEMAP, so I went

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
By request on #linuxfs, here is the FIEMAP spec that we used to implement the FIEMAP support for ext4. There was an ext4 patch posted on August 29 to linux-ext4 entitled [PATCH] FIEMAP ioctl. I've asked Kalpak to post an updated version of that patch along with the changes to the filefrag tool

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-29 Thread Hugh Dickins
On Sun, 28 Oct 2007, Erez Zadok wrote: I took your advise regarding ~(__GFP_FS|__GFP_IO), AOP_WRITEPAGE_ACTIVATE, and such. I revised my unionfs_writepage and unionfs_sync_page, and tested it under memory pressure: I have a couple of live CDs that use tmpfs and can deterministically

Re: [patch 0/6][RFC] Cleanup FIBMAP

2007-10-29 Thread Chris Mason
On Mon, 29 Oct 2007 12:18:22 -0700 Mike Waychison [EMAIL PROTECTED] wrote: Zach Brown wrote: And another of my pet peeves with -bmap is that it uses 0 to mean sparse which causes a conflict on NTFS at least as block zero is part of the $Boot system file so it is a real, valid block...

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Mark Fasheh
Hi Andreas, Thanks for posting this. I believe that an interface such as FIEMAP would be very useful to Ocfs2 as well. (I added ocfs2-devel to the e-mail) My comments below are generally geared towards understanding the ioctl interface. On Mon, Oct 29, 2007 at 01:45:07PM -0600, Andreas

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 16:13 -0600, Andreas Dilger wrote: On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: I'm a little bit confused by fe_offset. Is it a physical offset, or a logical offset? The reason I ask is that your description above says FIEMAP ioctl will return the logical to physical

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Mark Fasheh
On Mon, Oct 29, 2007 at 04:29:07PM -0600, Andreas Dilger wrote: On Oct 29, 2007 16:13 -0600, Andreas Dilger wrote: On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: I'm a little bit confused by fe_offset. Is it a physical offset, or a logical offset? The reason I ask is that your

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread David Chinner
On Mon, Oct 29, 2007 at 01:45:07PM -0600, Andreas Dilger wrote: By request on #linuxfs, here is the FIEMAP spec that we used to implement the FIEMAP support for ext4. There was an ext4 patch posted on August 29 to linux-ext4 entitled [PATCH] FIEMAP ioctl. Link:

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: Thanks for posting this. I believe that an interface such as FIEMAP would be very useful to Ocfs2 as well. (I added ocfs2-devel to the e-mail) I tried to make it as Lustre-agnostic as possible... On Mon, Oct 29, 2007 at 01:45:07PM -0600,

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Andreas Dilger
On Oct 29, 2007 17:11 -0700, Mark Fasheh wrote: On Mon, Oct 29, 2007 at 04:13:02PM -0600, Andreas Dilger wrote: Btrfs, Ocfs2, and Gfs2 pack small amounts of user data directly in inode blocks. Hmm, but part of the issue would be how to request the extra data, and what offset it

Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-10-29 Thread Mark Fasheh
On Mon, Oct 29, 2007 at 04:13:02PM -0600, Andreas Dilger wrote: On Oct 29, 2007 13:57 -0700, Mark Fasheh wrote: Thanks for posting this. I believe that an interface such as FIEMAP would be very useful to Ocfs2 as well. (I added ocfs2-devel to the e-mail) I tried to make it as

Proposal to improve filesystem/block snapshot interaction

2007-10-29 Thread Greg Banks
G'day, A number of people have already seen this; I'm posting for wider comment and to move some interesting discussion to a public list. I'll apologise in advance for the talk about SGI technologies (including proprietary ones), but all the problems mentioned apply to in-tree technologies too.

Re: Proposal to improve filesystem/block snapshot interaction

2007-10-29 Thread Greg Banks
On Tue, Oct 30, 2007 at 12:51:47AM +0100, Arnd Bergmann wrote: On Monday 29 October 2007, Christoph Hellwig wrote: - Forwarded message from Greg Banks [EMAIL PROTECTED] - Date: Thu, 27 Sep 2007 16:31:13 +1000 From: Greg Banks [EMAIL PROTECTED] Subject: Proposal to improve

Re: Proposal to improve filesystem/block snapshot interaction

2007-10-29 Thread Neil Brown
On Tuesday October 30, [EMAIL PROTECTED] wrote: Of course snapshot cow elements may be part of more generic element trees. In general there may be more than one consumer of block usage hints in a given filesystem's element tree, and their locations in that tree are not predictable. This

Re: Proposal to improve filesystem/block snapshot interaction

2007-10-29 Thread Greg Banks
On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote: On Tuesday October 30, [EMAIL PROTECTED] wrote: Of course snapshot cow elements may be part of more generic element trees. In general there may be more than one consumer of block usage hints in a given filesystem's element