Re: [PATCH 1/7] xfs: always use DAX if mount option is used

2017-09-26 Thread Dave Chinner
On Tue, Sep 26, 2017 at 11:35:48AM +0200, Jan Kara wrote: > On Tue 26-09-17 09:38:12, Dave Chinner wrote: > > On Mon, Sep 25, 2017 at 05:13:58PM -0600, Ross Zwisler wrote: > > > Before support for the per-inode DAX flag was disabled the XFS the code > > > had &g

Re: false positive lockdep splat with loop device

2017-09-25 Thread Dave Chinner
On Thu, Sep 21, 2017 at 09:43:41AM +0300, Amir Goldstein wrote: > On Thu, Sep 21, 2017 at 1:22 AM, Dave Chinner <da...@fromorbit.com> wrote: > > [cc lkml, PeterZ and Byungchul] > ... > > The thing is, this IO completion has nothing to do with the lower > > files

Re: false positive lockdep splat with loop device

2017-09-25 Thread Dave Chinner
On Thu, Sep 21, 2017 at 09:43:41AM +0300, Amir Goldstein wrote: > On Thu, Sep 21, 2017 at 1:22 AM, Dave Chinner wrote: > > [cc lkml, PeterZ and Byungchul] > ... > > The thing is, this IO completion has nothing to do with the lower > > filesystem - it's the IO compl

Re: shared/298 lockdep splat?

2017-09-25 Thread Dave Chinner
On Thu, Sep 21, 2017 at 05:47:14PM +0900, Byungchul Park wrote: > On Thu, Sep 21, 2017 at 08:22:56AM +1000, Dave Chinner wrote: > > Peter, this is the sort of false positive I mentioned were likely to > > occur without some serious work to annotate the IO stack to prevent > &g

Re: shared/298 lockdep splat?

2017-09-25 Thread Dave Chinner
On Thu, Sep 21, 2017 at 05:47:14PM +0900, Byungchul Park wrote: > On Thu, Sep 21, 2017 at 08:22:56AM +1000, Dave Chinner wrote: > > Peter, this is the sort of false positive I mentioned were likely to > > occur without some serious work to annotate the IO stack to prevent > &g

Re: [PATCH 7/7] xfs: re-enable XFS per-inode DAX

2017-09-25 Thread Dave Chinner
under the i_mmap_rwsem, then we have a deadlock vector. Historically we've avoided any mm/ level interactions under the ILOCK_EXCL because of it's location in the page fault path locking order (e.g. lockdep will go nuts if we take a page fault with the ILOCK held). Hence I'm extremely wary of putting any other mm/ level locks under the ILOCK like this without a clear explanation of the locking orders and why it won't deadlock Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 7/7] xfs: re-enable XFS per-inode DAX

2017-09-25 Thread Dave Chinner
ve a deadlock vector. Historically we've avoided any mm/ level interactions under the ILOCK_EXCL because of it's location in the page fault path locking order (e.g. lockdep will go nuts if we take a page fault with the ILOCK held). Hence I'm extremely wary of putting any other mm/ level locks under the ILOCK like this without a clear explanation of the locking orders and why it won't deadlock Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 1/7] xfs: always use DAX if mount option is used

2017-09-25 Thread Dave Chinner
that didn't/did work with DAX correctly so they didn't need multiple filesystems on pmem to segregate the apps that did/didn't work with DAX... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 1/7] xfs: always use DAX if mount option is used

2017-09-25 Thread Dave Chinner
that didn't/did work with DAX correctly so they didn't need multiple filesystems on pmem to segregate the apps that did/didn't work with DAX... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path

2017-09-25 Thread Dave Chinner
re are lots of applications out there that rely on these semantics for performance. CHeers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/7] xfs: protect S_DAX transitions in XFS read path

2017-09-25 Thread Dave Chinner
return 0; /* skip atime */ > > file_accessed(iocb->ki_filp); > - > - xfs_ilock(ip, XFS_IOLOCK_SHARED); > - ret = iomap_dio_rw(iocb, to, _iomap_ops, NULL); > - xfs_iunlock(ip, XFS_IOLOCK_SHARED); > - > - return ret; > + return iomap_dio_r

Re: [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path

2017-09-25 Thread Dave Chinner
re are lots of applications out there that rely on these semantics for performance. CHeers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/7] xfs: protect S_DAX transitions in XFS read path

2017-09-25 Thread Dave Chinner
file_accessed(iocb->ki_filp); > - > - xfs_ilock(ip, XFS_IOLOCK_SHARED); > - ret = iomap_dio_rw(iocb, to, _iomap_ops, NULL); > - xfs_iunlock(ip, XFS_IOLOCK_SHARED); > - > - return ret; > + return iomap_dio_rw(iocb, to, _iomap_ops, NULL); This puts file_accessed under the XFS_IOLOCK_SHARED now. Is that a safe/sane thing to do for DIO? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: shared/298 lockdep splat?

2017-09-20 Thread Dave Chinner
itive. Peter, this is the sort of false positive I mentioned were likely to occur without some serious work to annotate the IO stack to prevent them. We can nest multiple layers of IO completions and locking in the IO stack via things like loop and RAID devices. They can be nested to arbitrary depths, too (e.g. loop on fs on loop on fs on dm-raid on n * (loop on fs) on bdev) so this new completion lockdep checking is going to be a source of false positives until there is an effective (and simple!) way of providing context based completion annotations to avoid them... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: shared/298 lockdep splat?

2017-09-20 Thread Dave Chinner
itive. Peter, this is the sort of false positive I mentioned were likely to occur without some serious work to annotate the IO stack to prevent them. We can nest multiple layers of IO completions and locking in the IO stack via things like loop and RAID devices. They can be nested to arbitrary depths, too (e.g. loop on fs on loop on fs on dm-raid on n * (loop on fs) on bdev) so this new completion lockdep checking is going to be a source of false positives until there is an effective (and simple!) way of providing context based completion annotations to avoid them... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
On Mon, Sep 18, 2017 at 05:00:58PM -0500, Eric Sandeen wrote: > On 9/18/17 4:31 PM, Dave Chinner wrote: > > On Mon, Sep 18, 2017 at 09:28:55AM -0600, Jens Axboe wrote: > >> On 09/18/2017 09:27 AM, Christoph Hellwig wrote: > >>> On Mon, Sep 18, 2017 at 08:26:

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
On Mon, Sep 18, 2017 at 05:00:58PM -0500, Eric Sandeen wrote: > On 9/18/17 4:31 PM, Dave Chinner wrote: > > On Mon, Sep 18, 2017 at 09:28:55AM -0600, Jens Axboe wrote: > >> On 09/18/2017 09:27 AM, Christoph Hellwig wrote: > >>> On Mon, Sep 18, 2017 at 08:26:

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
should also have a comment like the post IO invalidation - the comment probably got dropped and not noticed when the changeover from internal XFS code to generic iomap code was made... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
should also have a comment like the post IO invalidation - the comment probably got dropped and not noticed when the changeover from internal XFS code to generic iomap code was made... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
g triggered. It needs to be on by default, bu tI'm sure we can wrap it with something like an xfs_alert_tag() type of construct so the tag can be set in /proc/fs/xfs/panic_mask to suppress it if testers so desire. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [linux-next][XFS][trinity] WARNING: CPU: 32 PID: 31369 at fs/iomap.c:993

2017-09-18 Thread Dave Chinner
g triggered. It needs to be on by default, bu tI'm sure we can wrap it with something like an xfs_alert_tag() type of construct so the tag can be set in /proc/fs/xfs/panic_mask to suppress it if testers so desire. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [GIT PULL] overlayfs update for 4.14

2017-09-17 Thread Dave Chinner
haring of multiply referenced data blocks. I don't see overlay being involved in this functionality at all Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [GIT PULL] overlayfs update for 4.14

2017-09-17 Thread Dave Chinner
haring of multiply referenced data blocks. I don't see overlay being involved in this functionality at all Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: iov_iter_pipe warning.

2017-09-12 Thread Dave Chinner
On Mon, Sep 11, 2017 at 09:07:13PM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 04:44:40PM +1000, Dave Chinner wrote: > > > > iov_iter_get_pages() for pipe-backed destination does page allocation > > > and inserts freshly allocated pages into pipe. > > > &

Re: iov_iter_pipe warning.

2017-09-12 Thread Dave Chinner
On Mon, Sep 11, 2017 at 09:07:13PM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 04:44:40PM +1000, Dave Chinner wrote: > > > > iov_iter_get_pages() for pipe-backed destination does page allocation > > > and inserts freshly allocated pages into pipe. > > > &

Re: iov_iter_pipe warning.

2017-09-11 Thread Dave Chinner
On Mon, Sep 11, 2017 at 04:32:22AM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote: > > > splice does not go down the direct IO path, so iomap_dio_actor() > > should never be handled a pipe as the destination for the IO data. > >

Re: iov_iter_pipe warning.

2017-09-11 Thread Dave Chinner
On Mon, Sep 11, 2017 at 04:32:22AM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote: > > > splice does not go down the direct IO path, so iomap_dio_actor() > > should never be handled a pipe as the destination for the IO data. > >

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
On Mon, Sep 11, 2017 at 12:07:23AM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 08:08:14AM +1000, Dave Chinner wrote: > > On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote: > > > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote: > > > > On Sun, Se

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
On Mon, Sep 11, 2017 at 12:07:23AM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 08:08:14AM +1000, Dave Chinner wrote: > > On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote: > > > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote: > > > > On Sun, Se

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote: > > On Sun, Sep 10, 2017 at 03:57:21AM +0100, Al Viro wrote: > > > On Sat, Sep 09, 2017 at 09:07:56PM -0400, Dave Jones wrote: > > > > >

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote: > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote: > > On Sun, Sep 10, 2017 at 03:57:21AM +0100, Al Viro wrote: > > > On Sat, Sep 09, 2017 at 09:07:56PM -0400, Dave Jones wrote: > > > > >

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
arning in the logs. The usual vector is an app that mixes concurrent DIO with mmap access to the same file, which we explicitly say "don't do this because data corruption" in the open(2) man page Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: iov_iter_pipe warning.

2017-09-10 Thread Dave Chinner
arning in the logs. The usual vector is an app that mixes concurrent DIO with mmap access to the same file, which we explicitly say "don't do this because data corruption" in the open(2) man page Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: XFS mounted with 'discard' option - deleting fio test files slow

2017-09-07 Thread Dave Chinner
random write with direct IO. 5GB file. Probably got a million 4k extents in it. Which means XFS has sent a million tiny 4k discards to the device. Run 'xfs_bmap -vvp fio_test_file.*' to confirm. Don't use "-o discard" if you care about performance. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: XFS mounted with 'discard' option - deleting fio test files slow

2017-09-07 Thread Dave Chinner
random write with direct IO. 5GB file. Probably got a million 4k extents in it. Which means XFS has sent a million tiny 4k discards to the device. Run 'xfs_bmap -vvp fio_test_file.*' to confirm. Don't use "-o discard" if you care about performance. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/9] add ext4 per-inode DAX flag

2017-09-07 Thread Dave Chinner
On Thu, Sep 07, 2017 at 04:19:00PM -0600, Ross Zwisler wrote: > On Fri, Sep 08, 2017 at 08:12:01AM +1000, Dave Chinner wrote: > > On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote: > > > On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote: > &g

Re: [PATCH 0/9] add ext4 per-inode DAX flag

2017-09-07 Thread Dave Chinner
On Thu, Sep 07, 2017 at 04:19:00PM -0600, Ross Zwisler wrote: > On Fri, Sep 08, 2017 at 08:12:01AM +1000, Dave Chinner wrote: > > On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote: > > > On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote: > &g

Re: [PATCH 0/9] add ext4 per-inode DAX flag

2017-09-07 Thread Dave Chinner
ches. That's not an option for production machines. Neat idea, but one I'd already thought of and discarded as "not practical from an admin perspective". Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/9] add ext4 per-inode DAX flag

2017-09-07 Thread Dave Chinner
ches. That's not an option for production machines. Neat idea, but one I'd already thought of and discarded as "not practical from an admin perspective". Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: iov_iter_pipe warning.

2017-09-06 Thread Dave Chinner
already full before we try to read from the filesystem? That doesn't seem like an XFS problem - it indicates the pipe we are filling in generic_file_splice_read() is not being emptied by whatever we are splicing the file data to Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: iov_iter_pipe warning.

2017-09-06 Thread Dave Chinner
already full before we try to read from the filesystem? That doesn't seem like an XFS problem - it indicates the pipe we are filling in generic_file_splice_read() is not being emptied by whatever we are splicing the file data to Cheers, Dave. -- Dave Chinner da...@fromorbit.com

[PATCH] swapon: fix vfree() badness

2017-09-04 Thread Dave Chinner
From: Dave Chinner <dchin...@redhat.com> The cluster_info structure is allocated with kvzalloc(), which can return kmalloc'd or vmalloc'd memory. It must be paired with kvfree(), but sys_swapon uses vfree(), resultin in this warning from xfstests generic/357: [ 1985.294915] swapon: swapfi

[PATCH] swapon: fix vfree() badness

2017-09-04 Thread Dave Chinner
From: Dave Chinner The cluster_info structure is allocated with kvzalloc(), which can return kmalloc'd or vmalloc'd memory. It must be paired with kvfree(), but sys_swapon uses vfree(), resultin in this warning from xfstests generic/357: [ 1985.294915] swapon: swapfile has holes [ 1985.296012

Re: linux-next: build warning after merge of the xfs tree

2017-08-31 Thread Dave Chinner
dered; > > > + > > > + aborted = !!(lip->li_flags & XFS_LI_ABORTED); > > > + hold = !!(bip->bli_flags & XFS_BLI_HOLD); > > > + dirty = !!(bip->bli_flags & XFS_BLI_DIRTY); > > > + ordered = !!(bip->bli_flags & XFS_BLI_ORDERED); > &

Re: linux-next: build warning after merge of the xfs tree

2017-08-31 Thread Dave Chinner
dered; > > > + > > > + aborted = !!(lip->li_flags & XFS_LI_ABORTED); > > > + hold = !!(bip->bli_flags & XFS_BLI_HOLD); > > > + dirty = !!(bip->bli_flags & XFS_BLI_DIRTY); > > > + ordered = !!(bip->bli_flags & XFS_BLI_ORDERED); > &

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-30 Thread Dave Chinner
On Wed, Aug 30, 2017 at 12:14:03AM -0700, Christoph Hellwig wrote: > On Wed, Aug 30, 2017 at 07:51:57AM +1000, Dave Chinner wrote: > > Right, I've looked at btrees, too, but it's more complex than just > > using an rbtree. I originally looked at using Peter Z's old > >

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-30 Thread Dave Chinner
On Wed, Aug 30, 2017 at 12:14:03AM -0700, Christoph Hellwig wrote: > On Wed, Aug 30, 2017 at 07:51:57AM +1000, Dave Chinner wrote: > > Right, I've looked at btrees, too, but it's more complex than just > > using an rbtree. I originally looked at using Peter Z's old > >

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
went back and forth on that, and given all the things it touched, it > seemed like too large a CC list. :) I can explicitly add the xfs list > to the first three for any future versions. If you are touching multiple filesystems, you really should cc the entire patchset to linux-fsdevel, similar to how you sent the entire patchset to lkml. That way the entire series will end up on a list that almost all fs developers read. LKML is not a list you can rely on all filesystem developers reading (or developers in any other subsystem, for that matter)... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
too large a CC list. :) I can explicitly add the xfs list > to the first three for any future versions. If you are touching multiple filesystems, you really should cc the entire patchset to linux-fsdevel, similar to how you sent the entire patchset to lkml. That way the entire series will end up on a list that almost all fs developers read. LKML is not a list you can rely on all filesystem developers reading (or developers in any other subsystem, for that matter)... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
On Tue, Aug 29, 2017 at 05:45:36AM -0700, Christoph Hellwig wrote: > On Tue, Aug 29, 2017 at 10:31:26PM +1000, Dave Chinner wrote: > > Probably should. I've already been looking at killing the inline > > extents array to simplify the management of the extent list (much >

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
On Tue, Aug 29, 2017 at 05:45:36AM -0700, Christoph Hellwig wrote: > On Tue, Aug 29, 2017 at 10:31:26PM +1000, Dave Chinner wrote: > > Probably should. I've already been looking at killing the inline > > extents array to simplify the management of the extent list (much >

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
data would get rid of the other part of the union the inline data sits in. OTOH, if we're going to have to dynamically allocate the memory for the extent/inline data for the data fork, it may just be easier to make the entire data fork a dynamic allocation (like the attr fork). Cheers, Dave. --

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-29 Thread Dave Chinner
data would get rid of the other part of the union the inline data sits in. OTOH, if we're going to have to dynamically allocate the memory for the extent/inline data for the data fork, it may just be easier to make the entire data fork a dynamic allocation (like the attr fork). Cheers, Dave. --

Re: [PATCH] xfs: Drop setting redundant PF_KSWAPD in kswapd context

2017-08-24 Thread Dave Chinner
in a different context. So this patch > loses the kswapd context. Yup. That's what the code does, and removing the PF_KSWAPD from it will break it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] xfs: Drop setting redundant PF_KSWAPD in kswapd context

2017-08-24 Thread Dave Chinner
in a different context. So this patch > loses the kswapd context. Yup. That's what the code does, and removing the PF_KSWAPD from it will break it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-22 Thread Dave Chinner
On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote: > On Tue, Aug 22, 2017 at 03:46:03PM +1000, Dave Chinner wrote: > > Even if I ignore the fact that buffer completions are run on > > different workqueues, there seems to be a bigger problem with this > > sort o

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-22 Thread Dave Chinner
On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote: > On Tue, Aug 22, 2017 at 03:46:03PM +1000, Dave Chinner wrote: > > Even if I ignore the fact that buffer completions are run on > > different workqueues, there seems to be a bigger problem with this > > sort o

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-21 Thread Dave Chinner
waited for all in progress IO to complete. Hence while the truncate runs and blocks on metadata IO completions, no data IO can be in progress on that inode, so there is no completions being run on that inode in workqueues. And therefore the IO completion deadlock path reported by lockdep can not actually be executed during a truncate, and so it's a false positive. Back to the drawing board, I guess Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-21 Thread Dave Chinner
waited for all in progress IO to complete. Hence while the truncate runs and blocks on metadata IO completions, no data IO can be in progress on that inode, so there is no completions being run on that inode in workqueues. And therefore the IO completion deadlock path reported by lockdep can not actually be executed during a truncate, and so it's a false positive. Back to the drawing board, I guess Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 0/3] MAP_DIRECT and block-map sealed files

2017-08-15 Thread Dave Chinner
fd that points to a real file. Oh, even more problematic: Seals are a property of an inode. [] Furthermore, seals can never be removed, only added. That seems somewhat difficult to reconcile with how I need F_SEAL_IOMAP to operate. /me calls it a day and goes looking for the hard liquor. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 0/3] MAP_DIRECT and block-map sealed files

2017-08-15 Thread Dave Chinner
fd that points to a real file. Oh, even more problematic: Seals are a property of an inode. [] Furthermore, seals can never be removed, only added. That seems somewhat difficult to reconcile with how I need F_SEAL_IOMAP to operate. /me calls it a day and goes looking for the hard liquor. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap

2017-08-13 Thread Dave Chinner
en by the filesystem via the break_layouts() interface, and the break then blocks until the app releases the lease? So the seal lifetime is bounded by the lease? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap

2017-08-13 Thread Dave Chinner
en by the filesystem via the break_layouts() interface, and the break then blocks until the app releases the lease? So the seal lifetime is bounded by the lease? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote: > On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote: > > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote: > > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner <da...@fromorbit.com> wrote

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote: > On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote: > > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote: > > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner wrote: > > > > On T

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote: > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner <da...@fromorbit.com> wrote: > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote: > >> >From falloc.h: > >> > >> FALLOC_FL_S

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote: > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner wrote: > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote: > >> >From falloc.h: > >> > >> FALLOC_FL_SEAL_BLOCK_M

Re: [PATCH v3 6/6] mm, xfs: protect swapfile contents with immutable + unwritten extents

2017-08-11 Thread Dave Chinner
r downgrades their kernel the swapfile suddenly can not be used by the older kernel. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 6/6] mm, xfs: protect swapfile contents with immutable + unwritten extents

2017-08-11 Thread Dave Chinner
r downgrades their kernel the swapfile suddenly can not be used by the older kernel. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
- having the seal operation also modify the extent map means it's not useful for the use cases where we need the extent map to remain unmodified Thoughts? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
- having the seal operation also modify the extent map means it's not useful for the use cases where we need the extent map to remain unmodified Thoughts? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 1/5] fs, xfs: introduce S_IOMAP_IMMUTABLE

2017-08-06 Thread Dave Chinner
being shut down because "Christoph shouted nasty words at me but I still don't understand why?". Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 1/5] fs, xfs: introduce S_IOMAP_IMMUTABLE

2017-08-06 Thread Dave Chinner
being shut down because "Christoph shouted nasty words at me but I still don't understand why?". Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-04 Thread Dave Chinner
On Fri, Aug 04, 2017 at 04:43:50PM -0700, Dan Williams wrote: > On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinner <da...@fromorbit.com> wrote: > > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: > >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c

Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-04 Thread Dave Chinner
On Fri, Aug 04, 2017 at 04:43:50PM -0700, Dan Williams wrote: > On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinner wrote: > > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: > >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > >> index fe0f8f

Re: [PATCH v2 4/5] xfs: introduce XFS_DIFLAG2_IOMAP_IMMUTABLE

2017-08-04 Thread Dave Chinner
t; fallocate(FALLOC_FL_[UN]SEAL_BLOCK_MAP). Support for toggling this > > on-disk state is saved for a later patch. > > > > Cc: Jan Kara <j...@suse.cz> > > Cc: Jeff Moyer <jmo...@redhat.com> > > Cc: Christoph Hellwig <h...@lst.de> > > Cc: Ross

Re: [PATCH v2 4/5] xfs: introduce XFS_DIFLAG2_IOMAP_IMMUTABLE

2017-08-04 Thread Dave Chinner
t; fallocate(FALLOC_FL_[UN]SEAL_BLOCK_MAP). Support for toggling this > > on-disk state is saved for a later patch. > > > > Cc: Jan Kara > > Cc: Jeff Moyer > > Cc: Christoph Hellwig > > Cc: Ross Zwisler > > Suggested-by: Dave Chinner > > Sugges

Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-04 Thread Dave Chinner
so we've already guaranteed that it won't have holes in it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-04 Thread Dave Chinner
so we've already guaranteed that it won't have holes in it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 1/3] fs, xfs: introduce S_IOMAP_IMMUTABLE

2017-07-31 Thread Dave Chinner
t to turn a swap file back into a regular > file. > I haven't fully followed DAX, but I'd take your word for it if people want to > be able to remove the flag after. DAX isn't the driver of that functionality, it's the other use cases that need it, and why the proposed "only remove flag if len == 0" API is a non-starter Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 1/3] fs, xfs: introduce S_IOMAP_IMMUTABLE

2017-07-31 Thread Dave Chinner
t to turn a swap file back into a regular > file. > I haven't fully followed DAX, but I'd take your word for it if people want to > be able to remove the flag after. DAX isn't the driver of that functionality, it's the other use cases that need it, and why the proposed "only remove flag if len == 0" API is a non-starter Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/3] xfs: persist S_IOMAP_IMMUTABLE in di_flags2

2017-07-31 Thread Dave Chinner
e/zero anything... IOWs, this flag should be the last thing that is set on the inode once it's been fully allocated and zeroed. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/3] xfs: persist S_IOMAP_IMMUTABLE in di_flags2

2017-07-31 Thread Dave Chinner
e/zero anything... IOWs, this flag should be the last thing that is set on the inode once it's been fully allocated and zeroed. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/3] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-07-31 Thread Dave Chinner
not. Perhaps it would be better to start with a man page documenting the desired API? FWIW, the if/else if/else structure could be cleaned up with a simple "goto out_unlock" construct such as: /* don't make immutable if inode is currently mapped */ error = -EBUSY; if (mapping_mapped(mapping)) goto out_unlock; /* can't do anything if inode is already immutable */ error = -ETXTBSY; if (IS_IMMUTABLE(inode) || IS_IOMAP_IMMUTABLE(inode)) goto out_unlock; /* XFS only supports whole file extent immutability */ error = -EINVAL; if (len != i_size_read(inode)) goto out_unlock; /* all good to go */ error = 0; out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); i_mmap_unlock_read(mapping); if (error) return error; /* now unshare, allocate and add immutable flag */ Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/3] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-07-31 Thread Dave Chinner
ter to start with a man page documenting the desired API? FWIW, the if/else if/else structure could be cleaned up with a simple "goto out_unlock" construct such as: /* don't make immutable if inode is currently mapped */ error = -EBUSY; if (mapping_mapped(mapping)) goto out_unlock; /* can't do anything if inode is already immutable */ error = -ETXTBSY; if (IS_IMMUTABLE(inode) || IS_IOMAP_IMMUTABLE(inode)) goto out_unlock; /* XFS only supports whole file extent immutability */ error = -EINVAL; if (len != i_size_read(inode)) goto out_unlock; /* all good to go */ error = 0; out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); i_mmap_unlock_read(mapping); if (error) return error; /* now unshare, allocate and add immutable flag */ Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-18 Thread Dave Chinner
On Tue, Jul 18, 2017 at 05:28:14PM -0700, David Rientjes wrote: > On Tue, 18 Jul 2017, Dave Chinner wrote: > > > > Thanks for looking into this, Dave! > > > > > > The number of GFP_NOFS allocations that build up the deferred counts can > > > be unboun

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-18 Thread Dave Chinner
On Tue, Jul 18, 2017 at 05:28:14PM -0700, David Rientjes wrote: > On Tue, 18 Jul 2017, Dave Chinner wrote: > > > > Thanks for looking into this, Dave! > > > > > > The number of GFP_NOFS allocations that build up the deferred counts can > > > be unboun

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-17 Thread Dave Chinner
On Mon, Jul 17, 2017 at 01:37:35PM -0700, David Rientjes wrote: > On Mon, 17 Jul 2017, Dave Chinner wrote: > > > > This is a side effect of super_cache_count() returning the appropriate > > > count but super_cache_scan() refusing to do anything about it and >

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-17 Thread Dave Chinner
On Mon, Jul 17, 2017 at 01:37:35PM -0700, David Rientjes wrote: > On Mon, 17 Jul 2017, Dave Chinner wrote: > > > > This is a side effect of super_cache_count() returning the appropriate > > > count but super_cache_scan() refusing to do anything about it and >

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-16 Thread Dave Chinner
s, then we end up with filesystem caches being trashed in light memory pressure conditions. This is, generally speaking, bad for workloads that rely on filesystem caches for performance (e.g git, NFS servers, etc). What we have now is effectively a brute force solution that finds a decent middle ground most of the time. It's not perfect, but I'm yet to find a better solution Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [rfc] superblock shrinker accumulating excessive deferred counts

2017-07-16 Thread Dave Chinner
s, then we end up with filesystem caches being trashed in light memory pressure conditions. This is, generally speaking, bad for workloads that rely on filesystem caches for performance (e.g git, NFS servers, etc). What we have now is effectively a brute force solution that finds a decent middle ground most of the time. It's not perfect, but I'm yet to find a better solution Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-22 Thread Dave Chinner
On Wed, Jun 21, 2017 at 09:07:57PM -0700, Andy Lutomirski wrote: > On Wed, Jun 21, 2017 at 5:02 PM, Dave Chinner <da...@fromorbit.com> wrote: > > > > You seem to be calling the "fdatasync on every page fault" the > > It's the opposite of fdatasync()

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-22 Thread Dave Chinner
On Wed, Jun 21, 2017 at 09:07:57PM -0700, Andy Lutomirski wrote: > On Wed, Jun 21, 2017 at 5:02 PM, Dave Chinner wrote: > > > > You seem to be calling the "fdatasync on every page fault" the > > It's the opposite of fdatasync(). It needs to sync whatever metada

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-21 Thread Dave Chinner
On Tue, Jun 20, 2017 at 10:18:24PM -0700, Andy Lutomirski wrote: > On Tue, Jun 20, 2017 at 6:40 PM, Dave Chinner <da...@fromorbit.com> wrote: > >> A per-inode > >> count of the number of live DAX mappings or of the number of struct > >> file instances tha

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-21 Thread Dave Chinner
On Tue, Jun 20, 2017 at 10:18:24PM -0700, Andy Lutomirski wrote: > On Tue, Jun 20, 2017 at 6:40 PM, Dave Chinner wrote: > >> A per-inode > >> count of the number of live DAX mappings or of the number of struct > >> file instances that have requested DAX would work

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-21 Thread Dave Chinner
ut/ to grouse about this syscall, then realized that maybe it > /is/ useful to be able to check a specific alignment. Maybe not, since > I had something more permanent in mind anyway. In any case, just pass > in an opened fd if this sticks around. We can do all that via fallocate(), too... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-21 Thread Dave Chinner
ut/ to grouse about this syscall, then realized that maybe it > /is/ useful to be able to check a specific alignment. Maybe not, since > I had something more permanent in mind anyway. In any case, just pass > in an opened fd if this sticks around. We can do all that via fallocate(), too... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-20 Thread Dave Chinner
On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote: > On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote: > > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote: > > > An immutable-extent DAX-file and a reflink-capable DAX-file are not > &

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-20 Thread Dave Chinner
On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote: > On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote: > > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote: > > > An immutable-extent DAX-file and a reflink-capable DAX-file are not > &

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-20 Thread Dave Chinner
On Tue, Jun 20, 2017 at 09:14:24AM -0700, Andy Lutomirski wrote: > On Tue, Jun 20, 2017 at 3:11 AM, Dave Chinner <da...@fromorbit.com> wrote: > > On Mon, Jun 19, 2017 at 10:53:12PM -0700, Andy Lutomirski wrote: > >> On Mon, Jun 19, 2017 at 5:46 PM, Dave Chinner <

<    4   5   6   7   8   9   10   11   12   13   >