On Tue, Sep 26, 2017 at 11:35:48AM +0200, Jan Kara wrote:
> On Tue 26-09-17 09:38:12, Dave Chinner wrote:
> > On Mon, Sep 25, 2017 at 05:13:58PM -0600, Ross Zwisler wrote:
> > > Before support for the per-inode DAX flag was disabled the XFS the code
> > > had
&g
On Thu, Sep 21, 2017 at 09:43:41AM +0300, Amir Goldstein wrote:
> On Thu, Sep 21, 2017 at 1:22 AM, Dave Chinner <da...@fromorbit.com> wrote:
> > [cc lkml, PeterZ and Byungchul]
> ...
> > The thing is, this IO completion has nothing to do with the lower
> > files
On Thu, Sep 21, 2017 at 09:43:41AM +0300, Amir Goldstein wrote:
> On Thu, Sep 21, 2017 at 1:22 AM, Dave Chinner wrote:
> > [cc lkml, PeterZ and Byungchul]
> ...
> > The thing is, this IO completion has nothing to do with the lower
> > filesystem - it's the IO compl
On Thu, Sep 21, 2017 at 05:47:14PM +0900, Byungchul Park wrote:
> On Thu, Sep 21, 2017 at 08:22:56AM +1000, Dave Chinner wrote:
> > Peter, this is the sort of false positive I mentioned were likely to
> > occur without some serious work to annotate the IO stack to prevent
> &g
On Thu, Sep 21, 2017 at 05:47:14PM +0900, Byungchul Park wrote:
> On Thu, Sep 21, 2017 at 08:22:56AM +1000, Dave Chinner wrote:
> > Peter, this is the sort of false positive I mentioned were likely to
> > occur without some serious work to annotate the IO stack to prevent
> &g
under the i_mmap_rwsem, then we have a deadlock vector.
Historically we've avoided any mm/ level interactions under the
ILOCK_EXCL because of it's location in the page fault path locking
order (e.g. lockdep will go nuts if we take a page fault with the
ILOCK held). Hence I'm extremely wary of putting any other mm/ level
locks under the ILOCK like this without a clear explanation of the
locking orders and why it won't deadlock
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
ve a deadlock vector.
Historically we've avoided any mm/ level interactions under the
ILOCK_EXCL because of it's location in the page fault path locking
order (e.g. lockdep will go nuts if we take a page fault with the
ILOCK held). Hence I'm extremely wary of putting any other mm/ level
locks under the ILOCK like this without a clear explanation of the
locking orders and why it won't deadlock
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
that didn't/did work with DAX correctly so they
didn't need multiple filesystems on pmem to segregate the apps that
did/didn't work with DAX...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
that didn't/did work with DAX correctly so they
didn't need multiple filesystems on pmem to segregate the apps that
did/didn't work with DAX...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
re are lots of applications out there that rely on
these semantics for performance.
CHeers,
Dave.
--
Dave Chinner
da...@fromorbit.com
return 0; /* skip atime */
>
> file_accessed(iocb->ki_filp);
> -
> - xfs_ilock(ip, XFS_IOLOCK_SHARED);
> - ret = iomap_dio_rw(iocb, to, _iomap_ops, NULL);
> - xfs_iunlock(ip, XFS_IOLOCK_SHARED);
> -
> - return ret;
> + return iomap_dio_r
re are lots of applications out there that rely on
these semantics for performance.
CHeers,
Dave.
--
Dave Chinner
da...@fromorbit.com
file_accessed(iocb->ki_filp);
> -
> - xfs_ilock(ip, XFS_IOLOCK_SHARED);
> - ret = iomap_dio_rw(iocb, to, _iomap_ops, NULL);
> - xfs_iunlock(ip, XFS_IOLOCK_SHARED);
> -
> - return ret;
> + return iomap_dio_rw(iocb, to, _iomap_ops, NULL);
This puts file_accessed under the XFS_IOLOCK_SHARED now. Is that a
safe/sane thing to do for DIO?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
itive.
Peter, this is the sort of false positive I mentioned were likely to
occur without some serious work to annotate the IO stack to prevent
them. We can nest multiple layers of IO completions and locking in
the IO stack via things like loop and RAID devices. They can be
nested to arbitrary depths, too (e.g. loop on fs on loop on fs on
dm-raid on n * (loop on fs) on bdev) so this new completion lockdep
checking is going to be a source of false positives until there is
an effective (and simple!) way of providing context based completion
annotations to avoid them...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
itive.
Peter, this is the sort of false positive I mentioned were likely to
occur without some serious work to annotate the IO stack to prevent
them. We can nest multiple layers of IO completions and locking in
the IO stack via things like loop and RAID devices. They can be
nested to arbitrary depths, too (e.g. loop on fs on loop on fs on
dm-raid on n * (loop on fs) on bdev) so this new completion lockdep
checking is going to be a source of false positives until there is
an effective (and simple!) way of providing context based completion
annotations to avoid them...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Mon, Sep 18, 2017 at 05:00:58PM -0500, Eric Sandeen wrote:
> On 9/18/17 4:31 PM, Dave Chinner wrote:
> > On Mon, Sep 18, 2017 at 09:28:55AM -0600, Jens Axboe wrote:
> >> On 09/18/2017 09:27 AM, Christoph Hellwig wrote:
> >>> On Mon, Sep 18, 2017 at 08:26:
On Mon, Sep 18, 2017 at 05:00:58PM -0500, Eric Sandeen wrote:
> On 9/18/17 4:31 PM, Dave Chinner wrote:
> > On Mon, Sep 18, 2017 at 09:28:55AM -0600, Jens Axboe wrote:
> >> On 09/18/2017 09:27 AM, Christoph Hellwig wrote:
> >>> On Mon, Sep 18, 2017 at 08:26:
should also have a comment like the post
IO invalidation - the comment probably got dropped and not noticed
when the changeover from internal XFS code to generic iomap code was
made...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
should also have a comment like the post
IO invalidation - the comment probably got dropped and not noticed
when the changeover from internal XFS code to generic iomap code was
made...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
g triggered.
It needs to be on by default, bu tI'm sure we can wrap it with
something like an xfs_alert_tag() type of construct so the tag can
be set in /proc/fs/xfs/panic_mask to suppress it if testers so
desire.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
g triggered.
It needs to be on by default, bu tI'm sure we can wrap it with
something like an xfs_alert_tag() type of construct so the tag can
be set in /proc/fs/xfs/panic_mask to suppress it if testers so
desire.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
haring of multiply
referenced data blocks. I don't see overlay being involved in this
functionality at all
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
haring of multiply
referenced data blocks. I don't see overlay being involved in this
functionality at all
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Mon, Sep 11, 2017 at 09:07:13PM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 04:44:40PM +1000, Dave Chinner wrote:
>
> > > iov_iter_get_pages() for pipe-backed destination does page allocation
> > > and inserts freshly allocated pages into pipe.
> >
> &
On Mon, Sep 11, 2017 at 09:07:13PM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 04:44:40PM +1000, Dave Chinner wrote:
>
> > > iov_iter_get_pages() for pipe-backed destination does page allocation
> > > and inserts freshly allocated pages into pipe.
> >
> &
On Mon, Sep 11, 2017 at 04:32:22AM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote:
>
> > splice does not go down the direct IO path, so iomap_dio_actor()
> > should never be handled a pipe as the destination for the IO data.
> >
On Mon, Sep 11, 2017 at 04:32:22AM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 10:31:13AM +1000, Dave Chinner wrote:
>
> > splice does not go down the direct IO path, so iomap_dio_actor()
> > should never be handled a pipe as the destination for the IO data.
> >
On Mon, Sep 11, 2017 at 12:07:23AM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 08:08:14AM +1000, Dave Chinner wrote:
> > On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote:
> > > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote:
> > > > On Sun, Se
On Mon, Sep 11, 2017 at 12:07:23AM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 08:08:14AM +1000, Dave Chinner wrote:
> > On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote:
> > > On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote:
> > > > On Sun, Se
On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote:
> > On Sun, Sep 10, 2017 at 03:57:21AM +0100, Al Viro wrote:
> > > On Sat, Sep 09, 2017 at 09:07:56PM -0400, Dave Jones wrote:
> > >
> >
On Sun, Sep 10, 2017 at 10:19:07PM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 07:11:10AM +1000, Dave Chinner wrote:
> > On Sun, Sep 10, 2017 at 03:57:21AM +0100, Al Viro wrote:
> > > On Sat, Sep 09, 2017 at 09:07:56PM -0400, Dave Jones wrote:
> > >
> >
arning in the logs. The usual vector is an app that mixes
concurrent DIO with mmap access to the same file, which we
explicitly say "don't do this because data corruption" in the
open(2) man page
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
arning in the logs. The usual vector is an app that mixes
concurrent DIO with mmap access to the same file, which we
explicitly say "don't do this because data corruption" in the
open(2) man page
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
random write with direct IO. 5GB file. Probably got a million 4k
extents in it. Which means XFS has sent a million tiny 4k discards
to the device. Run 'xfs_bmap -vvp fio_test_file.*' to confirm.
Don't use "-o discard" if you care about performance.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
random write with direct IO. 5GB file. Probably got a million 4k
extents in it. Which means XFS has sent a million tiny 4k discards
to the device. Run 'xfs_bmap -vvp fio_test_file.*' to confirm.
Don't use "-o discard" if you care about performance.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Thu, Sep 07, 2017 at 04:19:00PM -0600, Ross Zwisler wrote:
> On Fri, Sep 08, 2017 at 08:12:01AM +1000, Dave Chinner wrote:
> > On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote:
> > > On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote:
> &g
On Thu, Sep 07, 2017 at 04:19:00PM -0600, Ross Zwisler wrote:
> On Fri, Sep 08, 2017 at 08:12:01AM +1000, Dave Chinner wrote:
> > On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote:
> > > On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote:
> &g
ches. That's not an option for
production machines.
Neat idea, but one I'd already thought of and discarded as "not
practical from an admin perspective".
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
ches. That's not an option for
production machines.
Neat idea, but one I'd already thought of and discarded as "not
practical from an admin perspective".
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
already full before we
try to read from the filesystem?
That doesn't seem like an XFS problem - it indicates the pipe we are
filling in generic_file_splice_read() is not being emptied by
whatever we are splicing the file data to
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
already full before we
try to read from the filesystem?
That doesn't seem like an XFS problem - it indicates the pipe we are
filling in generic_file_splice_read() is not being emptied by
whatever we are splicing the file data to
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
From: Dave Chinner <dchin...@redhat.com>
The cluster_info structure is allocated with kvzalloc(), which can
return kmalloc'd or vmalloc'd memory. It must be paired with
kvfree(), but sys_swapon uses vfree(), resultin in this warning
from xfstests generic/357:
[ 1985.294915] swapon: swapfi
From: Dave Chinner
The cluster_info structure is allocated with kvzalloc(), which can
return kmalloc'd or vmalloc'd memory. It must be paired with
kvfree(), but sys_swapon uses vfree(), resultin in this warning
from xfstests generic/357:
[ 1985.294915] swapon: swapfile has holes
[ 1985.296012
dered;
> > > +
> > > + aborted = !!(lip->li_flags & XFS_LI_ABORTED);
> > > + hold = !!(bip->bli_flags & XFS_BLI_HOLD);
> > > + dirty = !!(bip->bli_flags & XFS_BLI_DIRTY);
> > > + ordered = !!(bip->bli_flags & XFS_BLI_ORDERED);
> &
dered;
> > > +
> > > + aborted = !!(lip->li_flags & XFS_LI_ABORTED);
> > > + hold = !!(bip->bli_flags & XFS_BLI_HOLD);
> > > + dirty = !!(bip->bli_flags & XFS_BLI_DIRTY);
> > > + ordered = !!(bip->bli_flags & XFS_BLI_ORDERED);
> &
On Wed, Aug 30, 2017 at 12:14:03AM -0700, Christoph Hellwig wrote:
> On Wed, Aug 30, 2017 at 07:51:57AM +1000, Dave Chinner wrote:
> > Right, I've looked at btrees, too, but it's more complex than just
> > using an rbtree. I originally looked at using Peter Z's old
> >
On Wed, Aug 30, 2017 at 12:14:03AM -0700, Christoph Hellwig wrote:
> On Wed, Aug 30, 2017 at 07:51:57AM +1000, Dave Chinner wrote:
> > Right, I've looked at btrees, too, but it's more complex than just
> > using an rbtree. I originally looked at using Peter Z's old
> >
went back and forth on that, and given all the things it touched, it
> seemed like too large a CC list. :) I can explicitly add the xfs list
> to the first three for any future versions.
If you are touching multiple filesystems, you really should cc the
entire patchset to linux-fsdevel, similar to how you sent the entire
patchset to lkml. That way the entire series will end up on a list
that almost all fs developers read. LKML is not a list you can rely
on all filesystem developers reading (or developers in any other
subsystem, for that matter)...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
too large a CC list. :) I can explicitly add the xfs list
> to the first three for any future versions.
If you are touching multiple filesystems, you really should cc the
entire patchset to linux-fsdevel, similar to how you sent the entire
patchset to lkml. That way the entire series will end up on a list
that almost all fs developers read. LKML is not a list you can rely
on all filesystem developers reading (or developers in any other
subsystem, for that matter)...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Aug 29, 2017 at 05:45:36AM -0700, Christoph Hellwig wrote:
> On Tue, Aug 29, 2017 at 10:31:26PM +1000, Dave Chinner wrote:
> > Probably should. I've already been looking at killing the inline
> > extents array to simplify the management of the extent list (much
>
On Tue, Aug 29, 2017 at 05:45:36AM -0700, Christoph Hellwig wrote:
> On Tue, Aug 29, 2017 at 10:31:26PM +1000, Dave Chinner wrote:
> > Probably should. I've already been looking at killing the inline
> > extents array to simplify the management of the extent list (much
>
data would get rid of the other
part of the union the inline data sits in.
OTOH, if we're going to have to dynamically allocate the memory for
the extent/inline data for the data fork, it may just be easier to
make the entire data fork a dynamic allocation (like the attr fork).
Cheers,
Dave.
--
data would get rid of the other
part of the union the inline data sits in.
OTOH, if we're going to have to dynamically allocate the memory for
the extent/inline data for the data fork, it may just be easier to
make the entire data fork a dynamic allocation (like the attr fork).
Cheers,
Dave.
--
in a different context. So this patch
> loses the kswapd context.
Yup. That's what the code does, and removing the PF_KSWAPD from it
will break it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
in a different context. So this patch
> loses the kswapd context.
Yup. That's what the code does, and removing the PF_KSWAPD from it
will break it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 22, 2017 at 03:46:03PM +1000, Dave Chinner wrote:
> > Even if I ignore the fact that buffer completions are run on
> > different workqueues, there seems to be a bigger problem with this
> > sort o
On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 22, 2017 at 03:46:03PM +1000, Dave Chinner wrote:
> > Even if I ignore the fact that buffer completions are run on
> > different workqueues, there seems to be a bigger problem with this
> > sort o
waited for all in progress IO to complete. Hence while the truncate
runs and blocks on metadata IO completions, no data IO can be in
progress on that inode, so there is no completions being run on that
inode in workqueues.
And therefore the IO completion deadlock path reported by lockdep
can not actually be executed during a truncate, and so it's a false
positive.
Back to the drawing board, I guess
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
waited for all in progress IO to complete. Hence while the truncate
runs and blocks on metadata IO completions, no data IO can be in
progress on that inode, so there is no completions being run on that
inode in workqueues.
And therefore the IO completion deadlock path reported by lockdep
can not actually be executed during a truncate, and so it's a false
positive.
Back to the drawing board, I guess
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
fd that
points to a real file.
Oh, even more problematic:
Seals are a property of an inode. [] Furthermore, seals
can never be removed, only added.
That seems somewhat difficult to reconcile with how I need
F_SEAL_IOMAP to operate.
/me calls it a day and goes looking for the hard liquor.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
fd that
points to a real file.
Oh, even more problematic:
Seals are a property of an inode. [] Furthermore, seals
can never be removed, only added.
That seems somewhat difficult to reconcile with how I need
F_SEAL_IOMAP to operate.
/me calls it a day and goes looking for the hard liquor.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
en
by the filesystem via the break_layouts() interface, and the break
then blocks until the app releases the lease? So the seal lifetime
is bounded by the lease?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
en
by the filesystem via the break_layouts() interface, and the break
then blocks until the app releases the lease? So the seal lifetime
is bounded by the lease?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote:
> On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner <da...@fromorbit.com> wrote
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote:
> On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner wrote:
> > > > On T
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner <da...@fromorbit.com> wrote:
> > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >> >From falloc.h:
> >>
> >> FALLOC_FL_S
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner wrote:
> > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >> >From falloc.h:
> >>
> >> FALLOC_FL_SEAL_BLOCK_M
r
downgrades their kernel the swapfile suddenly can not be used by the
older kernel.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
r
downgrades their kernel the swapfile suddenly can not be used by the
older kernel.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
- having the seal operation also modify the extent map
means it's not useful for the use cases where we need the extent map
to remain unmodified
Thoughts?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
- having the seal operation also modify the extent map
means it's not useful for the use cases where we need the extent map
to remain unmodified
Thoughts?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
being shut down because "Christoph shouted
nasty words at me but I still don't understand why?".
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
being shut down because "Christoph shouted
nasty words at me but I still don't understand why?".
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Fri, Aug 04, 2017 at 04:43:50PM -0700, Dan Williams wrote:
> On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinner <da...@fromorbit.com> wrote:
> > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote:
> >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
On Fri, Aug 04, 2017 at 04:43:50PM -0700, Dan Williams wrote:
> On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinner wrote:
> > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote:
> >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> >> index fe0f8f
t; fallocate(FALLOC_FL_[UN]SEAL_BLOCK_MAP). Support for toggling this
> > on-disk state is saved for a later patch.
> >
> > Cc: Jan Kara <j...@suse.cz>
> > Cc: Jeff Moyer <jmo...@redhat.com>
> > Cc: Christoph Hellwig <h...@lst.de>
> > Cc: Ross
t; fallocate(FALLOC_FL_[UN]SEAL_BLOCK_MAP). Support for toggling this
> > on-disk state is saved for a later patch.
> >
> > Cc: Jan Kara
> > Cc: Jeff Moyer
> > Cc: Christoph Hellwig
> > Cc: Ross Zwisler
> > Suggested-by: Dave Chinner
> > Sugges
so we've already guaranteed that it won't have holes
in it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
so we've already guaranteed that it won't have holes
in it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
t to turn a swap file back into a regular
> file.
> I haven't fully followed DAX, but I'd take your word for it if people want to
> be able to remove the flag after.
DAX isn't the driver of that functionality, it's the other use cases
that need it, and why the proposed "only remove flag if len == 0"
API is a non-starter
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
t to turn a swap file back into a regular
> file.
> I haven't fully followed DAX, but I'd take your word for it if people want to
> be able to remove the flag after.
DAX isn't the driver of that functionality, it's the other use cases
that need it, and why the proposed "only remove flag if len == 0"
API is a non-starter
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
e/zero anything...
IOWs, this flag should be the last thing that is set on the inode
once it's been fully allocated and zeroed.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
e/zero anything...
IOWs, this flag should be the last thing that is set on the inode
once it's been fully allocated and zeroed.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
not. Perhaps it would be better to start with a man page
documenting the desired API?
FWIW, the if/else if/else structure could be cleaned up with a
simple "goto out_unlock" construct such as:
/* don't make immutable if inode is currently mapped */
error = -EBUSY;
if (mapping_mapped(mapping))
goto out_unlock;
/* can't do anything if inode is already immutable */
error = -ETXTBSY;
if (IS_IMMUTABLE(inode) || IS_IOMAP_IMMUTABLE(inode))
goto out_unlock;
/* XFS only supports whole file extent immutability */
error = -EINVAL;
if (len != i_size_read(inode))
goto out_unlock;
/* all good to go */
error = 0;
out_unlock:
xfs_iunlock(ip, XFS_ILOCK_EXCL);
i_mmap_unlock_read(mapping);
if (error)
return error;
/* now unshare, allocate and add immutable flag */
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
ter to start with a man page
documenting the desired API?
FWIW, the if/else if/else structure could be cleaned up with a
simple "goto out_unlock" construct such as:
/* don't make immutable if inode is currently mapped */
error = -EBUSY;
if (mapping_mapped(mapping))
goto out_unlock;
/* can't do anything if inode is already immutable */
error = -ETXTBSY;
if (IS_IMMUTABLE(inode) || IS_IOMAP_IMMUTABLE(inode))
goto out_unlock;
/* XFS only supports whole file extent immutability */
error = -EINVAL;
if (len != i_size_read(inode))
goto out_unlock;
/* all good to go */
error = 0;
out_unlock:
xfs_iunlock(ip, XFS_ILOCK_EXCL);
i_mmap_unlock_read(mapping);
if (error)
return error;
/* now unshare, allocate and add immutable flag */
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Jul 18, 2017 at 05:28:14PM -0700, David Rientjes wrote:
> On Tue, 18 Jul 2017, Dave Chinner wrote:
>
> > > Thanks for looking into this, Dave!
> > >
> > > The number of GFP_NOFS allocations that build up the deferred counts can
> > > be unboun
On Tue, Jul 18, 2017 at 05:28:14PM -0700, David Rientjes wrote:
> On Tue, 18 Jul 2017, Dave Chinner wrote:
>
> > > Thanks for looking into this, Dave!
> > >
> > > The number of GFP_NOFS allocations that build up the deferred counts can
> > > be unboun
On Mon, Jul 17, 2017 at 01:37:35PM -0700, David Rientjes wrote:
> On Mon, 17 Jul 2017, Dave Chinner wrote:
>
> > > This is a side effect of super_cache_count() returning the appropriate
> > > count but super_cache_scan() refusing to do anything about it and
>
On Mon, Jul 17, 2017 at 01:37:35PM -0700, David Rientjes wrote:
> On Mon, 17 Jul 2017, Dave Chinner wrote:
>
> > > This is a side effect of super_cache_count() returning the appropriate
> > > count but super_cache_scan() refusing to do anything about it and
>
s, then we end up with filesystem caches being trashed in light
memory pressure conditions. This is, generally speaking, bad for
workloads that rely on filesystem caches for performance (e.g git,
NFS servers, etc).
What we have now is effectively a brute force solution that finds
a decent middle ground most of the time. It's not perfect, but I'm
yet to find a better solution
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
s, then we end up with filesystem caches being trashed in light
memory pressure conditions. This is, generally speaking, bad for
workloads that rely on filesystem caches for performance (e.g git,
NFS servers, etc).
What we have now is effectively a brute force solution that finds
a decent middle ground most of the time. It's not perfect, but I'm
yet to find a better solution
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Jun 21, 2017 at 09:07:57PM -0700, Andy Lutomirski wrote:
> On Wed, Jun 21, 2017 at 5:02 PM, Dave Chinner <da...@fromorbit.com> wrote:
> >
> > You seem to be calling the "fdatasync on every page fault" the
>
> It's the opposite of fdatasync()
On Wed, Jun 21, 2017 at 09:07:57PM -0700, Andy Lutomirski wrote:
> On Wed, Jun 21, 2017 at 5:02 PM, Dave Chinner wrote:
> >
> > You seem to be calling the "fdatasync on every page fault" the
>
> It's the opposite of fdatasync(). It needs to sync whatever metada
On Tue, Jun 20, 2017 at 10:18:24PM -0700, Andy Lutomirski wrote:
> On Tue, Jun 20, 2017 at 6:40 PM, Dave Chinner <da...@fromorbit.com> wrote:
> >> A per-inode
> >> count of the number of live DAX mappings or of the number of struct
> >> file instances tha
On Tue, Jun 20, 2017 at 10:18:24PM -0700, Andy Lutomirski wrote:
> On Tue, Jun 20, 2017 at 6:40 PM, Dave Chinner wrote:
> >> A per-inode
> >> count of the number of live DAX mappings or of the number of struct
> >> file instances that have requested DAX would work
ut/ to grouse about this syscall, then realized that maybe it
> /is/ useful to be able to check a specific alignment. Maybe not, since
> I had something more permanent in mind anyway. In any case, just pass
> in an opened fd if this sticks around.
We can do all that via fallocate(), too...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
ut/ to grouse about this syscall, then realized that maybe it
> /is/ useful to be able to check a specific alignment. Maybe not, since
> I had something more permanent in mind anyway. In any case, just pass
> in an opened fd if this sticks around.
We can do all that via fallocate(), too...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote:
> > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> > > An immutable-extent DAX-file and a reflink-capable DAX-file are not
> &
On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote:
> > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> > > An immutable-extent DAX-file and a reflink-capable DAX-file are not
> &
On Tue, Jun 20, 2017 at 09:14:24AM -0700, Andy Lutomirski wrote:
> On Tue, Jun 20, 2017 at 3:11 AM, Dave Chinner <da...@fromorbit.com> wrote:
> > On Mon, Jun 19, 2017 at 10:53:12PM -0700, Andy Lutomirski wrote:
> >> On Mon, Jun 19, 2017 at 5:46 PM, Dave Chinner <
801 - 900 of 3915 matches
Mail list logo