Re: [PATCH RFC tip/core/rcu] Add shrinker to shift to fast/inefficient GP mode

2020-05-12 Thread Dave Chinner
ould be trying to expedite kfree_rcu() unless there is a good reason to do so (e.g. at unmount to ensure everything allocated by a filesystem has actually been freed). Hence I'd much prefer the decision to expedite callbacks is made by the RCU subsystem based on it's known callback load and some indication of how close memory reclaim is to declaring OOM... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH RFC 2/8] selftests: add stress testing tool for dcache

2020-05-12 Thread Dave Chinner
dirstress, metaperf, etc) for exercising name-based operations like this, so it would fit right in. That way it would get run by just about every filesystem developer and distro QE department automatically and extremely frequently... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup

2020-05-05 Thread Dave Chinner
On Wed, Apr 29, 2020 at 12:25:40PM +0200, Jan Kara wrote: > On Wed 29-04-20 07:47:34, Dave Chinner wrote: > > On Tue, Apr 28, 2020 at 12:13:46PM -0400, Dan Schatzberg wrote: > > > The loop device runs all i/o to the backing file on a separate kworker > > > thread whi

Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup

2020-05-05 Thread Dave Chinner
On Tue, Apr 28, 2020 at 10:27:32PM -0400, Johannes Weiner wrote: > On Wed, Apr 29, 2020 at 07:47:34AM +1000, Dave Chinner wrote: > > On Tue, Apr 28, 2020 at 12:13:46PM -0400, Dan Schatzberg wrote: > > > This patch series does some > > > minor modification to the loop dri

Re: [PATCH] fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag()

2020-05-04 Thread Dave Chinner
che.c > @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag( > struct xfs_mount*mp = ip->i_mount; > struct xfs_perag*pag; > > + spin_lock(>i_flags_lock); > pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); > spin_lock(>

Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup

2020-04-28 Thread Dave Chinner
not trigger an OOM kill that shoots some innocent bystander in the head. That's worse than using BUG() to report errors... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: 回复: Re: [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink

2020-04-28 Thread Dave Chinner
On Tue, Apr 28, 2020 at 08:37:32AM -0700, Darrick J. Wong wrote: > On Tue, Apr 28, 2020 at 09:24:41PM +1000, Dave Chinner wrote: > > On Tue, Apr 28, 2020 at 04:16:36AM -0700, Matthew Wilcox wrote: > > > On Tue, Apr 28, 2020 at 05:32:41PM +0800, Ruan Shiyang wrote: > > >

Re: 回复: Re: [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink

2020-04-28 Thread Dave Chinner
On Tue, Apr 28, 2020 at 04:16:36AM -0700, Matthew Wilcox wrote: > On Tue, Apr 28, 2020 at 05:32:41PM +0800, Ruan Shiyang wrote: > > On 2020/4/28 下午2:43, Dave Chinner wrote: > > > On Tue, Apr 28, 2020 at 06:09:47AM +, Ruan, Shiyang wrote: > > > > 在 2020/4/27

Re: [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag

2019-10-21 Thread Dave Chinner
On Mon, Oct 21, 2019 at 03:49:31PM -0700, Ira Weiny wrote: > On Mon, Oct 21, 2019 at 11:45:36AM +1100, Dave Chinner wrote: > > On Sun, Oct 20, 2019 at 08:59:35AM -0700, ira.we...@intel.com wrote: > > > @@ -1232,12 +1233,10 @@ xfs_diflags_to_linux( > > >

Re: [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag

2019-10-20 Thread Dave Chinner
k(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); > + > + if (i_size_read(inode) != 0) { > + error = -EOPNOTSUPP; > + goto out_unlock; > + } Wrong error. Should be the same as whatever is returned when we try to change the extent size hint and can't because the file is non-zero in length (-EINVAL, I think). Also needs a comment explainging why this check exists, and probably better written as i_size_read() > 0 Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/5] fs/xfs: Isolate the physical DAX flag from effective

2019-10-20 Thread Dave Chinner
ption setting, giving applications a way of guranteeing they aren't using DAX to access the data. So if the mount option is going to live on, I suspect that we want to keep this code as it stands. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/14] iomap: lift the xfs writeback code to iomap

2019-10-17 Thread Dave Chinner
n > ioend, and cancel a page that encountered an error before it was added to > an ioend. > > Signed-off-by: Christoph Hellwig With Darrick's renaming of the .submit_ioend method, this looks fine. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 01/14] iomap: iomap that extends beyond EOF should be marked dirty

2019-10-17 Thread Dave Chinner
On Thu, Oct 17, 2019 at 11:39:17AM -0700, Darrick J. Wong wrote: > On Thu, Oct 17, 2019 at 07:56:11PM +0200, Christoph Hellwig wrote: > > From: Dave Chinner > > > > When doing a direct IO that spans the current EOF, and there are > > written blocks beyond EOF that exte

Re: [PATCH 12/12] iomap: cleanup iomap_ioend_compare

2019-10-15 Thread Dave Chinner
On Tue, Oct 15, 2019 at 05:43:45PM +0200, Christoph Hellwig wrote: > Move the initialization of ia and ib to the declaration line and remove > a superflous else. > > Signed-off-by: Christoph Hellwig nice little cleanup. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 11/12] iomap: move struct iomap_page out of iomap.h

2019-10-15 Thread Dave Chinner
> --- > fs/iomap/buffered-io.c | 17 + > include/linux/iomap.h | 17 - > 2 files changed, 17 insertions(+), 17 deletions(-) Sensible, nothing should be playing around with internal iomap per-page state. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/12] iomap: warn on inline maps in iomap_writepage_map

2019-10-15 Thread Dave Chinner
INLINE)) > + continue; > if (wpc->iomap.type == IOMAP_HOLE) > continue; > iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc, looks fine. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 09/12] iomap: lift the xfs writeback code to iomap

2019-10-15 Thread Dave Chinner
, 0, 0); > + > + /* > + * Refuse to write the page out if we are called from reclaim context. > + * > + * This avoids stack overflows when called from deeply used stacks in > + * random callers for direct reclaim or memcg reclaim. We explicitly > + * allow reclaim from kswapd as the stack usage there is relatively low. > + * > + * This should never happen except in the case of a VM regression so > + * warn about it. > + */ > + if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == > + PF_MEMALLOC)) > + goto redirty; > + > + /* > + * Given that we do not allow direct reclaim to call us, we should > + * never be called while in a filesystem transaction. > + */ never be called in a recursive filesystem reclaim context. > + if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) > + goto redirty; > + Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 08/12] iomap: lift the xfs readpage / readpages tracing to iomap

2019-10-15 Thread Dave Chinner
On Tue, Oct 15, 2019 at 05:43:41PM +0200, Christoph Hellwig wrote: > Lift the xfs code for tracing address space operations to the iomap > layer. > > Signed-off-by: Christoph Hellwig OK. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 07/12] iomap: zero newly allocated mapped blocks

2019-10-15 Thread Dave Chinner
lso zero out mapped blocks with the new flag. > > Signed-off-by: Christoph Hellwig > Reviewed-by: Darrick J. Wong Sensible. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 06/12] xfs: remove the fork fields in the writepage_ctx and ioend

2019-10-15 Thread Dave Chinner
On Tue, Oct 15, 2019 at 05:43:39PM +0200, Christoph Hellwig wrote: > In preparation for moving the writeback code to iomap.c, replace the > XFS-specific COW fork concept with the iomap IOMAP_F_SHARED flag. > > Signed-off-by: Christoph Hellwig no problems I can spot. Reviewed-by:

Re: [PATCH 05/12] xfs: turn io_append_trans into an io_private void pointer

2019-10-15 Thread Dave Chinner
ig looks good. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/12] xfs: refactor the ioend merging code

2019-10-15 Thread Dave Chinner
s ok. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 03/12] xfs: use a struct iomap in xfs_writepage_ctx

2019-10-15 Thread Dave Chinner
ino > Reviewed-by: Darrick J. Wong Pretty straight forward. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/12] xfs: set IOMAP_F_NEW more carefully

2019-10-15 Thread Dave Chinner
t; the iomap code to fully support file systems that don't do delayed > allocations or use unwritten extents. > > Signed-off-by: Christoph Hellwig looks fine. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 01/12] xfs: initialize iomap->flags in xfs_bmbt_to_iomap

2019-10-15 Thread Dave Chinner
de. Replace the shared paramter with a set of initial > flags an thus ensures the flags field is always reinitialized. > > Signed-off-by: Christoph Hellwig Looks fine. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: Lease semantic proposal

2019-10-10 Thread Dave Chinner
On Tue, Oct 01, 2019 at 02:01:57PM -0700, Ira Weiny wrote: > On Mon, Sep 30, 2019 at 06:42:33PM +1000, Dave Chinner wrote: > > On Wed, Sep 25, 2019 at 04:46:03PM -0700, Ira Weiny wrote: > > > On Tue, Sep 24, 2019 at 08:26:20AM +1000, Dave Chinner wrote: > > > > Hence

Re: [RFC PATCH 0/7] xfs: add reflink & dedupe support for fsdax.

2019-10-10 Thread Dave Chinner
new reflink copies to the same page... ISTR a couple of other solutions were thrown around, but I don't think anyone came up with a simple solution... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/11] iomap: copy the xfs writeback code to iomap.c

2019-10-08 Thread Dave Chinner
On Tue, Oct 08, 2019 at 08:34:36AM +0200, Christoph Hellwig wrote: > On Tue, Oct 08, 2019 at 08:43:53AM +1100, Dave Chinner wrote: > > > +static int > > > +iomap_ioend_compare(void *priv, struct list_head *a, struct list_head *b) > > > +{ > > > + struct i

Re: [PATCH] cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups

2019-10-07 Thread Dave Chinner
ve a hundred million cached inodes these days, often on a single filesystem. Anything that requires a brute-force system wide inode scan, especially without conditional reschedule points, is largely a non-starter. Also, inode_switch_wbs() is not guaranteed to move the inode to the destination wb.

Re: [PATCH 02/15] fs: Introduce i_blocks_per_page

2019-10-07 Thread Dave Chinner
On Fri, Oct 04, 2019 at 12:28:12PM -0700, Matthew Wilcox wrote: > On Wed, Sep 25, 2019 at 06:36:50PM +1000, Dave Chinner wrote: > > I'm actually working on abstrcting this code from both block size > > and page size via the helpers below. We ahve need to support block > > siz

Re: [PATCH 05/11] iomap: zero newly allocated mapped blocks

2019-10-07 Thread Dave Chinner
On Tue, Oct 08, 2019 at 08:46:32AM +1100, Dave Chinner wrote: > On Sun, Oct 06, 2019 at 05:46:02PM +0200, Christoph Hellwig wrote: > > File systems like gfs2 don't support delayed allocations or unwritten > > extents and thus allocate normal mapped blocks to fill holes. To >

Re: [PATCH 09/11] xfs: remove the fork fields in the writepage_ctx and ioend

2019-10-07 Thread Dave Chinner
e; > if ((ioend->io_type == IOMAP_UNWRITTEN) ^ > (next->io_type == IOMAP_UNWRITTEN)) These probably should be indented too, as they are continuations, not separate logic statements. > @@ -768,7 +769,8 @@ xfs_add_to_ioend( > boolmerged, same_page = fal

Re: [PATCH 08/11] xfs: use a struct iomap in xfs_writepage_ctx

2019-10-07 Thread Dave Chinner
error; > > + if (whichfork == XFS_COW_FORK) > + flags |= IOMAP_F_SHARED; That seems out of place - I don't see anywhere in this patch that moves/removes setting the IOMAP_F_SHARED flag. i.e this looks like a change of behaviour Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 05/11] iomap: zero newly allocated mapped blocks

2019-10-07 Thread Dave Chinner
a change of logic - why is the IOMAP_F_NEW check added here and what bug does it fix? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/11] iomap: copy the xfs writeback code to iomap.c

2019-10-07 Thread Dave Chinner
d friends... > +iomap_writepage(struct page *page, struct writeback_control *wbc, > + struct iomap_writepage_ctx *wpc, > + const struct iomap_writeback_ops *ops) > +{ > + int ret; > + > + wpc->ops = ops; > + ret = iomap_do_writepage(page, wbc

Re: Lease semantic proposal

2019-09-30 Thread Dave Chinner
On Wed, Sep 25, 2019 at 04:46:03PM -0700, Ira Weiny wrote: > On Tue, Sep 24, 2019 at 08:26:20AM +1000, Dave Chinner wrote: > > Hence, AFIACT, the above definition of a F_RDLCK|F_LAYOUT lease > > doesn't appear to be compatible with the semantics required by > > existing u

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-25 Thread Dave Chinner
On Wed, Sep 25, 2019 at 11:15:30AM +0300, Konstantin Khlebnikov wrote: > On 25/09/2019 10.18, Dave Chinner wrote: > > On Tue, Sep 24, 2019 at 12:00:17PM +0300, Konstantin Khlebnikov wrote: > > > On 24/09/2019 10.39, Dave Chinner wrote: > > > > On Mon, Sep 23, 2019 a

Re: [PATCH 02/15] fs: Introduce i_blocks_per_page

2019-09-25 Thread Dave Chinner
age. > + * > + * Context: Any context. > + * Return: The number of filesystem blocks covered by this page. > + */ > +static inline > +unsigned int i_blocks_per_page(struct inode *inode, struct page *page) > +{ > + return page_size(page) >> inode->i_blkbits; > +}

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-25 Thread Dave Chinner
On Tue, Sep 24, 2019 at 12:08:04PM -0700, Linus Torvalds wrote: > On Tue, Sep 24, 2019 at 12:39 AM Dave Chinner wrote: > > > > Stupid question: how is this any different to simply winding down > > our dirty writeback and throttling thresholds like so: > > > > # ec

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-25 Thread Dave Chinner
On Tue, Sep 24, 2019 at 12:00:17PM +0300, Konstantin Khlebnikov wrote: > On 24/09/2019 10.39, Dave Chinner wrote: > > On Mon, Sep 23, 2019 at 06:06:46PM +0300, Konstantin Khlebnikov wrote: > > > On 23/09/2019 17.52, Tejun Heo wrote: > > > > Hello, Konstantin. > &g

Re: [PATCH v2 2/2] mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)

2019-09-24 Thread Dave Chinner
can remove the workaround from XFS. Users don't care how we fix the problem, they just want it fixed. If that means we have to route around dysfunctional developer groups, then we'll just have to do that Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-24 Thread Dave Chinner
substantial amount of dirty data to be cached for writeback for fragmentation minimisation algorithms to be able to do their job Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: Lease semantic proposal

2019-09-23 Thread Dave Chinner
ause write() requires breaking of leases - WRLCK is open to abuse simply by not using a layout lease to do a "no change" layout modification - RDLCK|F_UNBREAK is entirely unusable - WRLCK|F_UNBREAK will be what every application uses because everything else either doesn't work or is too easy to abuse. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems

2019-09-12 Thread Dave Chinner
lock is protecting needs fixing. Adding timeouts to locks and sysctls to tune them is not a viable solution to address latencies caused by algorithm scalability issues. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [xfs] 610125ab1e: fsmark.app_overhead -71.2% improvement

2019-09-09 Thread Dave Chinner
On Mon, Sep 09, 2019 at 02:06:54PM +0800, Rong Chen wrote: > Hi Dave, > > On 9/9/19 1:32 PM, Dave Chinner wrote: > > On Mon, Sep 09, 2019 at 09:58:49AM +0800, kernel test robot wrote: > > > Greeting, > > > > > > FYI, we noticed a -71.2% improv

Re: [xfs] 610125ab1e: fsmark.app_overhead -71.2% improvement

2019-09-08 Thread Dave Chinner
s is a negative improvement - it's a large positive improvement. I suspect that you need to change the metric classifications for this workload... Cheers, Dave. -- Dave Chinner dchin...@redhat.com

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-09-02 Thread Dave Chinner
On Wed, Aug 28, 2019 at 07:02:31PM -0700, Ira Weiny wrote: > On Mon, Aug 26, 2019 at 03:55:10PM +1000, Dave Chinner wrote: > > On Fri, Aug 23, 2019 at 10:08:36PM -0700, Ira Weiny wrote: > > > On Sat, Aug 24, 2019 at 10:11:24AM +1000, Dave Chinner wrote: > > > > On F

Re: [PATCH] drivers/staging/exfat - by default, prohibit mount of fat/vfat

2019-09-01 Thread Dave Chinner
On Sat, Aug 31, 2019 at 11:37:27PM -0400, Valdis Klētnieks wrote: > On Sun, 01 Sep 2019 11:07:21 +1000, Dave Chinner said: > > Totally irrelevant to the issue at hand. You can easily co-ordinate > > out of tree contributions through a github tree, or a tree on > > kernel.org,

Re: [PATCH] drivers/staging/exfat - by default, prohibit mount of fat/vfat

2019-08-31 Thread Dave Chinner
perienced filesystem developer review as you are getting now. That's the choice you have to make now: listen to the reviewers saying "resolve the fundamental issues before goign any further", or you can ignore that and have it rejected after another year of work because the fundamental issues haven't been resolved while it sits in staging Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] staging: exfat: add exfat filesystem code to staging

2019-08-31 Thread Dave Chinner
On Sat, Aug 31, 2019 at 06:31:45AM -0400, Valdis Klētnieks wrote: > On Sat, 31 Aug 2019 07:54:10 +1000, Dave Chinner said: > > > The correct place for new filesystem review is where all the > > experienced filesystem developers hang out - that's linux-fsdevel, > > not

Re: [PATCH] staging: exfat: add exfat filesystem code to staging

2019-08-30 Thread Dave Chinner
. As a result, the quality and stability standard for merging a new filesystem needs to be far higher that what is acceptible for merging a new driver. The correct place for new filesystem review is where all the experienced filesystem developers hang out - that's linux-fsdevel, not the driver staging tree. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] xfs: Initialize label array properly

2019-08-30 Thread Dave Chinner
ctor here? I can't see any, mostly because this is the "set label" function and that doesn't return anything to userspace. We also zero the on-disk label before we copy the user label into it, so I don't see that anything can leak onto disk, either... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 01/19] dax: remove block device dependencies

2019-08-28 Thread Dave Chinner
, so on those grounds alone I'd suggest this is a dead end approach. Hence I think that if the dax device needs a physical offset from the start of the block device the filesystem sits on, it should be set up at dax device instantiation time and so the filesystem/bdev never needs to be queried again for this information. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-25 Thread Dave Chinner
On Fri, Aug 23, 2019 at 10:08:36PM -0700, Ira Weiny wrote: > On Sat, Aug 24, 2019 at 10:11:24AM +1000, Dave Chinner wrote: > > On Fri, Aug 23, 2019 at 09:04:29AM -0300, Jason Gunthorpe wrote: > > > On Fri, Aug 23, 2019 at 01:23:45PM +1000, Dave Chinner wrote: > > &

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-23 Thread Dave Chinner
On Fri, Aug 23, 2019 at 10:15:04AM -0700, Ira Weiny wrote: > On Fri, Aug 23, 2019 at 10:59:14AM +1000, Dave Chinner wrote: > > On Wed, Aug 21, 2019 at 11:02:00AM -0700, Ira Weiny wrote: > > > On Tue, Aug 20, 2019 at 08:55:15AM -0300, Jason Gunthorpe wrote: > > > >

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-23 Thread Dave Chinner
On Fri, Aug 23, 2019 at 09:04:29AM -0300, Jason Gunthorpe wrote: > On Fri, Aug 23, 2019 at 01:23:45PM +1000, Dave Chinner wrote: > > > > But the fact that RDMA, and potentially others, can "pass the > > > pins" to other processes is something I spent a lot

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-22 Thread Dave Chinner
s for to another process, then the destination process already have a valid, active layout lease that covers the range of the pins being passed to it via the RDMA handle. i.e. as the pins pass from one process to another, they pass from the protection of the lease process A holds to the protection that the lease process B holds. This can probably even be done by duplicating the lease fd and passing it by SCM_RIGHTS first. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-22 Thread Dave Chinner
On Wed, Aug 21, 2019 at 11:02:00AM -0700, Ira Weiny wrote: > On Tue, Aug 20, 2019 at 08:55:15AM -0300, Jason Gunthorpe wrote: > > On Tue, Aug 20, 2019 at 11:12:10AM +1000, Dave Chinner wrote: > > > On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote: > > > &

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Dave Chinner
On Thu, Aug 22, 2019 at 02:19:04PM +0200, Vlastimil Babka wrote: > On 8/22/19 2:07 PM, Dave Chinner wrote: > > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: > > > > No, the problem is this (using kmalloc as a general term for > > alloca

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Dave Chinner
On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: > On 8/22/19 12:14 PM, Dave Chinner wrote: > > On Thu, Aug 22, 2019 at 11:10:57AM +0200, Peter Zijlstra wrote: > >> > >> Ah, current_gfp_context() already seems to transfer PF_MEMALLOC_NOFS > >&g

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Dave Chinner
On Thu, Aug 22, 2019 at 11:10:57AM +0200, Peter Zijlstra wrote: > On Thu, Aug 22, 2019 at 10:51:30AM +0200, Peter Zijlstra wrote: > > On Thu, Aug 22, 2019 at 12:59:48AM -0700, Christoph Hellwig wrote: > > > On Thu, Aug 22, 2019 at 10:31:32AM +1000, Dave Chinner wrote: > &g

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-19 Thread Dave Chinner
On Mon, Aug 19, 2019 at 08:09:33PM -0700, John Hubbard wrote: > On 8/19/19 6:20 PM, Dave Chinner wrote: > > On Mon, Aug 19, 2019 at 05:05:53PM -0700, John Hubbard wrote: > > > On 8/19/19 2:24 AM, Dave Chinner wrote: > > > > On Mon, Aug 19, 2019 at 08:34:12AM +0200, Ja

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-19 Thread Dave Chinner
On Mon, Aug 19, 2019 at 05:05:53PM -0700, John Hubbard wrote: > On 8/19/19 2:24 AM, Dave Chinner wrote: > > On Mon, Aug 19, 2019 at 08:34:12AM +0200, Jan Kara wrote: > > > On Sat 17-08-19 12:26:03, Dave Chinner wrote: > > > > On Fri, Aug 16, 2019 at 12:

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-19 Thread Dave Chinner
On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote: > On Mon, Aug 19, 2019 at 07:24:09PM +1000, Dave Chinner wrote: > > > So that leaves just the normal close() syscall exit case, where the > > application has full control of the order in which resources are &

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-19 Thread Dave Chinner
On Mon, Aug 19, 2019 at 08:34:12AM +0200, Jan Kara wrote: > On Sat 17-08-19 12:26:03, Dave Chinner wrote: > > On Fri, Aug 16, 2019 at 12:05:28PM -0700, Ira Weiny wrote: > > > On Thu, Aug 15, 2019 at 03:05:58PM +0200, Jan Kara wrote: > > > > On Wed 14-0

[BUG 5.3-rc5] rwsem: use after free on task_struct if task exits with rwsem held

2019-08-19 Thread Dave Chinner
d much prefer that leaked rwsems just hang and we do not add the potential for random memory corruption into these situations as well - a lock hang is much easier to debug than a memory corruption Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-16 Thread Dave Chinner
access to the file any more, and so the lease should be reclaimed at that point. I'm of a mind to make the last close() on a file block if there's an active layout lease to prevent processes from zombie-ing layout leases like this. i.e. you can't close the fd until resources that pin the lease have been released. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v5 01/18] xfs: compat_ioctl: use compat_ptr()

2019-08-15 Thread Dave Chinner
s. :) It can easily go before or after Arnd's patch, and the merge conflict either way would be minor, so I'm not really fussed either way this gets sorted out... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 02/19] fs/locks: Add Exclusive flag to user Layout lease

2019-08-14 Thread Dave Chinner
access semantics (i.e other ops fail rather than block waiting for lease recall) and hence the API shouldn't need a new flag to specify them. i.e. the primary difference between F_RDLCK and F_WRLCK layout leases is that the F_RDLCK is a shared, co-operative lease model where only delays in operations will be seen, while F_WRLCK is a "guarantee exclusive access and I don't care what it breaks" model... :) Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v5 01/18] xfs: compat_ioctl: use compat_ptr()

2019-08-14 Thread Dave Chinner
en up front you can do: void__user *arg; p = compat_ptr_mask(p); arg = (void __user *)p; and then the rest of the code remains unchanged by now uses p correctly instead of having to change all the code to cast arg back to an unsigned long... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space

2019-08-14 Thread Dave Chinner
On Wed, Aug 14, 2019 at 07:21:34AM -0400, Jeff Layton wrote: > On Wed, 2019-08-14 at 18:05 +1000, Dave Chinner wrote: > > On Mon, Aug 12, 2019 at 10:36:26AM -0700, Ira Weiny wrote: > > > On Sat, Aug 10, 2019 at 09:52:31AM +1000, Dave Chinner wrote: > > > > On Fri, Au

Re: [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space

2019-08-14 Thread Dave Chinner
On Mon, Aug 12, 2019 at 10:36:26AM -0700, Ira Weiny wrote: > On Sat, Aug 10, 2019 at 09:52:31AM +1000, Dave Chinner wrote: > > On Fri, Aug 09, 2019 at 03:58:15PM -0700, ira.we...@intel.com wrote: > > > + /* > > > + * NOTE on F_LAYOUT lease > > > + * >

Re: [RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page()

2019-08-14 Thread Dave Chinner
On Mon, Aug 12, 2019 at 11:05:51AM -0700, Ira Weiny wrote: > On Sat, Aug 10, 2019 at 09:30:37AM +1000, Dave Chinner wrote: > > On Fri, Aug 09, 2019 at 03:58:21PM -0700, ira.we...@intel.com wrote: > > > From: Ira Weiny > > > > > > dax_layout_busy_p

Re: [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space

2019-08-09 Thread Dave Chinner
he physical layout of the file if it so desires, this will block while F_RDLCK holders are notified and release their leases before the modification will take place. We need to define the semantics we expose to userspace first. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page()

2019-08-09 Thread Dave Chinner
eyond EOF on a truncate down. i.e. when we use preallocation, the extent map extends beyond EOF, and layout leases need to be able to extend beyond the current EOF to allow the lease owner to do extending writes, extending truncate, preallocation beyond EOF, etc safely without having to get a new lease to cover the new region in the extended file... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 08/19] fs/xfs: Fail truncate if page lease can't be broken

2019-08-09 Thread Dave Chinner
a file that is built in. It's only external dependency is on the break_layout() function, and XFS already has other unconditional direct calls to break_layout()... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: XFS segementation fault with new linux 4.19.63

2019-08-06 Thread Dave Chinner
: finobt AG reserves don't consider last AG can be a runt") Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 0/7] xfs: add reflink & dedupe support for fsdax.

2019-08-04 Thread Dave Chinner
which moves the iomap code to the different directory. > > I will build the dax patches on top of that. > > However, we are making a big dependency chain here > Don't worry. It's fine for me. I'll follow your updates. Hi Shiyang, I'll wait for you to update your patche

Re: [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP

2019-07-31 Thread Dave Chinner
On Wed, Jul 31, 2019 at 04:32:21AM -0700, Matthew Wilcox wrote: > On Wed, Jul 31, 2019 at 08:20:53PM +1000, Dave Chinner wrote: > > On Wed, Jul 31, 2019 at 02:25:11AM -0600, William Kucharski wrote: > > > This set of patches is the first step towards a mechanism for &

Re: [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP

2019-07-31 Thread Dave Chinner
bio chain? Once you can answer that question, you should be able to easily convert the iomap_readpage/iomap_readpage_actor code to support THP pages without having to care about much else as iomap_readpage() is already coded in a way that will iterate IO over the entire THP for you Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-30 Thread Dave Chinner
nt to expose kernel memory reclaim capabilities to userspace... It would be misleading, too, because we still want to allow reclaim to occur, just not have reclaim recurse into other filesystems Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-28 Thread Dave Chinner
On Sat, Jul 27, 2019 at 02:59:59AM +, Damien Le Moal wrote: > On 2019/07/27 7:55, Theodore Y. Ts'o wrote: > > On Sat, Jul 27, 2019 at 08:44:23AM +1000, Dave Chinner wrote: > >>> > >>> This looks like something that could hit every file systems, so > >

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-26 Thread Dave Chinner
er-mapping gfp_mask. I think it has to be the entire IO path - any allocation from the underlying filesystem could recurse into the top level filesystem and then deadlock if the memory reclaim submits IO or blocks on IO completion from the upper filesystem. That's a bloody big hammer for something that is only necessary when there are stacked filesystems like this Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: pagecache locking

2019-07-09 Thread Dave Chinner
On Mon, Jul 08, 2019 at 03:31:14PM +0200, Jan Kara wrote: > On Sat 06-07-19 09:31:57, Dave Chinner wrote: > > On Wed, Jul 03, 2019 at 03:04:45AM +0300, Boaz Harrosh wrote: > > > On 20/06/2019 01:37, Dave Chinner wrote: > > > <> > > > > > > >

Re: pagecache locking

2019-07-07 Thread Dave Chinner
On Sun, Jul 07, 2019 at 06:05:16PM +0300, Boaz Harrosh wrote: > On 06/07/2019 02:31, Dave Chinner wrote: > > > > > As long as the IO ranges to the same file *don't overlap*, it should > > be perfectly safe to take separate range locks (in read or write > > mode)

Re: pagecache locking

2019-07-05 Thread Dave Chinner
On Wed, Jul 03, 2019 at 03:04:45AM +0300, Boaz Harrosh wrote: > On 20/06/2019 01:37, Dave Chinner wrote: > <> > > > > I'd prefer it doesn't get lifted to the VFS because I'm planning on > > getting rid of it in XFS with range locks. i.e. the XFS_MMAPLOCK is > >

Re: [PATCH 11/12] iomap: move the xfs writeback code to iomap.c

2019-07-01 Thread Dave Chinner
On Mon, Jul 01, 2019 at 08:43:33AM +0200, Christoph Hellwig wrote: > On Mon, Jul 01, 2019 at 10:08:59AM +1000, Dave Chinner wrote: > > > Why do you assume you have to test it? Back when we shared > > > generic_file_read with everyone you also didn't test odd change to > &g

Re: [PATCH 11/12] iomap: move the xfs writeback code to iomap.c

2019-06-30 Thread Dave Chinner
On Fri, Jun 28, 2019 at 07:33:20AM +0200, Christoph Hellwig wrote: > On Fri, Jun 28, 2019 at 10:45:42AM +1000, Dave Chinner wrote: > > You've already mentioned two new users you want to add. I don't even > > have zone capable hardware here to test one of the users you are > >

Re: [PATCH 11/12] iomap: move the xfs writeback code to iomap.c

2019-06-27 Thread Dave Chinner
On Tue, Jun 25, 2019 at 12:10:20PM +0200, Christoph Hellwig wrote: > On Tue, Jun 25, 2019 at 09:43:04AM +1000, Dave Chinner wrote: > > I'm a little concerned this is going to limit what we can do > > with the XFS IO path because now we can't change this code without > > c

Re: [PATCH 12/12] iomap: add tracing for the address space operations

2019-06-27 Thread Dave Chinner
On Tue, Jun 25, 2019 at 12:15:15PM +0200, Christoph Hellwig wrote: > On Tue, Jun 25, 2019 at 09:49:21AM +1000, Dave Chinner wrote: > > > +#undef TRACE_SYSTEM > > > +#define TRACE_SYSTEM iomap > > > > Can you add a comment somewhere here that says these tracepoint

Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size updates

2019-06-27 Thread Dave Chinner
On Tue, Jun 25, 2019 at 12:25:07PM +0200, Christoph Hellwig wrote: > On Tue, Jun 25, 2019 at 09:15:23AM +1000, Dave Chinner wrote: > > > So, uh, how much of a hit do we take for having to allocate a > > > transaction for a file size extension? Particularly since we can >

Re: [PATCH 12/12] iomap: add tracing for the address space operations

2019-06-24 Thread Dave Chinner
YSTEM > +#define TRACE_SYSTEM iomap Can you add a comment somewhere here that says these tracepoints are volatile and we reserve the right to change them at any time so they don't form any sort of persistent UAPI that we have to maintain? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 11/12] iomap: move the xfs writeback code to iomap.c

2019-06-24 Thread Dave Chinner
fs/iomap-util.c for all the miscellaneous one-off functions like fiemap, etc? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size updates

2019-06-24 Thread Dave Chinner
ervation/free. If we are out of log space, then we sleep waiting for space - the issue really comes down to where it is better to sleep in that case Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 06/12] xfs: remove XFS_TRANS_NOFS

2019-06-24 Thread Dave Chinner
while we are changing over to a different GFP_NOFS allocation context mechanism Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: pagecache locking (was: bcachefs status update) merged)

2019-06-19 Thread Dave Chinner
n getting rid of it in XFS with range locks. i.e. the XFS_MMAPLOCK is likely to go away in the near term because a range lock can be taken on either side of the mmap_sem in the page fault path. > That being said as Dave said we use those fs-private locks also for > serializing against equivalent issues arising for DAX. So the problem is > not only about page cache but generally about doing IO and caching > block mapping information for a file range. So the solution should not be > too tied to page cache. Yup, that was the point I was trying to make when Linus started shouting at me about how caches work and how essential they are. I guess the fact that DAX doesn't use the page cache isn't as widely known as I assumed it was... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: pagecache locking (was: bcachefs status update) merged)

2019-06-17 Thread Dave Chinner
the completely ambiguous behaviours defined in the older specs are still just as important these days as the completely ambiguous behaviours defined in the new specifications. :/ Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: pagecache locking (was: bcachefs status update) merged)

2019-06-14 Thread Dave Chinner
On Thu, Jun 13, 2019 at 04:30:36PM -1000, Linus Torvalds wrote: > On Thu, Jun 13, 2019 at 1:56 PM Dave Chinner wrote: > > > > That said, the page cache is still far, far slower than direct IO, > > Bullshit, Dave. > > You've made that claim before, and it's been com

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-13 Thread Dave Chinner
On Thu, Jun 13, 2019 at 01:34:06PM -0700, Ira Weiny wrote: > On Thu, Jun 13, 2019 at 10:55:52AM +1000, Dave Chinner wrote: > > On Wed, Jun 12, 2019 at 04:30:24PM -0700, Ira Weiny wrote: > > > On Wed, Jun 12, 2019 at 05:37:53AM -0700, Matthew Wilcox wrote: > > > > On S

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-13 Thread Dave Chinner
On Thu, Jun 13, 2019 at 07:31:07PM -0700, Matthew Wilcox wrote: > On Fri, Jun 14, 2019 at 12:09:21PM +1000, Dave Chinner wrote: > > If the lease holder modifies the mapping in a way that causes it's > > own internal state to screw up, then that's a bug in the lease > &g

<    1   2   3   4   5   6   7   8   9   10   >