On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> On Fri, Feb 26, 2021 at 12:51 PM Dave Chinner wrote:
> >
> > On Fri, Feb 26, 2021 at 11:24:53AM -0800, Dan Williams wrote:
> > > On Fri, Feb 26, 2021 at 11:05 AM Darrick J. Wong
> > > wrote:
> &
On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> On Fri, Feb 26, 2021 at 12:51 PM Dave Chinner wrote:
> >
> > On Fri, Feb 26, 2021 at 11:24:53AM -0800, Dan Williams wrote:
> > > On Fri, Feb 26, 2021 at 11:05 AM Darrick J. Wong
> > > wrote:
> &
On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> On Fri, Feb 26, 2021 at 12:51 PM Dave Chinner wrote:
> >
> > On Fri, Feb 26, 2021 at 11:24:53AM -0800, Dan Williams wrote:
> > > On Fri, Feb 26, 2021 at 11:05 AM Darrick J. Wong
> > > wrote:
> &
hen when userspace tries to access the
mapped DAX pages we get a new page fault. In processing the fault, the
filesystem will try to get direct access to the pmem from the block
device. This will get an ENODEV error from the block device because
because the backing store (pmem) has been unplugged and is no longer
there...
AFAICT, as long as pmem removal invalidates all the active ptes that
point at the pmem being removed, the filesystem doesn't need to
care about device removal at all, DAX or no DAX...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
hen when userspace tries to access the
mapped DAX pages we get a new page fault. In processing the fault, the
filesystem will try to get direct access to the pmem from the block
device. This will get an ENODEV error from the block device because
because the backing store (pmem) has been unplugged and is no longer
there...
AFAICT, as long as pmem removal invalidates all the active ptes that
point at the pmem being removed, the filesystem doesn't need to
care about device removal at all, DAX or no DAX...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
hen when userspace tries to access the
mapped DAX pages we get a new page fault. In processing the fault, the
filesystem will try to get direct access to the pmem from the block
device. This will get an ENODEV error from the block device because
because the backing store (pmem) has been unplugge
fix it in mainline that I know of.
> As I said, some vendors have tried to fix it in their NAS products,
> but I don't know where to find that patch any more.
It's not suportable from a disaster recovery perspective. I recently
saw a 14TB filesystem with billions of hardlinks in it require 240GB
of RAM to run xfs_repair. We just can't support large filesystems
on 32 bit systems, and it has nothing to do with simple stuff like
page cache index sizes...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
fset for such systems to 16TB so sparse files can't be larger
than what the kernel supports. See xfs_sb_validate_fsb_count() call
and the file offset checks against MAX_LFS_FILESIZE in
xfs_fs_fill_super()...
FWIW, XFS has been doing this for roughly 20 years now - >16TB on 32
bit machines w
n care at this
point about cross-device XCOPY at this point?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
n care at this
point about cross-device XCOPY at this point?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Fri, Feb 12, 2021 at 03:54:48PM -0800, Darrick J. Wong wrote:
> On Sat, Feb 13, 2021 at 10:27:26AM +1100, Dave Chinner wrote:
> > On Fri, Feb 12, 2021 at 03:07:39PM -0800, Ian Lance Taylor wrote:
> > > On Fri, Feb 12, 2021 at 3:03 PM Dave Chinner wrote:
> > > >
&
On Fri, Feb 12, 2021 at 03:07:39PM -0800, Ian Lance Taylor wrote:
> On Fri, Feb 12, 2021 at 3:03 PM Dave Chinner wrote:
> >
> > On Fri, Feb 12, 2021 at 04:45:41PM +0100, Greg KH wrote:
> > > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > &
ly breaking? What changed in
> > the kernel that caused this? Procfs has been around for a _very_ long
> > time :)
>
> That would be because of (v5.3):
>
> 5dae222a5ff0 vfs: allow copy_file_range to copy across devices
>
> The intention of this change (series) was to
It is not
intended as a copy mechanism for copying data from one random file
descriptor to another.
The use of it as a general file copy mechanism in the Go system
library is incorrect and wrong. It is a userspace bug. Userspace
has done the wrong thing, userspace needs to be fixed.
-Dave.
--
Dave Chinner
da...@fromorbit.com
back. It's likely to be too much work for a bound
workqueue, too, especially when you consider that the workqueue
completion code will merge sequential ioends into one ioend, hence
making the IO completion loop counts bigger and latency problems worse
rather than better...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
to list the requested attributes of all
directories and files in the tree...
So, yeah, we do indeed do thousands of these fsxattr based
operations a second, sometimes tens of thousands a second or more,
and sometimes they are issued in bulk in performance critical paths
for container build/deployment operations
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Mon, Feb 01, 2021 at 06:14:21PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 22, 2021 at 09:20:51AM +1100, Dave Chinner wrote:
> > Hi btrfs-gurus,
> >
> > I'm running a simple reflink/snapshot/COW scalability test at the
> > moment. It is just a loop that d
On Fri, Jan 29, 2021 at 06:25:50PM -0500, Zygo Blaxell wrote:
> On Mon, Jan 25, 2021 at 09:36:55AM +1100, Dave Chinner wrote:
> > On Sat, Jan 23, 2021 at 04:42:33PM +0800, Qu Wenruo wrote:
> > >
> > >
> > > On 2021/1/22 上午6:20, Dave Chinner wrote:
> >
mechanisms. Of course, with these
special zero length files that contain ephemeral data, userspace can't
actually tell that they contain data from userspace using stat(). So
as far as userspace is concerned, copy_file_range() correctly
returned zero bytes copied from a zero byte long file and there's
nothing more to do.
This zero length file behaviour is, fundamentally, a kernel
filesystem implementation bug, not a copy_file_range() bug.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Jan 26, 2021 at 11:50:50AM +0800, Nicolas Boichat wrote:
> On Tue, Jan 26, 2021 at 9:34 AM Dave Chinner wrote:
> >
> > On Mon, Jan 25, 2021 at 03:54:31PM +0800, Nicolas Boichat wrote:
> > > Hi copy_file_range experts,
> > >
> > > We hit this in
x27;t check the file
size and just attempts to read unconditionally from the file. Hence
it happily returns non-existent stale data from busted filesystem
implementations that allow data to be read from beyond EOF...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Sat, Jan 23, 2021 at 04:42:33PM +0800, Qu Wenruo wrote:
>
>
> On 2021/1/22 上午6:20, Dave Chinner wrote:
> > Hi btrfs-gurus,
> >
> > I'm running a simple reflink/snapshot/COW scalability test at the
> > moment. It is just a loop that does "fio overwri
On Sat, Jan 23, 2021 at 07:19:03PM -0500, Zygo Blaxell wrote:
> On Fri, Jan 22, 2021 at 09:20:51AM +1100, Dave Chinner wrote:
> > Hi btrfs-gurus,
> >
> > I'm running a simple reflink/snapshot/COW scalability test at the
> > moment. It is just a loop that d
workload, I suspect the issues I note above are
btrfs issues, not expected behaviour. I'm not sure what the
expected scalability of btrfs file clones and snapshots are though,
so I'm interested to hear if these results are expected or not.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
JOBS=4
IODEPTH=4
IOCOUNT=$((1 / $JOBS))
FILESIZE=4g
cat >$fio_config <
and so
provide the same benefit to all the filesystems that use it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
and so
provide the same benefit to all the filesystems that use it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
On Fri, Jan 08, 2021 at 11:56:57AM -0500, Brian Foster wrote:
> On Fri, Jan 08, 2021 at 08:54:44AM +1100, Dave Chinner wrote:
> > e.g. we run the first transaction into the CIL, it steals the sapce
> > needed for the cil checkpoint headers for the transaciton. Then if
> > the
On Mon, Jan 11, 2021 at 11:38:48AM -0500, Brian Foster wrote:
> On Fri, Jan 08, 2021 at 11:56:57AM -0500, Brian Foster wrote:
> > On Fri, Jan 08, 2021 at 08:54:44AM +1100, Dave Chinner wrote:
> > > On Mon, Jan 04, 2021 at 11:23:53AM -0500, Brian Foster wrote:
> > > >
ourse, it will do if you crash or even just unmount/mount a
filesystem that doesn't persist it.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
.com/
and that should also allow accrual of the work skipped on each memcg
be accounted across multiple calls to the shrinkers for the same
memcg. Hence as memory pressure within the memcg goes up, the
repeated calls to direct reclaim within that memcg will result in
all of the freeable items in each cache eventually being freed...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Fri, Jan 08, 2021 at 03:59:22PM +0800, Ming Lei wrote:
> On Thu, Jan 07, 2021 at 09:21:11AM +1100, Dave Chinner wrote:
> > On Wed, Jan 06, 2021 at 04:45:48PM +0800, Ming Lei wrote:
> > > On Tue, Jan 05, 2021 at 07:39:38PM +0100, Christoph Hellwig wrote:
> > > > A
On Sun, Jan 03, 2021 at 05:03:33PM +0100, Donald Buczek wrote:
> On 02.01.21 23:44, Dave Chinner wrote:
> > On Sat, Jan 02, 2021 at 08:12:56PM +0100, Donald Buczek wrote:
> > > On 31.12.20 22:59, Dave Chinner wrote:
> > > > On Thu, Dec 31, 2020 at 12:48:5
On Mon, Jan 04, 2021 at 11:23:53AM -0500, Brian Foster wrote:
> On Thu, Dec 31, 2020 at 09:16:11AM +1100, Dave Chinner wrote:
> > On Wed, Dec 30, 2020 at 12:56:27AM +0100, Donald Buczek wrote:
> > > If the value goes below the limit while some threads are
> > > already
rything we need to
determine whether we should do a large or small bio vec allocation
in the iomap writeback path...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Sat, Jan 02, 2021 at 08:12:56PM +0100, Donald Buczek wrote:
> On 31.12.20 22:59, Dave Chinner wrote:
> > On Thu, Dec 31, 2020 at 12:48:56PM +0100, Donald Buczek wrote:
> > > On 30.12.20 23:16, Dave Chinner wrote:
> > One could argue that, but one should al
lifts of the context setting up into
xfs_trans_alloc() back into the patchset before adding the
current->journal functionality patch.
Also, you need to test XFS code with CONFIG_XFS_DEBUG=y so that
asserts are actually built into the code and exercised, because this
ASSERT should have fired on the first rolling transaction that the
kernel executes...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
lifts of the context setting up into
xfs_trans_alloc() back into the patchset before adding the
current->journal functionality patch.
Also, you need to test XFS code with CONFIG_XFS_DEBUG=y so that
asserts are actually built into the code and exercised, because this
ASSERT should have fired o
On Thu, Dec 31, 2020 at 12:48:56PM +0100, Donald Buczek wrote:
> On 30.12.20 23:16, Dave Chinner wrote:
> > On Wed, Dec 30, 2020 at 12:56:27AM +0100, Donald Buczek wrote:
> > > Threads, which committed items to the CIL, wait in the
> > > xc_push_wait waitqueue when use
> wake_up_all(&cil->xc_push_wait);
That just smells wrong to me. It *might* be correct, but this
condition should pair with the sleep condition, as space used by a
CIL context should never actually decrease
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
is
> related to that, because the md block devices itself are
> responsive (`xxd /dev/md0` )
My bet is that the OOT driver/hardware had dropped a log IO on the
floor - XFS is waiting for the CIL push to complete, and I'm betting
that is stuck waiting for iclog IO completion while writing the CIL
to the journal. The sysrq output will tell us if this is the case,
so that's the first place to look.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
inspection. But I'm
> not a VFS expert so I'm not quite sure.
Uh, if you have a shrinker racing to register and unregister, you've
got a major bug in your object initialisation/teardown code. i.e.
calling reagister/unregister at the same time for the same shrinker
is a bug, pure and simple.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
reservation recursion is used by XFS only, we can
> move the check into xfs_vm_writepage(s), per Dave.
>
> Cc: Darrick J. Wong
> Cc: Matthew Wilcox (Oracle)
> Cc: Christoph Hellwig
> Cc: Dave Chinner
> Cc: Michal Hocko
> Cc: David Howells
> Cc: Jeff Layton
> Sig
On Thu, Dec 17, 2020 at 03:06:27PM -0800, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 09:15:09AM +1100, Dave Chinner wrote:
> > The obvious solution: we've moved the saved process state to a
> > different context, so it is no longer needed for the current
> > t
if (tp->t_pflags)
memalloc_nofs_restore(tp->t_pflags);
}
and the problem is solved. The NOFS state will follow the active
transaction and not be reset until the entire transaction chain is
completed.
In the next patch you can go and introduce current->journal_info
into just the wrapper functions, maintaining the same overall
logic.
-Dave.
--
Dave Chinner
da...@fromorbit.com
--
Linux-cachefs mailing list
Linux-cachefs@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cachefs
n kswapd -- the only time we reach this code is when we're
> exiting and the task_struct is about to be destroyed anyway.
>
> Cc: Dave Chinner
> Acked-by: Michal Hocko
> Reviewed-by: Darrick J. Wong
> Reviewed-by: Christoph Hellwig
> Signed-off-by: Matthew Wilcox (O
way.
So, AFAICT, the dax_lock() stuff is only necessary when the
filesystem can't be used to resolve the owner of physical page that
went bad
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
way.
So, AFAICT, the dax_lock() stuff is only necessary when the
filesystem can't be used to resolve the owner of physical page that
went bad
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
On Tue, Dec 15, 2020 at 02:27:18PM -0800, Yang Shi wrote:
> On Mon, Dec 14, 2020 at 6:46 PM Dave Chinner wrote:
> >
> > On Mon, Dec 14, 2020 at 02:37:19PM -0800, Yang Shi wrote:
> > > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
>
Combine that with the proposed "watch_sb()" syscall for reporting
such errors in a generic manner to interested listeners, and we've
got a fairly solid generic path for reporting data loss events to
userspace for an appropriate user-defined action to be taken...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
Combine that with the proposed "watch_sb()" syscall for reporting
such errors in a generic manner to interested listeners, and we've
got a fairly solid generic path for reporting data loss events to
userspace for an appropriate user-defined action to be taken...
Cheers,
Dave.
--
Dave Chi
u still have a uesr data
recovery process to perform after this...
> And how does it help in dealing with page faults upon poisoned
> dax page?
It doesn't. If the page is poisoned, the same behaviour will occur
as does now. This is simply error reporting infrastructure, not
error handling.
Future work might change how we correct the faults found in the
storage, but I think the user visible behaviour is going to be "kill
apps mapping corrupted data" for a long time yet
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
u still have a uesr data
recovery process to perform after this...
> And how does it help in dealing with page faults upon poisoned
> dax page?
It doesn't. If the page is poisoned, the same behaviour will occur
as does now. This is simply error reporting infrastructure, not
error handling
AP_DIO_NEED_SYNC))
> - ret = generic_write_sync(iocb, ret);
> + ret = generic_write_sync(dio->iocb, ret);
>
> kfree(dio);
>
> return ret;
> }
> -EXPORT_SYMBOL_GPL(iomap_dio_complete);
> +
NACK.
If you don't want iomap_dio_comple
On Tue, Dec 15, 2020 at 02:53:48PM +0100, Johannes Weiner wrote:
> On Tue, Dec 15, 2020 at 01:09:57PM +1100, Dave Chinner wrote:
> > On Mon, Dec 14, 2020 at 02:37:15PM -0800, Yang Shi wrote:
> > > Since memcg_shrinker_map_size just can be changd under holding
&g
return;
>
> kfree(shrinker->nr_deferred);
> shrinker->nr_deferred = NULL;
e.g. then this function can simply do:
{
if (shrinker->flags & SHRINKER_MEMCG_AWARE)
return unregister_memcg_shrinker(shrinker);
kfree(shrinker->nr_deferred);
shrinker->nr_deferred = NULL;
}
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
acd..693a41e89969 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -201,7 +201,7 @@ DECLARE_RWSEM(shrinker_rwsem);
> #define SHRINKER_REGISTERING ((struct shrinker *)~0UL)
>
> static DEFINE_IDR(shrinker_idr);
> -static int shrinker_nr_max;
> +int shrinker_nr_max;
Then we don't need to make yet another variable global...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
ile it may help your specific corner case,
it's likely to significantly change the reclaim balance of slab
caches, especially under GFP_NOFS intensive workloads where we can
only defer the work to kswapd.
Hence I think this is still a problematic approach as it doesn't
address the reason why deferred counts are increasing out of
control in the first place
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
r will do that for static functions automatically if it makes
sense.
Ok, so you only do the memcg nr_deferred thing if NUMA_AWARE &&
sc->memcg is true. so
static long shrink_slab_set_nr_deferred_memcg(...)
{
int nid = sc->nid;
deferred =
rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_deferred,
true);
return atomic_long_add_return(nr, &deferred->nr_deferred[id]);
}
static long shrink_slab_set_nr_deferred(...)
{
int nid = sc->nid;
if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
nid = 0;
else if (sc->memcg)
return shrink_slab_set_nr_deferred_memcg(, nid);
return atomic_long_add_return(nr, &shrinker->nr_deferred[nid]);
}
And now there's no duplicated code.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
nd
nr_deferred pointers to the correct offset in the allocated range.
Then this patch is really only changes to the size of the chunk
being allocated, setting up the pointers and copying the relevant
data from the old to new.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
is a good idea. This couples the shrinker
infrastructure to internal details of how cgroups are initialised
and managed. Sure, certain operations might be done in certain
shrinker lock contexts, but that doesn't mean we should share global
locks across otherwise independent subsystems
Chee
up
that the barriers enforce.
IOWs, these memory barriers belong inside the cgroup code to
guarantee anything that sees an online cgroup will always see the
fully initialised cgroup structures. They do not belong in the
shrinker infrastructure...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Dec 15, 2020 at 01:03:45AM +, Pavel Begunkov wrote:
> On 15/12/2020 00:56, Dave Chinner wrote:
> > On Tue, Dec 15, 2020 at 12:20:23AM +, Pavel Begunkov wrote:
> >> As reported, we must not do pressure stall information accounting for
> >> direct IO, beca
On Tue, Dec 15, 2020 at 08:42:08AM +0800, Yafang Shao wrote:
> On Tue, Dec 15, 2020 at 5:08 AM Dave Chinner wrote:
> > On Sun, Dec 13, 2020 at 05:09:02PM +0800, Yafang Shao wrote:
> > > On Thu, Dec 10, 2020 at 3:52 AM Darrick J. Wong
> > > wrote:
> > > > O
On Tue, Dec 15, 2020 at 12:00:23PM +1100, Dave Chinner wrote:
> On Tue, Dec 15, 2020 at 12:20:24AM +, Pavel Begunkov wrote:
> > A preparation patch. It adds a simple helper which abstracts out number
> > of segments we're allocating for a bio from iov_iter_npages().
>
io_iov_vecs_to_alloc(struct iov_iter *iter, int max_segs)
> {
> + /* reuse iter->bvec */
> + if (iov_iter_is_bvec(iter))
> + return 0;
> return iov_iter_npages(iter, max_segs);
Ah, I'm a blind idiot... :/
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
de this specific patch, so it's not clear what it's
actually needed for...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
for paging IO */
> + bio_clear_flag(bio, BIO_WORKINGSET);
Why only do this for the old direct IO path? Why isn't this
necessary for the iomap DIO path?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
> > This patch is based on Darrick's work to fix the issue in xfs/141 in the
> > > earlier version. [1]
> > >
> > > 1. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia
> > >
> > > Cc: Darrick J. Wong
>
trans_context_active
> To check whehter current is in fs transcation or not
> - xfs_trans_context_swap
> Transfer the transaction context when rolling a permanent transaction
>
> These two new helpers are instroduced in xfs_trans.h.
>
> Cc: Darrick J. Wong
> Cc: Matt
On Wed, Dec 02, 2020 at 03:12:20PM +0800, Ruan Shiyang wrote:
> Hi Dave,
>
> On 2020/11/30 上午6:47, Dave Chinner wrote:
> > On Mon, Nov 23, 2020 at 08:41:10AM +0800, Shiyang Ruan wrote:
> > >
> > > The call trace is like this:
> > > memory_fail
On Wed, Dec 02, 2020 at 03:12:20PM +0800, Ruan Shiyang wrote:
> Hi Dave,
>
> On 2020/11/30 上午6:47, Dave Chinner wrote:
> > On Mon, Nov 23, 2020 at 08:41:10AM +0800, Shiyang Ruan wrote:
> > >
> > > The call trace is like this:
> > > memory_fail
& (PF_MEMALLOC|PF_KSWAPD)) ==
> PF_MEMALLOC))
> goto redirty;
>
> [2]. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia/
>
> Cc: Darrick J. Wong
> Cc: Matthew Wilcox (Oracle)
> Cc: Christoph Hellwig
> Cc: Dave Chinner
> Cc: M
saction context is a bug in XFS.
IOWs, we are waiting on a new version of this patchset to be posted:
https://lore.kernel.org/linux-xfs/20201103131754.94949-1-laoar.s...@gmail.com/
so that we can get rid of this from iomap and check the transaction
recursion case directly in the XFS code. Then your pr
On Wed, Dec 02, 2020 at 10:04:17PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Dec 03, 2020 at 07:40:45AM +1100, Dave Chinner wrote:
> > On Wed, Dec 02, 2020 at 08:06:01PM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Dec 02, 2020 at 06:41:43PM +0100, Miklos Szeredi wrote:
>
orrect regressions in fixes before they get propagated to users.
It also creates a clear demarcation between fixes and cc: stable for
maintainers and developers: only patches with a cc: stable will be
backported immediately to stable. Developers know what patches need
urgent backports and, unlike developers, the automated fixes scan
does not have the subject matter expertise or background to make
that judgement
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
r that filesystem instance then,
by definition, it does not support DAX and the bit should never be
set.
e.g. We don't talk about kernels that support reflink - what matters
to userspace is whether the filesystem instance supports reflink.
Think of the useless mess that xfs_info would be if it reported
kernel capabilities instead of filesystem instance capabilities.
i.e. we don't report that a filesystem supports reflink just because
the kernel supports it - it reports whether the filesystem instance
being queried supports reflink. And that also implies the kernel
supports it, because the kernel has to support it to mount the
filesystem...
So, yeah, I think it really does need to be conditional on the
filesystem instance being queried to be actually useful to users
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
is cached then we can
try to re-write it to disk to fix the bad data, otherwise we treat
it like a writeback error and report it on the next
write/fsync/close operation done on that file.
This gets rid of the mf_recover_controller altogether and allows
the interface to be used by any sort of block device for any sort
of bottom-up reporting of media/device failures.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
is cached then we can
try to re-write it to disk to fix the bad data, otherwise we treat
it like a writeback error and report it on the next
write/fsync/close operation done on that file.
This gets rid of the mf_recover_controller altogether and allows
the interface to be used by any
On Wed, Nov 25, 2020 at 06:46:54PM -0500, Sasha Levin wrote:
> On Thu, Nov 26, 2020 at 08:52:47AM +1100, Dave Chinner wrote:
> > We've already had one XFS upstream kernel regression in this -rc
> > cycle propagated to the stable kernels in 5.9.9 because the stable
> > pr
On Wed, Nov 25, 2020 at 10:35:50AM -0500, Sasha Levin wrote:
> From: Dave Chinner
>
> [ Upstream commit 883a790a84401f6f55992887fd7263d808d4d05d ]
>
> Jens has reported a situation where partial direct IOs can be issued
> and completed yet still return -EAGAIN. We don't
TX_ATTR_DAX in statx for either the
attributes or attributes_mask field because the filesystem is not
DAX capable. And given that we have filesystems with multiple block
devices that can have different DAX capabilities, I think this
statx() attr state (and mask) really has to come from the
filesystem, not VFS...
> Extra question: should we only set this in the attributes mask if
> CONFIG_FS_DAX=y ?
IMO, yes, because it will always be false on CONFIG_FS_DAX=n and so
it may well as not be emitted as a supported bit in the mask.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Nov 11, 2020 at 11:28:48AM +0100, Michal Suchánek wrote:
> On Tue, Nov 10, 2020 at 08:08:23AM +1100, Dave Chinner wrote:
> > On Mon, Nov 09, 2020 at 09:27:05PM +0100, Michal Suchánek wrote:
> > > On Mon, Nov 09, 2020 at 11:24:19AM -0800, Darrick J. Wong wrote:
> >
storing it's data on a different
filesystem that isn't mounted at install time, so the installer
has no chance of detecting that the application is going to use
DAX enabled storage.
IOWs, the installer cannot make decisions based on DAX state on
behalf of applications because it does not know what environment the
application is going to be configured to run in. DAX can only be
deteted reliably by the application at runtime inside it's
production execution environment.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
passing work off to worker threads, duplicating
the current creds will capture this information and won't leave
random landmines where stuff doesn't work as it should because the
worker thread is unaware of the userns that it is supposed to be
doing filesytsem operations under...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit
quire() so that people who have no clue what the
hell smp_acquire__after_ctrl_dep() means or does have some hope of
understanding of what objects the ordering semantics in the function
actually apply to
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
#x27;re going completely in the wrong direction. The problem
that needs solving is integrating shrinker scanning control state
with memcgs more tightly, not force every memcg aware shrinker to
use list_lru for their subsystem shrinker implementations
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Tue, Sep 22, 2020 at 12:46:05PM -0400, Mikulas Patocka wrote:
> Thanks for reviewing NVFS.
Not a review - I've just had a cursory look and not looked any
deeper after I'd noticed various red flags...
> On Tue, 22 Sep 2020, Dave Chinner wrote:
> > IOWs, extent based tre
On Tue, Sep 22, 2020 at 12:46:05PM -0400, Mikulas Patocka wrote:
> Thanks for reviewing NVFS.
Not a review - I've just had a cursory look and not looked any
deeper after I'd noticed various red flags...
> On Tue, 22 Sep 2020, Dave Chinner wrote:
> > IOWs, extent based tre
ch...
I can see how "almost in place" modification can be done by having
two copies side by side and updating one while the other is the
active copy and switching atomically between the two objects. That
way a traditional soft-update algorithm would work because the
exposure of the changes is via ordering the active copy switches.
That would come at a cost, though, both in metadata footprint and
CPU overhead.
So, what have I missed about the way metadata is updated in the pmem
that allows non-atomic updates to work reliably?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
ch...
I can see how "almost in place" modification can be done by having
two copies side by side and updating one while the other is the
active copy and switching atomically between the two objects. That
way a traditional soft-update algorithm would work because the
exposure of the changes is via ordering the active copy switches.
That would come at a cost, though, both in metadata footprint and
CPU overhead.
So, what have I missed about the way metadata is updated in the pmem
that allows non-atomic updates to work reliably?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Thu, Sep 17, 2020 at 12:47:10AM -0700, Hugh Dickins wrote:
> On Thu, 17 Sep 2020, Dave Chinner wrote:
> > On Wed, Sep 16, 2020 at 07:04:46PM -0700, Hugh Dickins wrote:
> > > On Thu, 17 Sep 2020, Dave Chinner wrote:
> > > >
On Thu, Sep 17, 2020 at 05:12:08PM -0700, Yang Shi wrote:
> On Wed, Sep 16, 2020 at 7:37 PM Dave Chinner wrote:
> > On Wed, Sep 16, 2020 at 11:58:21AM -0700, Yang Shi wrote:
> > It clamps the worst case freeing to half the cache, and that is
> > exactly what you are seeing
e running millions of IOPS through the AIO subsystem, then the
cost of doing millions of extra atomic ops every second is going to
be noticable...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
thread. There are quite a few custom enterprise apps around that
rely on this POSIX behaviour, especially stuff that has come from
different Unixes that actually provided Posix compliant behaviour.
IOWs, from an upstream POV, POSIX atomic write behaviour doesn't
matter very much. From an enter
On Wed, Sep 16, 2020 at 07:04:46PM -0700, Hugh Dickins wrote:
> On Thu, 17 Sep 2020, Dave Chinner wrote:
> >
> > So
> >
> > P0 p1
> >
> > hole punch starts
> > takes XFS_MMAPLOCK_EXCL
> > truncate_pagec
031234618.15403-1-da...@fromorbit.com/
Unfortunately, none of the MM developers showed any interest in
these patches, so when I found a different solution to the XFS
problem it got dropped on the ground.
> So why do we have to still keep it around?
Because we need a feedback mechanism to allow us to maintain control
of the size of filesystem caches that grow via GFP_NOFS allocations.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Sep 16, 2020 at 05:58:51PM +0200, Jan Kara wrote:
> On Sat 12-09-20 09:19:11, Amir Goldstein wrote:
> > On Tue, Jun 23, 2020 at 8:21 AM Dave Chinner wrote:
> > >
> > > From: Dave Chinner
> > >
> > > The page faultround path ->map_pages i
ct. Or if it's a stupid idea,
> someone can point out why.
I think it's pretty straight forward to do it in the iomap layer...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
format
> - Add Fixes tag in commit message
>
> fs/inode.c | 4 +++-
> include/linux/fs.h | 3 +--
> 2 files changed, 4 insertions(+), 3 deletions(-)
Looks good.
Reviewed-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
the statement.
i.e.
if (!drop &&
!(inode->i_state & I_DONTCACHE) &&
(sb->s_flags & SB_ACTIVE)) {
Which gives a clear indication that there are all at the same
precedence and separate logic statements...
Otherwise the change looks good.
Probably best to resend with the fixes tag :)
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
401 - 500 of 3653 matches
Mail list logo