[4.8] btrfs heats my room with lock contention

2016-08-03 Thread Dave Chinner
, they only drop to ~1500-2000MB/s as they hit internal limits. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org

Re: [4.8] btrfs heats my room with lock contention

2016-08-04 Thread Dave Chinner
On Thu, Aug 04, 2016 at 10:28:44AM -0400, Chris Mason wrote: > > > On 08/04/2016 02:41 AM, Dave Chinner wrote: > > > >Simple test. 8GB pmem device on a 16p machine: > > > ># mkfs.btrfs /dev/pmem1 > ># mount /dev/pmem1 /mnt/scratch > ># dbench -t 60

Re: [PATCH 13/17] xfs: test swapext with reflink

2016-08-08 Thread Dave Chinner
entire file and counts lines. > Seeing as XFS records the extent count in the inode, we might as well use it. perhaps put a special xfs case in _count_extents() that does this rather than FIEMAP? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Will Btrfs have an official command to "uncow" existing files?

2016-08-22 Thread Dave Chinner
> So... if the btrfs folks really want an unshare flag I can trivially > re-add it to the VFS headers and re-enable it in the XFS > implementation but y'all better speak up now and hammer out an > acceptable definition. I don't think XFS needs a new flag. It's not urgen

Re: [PATCH] fstests: common: Enhance _exclude_scratch_mount_option to handle multiply options and generic fs type

2016-09-06 Thread Dave Chinner
gt; { > for opt in $*; do > if echo $MOUNT_OPTIONS | grep -qw "$opt"; then > _notrun "mount option \"$opt\" not allowed in this > test" > fi > done > } > > (Note that the c

Re: [PATCH] fstests: common: Enhance _exclude_scratch_mount_option to handle multiply options and generic fs type

2016-09-06 Thread Dave Chinner
hy we review changes. If it's not obvious to the reviewer why the mount option is excluded, or it's not documented in the commit message, then the reviewer should be asking for it to be added. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the l

Re: [PATCH 3/3] ioctl_xfs_ioc_getfsmap.2: document XFS_IOC_GETFSMAP ioctl

2016-09-08 Thread Dave Chinner
k I like this better. Everyone else, please chime in. :) That's pretty much the structure I was going to suggest - it matches the fiemap pattern. i.e control parameters are separated from record data. I'd dump a bit more reserved space in the structure, though; we've got heaps o

Re: [PATCH 3/3] ioctl_xfs_ioc_getfsmap.2: document XFS_IOC_GETFSMAP ioctl

2016-09-09 Thread Dave Chinner
On Thu, Sep 08, 2016 at 11:07:16PM -0700, Darrick J. Wong wrote: > On Fri, Sep 09, 2016 at 09:38:06AM +1000, Dave Chinner wrote: > > On Tue, Aug 30, 2016 at 12:09:49PM -0700, Darrick J. Wong wrote: > > > > I recall for FIEMAP that some filesystems may not have files alig

Re: [PATCH 2/3] writeback: allow for dirty metadata accounting

2016-09-11 Thread Dave Chinner
how tracking of information such as the global amount of dirty metadata is useful for diagnostics, but I'm not convinced we should be using it for globally scoped external control of deeply integrated and highly specific internal filesystem functionality. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] writeback: allow for dirty metadata accounting

2016-09-12 Thread Dave Chinner
t; >On Mon 12-09-16 10:46:56, Dave Chinner wrote: > >>On Fri, Sep 09, 2016 at 10:17:43AM +0200, Jan Kara wrote: > >>>On Mon 22-08-16 13:35:01, Josef Bacik wrote: > >>>>Provide a mechanism for file systems to indicate how much dirty metadata > >>>&g

Re: [PATCH 2/3] writeback: allow for dirty metadata accounting

2016-09-12 Thread Dave Chinner
ly hide memcg from the writeback implementations similar to the way memcg is completely hidden from the shrinker reclaim implementations... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 1/6] fstests: common: Introduce _post_mount_hook for btrfs

2016-09-14 Thread Dave Chinner
ex 23c007a..631397f 100644 > --- a/common/rc > +++ b/common/rc > @@ -321,6 +321,27 @@ _overlay_scratch_unmount() > $UMOUNT_PROG $SCRATCH_MNT > } > > +_run_btrfs_post_mount_hook() > +{ > + mnt_point=$1 > + for n in $ALWAYS_ENABLE_BTRFS_FEATURE; do What&#

Re: [RFC] Preliminary BTRFS Encryption

2016-09-15 Thread Dave Chinner
panded to address specific threat models should you then implement something that is unique to btrfs Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Making statfs return qgroup info (for container use case etc.)

2016-10-06 Thread Dave Chinner
; (e.g. by passing in an option at mount time - such as qgroup level > maybe?) , instead of the global filesystem data in f_bfree f_blocks etc. XFS does this with directory tree quotas. It was implmented 10 years ago or so, IIRC... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from

Re: Is is possible to submit binary image as fstest test case?

2016-10-07 Thread Dave Chinner
works or debug-only sysfs hooks are for. The XFS kernel code has both, xfstests use both, and they pretty much do away with the need for custom binary filesystem images for such testing... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "

Re: Is is possible to submit binary image as fstest test case?

2016-10-07 Thread Dave Chinner
On Fri, Oct 07, 2016 at 05:26:27PM +0800, Qu Wenruo wrote: > > > At 10/07/2016 05:18 PM, Dave Chinner wrote: > >On Thu, Oct 06, 2016 at 04:12:56PM +0800, Qu Wenruo wrote: > >>Hi, > >> > >>Just as the title says, for some case(OK, btrfs again) we need to

Re: [PATCH] generic: make 17[1-4] work well when btrfs compression is enabled

2016-10-07 Thread Dave Chinner
1 > +dd if=/dev/zero of=$testdir/eat_my_space >> $seqres.full 2>&1 Please don't replace xfs_io writes using a specific data pattern with dd calls that write zeros. Indeed, we don't use dd for new tests anymore - xfs_io should be used. Write a function that fills all the re

Re: Making statfs return qgroup info (for container use case etc.)

2016-10-09 Thread Dave Chinner
On Fri, Oct 07, 2016 at 06:58:47PM +0200, David Sterba wrote: > On Fri, Oct 07, 2016 at 09:40:11AM +1100, Dave Chinner wrote: > > XFS does this with directory tree quotas. It was implmented 10 years > > ago or so, IIRC... > > Sometimes, the line between a historical remark

Re: Is is possible to submit binary image as fstest test case?

2016-10-09 Thread Dave Chinner
On Fri, Oct 07, 2016 at 06:05:51PM +0200, David Sterba wrote: > On Fri, Oct 07, 2016 at 08:18:38PM +1100, Dave Chinner wrote: > > On Thu, Oct 06, 2016 at 04:12:56PM +0800, Qu Wenruo wrote: > > > So I'm wondering if I can just upload a zipped raw image as part

Re: [PATCH] generic/175: disable inline data feature for btrfs

2016-10-10 Thread Dave Chinner
needs to be avoided, then add an option to filter them out. e.g. something like this: _scratch_options_filter btrfs compress so that it removes any compression option from the btrfs mount/mkfs that is run for that test. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from t

Re: [PATCH v2] generic: make 17[1-4] work well when btrfs compression is enabled

2016-10-10 Thread Dave Chinner
ant size, especially as the data will compress down to nearly nothing. Trying to hack around compression artifacts by inflating the size of the file just doesn't work reliably. The way to fix this is to either use one of the "fill filesystem" functions that isn't affected by co

Re: [PATCH 5/5] fs: don't set *REFERENCED unless we are on the lru list

2016-10-25 Thread Dave Chinner
he file systems. Less than 1% for XFS and ~1.5% for ext4 is well within the run-to-run variation of fsmark. It looks like it might be slightly faster, but it's not a cut-and-dried win for anything other than btrfs. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe

Re: [PATCH 5/5] fs: don't set *REFERENCED unless we are on the lru list

2016-10-25 Thread Dave Chinner
On Wed, Oct 26, 2016 at 09:01:13AM +1100, Dave Chinner wrote: > On Tue, Oct 25, 2016 at 02:41:44PM -0400, Josef Bacik wrote: > > With anything that populates the inode/dentry cache with a lot of one time > > use > > inodes we can really put a lot of pressure on the syste

Re: [PATCH 5/5] fs: don't set *REFERENCED unless we are on the lru list

2016-10-26 Thread Dave Chinner
On Wed, Oct 26, 2016 at 04:03:54PM -0400, Josef Bacik wrote: > On 10/25/2016 07:36 PM, Dave Chinner wrote: > >So, 2-way has not improved. If changing referenced behaviour was an > >obvious win for btrfs, we'd expect to see that here as well. > >however, because 4-way im

Re: [PATCH] fs: push file_update_time into ->page_mkwrite

2011-12-20 Thread Dave Chinner
and then change the generic fault code to only update the file times if the filesystem doesn't implement page_mkwrite... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7

2011-12-28 Thread Dave Chinner
> machine2: > 3.2rc6 http://pastebin.com/khD0wGXx > 3.2rc7 (not crashed yet) These don't have XFS in the picture, but also appear to be hung waiting on IO completion with MD stuck in make_request()->get_active_stripe(). That, to me, indicates an MD problem. Cheers, Dave. -

Re: [3.2-rc7] slowdown, warning + oops creating lots of files

2012-01-04 Thread Dave Chinner
On Thu, Jan 05, 2012 at 08:44:45AM +1100, Dave Chinner wrote: > Hi there buttery folks, > > I just hit this warning and oops running a parallel fs_mark create > workload on a test VM using a 17TB btrfs filesystem (12 disk dm > RAID0) using default mkfs and mount parmeters, mo

Re: [3.2-rc7] slowdown, warning + oops creating lots of files

2012-01-04 Thread Dave Chinner
On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > On 05/01/12 09:11, Dave Chinner wrote: > > > Looks to be reproducable. > > Does this happen with rc6 ? I haven't tried. All I'm doing is running some benchmarks to get numbers for a talk I'm

Re: [3.2-rc7] slowdown, warning + oops creating lots of files

2012-01-04 Thread Dave Chinner
On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote: > On 01/04/2012 06:01 PM, Dave Chinner wrote: > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > >> On 05/01/12 09:11, Dave Chinner wrote: > >> > >>> Looks to be reproducable. >

Re: [3.2-rc7] slowdown, warning + oops creating lots of files

2012-01-05 Thread Dave Chinner
On Thu, Jan 05, 2012 at 02:11:31PM -0500, Liu Bo wrote: > On 01/04/2012 09:26 PM, Dave Chinner wrote: > > On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote: > >> On 01/04/2012 06:01 PM, Dave Chinner wrote: > >>> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Sa

Re: [3.2-rc7] slowdown, warning + oops creating lots of files

2012-01-05 Thread Dave Chinner
On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote: > On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote: > > On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote: > > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > > > > O

Re: [RFC PATCH v2 0/3] Btrfs: apply the Probabilistic Skiplist on btrfs

2012-01-12 Thread Dave Chinner
b/btree.c code and look to making that RCU safe. IIRC, the implementation was based on a RCU-btree prototype so maybe you might want to read up on that first: http://programming.kicks-ass.net/kernel-patches/vma_lookup/btree.patch FWIW, I'm mentioning this out of self interest - I need a

Re: [PATCH] Btrfs: return EUCLEAN rather than ENXIO once internal error has occurred for SEEK_DATA/SEEK_HOLE inquiry

2012-02-08 Thread Dave Chinner
er more accurate error info, which is better? Return the internal error unchanged - a failure to read the extent list (EIO) is different to a corruption detected in the extent map read from disk (EUCLEAN). Having a user report the appropriate error makes our life much simpler when it comes to

Re: [RFC PATCH 2/2] Btrfs: fix deadlock on umount by umount_prepare interface

2012-03-21 Thread Dave Chinner
down_read_trylock(), not down_read(). Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/2] Btrfs: fix deadlock on umount by umount_prepare interface

2012-03-21 Thread Dave Chinner
On Thu, Mar 22, 2012 at 01:25:26PM +0800, Miao Xie wrote: > On Thu, 22 Mar 2012 15:39:36 +1100, Dave Chinner wrote: > > On Thu, Mar 22, 2012 at 11:13:17AM +0800, Miao Xie wrote: > >> The reason the deadlock is that: > >> Task

Re: [PATCH 00/19 v5] Fix filesystem freezing deadlocks

2012-04-16 Thread Dave Chinner
correctly. i.e. you can still freeze a filesystem with inodes in this state successfully and have everythign behave as you'd expect. I'm not sure how other filesystems handle this problem, but perhaps pushing this check down into filesystem specific code or adding a superb

Re: [PATCH 1/4] vfs: introduce try_to_writeback_inodes_sb(_nr)

2012-04-25 Thread Dave Chinner
in writeback_inodes_[nr]_sb_if_idle() with a trylock and use that. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/8] mm: push vm_fault into the page fault handlers

2018-09-25 Thread Dave Chinner
struct siginfo si; > @@ -493,7 +494,8 @@ static int __kprobes do_page_fault(unsigned long addr, > unsigned int esr, > #endif > } > > - fault = __do_page_fault(mm, addr, mm_flags, vm_flags, tsk); > + vm_fault_init(&vmf, NULL, addr, mm_flags); > + fault = __do_page_fault(mm, vmf, vm_flags, tsk); I'm betting this doesn't compile, either. /me stops looking. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 8/8] btrfs: drop mmap_sem in mkwrite for btrfs

2018-09-25 Thread Dave Chinner
rea_struct > *vma, > + int flags) > +{ > + return NULL; > +} This doesn't compile either. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 00/15] fs: fixes for serious clone/dedupe problems

2018-10-04 Thread Dave Chinner
if (pos_out + len < i_size_read(inode_out)) { + ret = -EINVAL; + goto out_unlock; + } + } It might be better to put these in with the eof-zeroing patch then add all the other changes on top? Let me post them separately, as they may be candidates for 4.19-rc7 along with the eof zeroing. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/15] xfs: refactor clonerange preparation into a separate helper

2018-10-04 Thread Dave Chinner
so moves the invalidation of the destination range to the prep function so that it is done before the range is remapped. This ensures that nobody can access the data in range being remapped until the remap is complete. -- Sound OK? Otherwise this looks fine. Reviewed-by: Dave Chinner -Dave. >

Re: [PATCH 03/15] xfs: zero posteof blocks when cloning above eof

2018-10-04 Thread Dave Chinner
id: https://bugzilla.kernel.org/show_bug.cgi?id=201259 > Signed-off-by: Darrick J. Wong > --- > fs/xfs/xfs_reflink.c | 33 + > 1 file changed, 25 insertions(+), 8 deletions(-) Looks good. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/15] xfs: update ctime and remove suid before cloning files

2018-10-04 Thread Dave Chinner
-- > fs/xfs/xfs_reflink.c | 25 + > 1 file changed, 25 insertions(+) Looks good. Reviewed-by: Dave Chinner Because this fixes a security related problem, I'm going to push this with the data corruption fixes. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/15] xfs: refactor clonerange preparation into a separate helper

2018-10-05 Thread Dave Chinner
g to do and < 0 for an error and catch it in this code. I note that later patches in the series change the vfs_clone_file_prep_inodes() behaviour so this behaviour is probably masked by those later changes. It's still a nasty bisect landmine, though, so I'll fix it here. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/15] xfs: refactor clonerange preparation into a separate helper

2018-10-05 Thread Dave Chinner
On Fri, Oct 05, 2018 at 05:02:28PM +1000, Dave Chinner wrote: > On Thu, Oct 04, 2018 at 05:44:47PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong > > > > Refactor all the reflink preparation steps into a separate helper that > > we'll use to land all the

Re: [PATCH 02/15] xfs: refactor clonerange preparation into a separate helper

2018-10-05 Thread Dave Chinner
On Fri, Oct 05, 2018 at 10:21:59AM -0700, Darrick J. Wong wrote: > On Fri, Oct 05, 2018 at 07:02:42PM +1000, Dave Chinner wrote: > > On Fri, Oct 05, 2018 at 05:02:28PM +1000, Dave Chinner wrote: > > > On Thu, Oct 04, 2018 at 05:44:47PM -0700, Darrick J. Wong wrote: > > &

Re: [PATCH 01/25] xfs: add a per-xfs trace_printk macro

2018-10-09 Thread Dave Chinner
calling this trace point is not committed? If we decide to add this, it needs to be a CONFIG_XFS_DEBUG=y only definition because trace_printk() is only for temporary debugging code and has substantial performance overheads even when these trace points are not being traced. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 00/25] fs: fixes for serious clone/dedupe problems

2018-10-09 Thread Dave Chinner
cently sent to fstests > exercises the fixes in this series. Tests are in [2]. Can you rebase this on the for-next branch on the xfs tree which already contains some of the initial fixes in the series and a couple of other reflink/dedupe data corruption fixes? I'm planning on pushi

Re: [PATCH 05/25] vfs: check file ranges before cloning files

2018-10-10 Thread Dave Chinner
ing all the way to the end? */ > isize = i_size_read(inode_in); > - if (isize == 0) > - return 0; This looks like a change of behaviour. Instead of skipping zero legnth source files and returning success, this will now return -EINVAL as other checks fail? That needs to be documented in the commit message if it's intentional and a valid change to make... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block

2018-10-11 Thread Dave Chinner
e entire request. A subsequent > patch will enable us to shorten dedupe requests correctly. Ok, so this patch rejects whole file dedupe requests, and then a later patch adds support back in for it? Doesn't that leave a bisect landmine behind? Why separate the functionality like this? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 23/25] xfs: fix pagecache truncation prior to reflink

2018-10-11 Thread Dave Chinner
; + truncate_inode_pages_range(&inode_out->i_data, > + round_down(pos_out, PAGE_SIZE), > + round_up(pos_out + *len, PAGE_SIZE) - 1); Looks good. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 24/25] xfs: support returning partial reflink results

2018-10-11 Thread Dave Chinner
ped, pos_out + len); > + remapped = min_t(int64_t, len, XFS_FSB_TO_B(mp, remapped)); So remapped is returned as a block count, then immediately converted to a byte count? Can we return it as byte count so that we don't have this weird unit conversion? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 25/25] xfs: remove redundant remap partial EOF block checks

2018-10-11 Thread Dave Chinner
On Wed, Oct 10, 2018 at 09:15:26PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Now that we've moved the partial EOF block checks to the VFS helpers, we > can remove the redundantn functionality from XFS. > > Signed-off-by: Darrick J. Wong looks fine. Re

Re: [PATCH 24/25] xfs: support returning partial reflink results

2018-10-14 Thread Dave Chinner
flink_remap_range at this point? Yeah, that seems like a good idea to me - pulling all the vfs/generic code interactions back up into xfs_file.c would match how the rest of the file operations are layered w.r.t. external and internal XFS code... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block

2018-10-14 Thread Dave Chinner
committed, are you going to update it and repost as it clearly had value Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/7] mm: drop mmap_sem for page cache read IO submission

2018-10-18 Thread Dave Chinner
retry; > + } > + } else > + __lock_page(page); > } > > /* Did it get truncated? */ > @@ -2607,6 +2655,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) > /* Things didn't work out. Return zero to tell the mm layer so. */ > shrink_readahead_size_eio(file, ra); > return VM_FAULT_SIGBUS; > + > +out_retry_wait: > + if (page) { > + if (flags & FAULT_FLAG_KILLABLE) and here. Any reason for this discrepancy? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 5/7] mm: add a flag to indicate we used a cached page

2018-10-18 Thread Dave Chinner
alised by a prior fault attempt, not that "we used a cached page". "cached page" is a horribly overloaded term - I suspect we should not overload it more, especially as the flag get cleared if the cached page is not up to date (i.e. the data on it hasn't been fully initialised). FAULT_FLAG_PAGE_INITIALISED? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 6/7] mm: allow ->page_mkwrite to do retries

2018-10-18 Thread Dave Chinner
) Mess. #define __FAIL_FLAGS(VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY) if (ret & __FAIL_FLAGS) Should kill the unlikely() at the same time. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/7] mm: drop the mmap_sem in all read fault cases

2018-10-18 Thread Dave Chinner
unlock_mmap_for_io(vmf->vma, vmf->flags); > + > /* >* Umm, take care of errors if the page isn't up-to-date. >* Try to re-read it _once_. We do this synchronously, Same here. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 4/7] mm: use the cached page for filemap_fault

2018-10-18 Thread Dave Chinner
} > + unlock_page(cached_page); > + put_page(cached_page); > + } > + Can you factor this out so the main code path doesn't get any more complex than it already is? i.e. something like: error = vmf_has_cached_page(vmf, &page); if (error) goto out_retry; if (page) goto have_cached_page; -dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 7/7] btrfs: drop mmap_sem in mkwrite for btrfs

2018-10-18 Thread Dave Chinner
eneric_file_read_iter(struct kiocb *iocb, struct > iov_iter *iter) > EXPORT_SYMBOL(generic_file_read_iter); > > #ifdef CONFIG_MMU > -static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int > flags) > +struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int flags) > { > if ((flags & (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) == > FAULT_FLAG_ALLOW_RETRY) { > struct file *file; > @@ -2377,6 +2377,7 @@ static struct file *maybe_unlock_mmap_for_io(struct > vm_area_struct *vma, int fla > } > return NULL; > } > +EXPORT_SYMBOL_GPL(maybe_unlock_mmap_for_io); These API mods (if this functionality remains in the filesystem code) belong in whatever patch introduced this function. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 24/28] xfs: clean up xfs_reflink_remap_blocks call site

2018-10-21 Thread Dave Chinner
On Sun, Oct 21, 2018 at 09:17:50AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Move the offset <-> blocks unit conversions into > xfs_reflink_remap_blocks to make the call site less ugly. > > Signed-off-by: Darrick J. Wong Looks fine. Reviewed-by: Dave C

Re: [PATCH 25/28] xfs: support returning partial reflink results

2018-10-21 Thread Dave Chinner
egular write. > > Signed-off-by: Darrick J. Wong Looks ok to me. remap_file_range() still returns the full length, so there's no change of behaviour there. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 27/28] xfs: remove xfs_reflink_remap_range

2018-10-21 Thread Dave Chinner
; > Signed-off-by: Darrick J. Wong Sensible enough. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com

Re: [PATCH 28/28] xfs: remove [cm]time update from reflink calls

2018-10-21 Thread Dave Chinner
On Sun, Oct 21, 2018 at 09:18:17AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Now that the vfs remap helper dirties the inode [cm]time for us, xfs no > longer needs to do that on its own. > > Signed-off-by: Darrick J. Wong looks good. Reviewed-by: Dave

Re: [PATCH v6 00/28] fs: fixes for serious clone/dedupe problems

2018-10-21 Thread Dave Chinner
we want to merge this? I can take it through the XFS tree given that there is a bit of XFS changes that needs to be co-ordinated with it, or should it go through some other tree? The other question I have is who reviews ocfs2 changes these days? Do they get reviewed, or just shepherded in via akpm's tree? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v6 00/28] fs: fixes for serious clone/dedupe problems

2018-10-21 Thread Dave Chinner
On Mon, Oct 22, 2018 at 01:21:12PM +1100, Dave Chinner wrote: > On Sun, Oct 21, 2018 at 09:15:03AM -0700, Darrick J. Wong wrote: > > Hi all, > > > > Dave, Eric, and I have been chasing a stale data exposure bug in the XFS > > reflink implementation, and tracked it down

Re: [PATCH v6 00/28] fs: fixes for serious clone/dedupe problems

2018-10-21 Thread Dave Chinner
On Mon, Oct 22, 2018 at 05:52:49AM +0100, Al Viro wrote: > On Mon, Oct 22, 2018 at 03:37:41PM +1100, Dave Chinner wrote: > > > Ok, this is a bit of a mess. the patches do not merge cleanly to a > > 4.19-rc1 base kernel because of all the changes to > > include/linux/fs.

Re: [PATCH v6 00/28] fs: fixes for serious clone/dedupe problems

2018-10-21 Thread Dave Chinner
On Mon, Oct 22, 2018 at 08:42:29AM +0300, Amir Goldstein wrote: > On Mon, Oct 22, 2018 at 8:09 AM Dave Chinner wrote: > > > > On Mon, Oct 22, 2018 at 05:52:49AM +0100, Al Viro wrote: > > > On Mon, Oct 22, 2018 at 03:37:41PM +1100, Dave Chinner wrote: > > > > &g

Re: [PATCH 7/7] btrfs: drop mmap_sem in mkwrite for btrfs

2018-10-22 Thread Dave Chinner
On Mon, Oct 22, 2018 at 01:56:54PM -0400, Josef Bacik wrote: > On Fri, Oct 19, 2018 at 02:48:47PM +1100, Dave Chinner wrote: > > On Thu, Oct 18, 2018 at 04:23:18PM -0400, Josef Bacik wrote: > > > ->page_mkwrite is extremely expensive in btrfs. We have to reserve > >

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-14 Thread Dave Chinner
e create time doesn't really help, because once you've broken into a system, this makes it really easy to cover tracks (e.g. we can't find files that were created and unlinked during the break in window anymore) and lay false trails Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-14 Thread Dave Chinner
On Thu, Feb 14, 2019 at 03:14:29PM -0800, Omar Sandoval wrote: > On Fri, Feb 15, 2019 at 09:06:26AM +1100, Dave Chinner wrote: > > On Thu, Feb 14, 2019 at 02:00:07AM -0800, Omar Sandoval wrote: > > > From: Omar Sandoval > > > > > > Hi, > > > >

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-15 Thread Dave Chinner
all the metadata goes to the software raided pmem block devices that aren't DAX capable. Problem already solved, yes? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-15 Thread Dave Chinner
On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > > (This is a joint proposal with Hannes Reinecke) > > > > Servers with NV-DIMM are slowly emerging in data centers but one key feature > &g

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-16 Thread Dave Chinner
On Sat, Feb 16, 2019 at 09:05:31AM -0800, Dan Williams wrote: > On Fri, Feb 15, 2019 at 9:40 PM Dave Chinner wrote: > > > > On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > > > On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > >

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-17 Thread Dave Chinner
st of the various discard > commands - how painful is it for modern SSD's? AIUI, it still depends on the SSD implementation, unfortunately. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-17 Thread Dave Chinner
On Sun, Feb 17, 2019 at 06:42:59PM -0500, Ric Wheeler wrote: > On 2/17/19 4:09 PM, Dave Chinner wrote: > >On Sun, Feb 17, 2019 at 03:36:10PM -0500, Ric Wheeler wrote: > >>One proposal for btrfs was that we should look at getting discard > >>out of the synchronous pa

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-04 Thread Dave Chinner
louts is completely the wrong approach to be taking. We need to do these things in a generic manner so that all filesystems (and block devices!) that use the iomap infrastructure can take advantage of them, not just one of them. Quite frankly, I don't care if it takes more time and work up

Re: [PATCH 02/13] iomap: Read page from srcmap for IOMAP_COW

2019-08-04 Thread Dave Chinner
Darrick on CONFIG_IOMAP_DEBUG here - we need to start locking down invalid behaviour and invalid combinations with asserts that tell developers they've broken something. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 05/13] btrfs: Add CoW in iomap based writes

2019-08-04 Thread Dave Chinner
iomap->type = IOMAP_DELALLOC; > + } > + > iomap->addr = IOMAP_NULL_ADDR; > iomap->type = IOMAP_DELALLOC; The iomap->type is overwritten here and so IOMAP_COW will never be seen by the iomap infrastructure... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/13] btrfs: Add a simple buffered iomap write

2019-08-04 Thread Dave Chinner
en; > + if (iocb->ki_pos > i_size_read(inode)) > + i_size_write(inode, iocb->ki_pos); > + return written; Looks like it fails to handle O_[D]SYNC writes. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 01/13] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O

2019-08-04 Thread Dave Chinner
((name)[IOMAP_SOURCE_MAP]) And now we only have to pass a single iomap parameter to each function as "struct iomap **iomap". This makes the code somewhat simpler, and we only ever need to use IOMAP_S(iomap) when IOMAP_B(iomap)->type == IOMAP_COW. The other advantage of this is that if we even need new functionality that requires 2 (or more) iomaps, we don't have to change APIs again Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-05 Thread Dave Chinner
On Mon, Aug 05, 2019 at 04:08:43PM +, Goldwyn Rodrigues wrote: > On Mon, 2019-08-05 at 09:43 +1000, Dave Chinner wrote: > > On Fri, Aug 02, 2019 at 05:00:45PM -0500, Goldwyn Rodrigues wrote: > > > From: Goldwyn Rodrigues > > > > > > This helps filesyste

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-08 Thread Dave Chinner
now if there's hardware encryption below or software encryption on top becomes problematic... So really, from a filesystem and iomap perspective, What Eric says is the right - it's the only order that makes sense... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-04 Thread Dave Chinner
hat skips the compression/decompression code and sets a few extra flags in the iocb that is passed down to the direct IO code. We don't need a whole new IO path just to skip a data transformation step in the direct IO path Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-06 Thread Dave Chinner
On Thu, Sep 05, 2019 at 02:16:37PM +0200, Johannes Thumshirn wrote: > On 05/09/2019 04:10, Dave Chinner wrote: > > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote: > >> From: Omar Sandoval > >> > >> This adds an API for writing compressed data

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-06 Thread Dave Chinner
On Fri, Sep 06, 2019 at 11:19:49AM -0700, Omar Sandoval wrote: > On Thu, Sep 05, 2019 at 12:10:12PM +1000, Dave Chinner wrote: > > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote: > > > From: Omar Sandoval > > > > > > This adds an API for wri

Re: [PATCH 15/15] xfs: Use the new iomap infrastructure for CoW

2019-09-06 Thread Dave Chinner
; > This now at least survives xfstests -g quick on a 4k xfs file system > for. Here is my current tree: > > http://git.infradead.org/users/hch/xfs.git/shortlog/refs/heads/xfs-cow-iomap That looks somewhat reasonable. The XFS mapping function is turning into spagetti and getting really hard to follow again, though. Perhaps we should consider splitting the shared/COW path out of it... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data

2019-09-25 Thread Dave Chinner
checked, so why should the format that the data is supplied to the kernel in suddenly require new privilege checks? i.e. writing encoded data to a file requires exactly the same access permissions as writing cleartext data to the file. The only extra information here is whether the _filesystem_ supports encoded data, and that doesn't change regardless of what the open file gets passed to. Hence the capability is either there or it isn't, it doesn't transform not matter what privilege boundary the file is passed across. Similarly, we have permission to access the data or we don't through the struct file, it doesn't transform either. Hence I don't see why CAP_SYS_ADMIN or any special permissions are needed for an application with access permissions to file data to use these RWF_ENCODED IO interfaces. I am inclined to think the permission check here is wrong and should be dropped, and then all these issues go away. Yes, the app that is going to use this needs root perms because it accesses all data in the fs (it's a backup app!), but that doesn't mean you can only use RWF_ENCODED if you have root perms. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data

2019-09-25 Thread Dave Chinner
On Wed, Sep 25, 2019 at 08:07:12AM -0400, Colin Walters wrote: > > > On Wed, Sep 25, 2019, at 3:11 AM, Dave Chinner wrote: > > > > We're talking about user data read/write access here, not some > > special security capability. Access to the data has already bee

Re: [PATCH v2 3/3] fstests: generic: Check the fs after each FUA writes

2018-03-28 Thread Dave Chinner
as *completed*. If we've only replayed up to the FUA write with 1:63 in it, then no metadata writes should have been *issued* with 1:396 in it as the LSN that is stamped into metadata is only updated on log IO completion On first glance, this implies a bug in the underlying de

Re: [PATCH] fstests: generic test for fsync after fallocate

2018-04-09 Thread Dave Chinner
$SCRATCH_MNT/baz You also cannot assume that two separate preallocations beyond EOF are going to be contiguous (i.e. it could be two separate extents. What you should just be checking is that there are extents allocated covering EOF to 3MB, not the exactly size, shape and type of extents are allocated. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fstests: generic test for fsync after fallocate

2018-04-09 Thread Dave Chinner
On Mon, Apr 09, 2018 at 11:00:52AM +0100, Filipe Manana wrote: > On Mon, Apr 9, 2018 at 10:51 AM, Dave Chinner wrote: > > On Sun, Apr 08, 2018 at 10:07:54AM +0800, Eryu Guan wrote: > >> On Thu, Apr 05, 2018 at 10:56:14PM +0100, fdman...@kernel.org wrote: > > You als

Re: [PATCH 2/2] generic/427: used mixed mode for Btrfs

2018-04-11 Thread Dave Chinner
> _scratch_mkfs_sized $((256 * 1024 * 1024)) >>$seqres.full 2>&1 But this uses a filesystem larger than the mixed mode threshold in _scratch_mkfs_sized(). Please update the generic threshold rather than special case this test. Cheers,

Re: [PATCH v2 2/2] common/rc: raise mixed mode threshold to 1GB

2018-04-11 Thread Dave Chinner
s_sized() > ;; > btrfs) > local mixed_opt= > - (( fssize <= 100 * 1024 * 1024 )) && mixed_opt='--mixed' > + (( fssize <= 1024 * 1024 * 1024 )) && mixed_opt='--mixed' > $MKFS_BTRFS_PROG $MKFS_OPTIONS $mixed

Re: Symlink not persisted even after fsync

2018-04-13 Thread Dave Chinner
metadata recovery semantics, so it should behave the same way as ext4 and XFS in tests like these. If it doesn't, then there's filesystem bugs that need fixing... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Symlink not persisted even after fsync

2018-04-13 Thread Dave Chinner
as a ordering dependency with the symlink inode, not whatever is found by resolving the path in the symlink data. IOWs, there is no ordering relationship between the symlink's parent directory and whatever the symlink points at. i.e. it's a one-way relationship, and so there is no revers

Re: Symlink not persisted even after fsync

2018-04-14 Thread Dave Chinner
on a symlink" may, in fact, run a fsync method of a completely different filesystem or subsystem. There is no way this could possible trigger a directory fsync of the symlink parent, because the object being fsync()d may not even know what a filesystem is... If you want a symlink to have ordering behaviour like a dirent pointing to a regular file, then use hard links Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Symlink not persisted even after fsync

2018-04-16 Thread Dave Chinner
but it also means fsync() hasn't actually guaranteed inode changes made prior to the fsync to be persistent on disk. i.e. that's a violation of ordered metadata semantics and probably a bug. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: se

Re: [PATCH v4 72/73] xfs: Convert mru cache to XArray

2017-12-05 Thread Dave Chinner
nd operations in the MRU structure, not just the radix tree operations. Turning that around so that a larger XFS structure and algorithm is now protected by an opaque internal lock from generic storage structure the forms part of the larger structure seems like a bad design pattern to me... Cheers,

  1   2   3   4   5   6   >