, they only drop to ~1500-2000MB/s as they hit internal
limits.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org
On Thu, Aug 04, 2016 at 10:28:44AM -0400, Chris Mason wrote:
>
>
> On 08/04/2016 02:41 AM, Dave Chinner wrote:
> >
> >Simple test. 8GB pmem device on a 16p machine:
> >
> ># mkfs.btrfs /dev/pmem1
> ># mount /dev/pmem1 /mnt/scratch
> ># dbench -t 60
entire file and counts lines.
> Seeing as XFS records the extent count in the inode, we might as well use it.
perhaps put a special xfs case in _count_extents() that does this
rather than FIEMAP?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> So... if the btrfs folks really want an unshare flag I can trivially
> re-add it to the VFS headers and re-enable it in the XFS
> implementation but y'all better speak up now and hammer out an
> acceptable definition. I don't think XFS needs a new flag.
It's not urgen
gt; {
> for opt in $*; do
> if echo $MOUNT_OPTIONS | grep -qw "$opt"; then
> _notrun "mount option \"$opt\" not allowed in this
> test"
> fi
> done
> }
>
> (Note that the c
hy we review changes. If it's not obvious to the reviewer
why the mount option is excluded, or it's not documented in the
commit message, then the reviewer should be asking for it to be
added.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the l
k I like this better. Everyone else, please chime in. :)
That's pretty much the structure I was going to suggest - it matches
the fiemap pattern. i.e control parameters are separated from record
data. I'd dump a bit more reserved space in the structure, though;
we've got heaps o
On Thu, Sep 08, 2016 at 11:07:16PM -0700, Darrick J. Wong wrote:
> On Fri, Sep 09, 2016 at 09:38:06AM +1000, Dave Chinner wrote:
> > On Tue, Aug 30, 2016 at 12:09:49PM -0700, Darrick J. Wong wrote:
> > > > I recall for FIEMAP that some filesystems may not have files alig
how tracking of information such as the global amount of
dirty metadata is useful for diagnostics, but I'm not convinced we
should be using it for globally scoped external control of deeply
integrated and highly specific internal filesystem functionality.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
t; >On Mon 12-09-16 10:46:56, Dave Chinner wrote:
> >>On Fri, Sep 09, 2016 at 10:17:43AM +0200, Jan Kara wrote:
> >>>On Mon 22-08-16 13:35:01, Josef Bacik wrote:
> >>>>Provide a mechanism for file systems to indicate how much dirty metadata
> >>>&g
ly hide memcg from the
writeback implementations similar to the way memcg is completely
hidden from the shrinker reclaim implementations...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
ex 23c007a..631397f 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -321,6 +321,27 @@ _overlay_scratch_unmount()
> $UMOUNT_PROG $SCRATCH_MNT
> }
>
> +_run_btrfs_post_mount_hook()
> +{
> + mnt_point=$1
> + for n in $ALWAYS_ENABLE_BTRFS_FEATURE; do
What
panded to address specific threat
models should you then implement something that is unique to
btrfs
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
; (e.g. by passing in an option at mount time - such as qgroup level
> maybe?) , instead of the global filesystem data in f_bfree f_blocks etc.
XFS does this with directory tree quotas. It was implmented 10 years
ago or so, IIRC...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from
works or
debug-only sysfs hooks are for. The XFS kernel code has both,
xfstests use both, and they pretty much do away with the need for
custom binary filesystem images for such testing...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "
On Fri, Oct 07, 2016 at 05:26:27PM +0800, Qu Wenruo wrote:
>
>
> At 10/07/2016 05:18 PM, Dave Chinner wrote:
> >On Thu, Oct 06, 2016 at 04:12:56PM +0800, Qu Wenruo wrote:
> >>Hi,
> >>
> >>Just as the title says, for some case(OK, btrfs again) we need to
1
> +dd if=/dev/zero of=$testdir/eat_my_space >> $seqres.full 2>&1
Please don't replace xfs_io writes using a specific data pattern
with dd calls that write zeros. Indeed, we don't use dd for new
tests anymore - xfs_io should be used.
Write a function that fills all the re
On Fri, Oct 07, 2016 at 06:58:47PM +0200, David Sterba wrote:
> On Fri, Oct 07, 2016 at 09:40:11AM +1100, Dave Chinner wrote:
> > XFS does this with directory tree quotas. It was implmented 10 years
> > ago or so, IIRC...
>
> Sometimes, the line between a historical remark
On Fri, Oct 07, 2016 at 06:05:51PM +0200, David Sterba wrote:
> On Fri, Oct 07, 2016 at 08:18:38PM +1100, Dave Chinner wrote:
> > On Thu, Oct 06, 2016 at 04:12:56PM +0800, Qu Wenruo wrote:
> > > So I'm wondering if I can just upload a zipped raw image as part
needs to be avoided, then add an option to filter them out. e.g.
something like this:
_scratch_options_filter btrfs compress
so that it removes any compression option from the btrfs mount/mkfs
that is run for that test.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from t
ant size, especially as the
data will compress down to nearly nothing.
Trying to hack around compression artifacts by inflating the size of
the file just doesn't work reliably. The way to fix this is to
either use one of the "fill filesystem" functions that isn't
affected by co
he file systems.
Less than 1% for XFS and ~1.5% for ext4 is well within the
run-to-run variation of fsmark. It looks like it might be slightly
faster, but it's not a cut-and-dried win for anything other than
btrfs.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe
On Wed, Oct 26, 2016 at 09:01:13AM +1100, Dave Chinner wrote:
> On Tue, Oct 25, 2016 at 02:41:44PM -0400, Josef Bacik wrote:
> > With anything that populates the inode/dentry cache with a lot of one time
> > use
> > inodes we can really put a lot of pressure on the syste
On Wed, Oct 26, 2016 at 04:03:54PM -0400, Josef Bacik wrote:
> On 10/25/2016 07:36 PM, Dave Chinner wrote:
> >So, 2-way has not improved. If changing referenced behaviour was an
> >obvious win for btrfs, we'd expect to see that here as well.
> >however, because 4-way im
and then change the generic
fault code to only update the file times if the filesystem doesn't
implement page_mkwrite...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> machine2:
> 3.2rc6 http://pastebin.com/khD0wGXx
> 3.2rc7 (not crashed yet)
These don't have XFS in the picture, but also appear to be hung
waiting on IO completion with MD stuck in
make_request()->get_active_stripe(). That, to me, indicates an MD
problem.
Cheers,
Dave.
-
On Thu, Jan 05, 2012 at 08:44:45AM +1100, Dave Chinner wrote:
> Hi there buttery folks,
>
> I just hit this warning and oops running a parallel fs_mark create
> workload on a test VM using a 17TB btrfs filesystem (12 disk dm
> RAID0) using default mkfs and mount parmeters, mo
On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote:
> On 05/01/12 09:11, Dave Chinner wrote:
>
> > Looks to be reproducable.
>
> Does this happen with rc6 ?
I haven't tried. All I'm doing is running some benchmarks to get
numbers for a talk I'm
On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote:
> On 01/04/2012 06:01 PM, Dave Chinner wrote:
> > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote:
> >> On 05/01/12 09:11, Dave Chinner wrote:
> >>
> >>> Looks to be reproducable.
>
On Thu, Jan 05, 2012 at 02:11:31PM -0500, Liu Bo wrote:
> On 01/04/2012 09:26 PM, Dave Chinner wrote:
> > On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote:
> >> On 01/04/2012 06:01 PM, Dave Chinner wrote:
> >>> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Sa
On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote:
> On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote:
> > On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote:
> > > > O
b/btree.c code and look to making that RCU safe. IIRC, the
implementation was based on a RCU-btree prototype so maybe you might
want to read up on that first:
http://programming.kicks-ass.net/kernel-patches/vma_lookup/btree.patch
FWIW, I'm mentioning this out of self interest - I need a
er more accurate error info, which is better?
Return the internal error unchanged - a failure to read the extent
list (EIO) is different to a corruption detected in the extent
map read from disk (EUCLEAN). Having a user report the appropriate
error makes our life much simpler when it comes to
down_read_trylock(), not down_read().
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 22, 2012 at 01:25:26PM +0800, Miao Xie wrote:
> On Thu, 22 Mar 2012 15:39:36 +1100, Dave Chinner wrote:
> > On Thu, Mar 22, 2012 at 11:13:17AM +0800, Miao Xie wrote:
> >> The reason the deadlock is that:
> >> Task
correctly. i.e. you can still freeze a filesystem with
inodes in this state successfully and have everythign behave as
you'd expect.
I'm not sure how other filesystems handle this problem, but perhaps
pushing this check down into filesystem specific code or adding a
superb
in writeback_inodes_[nr]_sb_if_idle()
with a trylock and use that.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
struct siginfo si;
> @@ -493,7 +494,8 @@ static int __kprobes do_page_fault(unsigned long addr,
> unsigned int esr,
> #endif
> }
>
> - fault = __do_page_fault(mm, addr, mm_flags, vm_flags, tsk);
> + vm_fault_init(&vmf, NULL, addr, mm_flags);
> + fault = __do_page_fault(mm, vmf, vm_flags, tsk);
I'm betting this doesn't compile, either.
/me stops looking.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
rea_struct
> *vma,
> + int flags)
> +{
> + return NULL;
> +}
This doesn't compile either.
-Dave.
--
Dave Chinner
da...@fromorbit.com
if (pos_out + len < i_size_read(inode_out)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ }
It might be better to put these in with the eof-zeroing patch then
add all the other changes on top? Let me post them separately,
as they may be candidates for 4.19-rc7 along with the eof zeroing.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
so moves the invalidation of the destination range to
the prep function so that it is done before the range is remapped.
This ensures that nobody can access the data in range being remapped
until the remap is complete.
--
Sound OK?
Otherwise this looks fine.
Reviewed-by: Dave Chinner
-Dave.
>
id: https://bugzilla.kernel.org/show_bug.cgi?id=201259
> Signed-off-by: Darrick J. Wong
> ---
> fs/xfs/xfs_reflink.c | 33 +
> 1 file changed, 25 insertions(+), 8 deletions(-)
Looks good.
Reviewed-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
--
> fs/xfs/xfs_reflink.c | 25 +
> 1 file changed, 25 insertions(+)
Looks good.
Reviewed-by: Dave Chinner
Because this fixes a security related problem, I'm going to push
this with the data corruption fixes.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
g to do and < 0 for an error and catch it in this code.
I note that later patches in the series change the
vfs_clone_file_prep_inodes() behaviour so this behaviour is probably
masked by those later changes. It's still a nasty bisect landmine,
though, so I'll fix it here.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Fri, Oct 05, 2018 at 05:02:28PM +1000, Dave Chinner wrote:
> On Thu, Oct 04, 2018 at 05:44:47PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong
> >
> > Refactor all the reflink preparation steps into a separate helper that
> > we'll use to land all the
On Fri, Oct 05, 2018 at 10:21:59AM -0700, Darrick J. Wong wrote:
> On Fri, Oct 05, 2018 at 07:02:42PM +1000, Dave Chinner wrote:
> > On Fri, Oct 05, 2018 at 05:02:28PM +1000, Dave Chinner wrote:
> > > On Thu, Oct 04, 2018 at 05:44:47PM -0700, Darrick J. Wong wrote:
> > &
calling this trace point is not committed?
If we decide to add this, it needs to be a CONFIG_XFS_DEBUG=y only
definition because trace_printk() is only for temporary debugging
code and has substantial performance overheads even when these trace
points are not being traced.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
cently sent to fstests
> exercises the fixes in this series. Tests are in [2].
Can you rebase this on the for-next branch on the xfs tree which
already contains some of the initial fixes in the series and a
couple of other reflink/dedupe data corruption fixes? I'm planning
on pushi
ing all the way to the end? */
> isize = i_size_read(inode_in);
> - if (isize == 0)
> - return 0;
This looks like a change of behaviour. Instead of skipping zero
legnth source files and returning success, this will now return
-EINVAL as other checks fail? That needs to be documented in the
commit message if it's intentional and a valid change to make...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
e entire request. A subsequent
> patch will enable us to shorten dedupe requests correctly.
Ok, so this patch rejects whole file dedupe requests, and then a
later patch adds support back in for it?
Doesn't that leave a bisect landmine behind? Why separate the
functionality like this?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
; + truncate_inode_pages_range(&inode_out->i_data,
> + round_down(pos_out, PAGE_SIZE),
> + round_up(pos_out + *len, PAGE_SIZE) - 1);
Looks good.
Reviewed-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
ped, pos_out + len);
> + remapped = min_t(int64_t, len, XFS_FSB_TO_B(mp, remapped));
So remapped is returned as a block count, then immediately converted
to a byte count? Can we return it as byte count so that we don't
have this weird unit conversion?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Oct 10, 2018 at 09:15:26PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong
>
> Now that we've moved the partial EOF block checks to the VFS helpers, we
> can remove the redundantn functionality from XFS.
>
> Signed-off-by: Darrick J. Wong
looks fine.
Re
flink_remap_range at this point?
Yeah, that seems like a good idea to me - pulling all the
vfs/generic code interactions back up into xfs_file.c would match
how the rest of the file operations are layered w.r.t. external and
internal XFS code...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
committed, are you going
to update it and repost as it clearly had value
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
retry;
> + }
> + } else
> + __lock_page(page);
> }
>
> /* Did it get truncated? */
> @@ -2607,6 +2655,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> /* Things didn't work out. Return zero to tell the mm layer so. */
> shrink_readahead_size_eio(file, ra);
> return VM_FAULT_SIGBUS;
> +
> +out_retry_wait:
> + if (page) {
> + if (flags & FAULT_FLAG_KILLABLE)
and here.
Any reason for this discrepancy?
-Dave.
--
Dave Chinner
da...@fromorbit.com
alised by a prior
fault attempt, not that "we used a cached page". "cached page" is a
horribly overloaded term - I suspect we should not overload it more,
especially as the flag get cleared if the cached page is not up to
date (i.e. the data on it hasn't been fully initialised).
FAULT_FLAG_PAGE_INITIALISED?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
)
Mess.
#define __FAIL_FLAGS(VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)
if (ret & __FAIL_FLAGS)
Should kill the unlikely() at the same time.
-Dave.
--
Dave Chinner
da...@fromorbit.com
unlock_mmap_for_io(vmf->vma, vmf->flags);
> +
> /*
>* Umm, take care of errors if the page isn't up-to-date.
>* Try to re-read it _once_. We do this synchronously,
Same here.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
}
> + unlock_page(cached_page);
> + put_page(cached_page);
> + }
> +
Can you factor this out so the main code path doesn't get any more
complex than it already is? i.e. something like:
error = vmf_has_cached_page(vmf, &page);
if (error)
goto out_retry;
if (page)
goto have_cached_page;
-dave.
--
Dave Chinner
da...@fromorbit.com
eneric_file_read_iter(struct kiocb *iocb, struct
> iov_iter *iter)
> EXPORT_SYMBOL(generic_file_read_iter);
>
> #ifdef CONFIG_MMU
> -static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int
> flags)
> +struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma, int flags)
> {
> if ((flags & (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) ==
> FAULT_FLAG_ALLOW_RETRY) {
> struct file *file;
> @@ -2377,6 +2377,7 @@ static struct file *maybe_unlock_mmap_for_io(struct
> vm_area_struct *vma, int fla
> }
> return NULL;
> }
> +EXPORT_SYMBOL_GPL(maybe_unlock_mmap_for_io);
These API mods (if this functionality remains in the filesystem
code) belong in whatever patch introduced this function.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Sun, Oct 21, 2018 at 09:17:50AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong
>
> Move the offset <-> blocks unit conversions into
> xfs_reflink_remap_blocks to make the call site less ugly.
>
> Signed-off-by: Darrick J. Wong
Looks fine.
Reviewed-by: Dave C
egular write.
>
> Signed-off-by: Darrick J. Wong
Looks ok to me. remap_file_range() still returns the full length,
so there's no change of behaviour there.
Reviewed-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
;
> Signed-off-by: Darrick J. Wong
Sensible enough.
Reviewed-by: Dave Chinner
--
Dave Chinner
da...@fromorbit.com
On Sun, Oct 21, 2018 at 09:18:17AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong
>
> Now that the vfs remap helper dirties the inode [cm]time for us, xfs no
> longer needs to do that on its own.
>
> Signed-off-by: Darrick J. Wong
looks good.
Reviewed-by: Dave
we want to merge this? I can take it through the
XFS tree given that there is a bit of XFS changes that needs to be
co-ordinated with it, or should it go through some other tree?
The other question I have is who reviews ocfs2 changes these days?
Do they get reviewed, or just shepherded in via akpm's tree?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Mon, Oct 22, 2018 at 01:21:12PM +1100, Dave Chinner wrote:
> On Sun, Oct 21, 2018 at 09:15:03AM -0700, Darrick J. Wong wrote:
> > Hi all,
> >
> > Dave, Eric, and I have been chasing a stale data exposure bug in the XFS
> > reflink implementation, and tracked it down
On Mon, Oct 22, 2018 at 05:52:49AM +0100, Al Viro wrote:
> On Mon, Oct 22, 2018 at 03:37:41PM +1100, Dave Chinner wrote:
>
> > Ok, this is a bit of a mess. the patches do not merge cleanly to a
> > 4.19-rc1 base kernel because of all the changes to
> > include/linux/fs.
On Mon, Oct 22, 2018 at 08:42:29AM +0300, Amir Goldstein wrote:
> On Mon, Oct 22, 2018 at 8:09 AM Dave Chinner wrote:
> >
> > On Mon, Oct 22, 2018 at 05:52:49AM +0100, Al Viro wrote:
> > > On Mon, Oct 22, 2018 at 03:37:41PM +1100, Dave Chinner wrote:
> > >
> &g
On Mon, Oct 22, 2018 at 01:56:54PM -0400, Josef Bacik wrote:
> On Fri, Oct 19, 2018 at 02:48:47PM +1100, Dave Chinner wrote:
> > On Thu, Oct 18, 2018 at 04:23:18PM -0400, Josef Bacik wrote:
> > > ->page_mkwrite is extremely expensive in btrfs. We have to reserve
> >
e create time doesn't really help,
because once you've broken into a system, this makes it really easy
to cover tracks (e.g. we can't find files that were created and
unlinked during the break in window anymore) and lay false
trails
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Thu, Feb 14, 2019 at 03:14:29PM -0800, Omar Sandoval wrote:
> On Fri, Feb 15, 2019 at 09:06:26AM +1100, Dave Chinner wrote:
> > On Thu, Feb 14, 2019 at 02:00:07AM -0800, Omar Sandoval wrote:
> > > From: Omar Sandoval
> > >
> > > Hi,
> > >
>
all the metadata goes to the software
raided pmem block devices that aren't DAX capable.
Problem already solved, yes?
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote:
> On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote:
> > (This is a joint proposal with Hannes Reinecke)
> >
> > Servers with NV-DIMM are slowly emerging in data centers but one key feature
> &g
On Sat, Feb 16, 2019 at 09:05:31AM -0800, Dan Williams wrote:
> On Fri, Feb 15, 2019 at 9:40 PM Dave Chinner wrote:
> >
> > On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote:
> > > On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote:
> >
st of the various discard
> commands - how painful is it for modern SSD's?
AIUI, it still depends on the SSD implementation, unfortunately.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Sun, Feb 17, 2019 at 06:42:59PM -0500, Ric Wheeler wrote:
> On 2/17/19 4:09 PM, Dave Chinner wrote:
> >On Sun, Feb 17, 2019 at 03:36:10PM -0500, Ric Wheeler wrote:
> >>One proposal for btrfs was that we should look at getting discard
> >>out of the synchronous pa
louts is
completely the wrong approach to be taking. We need to do these
things in a generic manner so that all filesystems (and block
devices!) that use the iomap infrastructure can take advantage of
them, not just one of them.
Quite frankly, I don't care if it takes more time and work up
Darrick on CONFIG_IOMAP_DEBUG here - we need to start
locking down invalid behaviour and invalid combinations with asserts
that tell developers they've broken something.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
iomap->type = IOMAP_DELALLOC;
> + }
> +
> iomap->addr = IOMAP_NULL_ADDR;
> iomap->type = IOMAP_DELALLOC;
The iomap->type is overwritten here and so IOMAP_COW will never be
seen by the iomap infrastructure...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
en;
> + if (iocb->ki_pos > i_size_read(inode))
> + i_size_write(inode, iocb->ki_pos);
> + return written;
Looks like it fails to handle O_[D]SYNC writes.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
((name)[IOMAP_SOURCE_MAP])
And now we only have to pass a single iomap parameter to each
function as "struct iomap **iomap". This makes the code somewhat
simpler, and we only ever need to use IOMAP_S(iomap) when
IOMAP_B(iomap)->type == IOMAP_COW.
The other advantage of this is that if we even need new
functionality that requires 2 (or more) iomaps, we don't have to
change APIs again
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Mon, Aug 05, 2019 at 04:08:43PM +, Goldwyn Rodrigues wrote:
> On Mon, 2019-08-05 at 09:43 +1000, Dave Chinner wrote:
> > On Fri, Aug 02, 2019 at 05:00:45PM -0500, Goldwyn Rodrigues wrote:
> > > From: Goldwyn Rodrigues
> > >
> > > This helps filesyste
now if there's hardware encryption
below or software encryption on top becomes problematic...
So really, from a filesystem and iomap perspective, What Eric says
is the right - it's the only order that makes sense...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
hat skips the compression/decompression code
and sets a few extra flags in the iocb that is passed down to the
direct IO code.
We don't need a whole new IO path just to skip a data transformation
step in the direct IO path
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Thu, Sep 05, 2019 at 02:16:37PM +0200, Johannes Thumshirn wrote:
> On 05/09/2019 04:10, Dave Chinner wrote:
> > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote:
> >> From: Omar Sandoval
> >>
> >> This adds an API for writing compressed data
On Fri, Sep 06, 2019 at 11:19:49AM -0700, Omar Sandoval wrote:
> On Thu, Sep 05, 2019 at 12:10:12PM +1000, Dave Chinner wrote:
> > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval
> > >
> > > This adds an API for wri
;
> This now at least survives xfstests -g quick on a 4k xfs file system
> for. Here is my current tree:
>
> http://git.infradead.org/users/hch/xfs.git/shortlog/refs/heads/xfs-cow-iomap
That looks somewhat reasonable. The XFS mapping function is turning
into spagetti and getting really hard to follow again, though.
Perhaps we should consider splitting the shared/COW path out of
it...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
checked, so why should the format that the data is
supplied to the kernel in suddenly require new privilege checks?
i.e. writing encoded data to a file requires exactly the same
access permissions as writing cleartext data to the file. The only
extra information here is whether the _filesystem_ supports encoded
data, and that doesn't change regardless of what the open file gets
passed to. Hence the capability is either there or it isn't, it
doesn't transform not matter what privilege boundary the file is
passed across. Similarly, we have permission to access the data
or we don't through the struct file, it doesn't transform either.
Hence I don't see why CAP_SYS_ADMIN or any special permissions are
needed for an application with access permissions to file data to
use these RWF_ENCODED IO interfaces. I am inclined to think the
permission check here is wrong and should be dropped, and then all
these issues go away.
Yes, the app that is going to use this needs root perms because it
accesses all data in the fs (it's a backup app!), but that doesn't
mean you can only use RWF_ENCODED if you have root perms.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
On Wed, Sep 25, 2019 at 08:07:12AM -0400, Colin Walters wrote:
>
>
> On Wed, Sep 25, 2019, at 3:11 AM, Dave Chinner wrote:
> >
> > We're talking about user data read/write access here, not some
> > special security capability. Access to the data has already bee
as *completed*. If we've only replayed up to the
FUA write with 1:63 in it, then no metadata writes should have been
*issued* with 1:396 in it as the LSN that is stamped into metadata
is only updated on log IO completion
On first glance, this implies a bug in the underlying de
$SCRATCH_MNT/baz
You also cannot assume that two separate preallocations beyond EOF
are going to be contiguous (i.e. it could be two separate extents.
What you should just be checking is that there are extents allocated
covering EOF to 3MB, not the exactly size, shape and type of extents
are allocated.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 09, 2018 at 11:00:52AM +0100, Filipe Manana wrote:
> On Mon, Apr 9, 2018 at 10:51 AM, Dave Chinner wrote:
> > On Sun, Apr 08, 2018 at 10:07:54AM +0800, Eryu Guan wrote:
> >> On Thu, Apr 05, 2018 at 10:56:14PM +0100, fdman...@kernel.org wrote:
> > You als
> _scratch_mkfs_sized $((256 * 1024 * 1024)) >>$seqres.full 2>&1
But this uses a filesystem larger than the mixed mode threshold in
_scratch_mkfs_sized(). Please update the generic threshold rather
than special case this test.
Cheers,
s_sized()
> ;;
> btrfs)
> local mixed_opt=
> - (( fssize <= 100 * 1024 * 1024 )) && mixed_opt='--mixed'
> + (( fssize <= 1024 * 1024 * 1024 )) && mixed_opt='--mixed'
> $MKFS_BTRFS_PROG $MKFS_OPTIONS $mixed
metadata
recovery semantics, so it should behave the same way as ext4 and
XFS in tests like these. If it doesn't, then there's filesystem bugs
that need fixing...
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
as a ordering
dependency with the symlink inode, not whatever is found by
resolving the path in the symlink data. IOWs, there is no ordering
relationship between the symlink's parent directory and whatever the
symlink points at. i.e. it's a one-way relationship, and so there is
no revers
on a symlink" may, in fact, run a fsync
method of a completely different filesystem or subsystem. There is
no way this could possible trigger a directory fsync of the symlink
parent, because the object being fsync()d may not even know what a
filesystem is...
If you want a symlink to have ordering behaviour like a dirent
pointing to a regular file, then use hard links
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
but it also means fsync() hasn't actually guaranteed inode changes
made prior to the fsync to be persistent on disk. i.e. that's a
violation of ordered metadata semantics and probably a bug.
Cheers,
Dave.
--
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: se
nd operations in the MRU
structure, not just the radix tree operations. Turning that around
so that a larger XFS structure and algorithm is now protected by an
opaque internal lock from generic storage structure the forms part
of the larger structure seems like a bad design pattern to me...
Cheers,
1 - 100 of 563 matches
Mail list logo