[GIT PULL] xfs: updates for 4.9-rc5

2016-11-09 Thread Dave Chinner
ystem shutdown. Darrick J. Wong (1): xfs: defer should abort intent items if the trans roll fails fs/xfs/libxfs/xfs_defer.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) -- Dave Chinner da...@fromorbit.com

Re: [PATCH 7/8] blk-wbt: add general throttling mechanism

2016-11-09 Thread Dave Chinner
until the write cache is filled (can be GB in size) and by then it's way too late to fix up with OS level queuing... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC 0/6] vfs: Add timestamp range check support

2016-11-03 Thread Dave Chinner
On Thu, Nov 03, 2016 at 04:43:57PM -0400, Theodore Ts'o wrote: > On Thu, Nov 03, 2016 at 09:48:27AM +1100, Dave Chinner wrote: > > > > We're going to need regression tests for this to ensure that it > > works properly and that we don't inadvertantly break

Re: [PATCH v9 00/16] re-enable DAX PMD support

2016-11-03 Thread Dave Chinner
On Thu, Nov 03, 2016 at 11:51:02AM -0600, Ross Zwisler wrote: > On Thu, Nov 03, 2016 at 12:58:26PM +1100, Dave Chinner wrote: > > On Tue, Nov 01, 2016 at 01:54:02PM -0600, Ross Zwisler wrote: > > > DAX PMDs have been disabled since Jan Kara introduced DAX radix tree based &g

Re: [PATCH v9 00/16] re-enable DAX PMD support

2016-11-02 Thread Dave Chinner
ies that is likely to be merged through the ext4 tree, so it needs a stable branch. There's iomap direct IO patches for XFS pending, and they conflict with this patchset. i.e. we need a stable git base to work from... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC 0/6] vfs: Add timestamp range check support

2016-11-02 Thread Dave Chinner
and validate that the mount behaviour, clamping and range limiting is working as intended? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: tmpfs returns incorrect data on concurrent pread() and truncate()

2016-11-01 Thread Dave Chinner
On Tue, Nov 01, 2016 at 06:38:26PM -0700, Hugh Dickins wrote: > On Wed, 2 Nov 2016, Dave Chinner wrote: > > On Tue, Nov 01, 2016 at 04:51:30PM -0700, Hugh Dickins wrote: > > > On Wed, 26 Oct 2016, Jakob Unterwurzacher wrote: > > > > > > > tmpfs seems t

Re: tmpfs returns incorrect data on concurrent pread() and truncate()

2016-11-01 Thread Dave Chinner
truncate, so again we need filesystem level serialisation for this. Put simple: page locks are insufficient as a generic mechanism for serialising filesystem operations. The locking required for this is generally deeply filesystem implementation specific, so it's fine that the VFS doesn'

Re: [PATCH 2/4] fs: remove the never implemented aio_fsync file operation

2016-10-31 Thread Dave Chinner
On Mon, Oct 31, 2016 at 02:07:54PM +0100, Christoph Hellwig wrote: > On Mon, Oct 31, 2016 at 10:23:31AM +1100, Dave Chinner wrote: > > This doesn't belong in this patchset. > > It does. I can't fix up the calling conventions for a methods that > was never implemented.

Re: [PATCH 2/4] fs: remove the never implemented aio_fsync file operation

2016-10-30 Thread Dave Chinner
e so when all the bikshedding stops we can convert it to the One True AIO Interface that is decided on. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

[GIT PULL] xfs: fixes for 4.9-rc3

2016-10-27 Thread Dave Chinner
| 4 +- include/linux/iomap.h | 17 +- 17 files changed, 640 insertions(+), 645 deletions(-) -- Dave Chinner da...@fromorbit.com

Re: bio linked list corruption.

2016-10-26 Thread Dave Chinner
ing to write as had as they can concurrently and to all slip through the ENOSPC detection without the correct metadata reservations and all require multiple metadata blocks to be allocated durign writeback... If you've got a way to trigger it quickly and reliably, that would be helpful... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/3] iopmem : A block device for PCIe memory

2016-10-25 Thread Dave Chinner
On Tue, Oct 25, 2016 at 05:50:43AM -0600, Stephen Bates wrote: > Hi Dave and Christoph > > On Fri, Oct 21, 2016 at 10:12:53PM +1100, Dave Chinner wrote: > > On Fri, Oct 21, 2016 at 02:57:14AM -0700, Christoph Hellwig wrote: > > > On Fri, Oct 21, 2016 at 10:22:39AM +

Re: [PATCH] shmem: avoid huge pages for small files

2016-10-24 Thread Dave Chinner
On Mon, Oct 24, 2016 at 01:34:53PM -0700, Dave Hansen wrote: > On 10/21/2016 03:50 PM, Dave Chinner wrote: > > On Fri, Oct 21, 2016 at 06:00:07PM +0300, Kirill A. Shutemov wrote: > >> On Fri, Oct 21, 2016 at 04:01:18PM +1100, Dave Chinner wrote: > >> To me, most of thi

Re: [RFC] put more pressure on proc/sysfs slab shrink

2016-10-21 Thread Dave Chinner
o, I don't think s_shrink.batch = 0 does what you think it does. The superblock batch size default of 1024 is more efficient than setting sb->s_shrink.batch = 0 as that makes the shrinker use SHRINK_BATCH: #define SHRINK_BATCH 128 i.e. it does less work per batch so has more overhead Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] shmem: avoid huge pages for small files

2016-10-21 Thread Dave Chinner
On Fri, Oct 21, 2016 at 06:00:07PM +0300, Kirill A. Shutemov wrote: > On Fri, Oct 21, 2016 at 04:01:18PM +1100, Dave Chinner wrote: > > On Thu, Oct 20, 2016 at 07:01:16PM -0700, Andi Kleen wrote: > > > > Ugh, no, please don't use mount options for file specific behavio

Re: [RFC] fs/proc/meminfo: introduce Unaccounted statistic

2016-10-21 Thread Dave Chinner
On Fri, Oct 21, 2016 at 09:25:10AM +0200, Vlastimil Babka wrote: > On 10/21/2016 12:59 AM, Dave Chinner wrote: > >On Thu, Oct 20, 2016 at 03:33:58PM +0200, Michal Hocko wrote: > >>On Thu 20-10-16 14:11:49, Vlastimil Babka wrote: > >>[...] > >>> Hi, I'

Re: [PATCH 0/3] iopmem : A block device for PCIe memory

2016-10-21 Thread Dave Chinner
On Fri, Oct 21, 2016 at 02:57:14AM -0700, Christoph Hellwig wrote: > On Fri, Oct 21, 2016 at 10:22:39AM +1100, Dave Chinner wrote: > > You do realise that local filesystems can silently change the > > location of file data at any point in time, so there is no such > > thing

Re: [PATCH] shmem: avoid huge pages for small files

2016-10-20 Thread Dave Chinner
s have been modified by the admin... > That would imply that every application wanting to use large pages > would need to be especially enabled. That would seem awfully limiting > to me and needlessly deny benefits to most existing code. No change to applications will be necessary (see above), though there's no reason why couldn't directly use the VFS interfaces to explicitly ask for such behaviour themselves Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/3] iopmem : A block device for PCIe memory

2016-10-20 Thread Dave Chinner
" of file data to block device addresses in userspace? If you want remote access to the blocks owned and controlled by a filesystem, then you need to use a filesystem with a remote locking mechanism to allow co-ordinated, coherent access to the data in those blocks. Anything else is just asking for ongoing, unfixable filesystem corruption or data leakage problems (i.e. security issues). Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC] fs/proc/meminfo: introduce Unaccounted statistic

2016-10-20 Thread Dave Chinner
led) can grow to gigabytes in size under various metadata intensive workloads, there's every chance that such reporting will make users incorrectly think they have a massive memory leak Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] shmem: avoid huge pages for small files

2016-10-20 Thread Dave Chinner
ge=within_size" mounts only. > > Well, you're right that I tried originally address the issue with > huge=within_size, but this option makes much more sense for filesystem > with persistent storage. For ext4, it would be pretty usable option. Ugh, no, please don't use mount options for file specific behaviours in filesystems like ext4 and XFS. This is exactly the sort of behaviour that should either just work automatically (i.e. be completely controlled by the filesystem) or only be applied to files specifically configured with persistent hints to reliably allocate extents in a way that can be easily mapped to huge pages. e.g. on XFS you will need to apply extent size hints to get large page sized/aligned extent allocation to occur, and so this persistent extent size hint should trigger the filesystem to use large pages if supported, the hint is correctly sized and aligned, and there are large pages available for allocation. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/5] lib: radix-tree: native accounting and tracking of special entries

2016-10-20 Thread Dave Chinner
/* 0 1 */ unsigned char offset; /* 1 1 */ /* XXX 2 bytes hole, try to pack */ unsigned int count;/* 4 4 */ . Cheers, Dave. -- Dave Chinner da...@fromorbit.com

[4.9-rc1, selinux/audit/netlink, regression?] Warning at kernel/softirq.c:161

2016-10-20 Thread Dave Chinner
don't tend to sit idle for 5 hours like this one did before tripping this. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

[regression, 4.9-rc1] blk-mq: list corruption in request queue

2016-10-18 Thread Dave Chinner
7;t seen it before, hence it's probably a regression. I haven't tried to reproduce it yet, so I don't know if it's easy to trip over. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-17 Thread Dave Chinner
On Mon, Oct 17, 2016 at 10:22:56AM +0200, Michal Hocko wrote: > On Mon 17-10-16 07:49:59, Dave Chinner wrote: > > On Thu, Oct 13, 2016 at 01:04:56PM +0200, Michal Hocko wrote: > > > On Thu 13-10-16 09:39:47, Michal Hocko wrote: > > > > On Thu 13-10-16 11:29:24, Dave C

Re: [RFC PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-16 Thread Dave Chinner
On Thu, Oct 13, 2016 at 01:04:56PM +0200, Michal Hocko wrote: > On Thu 13-10-16 09:39:47, Michal Hocko wrote: > > On Thu 13-10-16 11:29:24, Dave Chinner wrote: > > > On Fri, Oct 07, 2016 at 03:18:14PM +0200, Michal Hocko wrote: > > [...] > > > > Unpatched ker

Re: [PATCH] DAX: enable iostat for read/write

2016-10-15 Thread Dave Chinner
tart = jiffies; > + part_round_stats(cpu, &disk->part0); > + part_stat_inc(cpu, &disk->part0, ios[rw]); > + part_stat_add(cpu, &disk->part0, sectors[rw], sec); > + part_inc_in_flight(&disk->part0, rw); > + part_stat_unlock(); > +} Why reimplement generic_start_io_acct() and generic_end_io_acct()? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-12 Thread Dave Chinner
On Fri, Oct 07, 2016 at 03:18:14PM +0200, Michal Hocko wrote: > On Thu 06-10-16 13:11:42, Dave Chinner wrote: > > On Wed, Oct 05, 2016 at 01:38:45PM +0200, Michal Hocko wrote: > > > On Wed 05-10-16 07:32:02, Dave Chinner wrote: > > > > On Tue, Oct 04, 2016 at 10:12:

Re: [PATCH v2] z3fold: add shrinker

2016-10-12 Thread Dave Chinner
On Wed, Oct 12, 2016 at 10:26:34AM +0200, Vitaly Wool wrote: > On Wed, 12 Oct 2016 09:52:06 +1100 > Dave Chinner wrote: > > > > > > > +static unsigned long z3fold_shrink_scan(struct shrinker *shrink, > > > + struct shrink_control *sc)

[GIT PULL] xfs: shared data extents support for 4.9-rc1

2016-10-12 Thread Dave Chinner
644 fs/xfs/xfs_refcount_item.c create mode 100644 fs/xfs/xfs_refcount_item.h create mode 100644 fs/xfs/xfs_reflink.c create mode 100644 fs/xfs/xfs_reflink.h create mode 100644 fs/xfs/xfs_trans_bmap.c create mode 100644 fs/xfs/xfs_trans_refcount.c -- Dave Chinner da...@fromorbit.com

[regression, 4.9, pmem] memmap= command line, pmem device creation behaviour changed

2016-10-11 Thread Dave Chinner
nt regions - persistent memory device setup cannot be allowed to change from kernel to kernel. Change in mapping and device setup like this will cause the corruption of and/or loss of data in the persistent memory devices that have changed shape, size or disappeared Cheers, Dave. -- Dave

Re: [PATCH v2] z3fold: add shrinker

2016-10-11 Thread Dave Chinner
+ pool->no_shrinker = true; > + } Just fail creation of the pool. If you can't register a shrinker, then much bigger problems are about to happen to your system, and running a new memory consumer that /can't be shrunk/ is not going to help anyone. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] z3fold: add shrinker

2016-10-11 Thread Dave Chinner
d and higher compression ratio therefore. > > Signed-off-by: Vitaly Wool This seems to implement the shrinker API we removed a ~3 years ago (commit a0b02131c5fc ("shrinker: Kill old ->shrink API.")). Forward porting and testing required, perhaps? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

2016-10-11 Thread Dave Chinner
On Mon, Oct 10, 2016 at 02:34:34AM -0700, Christoph Hellwig wrote: > On Mon, Oct 10, 2016 at 05:07:45PM +1100, Dave Chinner wrote: > > > > *However*, the DAX IO path locking in XFS has changed in 4.9-rc1 to > > > > match the buffered IO single writer POSIX semantics

Re: [PATCH] dax: correct dax iomap code namespace

2016-10-10 Thread Dave Chinner
of the "dax" namespace and not the "iomap" namespace. > Rename them to dax_iomap_rw(), dax_iomap_fault() and dax_iomap_actor() > respectively. > > Signed-off-by: Ross Zwisler > Suggested-by: Dave Chinner > Reviewed-by: Christoph Hellwig > Reviewed-by: Jan

Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

2016-10-09 Thread Dave Chinner
On Sun, Oct 09, 2016 at 08:17:48AM -0700, Christoph Hellwig wrote: > On Fri, Oct 07, 2016 at 08:47:51AM +1100, Dave Chinner wrote: > > Except that it's DAX, and in 4.7-rc1 that used shared locking at the > > XFS level and never took exclusive locks. > > > > *Howe

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-09 Thread Dave Chinner
On Sun, Oct 09, 2016 at 06:14:57PM +0200, Oleg Nesterov wrote: > On 10/08, Dave Chinner wrote: > > > > On Fri, Oct 07, 2016 at 07:15:18PM +0200, Oleg Nesterov wrote: > > > > > > > > > > --- x/fs/xfs/xfs_trans.c > > > > > +++ x/fs/x

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-07 Thread Dave Chinner
On Fri, Oct 07, 2016 at 07:15:18PM +0200, Oleg Nesterov wrote: > On 10/07, Dave Chinner wrote: > > > > On Thu, Oct 06, 2016 at 07:17:58PM +0200, Oleg Nesterov wrote: > > > Probably false positive? Although when I look at the comment above > > > xfs_sync_sb() &g

Re: lockdep splat due to reclaim recursion detected

2016-10-07 Thread Dave Chinner
this either. ISTR this same issue triggered a long whole discussion about how to move memory allocation to task based context flags or to push more context specific information into the shrinkers so they could decide if the needed to avoid deadlocks or not. That was about 6 months ago, IIRC, and there's been no followup from the mm side of things... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-06 Thread Dave Chinner
S_NOFS, not change the implementation to make XFS_TRANS_NO_WRITECOUNT flag to also mean XFS_TRANS_NOFS. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

2016-10-06 Thread Dave Chinner
can't be used as a regression test across multiple kernels. If you want to stress concurrent access to a single file, please use direct IO, not DAX or buffered IO. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-05 Thread Dave Chinner
On Wed, Oct 05, 2016 at 01:38:45PM +0200, Michal Hocko wrote: > On Wed 05-10-16 07:32:02, Dave Chinner wrote: > > On Tue, Oct 04, 2016 at 10:12:15AM +0200, Michal Hocko wrote: > > > From: Michal Hocko > > > > > > compaction has been disabled for GFP_NOFS a

Re: BUG_ON() in workingset_node_shadows_dec() triggers

2016-10-05 Thread Dave Chinner
d CONFIG_XFS_WARN=y to do this - it was a 20 line change to add XFS_CONFIG_WARN instead of having to audit and modify ~1800 call sites to do something differently. And because we know that ASSERT() is not present in all kernels, it isn't ever used as a replacement for error handling. Perhaps that's the simplest solution here as well Just my 2c worth. -Dave. -- Dave Chinner da...@fromorbit.com

[GIT PULL] xfs: updates for 4.9-rc1

2016-10-05 Thread Dave Chinner
n a btree xfs: defer should allow ->finish_item to request a new transaction xfs: set up per-AG free space reservations Dave Chinner (9): xfs: fix superblock inprogress check xfs: change mailing list address xfs: remote attribute blocks aren't really userdata x

Re: [PATCH] fs/block_dev.c: return the right error in thaw_bdev()

2016-10-05 Thread Dave Chinner
b; > > > int error = 0; > > > > > > mutex_lock(&bdev->bd_fsfreeze_mutex); > > > if (++bdev->bd_fsfreeze_count > 1) { > > > > No limit is put in place so in principle this will eventually turn negative. > &

Re: [RFC PATCH] mm, compaction: allow compaction for GFP_NOFS requests

2016-10-04 Thread Dave Chinner
to see lots of 65kB allocations being requested in GFP_NOFS context by the xfs-cil-worker context doing journal checkpoint formatting Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-04 Thread Dave Chinner
evice... Put your TEST_DIR and SCRATCHMNT mount points outside the xfstests directory, and this should go away. Most people use /mnt/test and /mnt/scratch for these Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-04 Thread Dave Chinner
On Tue, Oct 04, 2016 at 01:43:43PM +0200, Oleg Nesterov wrote: > On 10/03, Oleg Nesterov wrote: > > > > On 10/03, Dave Chinner wrote: > > > > > > On Fri, Sep 30, 2016 at 07:14:34PM +0200, Oleg Nesterov wrote: > > > > On 09/27, Oleg Nes

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-10-02 Thread Dave Chinner
t of tests that should complete successfully with minimal failures and without crashing the machine. If you're running this group and there's failures, hangs and crashes all over the place, then you need to start reporting bugs because that should not be happening on any kernel Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 00/12] re-enable DAX PMD support

2016-09-29 Thread Dave Chinner
On Thu, Sep 29, 2016 at 09:03:43PM -0600, Ross Zwisler wrote: > On Fri, Sep 30, 2016 at 09:43:45AM +1000, Dave Chinner wrote: > > Finally: none of the patches in your tree have reviewed-by tags. > > That says to me that none of this code has been reviewed yet. > > Rev

Re: [PATCH v4 00/12] re-enable DAX PMD support

2016-09-29 Thread Dave Chinner
h the one tree to avoid issues like this. > > Changes since v3: > - Corrected dax iomap code namespace for functions defined in fs/dax.c. >(Dave Chinner) > - Added leading "dax" namespace to new static functions in fs/dax.c. >(Dave Chinner) > - Made all DA

Re: [PATCH v3 00/11] re-enable DAX PMD support

2016-09-27 Thread Dave Chinner
eview or testing of the DAX changes (apart from the cursor comments I've already made) because of the huge pile of XFS reflink changes I've got ot get through first. However, I've already got the iomap dax bits in the XFS tree so I can pull everything through there if review and testing is covered otherwise.. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 09/11] dax: add struct iomap based DAX PMD support

2016-09-27 Thread Dave Chinner
struct address_space *mapping = vma->vm_file->f_mapping; > + unsigned long pmd_addr = address & PMD_MASK; > + bool write = flags & FAULT_FLAG_WRITE; > + struct inode *inode = mapping->host; > + struct iomap iomap = { 0 }; > + int error, result = 0

Re: [PATCH v3 04/11] ext2: remove support for DAX PMD faults

2016-09-27 Thread Dave Chinner
it be better to put a comment mentioning this here? So as the years go by, this reminds people not to bother trying to implement it? /* * .pmd_fault is not supported for DAX because allocation in ext2 * cannot be reliably aligned to huge page sizes and so pmd faults * will always fail and fail back to regular faults. */ -- Dave Chinner da...@fromorbit.com

Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

2016-09-27 Thread Dave Chinner
> > SCRATCH_MNT=SCRATCH \ > > ./check `grep -il freeze tests/*/???` > > You can run either: > > ./check -g freeze > > to check just the freezing tests or > > ./check Better for regression testing is: check -g auto so that is skips all the tests that are broken or likely to crash the machine on some debug check. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] [4.8-rc7, regression] fault_in_multipages_readable() throws set-but-unused error

2016-09-25 Thread Dave Chinner
On Sun, Sep 25, 2016 at 06:21:05PM -0700, Linus Torvalds wrote: > Thanks, applied. > > I did happen to notice: > > On Sun, Sep 25, 2016 at 4:57 PM, Dave Chinner wrote: > > > > ./include/linux/pagemap.h: In function ¿fault_in_multipages_readable¿: > > ./incl

[PATCH] [4.8-rc7, regression] fault_in_multipages_readable() throws set-but-unused error

2016-09-25 Thread Dave Chinner
From: Dave Chinner When building XFS with -Werror, it now fails with: ./include/linux/pagemap.h: In function ¿fault_in_multipages_readable¿: ./include/linux/pagemap.h:602:16: error: variable ¿c¿ set but not used [-Werror=unused-but-set-variable] volatile char c; ^ This is a

Re: [BUG, 4.8-rc7] perf: oops in intel_pmu_enable_all

2016-09-25 Thread Dave Chinner
On Mon, Sep 26, 2016 at 09:22:45AM +1000, Dave Chinner wrote: > Hi Folks, > > I just upgraded a test VM from 4.8-rc6 to 4.8-rc7, and went to run: > > # perf_4.7 top -g -U > > inside the VM - the kernel oops with the trace below. The perf > binary was built from a 4.7

[BUG, 4.8-rc7] perf: oops in intel_pmu_enable_all

2016-09-25 Thread Dave Chinner
5147] CR2: 0018 [ 16.535644] ---[ end trace 61a930b5078051b0 ]--- [ 16.535644] Kernel panic - not syncing: Fatal exception in interrupt [ 16.535833] Kernel Offset: disabled Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: Linux 4.8: Reported regressions as of Sunday, 2016-09-18

2016-09-18 Thread Dave Chinner
frastructure, and nobody has been able to reproduce it exactly outside of the reaim benchmark. We've reproduced other, similar issues, and the fixes for those are queued for the 4.9 window. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs

2016-09-15 Thread Dave Chinner
On Thu, Sep 15, 2016 at 07:04:27PM -0700, Dan Williams wrote: > On Thu, Sep 15, 2016 at 6:24 PM, Dave Chinner wrote: > > On Thu, Sep 15, 2016 at 05:16:42PM -0700, Dan Williams wrote: > >> On Thu, Sep 15, 2016 at 4:07 PM, Dave Chinner wrote: > >> > On Thu, Sep 15,

Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs

2016-09-15 Thread Dave Chinner
On Thu, Sep 15, 2016 at 05:16:42PM -0700, Dan Williams wrote: > On Thu, Sep 15, 2016 at 4:07 PM, Dave Chinner wrote: > > On Thu, Sep 15, 2016 at 10:01:03AM -0700, Dan Williams wrote: > >> On Thu, Sep 15, 2016 at 1:26 AM, Christoph Hellwig wrote: > >> > On Wed, Se

Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs

2016-09-15 Thread Dave Chinner
xfs/086 @@ -96,7 +96,8 @@ _scratch_mount echo "+ modify files" for x in `seq 1 64`; do - $XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full + $XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" \ + >> $seqres.full 2>&1 done umount "${SCRATCH_MNT}" Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-15 Thread Dave Chinner
On Thu, Sep 15, 2016 at 09:42:22PM +1000, Nicholas Piggin wrote: > On Thu, 15 Sep 2016 20:32:10 +1000 > Dave Chinner wrote: > > > > You still haven't described anything about what a per-block flag > > design is supposed to look like :/ > > For the API, or

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-15 Thread Dave Chinner
On Thu, Sep 15, 2016 at 01:49:45PM +1000, Nicholas Piggin wrote: > On Thu, 15 Sep 2016 12:31:33 +1000 > Dave Chinner wrote: > > > On Wed, Sep 14, 2016 at 08:19:36PM +1000, Nicholas Piggin wrote: > > > On Wed, 14 Sep 2016 17:39:02 +1000 > > Sure, but one first has t

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-14 Thread Dave Chinner
On Wed, Sep 14, 2016 at 10:55:03PM -0700, Darrick J. Wong wrote: > On Mon, Sep 12, 2016 at 11:40:35AM +1000, Dave Chinner wrote: > > On Thu, Sep 08, 2016 at 04:56:36PM -0600, Ross Zwisler wrote: > > > On Wed, Sep 07, 2016 at 09:32:36PM -0700, Dan Williams wrote: > > > &g

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-14 Thread Dave Chinner
On Wed, Sep 14, 2016 at 08:19:36PM +1000, Nicholas Piggin wrote: > On Wed, 14 Sep 2016 17:39:02 +1000 > Dave Chinner wrote: > > Ok, looking back over your example, you seem to be suggesting a new > > page fault behaviour is required from filesystems that has not been > >

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-14 Thread Dave Chinner
On Tue, Sep 13, 2016 at 11:53:11AM +1000, Nicholas Piggin wrote: > On Tue, 13 Sep 2016 07:34:36 +1000 > Dave Chinner wrote: > But let me understand your example in the absence of that. > > - Application mmaps a file, faults in block 0 > - FS allocates block, creates mappin

Re: [PATCH] xfs: fix signed integer overflow

2016-09-12 Thread Dave Chinner
t;< (end_bit - bit)) - 1) << bit; > *wordp |= mask; > wordp++; > bits_set = end_bit - bit; This patch is whitespace damaged and fails to apply. I've fixed it up as this is a trivial change. However, please fix the problem before you submit more patches. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-12 Thread Dave Chinner
te frankly, it's far easier to change the broken PMEM programming model assumptions than it is to implement what you are suggesting. Or to do what Christoph suggested and just use a wrapper around something like device mapper to hand out chunks of unchanging, static pmem to applications... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] logfs: remove from tree

2016-09-11 Thread Dave Chinner
fs/logfs/readwrite.c| 2298 > --- > fs/logfs/segment.c | 961 --- > fs/logfs/super.c| 653 -- Wasn't the lib/btree.c implementation introduced with and only used by logfs? Should that go as well? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps)

2016-09-11 Thread Dave Chinner
not currently mapped and caller has CAP_LINUX_IMMUTABLE. A flag like this /should/ make it possible to avoid fsync/msync() on a file for existing filesystems, but it also means that such files have significant management issues (hence the need for CAP_LINUX_IMMUTABLE to cover it's use). Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-09-01 Thread Dave Chinner
On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote: > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote: > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote: > > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote: > > > > &g

[GIT PULL] xfs, iomap: fixes for 4.8-rc5

2016-09-01 Thread Dave Chinner
log done items directly in the deferred pending work item Dave Chinner (1): xfs: fix superblock inprogress check fs/iomap.c | 5 - fs/xfs/libxfs/xfs_alloc.c | 2 ++ fs/xfs/libxfs/xfs_btree.c | 14 +- fs/xfs/libxfs/xfs_defer.c | 17 - f

Re: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes

2016-08-28 Thread Dave Chinner
uot; comment in your >mail, the current PMD logic requires us to know the size of the hole. This >The current XFS code in the v4.8 tree tells me the size of the hole, and I >think we need to keep this functionality. IOMAP_HOLE extents. It's a requirement of the iomap infrastructure that the filesystem reports hole extents in full for the range being mapped. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] Make __xfs_xattr_put_listen preperly report errors.

2016-08-28 Thread Dave Chinner
On Fri, Aug 26, 2016 at 10:59:28AM +0200, Artem Savkov wrote: > On Fri, Aug 26, 2016 at 08:42:15AM +1000, Dave Chinner wrote: > > So when I look at the fix, and see that it doesn't reproduce on my > > systems, it's clear that it's either not yet fully understood or &

Re: [PATCH] Make __xfs_xattr_put_listen preperly report errors.

2016-08-25 Thread Dave Chinner
On Thu, Aug 25, 2016 at 10:21:09AM +0200, Artem Savkov wrote: > On Thu, Aug 25, 2016 at 10:24:08AM +1000, Dave Chinner wrote: > > On Wed, Aug 24, 2016 at 10:08:33AM +0200, Artem Savkov wrote: > > > On Wed, Aug 24, 2016 at 11:55:51AM +1000, Dave Chinner wrote: > > > >

Re: [PATCH] Make __xfs_xattr_put_listen preperly report errors.

2016-08-25 Thread Dave Chinner
On Wed, Aug 24, 2016 at 10:08:33AM +0200, Artem Savkov wrote: > On Wed, Aug 24, 2016 at 11:55:51AM +1000, Dave Chinner wrote: > > On Tue, Aug 23, 2016 at 05:54:13PM +0200, Artem Savkov wrote: > > > Commit "xfs: only return -errno or success from attr ->put_l

Re: [PATCH] Make __xfs_xattr_put_listen preperly report errors.

2016-08-23 Thread Dave Chinner
n + 1; > if (arraytop > context->firstu) { > context->count = -1;/* insufficient space */ > + context->seen_enough = 1; > return 0; > } > offset = (char *)context->alist + context->count; Looks sane, though I don't know how to test it yet Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3.10 090/180] xfs: xfs_iflush_cluster fails to abort on error

2016-08-22 Thread Dave Chinner
On Mon, Aug 22, 2016 at 07:18:26AM +0200, Willy Tarreau wrote: > Hi Dave, > > On Mon, Aug 22, 2016 at 02:21:08PM +1000, Dave Chinner wrote: > > > - if (error || !bp) { > > > + if (error == -EAGAIN) { > > > > Wrong. Errors changed sign in XFS in 3.17. >

Re: [PATCH 3.10 090/180] xfs: xfs_iflush_cluster fails to abort on error

2016-08-21 Thread Dave Chinner
On Sun, Aug 21, 2016 at 05:30:20PM +0200, Willy Tarreau wrote: > From: Dave Chinner > > commit b1438f477934f5a4d5a44df26f3079a7575d5946 upstream. > > When a failure due to an inode buffer occurs, the error handling > fails to abort the inode writeback correctly. This can res

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-19 Thread Dave Chinner
sense for data that is really hot... I think the underlying principle here is that the faster the backing device, the less we should cache and buffer the device in the OS. I suspect a good initial approximation of "stickiness" for the page cache would the speed of writeback as measured by the BDI underlying the mapping Cheers, Dave. -- Dave Chinner da...@fromorbit.com

[GIT PULL] xfs, iomap: fixes for 4.8-rc3

2016-08-18 Thread Dave Chinner
ve OWN_AG rmap when allocating a block from the AGFL Dave Chinner (4): xfs: don't invalidate whole file on DAX read/write iomap: fiemap should honor the FIEMAP_FLAG_SYNC flag iomap: prepare iomap_fiemap for attribute mappings Merge branch 'iomap-fixes-4.8-rc3

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-18 Thread Dave Chinner
On Thu, Aug 18, 2016 at 10:55:01AM -0700, Linus Torvalds wrote: > On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman > wrote: > > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote: > >> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode. > >> > >

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-18 Thread Dave Chinner
/* > - * At this point, we have no other references and there is > - * no way to pick any more up (removed from LRU, removed > - * from pagecache). Can use non-atomic bitops now (and > - * we obviously don't have to worry about waking up a process > - * waiting on the page lock, because there are no references. > - */ > - __ClearPageLocked(page); > + list_add(&page->lru, &mapping_pages); > + if (ret == SWAP_LZFREE) > + count_vm_event(PGLAZYFREED); > + continue; > + > free_it: > if (ret == SWAP_LZFREE) > count_vm_event(PGLAZYFREED); > @@ -1251,6 +1320,7 @@ static unsigned long shrink_page_list(struct list_head > *page_list, > VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page); > } > > + remove_mapping_list(&mapping_pages, &free_pages, &ret_pages); > mem_cgroup_uncharge_list(&free_pages); > try_to_unmap_flush(); > free_hot_cold_page_list(&free_pages, true); > -- Dave Chinner da...@fromorbit.com

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-17 Thread Dave Chinner
I think 3 is a good possibility if contended locks result in expensive > exiting and reentery of the guest. I have a vague recollection that a > spinning vcpu exits the guest but I did not confirm that. I don't think anything like that has been implemented in the pv spinlocks yet. They just spin right now - it's the same lock implementation as the host. Also, Context switch rates measured on the host are not significantly higher than what is measured in the guest, so there doesn't appear to be any extra scheduling on the host side occurring. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3.16 102/305] xfs: xfs_iflush_cluster fails to abort on error

2016-08-16 Thread Dave Chinner
On Tue, Aug 16, 2016 at 08:45:02PM +0100, Ben Hutchings wrote: > On Sun, 2016-08-14 at 09:36 +1000, Dave Chinner wrote: > > On Sat, Aug 13, 2016 at 06:42:51PM +0100, Ben Hutchings wrote: > > > > > > 3.16.37-rc1 review patch.  If anyone has any objections

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-16 Thread Dave Chinner
irtual 16-core system on a physical machine > that then doesn't consistently give 16 cores to the virtual machine, > you'll get no end of hiccups. I learnt that lesson 6-7 years ago when I first started doing baseline benchmarking to compare bare metal to virtualised IO performance. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
ack. > > We actually have some very recent changes that I didn't even think > about that went into this very merge window. > Mel? The issue is that Dave Chinner is seeing some nasty spinlock > contention on "mapping->tree_lock": > > > 31.18% [kern

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
On Mon, Aug 15, 2016 at 04:20:55PM -0700, Linus Torvalds wrote: > On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner wrote: > > > > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath > >9.90% [kernel] [k] copy_user_generic_string >

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
_set_page_dirty -Dave. -- Dave Chinner da...@fromorbit.com

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
On Mon, Aug 15, 2016 at 04:01:00PM -0700, Linus Torvalds wrote: > On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner wrote: > > > > Right, but that does not make the profile data useless, > > Yes it does. Because it basically hides everything that happens inside > the lock,

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
On Mon, Aug 15, 2016 at 10:22:43AM -0700, Huang, Ying wrote: > Hi, Chinner, > > Dave Chinner writes: > > > On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote: > >> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying wrote: > >> > > >> >

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
On Tue, Aug 16, 2016 at 08:22:11AM +1000, Dave Chinner wrote: > On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote: > > On Aug 14, 2016 10:00 PM, "Dave Chinner" wrote: > > > > > > > What does it say if you annotate that _r

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote: > On Aug 14, 2016 10:00 PM, "Dave Chinner" wrote: > > > > > What does it say if you annotate that _raw_spin_unlock_irqrestore() > function? > > > >¿ &

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Dave Chinner
g_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read Significant increase in blocking delays in the journal during atime updates. There's nothing in Christoph's tree that would affect that behaviour. This smells like either a mount option change or individual tests not being 100% isolated and the previous test run is affecting this one? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-14 Thread Dave Chinner
On Sun, Aug 14, 2016 at 07:53:40PM -0700, Linus Torvalds wrote: > On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner wrote: > >> > >> Maybe your symbol table came from a old kernel, and functions moved > >> around enough that the profile attributions ended up bogus. &

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-14 Thread Dave Chinner
On Sun, Aug 14, 2016 at 06:37:33PM -0700, Linus Torvalds wrote: > On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner wrote: > >> > >> Does this attached patch help your contention numbers? > > > > No. If anything, it makes it worse. Without the pat

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-14 Thread Dave Chinner
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote: > On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner wrote: > > On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote: > >> > >> I don't recall having ever seen the mapping tree_lock as a conten

<    3   4   5   6   7   8   9   10   11   12   >