RE: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

2017-01-22 Thread Matthew Wilcox
From: Christoph Hellwig [mailto:h...@lst.de] > On Sat, Jan 21, 2017 at 04:28:52PM +0000, Matthew Wilcox wrote: > > Of course, there may not be a backing device either! > > s/backing device/block device/ ? If so fully agreed. I like the dax_ops > scheme, but we should go all th

RE: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

2017-01-22 Thread Matthew Wilcox
From: Christoph Hellwig [mailto:h...@lst.de] > On Sun, Jan 22, 2017 at 06:19:24PM +0000, Matthew Wilcox wrote: > > No, I mean a network filesystem like 9p or cifs or nfs. If the memcpy > > is supposed to be performed by the backing device > > struct backing_dev has no rel

RE: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

2017-01-22 Thread Matthew Wilcox
From: Christoph Hellwig [mailto:h...@lst.de] > On Sun, Jan 22, 2017 at 03:43:09PM +0000, Matthew Wilcox wrote: > > In the case of a network filesystem being used to communicate with > > a different VM on the same physical machine, there is no backing > > device, just a network

RE: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

2017-01-21 Thread Matthew Wilcox
From: Dan Williams [mailto:dan.j.willi...@intel.com] > A couple weeks back, in the course of reviewing the memcpy_nocache() > proposal from Brian, Linus subtly suggested that the pmem specific > memcpy_to_pmem() routine be moved to be implemented at the driver > level [1]: Of course, there may

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
which lacks PCI. -Original Message- From: Dan Williams [mailto:dan.j.willi...@intel.com] Sent: Thursday, August 18, 2016 4:17 PM To: Matthew Wilcox <mawil...@microsoft.com> Cc: X86 ML <x...@kernel.org>; linux-ker...@vger.kernel.org; linux-nvdimm@lists.01.org Subject: Re: [PATCH 1

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-19 Thread Matthew Wilcox
we want too) In short, not calling these functions is a bug. We just haven't noticed before now because PCI is so omnipresent. -Original Message- From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] Sent: Friday, August 19, 2016 12:07 PM To: Matthew Wilcox <ma

[PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
From: Matthew Wilcox <mawil...@microsoft.com> The Generation 2 Hyper-V virtual machine does not emulate PCI. This check causes the call to pcibios_resource_survey() to be skipped, and pcibios_resource_survey() calls e820_reserve_resources_late(), which is where PMEM resources are

[PATCH 2/2] x86: Turn down the scary 'no PCI' warning

2016-08-18 Thread Matthew Wilcox
From: Matthew Wilcox <mawil...@microsoft.com> Hyper-V Generation 2 guests do not have PCI hardware. It is therefore not 'Fatal' to lack PCI config space, and those booting with the 'quiet' option do not need to see a message at all. I deem this to be 'informational' rather than '

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
This is the memmap= kernel command line option. -Original Message- From: Dan Williams [mailto:dan.j.willi...@intel.com] Sent: Thursday, August 18, 2016 1:32 PM To: Matthew Wilcox <mawil...@linuxonhyperv.com> Cc: X86 ML <x...@kernel.org>; linux-ker...@vger.kernel.org; linux-

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
[mailto:dan.j.willi...@intel.com] Sent: Thursday, August 18, 2016 2:03 PM To: Matthew Wilcox <mawil...@microsoft.com> Cc: Matthew Wilcox <mawil...@linuxonhyperv.com>; X86 ML <x...@kernel.org>; linux-ker...@vger.kernel.org; linux-nvdimm@lists.01.org Subject: Re: [PATCH 1/2] x86: Set up r

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
an.j.willi...@intel.com] Sent: Thursday, August 18, 2016 2:15 PM To: Matthew Wilcox <mawil...@microsoft.com> Cc: Matthew Wilcox <mawil...@linuxonhyperv.com>; X86 ML <x...@kernel.org>; linux-ker...@vger.kernel.org; linux-nvdimm@lists.01.org Subject: Re: [PATCH 1/2] x86: Set up re

RE: [PATCH 4/5] mm: fix cache mode of dax pmd mappings

2016-09-06 Thread Matthew Wilcox
[mailto:dan.j.willi...@intel.com] Sent: Tuesday, September 6, 2016 12:50 PM To: linux-nvdimm@lists.01.org Cc: Toshi Kani <toshi.k...@hpe.com>; Matthew Wilcox <mawil...@microsoft.com>; Nilesh Choudhury <nilesh.choudh...@oracle.com>; linux-ker...@vger.kernel.org; sta...@vger.kernel.org;

RE: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes

2016-09-10 Thread Matthew Wilcox
From: Christoph Hellwig [mailto:h...@infradead.org] > On Sat, Sep 10, 2016 at 07:33:18AM +0000, Matthew Wilcox wrote: > > > caller specific is unaceptable. That being said your idea doesn't > > > sounds unreasonable, but will require a bit more work and has no >

RE: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes

2016-09-10 Thread Matthew Wilcox
From: Dan Williams [mailto:dan.j.willi...@intel.com] > /me grumbles about top-posting... Let's see if this does any better .. there's lots of new features, but I don't see a 'wrap lines at 80 columns' option. Unfortunately. > On Fri, Sep 9, 2016 at 1:35 PM, Matthew Wilcox

RE: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes

2016-09-10 Thread Matthew Wilcox
From: Christoph Hellwig [mailto:h...@infradead.org] > Either way we need to get rid of buffer_heads, and another aop that is > entirely > caller specific is unaceptable. I finally figured out what you actually meant by this. You mean that instead of having an aop->populate_pfn, you want to

RE: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes

2016-09-10 Thread Matthew Wilcox
From: Theodore Ts'o [mailto:ty...@mit.edu] > On Sat, Sep 10, 2016 at 12:31:51AM -0700, Christoph Hellwig wrote: > > I've mentioned this before, but I'd like to repeat it. With all the > > work reqwuired in the file system I would prefer to drop DAX support > > in ext2 (and if people really cry

RE: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

2016-08-18 Thread Matthew Wilcox
Compared to a patch which removes 5 lines of code, almost any additional work is ocean-boiling. -Original Message- From: Dan Williams [mailto:dan.j.willi...@intel.com] Sent: Thursday, August 18, 2016 3:12 PM To: Matthew Wilcox <mawil...@microsoft.com> Cc: X86 ML <x...@kernel.or

[PATCH] nvdimm: Remove minimum size requirement

2017-07-10 Thread Matthew Wilcox
From: Matthew Wilcox <mawil...@microsoft.com> There was no need to have a minimum size of 4MB for NV-DIMMs; it was just a sanity check. Keep a check that it's at least one page in size because we really can't add less than a page to the memory map. Promote the print statement from 'debug'

Re: [PATCH] nvdimm: Remove minimum size requirement

2017-07-10 Thread Matthew Wilcox
On Mon, Jul 10, 2017 at 11:41:08AM -0700, Dan Williams wrote: > > size = nvdimm_namespace_capacity(ndns); > > - if (size < ND_MIN_NAMESPACE_SIZE) { > > - dev_dbg(>dev, "%pa, too small must be at least %#x\n", > > - , ND_MIN_NAMESPACE_SIZE);

Re: [PATCH] nvdimm: Remove minimum size requirement

2017-07-12 Thread Matthew Wilcox
> You're right. It should return -ENODEV regardless, but the warning > should be for non-zero too small namespaces. OK, try this: 8< ---- From: Matthew Wilcox <mawil...@microsoft.com> Date: Mon, 10 Jul 2017 14:26:59 -0400 Subject: [PATCH] nvdimm: Remove minimum s

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-07-28 Thread Matthew Wilcox
On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote: > Dan Williams and Christoph Hellwig have recently expressed doubt about > whether the rw_page() interface made sense for synchronous memory drivers > [1][2]. It's unclear whether this interface has any performance benefit > for these

Re: [PATCH v1 2/6] fs: use on-stack-bio if backing device has BDI_CAP_SYNC capability

2017-08-08 Thread Matthew Wilcox
On Wed, Aug 09, 2017 at 10:51:13AM +0900, Minchan Kim wrote: > On Tue, Aug 08, 2017 at 06:29:04AM -0700, Matthew Wilcox wrote: > > On Tue, Aug 08, 2017 at 05:49:59AM -0700, Matthew Wilcox wrote: > > > + struct bio sbio; > > > + struct bio_vec sbvec; > > > &

Re: [PATCH v1 2/6] fs: use on-stack-bio if backing device has BDI_CAP_SYNC capability

2017-08-08 Thread Matthew Wilcox
On Tue, Aug 08, 2017 at 03:50:20PM +0900, Minchan Kim wrote: > There is no need to use dynamic bio allocation for BDI_CAP_SYNC > devices. They can with on-stack-bio without concern about waiting > bio allocation from mempool under heavy memory pressure. This seems ... more complex than necessary?

Re: [PATCH v1 2/6] fs: use on-stack-bio if backing device has BDI_CAP_SYNC capability

2017-08-08 Thread Matthew Wilcox
On Tue, Aug 08, 2017 at 05:49:59AM -0700, Matthew Wilcox wrote: > + struct bio sbio; > + struct bio_vec sbvec; ... this needs to be sbvec[nr_pages], of course. > - bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9), > + if (bdi_cap_synchrono

Re: [PATCH v1 2/6] fs: use on-stack-bio if backing device has BDI_CAP_SYNC capability

2017-08-09 Thread Matthew Wilcox
On Wed, Aug 09, 2017 at 11:41:50AM +0900, Minchan Kim wrote: > On Tue, Aug 08, 2017 at 07:31:22PM -0700, Matthew Wilcox wrote: > > On Wed, Aug 09, 2017 at 10:51:13AM +0900, Minchan Kim wrote: > > > On Tue, Aug 08, 2017 at 06:29:04AM -0700, Matthew Wilcox wrote: > > > &

Re: [PATCH v1 5/6] zram: remove zram_rw_page

2017-08-08 Thread Matthew Wilcox
On Tue, Aug 08, 2017 at 05:23:50PM +0900, Sergey Senozhatsky wrote: > Hello Minchan, > > On (08/08/17 17:13), Minchan Kim wrote: > > Hi Sergey, > > > > On Tue, Aug 08, 2017 at 04:02:26PM +0900, Sergey Senozhatsky wrote: > > > On (08/08/17 15:50), Minchan Kim wrote: > > > > With on-stack-bio,

[PATCH] Fix mpage_writepage() for pages with buffers

2017-10-06 Thread Matthew Wilcox
it from within bdev_write_page(). Reported-by: Toshi Kani <toshi.k...@hpe.com> Reported-by: OGAWA Hirofumi <hirof...@mail.parknet.co.jp> Tested-by: Toshi Kani <toshi.k...@hpe.com> Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> Cc: sta...@vger.kernel.org diff --g

Re: [PATCH] Fix mpage_writepage() for pages with buffers

2017-10-11 Thread Matthew Wilcox
On Tue, Oct 10, 2017 at 01:31:44PM -0700, Linus Torvalds wrote: > On Tue, Oct 10, 2017 at 12:44 PM, Andrew Morton > wrote: > > > > This is all pretty mature code (isn't it?). Any idea why this bug > > popped up now? I have no idea why it's suddenly popped up. It

Re: [PATCH 01/18] mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags

2017-11-22 Thread Matthew Wilcox
On Wed, Nov 22, 2017 at 08:52:37AM -0800, Dan Williams wrote: > On Wed, Nov 22, 2017 at 4:02 AM, Vlastimil Babka wrote: > > On 11/01/2017 04:36 PM, Jan Kara wrote: > >> From: Dan Williams > >> > >> The mmap(2) syscall suffers from the ABI anti-pattern of

Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-20 Thread Matthew Wilcox
On Mon, Dec 18, 2017 at 01:35:47PM -0700, Ross Zwisler wrote: > What I'm hoping to do with this series is to just provide a sysfs > representation of the HMAT so that applications can know which NUMA nodes to > select with existing utilities like numactl. This series does not currently > alter

Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-07 Thread Matthew Wilcox
On Mon, May 07, 2018 at 11:57:10AM -0700, Dan Williams wrote: > I think adding yet one more mm-zone is the wrong direction. Instead, > what we have been considering is a mechanism to allow a device-dax > instance to be given back to the kernel as a distinct numa node > managed by the VM. It seems

Re: [External] Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-07 Thread Matthew Wilcox
On Tue, May 08, 2018 at 02:59:40AM +, Huaisheng HS1 Ye wrote: > Currently in our mind, an ideal use scenario is that, we put all page caches > to > zone_nvm, without any doubt, page cache is an efficient and common cache > implement, but it has a disadvantage that all dirty data within it

Re: [PATCH v3] dax: Change return type to vm_fault_t

2018-04-27 Thread Matthew Wilcox
On Fri, Apr 27, 2018 at 01:37:02AM -0700, David Rientjes wrote: > On Fri, 27 Apr 2018, Souptick Joarder wrote: > > > Hi Matthew/ Ross, > > > > There are two changes exist in mm/huge_memory.c as part of this > > patch. vmf_insert_pfn_pmd() and vmf_insert_pfn_pud() functions are > > invoked from

Re: [External] Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-10 Thread Matthew Wilcox
On Wed, May 09, 2018 at 04:47:54AM +, Huaisheng HS1 Ye wrote: > > On Tue, May 08, 2018 at 02:59:40AM +, Huaisheng HS1 Ye wrote: > > > Currently in our mind, an ideal use scenario is that, we put all page > > > caches to > > > zone_nvm, without any doubt, page cache is an efficient and

Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-07 Thread Matthew Wilcox
On Mon, May 07, 2018 at 10:50:21PM +0800, Huaisheng Ye wrote: > Traditionally, NVDIMMs are treated by mm(memory management) subsystem as > DEVICE zone, which is a virtual zone and both its start and end of pfn > are equal to 0, mm wouldn’t manage NVDIMM directly as DRAM, kernel uses >

Re: [External] Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-15 Thread Matthew Wilcox
On Tue, May 15, 2018 at 04:07:28PM +, Huaisheng HS1 Ye wrote: > > From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf > > Of Matthew > > Wilcox > > No. In the current situation, the user knows that either the entire > > page was written b

[PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox
I plucked this patch from my XArray work. It seems self-contained enough that it could go into the DAX tree for merging this cycle. >From 8cb56f4ba36af38814ca7b8ba030a66384e59a21 Mon Sep 17 00:00:00 2001 From: Matthew Wilcox <mawil...@microsoft.com> Date: Thu, 29 Mar 2018 22:41:18 -040

[PATCH] dax: dax_insert_mapping_entry always succeeds

2018-05-17 Thread Matthew Wilcox
Another bugfix from the XArray work; please queue for merge. >From 4a53cab1968d2a1022f35d00b29519970ef624e9 Mon Sep 17 00:00:00 2001 From: Matthew Wilcox <mawil...@microsoft.com> Date: Thu, 29 Mar 2018 22:47:50 -0400 Subject: [PATCH] dax: dax_insert_mapping_entry always succeeds

Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox
On Thu, May 17, 2018 at 01:29:10PM -0600, Ross Zwisler wrote: > On Thu, May 17, 2018 at 01:24:00PM -0600, Ross Zwisler wrote: > > On Thu, May 17, 2018 at 11:37:11AM -0700, Matthew Wilcox wrote: > > > > > > I plucked this patch from my XArray work. It

Re: [PATCH] dax: Fix use of zero page

2018-05-17 Thread Matthew Wilcox
On Thu, May 17, 2018 at 12:32:07PM -0700, Dan Williams wrote: > On Thu, May 17, 2018 at 11:37 AM, Matthew Wilcox <wi...@infradead.org> wrote: > > > > I plucked this patch from my XArray work. It seems self-contained enough > > that it could go into the DAX

Re: [External] Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

2018-05-15 Thread Matthew Wilcox
On Wed, May 16, 2018 at 02:05:05AM +, Huaisheng HS1 Ye wrote: > > From: Matthew Wilcox [mailto:wi...@infradead.org] > > Sent: Wednesday, May 16, 2018 12:20 AM> > > > > > > Then there's the problem of reconnecting the page cache (which is > > > >

[RFC] ktest: add pmem support

2018-06-15 Thread Matthew Wilcox
I haven't got a filesystem mounted using DAX yet, but I thought I might as well send out the current patch against ktest for enabling persistent memory. It's clearly *wrong*, but I do get a /dev/pmem0 device under Linux once I add all the right config options. So it's a start, and people who

Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-29 Thread Matthew Wilcox
On Wed, Dec 27, 2017 at 10:10:34AM +0100, Brice Goglin wrote: > > Perhaps we can enlist /proc/iomem or a similar enumeration interface > > to tell userspace the NUMA node and whether the kernel thinks it has > > better or worse performance characteristics relative to base > > system-RAM, i.e. new

Re: [PATCH v4 11/18] fs, dax: introduce DEFINE_FSDAX_AOPS

2017-12-26 Thread Matthew Wilcox
On Sat, Dec 23, 2017 at 04:56:59PM -0800, Dan Williams wrote: > +int dax_set_page_dirty(struct page *page) > +{ > + /* > + * Unlike __set_page_dirty_no_writeback, dax does all dirty > + * tracking in the radix in response to mkwrite faults. Please stop saying "in the radix". I

Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-20 Thread Matthew Wilcox
On Wed, Dec 20, 2017 at 12:22:21PM -0800, Dave Hansen wrote: > On 12/20/2017 10:19 AM, Matthew Wilcox wrote: > > I don't know what the right interface is, but my laptop has a set of > > /sys/devices/system/memory/memoryN/ directories. Perhaps this is the > > right place to

Re: [PATCH] ipc/shm.c add ->pagesize function to shm_vm_ops

2018-07-28 Thread Matthew Wilcox
On Fri, Jul 27, 2018 at 03:17:27PM -0600, Jane Chu wrote: > +++ b/include/linux/mm.h > @@ -387,6 +387,13 @@ enum page_entry_size { > * These are the virtual MM functions - opening of an area, closing and > * unmapping it (needed to keep files on disk up-to-date etc), pointer > * to the

Re: [PATCH v2] mm: disallow mapping that conflict for devm_memremap_pages()

2018-07-18 Thread Matthew Wilcox
On Wed, Jul 18, 2018 at 02:27:31PM -0400, Jeff Moyer wrote: > Hi, Dave, > > Dave Jiang writes: > > > When pmem namespaces created are smaller than section size, this can cause > > issue during removal and gpf was observed: > > > > Add code to check whether we have mapping already in the same

Re: [PATCH 1/5] radix tree test suite: fix mapshift build target

2018-07-15 Thread Matthew Wilcox
On Thu, May 03, 2018 at 01:24:26PM -0600, Ross Zwisler wrote: > The following commit > > commit c6ce3e2fe3da ("radix tree test suite: Add config option for map > shift") > > Introduced a phony makefile target called 'mapshift' that ends up > generating the file generated/map-shift.h. This

Re: [PATCH 1/5] radix tree test suite: fix mapshift build target

2018-07-16 Thread Matthew Wilcox
On Mon, Jul 16, 2018 at 10:07:10AM -0600, Ross Zwisler wrote: > Incidentally, in the current linux/master the radix tree test suite again > fails to build: > > $ make > sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < > ../../../lib/radix-tree.c > radix-tree.c > sed -e

Re: [PATCH 1/5] radix tree test suite: fix mapshift build target

2018-07-16 Thread Matthew Wilcox
On Mon, Jul 16, 2018 at 03:08:20PM -0600, Ross Zwisler wrote: > On Mon, Jul 16, 2018 at 12:52:49PM -0700, Matthew Wilcox wrote: > > On Mon, Jul 16, 2018 at 10:07:10AM -0600, Ross Zwisler wrote: > <> > > OK ... what version of make are you using? Because this works fin

[PATCH] dax: Fix use of zero page

2018-08-30 Thread Matthew Wilcox
This patch got caught in the crossfire and didn't make it into 4.18 as planned. Can it make 4.19 please? >8 Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it works on MIPS and s390. Signed-off-by: Matthew Wilcox Reviewed-by: Ross Zwisler --- fs/dax.c |

Re: [PATCH 13/13] libnvdimm, namespace: Publish page structure init state / control

2018-07-05 Thread Matthew Wilcox
On Thu, Jul 05, 2018 at 07:46:05AM -0700, Dan Williams wrote: > On Thu, Jul 5, 2018 at 1:29 AM, Johannes Thumshirn wrote: > > On Wed, Jul 04, 2018 at 11:50:13PM -0700, Dan Williams wrote: > >> +static ssize_t memmap_state_store(struct device *dev, > >> + struct device_attribute *attr,

Re: [PATCH 13/13] libnvdimm, namespace: Publish page structure init state / control

2018-07-05 Thread Matthew Wilcox
On Thu, Jul 05, 2018 at 01:24:55PM -0700, Andrew Morton wrote: > On Thu, 5 Jul 2018 16:49:41 +0200 Johannes Thumshirn > wrote: > > > On Thu, Jul 05, 2018 at 07:46:05AM -0700, Dan Williams wrote: > > > ...but that also allows 'echo "syncAndThenSomeGarbage" > > > > /sys/.../memmap_state' to

Re: [PATCH] dax: Change return type to vm_fault_t

2018-04-16 Thread Matthew Wilcox
On Mon, Apr 16, 2018 at 11:00:26AM -0700, Dan Williams wrote: > On Mon, Apr 16, 2018 at 10:47 AM, Matthew Wilcox <wi...@infradead.org> wrote: > > On Mon, Apr 16, 2018 at 09:14:48AM -0700, Dan Williams wrote: > >> > - rc = vm_insert_mixed(

Re: [PATCH] dax: Change return type to vm_fault_t

2018-04-16 Thread Matthew Wilcox
On Mon, Apr 16, 2018 at 09:14:48AM -0700, Dan Williams wrote: > > - rc = vm_insert_mixed(vmf->vma, vmf->address, pfn); > > - > > - if (rc == -ENOMEM) > > - return VM_FAULT_OOM; > > - if (rc < 0 && rc != -EBUSY) > > - return VM_FAULT_SIGBUS; > > - > > -

Re: [PATCH v2] dax: Change return type to vm_fault_t

2018-04-16 Thread Matthew Wilcox
On Mon, Apr 16, 2018 at 05:02:00PM -0600, Ross Zwisler wrote: > I'm not sure what you mean by "vm_fault_t will become a distinct type"? Do > you mean you'll make it into an enum, i.e.: > > enum vm_fault_t { > VM_FAULT_OOM= 0x0001, > VM_FAULT_SIGBUS = 0x0002, >

Re: [PATCH] dax: Change return type to vm_fault_t

2018-04-16 Thread Matthew Wilcox
On Mon, Apr 16, 2018 at 08:14:22PM -0400, Theodore Y. Ts'o wrote: > On Mon, Apr 16, 2018 at 09:14:48AM -0700, Dan Williams wrote: > > Ugh, so this change to vmf_insert_mixed() went upstream without fixing > > the users? This changelog is now misleading as it does not mention > > that is now an

Re: fsdax memory error handling regression

2018-11-10 Thread Matthew Wilcox
On Wed, Nov 07, 2018 at 06:01:19AM +, Williams, Dan J wrote: > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > > > Hi Willy, > > > > > > I'm seeing the following warning with v

Re: fsdax memory error handling regression

2018-11-13 Thread Matthew Wilcox
On Sat, Nov 10, 2018 at 09:08:10AM -0800, Dan Williams wrote: > On Sat, Nov 10, 2018 at 12:29 AM Matthew Wilcox wrote: > > On Wed, Nov 07, 2018 at 06:01:19AM +, Williams, Dan J wrote: > > > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > > > On T

Re: fsdax memory error handling regression

2018-11-06 Thread Matthew Wilcox
On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > Hi Willy, > > I'm seeing the following warning with v4.20-rc1 and the "dax.sh" test > from the ndctl repository: I'll try to run this myself later today. > I tried to get this test going on -next before the merge window, but >

Re: [PATCH] dax: Fix deadlock in dax_lock_mapping_entry()

2018-10-04 Thread Matthew Wilcox
On Thu, Oct 04, 2018 at 06:57:52PM -0700, Dan Williams wrote: > On Thu, Oct 4, 2018 at 9:27 AM Jan Kara wrote: > > > > On Thu 27-09-18 11:22:22, Dan Williams wrote: > > > On Thu, Sep 27, 2018 at 6:41 AM Jan Kara wrote: > > > > > > > >

Re: [PATCH] dax: Fix deadlock in dax_lock_mapping_entry()

2018-09-27 Thread Matthew Wilcox
On Thu, Sep 27, 2018 at 01:23:32PM +0200, Jan Kara wrote: > When dax_lock_mapping_entry() has to sleep to obtain entry lock, it will > fail to unlock mapping->i_pages spinlock and thus immediately deadlock > against itself when retrying to grab the entry lock again. Fix the > problem by unlocking

Re: FAILED: patch "[PATCH] dax: Don't access a freed inode" failed to apply to 4.19-stable tree

2018-12-11 Thread Matthew Wilcox
> id to . I have only compile-tested this backport. Dan, do you want to run it through some tests before Greg applies it? >From e01d37913b5577acaa2e8e35200081eae2795087 Mon Sep 17 00:00:00 2001 From: Matthew Wilcox Date: Tue, 11 Dec 2018 10:16:45 -0500 Subject: [PATCH 2/2] dax: Don't ac

Re: [PATCH] dax: Fix Xarray conversion of dax_unlock_mapping_entry()

2018-11-30 Thread Matthew Wilcox
On Thu, Nov 29, 2018 at 04:13:46PM -0800, Dan Williams wrote: > Internal to dax_unlock_mapping_entry(), dax_unlock_entry() is used to > store a replacement entry in the Xarray at the given xas-index with the > DAX_LOCKED bit clear. When called, dax_unlock_entry() expects the unlocked > value of

Re: [PATCH] dax: Fix Xarray conversion of dax_unlock_mapping_entry()

2018-11-30 Thread Matthew Wilcox
On Fri, Nov 30, 2018 at 07:54:49AM -0800, Dan Williams wrote: > Looks good to me, although can we make that cookie an actual type? I > think it's mostly ok to pass around (void *) for 'entry' inside of > fs/dax.c, but once an entry leaves that file I'd like it to have an > explicit type to catch

Re: [PATCH] dax: Fix Xarray conversion of dax_unlock_mapping_entry()

2018-11-30 Thread Matthew Wilcox
On Fri, Nov 30, 2018 at 09:01:07AM -0800, Dan Williams wrote: > On Fri, Nov 30, 2018 at 8:33 AM Dan Williams wrote: > > > > On Fri, Nov 30, 2018 at 8:24 AM Matthew Wilcox wrote: > > > > > > On Fri, Nov 30, 2018 at 07:54:49AM -0800, Dan Williams wrote: > >

Re: [RFC v3 11/19] kunit: add Python libraries for handing KUnit config and kernel

2018-12-06 Thread Matthew Wilcox
On Thu, Dec 06, 2018 at 12:32:47PM +, Kieran Bingham wrote: > On 04/12/2018 20:47, Luis Chamberlain wrote: > > On Mon, Dec 03, 2018 at 03:48:15PM -0800, Brendan Higgins wrote: > >> On Thu, Nov 29, 2018 at 5:54 AM Kieran Bingham > >> wrote: > >>> > >>> Hi Brendan, > >>> > >>> Thanks again for

Re: [GIT PULL] dax fixes for 4.20-rc6

2018-12-09 Thread Matthew Wilcox
On Sun, Dec 09, 2018 at 10:26:54AM -0800, Dan Williams wrote: > [ add Willy and Jan ] > > On Sun, Dec 9, 2018 at 10:02 AM Linus Torvalds > wrote: > > > > On Sat, Dec 8, 2018 at 10:26 PM Williams, Dan J > > wrote: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm > > >

Re: [PATCH] dax: Fix Xarray conversion of dax_unlock_mapping_entry()

2018-12-04 Thread Matthew Wilcox
oad the responsibility of conveying > that this is not the traditional page lock to the fact that a > dax_entry is returned and passed back to the unlock routine. From: Matthew Wilcox Date: Fri, 30 Nov 2018 11:05:06 -0500 Subject: [PATCH] dax: Change lock/unlock API Return the unlock

Re: dax_lock_mapping_entry was never safe

2018-11-27 Thread Matthew Wilcox
On Mon, Nov 26, 2018 at 12:36:26PM -0800, Dan Williams wrote: > On Mon, Nov 26, 2018 at 9:11 AM Jan Kara wrote: > > The code looks good. Maybe can we call this wait_entry_unlocked() to stress > > that entry is not really usable after this function returns? And comment > > before the function that

[PATCH 1/2] dax: Check page->mapping isn't NULL

2018-11-27 Thread Matthew Wilcox
.org Reported-by: Jan Kara Signed-off-by: Matthew Wilcox --- fs/dax.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 9bcce89ea18e..e69fc231833b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -365,7 +365,7 @@ bool dax_lock_mapping_entry(stru

[PATCH 2/2] dax: Don't access a freed inode

2018-11-27 Thread Matthew Wilcox
: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()") Cc: sta...@vger.kernel.org Signed-off-by: Matthew Wilcox --- fs/dax.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index e69fc231833b..cf1805645

[PATCH 0/2] Two DAX fixes for 4.20

2018-11-27 Thread Matthew Wilcox
These both fix race conditions in dax_lock_mapping_entry(). I've tagged them both for 4.19 backport, which will fail and I'll do the equivalent patch for it. Dan, do you want to take these through your tree? Matthew Wilcox (2): dax: Check page->mapping isn't NULL dax: Don't access a fr

Re: FAILED: patch "[PATCH] dax: Avoid losing wakeup in dax_lock_mapping_entry" failed to apply to 4.19-stable tree

2018-11-26 Thread Matthew Wilcox
On Mon, Nov 26, 2018 at 04:22:23AM -0800, Matthew Wilcox wrote: > On Mon, Nov 26, 2018 at 11:40:20AM +0100, gre...@linuxfoundation.org wrote: > > > > The patch below does not apply to the 4.19-stable tree. > > If someone wants it applied there, or to any other stable

dax_lock_mapping_entry was never safe

2018-11-26 Thread Matthew Wilcox
I noticed this path while I was doing the 4.19 backport of dax: Avoid losing wakeup in dax_lock_mapping_entry xa_unlock_irq(>i_pages); revalidate = wait_fn(); finish_wait(wq, ); xa_lock_irq(>i_pages); It's not safe to call

Re: [PATCH 2/2] dax: Don't access a freed inode

2018-11-28 Thread Matthew Wilcox
On Wed, Nov 28, 2018 at 09:08:40AM -0800, Dan Williams wrote: > > Can we add a comment here like: > > > > /* > > * Entry lock waits are exclusive. Wake up the next waiter since we > > * aren't sure we will acquire the entry lock and thus wake the > > * next

Re: FAILED: patch "[PATCH] dax: Avoid losing wakeup in dax_lock_mapping_entry" failed to apply to 4.19-stable tree

2018-11-26 Thread Matthew Wilcox
. > > Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()") > Cc: sta...@vger.kernel.org > Signed-off-by: Matthew Wilcox > > diff --git a/fs/dax.c b/fs/dax.c > index cf2394e2bf4b..9bcce89ea18e 100644 > --- a/fs/da

Re: [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-13 Thread Matthew Wilcox
On Mon, Jan 14, 2019 at 10:29:02AM +1100, Dave Chinner wrote: > Until you have images (and hence host page cache) shared between > multiple guests. People will want to do this, because it means they > only need a single set of pages in host memory for executable > binaries rather than a set of

Re: [RFC PATCH 2/5] lib/memregion: Uplevel the pmem "region" ida to a global allocator

2019-04-04 Thread Matthew Wilcox
On Thu, Apr 04, 2019 at 12:08:38PM -0700, Dan Williams wrote: > +++ b/lib/Kconfig > @@ -318,6 +318,12 @@ config DECOMPRESS_LZ4 > config GENERIC_ALLOCATOR > bool > > +# > +# Generic IDA for memory regions > +# Leaky abstraction -- nobody needs know that it's implemented as an IDA.

Re: find_get_entries_tag regression bisected

2019-02-26 Thread Matthew Wilcox
On Mon, Feb 25, 2019 at 09:03:00PM -0800, Dan Williams wrote: > On Sat, Feb 16, 2019 at 1:11 PM Matthew Wilcox wrote: > > > > On Sat, Feb 16, 2019 at 09:29:48AM -0800, Matthew Wilcox wrote: > > > On Sat, Feb 16, 2019 at 07:35:11AM -0800, Matthew Wilcox wrote: > > >

Re: find_get_entries_tag regression bisected

2019-02-26 Thread Matthew Wilcox
On Tue, Feb 26, 2019 at 04:08:21AM -0800, Matthew Wilcox wrote: > On Mon, Feb 25, 2019 at 09:03:00PM -0800, Dan Williams wrote: > > On Sat, Feb 16, 2019 at 1:11 PM Matthew Wilcox wrote: > > > > > > On Sat, Feb 16, 2019 at 09:29:48AM -0800, Matthew Wilcox wrote: > &g

[PATCH] dax: Flush partial PMDs correctly

2019-02-28 Thread Matthew Wilcox
range. Fixes: c1901cd33cf4 "page cache: Convert find_get_entries_tag to XArray" Reported-by: Piotr Balcer Tested-by: Dan Williams Signed-off-by: Matthew Wilcox --- fs/dax.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/d

[PATCH] dax: Flush partial PMDs correctly

2019-03-01 Thread Matthew Wilcox
range. Fixes: c1901cd33cf4 "page cache: Convert find_get_entries_tag to XArray" Reported-by: Piotr Balcer Tested-by: Dan Williams Signed-off-by: Matthew Wilcox --- fs/dax.c | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/fs/dax.c b/fs/d

Re: [PATCH] dax: Flush partial PMDs correctly

2019-03-01 Thread Matthew Wilcox
On Fri, Mar 01, 2019 at 02:49:54PM +0100, Jan Kara wrote: > > pfn = dax_to_pfn(entry); > > - size = PAGE_SIZE << dax_entry_order(entry); > > + count = 1UL << dax_entry_order(entry); > > + index = xas->xa_index &~ (count - 1); > > Hum, why do you compute 'index' here when you actually

Re: Hang / zombie process from Xarray page-fault conversion (bisected)

2019-03-11 Thread Matthew Wilcox
On Thu, Mar 07, 2019 at 10:16:17PM -0800, Dan Williams wrote: > Hi Willy, > > We're seeing a case where RocksDB hangs and becomes defunct when > trying to kill the process. v4.19 succeeds and v4.20 fails. Robert was > able to bisect this to commit b15cd800682f "dax: Convert page fault > handlers

Re: find_get_entries_tag regression bisected

2019-02-16 Thread Matthew Wilcox
On Fri, Feb 15, 2019 at 06:08:15PM -0800, Dan Williams wrote: > Hi Willy, > > Piotr reports the following crash can be triggered on latest mainline: > > EXT4-fs (pmem5): recovery complete > EXT4-fs (pmem5): mounted filesystem with ordered data mode. Opts: dax > [ cut here

Re: find_get_entries_tag regression bisected

2019-02-16 Thread Matthew Wilcox
On Sat, Feb 16, 2019 at 09:29:48AM -0800, Matthew Wilcox wrote: > On Sat, Feb 16, 2019 at 07:35:11AM -0800, Matthew Wilcox wrote: > > Another way to fix this would be to mask the address in dax_entry_mkclean(), > > but I think this is cleaner. > > That's clearly rubbish, dax_

Re: find_get_entries_tag regression bisected

2019-02-16 Thread Matthew Wilcox
On Sat, Feb 16, 2019 at 07:35:11AM -0800, Matthew Wilcox wrote: > Another way to fix this would be to mask the address in dax_entry_mkclean(), > but I think this is cleaner. That's clearly rubbish, dax_entry_mkclean() can't possibly mask the address. It might be mis-aligned in another p

Re: [Lsf-pc] [LSF/MM TOPIC] The end of the DAX experiment

2019-02-14 Thread Matthew Wilcox
On Thu, Feb 14, 2019 at 11:31:24AM -0800, Dan Williams wrote: > On Thu, Feb 14, 2019 at 11:10 AM Jerome Glisse wrote: > > I am just again working on my struct page mapping patchset as well as > > the generic page write protection that sits on top. I hope to be able > > to post the v2 in couple

Re: [PATCH v3 07/10] lib/memregion: Uplevel the pmem "region" ida to a global allocator

2019-06-07 Thread Matthew Wilcox
On Fri, Jun 07, 2019 at 12:27:50PM -0700, Dan Williams wrote: > diff --git a/lib/memregion.c b/lib/memregion.c > new file mode 100644 > index ..f6c6a94c7921 > --- /dev/null > +++ b/lib/memregion.c > @@ -0,0 +1,15 @@ > +#include > + > +static DEFINE_IDA(region_ids); > + > +int

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-13 Thread Matthew Wilcox
On Thu, Jun 13, 2019 at 02:36:49PM +1000, Dave Chinner wrote: > On Wed, Jun 12, 2019 at 08:23:20PM -0700, Matthew Wilcox wrote: > > On Thu, Jun 13, 2019 at 10:25:55AM +1000, Dave Chinner wrote: > > > On Wed, Jun 12, 2019 at 05:37:53AM -0700, Matthew Wilcox wrote: > > >

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-12 Thread Matthew Wilcox
On Sat, Jun 08, 2019 at 10:10:36AM +1000, Dave Chinner wrote: > On Fri, Jun 07, 2019 at 11:25:35AM -0700, Ira Weiny wrote: > > Are you suggesting that we have something like this from user space? > > > > fcntl(fd, F_SETLEASE, F_LAYOUT | F_UNBREAKABLE); > > Rather than "unbreakable", perhaps

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-13 Thread Matthew Wilcox
On Thu, Jun 13, 2019 at 10:25:55AM +1000, Dave Chinner wrote: > e.g. Process A has an exclusive layout lease on file F. It does an > IO to file F. The filesystem IO path checks that Process A owns the > lease on the file and so skips straight through layout breaking > because it owns the lease and

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-13 Thread Matthew Wilcox
On Fri, Jun 14, 2019 at 12:09:21PM +1000, Dave Chinner wrote: > On Thu, Jun 13, 2019 at 08:45:30PM -0300, Jason Gunthorpe wrote: > > On Thu, Jun 13, 2019 at 02:13:21PM -0700, Ira Weiny wrote: > > > On Thu, Jun 13, 2019 at 08:27:55AM -0700, Matthew Wilcox wrote: > > > &

Re: [PATCH] mm/sparsemem: Cleanup 'section number' data types

2019-06-21 Thread Matthew Wilcox
On Thu, Jun 20, 2019 at 05:06:46PM -0700, Dan Williams wrote: > David points out that there is a mixture of 'int' and 'unsigned long' > usage for section number data types. Update the memory hotplug path to > use 'unsigned long' consistently for section numbers. ... because we're seriously

Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal

2019-06-12 Thread Matthew Wilcox
On Thu, Jun 13, 2019 at 10:25:55AM +1000, Dave Chinner wrote: > On Wed, Jun 12, 2019 at 05:37:53AM -0700, Matthew Wilcox wrote: > > That's rather different from the normal meaning of 'exclusive' in the > > context of locks, which is "only one user can have access t

Re: [PATCH, RFC 2/2] Implement sharing/unsharing of PMDs for FS/DAX

2019-05-09 Thread Matthew Wilcox
On Thu, May 09, 2019 at 09:05:33AM -0700, Larry Bassel wrote: > This is based on (but somewhat different from) what hugetlbfs > does to share/unshare page tables. Wow, that worked out far more cleanly than I was expecting to see. > @@ -4763,6 +4763,19 @@ void

Re: Hang / zombie process from Xarray page-fault conversion (bisected)

2019-05-04 Thread Matthew Wilcox
t; > On Mon, Mar 11, 2019 at 8:10 AM Matthew Wilcox > > > > wrote: > > > > > > > > > > On Thu, Mar 07, 2019 at 10:16:17PM -0800, Dan Williams wrote: > > > > > > Hi Willy, > > > > > > > > > > > >

Re: [PATCH] dax: Fix missed PMD wakeups

2019-07-04 Thread Matthew Wilcox
On Thu, Jul 04, 2019 at 06:54:50PM +0200, Jan Kara wrote: > On Wed 03-07-19 20:27:28, Matthew Wilcox wrote: > > So I think we're good for all current users. > > Agreed but it is an ugly trap. As I already said, I'd rather pay the > unnecessary cost of waiting for pte entr

Re: [PATCH] dax: Fix missed PMD wakeups

2019-07-04 Thread Matthew Wilcox
On Thu, Jul 04, 2019 at 04:00:00PM +0300, Boaz Harrosh wrote: > On 04/07/2019 06:27, Matthew Wilcox wrote: > > On Wed, Jul 03, 2019 at 02:28:41PM -0700, Dan Williams wrote: > >>> +#ifdef CONFIG_XARRAY_MULTI > >>> + unsigned int sibs = xas->xa_sib

  1   2   3   >