Re: [PATCH 04/13] dax: Create local variable for VMA in dax_iomap_pte_fault()

2017-08-18 Thread Ross Zwisler
On Thu, Aug 17, 2017 at 06:08:06PM +0200, Jan Kara wrote: > There are already two users and more are coming. > > Signed-off-by: Jan Kara <j...@suse.cz> Sure. Reviewed-by: Ross Zwisler <ross.zwis...@linux.intel.com> ___ Linux-nvdi

Re: [PATCH 03/13] dax: Factor out getting of pfn out of iomap

2017-08-18 Thread Ross Zwisler
On Thu, Aug 17, 2017 at 06:08:05PM +0200, Jan Kara wrote: > Factor out code to get pfn out of iomap that is shared between PTE and > PMD fault path. > > Signed-off-by: Jan Kara <j...@suse.cz> Yep, this looks correct to me. Reviewed-by: Ross Zwisler <ross.zw

Re: [PATCH] dax: explain how read(2)/write(2) addresses are validated

2017-08-17 Thread Ross Zwisler
On Thu, Aug 17, 2017 at 10:53:32AM +0200, Jan Kara wrote: > On Wed 16-08-17 11:36:15, Ross Zwisler wrote: > > Add a comment explaining how the user addresses provided to read(2) and > > write(2) are validated in the DAX I/O path. We call dax_copy_from_iter() > > or copy_to_it

[PATCH] dax: explain how read(2)/write(2) addresses are validated

2017-08-16 Thread Ross Zwisler
to arbitrary kernel addresses with this path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- Adding a comment instead of adding redundant access_ok() calls in the DAX code. If this is the wrong path to take, please let me know. fs/dax.c | 5 + 1 file changed, 5 insertions(+)

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-04 Thread Ross Zwisler
On Fri, Aug 04, 2017 at 11:01:08AM -0700, Dan Williams wrote: > [ adding Dave who is working on a blk-mq + dma offload version of the > pmem driver ] > > On Fri, Aug 4, 2017 at 1:17 AM, Minchan Kim wrote: > > On Fri, Aug 04, 2017 at 12:54:41PM +0900, Minchan Kim wrote: > [..]

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-03 Thread Ross Zwisler
On Thu, Aug 03, 2017 at 09:13:15AM +0900, Minchan Kim wrote: > Hi Ross, > > On Wed, Aug 02, 2017 at 04:13:59PM -0600, Ross Zwisler wrote: > > On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote: > > > On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wro

Re: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-03 Thread Ross Zwisler
dir = DMA_FROM_DEVICE; > + else > + dir = DMA_TO_DEVICE; > + dma_unmap_sg(dev, cmd->sg, cmd->sg_nents, dir); > + dmaengine_unmap_put(unmap); > +err: > + blk_mq_end_request(cmd->rq, -ENXIO); Should this be: blk_mq_end_request(cmd->rq, rc); ? Ot

Re: [PATCH v2 4/5] libnvdimm: Adding blk-mq support to the pmem driver

2017-08-03 Thread Ross Zwisler
> module parameter queue_mode=1. > > Signed-off-by: Dave Jiang <dave.ji...@intel.com> One small nit with error handling. With that addressed you can add: Reviewed-by: Ross Zwisler <ross.zwis...@linux.intel.com> > @@ -303,17 +369,47 @@ static int pmem_attach_disk(struct de

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-08-02 Thread Ross Zwisler
On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote: > On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote: > > Dan Williams and Christoph Hellwig have recently expressed doubt about > > whether the rw_page() interface made sense for synchronous memory drivers &g

Re: [PATCH] nvdimm: avoid bogus -Wmaybe-uninitialized warning

2017-08-01 Thread Ross Zwisler
On Tue, Aug 01, 2017 at 02:45:34PM -0700, Andrew Morton wrote: > On Tue, 1 Aug 2017 13:48:48 +0200 Arnd Bergmann wrote: > > > Removing the btt_rw_page/pmem_rw_page functions had a surprising > > side-effect of introducing a false-positive warning in another > > function, due to

Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq

2017-08-01 Thread Ross Zwisler
On Mon, Jul 31, 2017 at 03:24:46PM -0700, Dave Jiang wrote: > Adding DMA support for pmem blk reads. This provides signficant CPU > reduction with large memory reads with good performance. DMAs are triggered > with test against bio_multiple_segment(), so the small I/Os (4k or less?) > are still

Re: [PATCH 4/5] libnvdimm: Adding blk-mq support to the pmem driver

2017-08-01 Thread Ross Zwisler
On Mon, Jul 31, 2017 at 03:24:40PM -0700, Dave Jiang wrote: > Adding blk-mq support to the pmem driver in addition to the direct bio > support. This allows for hardware offloading via DMA engines. By default > the bio method will be enabled. The blk-mq support can be turned on via > module

Re: [PATCH] nvdimm: avoid bogus -Wmaybe-uninitialized warning

2017-08-01 Thread Ross Zwisler
riable. In those cases that gcc successfully > tracks the state of the variable, this will have no effect. > > Fixes: 503a5e89b1de ("drivers/nvdimm/btt.c: remove btt_rw_page()") > Fixes: 58100d6e735e ("drivers/nvdimm/pmem.c: remove pmem_rw_page()") > Signed-of

Re: QEMU NVDIMM as type 7 in e820 table

2017-07-31 Thread Ross Zwisler
On Sat, Jul 29, 2017 at 06:49:33PM +0800, Haozhong Zhang wrote: > On 07/28/17 13:45 -0600, Ross Zwisler wrote: > > On Fri, Jul 28, 2017 at 11:11:10AM -0700, Dan Williams wrote: > > > On Fri, Jul 28, 2017 at 11:04 AM, Ross Zwisler > > > <ross.zwis...@linux.intel.com&g

Re: QEMU NVDIMM as type 7 in e820 table

2017-07-28 Thread Ross Zwisler
On Fri, Jul 28, 2017 at 11:11:10AM -0700, Dan Williams wrote: > On Fri, Jul 28, 2017 at 11:04 AM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > I've been using the virtualized NVDIMM support in QEMU for testing, and I > > noticed that the physical addresses use

QEMU NVDIMM as type 7 in e820 table

2017-07-28 Thread Ross Zwisler
I've been using the virtualized NVDIMM support in QEMU for testing, and I noticed that the physical addresses used by the virtual NVDIMMs aren't present in the guest's e820 table. Here is the e820 table on my QEMU instance where I have one 32 GiB virtual NVDIMM: [0.00] e820:

[PATCH 3/3] brd: remove brd_rw_page()

2017-07-28 Thread Ross Zwisler
The rw_page() interface doesn't provide a clear performance benefit for BRD and has had a nonzero maintenance burden, so remove it. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Suggested-by: Dan Williams <dan.j.willi...@intel.com> Suggested-by: Christoph Hellwig <h..

[PATCH 1/3] btt: remove btt_rw_page()

2017-07-28 Thread Ross Zwisler
The rw_page() interface doesn't provide a clear performance benefit for the BTT and has had a nonzero maintenance burden, so remove it. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Suggested-by: Dan Williams <dan.j.willi...@intel.com> Suggested-by: Christop

[PATCH 0/3] remove rw_page() from brd, pmem and btt

2017-07-28 Thread Ross Zwisler
://lists.01.org/pipermail/linux-nvdimm/2017-July/011389.html [2] https://www.mail-archive.com/linux-block@vger.kernel.org/msg11170.html Ross Zwisler (3): btt: remove btt_rw_page() pmem: remove pmem_rw_page() brd: remove brd_rw_page() drivers/block/brd.c | 10 -- drivers/nvdimm/btt.c

Re: [PATCH 6/7] dax: Implement dax_pfn_mkwrite()

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 04:53:22PM -0600, Ross Zwisler wrote: > On Thu, Jul 27, 2017 at 03:12:44PM +0200, Jan Kara wrote: > > Implement a function that marks existing page table entry (PTE or PMD) > > as writeable and takes care of marking it dirty in the radix tree. This > >

Re: [PATCH 7/7] ext4: Support for synchronous DAX faults

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 03:12:45PM +0200, Jan Kara wrote: > We return IOMAP_F_NEEDDSYNC flag from ext4_iomap_begin() for a > synchronous write fault when inode has some uncommitted metadata > changes. In the fault handler ext4_dax_fault() we then detect this case, > call vfs_fsync_range() to make

Re: [PATCH 6/7] dax: Implement dax_pfn_mkwrite()

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 03:12:44PM +0200, Jan Kara wrote: > Implement a function that marks existing page table entry (PTE or PMD) > as writeable and takes care of marking it dirty in the radix tree. This > function will be used to finish synchronous page fault where the page > table entry is

Re: [PATCH 5/7] dax, iomap: Add support for synchronous faults

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 03:12:43PM +0200, Jan Kara wrote: > Add a flag to iomap interface informing the caller that inode needs > fdstasync(2) for returned extent to become persistent and use it in DAX > fault code so that we map such extents only read only. We propagate the > information that the

Re: [PATCH 3/7] dax: Simplify arguments of dax_insert_mapping()

2017-07-27 Thread Ross Zwisler
the same arguments as dax_pmd_insert_mapping(). Yay, this is much better! > Signed-off-by: Jan Kara <j...@suse.cz> Reviewed-by: Ross Zwisler <ross.zwis...@linux.intel.com> ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org http

Re: [PATCH 2/7] dax: Add sync argument to dax_iomap_fault()

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 03:12:40PM +0200, Jan Kara wrote: > Add 'sync' argument to dax_iomap_fault(). It will be used to communicate > the fact that synchronous fault is requested. I don't actually think you need to pass this 'sync' parameter around. I think you can completely rely on

Re: [PATCH 1/7] mm: Remove VM_FAULT_HWPOISON_LARGE_MASK

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 03:12:39PM +0200, Jan Kara wrote: > It is unused. > > Signed-off-by: Jan Kara <j...@suse.cz> Reviewed-by: Ross Zwisler <ross.zwis...@linux.intel.com> ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org

Re: [RFC PATCH 0/7] dax, ext4: Synchronous page faults

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 10:09:07AM -0400, Jeff Moyer wrote: > Jan Kara writes: > > Hi, Jan, > > Thanks for looking into this! > > > There are couple of open questions with this implementation: > > > > 1) Is it worth the hassle? > > 2) Is S_SYNC good flag to use or should we use a

Re: FIle copy to FAT FS on NVDIMM hits BUG_ON at fs/buffer.c:3305!

2017-07-27 Thread Ross Zwisler
On Thu, Jul 27, 2017 at 04:12:18PM +, Kani, Toshimitsu wrote: > On Wed, 2017-07-26 at 10:11 -0700, Matthew Wilcox wrote: > > On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote: > > > The locking of this path seems to be broken. The guy familiar to > > > bdev_write_page() path will

Re: FIle copy to FAT FS on NVDIMM hits BUG_ON at fs/buffer.c:3305!

2017-07-26 Thread Ross Zwisler
On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote: > "Kani, Toshimitsu" writes: > > > kernel BUG at fs/buffer.c:3305! > > invalid opcode: [#1] SMP > > : > > Workqueue: writeback wb_workfn (flush-259:0) > > task: 8d02595b8000 task.stack:

Re: FIle copy to FAT FS on NVDIMM hits BUG_ON at fs/buffer.c:3305!

2017-07-25 Thread Ross Zwisler
On Tue, Jul 25, 2017 at 09:37:38PM +, Kani, Toshimitsu wrote: > Hi, > > Copying files to vfat FS on an NVDIMM device hits > BUG_ON(!PageLocked(page)) in try_to_free_buffers(). It happens on > 4.13-rc1, and happens on older kernels as well. > > A simple reproducer is shown below. It is 100%

[PATCH v5 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-24 Thread Ross Zwisler
call path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Reviewed-by: Jan Kara <j...@suse.cz> --- include/linux/mm.h | 2 ++ mm/memory.c| 50 +++--- 2 files changed, 45 insertions(+), 7 deletions(-) diff --git a/include

[PATCH v5 3/5] dax: use common 4k zero page for dax mmap reads

2017-07-24 Thread Ross Zwisler
kwrite' flag. If 'mkwrite' is set insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. *** Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Reviewed-by: Jan Kara <j...@suse.cz> --- Documentation/filesystems/da

[PATCH v5 2/5] dax: relocate some dax functions

2017-07-24 Thread Ross Zwisler
-by: Ross Zwisler <ross.zwis...@linux.intel.com> Reviewed-by: Jan Kara <j...@suse.cz> --- fs/dax.c | 138 +++ 1 file changed, 69 insertions(+), 69 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 306c2b6..197067f 100644 --

[PATCH v5 0/5] DAX common 4k zero page

2017-07-24 Thread Ross Zwisler
e pages in the radix tree made the DAX code more complex. This series solves these issues by following the lead of the DAX PMD code and using a common 4k zero page instead. This reduces memory usage and decreases latencies for some workloads, and it simplifies the DAX code, removing over 100 lines in total

[PATCH v5 4/5] dax: remove DAX code from page_cache_tree_insert()

2017-07-24 Thread Ross Zwisler
Now that we no longer insert struct page pointers in DAX radix trees we can remove the special casing for DAX in page_cache_tree_insert(). This also allows us to make dax_wake_mapping_entry_waiter() local to fs/dax.c, removing it from dax.h. Signed-off-by: Ross Zwisler <ross.z

Re: [PATCH v4 1/5] mm: add mkwrite param to vm_insert_mixed()

2017-07-24 Thread Ross Zwisler
On Mon, Jul 24, 2017 at 01:25:30PM +0200, Jan Kara wrote: > > @@ -1658,14 +1658,28 @@ static int insert_pfn(struct vm_area_struct *vma, > > unsigned long addr, > > if (!pte) > > goto out; > > retval = -EBUSY; > > - if (!pte_none(*pte)) > > - goto out_unlock; > > +

Re: [PATCH v4 1/5] mm: add mkwrite param to vm_insert_mixed()

2017-07-24 Thread Ross Zwisler
On Mon, Jul 24, 2017 at 01:15:31PM +0200, Jan Kara wrote: > On Sat 22-07-17 09:21:31, Dan Williams wrote: > > On Fri, Jul 21, 2017 at 3:39 PM, Ross Zwisler > > <ross.zwis...@linux.intel.com> wrote: > > > To be able to use the common 4k zero page in DAX we need to have

[PATCH v4 3/5] dax: use common 4k zero page for dax mmap reads

2017-07-21 Thread Ross Zwisler
insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. *** Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- Documentation/filesystems/dax.txt | 5 +- fs/dax.c | 240

[PATCH v4 5/5] dax: move all DAX radix tree defs to fs/dax.c

2017-07-21 Thread Ross Zwisler
Now that we no longer insert struct page pointers in DAX radix trees the page cache code no longer needs to know anything about DAX exceptional entries. Move all the DAX exceptional entry definitions from dax.h to fs/dax.c. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Sug

[PATCH v4 0/5] DAX common 4k zero page

2017-07-21 Thread Ross Zwisler
ads, and it simplifies the DAX code, removing over 100 lines in total. This series has passed my targeted testing and a full xfstests run on both XFS and ext4. Ross Zwisler (5): mm: add mkwrite param to vm_insert_mixed() dax: relocate some dax functions dax: use common 4k zero page for dax mmap reads

[PATCH v4 1/5] mm: add mkwrite param to vm_insert_mixed()

2017-07-21 Thread Ross Zwisler
call path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/dax/device.c| 2 +- drivers/gpu/drm/exynos/exynos_drm_gem.c | 3 ++- drivers/gpu/drm/gma500/framebuffer.c| 2 +- drivers/gpu/drm/msm/msm_gem.c | 3 ++- drivers/gpu/drm/oma

Re: [PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-21 Thread Ross Zwisler
On Thu, Jul 20, 2017 at 09:59:22AM -0600, Ross Zwisler wrote: > On Thu, Jul 20, 2017 at 11:26:16AM -0400, Vivek Goyal wrote: <> > > Hi Ross, > > > > vm_insert_mixed_mkwrite() is same as vm_insert_mixed() except this sets > > write parameter to inser_pfn() tru

Re: [PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-21 Thread Ross Zwisler
On Wed, Jul 19, 2017 at 03:58:31PM -0600, Ross Zwisler wrote: > On Wed, Jul 19, 2017 at 11:51:12AM -0600, Ross Zwisler wrote: > > On Wed, Jul 19, 2017 at 04:16:59PM +0200, Jan Kara wrote: > > > On Wed 28-06-17 16:01:48, Ross Zwisler wrote: > > > > To be able to use

Re: [PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-20 Thread Ross Zwisler
On Thu, Jul 20, 2017 at 11:26:16AM -0400, Vivek Goyal wrote: > On Wed, Jun 28, 2017 at 04:01:48PM -0600, Ross Zwisler wrote: > > To be able to use the common 4k zero page in DAX we need to have our PTE > > fault path look more like our PMD fault path where a PTE entry can be >

Re: [PATCH v3 3/5] dax: use common 4k zero page for dax mmap reads

2017-07-20 Thread Ross Zwisler
On Thu, Jul 20, 2017 at 12:27:23PM +0200, Jan Kara wrote: > On Wed 19-07-17 10:26:45, Ross Zwisler wrote: > > On Wed, Jul 19, 2017 at 05:33:14PM +0200, Jan Kara wrote: > > > On Wed 28-06-17 16:01:50, Ross Zwisler wrote: > > > > Another major change is that we rem

Re: [PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-19 Thread Ross Zwisler
On Wed, Jul 19, 2017 at 11:51:12AM -0600, Ross Zwisler wrote: > On Wed, Jul 19, 2017 at 04:16:59PM +0200, Jan Kara wrote: > > On Wed 28-06-17 16:01:48, Ross Zwisler wrote: > > > To be able to use the common 4k zero page in DAX we need to have our PTE > > > fault path l

Re: [PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-07-19 Thread Ross Zwisler
On Wed, Jul 19, 2017 at 04:16:59PM +0200, Jan Kara wrote: > On Wed 28-06-17 16:01:48, Ross Zwisler wrote: > > To be able to use the common 4k zero page in DAX we need to have our PTE > > fault path look more like our PMD fault path where a PTE entry can be > > marked a

Re: [PATCH v3 3/5] dax: use common 4k zero page for dax mmap reads

2017-07-19 Thread Ross Zwisler
On Wed, Jul 19, 2017 at 05:33:14PM +0200, Jan Kara wrote: > On Wed 28-06-17 16:01:50, Ross Zwisler wrote: > > Another major change is that we remove dax_pfn_mkwrite() from our fault > > flow, and instead rely on the page fault itself to make the PTE dirty and > > writ

Re: [RFC v2 3/5] hmem: add heterogeneous memory sysfs support

2017-07-07 Thread Ross Zwisler
On Thu, Jul 06, 2017 at 10:53:39PM -0700, John Hubbard wrote: > On 07/06/2017 02:52 PM, Ross Zwisler wrote: > [...] > > diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile > > index b1aacfc..31e3f20 100644 > > --- a/drivers/acpi/Makefile > > +++ b/drivers/ac

Re: [RFC v2 0/5] surface heterogeneous memory performance information

2017-07-07 Thread Ross Zwisler
On Thu, Jul 06, 2017 at 10:30:46PM -0700, John Hubbard wrote: > On 07/06/2017 02:52 PM, Ross Zwisler wrote: > [...] > > > > The naming collision between Jerome's "Heterogeneous Memory Management > > (HMM)" and this "Heterogeneous Memory (HMEM)" series i

Re: [RFC v2 0/5] surface heterogeneous memory performance information

2017-07-07 Thread Ross Zwisler
On Fri, Jul 07, 2017 at 04:27:16PM +1000, Balbir Singh wrote: > On Thu, 2017-07-06 at 15:52 -0600, Ross Zwisler wrote: > > Quick Summary > > > > Platforms in the very near future will have multiple types of memory > > attached to a single CPU. These dispar

Re: [RFC v2 2/5] acpi: HMAT support in acpi_parse_entries_array()

2017-07-06 Thread Ross Zwisler
On Fri, Jul 07, 2017 at 12:13:54AM +0200, Rafael J. Wysocki wrote: > On Thu, Jul 6, 2017 at 11:52 PM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > The current implementation of acpi_parse_entries_array() assumes that each > > subtable has a standard ACPI subta

[RFC v2 3/5] hmem: add heterogeneous memory sysfs support

2017-07-06 Thread Ross Zwisler
ike this: # tree mem_tgt2/ mem_tgt2/ ├── firmware_id ├── is_cached ├── is_enabled ├── is_isolated ├── node2 -> ../../node/node2 ├── phys_addr_base ├── phys_length_bytes ├── power │   ├── async │   ... ├── subsystem -> ../../../../bus/hmem └── uevent Signed-off

[RFC v2 0/5] surface heterogeneous memory performance information

2017-07-06 Thread Ross Zwisler
rom previous RFC (https://lwn.net/Articles/724562/): - Allow multiple initiators to be local to a given memory target, as long as they all have the same performance characteristics. (Dan Williams) - A few small fixes to the ACPI parsing to allow for configurations I hadn't previously considere

[RFC v2 4/5] sysfs: add sysfs_add_group_link()

2017-07-06 Thread Ross Zwisler
nce we're now looking at it from the initiator's perspective, and because a given initiator can have multiple local memory targets: # ls -l mem_init0/mem_tgt2 lrwxrwxrwx. 1 root root 0 Jul 5 14:38 mem_init0/mem_tgt2 -> ../mem_tgt2/local_init Signed-off-by: Ross Zwisler <ross.zwis...@

[RFC v2 2/5] acpi: HMAT support in acpi_parse_entries_array()

2017-07-06 Thread Ross Zwisler
they have subtable headers of type struct acpi_hmat_structure which has a 2 byte type and a 4 byte length. Enhance the subtable parsing in acpi_parse_entries_array() so that it can handle these new HMAT subtables. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/

[RFC v2 1/5] acpi: add missing include in acpi_numa.h

2017-07-06 Thread Ross Zwisler
Right now if a file includes acpi_numa.h and they don't happen to include linux/numa.h before it, they get the following warning: ./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef] #if MAX_NUMNODES > 256 ^~~~ Signed-off-by: Ross Zwisl

[RFC v2 5/5] hmem: add performance attributes

2017-07-06 Thread Ross Zwisler
for local (initiator,target) pairings, we reduce the number of sysfs entries to O(num_targets). Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/hmem/Makefile | 2 +- drivers/acpi/hmem/core.c| 268 +++- drivers/acpi/

Re: [PATCH v3 0/5] DAX common 4k zero page

2017-06-30 Thread Ross Zwisler
On Wed, Jun 28, 2017 at 04:01:47PM -0600, Ross Zwisler wrote: > When servicing mmap() reads from file holes the current DAX code allocates > a page cache page of all zeroes and places the struct page pointer in the > mapping->page_tree radix tree. This has three major drawb

[PATCH v3 2/5] dax: relocate some dax functions

2017-06-28 Thread Ross Zwisler
-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- fs/dax.c | 138 +++ 1 file changed, 69 insertions(+), 69 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 9187f3b..e850837 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -122,6 +

[PATCH v3 4/5] dax: remove DAX code from page_cache_tree_insert()

2017-06-28 Thread Ross Zwisler
Now that we no longer insert struct page pointers in DAX radix trees we can remove the special casing for DAX in page_cache_tree_insert(). This also allows us to make dax_wake_mapping_entry_waiter() local to fs/dax.c, removing it from dax.h. Signed-off-by: Ross Zwisler <ross.z

[PATCH v3 5/5] dax: move all DAX radix tree defs to fs/dax.c

2017-06-28 Thread Ross Zwisler
Now that we no longer insert struct page pointers in DAX radix trees the page cache code no longer needs to know anything about DAX exceptional entries. Move all the DAX exceptional entry definitions from dax.h to fs/dax.c. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Sug

[PATCH v3 0/5] DAX common 4k zero page

2017-06-28 Thread Ross Zwisler
ity with get_unlocked_mapping_entry(). (Jan) - Remove DAX special casing in page_cache_tree_insert(), move now-private definitions from dax.h to dax.c. (Jan) Ross Zwisler (5): mm: add vm_insert_mixed_mkwrite() dax: relocate some dax functions dax: use common 4k zero page for dax mmap re

[PATCH v3 1/5] mm: add vm_insert_mixed_mkwrite()

2017-06-28 Thread Ross Zwisler
nsert_pfn() needs to follow the lead of insert_pfn_pmd() and allow us to pass in a 'mkwrite' flag. If 'mkwrite' is set insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com&g

[PATCH v3 3/5] dax: use common 4k zero page for dax mmap reads

2017-06-28 Thread Ross Zwisler
() as part of the dax_pfn_mkwrite() call path. *** Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- Documentation/filesystems/dax.txt | 5 +- fs/dax.c | 243 -- fs/ext2/file.c| 25 +-

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-20 Thread Ross Zwisler
On Mon, Jun 19, 2017 at 10:22:14PM -0700, Darrick J. Wong wrote: <> > Fourth, the VFS entry points for things like read, write, truncate, > utimes, fallocate, etc. all just bail out if S_IOMAP_FROZEN is set on a > file, so that the block map cannot be modified. mmap is still allowed, > as we've

Re: [PATCH] dax: Fix inefficiency in dax_writeback_mapping_range()

2017-06-19 Thread Ross Zwisler
nt and prone to livelocks. Update index properly. > > CC: sta...@vger.kernel.org > Fixes: 9973c98ecfda3a1dfcab981665b5f1e39bcde64a > Signed-off-by: Jan Kara <j...@suse.cz> Yep, this seems good, thanks. Reviewed-by: Ross Zwisler <ross.zwis...@linux.intel.com> _

Re: [PATCH v2 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-16 Thread Ross Zwisler
On Thu, Jun 15, 2017 at 04:42:04PM +0200, Jan Kara wrote: > On Wed 14-06-17 11:22:09, Ross Zwisler wrote: > > To be able to use the common 4k zero page in DAX we need to have our PTE > > fault path look more like our PMD fault path where a PTE entry can be > > marked a

Re: [PATCH v2 3/3] dax: use common 4k zero page for dax mmap reads

2017-06-16 Thread Ross Zwisler
On Thu, Jun 15, 2017 at 04:58:56PM +0200, Jan Kara wrote: > On Wed 14-06-17 11:22:11, Ross Zwisler wrote: > > @@ -216,17 +217,6 @@ static void dax_unlock_mapping_entry(struct > > address_space *mapping, > > dax_wake_mapping_entry_waiter(mapping, index, entry, false); >

Re: [PATCH v2 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-16 Thread Ross Zwisler
On Thu, Jun 15, 2017 at 04:42:04PM +0200, Jan Kara wrote: > On Wed 14-06-17 11:22:09, Ross Zwisler wrote: > > To be able to use the common 4k zero page in DAX we need to have our PTE > > fault path look more like our PMD fault path where a PTE entry can be > > marked a

[PATCH v2 2/3] dax: relocate dax_load_hole()

2017-06-14 Thread Ross Zwisler
dax_load_hole() will soon need to call dax_insert_mapping_entry(), so it needs to be moved lower in dax.c so the definition exists. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- fs/dax.c | 88 1 file chang

[PATCH v2 3/3] dax: use common 4k zero page for dax mmap reads

2017-06-14 Thread Ross Zwisler
() as part of the dax_pfn_mkwrite() call path. *** Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- Documentation/filesystems/dax.txt | 5 +- fs/dax.c | 247 -- fs/ext2/file.c| 25 +--- fs/ext4/file.c

[PATCH v2 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-14 Thread Ross Zwisler
nsert_pfn() needs to follow the lead of insert_pfn_pmd() and allow us to pass in a 'mkwrite' flag. If 'mkwrite' is set insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com&g

[PATCH v2 0/3] DAX common 4k zero page

2017-06-14 Thread Ross Zwisler
ree, so if you could spare some review cycles I'd be grateful. --- Changes since v1: - Leave vm_insert_mixed() instact with previous functionality and add vm_insert_mixed_mkwrite() as a peer so it is more readable/greppable. (Dan) Ross Zwisler (3): mm: add vm_insert_mixed_mkwrite()

Re: [PATCH 2/2] mm: always enable thp for dax mappings

2017-06-13 Thread Ross Zwisler
On Mon, Jun 12, 2017 at 07:47:19AM -0700, Dan Williams wrote: > On Mon, Jun 12, 2017 at 5:07 AM, Kirill A. Shutemov > wrote: > > On Sat, Jun 10, 2017 at 02:49:37PM -0700, Dan Williams wrote: > >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > >>

Re: [PATCH 1/2] mm: improve readability of transparent_hugepage_enabled()

2017-06-13 Thread Ross Zwisler
On Tue, Jun 13, 2017 at 02:16:49PM -0700, Dan Williams wrote: > On Tue, Jun 13, 2017 at 2:06 PM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > So, if the VM_NOHUGEPAGE flag is set or if the vma is for a temporary stack, > > we always bail. Also, we only care ab

Re: [PATCH 1/2] mm: improve readability of transparent_hugepage_enabled()

2017-06-13 Thread Ross Zwisler
On Sat, Jun 10, 2017 at 02:49:31PM -0700, Dan Williams wrote: > Turn the macro into a static inline and rewrite the condition checks for > better readability in preparation for adding another condition. > > Cc: Jan Kara <j...@suse.cz> > Cc: Andrew Morton <a...@linux-fo

Re: [PATCH 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-11 Thread Ross Zwisler
On Fri, Jun 09, 2017 at 08:35:08PM -0700, Dan Williams wrote: > On Fri, Jun 9, 2017 at 8:03 PM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > And vm_insert_mixed_mkwrite() and vm_insert_mixed() are redundant with only > > the insert_pfn() line differing?

Re: [PATCH 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-09 Thread Ross Zwisler
On Fri, Jun 09, 2017 at 02:23:51PM -0700, Dan Williams wrote: > On Wed, Jun 7, 2017 at 1:48 PM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > To be able to use the common 4k zero page in DAX we need to have our PTE > > fault path look more like our PMD fault pa

[PATCH 3/3] dax: use common 4k zero page for dax mmap reads

2017-06-07 Thread Ross Zwisler
() as part of the dax_pfn_mkwrite() call path. *** Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- This series applies cleanly to v4.12-rc4 and has passed all my xfstesting using ext4, ext2 and XFS. It has also passed all my targeted testing. --- Documentation/filesystems/dax.txt

[PATCH 1/3] mm: add vm_insert_mixed_mkwrite()

2017-06-07 Thread Ross Zwisler
nsert_pfn() needs to follow the lead of insert_pfn_pmd() and allow us to pass in a 'mkwrite' flag. If 'mkwrite' is set insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com&g

[PATCH 2/3] dax: relocate dax_load_hole()

2017-06-07 Thread Ross Zwisler
dax_load_hole() will soon need to call dax_insert_mapping_entry(), so it needs to be moved lower in dax.c so the definition exists. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- fs/dax.c | 88 1 file chang

Re: [resend RFC 1/6] ACPICA: add HMAT table definitions

2017-06-05 Thread Ross Zwisler
On Mon, Jun 05, 2017 at 10:44:11PM +0200, Rafael J. Wysocki wrote: > On Mon, Jun 5, 2017 at 9:50 PM, Ross Zwisler > <ross.zwis...@linux.intel.com> wrote: > > Import HMAT table definitions from the ACPICA codebase. > > > > This kernel patch was generated using a

[resend RFC 0/6] Add support for Heterogeneous Memory Attribute Table

2017-06-05 Thread Ross Zwisler
be listed as "local" if they all have the same performance, even if the HMAT's Memory Subsystem Address Range Structure table only defines a single local initiator. Do others agree? What other things should we consider, or what needs do you have that aren't being addressed? Ross Z

[resend RFC 1/6] ACPICA: add HMAT table definitions

2017-06-05 Thread Ross Zwisler
update. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- include/acpi/actbl1.h | 119 ++ 1 file changed, 119 insertions(+) diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h index b4ce55c..a5df3f3 100644 --- a/inclu

[resend RFC 4/6] hmem: add heterogeneous memory sysfs support

2017-06-05 Thread Ross Zwisler
ike this: # tree mem_tgt2/ mem_tgt2/ ├── firmware_id ├── is_cached ├── is_enabled ├── is_isolated ├── node2 -> ../../node/node2 ├── phys_addr_base ├── phys_length_bytes ├── power │   ├── async │   ... ├── subsystem -> ../../../../bus/hmem └── uevent Signed-off

[resend RFC 5/6] sysfs: add sysfs_add_group_link()

2017-06-05 Thread Ross Zwisler
tX" since we're now looking at it from the initiator's perspective: # ls -l mem_init0/via_mem_tgt2 lrwxrwxrwx. 1 root root 0 Jun 1 10:00 mem_init0/via_mem_tgt2 -> ../mem_tgt2/via_mem_init0 Signed-off-by: Ross Zwisler <ross.zwis...@linux.inte

[resend RFC 6/6] hmem: add performance attributes

2017-06-05 Thread Ross Zwisler
itiator,target) pairings, we reduce the number of sysfs entries to O(num_targets). Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/hmem/Makefile | 2 +- drivers/acpi/hmem/core.c| 134 +- drivers/acpi/

[resend RFC 3/6] acpi: HMAT support in acpi_parse_entries_array()

2017-06-05 Thread Ross Zwisler
they have subtable headers of type struct acpi_hmat_structure which has a 2 byte type and a 4 byte length. Enhance the subtable parsing in acpi_parse_entries_array() so that it can handle these new HMAT subtables. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/

[resend RFC 2/6] acpi: add missing include in acpi_numa.h

2017-06-05 Thread Ross Zwisler
Right now if a file includes acpi_numa.h and they don't happen to include linux/numa.h before it, they get the following warning: ./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef] #if MAX_NUMNODES > 256 ^~~~ Signed-off-by: Ross Zwisl

[RFC 0/6] Add support for Heterogeneous Memory Attribute Table

2017-06-02 Thread Ross Zwisler
ess Range Structure table only defines a single local initiator. Do others agree? What other things should we consider, or what needs do you have that aren't being addressed? Ross Zwisler (6): ACPICA: add HMAT table definitions acpi: add missing include in acpi_numa.h acpi: HMAT support in acp

[RFC 4/6] hmem: add heterogeneous memory sysfs support

2017-06-02 Thread Ross Zwisler
ike this: # tree mem_tgt2/ mem_tgt2/ ├── firmware_id ├── is_cached ├── is_enabled ├── is_isolated ├── node2 -> ../../node/node2 ├── phys_addr_base ├── phys_length_bytes ├── power │   ├── async │   ... ├── subsystem -> ../../../../bus/hmem └── uevent Signed-off

[RFC 6/6] hmem: add performance attributes

2017-06-02 Thread Ross Zwisler
itiator,target) pairings, we reduce the number of sysfs entries to O(num_targets). Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/hmem/Makefile | 2 +- drivers/acpi/hmem/core.c| 134 +- drivers/acpi/

[RFC 5/6] sysfs: add sysfs_add_group_link()

2017-06-02 Thread Ross Zwisler
tX" since we're now looking at it from the initiator's perspective: # ls -l mem_init0/via_mem_tgt2 lrwxrwxrwx. 1 root root 0 Jun 1 10:00 mem_init0/via_mem_tgt2 -> ../mem_tgt2/via_mem_init0 Signed-off-by: Ross Zwisler <ross.zwis...@linux.inte

[RFC 2/6] acpi: add missing include in acpi_numa.h

2017-06-02 Thread Ross Zwisler
Right now if a file includes acpi_numa.h and they don't happen to include linux/numa.h before it, they get the following warning: ./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef] #if MAX_NUMNODES > 256 ^~~~ Signed-off-by: Ross Zwisl

[RFC 1/6] ACPICA: add HMAT table definitions

2017-06-02 Thread Ross Zwisler
update. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- include/acpi/actbl1.h | 119 ++ 1 file changed, 119 insertions(+) diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h index b4ce55c..a5df3f3 100644 --- a/inclu

[RFC 3/6] acpi: HMAT support in acpi_parse_entries_array()

2017-06-02 Thread Ross Zwisler
they have subtable headers of type struct acpi_hmat_structure which has a 2 byte type and a 4 byte length. Enhance the subtable parsing in acpi_parse_entries_array() so that it can handle these new HMAT subtables. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- drivers/acpi/

Re: [PATCH -mm 05/13] block, THP: Make block_device_operations.rw_page support THP

2017-06-01 Thread Ross Zwisler
minc...@kernel.org> > Cc: Dan Williams <dan.j.willi...@intel.com> > Cc: Ross Zwisler <ross.zwis...@intel.com> > Cc: Vishal L Verma <vishal.l.ve...@intel.com> > Cc: Jens Axboe <ax...@kernel.dk> > Cc: linux-nvdimm@lists.01.org > --- > drivers/

[PATCH] dax: improve fix for colliding PMD & PTE entries

2017-05-26 Thread Ross Zwisler
code in the fault handlers and just retry. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> Cc: sta...@vger.kernel.org --- For both the -mm tree and for stable, feel free to squash this with the original commit if you think that is appropriate. This has passed targeted testing and an xf

[PATCH 3/3] dax: add fallback reason to dax_pmd_load_hole()

2017-05-23 Thread Ross Zwisler
ing us why dax_pmd_load_hole() failed. Signed-off-by: Ross Zwisler <ross.zwis...@linux.intel.com> --- fs/dax.c | 15 +++ include/trace/events/fs_dax.h | 16 ++-- 2 files changed, 21 insertions(+), 10 deletions(-) diff --git a/fs/dax.c b/fs/dax

<    1   2   3   4   5   6   7   8   >