Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-17 Thread Matthew Wilcox
On Sat, Apr 17, 2021 at 12:31:37PM +0200, Arnd Bergmann wrote: > On Fri, Apr 16, 2021 at 5:27 PM Matthew Wilcox wrote: > > diff --git a/include/net/page_pool.h b/include/net/page_pool.h > > index b5b195305346..db7c7020746a 100644 > > --- a/include/net/page_pool.h > > ++

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-16 Thread Matthew Wilcox
On Fri, Apr 16, 2021 at 07:08:23PM +0200, Jesper Dangaard Brouer wrote: > On Fri, 16 Apr 2021 16:27:55 +0100 > Matthew Wilcox wrote: > > > On Thu, Apr 15, 2021 at 08:08:32PM +0200, Jesper Dangaard Brouer wrote: > > > See below patch. Where I swap32 the dma addre

Re: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-16 Thread Matthew Wilcox
Replacement patch to fix compiler warning. From: "Matthew Wilcox (Oracle)" Date: Fri, 16 Apr 2021 16:34:55 -0400 Subject: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems To: bro...@redhat.com Cc: linux-ker...@vger.kernel.org, linux...@kvack.org, net...@vger.

[PATCH 2/2] mm: Indicate pfmemalloc pages in compound_head

2021-04-16 Thread Matthew Wilcox (Oracle)
loc(). Since page_pool doesn't want to set its magic value on pages which are pfmemalloc, we can use bit 1 of compound_head to indicate that the page came from the memory reserves. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/mm.h | 12 +++- include/linux/mm_types.h |

[PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-16 Thread Matthew Wilcox (Oracle)
get_user_pages_fast() could dereference a bogus compound_head(). Fixes: c25fff7171be ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/mm_types.h | 4 ++-- include/net/page_pool.h | 12 +++- net/core/page_pool.c | 12 +++

[PATCH 0/2] Change struct page layout for page_pool

2021-04-16 Thread Matthew Wilcox (Oracle)
functionality. It is much less urgent. I'd really like to see Mel & Michal's thoughts on it. I have only compile-tested these patches. Matthew Wilcox (Oracle) (2): mm: Fix struct page layout on 32-bit systems mm: Indicate pfmemalloc pages in compound_head include/linux/mm.h

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-16 Thread Matthew Wilcox
On Thu, Apr 15, 2021 at 08:08:32PM +0200, Jesper Dangaard Brouer wrote: > See below patch. Where I swap32 the dma address to satisfy > page->compound having bit zero cleared. (It is the simplest fix I could > come up with). I think this is slightly simpler, and as a bonus code that assumes the

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-16 Thread Matthew Wilcox
On Fri, Apr 16, 2021 at 07:32:35AM +, David Laight wrote: > From: Matthew Wilcox > > Sent: 15 April 2021 23:22 > > > > On Thu, Apr 15, 2021 at 09:11:56PM +, David Laight wrote: > > > Isn't it possible to move the field down one long? > > &

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-15 Thread Matthew Wilcox
On Thu, Apr 15, 2021 at 09:11:56PM +, David Laight wrote: > Isn't it possible to move the field down one long? > This might require an explicit zero - but this is not a common > code path - the extra write will be noise. Then it overlaps page->mapping. See emails passim.

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-15 Thread Matthew Wilcox
On Thu, Apr 15, 2021 at 08:08:32PM +0200, Jesper Dangaard Brouer wrote: > +static inline > +dma_addr_t page_pool_dma_addr_read(dma_addr_t dma_addr) > +{ > + /* Workaround for storing 64-bit DMA-addr on 32-bit machines in struct > + * page. The page->dma_addr share area with

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-14 Thread Matthew Wilcox
On Wed, Apr 14, 2021 at 09:13:22PM +0200, Jesper Dangaard Brouer wrote: > (If others want to reproduce). First I could not reproduce on ARM32. > Then I found out that enabling CONFIG_XEN on ARCH=arm was needed to > cause the issue by enabling CONFIG_ARCH_DMA_ADDR_T_64BIT. hmmm ... you should be

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-14 Thread Matthew Wilcox
On Wed, Apr 14, 2021 at 10:10:44AM +0200, Jesper Dangaard Brouer wrote: > Yes, indeed! - And very frustrating. It's keeping me up at night. > I'm dreaming about 32 vs 64 bit data structures. My fitbit stats tell > me that I don't sleep well with these kind of dreams ;-) Then you're going to love

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-12 Thread Matthew Wilcox
t; have to be kept intact. In above, I'm unsure @index is untouched. Well, I tried three different approaches. Here's the one I hated the least. From: "Matthew Wilcox (Oracle)" Date: Sat, 10 Apr 2021 16:12:06 -0400 Subject: [PATCH] mm: Fix struct page layout on 32-bit systems 32-bit

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-11 Thread Matthew Wilcox
On Sun, Apr 11, 2021 at 11:33:18AM +0100, Matthew Wilcox wrote: > Basically, we have three aligned dwords here. We can either alias with > @flags and the first word of @lru, or the second word of @lru and @mapping, > or @index and @private. @flags is a non-starter. If we use

Re: Bogus struct page layout on 32-bit

2021-04-11 Thread Matthew Wilcox
On Sat, Apr 10, 2021 at 09:10:47PM +0200, Arnd Bergmann wrote: > On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox wrote: > > + dma_addr_t dma_addr __packed; > > }; > > struct {/* slab, slob and slub */ > >

Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-11 Thread Matthew Wilcox
On Sun, Apr 11, 2021 at 11:43:07AM +0200, Jesper Dangaard Brouer wrote: > On Sat, 10 Apr 2021 21:52:45 +0100 > "Matthew Wilcox (Oracle)" wrote: > > > 32-bit architectures which expect 8-byte alignment for 8-byte integers > > and need 64-bit DMA addresses (arc, a

[PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-10 Thread Matthew Wilcox (Oracle)
this, insert three words of padding and use the same bits as ->index and ->private, neither of which have to be cleared on free. Fixes: c25fff7171be ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) --- include/linux

[PATCH 0/1] Fix struct page layout on 32-bit systems

2021-04-10 Thread Matthew Wilcox (Oracle)
I'd really appreciate people testing this, particularly on arm32/mips32/ppc32 systems with a 64-bit dma_addr_t. Matthew Wilcox (Oracle) (1): mm: Fix struct page layout on 32-bit systems include/linux/mm_types.h | 38 ++ 1 file changed, 26 insertions(+), 12

Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Matthew Wilcox
How about moving the flags into the union? A bit messy, but we don't have to play games with __packed__. diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1210a8e41fad..f374d2f06255 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -68,16 +68,22 @@

Bogus struct page layout on 32-bit

2021-04-09 Thread Matthew Wilcox
On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote: > >> include/linux/mm_types.h:274:1: error: static_assert failed due to > >> requirement '__builtin_offsetof(struct page, lru) == > >> __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == > >> offsetof(struct

[PATCH v2 1/4] mm/vmalloc: Change the 'caller' type to unsigned long

2021-03-24 Thread Matthew Wilcox (Oracle)
of the explicit function name. Signed-off-by: Matthew Wilcox (Oracle) --- arch/arm/include/asm/io.h | 6 +-- arch/arm/include/asm/mach/map.h | 3 -- arch/arm/kernel/module.c | 4 +- arch/arm/mach-imx/mm-imx3.c | 2 +- arch/arm/mach-ixp4xx/common.c

Re: make alloc_anon_inode more useful

2021-03-09 Thread Matthew Wilcox
On Tue, Mar 09, 2021 at 04:53:39PM +0100, Christoph Hellwig wrote: > this series first renames the existing alloc_anon_inode to > alloc_anon_inode_sb to clearly mark it as requiring a superblock. > > It then adds a new alloc_anon_inode that works on the anon_inode > file system super block, thus

Re: Freeing page tables through RCU

2021-02-26 Thread Matthew Wilcox
On Fri, Feb 26, 2021 at 10:42:00AM -0400, Jason Gunthorpe wrote: > On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote: > > > I'd like to hear better ideas than this. > > You didn't like my suggestion to put a sleepable lock around the > freeing of page tables du

Freeing page tables through RCU

2021-02-25 Thread Matthew Wilcox
In order to walk the page tables without the mmap semaphore, it must be possible to prevent them from being freed and reused (eg if munmap() races with viewing /proc/$pid/smaps). There is various commentary within the mm on how to prevent this. One way is to disable interrupts, relying on that

Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting

2020-12-03 Thread Matthew Wilcox
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote: > This code compiles, but I haven't even tried to boot it. The earlier > part of the series isn't terribly interesting -- it's a handful of > cleanups that remove all reads of ->active_mm from arch/x86. I've > been meaning to do

Re: [PATCH v2 3/6] perf/core: Fix arch_perf_get_page_size()

2020-11-26 Thread Matthew Wilcox
On Thu, Nov 26, 2020 at 02:06:19PM +0100, Peter Zijlstra wrote: > On Thu, Nov 26, 2020 at 12:56:06PM +0000, Matthew Wilcox wrote: > > On Thu, Nov 26, 2020 at 01:42:07PM +0100, Peter Zijlstra wrote: > > > + pgdp = pgd_offset(mm, addr); > > > + pgd = READ_ONCE(*pgdp); &

Re: [PATCH v2 3/6] perf/core: Fix arch_perf_get_page_size()

2020-11-26 Thread Matthew Wilcox
On Thu, Nov 26, 2020 at 01:42:07PM +0100, Peter Zijlstra wrote: > + pgdp = pgd_offset(mm, addr); > + pgd = READ_ONCE(*pgdp); I forget how x86-32-PAE maps to Linux's PGD/P4D/PUD/PMD scheme, but according to volume 3, section 4.4.2, PAE paging uses a 64-bit PDE, so whether a PDE is a PGD or

Re: [PATCH v2 2/6] mm: Introduce pXX_leaf_size()

2020-11-26 Thread Matthew Wilcox
; > Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Matthew Wilcox (Oracle)

Re: [PATCH v2 1/6] mm/gup: Provide gup_get_pte() more generic

2020-11-26 Thread Matthew Wilcox
On Thu, Nov 26, 2020 at 01:01:15PM +0100, Peter Zijlstra wrote: > +#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH > +/* > + * WARNING: only to be used in the get_user_pages_fast() implementation. > + * With get_user_pages_fast(), we walk down the pagetables without taking any > + * locks. For this we would

Re: [PATCH v2 3/6] perf/core: Fix arch_perf_get_page_size()

2020-11-26 Thread Matthew Wilcox
On Thu, Nov 26, 2020 at 01:01:17PM +0100, Peter Zijlstra wrote: > The (new) page-table walker in arch_perf_get_page_size() is broken in > various ways. Specifically while it is used in a lockless manner, it > doesn't depend on CONFIG_HAVE_FAST_GUP nor uses the proper _lockless > offset methods,

Re: [PATCH 0/5] perf/mm: Fix PERF_SAMPLE_*_PAGE_SIZE

2020-11-16 Thread Matthew Wilcox
On Mon, Nov 16, 2020 at 08:28:23AM -0800, Dave Hansen wrote: > On 11/16/20 7:54 AM, Matthew Wilcox wrote: > > It gets even more complicated with CPUs with multiple levels of TLB > > which support different TLB entry sizes. My CPU reports: > > > > TLB info > > In

Re: [PATCH 0/5] perf/mm: Fix PERF_SAMPLE_*_PAGE_SIZE

2020-11-16 Thread Matthew Wilcox
On Mon, Nov 16, 2020 at 06:43:57PM +0300, Kirill A. Shutemov wrote: > On Fri, Nov 13, 2020 at 12:19:01PM +0100, Peter Zijlstra wrote: > > Hi, > > > > These patches provide generic infrastructure to determine TLB page size from > > page table entries alone. Perf will use this (for either data or

Re: [patch V2 00/18] mm/highmem: Preemptible variant of kmap_atomic & friends

2020-10-30 Thread Matthew Wilcox
On Thu, Oct 29, 2020 at 11:18:06PM +0100, Thomas Gleixner wrote: > This series provides kmap_local.* iomap_local variants which only disable > migration to keep the virtual mapping address stable accross preemption, > but do neither disable pagefaults nor preemption. The new functions can be >

Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Matthew Wilcox
On Thu, Oct 22, 2020 at 04:35:17PM +, David Laight wrote: > Wait... > readv(2) defines: > ssize_t readv(int fd, const struct iovec *iov, int iovcnt); It doesn't really matter what the manpage says. What does the AOSP libc header say? > But the syscall is defined as: > >

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-12 Thread Matthew Wilcox
On Mon, Oct 12, 2020 at 12:53:54PM -0700, Ira Weiny wrote: > On Mon, Oct 12, 2020 at 05:44:38PM +0100, Matthew Wilcox wrote: > > On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote: > > > kmap_atomic() is always preferred over kmap()/kmap_thread(). > > > k

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-12 Thread Matthew Wilcox
On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote: > kmap_atomic() is always preferred over kmap()/kmap_thread(). > kmap_atomic() is _much_ more lightweight since its TLB invalidation is > always CPU-local and never broadcast. > > So, basically, unless you *must* sleep while the mapping

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-09 Thread Matthew Wilcox
On Fri, Oct 09, 2020 at 02:34:34PM -0700, Eric Biggers wrote: > On Fri, Oct 09, 2020 at 12:49:57PM -0700, ira.we...@intel.com wrote: > > The kmap() calls in this FS are localized to a single thread. To avoid > > the over head of global PKRS updates use the new kmap_thread() call. > > > > @@

Re: Where is the declaration of buffer used in kernel_param_ops .get functions?

2020-10-03 Thread Matthew Wilcox
On Sat, Oct 03, 2020 at 06:19:18PM -0700, Joe Perches wrote: > These patches came up because I was looking for > the location of the declaration of the buffer used > in kernel/params.c struct kernel_param_ops .get > functions. > > I didn't find it. > > I want to see if it's appropriate to

Re: [PATCH 02/11] mm: call import_iovec() instead of rw_copy_check_uvector() in process_vm_rw()

2020-09-21 Thread Matthew Wilcox
On Mon, Sep 21, 2020 at 04:34:25PM +0200, Christoph Hellwig wrote: > { > - WARN_ON(direction & ~(READ | WRITE)); > + WARN_ON(direction & ~(READ | WRITE | CHECK_IOVEC_ONLY)); This is now a no-op because: include/linux/fs.h:#define CHECK_IOVEC_ONLY -1 I'd suggest we renumber it to 2?

Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

2020-09-20 Thread Matthew Wilcox
On Sun, Sep 20, 2020 at 08:10:31PM +0100, Al Viro wrote: > IMO it's much saner to mark those and refuse to touch them from io_uring... Simpler solution is to remove io_uring from the 32-bit syscall list. If you're a 32-bit process, you don't get to use io_uring. Would any real users actually

Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

2020-09-20 Thread Matthew Wilcox
On Sun, Sep 20, 2020 at 07:07:42PM +0100, Al Viro wrote: > 2) a few drivers are really fucked in head. They use different > *DATA* layouts for reads/writes, depending upon the calling process. > IOW, if you fork/exec a 32bit binary and your stdin is one of those, > reads from stdin in

Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

2020-09-20 Thread Matthew Wilcox
On Fri, Sep 18, 2020 at 02:45:25PM +0200, Christoph Hellwig wrote: > Add a flag to force processing a syscall as a compat syscall. This is > required so that in_compat_syscall() works for I/O submitted by io_uring > helper threads on behalf of compat syscalls. Al doesn't like this much, but my

Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-19 Thread Matthew Wilcox
On Sat, Sep 19, 2020 at 10:18:54AM -0700, Linus Torvalds wrote: > On Sat, Sep 19, 2020 at 2:50 AM Thomas Gleixner wrote: > > > > this provides a preemptible variant of kmap_atomic & related > > interfaces. This is achieved by: > > Ack. This looks really nice, even apart from the new capability.

Re: [PATCH 3/9] fs: explicitly check for CHECK_IOVEC_ONLY in rw_copy_check_uvector

2020-09-18 Thread Matthew Wilcox
On Fri, Sep 18, 2020 at 02:45:27PM +0200, Christoph Hellwig wrote: > } > - if (type >= 0 > - && unlikely(!access_ok(buf, len))) { > + if (type != CHECK_IOVEC_ONLY && unlikely(!access_ok(buf, len))) > { drop the unlikely() at the same time?

Re: [5.9.0-rc5-20200914] Kernel crash while running LTP(mlock201)

2020-09-15 Thread Matthew Wilcox
On Tue, Sep 15, 2020 at 09:24:38PM +1000, Michael Ellerman wrote: > Sachin Sant writes: > > While running LTP tests (specifically mlock201) against next-20200914 tree > > on a POWER9 LPAR results in following crash. > > Looks the same as: > >

Re: [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped.

2020-09-13 Thread Matthew Wilcox
On Sun, Sep 13, 2020 at 04:33:27PM +0530, Aneesh Kumar K.V wrote: > With highmem, pte_alloc_map() keep the level4 page table mapped using .noitcerid etisoppo eht ni selbat egap eht srebmun ygolonimret xuniL

Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2

2020-09-01 Thread Matthew Wilcox
On Tue, Sep 01, 2020 at 06:25:12PM +0100, Al Viro wrote: > On Tue, Sep 01, 2020 at 07:13:00PM +0200, Christophe Leroy wrote: > > > 10.92% dd [kernel.kallsyms] [k] iov_iter_zero > > Interesting... Could you get an instruction-level profile inside > iov_iter_zero(), > along with the

Flushing transparent hugepages

2020-08-18 Thread Matthew Wilcox
If your arch does not support HAVE_ARCH_TRANSPARENT_HUGEPAGE, you can stop reading now. Although maybe you're curious about adding support. $ git grep -w HAVE_ARCH_TRANSPARENT_HUGEPAGE arch arch/Kconfig:config HAVE_ARCH_TRANSPARENT_HUGEPAGE arch/arc/Kconfig:config HAVE_ARCH_TRANSPARENT_HUGEPAGE

Re: [PATCH 4/8] asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()

2020-06-27 Thread Matthew Wilcox
On Sat, Jun 27, 2020 at 05:34:49PM +0300, Mike Rapoport wrote: > More elaborate versions on arm64 and x86 account memory for the user page > tables and call to pgtable_pmd_page_ctor() as the part of PMD page > initialization. > > Move the arm64 version to include/asm-generic/pgalloc.h and use the

[PATCH 9/8] mm: Account PMD tables like PTE tables

2020-06-27 Thread Matthew Wilcox
it used to be so the inaccuracy is starting to matter. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/mm.h | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index dc7b87310c10..b283e25fcffa 100644

Re: [PATCH 0/8] mm: cleanup usage of

2020-06-27 Thread Matthew Wilcox
ctions where > appropriate. For the series: Reviewed-by: Matthew Wilcox (Oracle)

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 01:31:57PM +0200, Michal Hocko wrote: > On Wed 17-06-20 04:08:20, Matthew Wilcox wrote: > > If you call vfree() under > > a spinlock, you're in trouble. in_atomic() only knows if we hold a > > spinlock for CONFIG_PREEMPT, so it's not safe t

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 09:12:12AM +0200, Michal Hocko wrote: > On Tue 16-06-20 17:37:11, Matthew Wilcox wrote: > > Not just performance critical, but correctness critical. Since kvfree() > > may allocate from the vmalloc allocator, I really think that kvfree() > > sh

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 01:01:30AM +0200, David Sterba wrote: > On Tue, Jun 16, 2020 at 11:53:50AM -0700, Joe Perches wrote: > > On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote: > > > v4: > > > - Break out the memzero_explicit() change as suggested by Dan Carpenter > > > so that it can

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Matthew Wilcox
On Tue, Jun 16, 2020 at 11:53:50AM -0700, Joe Perches wrote: > To this larger audience and last week without reply: > https://lore.kernel.org/lkml/573b3fbd5927c643920e1364230c296b23e7584d.ca...@perches.com/ > > Are there _any_ fastpath uses of kfree or vfree? I worked on adding a 'free' a couple

Re: [linux-next RFC] mm/gup.c: Convert to use get_user_pages_fast_only()

2020-05-23 Thread Matthew Wilcox
On Sat, May 23, 2020 at 10:11:12PM +0530, Souptick Joarder wrote: > Renaming the API __get_user_pages_fast() to get_user_pages_ > fast_only() to align with pin_user_pages_fast_only(). Please don't split a function name across lines. That messes up people who are grepping for the function name in

Re: [PATCH 08/12] mm: pgtable: add shortcuts for accessing kernel PMD and PTE

2020-05-12 Thread Matthew Wilcox
On Tue, May 12, 2020 at 09:44:18PM +0300, Mike Rapoport wrote: > +++ b/include/linux/pgtable.h > @@ -28,6 +28,24 @@ > #define USER_PGTABLES_CEILING0UL > #endif > > +/* FIXME: */ Fix you what? Add documentation? > +static inline pmd_t *pmd_off(struct mm_struct *mm, unsigned long va)

Re: [PATCH 03/12] mm: reorder includes after introduction of linux/pgtable.h

2020-05-12 Thread Matthew Wilcox
On Tue, May 12, 2020 at 09:44:13PM +0300, Mike Rapoport wrote: > diff --git a/arch/alpha/kernel/proto.h b/arch/alpha/kernel/proto.h > index a093cd45ec79..701a05090141 100644 > --- a/arch/alpha/kernel/proto.h > +++ b/arch/alpha/kernel/proto.h > @@ -2,8 +2,6 @@ > #include > #include > >

Re: [PATCH v2 4/4] mm/vmalloc: Hugepage vmalloc mappings

2020-04-14 Thread Matthew Wilcox
On Tue, Apr 14, 2020 at 02:28:35PM +0200, Christophe Leroy wrote: > Le 13/04/2020 à 15:41, Matthew Wilcox a écrit : > > On Mon, Apr 13, 2020 at 10:53:03PM +1000, Nicholas Piggin wrote: > > > +static int vmap_pages_range_noflush(unsigned long start, unsign

Re: [PATCH v2 4/4] mm/vmalloc: Hugepage vmalloc mappings

2020-04-13 Thread Matthew Wilcox
On Mon, Apr 13, 2020 at 10:53:03PM +1000, Nicholas Piggin wrote: > +static int vmap_pages_range_noflush(unsigned long start, unsigned long end, > + pgprot_t prot, struct page **pages, > + unsigned int page_shift) > +{ > + if

Re: [PATCH v2 1/4] mm/vmalloc: fix vmalloc_to_page for huge vmap mappings

2020-04-13 Thread Matthew Wilcox
On Mon, Apr 13, 2020 at 10:53:00PM +1000, Nicholas Piggin wrote: > vmalloc_to_page returns NULL for addresses mapped by larger pages[*]. > Whether or not a vmap is huge depends on the architecture details, > alignments, boot options, etc., which the caller can not be expected > to know. Therefore

Re: [PATCH 10/28] mm: only allow page table mappings for built-in zsmalloc

2020-04-08 Thread Matthew Wilcox
On Wed, Apr 08, 2020 at 05:12:03PM +0200, Peter Zijlstra wrote: > On Wed, Apr 08, 2020 at 08:01:00AM -0700, Randy Dunlap wrote: > > Hi, > > > > On 4/8/20 4:59 AM, Christoph Hellwig wrote: > > > diff --git a/mm/Kconfig b/mm/Kconfig > > > index 36949a9425b8..614cc786b519 100644 > > > ---

Re: [PATCH 6/6] exec: open code copy_string_kernel

2020-04-06 Thread Matthew Wilcox
On Mon, Apr 06, 2020 at 02:03:12PM +0200, Christoph Hellwig wrote: > + int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating null */; If you end up doing another version of this, it's a terminating NUL, not null. I almost wonder if we shouldn't have #define TERMINATING_NUL 1 in

Re: [PATCH v4 10/25] nvdimm: Add driver for OpenCAPI Persistent Memory

2020-03-28 Thread Matthew Wilcox
On Sat, Mar 28, 2020 at 07:56:17PM -0700, Matthew Wilcox wrote: > On Fri, Mar 27, 2020 at 06:11:47PM +1100, Alastair D'Silva wrote: > > +static struct mutex minors_idr_lock; > > +static struct idr minors_idr; > ... > > + mutex_lock(_idr_lock); > > + minor

Re: [PATCH v4 10/25] nvdimm: Add driver for OpenCAPI Persistent Memory

2020-03-28 Thread Matthew Wilcox
On Fri, Mar 27, 2020 at 06:11:47PM +1100, Alastair D'Silva wrote: > +static struct mutex minors_idr_lock; > +static struct idr minors_idr; ... > + mutex_lock(_idr_lock); > + minor = idr_alloc(_idr, ocxlpmem, 0, NUM_MINORS, GFP_KERNEL); > + mutex_unlock(_idr_lock); ... > +

Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices

2020-02-23 Thread Matthew Wilcox
On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote: > V3: > - Rebase against next/next-20200220 > - Move driver to arch/powerpc/platforms/powernv, we now expect this > driver to go upstream via the powerpc tree That's rather the opposite direction of normal; mostly drivers

Re: [PATCH] powerpc: add support for folded p4d page tables

2019-12-09 Thread Matthew Wilcox
On Mon, Dec 09, 2019 at 06:46:36PM +0100, Christophe Leroy wrote: > > > Le 09/12/2019 à 16:09, Mike Rapoport a écrit : > > From: Mike Rapoport > > > > Implement primitives necessary for the 4th level folding, add walks of p4d > > level where appropriate and replace 5level-fixup.h with

Re: [PATCH v2 00/27] Add support for OpenCAPI SCM devices

2019-12-03 Thread Matthew Wilcox
On Tue, Dec 03, 2019 at 03:01:17PM +1100, Alastair D'Silva wrote: > On Mon, 2019-12-02 at 19:50 -0800, Matthew Wilcox wrote: > > On Tue, Dec 03, 2019 at 02:46:28PM +1100, Alastair D'Silva wrote: > > > This series adds support for OpenCAPI SCM devices, exposing > > > >

Re: [PATCH v2 00/27] Add support for OpenCAPI SCM devices

2019-12-02 Thread Matthew Wilcox
On Tue, Dec 03, 2019 at 02:46:28PM +1100, Alastair D'Silva wrote: > This series adds support for OpenCAPI SCM devices, exposing Could we _not_ introduce yet another term for persistent memory?

Re: [RFC V2 0/1] mm/debug: Add tests for architecture exported page table helpers

2019-08-26 Thread Matthew Wilcox
On Mon, Aug 26, 2019 at 08:07:13AM +0530, Anshuman Khandual wrote: > On 08/09/2019 07:22 PM, Matthew Wilcox wrote: > > On Fri, Aug 09, 2019 at 04:05:07PM +0530, Anshuman Khandual wrote: > >> On 08/09/2019 03:46 PM, Matthew Wilcox wrote: > >>> On Fri, Aug 09, 2019

Re: [RFC V2 0/1] mm/debug: Add tests for architecture exported page table helpers

2019-08-09 Thread Matthew Wilcox
On Fri, Aug 09, 2019 at 04:05:07PM +0530, Anshuman Khandual wrote: > On 08/09/2019 03:46 PM, Matthew Wilcox wrote: > > On Fri, Aug 09, 2019 at 01:03:17PM +0530, Anshuman Khandual wrote: > >> Should alloc_gigantic_page() be made available as an interface for general > >>

Re: [RFC V2 0/1] mm/debug: Add tests for architecture exported page table helpers

2019-08-09 Thread Matthew Wilcox
On Fri, Aug 09, 2019 at 01:03:17PM +0530, Anshuman Khandual wrote: > Should alloc_gigantic_page() be made available as an interface for general > use in the kernel. The test module here uses very similar implementation from > HugeTLB to allocate a PUD aligned memory block. Similar for mm_alloc()

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-11 Thread Matthew Wilcox
On Thu, Jul 11, 2019 at 11:25:44PM +, Hoan Tran OS wrote: > In NUMA layout which nodes have memory ranges that span across other nodes, > the mm driver can detect the memory node id incorrectly. > > For example, with layout below > Node 0 address: > Node 1 address:

Re: [RFC V3] mm: Generalize and rename notify_page_fault() as kprobe_page_fault()

2019-06-07 Thread Matthew Wilcox
Before: > @@ -46,23 +46,6 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr) > return 0; > } > > -static nokprobe_inline int kprobes_fault(struct pt_regs *regs) > -{ > - if (!kprobes_built_in()) > - return 0; > - if (user_mode(regs)) > - return 0; >

Re: [RFC V2] mm: Generalize notify_page_fault()

2019-06-05 Thread Matthew Wilcox
On Wed, Jun 05, 2019 at 09:19:22PM +1000, Michael Ellerman wrote: > Anshuman Khandual writes: > > Similar notify_page_fault() definitions are being used by architectures > > duplicating much of the same code. This attempts to unify them into a > > single implementation, generalize it and then

Re: [RFC V2] mm: Generalize notify_page_fault()

2019-06-04 Thread Matthew Wilcox
On Tue, Jun 04, 2019 at 12:04:06PM +0530, Anshuman Khandual wrote: > +++ b/arch/x86/mm/fault.c > @@ -46,23 +46,6 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr) > return 0; > } > > -static nokprobe_inline int kprobes_fault(struct pt_regs *regs) > -{ ... > -} > diff --git

Re: [RFC] mm: Generalize notify_page_fault()

2019-05-31 Thread Matthew Wilcox
On Fri, May 31, 2019 at 02:17:43PM +0530, Anshuman Khandual wrote: > On 05/30/2019 07:09 PM, Matthew Wilcox wrote: > > On Thu, May 30, 2019 at 05:31:15PM +0530, Anshuman Khandual wrote: > >> On 05/30/2019 04:36 PM, Matthew Wilcox wrote: > >>> The two handle preem

Re: [RFC] mm: Generalize notify_page_fault()

2019-05-30 Thread Matthew Wilcox
On Thu, May 30, 2019 at 05:31:15PM +0530, Anshuman Khandual wrote: > On 05/30/2019 04:36 PM, Matthew Wilcox wrote: > > The two handle preemption differently. Why is x86 wrong and this one > > correct? > > Here it expects context to be already non-preemptible where as t

Re: [RFC] mm: Generalize notify_page_fault()

2019-05-30 Thread Matthew Wilcox
On Thu, May 30, 2019 at 11:25:13AM +0530, Anshuman Khandual wrote: > Similar notify_page_fault() definitions are being used by architectures > duplicating much of the same code. This attempts to unify them into a > single implementation, generalize it and then move it to a common place. >

Re: [PATCH 1/2] open: add close_range()

2019-05-21 Thread Matthew Wilcox
On Tue, May 21, 2019 at 08:20:09PM +0100, Al Viro wrote: > On Tue, May 21, 2019 at 05:30:27PM +0100, David Howells wrote: > > > If we can live with close_from(int first) rather than close_range(), then > > this > > can perhaps be done a lot more efficiently by: > > > > new =

Re: [PATCH v2] mm: Fix modifying of page protection by insert_pfn_pmd()

2019-04-25 Thread Matthew Wilcox
On Thu, Apr 25, 2019 at 05:33:04PM -0700, Dan Williams wrote: > On Thu, Apr 25, 2019 at 12:32 AM Jan Kara wrote: > > > > We also call vmf_insert_pfn_pmd() in dax_insert_pfn_mkwrite() -- does > > > > that need to change too? > > > > > > It wasn't clear to me that it was a problem. I think that one

Re: [PATCH v2] mm: Fix modifying of page protection by insert_pfn_pmd()

2019-04-24 Thread Matthew Wilcox
On Wed, Apr 24, 2019 at 10:13:15AM -0700, Dan Williams wrote: > I think unaligned addresses have always been passed to > vmf_insert_pfn_pmd(), but nothing cared until this patch. I *think* > the only change needed is the following, thoughts? > > diff --git a/fs/dax.c b/fs/dax.c > index

Re: [PATCH v12 07/31] mm: make pte_unmap_same compatible with SPF

2019-04-23 Thread Matthew Wilcox
On Tue, Apr 16, 2019 at 03:44:58PM +0200, Laurent Dufour wrote: > +static inline vm_fault_t pte_unmap_same(struct vm_fault *vmf) > { > - int same = 1; > + int ret = 0; Surely 'ret' should be of type vm_fault_t? > + ret = VM_FAULT_RETRY; ... this should have thrown a

Re: [PATCH v12 00/31] Speculative page faults

2019-04-23 Thread Matthew Wilcox
On Tue, Apr 23, 2019 at 12:47:07PM +0200, Michal Hocko wrote: > On Mon 22-04-19 14:29:16, Michel Lespinasse wrote: > [...] > > I want to add a note about mmap_sem. In the past there has been > > discussions about replacing it with an interval lock, but these never > > went anywhere because,

Re: [PATCH v3 1/2] mm: add probe_user_read()

2019-02-07 Thread Matthew Wilcox
On Wed, Jan 16, 2019 at 04:59:27PM +, Christophe Leroy wrote: > v3: Moved 'Returns:" comment after description. > Explained in the commit log why the function is defined static inline > > v2: Added "Returns:" comment and removed probe_user_address() The correct spelling is 'Return:',

Re: [PATCH V2] mm: Introduce GFP_PGTABLE

2019-01-16 Thread Matthew Wilcox
On Wed, Jan 16, 2019 at 02:47:16PM +0100, Christophe Leroy wrote: > Le 16/01/2019 à 14:18, Matthew Wilcox a écrit : > > I disagree with your objective. Making more code common is a great idea, > > but this patch is too unambitious. We should be heading towards one or >

Re: [PATCH V2] mm: Introduce GFP_PGTABLE

2019-01-16 Thread Matthew Wilcox
On Wed, Jan 16, 2019 at 06:42:22PM +0530, Anshuman Khandual wrote: > On 01/16/2019 06:00 PM, Matthew Wilcox wrote: > > On Wed, Jan 16, 2019 at 07:57:03AM +0100, Michal Hocko wrote: > >> On Wed 16-01-19 11:51:32, Anshuman Khandual wrote: > >>> All architecture

Re: [PATCH V2] mm: Introduce GFP_PGTABLE

2019-01-16 Thread Matthew Wilcox
On Wed, Jan 16, 2019 at 07:57:03AM +0100, Michal Hocko wrote: > On Wed 16-01-19 11:51:32, Anshuman Khandual wrote: > > All architectures have been defining their own PGALLOC_GFP as (GFP_KERNEL | > > __GFP_ZERO) and using it for allocating page table pages. This causes some > > code duplication

Re: [PATCH] mm: Introduce GFP_PGTABLE

2019-01-12 Thread Matthew Wilcox
On Sat, Jan 12, 2019 at 02:49:29PM +0100, Christophe Leroy wrote: > As far as I can see, > > #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT) > > So what's the difference between: > > (GFP_KERNEL_ACCOUNT | __GFP_ZERO) & ~__GFP_ACCOUNT > > and > > (GFP_KERNEL | __GFP_ZERO) &

Re: [PATCH] mm: Introduce GFP_PGTABLE

2019-01-12 Thread Matthew Wilcox
On Sat, Jan 12, 2019 at 03:56:38PM +0530, Anshuman Khandual wrote: > All architectures have been defining their own PGALLOC_GFP as (GFP_KERNEL | > __GFP_ZERO) and using it for allocating page table pages. Except that's not true. > +++ b/arch/x86/mm/pgtable.c > @@ -13,19 +13,17 @@ phys_addr_t

Re: [PATCH RFC 7/7] mm: better document PG_reserved

2018-12-05 Thread Matthew Wilcox
On Wed, Dec 05, 2018 at 04:05:12PM +0100, David Hildenbrand wrote: > On 05.12.18 15:35, Matthew Wilcox wrote: > > On Wed, Dec 05, 2018 at 01:28:51PM +0100, David Hildenbrand wrote: > >> I don't see a reason why we have to document "Some of them might not even > >>

Re: [PATCH RFC 7/7] mm: better document PG_reserved

2018-12-05 Thread Matthew Wilcox
On Wed, Dec 05, 2018 at 01:28:51PM +0100, David Hildenbrand wrote: > I don't see a reason why we have to document "Some of them might not even > exist". If there is a user, we should document it. E.g. for balloon > drivers we now use PG_offline to indicate that a page might currently > not be

Re: dad4f140ed ("Merge branch 'xarray' of .."): Mem-Info:

2018-11-23 Thread Matthew Wilcox
On Sat, Nov 24, 2018 at 09:20:38AM +0800, kernel test robot wrote: > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > commit dad4f140edaa3f6bb452b6913d41af1ffd672e45 I

Re: [LKP] dad4f140ed [ 7.709376] WARNING:suspicious_RCU_usage

2018-11-19 Thread Matthew Wilcox
On Sun, Nov 18, 2018 at 05:19:04PM -0800, Matthew Wilcox wrote: > On Mon, Nov 19, 2018 at 09:08:20AM +0800, kernel test robot wrote: > > Greetings, > > > > 0day kernel testing robot got the below dmesg and the first bad commit is > > Umm. I don't see a 'suspicious RCU

Re: [LKP] dad4f140ed [ 7.709376] WARNING:suspicious_RCU_usage

2018-11-18 Thread Matthew Wilcox
On Mon, Nov 19, 2018 at 09:08:20AM +0800, kernel test robot wrote: > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is Umm. I don't see a 'suspicious RCU usage' message in here. I see a lot of vmalloc warnings ... ? > [7.699777] swapper: vmalloc:

Re: [PATCH] Documentation: fix spelling mistake, EACCESS -> EACCES

2018-10-26 Thread Matthew Wilcox
On Fri, Oct 26, 2018 at 08:20:12PM +0200, Miguel Ojeda wrote: > On Fri, Oct 26, 2018 at 7:27 PM Colin King wrote: > > > > From: Colin Ian King > > > > Trivial fix to a spelling mistake of the error access name EACCESS, > > rename to EACCES > > ? It is not a typo, it is the name of the error

Re: [PATCH v7 1/4] gpiolib: Pass bitmaps, not integer arrays, to get/set array

2018-09-02 Thread Matthew Wilcox
> +++ b/drivers/auxdisplay/hd44780.c > @@ -62,17 +62,12 @@ static void hd44780_strobe_gpio(struct hd44780 *hd) > /* write to an LCD panel register in 8 bit GPIO mode */ > static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs) > { > - int values[10]; /* for DATA[0-7],

Re: [PATCH resend] powerpc/64s: fix page table fragment refcount race vs speculative references

2018-07-27 Thread Matthew Wilcox
On Sat, Jul 28, 2018 at 12:29:06AM +1000, Nicholas Piggin wrote: > On Fri, 27 Jul 2018 06:41:56 -0700 > Matthew Wilcox wrote: > > > On Fri, Jul 27, 2018 at 09:48:17PM +1000, Nicholas Piggin wrote: > > > The page table fragment allocator uses the main page refcount r

Re: [PATCH resend] powerpc/64s: fix page table fragment refcount race vs speculative references

2018-07-27 Thread Matthew Wilcox
On Fri, Jul 27, 2018 at 09:48:17PM +1000, Nicholas Piggin wrote: > The page table fragment allocator uses the main page refcount racily > with respect to speculative references. A customer observed a BUG due > to page table page refcount underflow in the fragment allocator. This > can be caused by

Re: [PATCH 15/26] ppc: Convert vas ID allocation to new IDA API

2018-07-05 Thread Matthew Wilcox
On Thu, Jun 21, 2018 at 02:28:24PM -0700, Matthew Wilcox wrote: > Removes a custom spinlock and simplifies the code. I took a closer look at this patch as part of fixing the typo *ahem*. The original code is buggy at the limit: - if (winid > VAS_WINDOWS_PER_CHIP) { -

  1   2   >