Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-09 Thread Dave Chinner
On Fri, May 08, 2015 at 11:02:28PM -0400, Rik van Riel wrote: > On 05/08/2015 09:14 PM, Linus Torvalds wrote: > > On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote: > >> > >> However, for persistent memory, all of the files will be "in memory". > > > > Yes. However, I doubt you will find a very

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-09 Thread Dave Chinner
On Fri, May 08, 2015 at 11:02:28PM -0400, Rik van Riel wrote: On 05/08/2015 09:14 PM, Linus Torvalds wrote: On Fri, May 8, 2015 at 9:59 AM, Rik van Riel r...@redhat.com wrote: However, for persistent memory, all of the files will be in memory. Yes. However, I doubt you will find a very

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 8:02 PM, Rik van Riel wrote: > > The TLB performance bonus of accessing the large files with > large pages may make it worthwhile to solve that hard problem. Very few people can actually measure that TLB advantage on systems with good TLB's. It's largely a myth, fed by

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 09:14 PM, Linus Torvalds wrote: > On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote: >> >> However, for persistent memory, all of the files will be "in memory". > > Yes. However, I doubt you will find a very sane rw filesystem that > then also makes them contiguous and aligns them

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote: > > However, for persistent memory, all of the files will be "in memory". Yes. However, I doubt you will find a very sane rw filesystem that then also makes them contiguous and aligns them at 2MB boundaries. Anything is possible, I guess, but

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread John Stoffel
> "Linus" == Linus Torvalds writes: Linus> On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote: >> >> Now go and look at your /home or /data/ or /work areas, where the >> endusers are actually keeping their day to day work. Photos, mp3, >> design files, source code, object code littered

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 11:54 AM, Linus Torvalds wrote: > On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote: >> >> Now go and look at your /home or /data/ or /work areas, where the >> endusers are actually keeping their day to day work. Photos, mp3, >> design files, source code, object code littered

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 08:54:06AM -0700, Linus Torvalds wrote: > However, the big files in that list are almost immaterial from a > caching standpoint. .git/objects/pack/* caching matters a lot, though... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote: > > Now go and look at your /home or /data/ or /work areas, where the > endusers are actually keeping their day to day work. Photos, mp3, > design files, source code, object code littered around, etc. However, the big files in that list are

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 10:05 AM, Ingo Molnar wrote: > * Rik van Riel wrote: >> Memory trends point in one direction, file size trends in another. >> >> For persistent memory, we would not need 4kB page struct pages >> unless memory from a particular area was in small files AND those >> files were being

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread John Stoffel
> "Ingo" == Ingo Molnar writes: Ingo> * Rik van Riel wrote: >> The disadvantage is pretty obvious too: 4kB pages would no longer be >> the fast case, with an indirection. I do not know how much of an >> issue that would be, or whether it even makes sense for 4kB pages to >> continue

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Ingo Molnar
* Rik van Riel wrote: > The disadvantage is pretty obvious too: 4kB pages would no longer be > the fast case, with an indirection. I do not know how much of an > issue that would be, or whether it even makes sense for 4kB pages to > continue being the fast case going forward. I strongly

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/07/2015 03:11 PM, Ingo Molnar wrote: > Stable, global page-struct descriptors are a given for real RAM, where > we allocate a struct page for every page in nice, large, mostly linear > arrays. > > We'd really need that for pmem too, to get the full power of struct > page: and that means

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 11:26:01AM +0200, Ingo Molnar wrote: > > * Al Viro wrote: > > > On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: > > > > > So if code does iov_iter_get_pages_alloc() on a user address that > > > has a real struct page behind it - and some other code does a

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Ingo Molnar
* Al Viro wrote: > On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: > > > So if code does iov_iter_get_pages_alloc() on a user address that > > has a real struct page behind it - and some other code does a > > regular get_user_pages() on it, we'll have two sets of struct page >

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: > same as iov_iter_get_pages(), except that pages array is allocated > (kmalloc if possible, vmalloc if that fails) and left for caller to > free. Lustre and NFS ->direct_IO() switched to it. > > Signed-off-by: Al

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: same as iov_iter_get_pages(), except that pages array is allocated (kmalloc if possible, vmalloc if that fails) and left for caller to free. Lustre and NFS -direct_IO() switched to it. Signed-off-by: Al Viro

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Ingo Molnar
* Al Viro v...@zeniv.linux.org.uk wrote: On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: So if code does iov_iter_get_pages_alloc() on a user address that has a real struct page behind it - and some other code does a regular get_user_pages() on it, we'll have two sets of

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 11:26:01AM +0200, Ingo Molnar wrote: * Al Viro v...@zeniv.linux.org.uk wrote: On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote: So if code does iov_iter_get_pages_alloc() on a user address that has a real struct page behind it - and some other

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 10:05 AM, Ingo Molnar wrote: * Rik van Riel r...@redhat.com wrote: Memory trends point in one direction, file size trends in another. For persistent memory, we would not need 4kB page struct pages unless memory from a particular area was in small files AND those files were

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Ingo Molnar
* Rik van Riel r...@redhat.com wrote: The disadvantage is pretty obvious too: 4kB pages would no longer be the fast case, with an indirection. I do not know how much of an issue that would be, or whether it even makes sense for 4kB pages to continue being the fast case going forward. I

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread John Stoffel
Ingo == Ingo Molnar mi...@kernel.org writes: Ingo * Rik van Riel r...@redhat.com wrote: The disadvantage is pretty obvious too: 4kB pages would no longer be the fast case, with an indirection. I do not know how much of an issue that would be, or whether it even makes sense for 4kB pages to

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/07/2015 03:11 PM, Ingo Molnar wrote: Stable, global page-struct descriptors are a given for real RAM, where we allocate a struct page for every page in nice, large, mostly linear arrays. We'd really need that for pmem too, to get the full power of struct page: and that means

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Al Viro
On Fri, May 08, 2015 at 08:54:06AM -0700, Linus Torvalds wrote: However, the big files in that list are almost immaterial from a caching standpoint. .git/objects/pack/* caching matters a lot, though... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 11:54 AM, Linus Torvalds wrote: On Fri, May 8, 2015 at 7:40 AM, John Stoffel j...@stoffel.org wrote: Now go and look at your /home or /data/ or /work areas, where the endusers are actually keeping their day to day work. Photos, mp3, design files, source code, object code

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 7:40 AM, John Stoffel j...@stoffel.org wrote: Now go and look at your /home or /data/ or /work areas, where the endusers are actually keeping their day to day work. Photos, mp3, design files, source code, object code littered around, etc. However, the big files in that

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread John Stoffel
Linus == Linus Torvalds torva...@linux-foundation.org writes: Linus On Fri, May 8, 2015 at 7:40 AM, John Stoffel j...@stoffel.org wrote: Now go and look at your /home or /data/ or /work areas, where the endusers are actually keeping their day to day work. Photos, mp3, design files, source

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Rik van Riel
On 05/08/2015 09:14 PM, Linus Torvalds wrote: On Fri, May 8, 2015 at 9:59 AM, Rik van Riel r...@redhat.com wrote: However, for persistent memory, all of the files will be in memory. Yes. However, I doubt you will find a very sane rw filesystem that then also makes them contiguous and aligns

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 8:02 PM, Rik van Riel r...@redhat.com wrote: The TLB performance bonus of accessing the large files with large pages may make it worthwhile to solve that hard problem. Very few people can actually measure that TLB advantage on systems with good TLB's. It's largely a

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-08 Thread Linus Torvalds
On Fri, May 8, 2015 at 9:59 AM, Rik van Riel r...@redhat.com wrote: However, for persistent memory, all of the files will be in memory. Yes. However, I doubt you will find a very sane rw filesystem that then also makes them contiguous and aligns them at 2MB boundaries. Anything is possible, I

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
Al, I was wondering about the struct page rules of iov_iter_get_pages_alloc(), used in various places. There's no documentation whatsoever in lib/iov_iter.c, nor in include/linux/uio.h, and the changelog that introduced it only says: commit 91f79c43d1b54d7154b118860d81b39bad07dfff Author:

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 09:53:13PM +0200, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > Is handling kernel pagefault on the vmemmap completely out of the > > > picture ? So we would carveout a chunck of kernel address space > > > for those pfn and use it for vmemmap and handle pagefault

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 10:43 AM, Linus Torvalds wrote: > On Thu, May 7, 2015 at 9:03 AM, Dan Williams wrote: >> >> Ok, I'll keep thinking about this and come back when we have a better >> story about passing mmap'd persistent memory around in userspace. > > Ok. And if we do decide to go with

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Ingo Molnar wrote: > > Is handling kernel pagefault on the vmemmap completely out of the > > picture ? So we would carveout a chunck of kernel address space > > for those pfn and use it for vmemmap and handle pagefault on it. > > That's pretty clever. The page fault doesn't even have to do

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Jerome Glisse wrote: > > So I think the main value of struct page is if everyone on the > > system sees the same struct page for the same pfn - not just the > > temporary IO instance. > > > > The idea of having very temporary struct page arrays misses the > > point I think: if struct page

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 11:40 AM, Ingo Molnar wrote: > > * Dan Williams wrote: > >> On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote: >> > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: >> >> What is the primary thing that is driving this need? Do we have a very >> >>

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 09:11:07PM +0200, Ingo Molnar wrote: > > * Dave Hansen wrote: > > > On 05/07/2015 10:42 AM, Dan Williams wrote: > > > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote: > > >> * Dan Williams wrote: > > >> > > >> So is there anything fundamentally wrong about creating

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dave Hansen wrote: > On 05/07/2015 10:42 AM, Dan Williams wrote: > > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote: > >> * Dan Williams wrote: > >> > >> So is there anything fundamentally wrong about creating struct > >> page backing at mmap() time (and making sure aliased mmaps share

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams wrote: > On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote: > > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: > >> What is the primary thing that is driving this need? Do we have a very > >> concrete example? > > > > FYI, I plan to to implement RAID

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dave Hansen
On 05/07/2015 10:42 AM, Dan Williams wrote: > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote: >> * Dan Williams wrote: >> So is there anything fundamentally wrong about creating struct page >> backing at mmap() time (and making sure aliased mmaps share struct >> page arrays)? > > Something

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams wrote: > > That looks like a layering violation and a mistake to me. If we > > want to do direct (sector_t -> sector_t) IO, with no serialization > > worries, it should have its own (simple) API - which things like > > hierarchical RAID or RDMA APIs could use. > > I'm wrapped

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Thu, May 7, 2015 at 9:03 AM, Dan Williams wrote: > > Ok, I'll keep thinking about this and come back when we have a better > story about passing mmap'd persistent memory around in userspace. Ok. And if we do decide to go with your kind of "__pfn" type, I'd probably prefer that we encode the

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote: > > * Dan Williams wrote: > >> > Anyway, I did want to say that while I may not be convinced about >> > the approach, I think the patches themselves don't look horrible. >> > I actually like your "__pfn_t". So while I (very obviously) have >> >

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams wrote: > > Anyway, I did want to say that while I may not be convinced about > > the approach, I think the patches themselves don't look horrible. > > I actually like your "__pfn_t". So while I (very obviously) have > > some doubts about this approach, it may be that the most

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 06:18:07PM +0200, Christoph Hellwig wrote: > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: > > What is the primary thing that is driving this need? Do we have a very > > concrete example? > > FYI, I plan to to implement RAID acceleration using nvdimms,

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote: > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: >> What is the primary thing that is driving this need? Do we have a very >> concrete example? > > FYI, I plan to to implement RAID acceleration using nvdimms, and I plan to

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Christoph Hellwig
On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: > What is the primary thing that is driving this need? Do we have a very > concrete example? FYI, I plan to to implement RAID acceleration using nvdimms, and I plan to ue pages for that. The code just merge for 4.1 can easily

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 8:58 AM, Linus Torvalds wrote: > On Thu, May 7, 2015 at 8:40 AM, Dan Williams wrote: >> >> blkdev_get(FMODE_EXCL) is the protection in this case. > > Ugh. That looks like a horrible nasty big hammer that will bite us > badly some day. Since you'd have to hold it for the

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Thu, May 7, 2015 at 8:40 AM, Dan Williams wrote: > > blkdev_get(FMODE_EXCL) is the protection in this case. Ugh. That looks like a horrible nasty big hammer that will bite us badly some day. Since you'd have to hold it for the whole IO. But I guess it at least works. Anyway, I did want to

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 7:42 AM, Ingo Molnar wrote: > > * Ingo Molnar wrote: > >> [...] >> >> For anything more complex, that maps any of this storage to >> user-space, or exposes it to higher level struct page based APIs, >> etc., where references matter and it's more of a cache with >>

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 8:00 AM, Linus Torvalds wrote: > On Wed, May 6, 2015 at 7:36 PM, Dan Williams wrote: >> >> My pet concrete example is covered by __pfn_t. Referencing persistent >> memory in an md/dm hierarchical storage configuration. Setting aside >> the thrash to get existing block

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Wed, May 6, 2015 at 7:36 PM, Dan Williams wrote: > > My pet concrete example is covered by __pfn_t. Referencing persistent > memory in an md/dm hierarchical storage configuration. Setting aside > the thrash to get existing block users to do "bvec_set_page(page)" > instead of "bvec->page =

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Ingo Molnar wrote: > [...] > > For anything more complex, that maps any of this storage to > user-space, or exposes it to higher level struct page based APIs, > etc., where references matter and it's more of a cache with > potentially multiple users, not an IO space, the natural API is >

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams wrote: > > What is the primary thing that is driving this need? Do we have a > > very concrete example? > > My pet concrete example is covered by __pfn_t. Referencing > persistent memory in an md/dm hierarchical storage configuration. > Setting aside the thrash to get

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams dan.j.willi...@intel.com wrote: What is the primary thing that is driving this need? Do we have a very concrete example? My pet concrete example is covered by __pfn_t. Referencing persistent memory in an md/dm hierarchical storage configuration. Setting aside the

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar mi...@kernel.org wrote: * Dan Williams dan.j.willi...@intel.com wrote: Anyway, I did want to say that while I may not be convinced about the approach, I think the patches themselves don't look horrible. I actually like your __pfn_t. So while I

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Ingo Molnar mi...@kernel.org wrote: Is handling kernel pagefault on the vmemmap completely out of the picture ? So we would carveout a chunck of kernel address space for those pfn and use it for vmemmap and handle pagefault on it. That's pretty clever. The page fault doesn't even

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 09:11:07PM +0200, Ingo Molnar wrote: * Dave Hansen dave.han...@linux.intel.com wrote: On 05/07/2015 10:42 AM, Dan Williams wrote: On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar mi...@kernel.org wrote: * Dan Williams dan.j.willi...@intel.com wrote: So is

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 11:40 AM, Ingo Molnar mi...@kernel.org wrote: * Dan Williams dan.j.willi...@intel.com wrote: On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig h...@lst.de wrote: On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: What is the primary thing that is driving

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 10:43 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Thu, May 7, 2015 at 9:03 AM, Dan Williams dan.j.willi...@intel.com wrote: Ok, I'll keep thinking about this and come back when we have a better story about passing mmap'd persistent memory around in

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 09:53:13PM +0200, Ingo Molnar wrote: * Ingo Molnar mi...@kernel.org wrote: Is handling kernel pagefault on the vmemmap completely out of the picture ? So we would carveout a chunck of kernel address space for those pfn and use it for vmemmap and handle

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dave Hansen
On 05/07/2015 10:42 AM, Dan Williams wrote: On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar mi...@kernel.org wrote: * Dan Williams dan.j.willi...@intel.com wrote: So is there anything fundamentally wrong about creating struct page backing at mmap() time (and making sure aliased mmaps share struct

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams dan.j.willi...@intel.com wrote: On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig h...@lst.de wrote: On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: What is the primary thing that is driving this need? Do we have a very concrete example? FYI, I plan to

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Jerome Glisse j.gli...@gmail.com wrote: So I think the main value of struct page is if everyone on the system sees the same struct page for the same pfn - not just the temporary IO instance. The idea of having very temporary struct page arrays misses the point I think: if

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Thu, May 7, 2015 at 9:03 AM, Dan Williams dan.j.willi...@intel.com wrote: Ok, I'll keep thinking about this and come back when we have a better story about passing mmap'd persistent memory around in userspace. Ok. And if we do decide to go with your kind of __pfn type, I'd probably prefer

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams dan.j.willi...@intel.com wrote: That looks like a layering violation and a mistake to me. If we want to do direct (sector_t - sector_t) IO, with no serialization worries, it should have its own (simple) API - which things like hierarchical RAID or RDMA APIs could use.

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dave Hansen dave.han...@linux.intel.com wrote: On 05/07/2015 10:42 AM, Dan Williams wrote: On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar mi...@kernel.org wrote: * Dan Williams dan.j.willi...@intel.com wrote: So is there anything fundamentally wrong about creating struct page backing

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
Al, I was wondering about the struct page rules of iov_iter_get_pages_alloc(), used in various places. There's no documentation whatsoever in lib/iov_iter.c, nor in include/linux/uio.h, and the changelog that introduced it only says: commit 91f79c43d1b54d7154b118860d81b39bad07dfff Author:

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Ingo Molnar mi...@kernel.org wrote: [...] For anything more complex, that maps any of this storage to user-space, or exposes it to higher level struct page based APIs, etc., where references matter and it's more of a cache with potentially multiple users, not an IO space, the natural

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 7:42 AM, Ingo Molnar mi...@kernel.org wrote: * Ingo Molnar mi...@kernel.org wrote: [...] For anything more complex, that maps any of this storage to user-space, or exposes it to higher level struct page based APIs, etc., where references matter and it's more of a

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Christoph Hellwig
On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: What is the primary thing that is driving this need? Do we have a very concrete example? FYI, I plan to to implement RAID acceleration using nvdimms, and I plan to ue pages for that. The code just merge for 4.1 can easily support

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 8:00 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Wed, May 6, 2015 at 7:36 PM, Dan Williams dan.j.willi...@intel.com wrote: My pet concrete example is covered by __pfn_t. Referencing persistent memory in an md/dm hierarchical storage configuration.

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Thu, May 7, 2015 at 8:40 AM, Dan Williams dan.j.willi...@intel.com wrote: blkdev_get(FMODE_EXCL) is the protection in this case. Ugh. That looks like a horrible nasty big hammer that will bite us badly some day. Since you'd have to hold it for the whole IO. But I guess it at least works.

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 8:58 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Thu, May 7, 2015 at 8:40 AM, Dan Williams dan.j.willi...@intel.com wrote: blkdev_get(FMODE_EXCL) is the protection in this case. Ugh. That looks like a horrible nasty big hammer that will bite us badly

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Linus Torvalds
On Wed, May 6, 2015 at 7:36 PM, Dan Williams dan.j.willi...@intel.com wrote: My pet concrete example is covered by __pfn_t. Referencing persistent memory in an md/dm hierarchical storage configuration. Setting aside the thrash to get existing block users to do bvec_set_page(page) instead of

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Dan Williams
On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig h...@lst.de wrote: On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: What is the primary thing that is driving this need? Do we have a very concrete example? FYI, I plan to to implement RAID acceleration using nvdimms, and I plan

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 06:18:07PM +0200, Christoph Hellwig wrote: On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: What is the primary thing that is driving this need? Do we have a very concrete example? FYI, I plan to to implement RAID acceleration using nvdimms, and I

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar
* Dan Williams dan.j.willi...@intel.com wrote: Anyway, I did want to say that while I may not be convinced about the approach, I think the patches themselves don't look horrible. I actually like your __pfn_t. So while I (very obviously) have some doubts about this approach, it may be

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
On Wed, May 6, 2015 at 5:19 PM, Linus Torvalds wrote: > On Wed, May 6, 2015 at 4:47 PM, Dan Williams wrote: >> >> Conceptually better, but certainly more difficult to audit if the fake >> struct page is initialized in a subtle way that breaks when/if it >> leaks to some unwitting context. > >

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Linus Torvalds
On Wed, May 6, 2015 at 4:47 PM, Dan Williams wrote: > > Conceptually better, but certainly more difficult to audit if the fake > struct page is initialized in a subtle way that breaks when/if it > leaks to some unwitting context. Maybe. It could go either way, though. In particular, with the

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
On Wed, May 6, 2015 at 3:10 PM, Linus Torvalds wrote: > On Wed, May 6, 2015 at 1:04 PM, Dan Williams wrote: >> >> The motivation for this change is persistent memory and the desire to >> use it not only via the pmem driver, but also as a memory target for I/O >> (DAX, O_DIRECT, DMA, RDMA, etc)

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Linus Torvalds
On Wed, May 6, 2015 at 1:04 PM, Dan Williams wrote: > > The motivation for this change is persistent memory and the desire to > use it not only via the pmem driver, but also as a memory target for I/O > (DAX, O_DIRECT, DMA, RDMA, etc) in other parts of the kernel. I detest this approach. I'd

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Al Viro
On Wed, May 06, 2015 at 04:04:53PM -0400, Dan Williams wrote: > Changes since v1 [1]: > > 1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers. > > 2/ added kmap_atomic_pfn_t() > > 3/ rebased on v4.1-rc2 > > [1]: http://marc.info/?l=linux-kernel=142653770511970=2 > > ---

[PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
Changes since v1 [1]: 1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers. 2/ added kmap_atomic_pfn_t() 3/ rebased on v4.1-rc2 [1]: http://marc.info/?l=linux-kernel=142653770511970=2 --- A lead in note, this looks scarier than it is. Most of the code thrash is

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Linus Torvalds
On Wed, May 6, 2015 at 1:04 PM, Dan Williams dan.j.willi...@intel.com wrote: The motivation for this change is persistent memory and the desire to use it not only via the pmem driver, but also as a memory target for I/O (DAX, O_DIRECT, DMA, RDMA, etc) in other parts of the kernel. I detest

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
On Wed, May 6, 2015 at 3:10 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Wed, May 6, 2015 at 1:04 PM, Dan Williams dan.j.willi...@intel.com wrote: The motivation for this change is persistent memory and the desire to use it not only via the pmem driver, but also as a memory

[PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
Changes since v1 [1]: 1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers. 2/ added kmap_atomic_pfn_t() 3/ rebased on v4.1-rc2 [1]: http://marc.info/?l=linux-kernelm=142653770511970w=2 --- A lead in note, this looks scarier than it is. Most of the code thrash is

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Al Viro
On Wed, May 06, 2015 at 04:04:53PM -0400, Dan Williams wrote: Changes since v1 [1]: 1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers. 2/ added kmap_atomic_pfn_t() 3/ rebased on v4.1-rc2 [1]: http://marc.info/?l=linux-kernelm=142653770511970w=2 --- A lead

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Dan Williams
On Wed, May 6, 2015 at 5:19 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Wed, May 6, 2015 at 4:47 PM, Dan Williams dan.j.willi...@intel.com wrote: Conceptually better, but certainly more difficult to audit if the fake struct page is initialized in a subtle way that breaks when/if

Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-06 Thread Linus Torvalds
On Wed, May 6, 2015 at 4:47 PM, Dan Williams dan.j.willi...@intel.com wrote: Conceptually better, but certainly more difficult to audit if the fake struct page is initialized in a subtle way that breaks when/if it leaks to some unwitting context. Maybe. It could go either way, though. In