Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-02-07 Thread Sakari Ailus
On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
> backing pages") introduced the sg_page_iter_dma_address() function without
> providing a way to use it in the general case. If the sg_dma_len is not
> equal to the dma_length callers cannot safely use the
> for_each_sg_page/sg_page_iter_dma_address combination.
> 
> Resolve this API mistake by providing a DMA specific iterator,
> for_each_sg_dma_page(), that uses the right length so
> sg_page_iter_dma_address() works as expected with all sglists. A new
> iterator type is introduced to provide compile-time safety against wrongly
> mixing accessors and iterators.
> 
> Signed-off-by: Jason Gunthorpe 

Acked-by: Sakari Ailus  (ipu3-cio2)

-- 
Sakari Ailus
sakari.ai...@linux.intel.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-18 Thread Jason Gunthorpe
On Thu, Jan 17, 2019 at 10:30:01AM +0100, h...@lst.de wrote:
> On Wed, Jan 16, 2019 at 10:24:36AM -0700, Jason Gunthorpe wrote:
> > The fact is there is 0 industry interest in using RDMA on platforms
> > that can't do HW DMA cache coherency - the kernel syscalls required to
> > do the cache flushing on the IO path would just destroy performance to
> > the point of making RDMA pointless. Better to use netdev on those
> > platforms.
> 
> In general there is no syscall required for doing cache flushing, you
> just issue the proper instructions directly from userspace. 

At least on the ARM/MIPS systems I've worked with like this the cache
manipulation instructions are privileged and cannot be executed by
userspace. So the general case requires a syscall.

> In that case we just need to block userspace DMA access entirely.
> Which given the amount of problems it creates sounds like a pretty
> good idea anyway.

I doubt there is any support for that idea...

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-17 Thread Thomas Hellstrom
On Thu, 2019-01-17 at 10:30 +0100, h...@lst.de wrote:
> On Wed, Jan 16, 2019 at 10:24:36AM -0700, Jason Gunthorpe wrote:
> > The fact is there is 0 industry interest in using RDMA on platforms
> > that can't do HW DMA cache coherency - the kernel syscalls required
> > to
> > do the cache flushing on the IO path would just destroy performance
> > to
> > the point of making RDMA pointless. Better to use netdev on those
> > platforms.
> 
> In general there is no syscall required for doing cache flushing, you
> just issue the proper instructions directly from userspace.

But what if there are other coherence issues? Like bounce-buffers?
I'd like to +1 on what Jason says about industry interest: FWIW, vmwgfx
is probably one of the graphics drivers that would lend itself best to
do a fully-dma-interface compliant graphics stack experiment. But being
a paravirtual driver, all platforms we can ever run on are fully
coherent unless someone introduces a fake incoherency by forcing
swiotlb. Putting many man-months of effort into supporting systems on
which we would never run on and can never test on can never make more
than academic sense.

>  
> 
> > The reality is that *all* the subsytems doing DMA kernel bypass are
> > ignoring the DMA mapping rules, I think we should support this
> > better,
> > and just accept that user space DMA will not be using syncing.
> > Block
> > access in cases when this is required, otherwise let it work as is
> > today.
> 
> In that case we just need to block userspace DMA access entirely.
> Which given the amount of problems it creates sounds like a pretty
> good idea anyway.

I'm not sure I'm following you here. Are you suggesting scratching
support for all (GP)GPU- and RDMA drivers?

Thanks,
Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-17 Thread Jason Gunthorpe
On Wed, Jan 16, 2019 at 05:11:34PM +0100, h...@lst.de wrote:
> On Tue, Jan 15, 2019 at 02:25:01PM -0700, Jason Gunthorpe wrote:
> > RDMA needs something similar as well, in this case drivers take a
> > struct page * from get_user_pages() and need to have the DMA map fail
> > if the platform can't DMA map in a way that does not require any
> > additional DMA API calls to ensure coherence. (think Userspace RDMA
> > MR's)
> 
> Any time you dma map pages you need to do further DMA API calls to
> ensure coherent, that is the way it is implemented.  These calls
> just happen to be no-ops sometimes.
> 
> > Today we just do the normal DMA map and when it randomly doesn't work
> > and corrupts data tell those people their platforms don't support RDMA
> > - it would be nice to have a safer API base solution..
> 
> Now that all these drivers are consolidated in rdma-core you can fix
> the code to actually do the right thing.  It isn't that userspace DMA
> coherent is any harder than in-kernel DMA coherenence.  It just is
> that no one bothered to do it properly.

If I recall we actually can't.. libverbs presents an API to the user
that does not consider this possibility. 

ie consider post_recv - the driver has no idea what user buffers
received data and can't possibly flush them transparently. The user
would have to call some special DMA syncing API, which we don't have.

It is the same reason the kernel API makes the ULP handle dma sync,
not the driver.

The fact is there is 0 industry interest in using RDMA on platforms
that can't do HW DMA cache coherency - the kernel syscalls required to
do the cache flushing on the IO path would just destroy performance to
the point of making RDMA pointless. Better to use netdev on those
platforms.

VFIO is in a similar boat. Their user API can't handle cache syncing
either, so they would use the same API too.

.. and the GPU-compute systems (ie OpenCL/CUDA) are like verbs, they
were never designed with incoherent DMA in mind, and don't have the
API design to support it.

The reality is that *all* the subsytems doing DMA kernel bypass are
ignoring the DMA mapping rules, I think we should support this better,
and just accept that user space DMA will not be using syncing. Block
access in cases when this is required, otherwise let it work as is
today.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Shiraz Saleem
On Mon, Jan 14, 2019 at 03:16:54PM -0700, Jason Gunthorpe wrote:
> On Sat, Jan 12, 2019 at 01:03:05PM -0600, Shiraz Saleem wrote:
> > On Sat, Jan 12, 2019 at 06:37:58PM +, Jason Gunthorpe wrote:
> > > On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote:
> > > > On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> > > > > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists 
> > > > > w/o
> > > > > backing pages") introduced the sg_page_iter_dma_address() function 
> > > > > without
> > > > > providing a way to use it in the general case. If the sg_dma_len is 
> > > > > not
> > > > > equal to the dma_length callers cannot safely use the
> > > > > for_each_sg_page/sg_page_iter_dma_address combination.
> > > > > 
> > > > > Resolve this API mistake by providing a DMA specific iterator,
> > > > > for_each_sg_dma_page(), that uses the right length so
> > > > > sg_page_iter_dma_address() works as expected with all sglists. A new
> > > > > iterator type is introduced to provide compile-time safety against 
> > > > > wrongly
> > > > > mixing accessors and iterators.
> > > > [..]
> > > > 
> > > > >  
> > > > > +/*
> > > > > + * sg page iterator for DMA addresses
> > > > > + *
> > > > > + * This is the same as sg_page_iter however you can call
> > > > > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA
> > > > > + * address. sg_page_iter_page() cannot be called on this iterator.
> > > > > + */
> > > > Does it make sense to have a variant of sg_page_iter_page() to get the
> > > > page descriptor with this dma_iter? This can be used when walking 
> > > > DMA-mapped
> > > > SG lists with for_each_sg_dma_page.
> > > 
> > > I think that would be a complicated cacluation to find the right
> > > offset into the page sg list to get the page pointer back. We can't
> > > just naively use the existing iterator location.
> > > 
> > > Probably places that need this are better to run with two iterators,
> > > less computationally expensive.
> > > 
> > > Did you find a need for this? 
> > > 
> > 
> > Well I was trying convert the RDMA drivers to use your new iterator variant
> > and saw the need for it in locations where we need virtual address of the 
> > pages
> > contained in the SGEs.
> > 
> > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c 
> > b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> > index 59eeac5..7d26903 100644
> > --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
> > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> > @@ -85,7 +85,7 @@ static void __free_pbl(struct pci_dev *pdev, struct 
> > bnxt_qplib_pbl *pbl,
> >  static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
> >struct scatterlist *sghead, u32 pages, u32 pg_size)
> >  {
> > -   struct scatterlist *sg;
> > +   struct sg_dma_page_iter sg_iter;
> > bool is_umem = false;
> > int i;
> > 
> > @@ -116,12 +116,13 @@ static int __alloc_pbl(struct pci_dev *pdev, struct 
> > bnxt_qplib_pbl *pbl,
> > } else {
> > i = 0;
> > is_umem = true;
> > -   for_each_sg(sghead, sg, pages, i) {
> > -   pbl->pg_map_arr[i] = sg_dma_address(sg);
> > -   pbl->pg_arr[i] = sg_virt(sg);
> > +   for_each_sg_dma_page(sghead, _iter, pages, 0) {
> > +   pbl->pg_map_arr[i] = 
> > sg_page_iter_dma_address(_iter);
> > +   pbl->pg_arr[i] = 
> > page_address(sg_page_iter_page(_iter.base)); ???
> > ^
> 
> I concur with CH, pg_arr only looks used in the !umem case, so set to
> NULL here. Check with Selvin & Devesh ?

That makes sense to me. Any concerns with that Devesh?
 
> > @@ -191,16 +190,16 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
> > goto err1;
> > }
> > 
> > -   mem->page_shift = umem->page_shift;
> > -   mem->page_mask  = BIT(umem->page_shift) - 1;
> > +   mem->page_shift = PAGE_SHIFT;
> > +   mem->page_mask  = PAGE_SIZE - 1;
> > 
> > num_buf = 0;
> > map = mem->map;
> > if (length > 0) {
> > buf = map[0]->buf;
> > 
> > -   for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
> > -   vaddr = page_address(sg_page(sg));
> > +   for_each_sg_dma_page(umem->sg_head.sgl, _iter, 
> > umem->nmap, 0) {
> > +   vaddr = 
> > page_address(sg_page_iter_page(_iter.base));   ?
> > 
> 
> rxe doesn't use DMA addreses, so just leave as for_each_sg_page
> 

OK.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Daniel Stone
On Wed, 16 Jan 2019 at 16:06, h...@lst.de  wrote:
> On Wed, Jan 16, 2019 at 07:28:13AM +, Koenig, Christian wrote:
> > To summarize once more: We have an array of struct pages and want to
> > coherently map that to a device.
>
> And the answer to that is very simple: you can't.  What is so hard
> to understand about?  If you want to map arbitrary memory it simply
> can't be done in a coherent way on about half of our platforms.
>
> > If that is not possible because of whatever reason we want to get an
> > error code or even not load the driver from the beginning.
>
> That is a bullshit attitude.  Just like everyone else makes their
> drivers work you should not be lazy.

Can you not talk to people like that? Even if you think that is an OK
way to treat anyone - which it isn't, certainly not on dri-devel@ with
the fd.o Code of Conduct, and not according to the kernel's either - I
have absolutely no idea how you can look at the work the AMD people
have put in over many years and conclude that they're 'lazy'.

If this makes you so angry, step back from the keyboard for a few
minutes, and if you still can't participate in reasonable discussion
like an adult, maybe step out of the thread entirely.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread h...@lst.de
On Tue, Jan 15, 2019 at 02:25:01PM -0700, Jason Gunthorpe wrote:
> RDMA needs something similar as well, in this case drivers take a
> struct page * from get_user_pages() and need to have the DMA map fail
> if the platform can't DMA map in a way that does not require any
> additional DMA API calls to ensure coherence. (think Userspace RDMA
> MR's)

Any time you dma map pages you need to do further DMA API calls to
ensure coherent, that is the way it is implemented.  These calls
just happen to be no-ops sometimes.

> Today we just do the normal DMA map and when it randomly doesn't work
> and corrupts data tell those people their platforms don't support RDMA
> - it would be nice to have a safer API base solution..

Now that all these drivers are consolidated in rdma-core you can fix
the code to actually do the right thing.  It isn't that userspace DMA
coherent is any harder than in-kernel DMA coherenence.  It just is
that no one bothered to do it properly.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread h...@lst.de
On Wed, Jan 16, 2019 at 07:28:13AM +, Koenig, Christian wrote:
> To summarize once more: We have an array of struct pages and want to 
> coherently map that to a device.

And the answer to that is very simple: you can't.  What is so hard
to understand about?  If you want to map arbitrary memory it simply
can't be done in a coherent way on about half of our platforms.

> If that is not possible because of whatever reason we want to get an 
> error code or even not load the driver from the beginning.

That is a bullshit attitude.  Just like everyone else makes their
drivers work you should not be lazy.

> > bool dma_streaming_is_coherent(const struct device *)
> >
> > API to help us decide when to load or not.
> 
> Yes, please.

Hell no.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Christian König

Am 15.01.19 um 22:25 schrieb Jason Gunthorpe:

On Tue, Jan 15, 2019 at 02:17:26PM +, Thomas Hellstrom wrote:

Hi, Christoph,

On Mon, 2019-01-14 at 10:48 +0100, Christoph Hellwig wrote:

On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote:

Changes since the RFC:
- Rework vmwgfx too [CH]
- Use a distinct type for the DMA page iterator [CH]
- Do not have a #ifdef [CH]

ChristophH: Will you ack?

This looks generally fine.


Are you still OK with the vmwgfx reworking, or should we go back to
the original version that didn't have the type safety so this
driver
can be left broken?

I think the map method in vmgfx that just does virt_to_phys is
pretty broken.  Thomas, can you check if you see any performance
difference with just doing the proper dma mapping, as that gets the
driver out of interface abuse land?

The performance difference is not really the main problem here. The
problem is that even though we utilize the streaming DMA interface, we
use it only since we have to for DMA-Remapping and assume that the
memory is coherent. To be able to be as compliant as possible and ditch
the virt-to-phys mode, we *need* a DMA interface flag that tells us
when the dma_sync_for_xxx are no-ops. If they aren't we'll refuse to
load for now. I'm not sure, but I think also nouveau and radeon suffer
from the same issue.

RDMA needs something similar as well, in this case drivers take a
struct page * from get_user_pages() and need to have the DMA map fail
if the platform can't DMA map in a way that does not require any
additional DMA API calls to ensure coherence. (think Userspace RDMA
MR's)

Today we just do the normal DMA map and when it randomly doesn't work
and corrupts data tell those people their platforms don't support RDMA
- it would be nice to have a safer API base solution..


Oh, yes really good point. We have to support get_user_pages (or HMM) in 
a similar manner on GPUs as well.


Regards,
Christian.



Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Daniel Vetter
On Wed, Jan 16, 2019 at 07:28:13AM +, Koenig, Christian wrote:
> Am 16.01.19 um 08:09 schrieb Thomas Hellstrom:
> > On Tue, 2019-01-15 at 21:58 +0100, h...@lst.de wrote:
> >> On Tue, Jan 15, 2019 at 07:13:11PM +, Koenig, Christian wrote:
> >>> Thomas is correct that the interface you propose here doesn't work
> >>> at
> >>> all for GPUs.
> >>>
> >>> The kernel driver is not informed of flush/sync, but rather just
> >>> setups
> >>> coherent mappings between system memory and devices.
> >>>
> >>> In other words you have an array of struct pages and need to map
> >>> that to
> >>> a specific device and so create dma_addresses for the mappings.
> >> If you want a coherent mapping you need to use dma_alloc_coherent
> >> and dma_mmap_coherent and you are done, that is not the problem.
> >> That actually is one of the vmgfx modes, so I don't understand what
> >> problem we are trying to solve if you don't actually want a non-
> >> coherent mapping.
> > For vmwgfx, not making dma_alloc_coherent default has a couple of
> > reasons:
> > 1) Memory is associated with a struct device. It has not been clear
> > that it is exportable to other devices.
> > 2) There seems to be restrictions in the system pages allowable. GPUs
> > generally prefer highmem pages but dma_alloc_coherent returns a virtual
> > address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
> > prefers HIGHMEM pages to facilitate caching mode switching without
> > having to touch the kernel map.
> > 3) Historically we had APIs to allow coherent access to user-space
> > defined pages. That has gone away not but the infrastructure was built
> > around it.
> >
> > dma_mmap_coherent isn't use because as the data moves between system
> > memory, swap and VRAM, PTEs of user-space mappings are adjusted
> > accordingly, meaning user-space doesn't have to unmap when an operation
> > is initiated that might mean the data is moved.
> 
> To summarize once more: We have an array of struct pages and want to 
> coherently map that to a device.
> 
> If that is not possible because of whatever reason we want to get an 
> error code or even not load the driver from the beginning.

I guess to make this work we'd also need information about how we're
allowed to mmap this on the cpu side, both from the kernel (kmap or vmap)
and for userspace. At least for i915 we use all kinds of combinations,
e.g. cpu mmap ptes as cached w/ coherent device transactions, or
cached+clflush on the cpu side, and non-coherent device transactions (the
no-snoop thing), or wc mode in the cpu ptes and non-coherent device
transactions-

Plus some debug mode so we catch abuse, because reality is that most of
the gpu driver work happens on x86, where all of this just works. Even if
you do some really serious layering violations (which is why this isn't
that high a priority for gpu folks).
-Daniel

> 
> >
> >
> >> Although last time I had that discussion with Daniel Vetter
> >> I was under the impressions that GPUs really wanted non-coherent
> >> mappings.
> > Intel historically has done things a bit differently. And it's also
> > possible that embedded platforms and ARM prefer this mode of operation,
> > but I haven't caught up on that discussion.
> >
> >> But if you want a coherent mapping you can't go to a struct page,
> >> because on many systems you can't just map arbitrary memory as
> >> uncachable.  It might either come from very special limited pools,
> >> or might need other magic applied to it so that it is not visible
> >> in the normal direct mapping, or at least not access through it.
> >
> > The TTM subsystem has been relied on to provide coherent memory with
> > the option to switch caching mode of pages. But only on selected and
> > well tested platforms. On other platforms we simply do not load, and
> > that's fine for now.
> >
> > But as mentioned multiple times, to make GPU drivers more compliant,
> > we'd really want that
> >
> > bool dma_streaming_is_coherent(const struct device *)
> >
> > API to help us decide when to load or not.
> 
> Yes, please.
> 
> Christian.
> 
> >
> > Thanks,
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-16 Thread Jason Gunthorpe
On Tue, Jan 15, 2019 at 02:17:26PM +, Thomas Hellstrom wrote:
> Hi, Christoph,
> 
> On Mon, 2019-01-14 at 10:48 +0100, Christoph Hellwig wrote:
> > On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote:
> > > > Changes since the RFC:
> > > > - Rework vmwgfx too [CH]
> > > > - Use a distinct type for the DMA page iterator [CH]
> > > > - Do not have a #ifdef [CH]
> > > 
> > > ChristophH: Will you ack?
> > 
> > This looks generally fine.
> > 
> > > Are you still OK with the vmwgfx reworking, or should we go back to
> > > the original version that didn't have the type safety so this
> > > driver
> > > can be left broken?
> > 
> > I think the map method in vmgfx that just does virt_to_phys is
> > pretty broken.  Thomas, can you check if you see any performance
> > difference with just doing the proper dma mapping, as that gets the
> > driver out of interface abuse land?
> 
> The performance difference is not really the main problem here. The
> problem is that even though we utilize the streaming DMA interface, we
> use it only since we have to for DMA-Remapping and assume that the
> memory is coherent. To be able to be as compliant as possible and ditch
> the virt-to-phys mode, we *need* a DMA interface flag that tells us
> when the dma_sync_for_xxx are no-ops. If they aren't we'll refuse to
> load for now. I'm not sure, but I think also nouveau and radeon suffer
> from the same issue.

RDMA needs something similar as well, in this case drivers take a
struct page * from get_user_pages() and need to have the DMA map fail
if the platform can't DMA map in a way that does not require any
additional DMA API calls to ensure coherence. (think Userspace RDMA
MR's)

Today we just do the normal DMA map and when it randomly doesn't work
and corrupts data tell those people their platforms don't support RDMA
- it would be nice to have a safer API base solution..

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Koenig, Christian
Am 16.01.19 um 08:09 schrieb Thomas Hellstrom:
> On Tue, 2019-01-15 at 21:58 +0100, h...@lst.de wrote:
>> On Tue, Jan 15, 2019 at 07:13:11PM +, Koenig, Christian wrote:
>>> Thomas is correct that the interface you propose here doesn't work
>>> at
>>> all for GPUs.
>>>
>>> The kernel driver is not informed of flush/sync, but rather just
>>> setups
>>> coherent mappings between system memory and devices.
>>>
>>> In other words you have an array of struct pages and need to map
>>> that to
>>> a specific device and so create dma_addresses for the mappings.
>> If you want a coherent mapping you need to use dma_alloc_coherent
>> and dma_mmap_coherent and you are done, that is not the problem.
>> That actually is one of the vmgfx modes, so I don't understand what
>> problem we are trying to solve if you don't actually want a non-
>> coherent mapping.
> For vmwgfx, not making dma_alloc_coherent default has a couple of
> reasons:
> 1) Memory is associated with a struct device. It has not been clear
> that it is exportable to other devices.
> 2) There seems to be restrictions in the system pages allowable. GPUs
> generally prefer highmem pages but dma_alloc_coherent returns a virtual
> address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
> prefers HIGHMEM pages to facilitate caching mode switching without
> having to touch the kernel map.
> 3) Historically we had APIs to allow coherent access to user-space
> defined pages. That has gone away not but the infrastructure was built
> around it.
>
> dma_mmap_coherent isn't use because as the data moves between system
> memory, swap and VRAM, PTEs of user-space mappings are adjusted
> accordingly, meaning user-space doesn't have to unmap when an operation
> is initiated that might mean the data is moved.

To summarize once more: We have an array of struct pages and want to 
coherently map that to a device.

If that is not possible because of whatever reason we want to get an 
error code or even not load the driver from the beginning.

>
>
>> Although last time I had that discussion with Daniel Vetter
>> I was under the impressions that GPUs really wanted non-coherent
>> mappings.
> Intel historically has done things a bit differently. And it's also
> possible that embedded platforms and ARM prefer this mode of operation,
> but I haven't caught up on that discussion.
>
>> But if you want a coherent mapping you can't go to a struct page,
>> because on many systems you can't just map arbitrary memory as
>> uncachable.  It might either come from very special limited pools,
>> or might need other magic applied to it so that it is not visible
>> in the normal direct mapping, or at least not access through it.
>
> The TTM subsystem has been relied on to provide coherent memory with
> the option to switch caching mode of pages. But only on selected and
> well tested platforms. On other platforms we simply do not load, and
> that's fine for now.
>
> But as mentioned multiple times, to make GPU drivers more compliant,
> we'd really want that
>
> bool dma_streaming_is_coherent(const struct device *)
>
> API to help us decide when to load or not.

Yes, please.

Christian.

>
> Thanks,
> Thomas
>
>
>
>
>
>
>

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Thomas Hellstrom
On Tue, 2019-01-15 at 21:58 +0100, h...@lst.de wrote:
> On Tue, Jan 15, 2019 at 07:13:11PM +, Koenig, Christian wrote:
> > Thomas is correct that the interface you propose here doesn't work
> > at 
> > all for GPUs.
> > 
> > The kernel driver is not informed of flush/sync, but rather just
> > setups 
> > coherent mappings between system memory and devices.
> > 
> > In other words you have an array of struct pages and need to map
> > that to 
> > a specific device and so create dma_addresses for the mappings.
> 
> If you want a coherent mapping you need to use dma_alloc_coherent
> and dma_mmap_coherent and you are done, that is not the problem.
> That actually is one of the vmgfx modes, so I don't understand what
> problem we are trying to solve if you don't actually want a non-
> coherent mapping. 

For vmwgfx, not making dma_alloc_coherent default has a couple of
reasons:
1) Memory is associated with a struct device. It has not been clear
that it is exportable to other devices.
2) There seems to be restrictions in the system pages allowable. GPUs
generally prefer highmem pages but dma_alloc_coherent returns a virtual
address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
prefers HIGHMEM pages to facilitate caching mode switching without
having to touch the kernel map. 
3) Historically we had APIs to allow coherent access to user-space
defined pages. That has gone away not but the infrastructure was built
around it.

dma_mmap_coherent isn't use because as the data moves between system
memory, swap and VRAM, PTEs of user-space mappings are adjusted
accordingly, meaning user-space doesn't have to unmap when an operation
is initiated that might mean the data is moved.


> Although last time I had that discussion with Daniel Vetter
> I was under the impressions that GPUs really wanted non-coherent
> mappings.

Intel historically has done things a bit differently. And it's also
possible that embedded platforms and ARM prefer this mode of operation,
but I haven't caught up on that discussion.

> 
> But if you want a coherent mapping you can't go to a struct page,
> because on many systems you can't just map arbitrary memory as
> uncachable.  It might either come from very special limited pools,
> or might need other magic applied to it so that it is not visible
> in the normal direct mapping, or at least not access through it.


The TTM subsystem has been relied on to provide coherent memory with
the option to switch caching mode of pages. But only on selected and
well tested platforms. On other platforms we simply do not load, and
that's fine for now.

But as mentioned multiple times, to make GPU drivers more compliant,
we'd really want that

bool dma_streaming_is_coherent(const struct device *)

API to help us decide when to load or not.

Thanks,
Thomas







___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread h...@lst.de
On Tue, Jan 15, 2019 at 07:13:11PM +, Koenig, Christian wrote:
> Thomas is correct that the interface you propose here doesn't work at 
> all for GPUs.
> 
> The kernel driver is not informed of flush/sync, but rather just setups 
> coherent mappings between system memory and devices.
> 
> In other words you have an array of struct pages and need to map that to 
> a specific device and so create dma_addresses for the mappings.

If you want a coherent mapping you need to use dma_alloc_coherent
and dma_mmap_coherent and you are done, that is not the problem.
That actually is one of the vmgfx modes, so I don't understand what
problem we are trying to solve if you don't actually want a non-coherent
mapping.  Although last time I had that discussion with Daniel Vetter
I was under the impressions that GPUs really wanted non-coherent
mappings.

But if you want a coherent mapping you can't go to a struct page,
because on many systems you can't just map arbitrary memory as
uncachable.  It might either come from very special limited pools,
or might need other magic applied to it so that it is not visible
in the normal direct mapping, or at least not access through it.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Koenig, Christian
Am 15.01.19 um 19:31 schrieb h...@lst.de:
> On Tue, Jan 15, 2019 at 06:03:39PM +, Thomas Hellstrom wrote:
>> In the graphics case, it's probably because it doesn't fit the graphics
>> use-cases:
>>
>> 1) Memory typically needs to be mappable by another device. (the "dma-
>> buf" interface)
> And there is nothing preventing dma-buf sharing of these buffers.
> Unlike the get_sgtable mess it can actually work reliably on
> architectures that have virtually tagged caches and/or don't
> guarantee cache coherency with mixed attribute mappings.
>
>> 2) DMA buffers are exported to user-space and is sub-allocated by it.
>> Mostly there are no GPU user-space kernel interfaces to sync / flush
>> subregions and these syncs may happen on a smaller-than-cache-line
>> granularity.
> I know of no architectures that can do cache maintainance on a less
> than cache line basis.  Either the instructions require you to
> specifcy cache lines, or they do sometimes more, sometimes less
> intelligent rounding up.
>
> Note that as long dma non-coherent buffers are devices owned it
> is up to the device and the user space driver to take care of
> coherency, the kernel very much is out of the picture.

Thomas is correct that the interface you propose here doesn't work at 
all for GPUs.

The kernel driver is not informed of flush/sync, but rather just setups 
coherent mappings between system memory and devices.

In other words you have an array of struct pages and need to map that to 
a specific device and so create dma_addresses for the mappings.

Regards,
Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread h...@lst.de
On Tue, Jan 15, 2019 at 06:03:39PM +, Thomas Hellstrom wrote:
> In the graphics case, it's probably because it doesn't fit the graphics
> use-cases:
> 
> 1) Memory typically needs to be mappable by another device. (the "dma-
> buf" interface)

And there is nothing preventing dma-buf sharing of these buffers.
Unlike the get_sgtable mess it can actually work reliably on
architectures that have virtually tagged caches and/or don't
guarantee cache coherency with mixed attribute mappings.

> 2) DMA buffers are exported to user-space and is sub-allocated by it.
> Mostly there are no GPU user-space kernel interfaces to sync / flush
> subregions and these syncs may happen on a smaller-than-cache-line
> granularity.

I know of no architectures that can do cache maintainance on a less
than cache line basis.  Either the instructions require you to
specifcy cache lines, or they do sometimes more, sometimes less
intelligent rounding up.

Note that as long dma non-coherent buffers are devices owned it
is up to the device and the user space driver to take care of
coherency, the kernel very much is out of the picture.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Thomas Hellstrom
On Tue, 2019-01-15 at 16:20 +0100, h...@lst.de wrote:
> On Tue, Jan 15, 2019 at 03:24:55PM +0100, Christian König wrote:
> > Yeah, indeed. Bounce buffers are an absolute no-go for GPUs.
> > 
> > If the DMA API finds that a piece of memory is not directly
> > accessible by 
> > the GPU we need to return an error and not try to use bounce
> > buffers behind 
> > the surface.
> > 
> > That is something which always annoyed me with the DMA API, which
> > is 
> > otherwise rather cleanly defined.
> 
> That is exactly what I want to fix with my series to make
> DMA_ATTR_NON_CONSISTENT more useful and always available:
> 
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fpipermail%2Fiommu%2F2018-December%2F031985.htmldata=02%7C01%7Cthellstrom%40vmware.com%7Cb1799c4073024a824f9408d67afcfcea%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636831624340834140sdata=JiRBfzZMvN3joJ4vKiErbzXAHaNzuBcLapRJDL%2Bt6Hc%3Dreserved=0
> 
> With that you allocate the memory using dma_alloc_attrs with
> DMA_ATTR_NON_CONSISTENT, and use dma_sync_single_* to transfer
> ownership to the cpu and back to the device, with a gurantee that
> there won't be any bouncing.  So far the interest by the parties that
> requested the feature has been rather lacklustre, though.

In the graphics case, it's probably because it doesn't fit the graphics
use-cases:

1) Memory typically needs to be mappable by another device. (the "dma-
buf" interface)
2) DMA buffers are exported to user-space and is sub-allocated by it.
Mostly there are no GPU user-space kernel interfaces to sync / flush
subregions and these syncs may happen on a smaller-than-cache-line
granularity.

So to help the graphics driver, that coherent flag would be much more
useful.

/Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread h...@lst.de
On Tue, Jan 15, 2019 at 03:24:55PM +0100, Christian König wrote:
> Yeah, indeed. Bounce buffers are an absolute no-go for GPUs.
>
> If the DMA API finds that a piece of memory is not directly accessible by 
> the GPU we need to return an error and not try to use bounce buffers behind 
> the surface.
>
> That is something which always annoyed me with the DMA API, which is 
> otherwise rather cleanly defined.

That is exactly what I want to fix with my series to make
DMA_ATTR_NON_CONSISTENT more useful and always available:

https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031985.html

With that you allocate the memory using dma_alloc_attrs with
DMA_ATTR_NON_CONSISTENT, and use dma_sync_single_* to transfer
ownership to the cpu and back to the device, with a gurantee that
there won't be any bouncing.  So far the interest by the parties that
requested the feature has been rather lacklustre, though.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Christian König

Am 15.01.19 um 15:17 schrieb Thomas Hellstrom:

Hi, Christoph,

On Mon, 2019-01-14 at 10:48 +0100, Christoph Hellwig wrote:

On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote:

Changes since the RFC:
- Rework vmwgfx too [CH]
- Use a distinct type for the DMA page iterator [CH]
- Do not have a #ifdef [CH]

ChristophH: Will you ack?

This looks generally fine.


Are you still OK with the vmwgfx reworking, or should we go back to
the original version that didn't have the type safety so this
driver
can be left broken?

I think the map method in vmgfx that just does virt_to_phys is
pretty broken.  Thomas, can you check if you see any performance
difference with just doing the proper dma mapping, as that gets the
driver out of interface abuse land?

The performance difference is not really the main problem here. The
problem is that even though we utilize the streaming DMA interface, we
use it only since we have to for DMA-Remapping and assume that the
memory is coherent. To be able to be as compliant as possible and ditch
the virt-to-phys mode, we *need* a DMA interface flag that tells us
when the dma_sync_for_xxx are no-ops. If they aren't we'll refuse to
load for now. I'm not sure, but I think also nouveau and radeon suffer
from the same issue.


Yeah, indeed. Bounce buffers are an absolute no-go for GPUs.

If the DMA API finds that a piece of memory is not directly accessible 
by the GPU we need to return an error and not try to use bounce buffers 
behind the surface.


That is something which always annoyed me with the DMA API, which is 
otherwise rather cleanly defined.


Christian.




While we're at it I think we need to merge my series in this area
for 5.0, because without that the driver is already broken.  Where
should we merge it?

I can merge it through vmwgfx/drm-fixes. There is an outstanding issue
with patch 3. Do you want me to fix that up?

Thanks,
Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-15 Thread Thomas Hellstrom
Hi, Christoph,

On Mon, 2019-01-14 at 10:48 +0100, Christoph Hellwig wrote:
> On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote:
> > > Changes since the RFC:
> > > - Rework vmwgfx too [CH]
> > > - Use a distinct type for the DMA page iterator [CH]
> > > - Do not have a #ifdef [CH]
> > 
> > ChristophH: Will you ack?
> 
> This looks generally fine.
> 
> > Are you still OK with the vmwgfx reworking, or should we go back to
> > the original version that didn't have the type safety so this
> > driver
> > can be left broken?
> 
> I think the map method in vmgfx that just does virt_to_phys is
> pretty broken.  Thomas, can you check if you see any performance
> difference with just doing the proper dma mapping, as that gets the
> driver out of interface abuse land?

The performance difference is not really the main problem here. The
problem is that even though we utilize the streaming DMA interface, we
use it only since we have to for DMA-Remapping and assume that the
memory is coherent. To be able to be as compliant as possible and ditch
the virt-to-phys mode, we *need* a DMA interface flag that tells us
when the dma_sync_for_xxx are no-ops. If they aren't we'll refuse to
load for now. I'm not sure, but I think also nouveau and radeon suffer
from the same issue.

> 
> While we're at it I think we need to merge my series in this area
> for 5.0, because without that the driver is already broken.  Where
> should we merge it?

I can merge it through vmwgfx/drm-fixes. There is an outstanding issue
with patch 3. Do you want me to fix that up?

Thanks,
Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-14 Thread Jason Gunthorpe
On Sat, Jan 12, 2019 at 01:03:05PM -0600, Shiraz Saleem wrote:
> On Sat, Jan 12, 2019 at 06:37:58PM +, Jason Gunthorpe wrote:
> > On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote:
> > > On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> > > > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists 
> > > > w/o
> > > > backing pages") introduced the sg_page_iter_dma_address() function 
> > > > without
> > > > providing a way to use it in the general case. If the sg_dma_len is not
> > > > equal to the dma_length callers cannot safely use the
> > > > for_each_sg_page/sg_page_iter_dma_address combination.
> > > > 
> > > > Resolve this API mistake by providing a DMA specific iterator,
> > > > for_each_sg_dma_page(), that uses the right length so
> > > > sg_page_iter_dma_address() works as expected with all sglists. A new
> > > > iterator type is introduced to provide compile-time safety against 
> > > > wrongly
> > > > mixing accessors and iterators.
> > > [..]
> > > 
> > > >  
> > > > +/*
> > > > + * sg page iterator for DMA addresses
> > > > + *
> > > > + * This is the same as sg_page_iter however you can call
> > > > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA
> > > > + * address. sg_page_iter_page() cannot be called on this iterator.
> > > > + */
> > > Does it make sense to have a variant of sg_page_iter_page() to get the
> > > page descriptor with this dma_iter? This can be used when walking 
> > > DMA-mapped
> > > SG lists with for_each_sg_dma_page.
> > 
> > I think that would be a complicated cacluation to find the right
> > offset into the page sg list to get the page pointer back. We can't
> > just naively use the existing iterator location.
> > 
> > Probably places that need this are better to run with two iterators,
> > less computationally expensive.
> > 
> > Did you find a need for this? 
> > 
> 
> Well I was trying convert the RDMA drivers to use your new iterator variant
> and saw the need for it in locations where we need virtual address of the 
> pages
> contained in the SGEs.
> 
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c 
> b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> index 59eeac5..7d26903 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> @@ -85,7 +85,7 @@ static void __free_pbl(struct pci_dev *pdev, struct 
> bnxt_qplib_pbl *pbl,
>  static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
>struct scatterlist *sghead, u32 pages, u32 pg_size)
>  {
> -   struct scatterlist *sg;
> +   struct sg_dma_page_iter sg_iter;
> bool is_umem = false;
> int i;
> 
> @@ -116,12 +116,13 @@ static int __alloc_pbl(struct pci_dev *pdev, struct 
> bnxt_qplib_pbl *pbl,
> } else {
> i = 0;
> is_umem = true;
> -   for_each_sg(sghead, sg, pages, i) {
> -   pbl->pg_map_arr[i] = sg_dma_address(sg);
> -   pbl->pg_arr[i] = sg_virt(sg);
> +   for_each_sg_dma_page(sghead, _iter, pages, 0) {
> +   pbl->pg_map_arr[i] = 
> sg_page_iter_dma_address(_iter);
> +   pbl->pg_arr[i] = 
> page_address(sg_page_iter_page(_iter.base)); ???
>   ^

I concur with CH, pg_arr only looks used in the !umem case, so set to
NULL here. Check with Selvin & Devesh ?

> @@ -191,16 +190,16 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
> goto err1;
> }
> 
> -   mem->page_shift = umem->page_shift;
> -   mem->page_mask  = BIT(umem->page_shift) - 1;
> +   mem->page_shift = PAGE_SHIFT;
> +   mem->page_mask  = PAGE_SIZE - 1;
> 
> num_buf = 0;
> map = mem->map;
> if (length > 0) {
> buf = map[0]->buf;
> 
> -   for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
> -   vaddr = page_address(sg_page(sg));
> +   for_each_sg_dma_page(umem->sg_head.sgl, _iter, umem->nmap, 
> 0) {
> +   vaddr = 
> page_address(sg_page_iter_page(_iter.base));   ?
>   

rxe doesn't use DMA addreses, so just leave as for_each_sg_page

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-14 Thread Christoph Hellwig
On Thu, Jan 10, 2019 at 04:42:18PM -0700, Jason Gunthorpe wrote:
> > Changes since the RFC:
> > - Rework vmwgfx too [CH]
> > - Use a distinct type for the DMA page iterator [CH]
> > - Do not have a #ifdef [CH]
> 
> ChristophH: Will you ack?

This looks generally fine.

> 
> Are you still OK with the vmwgfx reworking, or should we go back to
> the original version that didn't have the type safety so this driver
> can be left broken?

I think the map method in vmgfx that just does virt_to_phys is
pretty broken.  Thomas, can you check if you see any performance
difference with just doing the proper dma mapping, as that gets the
driver out of interface abuse land?

While we're at it I think we need to merge my series in this area
for 5.0, because without that the driver is already broken.  Where
should we merge it?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-14 Thread Christoph Hellwig
On Sat, Jan 12, 2019 at 01:03:05PM -0600, Shiraz Saleem wrote:
> Well I was trying convert the RDMA drivers to use your new iterator variant
> and saw the need for it in locations where we need virtual address of the 
> pages
> contained in the SGEs.

As far as i can tell that pg_arr[i] value is only ever used for
the case where we do an explicit dma coherent allocation,  so you
should not have to fill it out at all.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-13 Thread Jason Gunthorpe
On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote:
> On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
> > backing pages") introduced the sg_page_iter_dma_address() function without
> > providing a way to use it in the general case. If the sg_dma_len is not
> > equal to the dma_length callers cannot safely use the
> > for_each_sg_page/sg_page_iter_dma_address combination.
> > 
> > Resolve this API mistake by providing a DMA specific iterator,
> > for_each_sg_dma_page(), that uses the right length so
> > sg_page_iter_dma_address() works as expected with all sglists. A new
> > iterator type is introduced to provide compile-time safety against wrongly
> > mixing accessors and iterators.
> [..]
> 
> >  
> > +/*
> > + * sg page iterator for DMA addresses
> > + *
> > + * This is the same as sg_page_iter however you can call
> > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA
> > + * address. sg_page_iter_page() cannot be called on this iterator.
> > + */
> Does it make sense to have a variant of sg_page_iter_page() to get the
> page descriptor with this dma_iter? This can be used when walking DMA-mapped
> SG lists with for_each_sg_dma_page.

I think that would be a complicated cacluation to find the right
offset into the page sg list to get the page pointer back. We can't
just naively use the existing iterator location.

Probably places that need this are better to run with two iterators,
less computationally expensive.

Did you find a need for this? 

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-12 Thread Shiraz Saleem
On Sat, Jan 12, 2019 at 06:37:58PM +, Jason Gunthorpe wrote:
> On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote:
> > On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> > > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
> > > backing pages") introduced the sg_page_iter_dma_address() function without
> > > providing a way to use it in the general case. If the sg_dma_len is not
> > > equal to the dma_length callers cannot safely use the
> > > for_each_sg_page/sg_page_iter_dma_address combination.
> > > 
> > > Resolve this API mistake by providing a DMA specific iterator,
> > > for_each_sg_dma_page(), that uses the right length so
> > > sg_page_iter_dma_address() works as expected with all sglists. A new
> > > iterator type is introduced to provide compile-time safety against wrongly
> > > mixing accessors and iterators.
> > [..]
> > 
> > >  
> > > +/*
> > > + * sg page iterator for DMA addresses
> > > + *
> > > + * This is the same as sg_page_iter however you can call
> > > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA
> > > + * address. sg_page_iter_page() cannot be called on this iterator.
> > > + */
> > Does it make sense to have a variant of sg_page_iter_page() to get the
> > page descriptor with this dma_iter? This can be used when walking DMA-mapped
> > SG lists with for_each_sg_dma_page.
> 
> I think that would be a complicated cacluation to find the right
> offset into the page sg list to get the page pointer back. We can't
> just naively use the existing iterator location.
> 
> Probably places that need this are better to run with two iterators,
> less computationally expensive.
> 
> Did you find a need for this? 
> 

Well I was trying convert the RDMA drivers to use your new iterator variant
and saw the need for it in locations where we need virtual address of the pages
contained in the SGEs.

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c 
b/drivers/infiniband/hw/bnxt_re/qplib_res.c
index 59eeac5..7d26903 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
@@ -85,7 +85,7 @@ static void __free_pbl(struct pci_dev *pdev, struct 
bnxt_qplib_pbl *pbl,
 static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
   struct scatterlist *sghead, u32 pages, u32 pg_size)
 {
-   struct scatterlist *sg;
+   struct sg_dma_page_iter sg_iter;
bool is_umem = false;
int i;

@@ -116,12 +116,13 @@ static int __alloc_pbl(struct pci_dev *pdev, struct 
bnxt_qplib_pbl *pbl,
} else {
i = 0;
is_umem = true;
-   for_each_sg(sghead, sg, pages, i) {
-   pbl->pg_map_arr[i] = sg_dma_address(sg);
-   pbl->pg_arr[i] = sg_virt(sg);
+   for_each_sg_dma_page(sghead, _iter, pages, 0) {
+   pbl->pg_map_arr[i] = sg_page_iter_dma_address(_iter);
+   pbl->pg_arr[i] = 
page_address(sg_page_iter_page(_iter.base)); ???
^

if (!pbl->pg_arr[i])
goto fail;

+   i++;
pbl->pg_count++;
}
}
--

@@ -191,16 +190,16 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
goto err1;
}

-   mem->page_shift = umem->page_shift;
-   mem->page_mask  = BIT(umem->page_shift) - 1;
+   mem->page_shift = PAGE_SHIFT;
+   mem->page_mask  = PAGE_SIZE - 1;

num_buf = 0;
map = mem->map;
if (length > 0) {
buf = map[0]->buf;

-   for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
-   vaddr = page_address(sg_page(sg));
+   for_each_sg_dma_page(umem->sg_head.sgl, _iter, umem->nmap, 
0) {
+   vaddr = page_address(sg_page_iter_page(_iter.base)); 
  ?



if (!vaddr) {
pr_warn("null vaddr\n");
err = -ENOMEM;
@@ -208,7 +207,7 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
}

buf->addr = (uintptr_t)vaddr;
-   buf->size = BIT(umem->page_shift);
+   buf->size = PAGE_SIZE;

1.8.3.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-12 Thread Shiraz Saleem
On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote:
> Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
> backing pages") introduced the sg_page_iter_dma_address() function without
> providing a way to use it in the general case. If the sg_dma_len is not
> equal to the dma_length callers cannot safely use the
> for_each_sg_page/sg_page_iter_dma_address combination.
> 
> Resolve this API mistake by providing a DMA specific iterator,
> for_each_sg_dma_page(), that uses the right length so
> sg_page_iter_dma_address() works as expected with all sglists. A new
> iterator type is introduced to provide compile-time safety against wrongly
> mixing accessors and iterators.
[..]

>  
> +/*
> + * sg page iterator for DMA addresses
> + *
> + * This is the same as sg_page_iter however you can call
> + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA
> + * address. sg_page_iter_page() cannot be called on this iterator.
> + */
Does it make sense to have a variant of sg_page_iter_page() to get the
page descriptor with this dma_iter? This can be used when walking DMA-mapped
SG lists with for_each_sg_dma_page.

> +struct sg_dma_page_iter {
> + struct sg_page_iter base;
> +};
> +
> 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-11 Thread Jason Gunthorpe
On Fri, Jan 04, 2019 at 03:35:31PM -0700, Jason Gunthorpe wrote:
> Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
> backing pages") introduced the sg_page_iter_dma_address() function without
> providing a way to use it in the general case. If the sg_dma_len is not
> equal to the dma_length callers cannot safely use the
> for_each_sg_page/sg_page_iter_dma_address combination.
> 
> Resolve this API mistake by providing a DMA specific iterator,
> for_each_sg_dma_page(), that uses the right length so
> sg_page_iter_dma_address() works as expected with all sglists. A new
> iterator type is introduced to provide compile-time safety against wrongly
> mixing accessors and iterators.
> 
> Signed-off-by: Jason Gunthorpe 
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h| 26 ++--
>  drivers/gpu/drm/vmwgfx/vmwgfx_mob.c| 26 +++-
>  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 42 +--
>  drivers/media/pci/intel/ipu3/ipu3-cio2.c   |  4 +-
>  include/linux/scatterlist.h| 49 ++
>  lib/scatterlist.c  | 26 
>  6 files changed, 134 insertions(+), 39 deletions(-)
> 
> I'd like to run this patch through the RDMA tree as we have another
> series in the works that wants to use the for_each_sg_dma_page() API.
> 
> The changes to vmwgfx make me nervous, it would be great if someone
> could test and ack them?
> 
> Changes since the RFC:
> - Rework vmwgfx too [CH]
> - Use a distinct type for the DMA page iterator [CH]
> - Do not have a #ifdef [CH]

ChristophH: Will you ack?

Are you still OK with the vmwgfx reworking, or should we go back to
the original version that didn't have the type safety so this driver
can be left broken?

Thanks,
Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-07 Thread Jason Gunthorpe
On Sat, Jan 05, 2019 at 10:37:17AM +0800, kbuild test robot wrote:
> Hi Jason,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.20 next-20190103]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Jason-Gunthorpe/lib-scatterlist-Provide-a-DMA-page-iterator/20190105-081739
> config: x86_64-randconfig-x017-201900 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> All error/warnings (new ones prefixed by >>):
> 
>In file included from lib/scatterlist.c:9:0:
> >> include/linux/export.h:81:20: error: redefinition of 
> >> '__kstrtab___sg_page_iter_next'
>  static const char __kstrtab_##sym[]\
>^
>include/linux/export.h:120:25: note: in expansion of macro 
> '___EXPORT_SYMBOL'
> #define __EXPORT_SYMBOL ___EXPORT_SYMBOL
> ^~~~
>include/linux/export.h:124:2: note: in expansion of macro '__EXPORT_SYMBOL'
>  __EXPORT_SYMBOL(sym, "")
>  ^~~
> >> lib/scatterlist.c:652:1: note: in expansion of macro 'EXPORT_SYMBOL'
> EXPORT_SYMBOL(__sg_page_iter_next);
> ^

Woops, should be __sg_page_dma_iter_next.. Will resend after getting
some feedback.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

2019-01-04 Thread kbuild test robot
Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20 next-20190103]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jason-Gunthorpe/lib-scatterlist-Provide-a-DMA-page-iterator/20190105-081739
config: x86_64-randconfig-x017-201900 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from lib/scatterlist.c:9:0:
>> include/linux/export.h:81:20: error: redefinition of 
>> '__kstrtab___sg_page_iter_next'
 static const char __kstrtab_##sym[]\
   ^
   include/linux/export.h:120:25: note: in expansion of macro '___EXPORT_SYMBOL'
#define __EXPORT_SYMBOL ___EXPORT_SYMBOL
^~~~
   include/linux/export.h:124:2: note: in expansion of macro '__EXPORT_SYMBOL'
 __EXPORT_SYMBOL(sym, "")
 ^~~
>> lib/scatterlist.c:652:1: note: in expansion of macro 'EXPORT_SYMBOL'
EXPORT_SYMBOL(__sg_page_iter_next);
^
   include/linux/export.h:81:20: note: previous definition of 
'__kstrtab___sg_page_iter_next' was here
 static const char __kstrtab_##sym[]\
   ^
   include/linux/export.h:120:25: note: in expansion of macro '___EXPORT_SYMBOL'
#define __EXPORT_SYMBOL ___EXPORT_SYMBOL
^~~~
   include/linux/export.h:124:2: note: in expansion of macro '__EXPORT_SYMBOL'
 __EXPORT_SYMBOL(sym, "")
 ^~~
   lib/scatterlist.c:626:1: note: in expansion of macro 'EXPORT_SYMBOL'
EXPORT_SYMBOL(__sg_page_iter_next);
^

vim +/__kstrtab___sg_page_iter_next +81 include/linux/export.h

7290d580 Ard Biesheuvel  2018-08-21  76  
f5016932 Paul Gortmaker  2011-05-23  77  /* For every exported symbol, place a 
struct in the __ksymtab section */
f2355416 Nicolas Pitre   2016-01-22  78  #define ___EXPORT_SYMBOL(sym, sec) 
\
f5016932 Paul Gortmaker  2011-05-23  79 extern typeof(sym) sym; 
\
f5016932 Paul Gortmaker  2011-05-23  80 __CRC_SYMBOL(sym, sec)  
\
f5016932 Paul Gortmaker  2011-05-23 @81 static const char 
__kstrtab_##sym[] \
7290d580 Ard Biesheuvel  2018-08-21  82 
__attribute__((section("__ksymtab_strings"), used, aligned(1))) \
94e58e0a Masahiro Yamada 2018-05-09  83 = #sym; 
\
7290d580 Ard Biesheuvel  2018-08-21  84 __KSYMTAB_ENTRY(sym, sec)
f5016932 Paul Gortmaker  2011-05-23  85  

:: The code at line 81 was first introduced by commit
:: f50169324df4ad942e544386d136216c8617636a module.h: split out the 
EXPORT_SYMBOL into export.h

:: TO: Paul Gortmaker 
:: CC: Paul Gortmaker 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel