Re: [Intel-gfx] [PATCH 02/22] nvmet: Make use of the new sg_map helper function

2017-04-18 Thread Christoph Hellwig
On Thu, Apr 13, 2017 at 11:06:16PM -0600, Logan Gunthorpe wrote:
> Or maybe I'll just send a patch for that
> separately seeing it doesn't depend on anything and is pretty simple. I
> can do that next week.

Yes, please just send that patch linux-nvme, we should be able to get
it into 4.12.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/22] libiscsi: Make use of new the sg_map helper function

2017-04-18 Thread Christoph Hellwig
On Thu, Apr 13, 2017 at 04:05:16PM -0600, Logan Gunthorpe wrote:
> Convert the kmap and kmap_atomic uses to the sg_map function. We now
> store the flags for the kmap instead of a boolean to indicate
> atomicitiy. We also propogate a possible kmap error down and create
> a new ISCSI_TCP_INTERNAL_ERR error type for this.

Can you split out the new error handling into a separate prep patch
which should go to the iscsi maintainers ASAP?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/22] scatterlist: Introduce sg_map helper functions

2017-04-18 Thread Christoph Hellwig
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 0007b79..b95934b 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -37,6 +37,9 @@
>  
>  #include 
>  
> +/* Prevent the highmem.h macro from aliasing ops->kunmap_atomic */
> +#undef kunmap_atomic
> +
>  static inline int is_dma_buf_file(struct file *);
>  
>  struct dma_buf_list {

I think the right fix here is to rename the operation to unmap_atomic
and send out a little patch for that ASAP.

> + *   Flags can be any of:
> + *   * SG_KMAP- Use kmap to create the mapping
> + *   * SG_KMAP_ATOMIC - Use kmap_atomic to map the page atommically.
> + *  Thus, the rules of that function apply: the cpu
> + *  may not sleep until it is unmaped.
> + *
> + *   Also, consider carefully whether this function is appropriate. It is
> + *   largely not recommended for new code and if the sgl came from another
> + *   subsystem and you don't know what kind of memory might be in the list
> + *   then you definitely should not call it. Non-mappable memory may be in
> + *   the sgl and thus this function may fail unexpectedly.
> + **/
> +static inline void *sg_map_offset(struct scatterlist *sg, size_t offset,
> +int flags)

I'd rather have separate functions for kmap vs kmap_atomic instead of
the flags parameter.  And while you're at it just always pass the 0
offset parameter instead of adding a wrapper..

Otherwise this looks good to me.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/22] nvmet: Make use of the new sg_map helper function

2017-04-18 Thread Christoph Hellwig
On Thu, Apr 13, 2017 at 04:05:15PM -0600, Logan Gunthorpe wrote:
> This is a straight forward conversion in two places. Should kmap fail,
> the code will return an INVALD_DATA error in the completion.

It really should be using nvmet_copy_from_sgl to make things safer,
as we don't want to rely on any particular SG list layout.  In fact
I'm pretty sure I did the conversion at some point, but it must never
have made it upstream.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/22] dm-crypt: Make use of the new sg_map helper in 4 call sites

2017-04-18 Thread Christoph Hellwig
On Thu, Apr 13, 2017 at 04:05:22PM -0600, Logan Gunthorpe wrote:
> Very straightforward conversion to the new function in all four spots.

I think the right fix here is to switch dm-crypt to the ahash API
that takes a scatterlist.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-26 Thread Christoph Hellwig
On Tue, Apr 25, 2017 at 12:20:48PM -0600, Logan Gunthorpe wrote:
> This patch introduces functions which kmap the pages inside an sgl.
> These functions replace a common pattern of kmap(sg_page(sg)) that is
> used in more than 50 places within the kernel.
> 
> The motivation for this work is to eventually safely support sgls that
> contain io memory. In order for that to work, any access to the contents
> of an iomem SGL will need to be done with iomemcpy or hit some warning.
> (The exact details of how this will work have yet to be worked out.)

I think we'll at least need a draft of those to make sense of these
patches.  Otherwise they just look very clumsy.

> + *   Use this function to map a page in the scatterlist at the specified
> + *   offset. sg->offset is already added for you. Note: the semantics of
> + *   this function are that it may fail. Thus, its output should be checked
> + *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
> + *   in the mapped page is returned.
> + *
> + *   Flags can be any of:
> + *   * SG_KMAP   - Use kmap to create the mapping
> + *   * SG_KMAP_ATOMIC- Use kmap_atomic to map the page atommically.
> + * Thus, the rules of that function apply: the
> + * cpu may not sleep until it is unmaped.
> + *   * SG_MAP_MUST_NOT_FAIL  - Indicate that sg_map must not fail.
> + * If it does, it will issue a BUG_ON instead.
> + * This is intended for legacy code only, it
> + * is not to be used in new code.

I'm sorry but this API is just a trainwreck.  Right now we have the
nice little kmap_atomic API, which never fails and has a very nice
calling convention where we just pass back the return address, but does
not support sleeping inside the critical section.

And kmap, whіch may fail and requires the original page to be passed
back.  Anything that mixes these two concepts up is simply a non-starter.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 02/21] libiscsi: Add an internal error code

2017-04-26 Thread Christoph Hellwig
On Tue, Apr 25, 2017 at 12:20:49PM -0600, Logan Gunthorpe wrote:
> This is a prep patch to add a new error code to libiscsi. We want to
> rework some kmap calls to be able to fail. When we do, we'd like to
> use this error code.

The kmap case in iscsi_tcp_segment_map can already fail.  Please add
handling of that failure to this patch, and we should get it merged
ASAP.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-27 Thread Christoph Hellwig
On Wed, Apr 26, 2017 at 12:11:33PM -0600, Logan Gunthorpe wrote:
> Ok, well for starters I think you are mistaken about kmap being able to
> fail. I'm having a hard time finding many users of that function that
> bother to check for an error when calling it.

A quick audit of the arch code shows you're right - kmap can't fail
anywhere anymore.

> The main difficulty we
> have now is that neither of those functions are expected to fail and we
> need them to be able to in cases where the page doesn't map to system
> RAM. This patch series is trying to address it for users of scatterlist.
> I'm certainly open to other suggestions.

I think you'll need to follow the existing kmap semantics and never
fail the iomem version either.  Otherwise you'll have a special case
that's almost never used that has a different error path.

> There are a fair number of cases in the kernel that do something like:
> 
> if (something)
> x = kmap(page);
> else
> x = kmap_atomic(page);
> ...
> if (something)
> kunmap(page)
> else
> kunmap_atomic(x)
> 
> Which just seems cumbersome to me.

Passing a different flag based on something isn't really much better.

> In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
> say so and I'll make the change. But I'll still need a flags variable
> for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
> and both of those functions will need to be pretty nearly replicas of
> each other.

Again, wrong way.  Suddenly making things fail for your special case
that normally don't fail is a receipe for bugs.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 3/5] ACPI / bus: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
> + in_params[0].buffer.pointer = (u8 *)

Any idea why the pointer is defined as a u8 * in union acpi_object
instead of a void?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 1/5] acpi, nfit: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
>   for (i = 0; i < NFIT_UUID_MAX; i++)
> - if (memcmp(to_nfit_uuid(i), spa->range_guid, 16) == 0)
> + if (!guid_equal(to_nfit_uuid(i), (guid_t *)>range_guid))
>   return i;

I think this should be guid_equal without the "!"
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 2/5] ACPI / APEI: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
> - if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
> -  CPER_SEC_PLATFORM_MEM)) {
> + if (!guid_equal(sec_type, _SEC_PLATFORM_MEM)) {

The "!" here seems incorrect.

>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> - else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
> -   CPER_SEC_PCIE)) {
> + else if (!guid_equal(sec_type, _SEC_PCIE)) {

Same here.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 4/5] ACPI / extlog: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
> @@ -165,11 +165,11 @@ static int extlog_print(struct notifier_block *nb, 
> unsigned long val,
>   err_seq++;
>   gdata = (struct acpi_hest_generic_data *)(tmp + 1);
>   if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID)
> - fru_id = (uuid_le *)gdata->fru_id;
> + fru_id = (guid_t *)gdata->fru_id;
>   if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT)
>   fru_text = gdata->fru_text;
> - sec_type = (uuid_le *)gdata->section_type;
> - if (!uuid_le_cmp(*sec_type, CPER_SEC_PLATFORM_MEM)) {
> + sec_type = (guid_t *)gdata->section_type;

From a quick look over the tree it seems like both fru_id and
section_type should be declared as guid_t in
struct acpi_hest_generic_data.

> + if (!guid_equal(sec_type, _SEC_PLATFORM_MEM)) {

The "!" seems incorrect here.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 2/5] ACPI / APEI: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
On Mon, Jun 05, 2017 at 07:19:43PM +0300, Andy Shevchenko wrote:
> Thanks!
> Are you going to fixup when applying or better me to send an updated
> version?

I'd prefer a resend so that all the maintainers can carefully re-review
the patches.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 5/5] ACPI: Switch to use generic guid_t in acpi_evaluate_dsm()

2017-06-01 Thread Christoph Hellwig
On Thu, Jun 01, 2017 at 12:42:30AM +0200, Rafael J. Wysocki wrote:
> On Wednesday, May 31, 2017 10:41:52 PM Andy Shevchenko wrote:
> > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
> > bytes. Instead we convert them to use guid_t type. At the same time we
> > convert current users.
> > 
> > acpi_str_to_uuid() becomes useless after the conversion and it's safe to
> > get rid of it.
> 
> Acked-by: Rafael J. Wysocki 
> 
> with one caveat.
> 
> I have a pending patch that will use acpi_evaluate_dsm(), so I'd like this to
> be made available in an immutable branch once applied.

I hope to make the current uuid-types branch immutable soon, I just
want to collect a few more reviews.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 0/5] ACPI et al: convert to use new UUID API

2017-05-31 Thread Christoph Hellwig
On Wed, May 31, 2017 at 10:41:47PM +0300, Andy Shevchenko wrote:
> This series converts ACPI and users of acpi_evaluate_dsm() to new UUID
> API which includes new types and methods.
> 
> Patches are based on uuid tree [1] from Christoph Hellwig and supposed to
> go through it.
> 
> (Christoph, I think it would be nice to attach them to your stuff)

I'd be fine with that in general, as long as we have enough reviewers.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 3/5] ACPI / bus: Switch to use new generic UUID API

2017-06-06 Thread Christoph Hellwig
On Mon, Jun 05, 2017 at 10:37:41PM +0200, Rafael J. Wysocki wrote:
> On Mon, Jun 5, 2017 at 6:20 PM, Andy Shevchenko
> <andriy.shevche...@linux.intel.com> wrote:
> > On Mon, 2017-06-05 at 18:03 +0200, Christoph Hellwig wrote:
> >> > +   in_params[0].buffer.pointer = (u8 *)
> >>
> >> Any idea why the pointer is defined as a u8 * in union acpi_object
> >> instead of a void?
> >
> > I guess this question to Rafael.
> 
> That data type is defined in upstream ACPICA and the reason why it is
> defined the way it is has something to do with history I suppose.

Ok, I'll send a patch to fix it - just about every user of the field
currently requires a cast, which is not a good thing.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Christoph Hellwig
On Fri, May 05, 2017 at 10:06:06AM +0300, Amir Goldstein wrote:
> I much rather that you sort out uuid helpers in a way that will
> satisfy the filesystem
> needs (just provide the helpers don't need to convert filesystems code).

Yeah.

> IMO, you should acknowledge that the common use case for filesystems is
> to handle an opaque char[16] which most likely holds a uuid_be and you
> should provide 'neutral' helpers to satisfy this use case.
> 
> The simplest would be to typedef uuid_t to struct uuid_be and to name 
> 'neutral'
> helpers' uuid_cmp/uuid_copy(uuid_t *, uuid_t *), similar to my proposal.

It's not jut neutral, it's the right thing to to.  The Apollo / DCE
uuids (later also specified in RFC 4122) are big endian, so we should
use the name there.

> Christoph also suggested a similar treatment to typedef guid_t to
> struct uuid_le.

Exactly.  The whole idea of "little endian UUIDs" comes from the
Wintel world, and if you look at the relevant specs they are almost
exclusively referred to as GUIDs.

The magic XFS and AFS types for specific interpretations of one of
the RFC4122 formats don't really help, but I'll just send a patch
to kill them off for XFS ASAP to at least get that out, and we
probably should revert at least

"afs: Move UUID struct to linux/uuid.h"

That moved the AFS mess to common code as a start, and then
also clean up the way we generate random UUIDs, where we currently
have le helper, a be helper and then also generate_random_uuid
just to confuse the heck out of people.  With no description of
either of them.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Christoph Hellwig
On Fri, May 05, 2017 at 12:50:31PM +0300, Amir Goldstein wrote:
> To complete the picture for folks not cc'ed on my patches,
> xfs use case suggests there is also justification for the additional helpers:
> 
> uuid_is_null() / uuid_equal()
> guid_is_null() / guid_equal()

The is_null is useful and I bet we'll find other instances.  I'm
not sure _equals really adds much value over the existing _cmp
helpers, but on the other hand they are so trivial that we might as
well add them.

The other thing XFS has is uuid_copy.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/5] acpi, nfit: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
On Mon, Jun 05, 2017 at 08:10:42PM +0300, Andy Shevchenko wrote:
> I hope Christoph can replace old version of this series with new one in
> his uuid branch. URL in cover letter, repeating for your convenience:
> 
> [1]: git://git.infradead.org/users/hch/uuid.git

Yeah, but I had to drop it again after noticing the guid_equals bits,
and after not fixing up the GUID -> GUID_INIT rename.

for-next in the above repo should have the proper base now, and acpi
has the ACPI bits for Dan to test.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 4/5] ACPI / extlog: Switch to use new generic UUID API

2017-06-05 Thread Christoph Hellwig
On Mon, Jun 05, 2017 at 07:23:05PM +0300, Andy Shevchenko wrote:
> > From a quick look over the tree it seems like both fru_id and
> > section_type should be declared as guid_t in
> > struct acpi_hest_generic_data.
> 
> 
> They are arrays of 16 u8:s.

As is the guid_t.

> And since it's defined in ACPI table definition I'm not sure we can
> change it.

If we always use 16 u8s as guid_t we might as well define the structure
as such, instead of all the casts that remove type safety.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] linux-next: build failure after merge of the drm-intel tree

2018-05-08 Thread Christoph Hellwig
On Wed, May 09, 2018 at 03:02:55PM +1000, Stephen Rothwell wrote:
> > -   ret = of_dma_configure(dev, NULL);
> > +   ret = of_dma_configure(dev, NULL, true);
> > if (ret < 0) {
> > DRM_ERROR("Cannot setup DMA ops, ret %d", ret);
> > return ret;
> > -- 
> > 2.17.0
> 
> This is now needed for the merge of the drm and dma-mapping trees.

FYI, because the dma_configure change touch so much code and the author
wants to base more work on it it actually is in a guranteed stable
branch with just those patches:

git://git.infradead.org/users/hch/dma-mapping.git dma-configure

Gitweb:


http://git.infradead.org/users/hch/dma-mapping.git/shortlog/refs/heads/dma-configure
  
Feel free to pull this into the drm tree.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] swiotlb: Suppress "Out of SW-IOMMU" errors for NO_WARN

2018-05-02 Thread Christoph Hellwig
On Wed, May 02, 2018 at 11:05:01AM +0100, Chris Wilson wrote:
> This extends the warning suppression from commit d0bc0c2a31c9 ("swiotlb:
> suppress warning when __GFP_NOWARN is set") to suppress the warnings
> when DMA_ATTR_NO_WARN is given by caller. In such cases the caller wants
> to handle the error themselves.

I have patches that remove swiotlb_full entirely to be posted soon.
I'll defer the patch for now.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote:
> Does this still guarantee that requests for 16k will not cross a 16k boundary?
> It looks like you are losing the alignment parameter.

The DMA API never gave you alignment guarantees to start with,
and you can get not naturally aligned memory from many of our
current implementations.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 12/16] staging/comedi: mark as broken

2019-06-14 Thread Christoph Hellwig
On Fri, Jun 14, 2019 at 04:02:39PM +0200, Greg KH wrote:
> Perhaps a hint as to how we can fix this up?  This is the first time
> I've heard of the comedi code not handling dma properly.

It can be fixed by:

 a) never calling virt_to_page (or vmalloc_to_page for that matter)
on dma allocation
 b) never remapping dma allocation with conflicting cache modes
(no remapping should be doable after a) anyway).
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 03:01:22PM +, David Laight wrote:
> I'm pretty sure there is a lot of code out there that makes that assumption.
> Without it many drivers will have to allocate almost double the
> amount of memory they actually need in order to get the required alignment.
> So instead of saving memory you'll actually make more be used.

That code would already be broken on a lot of Linux platforms.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 06/16] drm: don't pass __GFP_COMP to dma_alloc_coherent in drm_pci_alloc

2019-06-14 Thread Christoph Hellwig
The memory returned from dma_alloc_coherent is opaqueue to the user,
thus the exact way of page refcounting shall not matter either.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/drm_bufs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index b640437ce90f..7a79a16a055c 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -70,7 +70,7 @@ drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, 
size_t size, size_t ali
dmah->size = size;
dmah->vaddr = dma_alloc_coherent(>pdev->dev, size,
 >busaddr,
-GFP_KERNEL | __GFP_COMP);
+GFP_KERNEL);
 
if (dmah->vaddr == NULL) {
kfree(dmah);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 11/16] s390/ism: stop passing bogus gfp flags arguments to dma_alloc_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent is not just the page allocator.  The only valid
arguments to pass are either GFP_ATOMIC or GFP_ATOMIC with possible
modifiers of __GFP_NORETRY or __GFP_NOWARN.

Signed-off-by: Christoph Hellwig 
---
 drivers/s390/net/ism_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c
index 4fc2056bd227..4ff5506fa4c6 100644
--- a/drivers/s390/net/ism_drv.c
+++ b/drivers/s390/net/ism_drv.c
@@ -241,7 +241,8 @@ static int ism_alloc_dmb(struct ism_dev *ism, struct 
smcd_dmb *dmb)
 
dmb->cpu_addr = dma_alloc_coherent(>pdev->dev, dmb->dmb_len,
   >dma_addr,
-  GFP_KERNEL | __GFP_NOWARN | 
__GFP_NOMEMALLOC | __GFP_COMP | __GFP_NORETRY);
+  GFP_KERNEL | __GFP_NOWARN |
+  __GFP_NORETRY);
if (!dmb->cpu_addr)
clear_bit(dmb->sba_idx, ism->sba_bitmap);
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 09/16] cnic: stop passing bogus gfp flags arguments to dma_alloc_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent is not just the page allocator.  The only valid
arguments to pass are either GFP_ATOMIC or GFP_ATOMIC with possible
modifiers of __GFP_NORETRY or __GFP_NOWARN.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/ethernet/broadcom/cnic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/cnic.c 
b/drivers/net/ethernet/broadcom/cnic.c
index 57dc3cbff36e..bd1c993680e5 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -1028,7 +1028,7 @@ static int __cnic_alloc_uio_rings(struct cnic_uio_dev 
*udev, int pages)
udev->l2_ring_size = pages * CNIC_PAGE_SIZE;
udev->l2_ring = dma_alloc_coherent(>pdev->dev, udev->l2_ring_size,
   >l2_ring_map,
-  GFP_KERNEL | __GFP_COMP);
+  GFP_KERNEL);
if (!udev->l2_ring)
return -ENOMEM;
 
@@ -1036,7 +1036,7 @@ static int __cnic_alloc_uio_rings(struct cnic_uio_dev 
*udev, int pages)
udev->l2_buf_size = CNIC_PAGE_ALIGN(udev->l2_buf_size);
udev->l2_buf = dma_alloc_coherent(>pdev->dev, udev->l2_buf_size,
  >l2_buf_map,
- GFP_KERNEL | __GFP_COMP);
+ GFP_KERNEL);
if (!udev->l2_buf) {
__cnic_free_uio_rings(udev);
return -ENOMEM;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 08/16] IB/qib: stop passing bogus gfp flags arguments to dma_alloc_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent is not just the page allocator.  The only valid
arguments to pass are either GFP_ATOMIC or GFP_ATOMIC with possible
modifiers of __GFP_NORETRY or __GFP_NOWARN.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/hw/qib/qib_iba6120.c |  2 +-
 drivers/infiniband/hw/qib/qib_init.c| 20 +++-
 2 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_iba6120.c 
b/drivers/infiniband/hw/qib/qib_iba6120.c
index 531d8a1db2c3..d8a0b8993d22 100644
--- a/drivers/infiniband/hw/qib/qib_iba6120.c
+++ b/drivers/infiniband/hw/qib/qib_iba6120.c
@@ -2076,7 +2076,7 @@ static void alloc_dummy_hdrq(struct qib_devdata *dd)
dd->cspec->dummy_hdrq = dma_alloc_coherent(>pcidev->dev,
dd->rcd[0]->rcvhdrq_size,
>cspec->dummy_hdrq_phys,
-   GFP_ATOMIC | __GFP_COMP);
+   GFP_ATOMIC);
if (!dd->cspec->dummy_hdrq) {
qib_devinfo(dd->pcidev, "Couldn't allocate dummy hdrq\n");
/* fallback to just 0'ing */
diff --git a/drivers/infiniband/hw/qib/qib_init.c 
b/drivers/infiniband/hw/qib/qib_init.c
index d4fd8a6cff7b..072885a6684d 100644
--- a/drivers/infiniband/hw/qib/qib_init.c
+++ b/drivers/infiniband/hw/qib/qib_init.c
@@ -1547,18 +1547,13 @@ int qib_create_rcvhdrq(struct qib_devdata *dd, struct 
qib_ctxtdata *rcd)
 
if (!rcd->rcvhdrq) {
dma_addr_t phys_hdrqtail;
-   gfp_t gfp_flags;
-
amt = ALIGN(dd->rcvhdrcnt * dd->rcvhdrentsize *
sizeof(u32), PAGE_SIZE);
-   gfp_flags = (rcd->ctxt >= dd->first_user_ctxt) ?
-   GFP_USER : GFP_KERNEL;
 
old_node_id = dev_to_node(>pcidev->dev);
set_dev_node(>pcidev->dev, rcd->node_id);
rcd->rcvhdrq = dma_alloc_coherent(
-   >pcidev->dev, amt, >rcvhdrq_phys,
-   gfp_flags | __GFP_COMP);
+   >pcidev->dev, amt, >rcvhdrq_phys, GFP_KERNEL);
set_dev_node(>pcidev->dev, old_node_id);
 
if (!rcd->rcvhdrq) {
@@ -1578,7 +1573,7 @@ int qib_create_rcvhdrq(struct qib_devdata *dd, struct 
qib_ctxtdata *rcd)
set_dev_node(>pcidev->dev, rcd->node_id);
rcd->rcvhdrtail_kvaddr = dma_alloc_coherent(
>pcidev->dev, PAGE_SIZE, _hdrqtail,
-   gfp_flags);
+   GFP_KERNEL);
set_dev_node(>pcidev->dev, old_node_id);
if (!rcd->rcvhdrtail_kvaddr)
goto bail_free;
@@ -1622,17 +1617,8 @@ int qib_setup_eagerbufs(struct qib_ctxtdata *rcd)
struct qib_devdata *dd = rcd->dd;
unsigned e, egrcnt, egrperchunk, chunk, egrsize, egroff;
size_t size;
-   gfp_t gfp_flags;
int old_node_id;
 
-   /*
-* GFP_USER, but without GFP_FS, so buffer cache can be
-* coalesced (we hope); otherwise, even at order 4,
-* heavy filesystem activity makes these fail, and we can
-* use compound pages.
-*/
-   gfp_flags = __GFP_RECLAIM | __GFP_IO | __GFP_COMP;
-
egrcnt = rcd->rcvegrcnt;
egroff = rcd->rcvegr_tid_base;
egrsize = dd->rcvegrbufsize;
@@ -1664,7 +1650,7 @@ int qib_setup_eagerbufs(struct qib_ctxtdata *rcd)
rcd->rcvegrbuf[e] =
dma_alloc_coherent(>pcidev->dev, size,
   >rcvegrbuf_phys[e],
-  gfp_flags);
+  GFP_KERNEL);
set_dev_node(>pcidev->dev, old_node_id);
if (!rcd->rcvegrbuf[e])
goto bail_rcvegrbuf_phys;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread Christoph Hellwig
Many architectures (e.g. arm, m68 and sh) have always used exact
allocation in their dma coherent allocator, which avoids a lot of
memory waste especially for larger allocations.  Lift this behavior
into the generic allocator so that dma-direct and the generic IOMMU
code benefit from this behavior as well.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-contiguous.h |  8 +---
 kernel/dma/contiguous.c| 17 +++--
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index c05d4e661489..2e542e314acf 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -161,15 +161,17 @@ static inline struct page *dma_alloc_contiguous(struct 
device *dev, size_t size,
gfp_t gfp)
 {
int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
-   size_t align = get_order(PAGE_ALIGN(size));
+   void *cpu_addr = alloc_pages_exact_node(node, size, gfp);
 
-   return alloc_pages_node(node, gfp, align);
+   if (!cpu_addr)
+   return NULL;
+   return virt_to_page(p);
 }
 
 static inline void dma_free_contiguous(struct device *dev, struct page *page,
size_t size)
 {
-   __free_pages(page, get_order(size));
+   free_pages_exact(page_address(page), get_order(size));
 }
 
 #endif
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index bfc0c17f2a3d..84f41eea2741 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -232,9 +232,8 @@ struct page *dma_alloc_contiguous(struct device *dev, 
size_t size, gfp_t gfp)
 {
int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   size_t align = get_order(PAGE_ALIGN(size));
-   struct page *page = NULL;
struct cma *cma = NULL;
+   void *cpu_addr;
 
if (dev && dev->cma_area)
cma = dev->cma_area;
@@ -243,14 +242,20 @@ struct page *dma_alloc_contiguous(struct device *dev, 
size_t size, gfp_t gfp)
 
/* CMA can be used only in the context which permits sleeping */
if (cma && gfpflags_allow_blocking(gfp)) {
+   size_t align = get_order(PAGE_ALIGN(size));
+   struct page *page;
+
align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
+   if (page)
+   return page;
}
 
/* Fallback allocation of normal pages */
-   if (!page)
-   page = alloc_pages_node(node, gfp, align);
-   return page;
+   cpu_addr = alloc_pages_exact_node(node, size, gfp);
+   if (!cpu_addr)
+   return NULL;
+   return virt_to_page(cpu_addr);
 }
 
 /**
@@ -267,7 +272,7 @@ struct page *dma_alloc_contiguous(struct device *dev, 
size_t size, gfp_t gfp)
 void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
 {
if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
-   __free_pages(page, get_order(size));
+   free_pages_exact(page_address(page), get_order(size));
 }
 
 /*
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 07/16] IB/hfi1: stop passing bogus gfp flags arguments to dma_alloc_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent is not just the page allocator.  The only valid
arguments to pass are either GFP_ATOMIC or GFP_ATOMIC with possible
modifiers of __GFP_NORETRY or __GFP_NOWARN.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/hw/hfi1/init.c | 22 +++---
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c 
b/drivers/infiniband/hw/hfi1/init.c
index 71cb9525c074..ff9d106ee784 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1846,17 +1846,10 @@ int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct 
hfi1_ctxtdata *rcd)
u64 reg;
 
if (!rcd->rcvhdrq) {
-   gfp_t gfp_flags;
-
amt = rcvhdrq_size(rcd);
 
-   if (rcd->ctxt < dd->first_dyn_alloc_ctxt || rcd->is_vnic)
-   gfp_flags = GFP_KERNEL;
-   else
-   gfp_flags = GFP_USER;
rcd->rcvhdrq = dma_alloc_coherent(>pcidev->dev, amt,
- >rcvhdrq_dma,
- gfp_flags | __GFP_COMP);
+ >rcvhdrq_dma, 
GFP_KERNEL);
 
if (!rcd->rcvhdrq) {
dd_dev_err(dd,
@@ -1870,7 +1863,7 @@ int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct 
hfi1_ctxtdata *rcd)
rcd->rcvhdrtail_kvaddr = 
dma_alloc_coherent(>pcidev->dev,
PAGE_SIZE,

>rcvhdrqtailaddr_dma,
-   gfp_flags);
+   GFP_KERNEL);
if (!rcd->rcvhdrtail_kvaddr)
goto bail_free;
}
@@ -1926,19 +1919,10 @@ int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd)
 {
struct hfi1_devdata *dd = rcd->dd;
u32 max_entries, egrtop, alloced_bytes = 0;
-   gfp_t gfp_flags;
u16 order, idx = 0;
int ret = 0;
u16 round_mtu = roundup_pow_of_two(hfi1_max_mtu);
 
-   /*
-* GFP_USER, but without GFP_FS, so buffer cache can be
-* coalesced (we hope); otherwise, even at order 4,
-* heavy filesystem activity makes these fail, and we can
-* use compound pages.
-*/
-   gfp_flags = __GFP_RECLAIM | __GFP_IO | __GFP_COMP;
-
/*
 * The minimum size of the eager buffers is a groups of MTU-sized
 * buffers.
@@ -1969,7 +1953,7 @@ int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd)
dma_alloc_coherent(>pcidev->dev,
   rcd->egrbufs.rcvtid_size,
   >egrbufs.buffers[idx].dma,
-  gfp_flags);
+  GFP_KERNEL);
if (rcd->egrbufs.buffers[idx].addr) {
rcd->egrbufs.buffers[idx].len =
rcd->egrbufs.rcvtid_size;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 12/16] staging/comedi: mark as broken

2019-06-14 Thread Christoph Hellwig
On Fri, Jun 14, 2019 at 05:30:32PM +0200, Greg KH wrote:
> On Fri, Jun 14, 2019 at 04:48:57PM +0200, Christoph Hellwig wrote:
> > On Fri, Jun 14, 2019 at 04:02:39PM +0200, Greg KH wrote:
> > > Perhaps a hint as to how we can fix this up?  This is the first time
> > > I've heard of the comedi code not handling dma properly.
> > 
> > It can be fixed by:
> > 
> >  a) never calling virt_to_page (or vmalloc_to_page for that matter)
> > on dma allocation
> >  b) never remapping dma allocation with conflicting cache modes
> > (no remapping should be doable after a) anyway).
> 
> Ok, fair enough, have any pointers of drivers/core code that does this
> correctly?  I can put it on my todo list, but might take a week or so...

Just about everyone else.  They just need to remove the vmap and
either do one large allocation, or live with the fact that they need
helpers to access multiple array elements instead of one net vmap,
which most of the users already seem to do anyway, with just a few
using the vmap (which might explain why we didn't see blowups yet).
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 01/16] media: videobuf-dma-contig: use dma_mmap_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent does not return a physical address, but a DMA
address, which might be remapped or have an offset.  Passing this
DMA address to vm_iomap_memory is completely bogus.  Use the proper
dma_mmap_coherent helper instead, and stop passing __GFP_COMP
to dma_alloc_coherent, as the memory management inside the DMA
allocator is hidden from the callers.

Fixes: a8f3c203e19b ("[media] videobuf-dma-contig: add cache support")
Signed-off-by: Christoph Hellwig 
---
 drivers/media/v4l2-core/videobuf-dma-contig.c | 23 +++
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c 
b/drivers/media/v4l2-core/videobuf-dma-contig.c
index e1bf50df4c70..a5942ea38f1f 100644
--- a/drivers/media/v4l2-core/videobuf-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
@@ -39,11 +39,11 @@ struct videobuf_dma_contig_memory {
 
 static int __videobuf_dc_alloc(struct device *dev,
   struct videobuf_dma_contig_memory *mem,
-  unsigned long size, gfp_t flags)
+  unsigned long size)
 {
mem->size = size;
-   mem->vaddr = dma_alloc_coherent(dev, mem->size,
-   >dma_handle, flags);
+   mem->vaddr = dma_alloc_coherent(dev, mem->size, >dma_handle,
+   GFP_KERNEL);
 
if (!mem->vaddr) {
dev_err(dev, "memory alloc size %ld failed\n", mem->size);
@@ -260,8 +260,7 @@ static int __videobuf_iolock(struct videobuf_queue *q,
return videobuf_dma_contig_user_get(mem, vb);
 
/* allocate memory for the read() method */
-   if (__videobuf_dc_alloc(q->dev, mem, PAGE_ALIGN(vb->size),
-   GFP_KERNEL))
+   if (__videobuf_dc_alloc(q->dev, mem, PAGE_ALIGN(vb->size)))
return -ENOMEM;
break;
case V4L2_MEMORY_OVERLAY:
@@ -280,7 +279,6 @@ static int __videobuf_mmap_mapper(struct videobuf_queue *q,
struct videobuf_dma_contig_memory *mem;
struct videobuf_mapping *map;
int retval;
-   unsigned long size;
 
dev_dbg(q->dev, "%s\n", __func__);
 
@@ -298,23 +296,18 @@ static int __videobuf_mmap_mapper(struct videobuf_queue 
*q,
BUG_ON(!mem);
MAGIC_CHECK(mem->magic, MAGIC_DC_MEM);
 
-   if (__videobuf_dc_alloc(q->dev, mem, PAGE_ALIGN(buf->bsize),
-   GFP_KERNEL | __GFP_COMP))
+   if (__videobuf_dc_alloc(q->dev, mem, PAGE_ALIGN(buf->bsize)))
goto error;
 
-   /* Try to remap memory */
-   size = vma->vm_end - vma->vm_start;
-   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
-
/* the "vm_pgoff" is just used in v4l2 to find the
 * corresponding buffer data structure which is allocated
 * earlier and it does not mean the offset from the physical
 * buffer start address as usual. So set it to 0 to pass
-* the sanity check in vm_iomap_memory().
+* the sanity check in dma_mmap_coherent().
 */
vma->vm_pgoff = 0;
-
-   retval = vm_iomap_memory(vma, mem->dma_handle, size);
+   retval = dma_mmap_coherent(q->dev, vma, mem->vaddr, mem->dma_handle,
+   vma->vm_end - vma->vm_start);
if (retval) {
dev_err(q->dev, "mmap: remap failed with error %d. ",
retval);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] use exact allocation for dma coherent memory

2019-06-14 Thread Christoph Hellwig
Hi all,

various architectures have used exact memory allocations for dma
allocations for a long time, but x86 and thus the common code based
on it kept using our normal power of two allocator, which tends to
waste a lot of memory for certain allocations.

Switching to a slightly cleaned up alloc_pages_exact is pretty easy,
but it turns out that because we didn't filter valid gfp_t flags
on the DMA allocator, a bunch of drivers were passing __GFP_COMP
to it, which is rather bogus in too many ways to explain.  Arm has
been filtering it for a while, but this series instead tries to fix
the drivers and warn when __GFP_COMP is passed, which makes it much
larger than just adding the functionality.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 02/16] drm/ati_pcigart: stop using drm_pci_alloc

2019-06-14 Thread Christoph Hellwig
Remove usage of the legacy drm PCI DMA wrappers, and with that the
incorrect usage cocktail of __GFP_COMP, virt_to_page on DMA allocation
and SetPageReserved.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/ati_pcigart.c | 27 +++
 include/drm/ati_pcigart.h |  5 -
 2 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index 2362f07fe1fc..f66d4fccd812 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -41,21 +41,14 @@
 static int drm_ati_alloc_pcigart_table(struct drm_device *dev,
   struct drm_ati_pcigart_info *gart_info)
 {
-   gart_info->table_handle = drm_pci_alloc(dev, gart_info->table_size,
-   PAGE_SIZE);
-   if (gart_info->table_handle == NULL)
+   gart_info->table_vaddr = dma_alloc_coherent(>pdev->dev,
+   gart_info->table_size, _info->table_handle,
+   GFP_KERNEL);
+   if (!gart_info->table_vaddr)
return -ENOMEM;
-
return 0;
 }
 
-static void drm_ati_free_pcigart_table(struct drm_device *dev,
-  struct drm_ati_pcigart_info *gart_info)
-{
-   drm_pci_free(dev, gart_info->table_handle);
-   gart_info->table_handle = NULL;
-}
-
 int drm_ati_pcigart_cleanup(struct drm_device *dev, struct 
drm_ati_pcigart_info *gart_info)
 {
struct drm_sg_mem *entry = dev->sg;
@@ -87,8 +80,10 @@ int drm_ati_pcigart_cleanup(struct drm_device *dev, struct 
drm_ati_pcigart_info
}
 
if (gart_info->gart_table_location == DRM_ATI_GART_MAIN &&
-   gart_info->table_handle) {
-   drm_ati_free_pcigart_table(dev, gart_info);
+   gart_info->table_vaddr) {
+   dma_free_coherent(>pdev->dev, gart_info->table_size,
+   gart_info->table_vaddr, 
gart_info->table_handle);
+   gart_info->table_vaddr = NULL;
}
 
return 1;
@@ -127,9 +122,9 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
goto done;
}
 
-   pci_gart = gart_info->table_handle->vaddr;
-   address = gart_info->table_handle->vaddr;
-   bus_address = gart_info->table_handle->busaddr;
+   pci_gart = gart_info->table_vaddr;
+   address = gart_info->table_vaddr;
+   bus_address = gart_info->table_handle;
} else {
address = gart_info->addr;
bus_address = gart_info->bus_addr;
diff --git a/include/drm/ati_pcigart.h b/include/drm/ati_pcigart.h
index a728a1364e66..2ffe278ba3b3 100644
--- a/include/drm/ati_pcigart.h
+++ b/include/drm/ati_pcigart.h
@@ -18,7 +18,10 @@ struct drm_ati_pcigart_info {
void *addr;
dma_addr_t bus_addr;
dma_addr_t table_mask;
-   struct drm_dma_handle *table_handle;
+
+   dma_addr_t table_handle;
+   void *table_vaddr;
+
struct drm_local_map mapping;
int table_size;
 };
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 15/16] dma-mapping: clear __GFP_COMP in dma_alloc_attrs

2019-06-14 Thread Christoph Hellwig
Lift the code to clear __GFP_COMP from arm into the common DMA
allocator path.  For one this fixes the various other patches that
call alloc_pages_exact or split_page in case a bogus driver passes
the argument, and it also prepares for doing exact allocation in
the generic dma-direct allocator.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/mm/dma-mapping.c | 17 -
 kernel/dma/mapping.c  |  9 +
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0a75058c11f3..86135feb2c05 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -759,14 +759,6 @@ static void *__dma_alloc(struct device *dev, size_t size, 
dma_addr_t *handle,
if (mask < 0xULL)
gfp |= GFP_DMA;
 
-   /*
-* Following is a work-around (a.k.a. hack) to prevent pages
-* with __GFP_COMP being passed to split_page() which cannot
-* handle them.  The real problem is that this flag probably
-* should be 0 on ARM as it is not supported on this
-* platform; see CONFIG_HUGETLBFS.
-*/
-   gfp &= ~(__GFP_COMP);
args.gfp = gfp;
 
*handle = DMA_MAPPING_ERROR;
@@ -1527,15 +1519,6 @@ static void *__arm_iommu_alloc_attrs(struct device *dev, 
size_t size,
return __iommu_alloc_simple(dev, size, gfp, handle,
coherent_flag, attrs);
 
-   /*
-* Following is a work-around (a.k.a. hack) to prevent pages
-* with __GFP_COMP being passed to split_page() which cannot
-* handle them.  The real problem is that this flag probably
-* should be 0 on ARM as it is not supported on this
-* platform; see CONFIG_HUGETLBFS.
-*/
-   gfp &= ~(__GFP_COMP);
-
pages = __iommu_alloc_buffer(dev, size, gfp, attrs, coherent_flag);
if (!pages)
return NULL;
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index f7afdadb6770..4b618e1abbc1 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -252,6 +252,15 @@ void *dma_alloc_attrs(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
/* let the implementation decide on the zone to allocate from: */
flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
+   /*
+* __GFP_COMP interacts badly with splitting up a larger order
+* allocation.  But as our allocations might not even come from the
+* page allocator, the callers can't rely on the fact that they
+* even get pages, never mind which kind.
+*/
+   if (WARN_ON_ONCE(flag & __GFP_COMP))
+   flag &= ~__GFP_COMP;
+
if (dma_is_direct(ops))
cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
else if (ops->alloc)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 14/16] mm: use alloc_pages_exact_node to implement alloc_pages_exact

2019-06-14 Thread Christoph Hellwig
No need to duplicate the logic over two functions that are almost the
same.

Signed-off-by: Christoph Hellwig 
---
 include/linux/gfp.h |  5 +++--
 mm/page_alloc.c | 39 +++
 2 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 4274ea6bc72b..c616a23a3f81 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -530,9 +530,10 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int 
order,
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
-void *alloc_pages_exact(size_t size, gfp_t gfp_mask);
 void free_pages_exact(void *virt, size_t size);
-void * __meminit alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask);
+void *alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask);
+#define alloc_pages_exact(size, gfp_mask) \
+   alloc_pages_exact_node(NUMA_NO_NODE, size, gfp_mask)
 
 #define __get_free_page(gfp_mask) \
__get_free_pages((gfp_mask), 0)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index dd2fed66b656..dec68bd21a71 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4859,34 +4859,6 @@ static void *make_alloc_exact(unsigned long addr, 
unsigned int order,
return (void *)addr;
 }
 
-/**
- * alloc_pages_exact - allocate an exact number physically-contiguous pages.
- * @size: the number of bytes to allocate
- * @gfp_mask: GFP flags for the allocation, must not contain __GFP_COMP
- *
- * This function is similar to alloc_pages(), except that it allocates the
- * minimum number of pages to satisfy the request.  alloc_pages() can only
- * allocate memory in power-of-two pages.
- *
- * This function is also limited by MAX_ORDER.
- *
- * Memory allocated by this function must be released by free_pages_exact().
- *
- * Return: pointer to the allocated area or %NULL in case of error.
- */
-void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
-{
-   unsigned int order = get_order(size);
-   unsigned long addr;
-
-   if (WARN_ON_ONCE(gfp_mask & __GFP_COMP))
-   gfp_mask &= ~__GFP_COMP;
-
-   addr = __get_free_pages(gfp_mask, order);
-   return make_alloc_exact(addr, order, size);
-}
-EXPORT_SYMBOL(alloc_pages_exact);
-
 /**
  * alloc_pages_exact_node - allocate an exact number of physically-contiguous
  *pages on a node.
@@ -4894,12 +4866,15 @@ EXPORT_SYMBOL(alloc_pages_exact);
  * @size: the number of bytes to allocate
  * @gfp_mask: GFP flags for the allocation, must not contain __GFP_COMP
  *
- * Like alloc_pages_exact(), but try to allocate on node nid first before 
falling
- * back.
+ * This function is similar to alloc_pages_node(), except that it allocates the
+ * minimum number of pages to satisfy the request while alloc_pages() can only
+ * allocate memory in power-of-two pages.  This function is also limited by
+ * MAX_ORDER.
  *
- * Return: pointer to the allocated area or %NULL in case of error.
+ * Returns a pointer to the allocated area or %NULL in case of error, memory
+ * allocated by this function must be released by free_pages_exact().
  */
-void * __meminit alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask)
+void *alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask)
 {
unsigned int order = get_order(size);
struct page *p;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 04/16] drm: move drm_pci_{alloc, free} to drm_legacy

2019-06-14 Thread Christoph Hellwig
These functions are rather broken in that they try to pass __GFP_COMP
to dma_alloc_coherent, call virt_to_page on the return value and
mess with PageReserved.  And not actually used by any modern driver.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/drm_bufs.c | 85 
 drivers/gpu/drm/drm_pci.c  | 89 --
 2 files changed, 85 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index bfc419ed9d6c..7418872d87c6 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -38,6 +38,91 @@
 
 #include 
 
+/**
+ * drm_pci_alloc - Allocate a PCI consistent memory block, for DMA.
+ * @dev: DRM device
+ * @size: size of block to allocate
+ * @align: alignment of block
+ *
+ * FIXME: This is a needless abstraction of the Linux dma-api and should be
+ * removed.
+ *
+ * Return: A handle to the allocated memory block on success or NULL on
+ * failure.
+ */
+drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t 
align)
+{
+   drm_dma_handle_t *dmah;
+   unsigned long addr;
+   size_t sz;
+
+   /* pci_alloc_consistent only guarantees alignment to the smallest
+* PAGE_SIZE order which is greater than or equal to the requested size.
+* Return NULL here for now to make sure nobody tries for larger 
alignment
+*/
+   if (align > size)
+   return NULL;
+
+   dmah = kmalloc(sizeof(drm_dma_handle_t), GFP_KERNEL);
+   if (!dmah)
+   return NULL;
+
+   dmah->size = size;
+   dmah->vaddr = dma_alloc_coherent(>pdev->dev, size,
+>busaddr,
+GFP_KERNEL | __GFP_COMP);
+
+   if (dmah->vaddr == NULL) {
+   kfree(dmah);
+   return NULL;
+   }
+
+   /* XXX - Is virt_to_page() legal for consistent mem? */
+   /* Reserve */
+   for (addr = (unsigned long)dmah->vaddr, sz = size;
+sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
+   SetPageReserved(virt_to_page((void *)addr));
+   }
+
+   return dmah;
+}
+
+/*
+ * Free a PCI consistent memory block without freeing its descriptor.
+ *
+ * This function is for internal use in the Linux-specific DRM core code.
+ */
+void __drm_legacy_pci_free(struct drm_device * dev, drm_dma_handle_t * dmah)
+{
+   unsigned long addr;
+   size_t sz;
+
+   if (dmah->vaddr) {
+   /* XXX - Is virt_to_page() legal for consistent mem? */
+   /* Unreserve */
+   for (addr = (unsigned long)dmah->vaddr, sz = dmah->size;
+sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
+   ClearPageReserved(virt_to_page((void *)addr));
+   }
+   dma_free_coherent(>pdev->dev, dmah->size, dmah->vaddr,
+ dmah->busaddr);
+   }
+}
+
+/**
+ * drm_pci_free - Free a PCI consistent memory block
+ * @dev: DRM device
+ * @dmah: handle to memory block
+ *
+ * FIXME: This is a needless abstraction of the Linux dma-api and should be
+ * removed.
+ */
+void drm_pci_free(struct drm_device * dev, drm_dma_handle_t * dmah)
+{
+   __drm_legacy_pci_free(dev, dmah);
+   kfree(dmah);
+}
+
 static struct drm_map_list *drm_find_matching_map(struct drm_device *dev,
  struct drm_local_map *map)
 {
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index 693748ad8b88..77a215f2a8e4 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -31,95 +31,6 @@
 #include "drm_internal.h"
 #include "drm_legacy.h"
 
-/**
- * drm_pci_alloc - Allocate a PCI consistent memory block, for DMA.
- * @dev: DRM device
- * @size: size of block to allocate
- * @align: alignment of block
- *
- * FIXME: This is a needless abstraction of the Linux dma-api and should be
- * removed.
- *
- * Return: A handle to the allocated memory block on success or NULL on
- * failure.
- */
-drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t 
align)
-{
-   drm_dma_handle_t *dmah;
-   unsigned long addr;
-   size_t sz;
-
-   /* pci_alloc_consistent only guarantees alignment to the smallest
-* PAGE_SIZE order which is greater than or equal to the requested size.
-* Return NULL here for now to make sure nobody tries for larger 
alignment
-*/
-   if (align > size)
-   return NULL;
-
-   dmah = kmalloc(sizeof(drm_dma_handle_t), GFP_KERNEL);
-   if (!dmah)
-   return NULL;
-
-   dmah->size = size;
-   dmah->vaddr = dma_alloc_coherent(>pdev->dev, size,
->busaddr,
-GFP_KERNEL | __GFP_COMP);
-

[Intel-gfx] [PATCH 05/16] drm: don't mark pages returned from drm_pci_alloc reserved

2019-06-14 Thread Christoph Hellwig
We are not allowed to call virt_to_page on pages returned from
dma_alloc_coherent, as in many cases the virtual address returned
is aactually a kernel direct mapping.  Also there generally is no
need to mark dma memory as reserved.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/drm_bufs.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index 7418872d87c6..b640437ce90f 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -77,13 +77,6 @@ drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, 
size_t size, size_t ali
return NULL;
}
 
-   /* XXX - Is virt_to_page() legal for consistent mem? */
-   /* Reserve */
-   for (addr = (unsigned long)dmah->vaddr, sz = size;
-sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
-   SetPageReserved(virt_to_page((void *)addr));
-   }
-
return dmah;
 }
 
@@ -97,16 +90,9 @@ void __drm_legacy_pci_free(struct drm_device * dev, 
drm_dma_handle_t * dmah)
unsigned long addr;
size_t sz;
 
-   if (dmah->vaddr) {
-   /* XXX - Is virt_to_page() legal for consistent mem? */
-   /* Unreserve */
-   for (addr = (unsigned long)dmah->vaddr, sz = dmah->size;
-sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
-   ClearPageReserved(virt_to_page((void *)addr));
-   }
+   if (dmah->vaddr)
dma_free_coherent(>pdev->dev, dmah->size, dmah->vaddr,
  dmah->busaddr);
-   }
 }
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 10/16] iwlwifi: stop passing bogus gfp flags arguments to dma_alloc_coherent

2019-06-14 Thread Christoph Hellwig
dma_alloc_coherent is not just the page allocator.  The only valid
arguments to pass are either GFP_ATOMIC or GFP_ATOMIC with possible
modifiers of __GFP_NORETRY or __GFP_NOWARN.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/wireless/intel/iwlwifi/fw/dbg.c | 3 +--
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c 
b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c
index 5f52e40a2903..323dc5d5ee88 100644
--- a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c
+++ b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c
@@ -2361,8 +2361,7 @@ iwl_fw_dbg_buffer_allocation(struct iwl_fw_runtime *fwrt, 
u32 size)
 
virtual_addr =
dma_alloc_coherent(fwrt->trans->dev, size, _addr,
-  GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO |
-  __GFP_COMP);
+  GFP_KERNEL | __GFP_NOWARN);
 
/* TODO: alloc fragments if needed */
if (!virtual_addr)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c 
b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index 803fcbac4152..22a47f928dc8 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -210,8 +210,7 @@ static void iwl_pcie_alloc_fw_monitor_block(struct 
iwl_trans *trans,
for (power = max_power; power >= min_power; power--) {
size = BIT(power);
cpu_addr = dma_alloc_coherent(trans->dev, size, ,
- GFP_KERNEL | __GFP_NOWARN |
- __GFP_ZERO | __GFP_COMP);
+ GFP_KERNEL | __GFP_NOWARN);
if (!cpu_addr)
continue;
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 12/16] staging/comedi: mark as broken

2019-06-14 Thread Christoph Hellwig
comedi_buf.c abuse the DMA API in gravely broken ways, as it assumes it
can call virt_to_page on the result, and the just remap it as uncached
using vmap.  Disable the driver until this API abuse has been fixed.

Signed-off-by: Christoph Hellwig 
---
 drivers/staging/comedi/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/comedi/Kconfig b/drivers/staging/comedi/Kconfig
index 049b659fa6ad..e7c021d76cfa 100644
--- a/drivers/staging/comedi/Kconfig
+++ b/drivers/staging/comedi/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 config COMEDI
tristate "Data acquisition support (comedi)"
+   depends on BROKEN
help
  Enable support for a wide range of data acquisition devices
  for Linux.
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 13/16] mm: rename alloc_pages_exact_nid to alloc_pages_exact_node

2019-06-14 Thread Christoph Hellwig
This fits in with the naming scheme used by alloc_pages_node.

Signed-off-by: Christoph Hellwig 
---
 include/linux/gfp.h | 2 +-
 mm/page_alloc.c | 4 ++--
 mm/page_ext.c   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index fb07b503dc45..4274ea6bc72b 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -532,7 +532,7 @@ extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
 void *alloc_pages_exact(size_t size, gfp_t gfp_mask);
 void free_pages_exact(void *virt, size_t size);
-void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
+void * __meminit alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask);
 
 #define __get_free_page(gfp_mask) \
__get_free_pages((gfp_mask), 0)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d66bc8abe0af..dd2fed66b656 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4888,7 +4888,7 @@ void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
 EXPORT_SYMBOL(alloc_pages_exact);
 
 /**
- * alloc_pages_exact_nid - allocate an exact number of physically-contiguous
+ * alloc_pages_exact_node - allocate an exact number of physically-contiguous
  *pages on a node.
  * @nid: the preferred node ID where memory should be allocated
  * @size: the number of bytes to allocate
@@ -4899,7 +4899,7 @@ EXPORT_SYMBOL(alloc_pages_exact);
  *
  * Return: pointer to the allocated area or %NULL in case of error.
  */
-void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask)
+void * __meminit alloc_pages_exact_node(int nid, size_t size, gfp_t gfp_mask)
 {
unsigned int order = get_order(size);
struct page *p;
diff --git a/mm/page_ext.c b/mm/page_ext.c
index d8f1aca4ad43..bca6bb316714 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -215,7 +215,7 @@ static void *__meminit alloc_page_ext(size_t size, int nid)
gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
void *addr = NULL;
 
-   addr = alloc_pages_exact_nid(nid, size, flags);
+   addr = alloc_pages_exact_node(nid, size, flags);
if (addr) {
kmemleak_alloc(addr, size, 1, flags);
return addr;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 03/16] drm/i915: stop using drm_pci_alloc

2019-06-14 Thread Christoph Hellwig
Remove usage of the legacy drm PCI DMA wrappers, and with that the
incorrect usage cocktail of __GFP_COMP, virt_to_page on DMA allocation
and SetPageReserved.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/i915/i915_gem.c| 30 +-
 drivers/gpu/drm/i915/i915_gem_object.h |  3 ++-
 drivers/gpu/drm/i915/intel_display.c   |  2 +-
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ad01c92aaf74..8f2053c91aff 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -228,7 +228,6 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void 
*data,
 static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 {
struct address_space *mapping = obj->base.filp->f_mapping;
-   drm_dma_handle_t *phys;
struct sg_table *st;
struct scatterlist *sg;
char *vaddr;
@@ -242,13 +241,13 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
 * to handle all possible callers, and given typical object sizes,
 * the alignment of the buddy allocation will naturally match.
 */
-   phys = drm_pci_alloc(obj->base.dev,
-roundup_pow_of_two(obj->base.size),
-roundup_pow_of_two(obj->base.size));
-   if (!phys)
+   obj->phys_vaddr = dma_alloc_coherent(>base.dev->pdev->dev,
+   roundup_pow_of_two(obj->base.size),
+   >phys_handle, GFP_KERNEL);
+   if (!obj->phys_vaddr)
return -ENOMEM;
 
-   vaddr = phys->vaddr;
+   vaddr = obj->phys_vaddr;
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
struct page *page;
char *src;
@@ -286,18 +285,17 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
sg->offset = 0;
sg->length = obj->base.size;
 
-   sg_dma_address(sg) = phys->busaddr;
+   sg_dma_address(sg) = obj->phys_handle;
sg_dma_len(sg) = obj->base.size;
 
-   obj->phys_handle = phys;
-
__i915_gem_object_set_pages(obj, st, sg->length);
 
return 0;
 
 err_phys:
-   drm_pci_free(obj->base.dev, phys);
-
+   dma_free_coherent(>base.dev->pdev->dev,
+   roundup_pow_of_two(obj->base.size), obj->phys_vaddr,
+   obj->phys_handle);
return err;
 }
 
@@ -335,7 +333,7 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
 
if (obj->mm.dirty) {
struct address_space *mapping = obj->base.filp->f_mapping;
-   char *vaddr = obj->phys_handle->vaddr;
+   char *vaddr = obj->phys_vaddr;
int i;
 
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
@@ -363,7 +361,9 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
sg_free_table(pages);
kfree(pages);
 
-   drm_pci_free(obj->base.dev, obj->phys_handle);
+   dma_free_coherent(>base.dev->pdev->dev,
+   roundup_pow_of_two(obj->base.size), obj->phys_vaddr,
+   obj->phys_handle);
 }
 
 static void
@@ -603,7 +603,7 @@ i915_gem_phys_pwrite(struct drm_i915_gem_object *obj,
 struct drm_i915_gem_pwrite *args,
 struct drm_file *file)
 {
-   void *vaddr = obj->phys_handle->vaddr + args->offset;
+   void *vaddr = obj->phys_vaddr + args->offset;
char __user *user_data = u64_to_user_ptr(args->data_ptr);
 
/* We manually control the domain here and pretend that it
@@ -1431,7 +1431,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
ret = i915_gem_gtt_pwrite_fast(obj, args);
 
if (ret == -EFAULT || ret == -ENOSPC) {
-   if (obj->phys_handle)
+   if (obj->phys_vaddr)
ret = i915_gem_phys_pwrite(obj, args, file);
else
ret = i915_gem_shmem_pwrite(obj, args);
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/i915_gem_object.h
index ca93a40c0c87..14bd2d61d0f6 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -290,7 +290,8 @@ struct drm_i915_gem_object {
};
 
/** for phys allocated objects */
-   struct drm_dma_handle *phys_handle;
+   dma_addr_t phys_handle;
+   void *phys_vaddr;
 
struct reservation_object __builtin_resv;
 };
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 5098228f1302..4f8b368ac4e2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10066,7 +1006

Re: [Intel-gfx] [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 04:05:33PM +0100, Robin Murphy wrote:
> That said, I don't believe this particular patch should make any 
> appreciable difference - alloc_pages_exact() is still going to give back 
> the same base address as the rounded up over-allocation would, and 
> PAGE_ALIGN()ing the size passed to get_order() already seemed to be 
> pointless.

True, we actually do get the right alignment just about anywhere.
Not 100% sure about the various static pool implementations, but we
can make sure if any didn't we'll do that right thing once those
get consolidated.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] use exact allocation for dma coherent memory

2019-06-20 Thread Christoph Hellwig
On Wed, Jun 19, 2019 at 01:29:03PM -0300, Jason Gunthorpe wrote:
> > Yes.  This will blow up badly on many platforms, as sq->queue
> > might be vmapped, ioremapped, come from a pool without page backing.
> 
> Gah, this addr gets fed into io_remap_pfn_range/remap_pfn_range too..
> 
> Potnuri, you should fix this.. 
> 
> You probably need to use dma_mmap_from_dev_coherent() in the mmap ?

The function to use is dma_mmap_coherent, dma_mmap_from_dev_coherent is
just an internal helper.

That beiŋ said the drivers/infiniband code has a lot of
*remap_pfn_range, and a lot of them look like they might be for
DMA memory.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] use exact allocation for dma coherent memory

2019-06-17 Thread Christoph Hellwig
> drivers/infiniband/hw/cxgb4/qp.c
>129  static int alloc_host_sq(struct c4iw_rdev *rdev, struct t4_sq *sq)
>130  {
>131  sq->queue = dma_alloc_coherent(&(rdev->lldi.pdev->dev), 
> sq->memsize,
>132 &(sq->dma_addr), GFP_KERNEL);
>133  if (!sq->queue)
>134  return -ENOMEM;
>135  sq->phys_addr = virt_to_phys(sq->queue);
>136  dma_unmap_addr_set(sq, mapping, sq->dma_addr);
>137  return 0;
>138  }
> 
> Is this a bug?

Yes.  This will blow up badly on many platforms, as sq->queue
might be vmapped, ioremapped, come from a pool without page backing.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [patch V2 12/29] dma/debug: Simplify stracktrace retrieval

2019-04-19 Thread Christoph Hellwig
Please fix up the > 80 char line.  Otherwise:

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] use exact allocation for dma coherent memory

2019-07-01 Thread Christoph Hellwig
On Fri, Jun 14, 2019 at 03:47:10PM +0200, Christoph Hellwig wrote:
> Switching to a slightly cleaned up alloc_pages_exact is pretty easy,
> but it turns out that because we didn't filter valid gfp_t flags
> on the DMA allocator, a bunch of drivers were passing __GFP_COMP
> to it, which is rather bogus in too many ways to explain.  Arm has
> been filtering it for a while, but this series instead tries to fix
> the drivers and warn when __GFP_COMP is passed, which makes it much
> larger than just adding the functionality.

Dear driver maintainers,

can you look over the patches touching your drivers, please?  I'd
like to get as much as possible of the driver patches into this
merge window, so that it can you through your maintainer trees.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] use exact allocation for dma coherent memory

2019-07-08 Thread Christoph Hellwig
On Tue, Jul 02, 2019 at 11:48:44AM +0200, Arend Van Spriel wrote:
> You made me look ;-) Actually not touching my drivers so I'm off the hook. 
> However, I was wondering if drivers could know so I decided to look into 
> the DMA-API.txt documentation which currently states:
>
> """
> The flag parameter (dma_alloc_coherent() only) allows the caller to
> specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
> implementation may choose to ignore flags that affect the location of
> the returned memory, like GFP_DMA).
> """
>
> I do expect you are going to change that description as well now that you 
> are going to issue a warning on __GFP_COMP. Maybe include that in patch 
> 15/16 where you introduce that warning.

Yes, that description needs an updated, even without this series.
I'll make sure it is more clear.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] Revert "nvme-pci: use host managed power state for suspend"

2019-08-16 Thread Christoph Hellwig
On Fri, Aug 16, 2019 at 01:30:29PM +0100, Chris Wilson wrote:
> Quoting Christoph Hellwig (2019-08-16 13:26:44)
> > Please, report the actual problem.  Blindly reverting a patch without
> > even an explanation of your regressions is not the way to do it.
> 
> As stated, the system doesn't suspend.
> 
> If you would like to wait, you will get test results from our CI
> giving the current failed state and the outcome of the patch.

Platform type, SSD vendor and type, firmware version?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] Revert "nvme-pci: use host managed power state for suspend"

2019-08-16 Thread Christoph Hellwig
Please, report the actual problem.  Blindly reverting a patch without
even an explanation of your regressions is not the way to do it.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2 2/2] i915: do not leak module ref counter

2019-08-19 Thread Christoph Hellwig
On Tue, Aug 20, 2019 at 12:13:59PM +0900, Sergey Senozhatsky wrote:
> Always put_filesystem() in i915_gemfs_init().
> 
> Signed-off-by: Sergey Senozhatsky 
> ---
>  - v2: rebased (i915 does not remount gemfs anymore)

Which means it real doesn't need its mount anyore, and thus can use
plain old shmem_file_setup and doesn't need to mess with file system
types at all.

Assuming we find a legitimate rason for why a driver should be able
to create a kernel mount or a file system type where it doesn't have
access to the struct file_system_type an API that mount by file system
name and thus hides the get_fs_type and put_filesystem would be a much
better API than adding this random export.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2 2/3] i915: convert to new mount API

2019-08-07 Thread Christoph Hellwig
On Tue, Aug 06, 2019 at 12:50:10AM -0700, Hugh Dickins wrote:
> Though personally I'm averse to managing "f"objects through
> "m"interfaces, which can get ridiculous (notably, MADV_HUGEPAGE works
> on the virtual address of a mapping, but the huge-or-not alignment of
> that mapping must have been decided previously).  In Google we do use
> fcntls F_HUGEPAGE and F_NOHUGEPAGE to override on a per-file basis -
> one day I'll get to upstreaming those.

Such an interface seems very useful, although the two fcntls seem a bit
odd.

But I think the point here is that the i915 has its own somewhat odd
instance of tmpfs.  If we could pass the equivalent of the huge=*
options to shmem_file_setup all that garbage (including the
shmem_file_setup_with_mnt function) could go away.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct

2019-08-08 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm'

2019-08-08 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release

2019-08-08 Thread Christoph Hellwig
On Tue, Aug 06, 2019 at 08:15:48PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> mmu_notifier_unregister_no_release() and mmu_notifier_call_srcu() no
> longer have any users, they have all been converted to use
> mmu_notifier_put().
> 
> So delete this difficult to use interface.
> 
> Signed-off-by: Jason Gunthorpe 

Looks good:

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm

2019-08-08 Thread Christoph Hellwig
On Tue, Aug 06, 2019 at 08:15:39PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> A prior commit e0f3c3f78da2 ("mm/mmu_notifier: init notifier if necessary")
> made an attempt at doing this, but had to be reverted as calling
> the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
> commit 35cfa2b0b491 ("mm/mmu_notifier: allocate mmu_notifier in advance").
> 
> However, we can avoid that problem by doing the allocation only under
> the mmap_sem, which is already happening.
> 
> Since all writers to mm->mmu_notifier_mm hold the write side of the
> mmap_sem reading it under that sem is deterministic and we can use that to
> decide if the allocation path is required, without speculation.
> 
> The actual update to mmu_notifier_mm must still be done under the
> mm_take_all_locks() to ensure read-side coherency.
> 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller

2019-08-08 Thread Christoph Hellwig
On Tue, Aug 06, 2019 at 08:15:38PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> This simplifies the code to not have so many one line functions and extra
> logic. __mmu_notifier_register() simply becomes the entry point to
> register the notifier, and the other one calls it under lock.
> 
> Also add a lockdep_assert to check that the callers are holding the lock
> as expected.
> 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Jason Gunthorpe 

Looks good:

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH V6 6/6] docs: sample driver to demonstrate how to implement virtio-mdev framework

2019-10-30 Thread Christoph Hellwig
On Wed, Oct 30, 2019 at 02:44:44PM +0800, Jason Wang wrote:
> This sample driver creates mdev device that simulate virtio net device
> over virtio mdev transport. The device is implemented through vringh
> and workqueue. A device specific dma ops is to make sure HVA is used
> directly as the IOVA. This should be sufficient for kernel virtio
> driver to work.
> 
> Only 'virtio' type is supported right now. I plan to add 'vhost' type
> on top which requires some virtual IOMMU implemented in this sample
> driver.

Can we please submit a real driver for it?  A more or less useless
sample driver doesn't really qualify for our normal kernel requirements
that infrastructure should have a real user.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-12 Thread Christoph Hellwig
On Thu, Mar 12, 2020 at 11:14:22AM +0100, Christian König wrote:
> > > The page pointer is set to NULL and only the DMA address,
> > > length and offset values are valid.
> > NAK.  The only valid way to fill DMA address in scatterlists is
> > dma_map_sg / dma_map_sg_attr.
> 
> How can we then map PCIe BARs into an scatterlist which are not backed by
> struct pages?

You can't.  scatterlists by definition map memory backed by a struct
page.  If you want to map something else struct scatterlist is the
wrong structure and you need to use something else (which you should
anyway as struct scatterlist is a bad design patter, and the above
is only one of the many issues with it).
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-12 Thread Christoph Hellwig
On Thu, Mar 12, 2020 at 11:31:35AM +0100, Christian König wrote:
> But how should we then deal with all the existing interfaces which already
> take a scatterlist/sg_table ?
>
> The whole DMA-buf design and a lot of drivers are build around
> scatterlist/sg_table and to me that actually makes quite a lot of sense.
> 

Replace them with a saner interface that doesn't take a scatterlist.
At very least for new functionality like peer to peer DMA, but
especially this code would also benefit from a general move away
from the scatterlist.

> For TTM I'm also trying for quite a while to just nuke the manual
> dma_address arrays we have and switch over to scatterlist/sg_table.

Which is a move in the wrong direction.

> I mean we could come up with a new structure for this, but to me that just
> looks like reinventing the wheel. Especially since drivers need to be able
> to handle both I/O to system memory and I/O to PCIe BARs.

The structure for holding the struct page side of the scatterlist is
called struct bio_vec, so far mostly used by the block and networking
code.  The structure for holding dma addresses doesn't really exist
in a generic form, but would be an array of these structures:

struct dma_sg {
dma_addr_t  addr;
u32 len;
};

Keeping them separate is important as most IOMMU drivers will return
less entries than you can feed them.  E.g. if your input boundaries
are 4k aligned you will usually just get a single IOVA entry back.
I will soon also have a dma mapping interface that will take advantage
of that fact.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-12 Thread Christoph Hellwig
On Wed, Mar 11, 2020 at 02:51:53PM +0100, Christian König wrote:
> This can be used by drivers to setup P2P DMA between device
> memory which is not backed by struct pages.
> 
> The drivers of the involved devices are responsible for
> setting up and tearing down DMA addresses as necessary
> using dma_map_resource().
> 
> The page pointer is set to NULL and only the DMA address,
> length and offset values are valid.

NAK.  The only valid way to fill DMA address in scatterlists is
dma_map_sg / dma_map_sg_attr.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] i915/gvt: remove unused xen bits

2020-04-13 Thread Christoph Hellwig
On Wed, Apr 08, 2020 at 09:44:37AM +0800, Zhenyu Wang wrote:
> On 2020.04.04 11:40:58 +0200, Christoph Hellwig wrote:
> > No Xen support anywhere here.  Remove a dead declaration and an unused
> > include.
> > 
> > Signed-off-by: Christoph Hellwig 
> > ---
> 
> We'll keep that off-tree.
> 
> Acked-by: Zhenyu Wang 

Can you pick this up through the i915 tree?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/6] i915/gvt/kvm: a NULL ->mm does not mean a thread is a kthread

2020-04-13 Thread Christoph Hellwig
On Mon, Apr 06, 2020 at 11:08:46PM -0400, Yan Zhao wrote:
> hi
> we were removing this code. see
> https://lore.kernel.org/kvm/20200313031109.7989-1-yan.y.z...@intel.com/

This didn't make 5.7-rc1.

> The implementation of vfio_dma_rw() has been in vfio next tree.
> https://github.com/awilliam/linux-vfio/commit/8d46c0cca5f4dc0538173d62cd36b1119b5105bc


This made 5.7-rc1, so I'll update the series to take it into account.

T
> in vfio_dma_rw(),  we still use
> bool kthread = current->mm == NULL.
> because if current->mm != NULL and current->flags & PF_KTHREAD, instead
> of calling use_mm(), we first check if (current->mm == mm) and allow 
> copy_to_user() if it's true.
> 
> Do you think it's all right?

I can't think of another way for a kernel thread to have a mm indeed.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/3] kernel: set USER_DS in kthread_use_mm

2020-04-15 Thread Christoph Hellwig
Some architectures like arm64 and s390 require USER_DS to be set for
kernel threads to access user address space, which is the whole purpose
of kthread_use_mm, but other like x86 don't.  That has lead to a huge
mess where some callers are fixed up once they are tested on said
architectures, while others linger around and yet other like io_uring
try to do "clever" optimizations for what usually is just a trivial
asignment to a member in the thread_struct for most architectures.

Make kthread_use_mm set USER_DS, and kthread_unuse_mm restore to the
previous value instead.

Signed-off-by: Christoph Hellwig 
Acked-by: Michael S. Tsirkin  [vhost]
---
 drivers/usb/gadget/function/f_fs.c | 4 
 drivers/vhost/vhost.c  | 3 ---
 fs/io-wq.c | 8 ++--
 fs/io_uring.c  | 4 
 kernel/kthread.c   | 6 ++
 5 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index d9e48bd7c692..a1198f4c527c 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -824,13 +824,9 @@ static void ffs_user_copy_worker(struct work_struct *work)
bool kiocb_has_eventfd = io_data->kiocb->ki_flags & IOCB_EVENTFD;
 
if (io_data->read && ret > 0) {
-   mm_segment_t oldfs = get_fs();
-
-   set_fs(USER_DS);
kthread_use_mm(io_data->mm);
ret = ffs_copy_to_iter(io_data->buf, ret, _data->data);
kthread_unuse_mm(io_data->mm);
-   set_fs(oldfs);
}
 
io_data->kiocb->ki_complete(io_data->kiocb, ret, ret);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 17d598e74780..b2abfbdf3cb2 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -329,9 +329,7 @@ static int vhost_worker(void *data)
struct vhost_dev *dev = data;
struct vhost_work *work, *work_next;
struct llist_node *node;
-   mm_segment_t oldfs = get_fs();
 
-   set_fs(USER_DS);
kthread_use_mm(dev->mm);
 
for (;;) {
@@ -361,7 +359,6 @@ static int vhost_worker(void *data)
}
}
kthread_unuse_mm(dev->mm);
-   set_fs(oldfs);
return 0;
 }
 
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 748621f7391e..a5e90ac39e4d 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -169,7 +169,6 @@ static bool __io_worker_unuse(struct io_wqe *wqe, struct 
io_worker *worker)
dropped_lock = true;
}
__set_current_state(TASK_RUNNING);
-   set_fs(KERNEL_DS);
kthread_unuse_mm(worker->mm);
mmput(worker->mm);
worker->mm = NULL;
@@ -421,14 +420,11 @@ static void io_wq_switch_mm(struct io_worker *worker, 
struct io_wq_work *work)
mmput(worker->mm);
worker->mm = NULL;
}
-   if (!work->mm) {
-   set_fs(KERNEL_DS);
+   if (!work->mm)
return;
-   }
+
if (mmget_not_zero(work->mm)) {
kthread_use_mm(work->mm);
-   if (!worker->mm)
-   set_fs(USER_DS);
worker->mm = work->mm;
/* hang on to this mm */
work->mm = NULL;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8a8148512da7..40f90b98a18a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5908,15 +5908,12 @@ static int io_sq_thread(void *data)
struct io_ring_ctx *ctx = data;
struct mm_struct *cur_mm = NULL;
const struct cred *old_cred;
-   mm_segment_t old_fs;
DEFINE_WAIT(wait);
unsigned long timeout;
int ret = 0;
 
complete(>completions[1]);
 
-   old_fs = get_fs();
-   set_fs(USER_DS);
old_cred = override_creds(ctx->creds);
 
timeout = jiffies + ctx->sq_thread_idle;
@@ -6023,7 +6020,6 @@ static int io_sq_thread(void *data)
if (current->task_works)
task_work_run();
 
-   set_fs(old_fs);
if (cur_mm) {
kthread_unuse_mm(cur_mm);
mmput(cur_mm);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 8ed4b4fbec7c..86357cd38eb2 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -52,6 +52,7 @@ struct kthread {
unsigned long flags;
unsigned int cpu;
void *data;
+   mm_segment_t oldfs;
struct completion parked;
struct completion exited;
 #ifdef CONFIG_BLK_CGROUP
@@ -1235,6 +1236,9 @@ void kthread_use_mm(struct mm_struct *mm)
 
if (active_mm != mm)
mmdrop(active_mm);
+
+   to_kthread(tsk)->oldfs = get_fs();
+   set_fs(USER_DS);
 }
 EXPORT_SYMBOL_GPL(kthread_use_mm);
 
@@ -1249,6 +1253,8 @@ void kthread_unuse_mm(struct mm_struct *mm)
  

[Intel-gfx] improve use_mm / unuse_mm v2

2020-04-15 Thread Christoph Hellwig
Hi all,

this series improves the use_mm / unuse_mm interface by better
documenting the assumptions, and my taking the set_fs manipulations
spread over the callers into the core API.

Changes since v1:
 - drop a few patches
 - fix a comment typo
 - cover the newly merged use_mm/unuse_mm caller in vfio
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/3] kernel: better document the use_mm/unuse_mm API contract

2020-04-15 Thread Christoph Hellwig
Switch the function documentation to kerneldoc comments, and add
WARN_ON_ONCE asserts that the calling thread is a kernel thread and
does not have ->mm set (or has ->mm set in the case of unuse_mm).

Also give the functions a kthread_ prefix to better document the
use case.

Signed-off-by: Christoph Hellwig 
Acked-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  4 +--
 drivers/gpu/drm/i915/gvt/kvmgt.c   |  4 +--
 drivers/usb/gadget/function/f_fs.c |  4 +--
 drivers/usb/gadget/legacy/inode.c  |  4 +--
 drivers/vfio/vfio_iommu_type1.c|  4 +--
 drivers/vhost/vhost.c  |  4 +--
 fs/io-wq.c |  6 ++--
 fs/io_uring.c  |  6 ++--
 include/linux/kthread.h|  4 +--
 kernel/kthread.c   | 33 +++---
 mm/oom_kill.c  |  6 ++--
 mm/vmacache.c  |  4 +--
 12 files changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index b820c8fc689f..b063bd7f41d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -192,9 +192,9 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev 
*dst, struct kgd_dev *s
if ((mmptr) == current->mm) {   \
valid = !get_user((dst), (wptr));   \
} else if (current->mm == NULL) {   \
-   use_mm(mmptr);  \
+   kthread_use_mm(mmptr);  \
valid = !get_user((dst), (wptr));   \
-   unuse_mm(mmptr);\
+   kthread_unuse_mm(mmptr);\
}   \
pagefault_enable(); \
}   \
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index ca1dd6e6f395..f2927575b793 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -2048,7 +2048,7 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned 
long gpa,
if (kthread) {
if (!mmget_not_zero(kvm->mm))
return -EFAULT;
-   use_mm(kvm->mm);
+   kthread_use_mm(kvm->mm);
}
 
idx = srcu_read_lock(>srcu);
@@ -2057,7 +2057,7 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned 
long gpa,
srcu_read_unlock(>srcu, idx);
 
if (kthread) {
-   unuse_mm(kvm->mm);
+   kthread_unuse_mm(kvm->mm);
mmput(kvm->mm);
}
 
diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index c57b1b2507c6..d9e48bd7c692 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -827,9 +827,9 @@ static void ffs_user_copy_worker(struct work_struct *work)
mm_segment_t oldfs = get_fs();
 
set_fs(USER_DS);
-   use_mm(io_data->mm);
+   kthread_use_mm(io_data->mm);
ret = ffs_copy_to_iter(io_data->buf, ret, _data->data);
-   unuse_mm(io_data->mm);
+   kthread_unuse_mm(io_data->mm);
set_fs(oldfs);
}
 
diff --git a/drivers/usb/gadget/legacy/inode.c 
b/drivers/usb/gadget/legacy/inode.c
index 8b5233888bf8..a05552bc2ff8 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -462,9 +462,9 @@ static void ep_user_copy_worker(struct work_struct *work)
struct kiocb *iocb = priv->iocb;
size_t ret;
 
-   use_mm(mm);
+   kthread_use_mm(mm);
ret = copy_to_iter(priv->buf, priv->actual, >to);
-   unuse_mm(mm);
+   kthread_unuse_mm(mm);
if (!ret)
ret = -EFAULT;
 
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5f50866a8b01..2eb105aa9723 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2333,7 +2333,7 @@ static int vfio_iommu_type1_dma_rw_chunk(struct 
vfio_iommu *iommu,
return -EPERM;
 
if (kthread)
-   use_mm(mm);
+   kthread_use_mm(mm);
else if (current->mm != mm)
goto out;
 
@@ -2351,7 +2351,7 @@ static int vfio_iommu_type1_dma_rw_chunk(struct 
vfio_iommu *iommu,
*copied = __copy_from_user(data, (void __user *)vaddr,
   count) ? 0 : count;
if (kthread)
-  

[Intel-gfx] [PATCH 1/3] kernel: move use_mm/unuse_mm to kthread.c

2020-04-15 Thread Christoph Hellwig
These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward.  Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.

Signed-off-by: Christoph Hellwig 
Acked-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  1 -
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 -
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  2 +-
 drivers/usb/gadget/function/f_fs.c|  2 +-
 drivers/usb/gadget/legacy/inode.c |  2 +-
 drivers/vfio/vfio_iommu_type1.c   |  2 +-
 drivers/vhost/vhost.c |  1 -
 fs/aio.c  |  1 -
 fs/io-wq.c|  1 -
 fs/io_uring.c |  1 -
 include/linux/kthread.h   |  5 ++
 include/linux/mmu_context.h   |  5 --
 kernel/kthread.c  | 56 
 mm/Makefile   |  2 +-
 mm/mmu_context.c  | 64 ---
 19 files changed, 67 insertions(+), 86 deletions(-)
 delete mode 100644 mm/mmu_context.c

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 13feb313e9b3..b820c8fc689f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -27,6 +27,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index 6529caca88fe..35d4a5ab0228 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 4ec6d0c03201..b1655054b919 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
-#include 
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gc/gc_10_1_0_offset.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 0b7e78748540..7d01420c0c85 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -20,8 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "cikd.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index ccd635b812b5..635cd1a26bed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -20,8 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gfx_v8_0.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index df841c2ac5e7..c7fd0c47b254 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -19,8 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gc/gc_9_0_offset.h"
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 074c4efb58eb..ca1dd6e6f395 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -31,7 +31,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index c81023b195c3..c57b1b2507c6 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -32,7 +32,7 @@
 #include 
 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/drivers/usb/gadget/legacy/inode.c 
b/drivers/usb/gadget/legacy/inode.c
index aa0de9e35afa..8b5233888bf8 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -21,7 +21,7 @@
 #inclu

Re: [Intel-gfx] [PATCH 2/6] i915/gvt/kvm: a NULL ->mm does not mean a thread is a kthread

2020-04-14 Thread Christoph Hellwig
On Mon, Apr 13, 2020 at 08:04:10PM -0400, Yan Zhao wrote:
> > I can't think of another way for a kernel thread to have a mm indeed.
> for example, before calling to vfio_dma_rw(), a kernel thread has already
> called use_mm(), then its current->mm is not null, and it has flag
> PF_KTHREAD.
> in this case, we just want to allow the copy_to_user() directly if
> current->mm == mm, rather than call another use_mm() again.
> 
> do you think it makes sense?

I mean no other way than using use_mm.  That being said nesting
potentional use_mm callers sounds like a rather bad idea, and we
should avoid that.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] improve use_mm / unuse_mm v2

2020-04-17 Thread Christoph Hellwig
On Thu, Apr 16, 2020 at 08:17:44PM -0700, Matthew Wilcox wrote:
> On Thu, Apr 16, 2020 at 07:31:55AM +0200, Christoph Hellwig wrote:
> > this series improves the use_mm / unuse_mm interface by better
> > documenting the assumptions, and my taking the set_fs manipulations
> > spread over the callers into the core API.
> 
> I appreciate all the work you're doing here.
> 
> Do you have plans to introduce a better-named API than set_fs() / get_fs()?

Eventually.  For now I just plan to kill as many as possible.

> Also, having set_fs() return the previous value of 'fs' would simplify
> a lot of the callers.

One thing that should go relatively soon is the need to store the
previous value because we'll have so few callers left that we know we can't
recurse. We should be able to get there around 5.9 / 5.10.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-16 Thread Christoph Hellwig
On Mon, Mar 16, 2020 at 10:41:42AM +0100, Christian König wrote:
> Well I would prefer if the drivers can somehow express their requirements
> and get IOVA structures already in the form they need.
> 
> Converting the IOVA data from one form to another is sometimes quite costly.
> Especially when it is only temporarily needed.

We basically have two ways to generate the IOVA:

  - a linear translation for the direct mapping case or some dumb IOMMU
drivers - in that case case there is a 1:1 mapping between input
segments and output segments in DMA mapping
  - a non-trivial IOMMU where all aligned segments are merged into
a single IOVA range

So I don't really see how the dma layer could help much with any
limitation beyond existing max size and dma boundary ones.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-16 Thread Christoph Hellwig
On Thu, Mar 12, 2020 at 11:19:28AM -0300, Jason Gunthorpe wrote:
> The non-page scatterlist is also a big concern for RDMA as we have
> drivers that want the page list, so even if we did as this series
> contemplates I'd have still have to split the drivers and create the
> notion of a dma-only SGL.

The drivers I looked at want a list of IOVA address, aligned to the
device "page size".  What other data do drivers want?  Execept for the
software protocol stack drivers, which of couse need pages for the
stack futher down.

> I haven't used bio_vecs before, do they support chaining like SGL so
> they can be very big? RDMA dma maps gigabytes of memory

bio_vecs itself don't have the chaining, but the bios build around them
do.  But each entry can map a huge pile.  If needed we could use the
same chaining scheme we use for scatterlists for bio_vecs as well, but
lets see if we really end up needing that.

> So I'm guessing the path forward is something like
> 
>  - Add some generic dma_sg data structure and helper
>  - Add dma mapping code to go from pages to dma_sg

That has been on my todo list for a while.  All the DMA consolidatation
is to prepare for that and we're finally getting close.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] lib/scatterlist: add sg_set_dma_addr() function

2020-03-16 Thread Christoph Hellwig
On Fri, Mar 13, 2020 at 09:17:42AM -0300, Jason Gunthorpe wrote:
> On Fri, Mar 13, 2020 at 04:21:39AM -0700, Christoph Hellwig wrote:
> > On Thu, Mar 12, 2020 at 11:19:28AM -0300, Jason Gunthorpe wrote:
> > > The non-page scatterlist is also a big concern for RDMA as we have
> > > drivers that want the page list, so even if we did as this series
> > > contemplates I'd have still have to split the drivers and create the
> > > notion of a dma-only SGL.
> > 
> > The drivers I looked at want a list of IOVA address, aligned to the
> > device "page size".  What other data do drivers want?  Execept for the
> > software protocol stack drivers, which of couse need pages for the
> > stack futher down.
> 
> In principle it is possible to have just an aligned page list -
> however the page size is variable, following certain rules, and today
> the drivers still determine the correct page size largely on their
> own.  
> 
> Some progress was made recently to consolidate this, but more is
> needed.
> 
> If the common code doesn't know the device page size in advance then
> today's approach of sending largest possible dma mapped SGLs into the
> device driver is best. The driver only has to do splitting.

The point was that drivers don't need pages, drivers need IOVAs.  In
what form they are stuffed into the hardware is the drivers problem.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/6] kernel: move use_mm/unuse_mm to kthread.c

2020-04-04 Thread Christoph Hellwig
These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward.  Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  1 -
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 -
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  2 +-
 drivers/usb/gadget/function/f_fs.c|  2 +-
 drivers/usb/gadget/legacy/inode.c |  2 +-
 drivers/vhost/vhost.c |  1 -
 fs/aio.c  |  1 -
 fs/io-wq.c|  1 -
 fs/io_uring.c |  1 -
 include/linux/kthread.h   |  5 ++
 include/linux/mmu_context.h   |  5 --
 kernel/kthread.c  | 56 
 mm/Makefile   |  2 +-
 mm/mmu_context.c  | 64 ---
 18 files changed, 66 insertions(+), 85 deletions(-)
 delete mode 100644 mm/mmu_context.c

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 4db143c19dcc..bce5e93fefc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -27,6 +27,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index 6529caca88fe..35d4a5ab0228 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 4ec6d0c03201..b1655054b919 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
-#include 
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gc/gc_10_1_0_offset.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 0b7e78748540..7d01420c0c85 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -20,8 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "cikd.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index ccd635b812b5..635cd1a26bed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -20,8 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gfx_v8_0.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index df841c2ac5e7..c7fd0c47b254 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -19,8 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
-#include 
-
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "gc/gc_9_0_offset.h"
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 5848400620b4..dee01c371bf5 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -31,7 +31,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index c81023b195c3..c57b1b2507c6 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -32,7 +32,7 @@
 #include 
 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/drivers/usb/gadget/legacy/inode.c 
b/drivers/usb/gadget/legacy/inode.c
index aa0de9e35afa..8b5233888bf8 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -21,7 +21,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 

[Intel-gfx] [PATCH 1/6] amdgpu: a NULL ->mm does not mean a thread is a kthread

2020-04-04 Thread Christoph Hellwig
Use the proper API instead.

Fixes: 70539bd795002 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 13feb313e9b3..4db143c19dcc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -190,7 +190,7 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev 
*dst, struct kgd_dev *s
pagefault_disable();\
if ((mmptr) == current->mm) {   \
valid = !get_user((dst), (wptr));   \
-   } else if (current->mm == NULL) {   \
+   } else if (current->flags & PF_KTHREAD) {   \
use_mm(mmptr);  \
valid = !get_user((dst), (wptr));   \
unuse_mm(mmptr);\
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/6] kernel: better document the use_mm/unuse_mm API contract

2020-04-04 Thread Christoph Hellwig
Switch the function documentation to kerneldoc comments, and add
WARN_ON_ONCE asserts that the calling thread is a kernel thread and
does not have ->mm set (or has ->mm set in the case of unuse_mm).

Also give the functions a kthread_ prefix to better document the
use case.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  4 +--
 drivers/gpu/drm/i915/gvt/kvmgt.c   |  4 +--
 drivers/usb/gadget/function/f_fs.c |  4 +--
 drivers/usb/gadget/legacy/inode.c  |  4 +--
 drivers/vhost/vhost.c  |  4 +--
 fs/io-wq.c |  6 ++--
 fs/io_uring.c  |  6 ++--
 include/linux/kthread.h|  4 +--
 kernel/kthread.c   | 33 +++---
 mm/oom_kill.c  |  6 ++--
 mm/vmacache.c  |  4 +--
 11 files changed, 39 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index bce5e93fefc8..63db84e09408 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -192,9 +192,9 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev 
*dst, struct kgd_dev *s
if ((mmptr) == current->mm) {   \
valid = !get_user((dst), (wptr));   \
} else if (current->flags & PF_KTHREAD) {   \
-   use_mm(mmptr);  \
+   kthread_use_mm(mmptr);  \
valid = !get_user((dst), (wptr));   \
-   unuse_mm(mmptr);\
+   kthread_unuse_mm(mmptr);\
}   \
pagefault_enable(); \
}   \
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index dee01c371bf5..92e9b340dbc2 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -2048,7 +2048,7 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned 
long gpa,
if (kthread) {
if (!mmget_not_zero(kvm->mm))
return -EFAULT;
-   use_mm(kvm->mm);
+   kthread_use_mm(kvm->mm);
}
 
idx = srcu_read_lock(>srcu);
@@ -2057,7 +2057,7 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned 
long gpa,
srcu_read_unlock(>srcu, idx);
 
if (kthread) {
-   unuse_mm(kvm->mm);
+   kthread_unuse_mm(kvm->mm);
mmput(kvm->mm);
}
 
diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index c57b1b2507c6..d9e48bd7c692 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -827,9 +827,9 @@ static void ffs_user_copy_worker(struct work_struct *work)
mm_segment_t oldfs = get_fs();
 
set_fs(USER_DS);
-   use_mm(io_data->mm);
+   kthread_use_mm(io_data->mm);
ret = ffs_copy_to_iter(io_data->buf, ret, _data->data);
-   unuse_mm(io_data->mm);
+   kthread_unuse_mm(io_data->mm);
set_fs(oldfs);
}
 
diff --git a/drivers/usb/gadget/legacy/inode.c 
b/drivers/usb/gadget/legacy/inode.c
index 8b5233888bf8..a05552bc2ff8 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -462,9 +462,9 @@ static void ep_user_copy_worker(struct work_struct *work)
struct kiocb *iocb = priv->iocb;
size_t ret;
 
-   use_mm(mm);
+   kthread_use_mm(mm);
ret = copy_to_iter(priv->buf, priv->actual, >to);
-   unuse_mm(mm);
+   kthread_unuse_mm(mm);
if (!ret)
ret = -EFAULT;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 4e9ce54869af..1787d426a956 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -336,7 +336,7 @@ static int vhost_worker(void *data)
mm_segment_t oldfs = get_fs();
 
set_fs(USER_DS);
-   use_mm(dev->mm);
+   kthread_use_mm(dev->mm);
 
for (;;) {
/* mb paired w/ kthread_stop */
@@ -364,7 +364,7 @@ static int vhost_worker(void *data)
schedule();
}
}
-   unuse_mm(dev->mm);
+   kthread_unuse_mm(dev->mm);
set_fs(oldfs);
return 0;
 }
diff --git a/fs/io-wq.c b/fs/io-wq.c
index c49c2bdbafb5..83c2868eff2a 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -169,7 +16

[Intel-gfx] [PATCH 3/6] i915/gvt: remove unused xen bits

2020-04-04 Thread Christoph Hellwig
No Xen support anywhere here.  Remove a dead declaration and an unused
include.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/i915/gvt/gvt.c   | 1 -
 drivers/gpu/drm/i915/gvt/hypercall.h | 2 --
 2 files changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
index 9e1787867894..c7c561237883 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.c
+++ b/drivers/gpu/drm/i915/gvt/gvt.c
@@ -31,7 +31,6 @@
  */
 
 #include 
-#include 
 #include 
 
 #include "i915_drv.h"
diff --git a/drivers/gpu/drm/i915/gvt/hypercall.h 
b/drivers/gpu/drm/i915/gvt/hypercall.h
index b17c4a1599cd..b79da5124f83 100644
--- a/drivers/gpu/drm/i915/gvt/hypercall.h
+++ b/drivers/gpu/drm/i915/gvt/hypercall.h
@@ -79,6 +79,4 @@ struct intel_gvt_mpt {
bool (*is_valid_gfn)(unsigned long handle, unsigned long gfn);
 };
 
-extern struct intel_gvt_mpt xengt_mpt;
-
 #endif /* _GVT_HYPERCALL_H_ */
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/6] i915/gvt/kvm: a NULL ->mm does not mean a thread is a kthread

2020-04-04 Thread Christoph Hellwig
Use the proper API instead.

Fixes: f440c8a572d7 ("drm/i915/gvt/kvmgt: read/write GPA via KVM API")
Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 074c4efb58eb..5848400620b4 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -2037,7 +2037,7 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned 
long gpa,
struct kvmgt_guest_info *info;
struct kvm *kvm;
int idx, ret;
-   bool kthread = current->mm == NULL;
+   bool kthread = (current->flags & PF_KTHREAD);
 
if (!handle_valid(handle))
return -ESRCH;
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] improve use_mm / unuse_mm

2020-04-04 Thread Christoph Hellwig
Hi all,

this series improves the use_mm / unuse_mm interface by better
documenting the assumptions, and my taking the set_fs manipulations
spread over the callers into the core API.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 6/6] kernel: set USER_DS in kthread_use_mm

2020-04-04 Thread Christoph Hellwig
Some architectures like arm64 and s390 require USER_DS to be set for
kernel threads to access user address space, which is the whole purpose
of kthread_use_mm, but other like x86 don't.  That has lead to a huge
mess where some callers are fixed up once they are tested on said
architectures, while others linger around and yet other like io_uring
try to do "clever" optimizations for what usually is just a trivial
asignment to a member in the thread_struct for most architectures.

Make kthread_use_mm set USER_DS, and kthread_unuse_mm restore to the
previous value instead.

Signed-off-by: Christoph Hellwig 
---
 drivers/usb/gadget/function/f_fs.c | 4 
 drivers/vhost/vhost.c  | 3 ---
 fs/io-wq.c | 8 ++--
 fs/io_uring.c  | 4 
 kernel/kthread.c   | 6 ++
 5 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index d9e48bd7c692..a1198f4c527c 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -824,13 +824,9 @@ static void ffs_user_copy_worker(struct work_struct *work)
bool kiocb_has_eventfd = io_data->kiocb->ki_flags & IOCB_EVENTFD;
 
if (io_data->read && ret > 0) {
-   mm_segment_t oldfs = get_fs();
-
-   set_fs(USER_DS);
kthread_use_mm(io_data->mm);
ret = ffs_copy_to_iter(io_data->buf, ret, _data->data);
kthread_unuse_mm(io_data->mm);
-   set_fs(oldfs);
}
 
io_data->kiocb->ki_complete(io_data->kiocb, ret, ret);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 1787d426a956..b5229ae01d3b 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -333,9 +333,7 @@ static int vhost_worker(void *data)
struct vhost_dev *dev = data;
struct vhost_work *work, *work_next;
struct llist_node *node;
-   mm_segment_t oldfs = get_fs();
 
-   set_fs(USER_DS);
kthread_use_mm(dev->mm);
 
for (;;) {
@@ -365,7 +363,6 @@ static int vhost_worker(void *data)
}
}
kthread_unuse_mm(dev->mm);
-   set_fs(oldfs);
return 0;
 }
 
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 83c2868eff2a..75cc2f31816d 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -168,7 +168,6 @@ static bool __io_worker_unuse(struct io_wqe *wqe, struct 
io_worker *worker)
dropped_lock = true;
}
__set_current_state(TASK_RUNNING);
-   set_fs(KERNEL_DS);
kthread_unuse_mm(worker->mm);
mmput(worker->mm);
worker->mm = NULL;
@@ -420,14 +419,11 @@ static void io_wq_switch_mm(struct io_worker *worker, 
struct io_wq_work *work)
mmput(worker->mm);
worker->mm = NULL;
}
-   if (!work->mm) {
-   set_fs(KERNEL_DS);
+   if (!work->mm)
return;
-   }
+
if (mmget_not_zero(work->mm)) {
kthread_use_mm(work->mm);
-   if (!worker->mm)
-   set_fs(USER_DS);
worker->mm = work->mm;
/* hang on to this mm */
work->mm = NULL;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 367406381044..c332a34e8b34 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5871,15 +5871,12 @@ static int io_sq_thread(void *data)
struct io_ring_ctx *ctx = data;
struct mm_struct *cur_mm = NULL;
const struct cred *old_cred;
-   mm_segment_t old_fs;
DEFINE_WAIT(wait);
unsigned long timeout;
int ret = 0;
 
complete(>completions[1]);
 
-   old_fs = get_fs();
-   set_fs(USER_DS);
old_cred = override_creds(ctx->creds);
 
timeout = jiffies + ctx->sq_thread_idle;
@@ -5985,7 +5982,6 @@ static int io_sq_thread(void *data)
if (current->task_works)
task_work_run();
 
-   set_fs(old_fs);
if (cur_mm) {
kthread_unuse_mm(cur_mm);
mmput(cur_mm);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 316db17f6b4f..9e27d01b6d78 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -52,6 +52,7 @@ struct kthread {
unsigned long flags;
unsigned int cpu;
void *data;
+   mm_segment_t oldfs;
struct completion parked;
struct completion exited;
 #ifdef CONFIG_BLK_CGROUP
@@ -1235,6 +1236,9 @@ void kthread_use_mm(struct mm_struct *mm)
 
if (active_mm != mm)
mmdrop(active_mm);
+
+   to_kthread(tsk)->oldfs = get_fs();
+   set_fs(USER_DS);
 }
 EXPORT_SYMBOL_GPL(kthread_use_mm);
 
@@ -1249,6 +1253,8 @@ void kthread_unuse_mm(struct mm_struct *mm)
WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));

Re: [Intel-gfx] [RFC 00/17] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-04-28 Thread Christoph Hellwig
On Tue, Apr 28, 2020 at 03:19:48PM +0200, Marek Szyprowski wrote:
> 1. introduce a dma_{map,sync,unmap}_sgtable() wrappers, which will use
>a proper sg_table entries and call respective DMA-mapping functions
>and adapt current code to it

That sounds reasonable to me.  Those could be pretty trivial wrappers.

>
> 
> 2. rename nents and orig_nents to nr_pages, nr_dmas to clearly state
>which one refers to which part of the scatterlist; I'm open for
>other names for those entries

nr_cpu_ents and nr_dma_ents might be better names, but it still would be
a whole lot of churn for little gain.  I think just good wrappers like
suggested above might be more helpful.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm 2/5] mm/hmm: make hmm_range_fault return 0 or -1

2020-04-21 Thread Christoph Hellwig
On Tue, Apr 21, 2020 at 09:21:43PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> hmm_vma_walk->last is supposed to be updated after every write to the
> pfns, so that it can be returned by hmm_range_fault(). However, this is
> not done consistently. Fortunately nothing checks the return code of
> hmm_range_fault() for anything other than error.
> 
> More importantly last must be set before returning -EBUSY as it is used to
> prevent reading an output pfn as an input flags when the loop restarts.
> 
> For clarity and simplicity make hmm_range_fault() return 0 or -ERRNO. Only
> set last when returning -EBUSY.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  Documentation/vm/hmm.rst|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  4 ++--
>  drivers/gpu/drm/nouveau/nouveau_svm.c   |  6 +++---
>  include/linux/hmm.h |  2 +-
>  mm/hmm.c| 25 +
>  5 files changed, 16 insertions(+), 23 deletions(-)
> 
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index 4e3e9362afeb10..9924f2caa0184c 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -161,7 +161,7 @@ device must complete the update before the driver 
> callback returns.
>  When the device driver wants to populate a range of virtual addresses, it can
>  use::
>  
> -  long hmm_range_fault(struct hmm_range *range);
> +  int hmm_range_fault(struct hmm_range *range);
>  
>  It will trigger a page fault on missing or read-only entries if write access 
> is
>  requested (see below). Page faults use the generic mm page fault code path 
> just
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 6309ff72bd7876..efc1329a019127 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -852,12 +852,12 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
> struct page **pages)
>   down_read(>mmap_sem);
>   r = hmm_range_fault(range);
>   up_read(>mmap_sem);
> - if (unlikely(r <= 0)) {
> + if (unlikely(r)) {
>   /*
>* FIXME: This timeout should encompass the retry from
>* mmu_interval_read_retry() as well.
>*/
> - if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
> + if ((r == -EBUSY) && !time_after(jiffies, timeout))

Please also kill the superflous inner braces here.

> + * Return: 0 or -ERRNO with one of the following status codes:

Maybe say something like:

* Returns 0 on success or one of the following error codes:

Otherwise this looks good:

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm 4/5] mm/hmm: remove HMM_PFN_SPECIAL

2020-04-21 Thread Christoph Hellwig
On Tue, Apr 21, 2020 at 09:21:45PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> This is just an alias for HMM_PFN_ERROR, nothing cares that the error was
> because of a special page vs any other error case.

Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm 5/5] mm/hmm: remove the customizable pfn format from hmm_range_fault

2020-04-23 Thread Christoph Hellwig
On Wed, Apr 22, 2020 at 09:39:11AM -0300, Jason Gunthorpe wrote:
> > Nice callchain from hell..  Unfortunately such "code listings" tend to
> > get out of date very quickly, so I'm not sure it is worth keeping in
> > the code.  What would be really worthile is consolidating the two
> > different sets of defines (NVIF_VMM_PFNMAP_V0_ vs NVKM_VMM_PFN_)
> > to make the code a little easier to follow.
> 
> I was mainly concerned that this function is using hmm properly,
> becuase it sure looks like it is just forming the CPU physical address
> into a HW specific data. But it turns out it is just an internal data
> for some other code and the dma_map is impossibly far away
> 
> It took forever to find, I figured I'd leave a hint for the next poor
> soul that has to look at this.. 
> 
> Also, I think it shows there is no 'performance' argument here, if
> this path needs more performance the above should be cleaned
> before we abuse hmm_range_fault.
> 
> Put it in the commit message instead?

Yes, the graph itself sounds reasonable for the commit log as a point
of time information.

> > >   npages = (range->end - range->start) >> PAGE_SHIFT;
> > >   for (i = 0; i < npages; ++i) {
> > >   struct page *page;
> > >  
> > > + if (!(range->hmm_pfns[i] & HMM_PFN_VALID)) {
> > > + ioctl_addr[i] = 0;
> > >   continue;
> > > + }
> > 
> > Can't we rely on the caller pre-zeroing the array?
> 
> This ends up as args.phys in nouveau_svm_fault - I didn't see a
> zeroing?
> 
> I think it makes sense that this routine fully sets the output array
> and does not assume pre-initialize

Ok.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm 5/5] mm/hmm: remove the customizable pfn format from hmm_range_fault

2020-04-22 Thread Christoph Hellwig



On Tue, Apr 21, 2020 at 09:21:46PM -0300, Jason Gunthorpe wrote:
> +void nouveau_hmm_convert_pfn(struct nouveau_drm *drm, struct hmm_range 
> *range,
> +  u64 *ioctl_addr)
>  {
>   unsigned long i, npages;
>  
> + /*
> +  * The ioctl_addr prepared here is passed through nvif_object_ioctl()
> +  * to an eventual DMA map on some call chain like:
> +  *nouveau_svm_fault():
> +  *  args.i.m.method = NVIF_VMM_V0_PFNMAP
> +  *  nouveau_range_fault()
> +  *   nvif_object_ioctl()
> +  *client->driver->ioctl()
> +  *   struct nvif_driver nvif_driver_nvkm:
> +  * .ioctl = nvkm_client_ioctl
> +  *nvkm_ioctl()
> +  * nvkm_ioctl_path()
> +  *   nvkm_ioctl_v0[type].func(..)
> +  *   nvkm_ioctl_mthd()
> +  *nvkm_object_mthd()
> +  *   struct nvkm_object_func nvkm_uvmm:
> +  * .mthd = nvkm_uvmm_mthd
> +  *nvkm_uvmm_mthd()
> +  * nvkm_uvmm_mthd_pfnmap()
> +  *  nvkm_vmm_pfn_map()
> +  *   nvkm_vmm_ptes_get_map()
> +  *func == gp100_vmm_pgt_pfn
> +  * struct nvkm_vmm_desc_func gp100_vmm_desc_spt:
> +  *   .pfn = gp100_vmm_pgt_pfn
> +  *  nvkm_vmm_iter()
> +  *   REF_PTES == func == gp100_vmm_pgt_pfn()
> +  *dma_map_page()
> +  *
> +  * This is all just encoding the internal hmm reprensetation into a
> +  * different nouveau internal representation.
> +  */

Nice callchain from hell..  Unfortunately such "code listings" tend to
get out of date very quickly, so I'm not sure it is worth keeping in
the code.  What would be really worthile is consolidating the two
different sets of defines (NVIF_VMM_PFNMAP_V0_ vs NVKM_VMM_PFN_)
to make the code a little easier to follow.

>   npages = (range->end - range->start) >> PAGE_SHIFT;
>   for (i = 0; i < npages; ++i) {
>   struct page *page;
>  
> + if (!(range->hmm_pfns[i] & HMM_PFN_VALID)) {
> + ioctl_addr[i] = 0;
>   continue;
> + }

Can't we rely on the caller pre-zeroing the array?

> + page = hmm_pfn_to_page(range->hmm_pfns[i]);
> + if (is_device_private_page(page))
> + ioctl_addr[i] = nouveau_dmem_page_addr(page) |
> + NVIF_VMM_PFNMAP_V0_V |
> + NVIF_VMM_PFNMAP_V0_VRAM;
> + else
> + ioctl_addr[i] = page_to_phys(page) |
> + NVIF_VMM_PFNMAP_V0_V |
> + NVIF_VMM_PFNMAP_V0_HOST;
> + if (range->hmm_pfns[i] & HMM_PFN_WRITE)
> + ioctl_addr[i] |= NVIF_VMM_PFNMAP_V0_W;

Now that this routine isn't really device memory specific any more, I
wonder if it should move to nouveau_svm.c.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm 1/5] mm/hmm: make CONFIG_DEVICE_PRIVATE into a select

2020-04-21 Thread Christoph Hellwig
On Tue, Apr 21, 2020 at 09:21:42PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> There is no reason for a user to select this or not directly - it should
> be selected by drivers that are going to use the feature, similar to how
> CONFIG_HMM_MIRROR works.
> 
> Currently all drivers provide a feature kconfig that will disable use of
> DEVICE_PRIVATE in that driver, allowing users to avoid enabling this if
> they don't want the overhead.
> 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 04:10:16PM -0600, Jason A. Donenfeld wrote:
> Sometimes it's not okay to use SIMD registers, the conditions for which
> have changed subtly from kernel release to kernel release. Usually the
> pattern is to check for may_use_simd() and then fallback to using
> something slower in the unlikely case SIMD registers aren't available.
> So, this patch fixes up i915's accelerated memcpy routines to fallback
> to boring memcpy if may_use_simd() is false.

Err, why does i915 implements its own uncached memcpy instead of relying
on core functionality to start with?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm v2 5/5] mm/hmm: remove the customizable pfn format from hmm_range_fault

2020-05-04 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD

2020-05-04 Thread Christoph Hellwig
On Sun, May 03, 2020 at 09:20:19PM +0100, Chris Wilson wrote:
> > Err, why does i915 implements its own uncached memcpy instead of relying
> > on core functionality to start with?
> 
> What is this core functionality that provides movntqda?

A sensible name might be memcpy_uncached or mempcy_nontemporal.
But the important point is that this should be arch code with a common
fallback rather than hacking it up in drivers.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-09 Thread Christoph Hellwig
On Wed, Sep 09, 2020 at 09:43:09AM +0800, Lu Baolu wrote:
> +   /*
> +* The Intel graphic device driver is used to assume that the
> returned
> +* sg list is not combound. This blocks the efforts of converting
> the

This adds pointless overly long lines.

> +* Intel IOMMU driver to dma-iommu api's. Add this quirk to make the
> +* device driver work and should be removed once it's fixed in i915
> +* driver.
> +*/
> +   if (dev_is_pci(dev) &&
> +   to_pci_dev(dev)->vendor == PCI_VENDOR_ID_INTEL &&
> +   (to_pci_dev(dev)->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
> +   for_each_sg(sg, s, nents, i) {
> +   unsigned int s_iova_off = sg_dma_address(s);
> +   unsigned int s_length = sg_dma_len(s);
> +   unsigned int s_iova_len = s->length;
> +
> +   s->offset += s_iova_off;
> +   s->length = s_length;
> +   sg_dma_address(s) = dma_addr + s_iova_off;
> +   sg_dma_len(s) = s_length;
> +   dma_addr += s_iova_len;
> +   }
> +
> +   return nents;
> +   }

This wants an IS_ENABLED() check.  And probably a pr_once reminding
of the workaround.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-08 Thread Christoph Hellwig
On Thu, Sep 03, 2020 at 09:18:37PM +0100, Tom Murphy wrote:
> Disable combining sg segments in the dma-iommu api.
> Combining the sg segments exposes a bug in the intel i915 driver which
> causes visual artifacts and the screen to freeze. This is most likely
> because of how the i915 handles the returned list. It probably doesn't
> respect the returned value specifying the number of elements in the list
> and instead depends on the previous behaviour of the intel iommu driver
> which would return the same number of elements in the output list as in
> the input list.

So what is the state of addressing this properly in i915?  IF we can't
get it done ASAP I wonder if we need a runtime quirk to disable
merging instead of blocking this conversion..
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-08 Thread Christoph Hellwig
On Tue, Sep 08, 2020 at 06:36:19AM +0100, Christoph Hellwig wrote:
> On Mon, Sep 07, 2020 at 09:18:50PM +0100, Tom Murphy wrote:
> > Yeah we talked about passing an attr to map_sg to disable merging at
> > the following microconfernce:
> > https://linuxplumbersconf.org/event/7/contributions/846/
> > As far as I can remember everyone seemed happy with that solution. I
> > won't be working on this though as I don't have any more time to
> > dedicate to this. It seems Lu Baolu will take over this.
> 
> I'm absolutely again passing a flag.  Tha just invites further
> abuse.  We need a PCI ID based quirk or something else that can't
> be as easily abused.

Also, I looked at i915 and there are just three dma_map_sg callers.
The two dmabuf related ones are fixed by Marek in his series, leaving
just the one in i915_gem_gtt_prepare_pages, which does indeed look
very fishy.  But if that one is so hard to fix it can just be replaced
by an open coded for_each_sg loop that contains manual dma_map_page
calls.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-08 Thread Christoph Hellwig
On Tue, Sep 08, 2020 at 02:04:53PM +0800, Lu Baolu wrote:
> Do you mind telling where can I find Marek's series?

[PATCH v10 00/30] DRM: fix struct sg_table nents vs. orig_nents misuse

on various lists including the iommu one.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-08 Thread Christoph Hellwig
On Mon, Sep 07, 2020 at 09:18:50PM +0100, Tom Murphy wrote:
> Yeah we talked about passing an attr to map_sg to disable merging at
> the following microconfernce:
> https://linuxplumbersconf.org/event/7/contributions/846/
> As far as I can remember everyone seemed happy with that solution. I
> won't be working on this though as I don't have any more time to
> dedicate to this. It seems Lu Baolu will take over this.

I'm absolutely again passing a flag.  Tha just invites further
abuse.  We need a PCI ID based quirk or something else that can't
be as easily abused.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 03/11] mm: add a vmap_pfn function

2020-10-02 Thread Christoph Hellwig
Add a proper helper to remap PFNs into kernel virtual space so that
drivers don't have to abuse alloc_vm_area and open coded PTE
manipulation for it.

Signed-off-by: Christoph Hellwig 
---
 include/linux/vmalloc.h |  1 +
 mm/Kconfig  |  3 +++
 mm/vmalloc.c| 45 +
 3 files changed, 49 insertions(+)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index b899681e3ff9f0..c77efeac242514 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -122,6 +122,7 @@ extern void vfree_atomic(const void *addr);
 
 extern void *vmap(struct page **pages, unsigned int count,
unsigned long flags, pgprot_t prot);
+void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot);
 extern void vunmap(const void *addr);
 
 extern int remap_vmalloc_range_partial(struct vm_area_struct *vma,
diff --git a/mm/Kconfig b/mm/Kconfig
index 6c974888f86f97..6fa7ba1199eb1e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -815,6 +815,9 @@ config DEVICE_PRIVATE
  memory; i.e., memory that is only accessible from the device (or
  group of devices). You likely also want to select HMM_MIRROR.
 
+config VMAP_PFN
+   bool
+
 config FRAME_VECTOR
bool
 
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ffad65f052c3f9..e2a2ded8d93478 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2413,6 +2413,51 @@ void *vmap(struct page **pages, unsigned int count,
 }
 EXPORT_SYMBOL(vmap);
 
+#ifdef CONFIG_VMAP_PFN
+struct vmap_pfn_data {
+   unsigned long   *pfns;
+   pgprot_tprot;
+   unsigned intidx;
+};
+
+static int vmap_pfn_apply(pte_t *pte, unsigned long addr, void *private)
+{
+   struct vmap_pfn_data *data = private;
+
+   if (WARN_ON_ONCE(pfn_valid(data->pfns[data->idx])))
+   return -EINVAL;
+   *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
+   return 0;
+}
+
+/**
+ * vmap_pfn - map an array of PFNs into virtually contiguous space
+ * @pfns: array of PFNs
+ * @count: number of pages to map
+ * @prot: page protection for the mapping
+ *
+ * Maps @count PFNs from @pfns into contiguous kernel virtual space and returns
+ * the start address of the mapping.
+ */
+void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot)
+{
+   struct vmap_pfn_data data = { .pfns = pfns, .prot = pgprot_nx(prot) };
+   struct vm_struct *area;
+
+   area = get_vm_area_caller(count * PAGE_SIZE, VM_IOREMAP,
+   __builtin_return_address(0));
+   if (!area)
+   return NULL;
+   if (apply_to_page_range(_mm, (unsigned long)area->addr,
+   count * PAGE_SIZE, vmap_pfn_apply, )) {
+   free_vm_area(area);
+   return NULL;
+   }
+   return area->addr;
+}
+EXPORT_SYMBOL_GPL(vmap_pfn);
+#endif /* CONFIG_VMAP_PFN */
+
 static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 pgprot_t prot, int node)
 {
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/11] drm/i915: use vmap in shmem_pin_map

2020-10-02 Thread Christoph Hellwig
shmem_pin_map somewhat awkwardly reimplements vmap using
alloc_vm_area and manual pte setup.  The only practical difference
is that alloc_vm_area prefeaults the vmalloc area PTEs, which doesn't
seem to be required here (and could be added to vmap using a flag if
actually required).  Switch to use vmap, and use vfree to free both the
vmalloc mapping and the page array, as well as dropping the references
to each page.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 76 +++
 1 file changed, 18 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 43c7acbdc79dea..f011ea42487e11 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -49,80 +49,40 @@ struct file *shmem_create_from_object(struct 
drm_i915_gem_object *obj)
return file;
 }
 
-static size_t shmem_npte(struct file *file)
-{
-   return file->f_mapping->host->i_size >> PAGE_SHIFT;
-}
-
-static void __shmem_unpin_map(struct file *file, void *ptr, size_t n_pte)
-{
-   unsigned long pfn;
-
-   vunmap(ptr);
-
-   for (pfn = 0; pfn < n_pte; pfn++) {
-   struct page *page;
-
-   page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
-  GFP_KERNEL);
-   if (!WARN_ON(IS_ERR(page))) {
-   put_page(page);
-   put_page(page);
-   }
-   }
-}
-
 void *shmem_pin_map(struct file *file)
 {
-   const size_t n_pte = shmem_npte(file);
-   pte_t *stack[32], **ptes, **mem;
-   struct vm_struct *area;
-   unsigned long pfn;
-
-   mem = stack;
-   if (n_pte > ARRAY_SIZE(stack)) {
-   mem = kvmalloc_array(n_pte, sizeof(*mem), GFP_KERNEL);
-   if (!mem)
-   return NULL;
-   }
+   struct page **pages;
+   size_t n_pages, i;
+   void *vaddr;
 
-   area = alloc_vm_area(n_pte << PAGE_SHIFT, mem);
-   if (!area) {
-   if (mem != stack)
-   kvfree(mem);
+   n_pages = file->f_mapping->host->i_size >> PAGE_SHIFT;
+   pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL);
+   if (!pages)
return NULL;
-   }
 
-   ptes = mem;
-   for (pfn = 0; pfn < n_pte; pfn++) {
-   struct page *page;
-
-   page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
-  GFP_KERNEL);
-   if (IS_ERR(page))
+   for (i = 0; i < n_pages; i++) {
+   pages[i] = shmem_read_mapping_page_gfp(file->f_mapping, i,
+  GFP_KERNEL);
+   if (IS_ERR(pages[i]))
goto err_page;
-
-   **ptes++ = mk_pte(page,  PAGE_KERNEL);
}
 
-   if (mem != stack)
-   kvfree(mem);
-
+   vaddr = vmap(pages, n_pages, VM_MAP_PUT_PAGES, PAGE_KERNEL);
+   if (!vaddr)
+   goto err_page;
mapping_set_unevictable(file->f_mapping);
-   return area->addr;
-
+   return vaddr;
 err_page:
-   if (mem != stack)
-   kvfree(mem);
-
-   __shmem_unpin_map(file, area->addr, pfn);
+   while (--i >= 0)
+   put_page(pages[i]);
+   kvfree(pages);
return NULL;
 }
 
 void shmem_unpin_map(struct file *file, void *ptr)
 {
mapping_clear_unevictable(file->f_mapping);
-   __shmem_unpin_map(file, ptr, shmem_npte(file));
+   vfree(ptr);
 }
 
 static int __shmem_rw(struct file *file, loff_t off,
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/11] mm: update the documentation for vfree

2020-10-02 Thread Christoph Hellwig
From: "Matthew Wilcox (Oracle)" 

 * Document that you can call vfree() on an address returned from vmap()
 * Remove the note about the minimum size -- the minimum size of a vmalloc
   allocation is one page
 * Add a Context: section
 * Fix capitalisation
 * Reword the prohibition on calling from NMI context to avoid a double
   negative

Signed-off-by: Matthew Wilcox (Oracle) 
Signed-off-by: Christoph Hellwig 
---
 mm/vmalloc.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index be4724b916b3e7..8770260419af06 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2321,20 +2321,21 @@ static void __vfree(const void *addr)
 }
 
 /**
- * vfree - release memory allocated by vmalloc()
- * @addr:  memory base address
+ * vfree - Release memory allocated by vmalloc()
+ * @addr:  Memory base address
  *
- * Free the virtually continuous memory area starting at @addr, as
- * obtained from vmalloc(), vmalloc_32() or __vmalloc(). If @addr is
- * NULL, no operation is performed.
+ * Free the virtually continuous memory area starting at @addr, as obtained
+ * from one of the vmalloc() family of APIs.  This will usually also free the
+ * physical memory underlying the virtual allocation, but that memory is
+ * reference counted, so it will not be freed until the last user goes away.
  *
- * Must not be called in NMI context (strictly speaking, only if we don't
- * have CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling
- * conventions for vfree() arch-depenedent would be a really bad idea)
+ * If @addr is NULL, no operation is performed.
  *
+ * Context:
  * May sleep if called *not* from interrupt context.
- *
- * NOTE: assumes that the object at @addr has a size >= sizeof(llist_node)
+ * Must not be called in NMI context (strictly speaking, it could be
+ * if we have CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling
+ * conventions for vfree() arch-depenedent would be a really bad idea).
  */
 void vfree(const void *addr)
 {
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/11] mm: allow a NULL fn callback in apply_to_page_range

2020-10-02 Thread Christoph Hellwig
Besides calling the callback on each page, apply_to_page_range also has
the effect of pre-faulting all PTEs for the range.  To support callers
that only need the pre-faulting, make the callback optional.

Based on a patch from Minchan Kim .

Signed-off-by: Christoph Hellwig 
---
 mm/memory.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index fcfc4ca36eba80..dcf2bb69fbf847 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2420,13 +2420,15 @@ static int apply_to_pte_range(struct mm_struct *mm, 
pmd_t *pmd,
 
arch_enter_lazy_mmu_mode();
 
-   do {
-   if (create || !pte_none(*pte)) {
-   err = fn(pte++, addr, data);
-   if (err)
-   break;
-   }
-   } while (addr += PAGE_SIZE, addr != end);
+   if (fn) {
+   do {
+   if (create || !pte_none(*pte)) {
+   err = fn(pte++, addr, data);
+   if (err)
+   break;
+   }
+   } while (addr += PAGE_SIZE, addr != end);
+   }
*mask |= PGTBL_PTE_MODIFIED;
 
arch_leave_lazy_mmu_mode();
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] remove alloc_vm_area v4

2020-10-02 Thread Christoph Hellwig
Hi Andrew,

this series removes alloc_vm_area, which was left over from the big
vmalloc interface rework.  It is a rather arkane interface, basicaly
the equivalent of get_vm_area + actually faulting in all PTEs in
the allocated area.  It was originally addeds for Xen (which isn't
modular to start with), and then grew users in zsmalloc and i915
which seems to mostly qualify as abuses of the interface, especially
for i915 as a random driver should not set up PTE bits directly.

A git tree is also available here:

git://git.infradead.org/users/hch/misc.git alloc_vm_area

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/alloc_vm_area

Changes since v2:
 - rebased to include the two conflicting i915 changes in drm-tip /
   linux-next again.  No changes outside of drivers/gpu/drm/i915/

Changes since v2:
 - add another missing i initialization
 - rebased to mainline instead of drm-tip again

Changes since v1:
 - fix a bug in the zsmalloc changes
 - fix a bug and rebase to include the recent changes in i915
 - add a new vmap flag that allows to free the page array and pages
   using vfree
 - add a vfree documentation updated from Matthew

Diffstat:
 arch/x86/xen/grant-table.c|   27 --
 drivers/gpu/drm/i915/Kconfig  |1 
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  131 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c |   76 -
 drivers/xen/xenbus/xenbus_client.c|   30 +++---
 include/linux/vmalloc.h   |7 -
 mm/Kconfig|3 
 mm/memory.c   |   16 ++-
 mm/nommu.c|7 -
 mm/vmalloc.c  |  123 ++--
 mm/zsmalloc.c |   10 +-
 11 files changed, 200 insertions(+), 231 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/11] drm/i915: use vmap in i915_gem_object_map

2020-10-02 Thread Christoph Hellwig
i915_gem_object_map implements fairly low-level vmap functionality in
a driver.  Split it into two helpers, one for remapping kernel memory
which can use vmap, and one for I/O memory that uses vmap_pfn.

The only practical difference is that alloc_vm_area prefeaults the
vmalloc area PTEs, which doesn't seem to be required here for the
kernel memory case (and could be added to vmap using a flag if actually
required).

Signed-off-by: Christoph Hellwig 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/Kconfig  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c | 127 ++
 2 files changed, 60 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 9afa5c4a6bf006..1e1cb245fca778 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -25,6 +25,7 @@ config DRM_I915
select CRC32
select SND_HDA_I915 if SND_HDA_CORE
select CEC_CORE if CEC_NOTIFIER
+   select VMAP_PFN
help
  Choose this option if you have a system that has "Intel Graphics
  Media Accelerator" or "HD Graphics" integrated graphics,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 6550c0bc824ea2..f60ca6dc911f29 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -232,34 +232,21 @@ int __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj)
return err;
 }
 
-static inline pte_t iomap_pte(resource_size_t base,
- dma_addr_t offset,
- pgprot_t prot)
-{
-   return pte_mkspecial(pfn_pte((base + offset) >> PAGE_SHIFT, prot));
-}
-
 /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
-enum i915_map_type type)
+static void *i915_gem_object_map_page(struct drm_i915_gem_object *obj,
+   enum i915_map_type type)
 {
-   unsigned long n_pte = obj->base.size >> PAGE_SHIFT;
-   struct sg_table *sgt = obj->mm.pages;
-   pte_t *stack[32], **mem;
-   struct vm_struct *area;
+   unsigned long n_pages = obj->base.size >> PAGE_SHIFT, i;
+   struct page *stack[32], **pages = stack, *page;
+   struct sgt_iter iter;
pgprot_t pgprot;
+   void *vaddr;
 
-   if (!i915_gem_object_has_struct_page(obj) && type != I915_MAP_WC)
-   return NULL;
-
-   if (GEM_WARN_ON(type == I915_MAP_WC &&
-   !static_cpu_has(X86_FEATURE_PAT)))
-   return NULL;
-
-   /* A single page can always be kmapped */
-   if (n_pte == 1 && type == I915_MAP_WB) {
-   struct page *page = sg_page(sgt->sgl);
-
+   switch (type) {
+   default:
+   MISSING_CASE(type);
+   fallthrough;/* to use PAGE_KERNEL anyway */
+   case I915_MAP_WB:
/*
 * On 32b, highmem using a finite set of indirect PTE (i.e.
 * vmap) to provide virtual mappings of the high pages.
@@ -277,30 +264,8 @@ static void *i915_gem_object_map(struct 
drm_i915_gem_object *obj,
 * So if the page is beyond the 32b boundary, make an explicit
 * vmap.
 */
-   if (!PageHighMem(page))
-   return page_address(page);
-   }
-
-   mem = stack;
-   if (n_pte > ARRAY_SIZE(stack)) {
-   /* Too big for stack -- allocate temporary array instead */
-   mem = kvmalloc_array(n_pte, sizeof(*mem), GFP_KERNEL);
-   if (!mem)
-   return NULL;
-   }
-
-   area = alloc_vm_area(obj->base.size, mem);
-   if (!area) {
-   if (mem != stack)
-   kvfree(mem);
-   return NULL;
-   }
-
-   switch (type) {
-   default:
-   MISSING_CASE(type);
-   fallthrough;/* to use PAGE_KERNEL anyway */
-   case I915_MAP_WB:
+   if (n_pages == 1 && !PageHighMem(sg_page(obj->mm.pages->sgl)))
+   return page_address(sg_page(obj->mm.pages->sgl));
pgprot = PAGE_KERNEL;
break;
case I915_MAP_WC:
@@ -308,30 +273,50 @@ static void *i915_gem_object_map(struct 
drm_i915_gem_object *obj,
break;
}
 
-   if (i915_gem_object_has_struct_page(obj)) {
-   struct sgt_iter iter;
-   struct page *page;
-   pte_t **ptes = mem;
+   if (n_pages > ARRAY_SIZE(stack)) {
+   /* Too big for stack -- allocate temporary array instead */
+   pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL);
+   if (!pages)
+   return NULL;
+

  1   2   3   4   5   6   >