Re: [Intel-gfx] [PATCH v2] drm/i915: Replace kmap() with kmap_local_page()

2023-06-23 Thread Fabio M. De Francesco
On sabato 17 giugno 2023 20:04:20 CEST Sumitra Sharma wrote:
> kmap() has been deprecated in favor of the kmap_local_page()
> due to high cost, restricted mapping space, the overhead of a
> global lock for synchronization, and making the process sleep
> in the absence of free slots.
> 
> kmap_local_page() is faster than kmap() and offers thread-local
> and CPU-local mappings, take pagefaults in a local kmap region

NIT: _can_ take pagefaults in a local kmap region

> and preserves preemption by saving the mappings of outgoing tasks
> and restoring those of the incoming one during a context switch.
> 
> The mapping is kept thread local in the function
> “i915_vma_coredump_create” in i915_gpu_error.c
> 
> Therefore, replace kmap() with kmap_local_page().
> 
> Suggested-by: Ira Weiny 
> 
> Signed-off-by: Sumitra Sharma 
> ---
> 
> Changes in v2:
>   - Replace kmap() with kmap_local_page().
>   - Change commit subject and message.

With the changes that Ira suggested and the minor fix I'm proposing to the 
commit message, it looks good to me too, so this patch is... 

Reviewed-by: Fabio M. De Francesco 

However, as far as I'm concerned, our nits don't necessarily require any newer 
version, especially because Tvrtko has already sent this patch for their CI.

Thanks,

Fabio

P.S.: As Sumitra says both kmap() and kmap_local_page() allows preemption in 
non atomic context. 

Furthermore, Tvrtko confirmed that the pages can come from HIGHMEM, therefore 
kmap_local_page for local temporary mapping is unavoidable.

Last thing... Thomas thinks he wants to make it run atomically (if I 
understood one of his messages correctly). As I already responded, nothing 
prevents someone does another patch just to disable preemption (or to enter 
atomic context by other means) around the code marked by kmap_local_page() / 
kunmap_local() because these functions work perfectly _also_ in atomic context 
(including interrupts). But this is not something that Sumitra should be 
worried about.

> 
>  drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
> b/drivers/gpu/drm/i915/i915_gpu_error.c index f020c0086fbc..bc41500eedf5
> 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1164,9 +1164,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
> 
>   drm_clflush_pages(, 1);
> 
> - s = kmap(page);
> + s = kmap_local_page(page);
>   ret = compress_page(compress, s, dst, false);
> - kunmap(page);
> + kunmap_local(s);
> 
>   drm_clflush_pages(, 1);
> 
> --
> 2.25.1






Re: [Intel-gfx] [PATCH v2] drm/i915: Replace kmap() with kmap_local_page()

2023-06-21 Thread Fabio M. De Francesco
On mercoledì 21 giugno 2023 11:06:51 CEST Thomas Hellström (Intel) wrote:
> 
> I think one thing worth mentioning in the context of this patch is that
> IIRC kmap_local_page() will block offlining of the mapping CPU until
> kunmap_local(),

Migration is disabled.

> so while I haven't seen any guidelines around the usage
> of this api for long-held mappings,

It would be advisable to not use it for long term mappings, if possible. These 
"local" mappings should better be help for not too long duration. 

> I figure it's wise to keep the
> mapping duration short, or at least avoid sleeping with a
> kmap_local_page() map active.

Nothing prevents a call of preempt_disable() around the section of code 
between kmap_local_page() / kunmap_local(). If it is needed, local mappings 
could also be acquired under spinlocks and/or with disabled interrupts.

I don't know the code, however, everything cited above could be the subject of 
a subsequent patch.

Regards,

Fabio

> I figured, while page compression is probably to be considered "slow"
> it's probably not slow enough to motivate kmap() instead of
> kmap_local_page(), but if anyone feels differently, perhaps it should be
> considered.
> 
> With that said, my Reviewed-by: still stands.
> 
> /Thomas
> 





[Intel-gfx] [PATCH v2 3/3] drm/i915/gem: Replace kmap() with kmap_local_page()

2023-04-17 Thread Fabio M. De Francesco
kmap() s been deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Obviously, thread locality implies that the kernel virtual addresses are
only valid in the context of the callers. The kmap_local_page() use in
i915/gem doesn't break the above-mentioned constraint, so it should be
preferred to kmap().

Therefore, replace kmap() with kmap_local_page() in i915/gem and use
memcpy_to_page() where it is possible to avoid the open coding of
mapping + memcpy() + un-mapping.

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..8856a6409e83 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -657,16 +657,14 @@ i915_gem_object_create_shmem_from_data(struct 
drm_i915_private *dev_priv,
do {
unsigned int len = min_t(typeof(size), size, PAGE_SIZE);
struct page *page;
-   void *pgdata, *vaddr;
+   void *pgdata;

err = aops->write_begin(file, file->f_mapping, offset, len,
, );
if (err < 0)
goto fail;

-   vaddr = kmap(page);
-   memcpy(vaddr, data, len);
-   kunmap(page);
+   memcpy_to_page(page, 0, data, len);

err = aops->write_end(file, file->f_mapping, offset, len, len,
  page, pgdata);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 56279908ed30..5fd9e1ee2340 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -155,7 +155,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
intel_gt_flush_ggtt_writes(to_gt(i915));

p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, 
row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to 
page (%lu + %u [0x%lx]) of 0x%x, found 0x%x\n",
@@ -173,7 +173,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);

 out:
i915_gem_object_lock(obj, NULL);
@@ -251,7 +251,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
intel_gt_flush_ggtt_writes(to_gt(i915));

p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u 
[%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected 
write to page (%lu + %u [0x%lx]) of 0x%x, found 0x%x\n",
@@ -269,7 +269,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);
if (err)
return err;

--
2.40.0



[Intel-gfx] [PATCH v2 2/3] drm/i915/gt: Replace kmap() with kmap_local_page()

2023-04-17 Thread Fabio M. De Francesco
kmap() s been deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Obviously, thread locality implies that the kernel virtual addresses are
only valid in the context of the callers. The use of kmap_local_page() in
i915/gt doesn't break the above-mentioned constraint, so it should be
preferred to kmap().

Therefore, replace kmap() with kmap_local_page() in i915/gt.

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c| 11 ---
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 37d0b0fe791d..89295c6921d6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -749,7 +749,7 @@ static void swizzle_page(struct page *page)
char *vaddr;
int i;

-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);

for (i = 0; i < PAGE_SIZE; i += 128) {
memcpy(temp, [i], 64);
@@ -757,7 +757,7 @@ static void swizzle_page(struct page *page)
memcpy([i + 64], temp, 64);
}

-   kunmap(page);
+   kunmap_local(vaddr);
 }

 /**
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 449c9ed44382..be809839a241 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -101,22 +101,19 @@ static int __shmem_rw(struct file *file, loff_t off,
unsigned int this =
min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
struct page *page;
-   void *vaddr;

page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
   GFP_KERNEL);
if (IS_ERR(page))
return PTR_ERR(page);

-   vaddr = kmap(page);
if (write) {
-   memcpy(vaddr + offset_in_page(off), ptr, this);
+   memcpy_to_page(page, offset_in_page(off), ptr, this);
set_page_dirty(page);
} else {
-   memcpy(ptr, vaddr + offset_in_page(off), this);
+   memcpy_from_page(ptr, page, offset_in_page(off), this);
}
mark_page_accessed(page);
-   kunmap(page);
put_page(page);

len -= this;
@@ -143,11 +140,11 @@ int shmem_read_to_iosys_map(struct file *file, loff_t off,
if (IS_ERR(page))
return PTR_ERR(page);

-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
iosys_map_memcpy_to(map, map_off, vaddr + offset_in_page(off),
this);
mark_page_accessed(page);
-   kunmap(page);
+   kunmap_local(vaddr);
put_page(page);

len -= this;
--
2.40.0



[Intel-gfx] [PATCH v2 1/3] drm/i915: Replace kmap() with kmap_local_page()

2023-04-17 Thread Fabio M. De Francesco
kmap() has been deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Obviously, thread locality implies that the kernel voirtual addresses
are valid only in the context of the callers. kmap_local_page() use in
i915_gem.c doesn't break the above-mentioned constraint, so it should be
preferred to kmap().

Therefore, replace kmap() with kmap_local_page() in i915_gem.c

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/i915_gem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 35950fa91406..c35248555e42 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -212,14 +212,14 @@ shmem_pread(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;

-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);

if (needs_clflush)
drm_clflush_virt_range(vaddr + offset, len);

ret = __copy_to_user(user_data, vaddr + offset, len);

-   kunmap(page);
+   kunmap_local(vaddr);

return ret ? -EFAULT : 0;
 }
@@ -643,7 +643,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;

-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);

if (needs_clflush_before)
drm_clflush_virt_range(vaddr + offset, len);
@@ -652,7 +652,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
if (!ret && needs_clflush_after)
drm_clflush_virt_range(vaddr + offset, len);

-   kunmap(page);
+   kunmap_local(vaddr);

return ret ? -EFAULT : 0;
 }
--
2.40.0



[Intel-gfx] [PATCH v2 0/3] drm/i915: Replace kmap() with kmap_local_page()

2023-04-17 Thread Fabio M. De Francesco
kmap() has been deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).

The tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and so they are still valid.

Furthermore, kmap_local_page() is faster than kmap() in kernels with
HIGHMEM enabled.

Thread locality implies that the kernel virtual addresses returned by
kmap_local_page() are only valid in the context of the callers. This
constraint is never violated with the conversions in this series,
because the pointers are never handed to other threads, so the local
mappings are allowed and preferred.

Therefore, replace kmap() with kmap_local_page() in drm/i915/,
drm/i915/gem/, drm/i915/gt/.

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 

v1->v2: Do some changes in the text of the cover letter and in the
commit messages. There are no changes in the code of any of the three
patches.

Fabio M. De Francesco (3):
  drm/i915: Replace kmap() with kmap_local_page()
  drm/i915/gt: Replace kmap() with kmap_local_page()
  drm/i915/gem: Replace kmap() with kmap_local_page()

 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  |  6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |  8 
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c   |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c  | 11 ---
 drivers/gpu/drm/i915/i915_gem.c|  8 
 5 files changed, 16 insertions(+), 21 deletions(-)

--
2.40.0



Re: [Intel-gfx] [PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-03-31 Thread Fabio M. De Francesco
On venerdì 31 marzo 2023 13:30:20 CEST Tvrtko Ursulin wrote:
> On 31/03/2023 05:18, Ira Weiny wrote:
> > Zhao Liu wrote:
> >> From: Zhao Liu 
> >> 
> >> The use of kmap_atomic() is being deprecated in favor of
> >> kmap_local_page()[1], and this patch converts the calls from
> >> kmap_atomic() to kmap_local_page().
> >> 
> >> The main difference between atomic and local mappings is that local
> >> mappings doesn't disable page faults or preemption (the preemption is
> >> disabled for !PREEMPT_RT case, otherwise it only disables migration).
> >> 
> >> With kmap_local_page(), we can avoid the often unwanted side effect of
> >> unnecessary page faults and preemption disables.
> >> 
> >> In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
> >> kmap_atomic() in eb_relocate_entry(), and is unmapped by
> >> kunmap_atomic() in reloc_cache_reset().
> > 
> > First off thanks for the series and sticking with this.  That said this
> > patch kind of threw me for a loop because tracing the map/unmap calls did
> > not make sense to me.  See below.
> > 
> >> And this mapping/unmapping occurs in two places: one is in
> >> eb_relocate_vma(), and another is in eb_relocate_vma_slow().
> >> 
> >> The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
> >> need to disable pagefaults and preemption during the above mapping/
> >> unmapping.
> >> 
> >> So it can simply use kmap_local_page() / kunmap_local() that can
> >> instead do the mapping / unmapping regardless of the context.
> >> 
> >> Convert the calls of kmap_atomic() / kunmap_atomic() to
> >> kmap_local_page() / kunmap_local().
> >> 
> >> [1]:
> >> https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> >> 
> >> v2: No code change since v1. Added description of the motivation of
> >> 
> >>  using kmap_local_page() and "Suggested-by" tag of Fabio.
> >> 
> >> Suggested-by: Ira Weiny 
> >> Suggested-by: Fabio M. De Francesco 
> >> Signed-off-by: Zhao Liu 
> >> ---
> >> 
> >> Suggested by credits:
> >>Ira: Referred to his task document, review comments.
> >>Fabio: Referred to his boiler plate commit message and his description
> >>
> >>   about why kmap_local_page() should be preferred.
> >> 
> >> ---
> >> 
> >>   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 10 +-
> >>   1 file changed, 5 insertions(+), 5 deletions(-)
> >> 

[snip]
 
> However I am unsure if disabling pagefaulting is needed or not. Thomas,
> Matt, being the last to touch this area, perhaps you could have a look?
> Because I notice we have a fallback iomap path which still uses
> io_mapping_map_atomic_wc. So if kmap_atomic to kmap_local conversion is
> safe, does the iomap side also needs converting to
> io_mapping_map_local_wc? Or they have separate requirements?

AFAIK, the requirements for io_mapping_map_local_wc() are the same as for 
kmap_local_page(): the kernel virtual address is _only_ valid in the caller 
context, and map/unmap nesting must be done in stack-based ordering (LIFO).

I think a follow up patch could safely switch to io_mapping_map_local_wc() / 
io_mapping_unmap_local_wc since the address is local to context.

However, not being an expert, reading your note now I suspect that I'm missing 
something. Can I ask why you think that page-faults disabling might be 
necessary? 

Thanks,

Fabio

> Regards,
> 
> Tvrtko





Re: [Intel-gfx] [PATCH v2 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-03-29 Thread Fabio M. De Francesco
On mercoledì 29 marzo 2023 09:32:11 CEST Zhao Liu wrote:
> From: Zhao Liu 
> 
> Hi list,
> 
> Sorry for a long delay since v1 [1]. This patchset is based on 197b6b6
> (Linux 6.3-rc4).
> 
> Welcome and thanks for your review and comments!
> 
> 
> # Purpose of this patchset
> 
> The purpose of this pacthset is to replace all uses of kmap_atomic() in
> i915 with kmap_local_page() because the use of kmap_atomic() is being
> deprecated in favor of kmap_local_page()[1]. And 92b64bd (mm/highmem:
> add notes about conversions from kmap{,_atomic}()) has declared the
> deprecation of kmap_atomic().
> 
> 
> # Motivation for deprecating kmap_atomic() and using kmap_local_page()
> 
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption (the preemption is
> disabled for !PREEMPT_RT case, otherwise it only disables migration).
> 
> With kmap_local_page(), we can avoid the often unwanted side effect of
> unnecessary page faults and preemption disables.
> 
> 
> # Patch summary
> 
> Patch 1, 4-6 and 8-9 replace kamp_atomic()/kunmap_atomic() with
> kmap_local_page()/kunmap_local() directly. With thses local
> mappings, the page faults and preemption are allowed.
> 
> Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
> kamp_atomic()/kunmap_atomic(). These two variants of memcpy()
> are based on the local mapping, so page faults and preemption
> are also allowed in these two interfaces.
> 
> Patch 3 replaces kamp_atomic()/kunmap_atomic() with kmap_local_page()/
> kunmap_local() and also diable page fault since the for special
> handling (pls see the commit message).
> 
> 
> # Changes since v1
> 
> * Dropped hot plug related description in commit message since it has
>   nothing to do with kmap_local_page().
> * Emphasized the motivation for using kmap_local_page() in commit
>   message.
> * Rebased patch 1 on f47e630 (drm/i915/gem: Typecheck page lookups) to
>   keep the "idx" variable of type pgoff_t here.
> * Used memcpy_from_page() and memcpy_to_page() to replace
>   kmap_local_page() + memcpy() in patch 2.
> 
> 
> # Reference
> 
> [1]:
> https://lore.kernel.org/lkml/20221017093726.2070674-1-zhao1.liu@linux.intel.c
> om/ [1]:
> https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com ---
> Zhao Liu (9):
>   drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
>   drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
>   drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
>   drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
>   drm/i915: Use kmap_local_page() in i915_cmd_parser.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c
> 

I _think_ that the "long delay" you mentioned in the first sentence has paid 
off in full. 

I don't see things to improve (except all those "kamp_atomic()" typo in the 
patches summary; however, typos are only in the cover so I'm sure they won't 
hurt anybody). 

Each of the nine patches listed above looks good to me, so they are all…

Reviewed-by: Fabio M. De Francesco 

Thanks!

Fabio

PS: Obviously there was no need to reconfirm my tag for patch 3/9. A single 
tag that catches all patches is easier for a lazy person like me :-)

>
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
>  drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
>  drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
>  .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
>  .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
>  drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
>  9 files changed, 28 insertions(+), 41 deletions(-)
> 
> --
> 2.34.1






Re: [Intel-gfx] [PATCH 3/9] drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c

2022-11-03 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 11:37:19 CET Zhao Liu wrote:
> From: Zhao Liu 
> 
> The use of kmap_atomic() is being deprecated in favor of
> kmap_local_page()[1].
> 
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption.
> 
> In drm/i915/gem/i915_gem_shmem.c, the function shmem_pwrite() need to
> disable pagefault to eliminate the potential recursion fault[2]. But
> here __copy_from_user_inatomic() doesn't need to disable preemption and
> local mapping is valid for sched in/out.
> So it can use kmap_local_page() / kunmap_local() with
> pagefault_disable() / pagefault_enable() to replace atomic mapping.
> 
> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> [2]: https://patchwork.freedesktop.org/patch/295840/
> 
> Suggested-by: Ira Weiny 
> Signed-off-by: Zhao Liu 
> ---
> Suggested by credits:
>   Ira: Referred to his suggestions about keeping pagefault_disable().
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

The commit message explains clearly and the changes to the code look good. The 
necessity to reuse pagefault_disable() / pagefault_enable() from the main 
kmap_atomic() side effect is a nice catch.

Reviewed-by: Fabio M. De Francesco 

Thanks!

> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index f42ca1179f37..e279a3e30c02 
100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -472,11 +472,13 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
>   if (err < 0)
>   return err;
> 
> - vaddr = kmap_atomic(page);
> + vaddr = kmap_local_page(page);
> + pagefault_disable();
>   unwritten = __copy_from_user_inatomic(vaddr + pg,
> user_data,
> len);
> - kunmap_atomic(vaddr);
> + pagefault_enable();
> + kunmap_local(vaddr);
> 
>   err = aops->write_end(obj->base.filp, mapping, offset, 
len,
> len - unwritten, page, data);





Re: [Intel-gfx] [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-11-03 Thread Fabio M. De Francesco
On giovedì 3 novembre 2022 17:51:23 CET Ira Weiny wrote:
> On Sat, Oct 29, 2022 at 01:17:03PM +0200, Fabio M. De Francesco wrote:
> > On lunedì 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> > > From: Zhao Liu 
> > > 
> > > The use of kmap_atomic() is being deprecated in favor of
> > > kmap_local_page()[1].
> > > 
> > > The main difference between atomic and local mappings is that local
> > > mappings doesn't disable page faults or preemption.
> > 
> > You are right about about page faults which are never disabled by
> > kmap_local_page(). However kmap_atomic might not disable preemption. It
> > depends on CONFIG_PREEMPT_RT.
> > 
> > Please refer to how kmap_atomic_prot() works (this function is called by
> > kmap_atomic() when kernels have HIGHMEM enabled).
> > 
> > > There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> > > need to disable pagefaults and preemption for mapping:
> > > 
> > > 1. The flush operation is safe for CPU hotplug when preemption is not
> > > disabled.
> > 
> > I'm confused here. Why are you talking about CPU hotplug?
> 
> I agree with Fabio here.  I'm not making the connection between cpu hotplug 
and
> this code path.
> 
> Ira

@Zhao,

I'd like to add that I was about to put my reviewed-by tag. The other things I 
objected are minor nits. Please just clarify this connection.

Your code is good and deserves to be applied.

Fabio

> 
> > In any case, developers should never rely on implicit calls of
> > preempt_disable() for the reasons said above. Therefore, flush operations
> > should be allowed regardless that kmap_atomic() potential side effect.
> > 
> > > In drm/i915/gem/i915_gem_object.c, the function
> > > i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range()
> > 
> > If I recall correctly, drm_clflush_virt_range() can always be called with 
page
> > faults and preemption enabled. If so, this is enough to say that the
> > conversion is safe.
> > 
> > Is this code explicitly related to flushing the cache lines before 
removing /
> > adding CPUs? If I recall correctly, there are several other reasons behind 
the
> > need to issue cache lines flushes. Am I wrong about this?
> > 
> > Can you please say more about what I'm missing here?
> > 
> > > to
> > > use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
> > > and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
> > > operation is global and any issue with cpu's being added or removed
> > > can be handled safely.
> > 
> > Again your main concern is about CPU hotplug.
> > 
> > Even if I'm missing something, do we really need all these details about 
the
> > inner workings of drm_clflush_virt_range()?
> > 
> > I'm not an expert, so may be that I'm wrong about all I wrote above.
> > 
> > Therefore, can you please elaborate a little more for readers with very 
little
> > knowledge of these kinds of things (like me and perhaps others)?
> > 
> > > 2. Any context switch caused by preemption or sleep (pagefault may
> > > cause sleep) doesn't affect the validity of local mapping.
> > 
> > I'd replace "preemption or sleep" with "preemption and page faults" since
> > yourself then added that page faults lead to tasks being put to sleep.
> > 
> > > Therefore, i915_gem_object_read_from_page_kmap() is a function where
> > > the use of kmap_local_page() in place of kmap_atomic() is correctly
> > > suited.
> > > 
> > > Convert the calls of kmap_atomic() / kunmap_atomic() to
> > > kmap_local_page() / kunmap_local().
> > > 
> > > And remove the redundant variable that stores the address of the mapped
> > > page since kunmap_local() can accept any pointer within the page.
> > > 
> > > [1]: 
> > > https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> > > 
> > > Suggested-by: Dave Hansen 
> > > Suggested-by: Ira Weiny 
> > > Suggested-by: Fabio M. De Francesco 
> > > Signed-off-by: Zhao Liu 
> > > ---
> > > 
> > > Suggested by credits:
> > >   Dave: Referred to his explanation about cache flush.
> > >   Ira: Referred to his task document, review comments and explanation 
about
> > >   
> > >cache flush.
> > >   
> > >   Fabio: Referred to his boiler plate commit message.
> > > 
> > > ---
> > > 
> > >  dri

Re: [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Replace kmap() with kmap_local_page() (rev2)

2022-11-01 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 22:23:30 CET Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Replace kmap() with kmap_local_page() (rev2)
> URL   : https://patchwork.freedesktop.org/series/107277/
> State : warning
> 
> == Summary ==
> 
> Error: dim sparse failed
> Sparse version: v0.6.2
> Fast mode used, each commit won't be checked separately.

I received this and other messages with regards to the same series, however, 
after due inspection of my patches, I haven't yet been able to understand what 
these reports may mean or even if they are reporting bugs.

Can anyone please provide any help?

Thanks,

Fabio

P.S.: Similar requests for the other messages will follow ASAP.





Re: [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Replace kmap() with kmap_local_page() (rev2)

2022-11-01 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 22:23:30 CET Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Replace kmap() with kmap_local_page() (rev2)
> URL   : https://patchwork.freedesktop.org/series/107277/
> State : warning
> 
> == Summary ==
> 
> Error: dim sparse failed
> Sparse version: v0.6.2
> Fast mode used, each commit won't be checked separately.

I received this and other messages with regards to the same series, however, 
after due inspection of my patches, I haven't yet been able to understand what 
these reports may mean or even if they are reporting bugs.

Can anyone please provide any help?

Thanks,

Fabio

P.S.: Similar requests for the other messages will follow ASAP.






Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Replace kmap() with kmap_local_page() (rev2)

2022-11-01 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 22:34:44 CET Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Replace kmap() with kmap_local_page() (rev2)
> URL   : https://patchwork.freedesktop.org/series/107277/
> State : success
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_12252 -> Patchwork_107277v2
> 
> 
> Summary
> ---
> 
>   **SUCCESS**
> 
>   No regressions found.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107277v2/
index.html
> 
> Participating hosts (46 -> 43)
> --
> 
>   Additional (1): fi-icl-u2
>   Missing(4): fi-bxt-dsi fi-cfl-8700k bat-atsm-1 fi-bdw-samus
> 
> Known issues
> 
> 
>   Here are the changes found in Patchwork_107277v2 that come from known 
issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@gem_exec_gttfill@basic:
> - fi-pnv-d510:[PASS][1] -> [FAIL][2] ([i915#7229])
>[1]:
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12252/fi-pnv-d510/
igt@gem_exec_gttfill@
> basic.html [2]:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107277v2/fi-pnv-d510/
igt@gem_exec_gt
> tf...@basic.html
> 
>   * igt@gem_linear_blits@basic:
> - fi-pnv-d510:[PASS][3] -> [SKIP][4] ([fdo#109271])
>[3]:
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12252/fi-pnv-d510/
igt@gem_linear_blits@
> basic.html [4]:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107277v2/fi-pnv-d510/
igt@gem_linear_
> bl...@basic.html
> 
>   * igt@runner@aborted:
> - fi-icl-u2:  NOTRUN -> [FAIL][5] ([i915#7220])
>[5]:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107277v2/fi-icl-u2/
igt@runner@aborte
> d.html
> 
> 
>   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
>   [i915#7220]: https://gitlab.freedesktop.org/drm/intel/issues/7220
>   [i915#7229]: https://gitlab.freedesktop.org/drm/intel/issues/7229
> 
> 
> Build changes
> -
> 
>   * Linux: CI_DRM_12252 -> Patchwork_107277v2
> 
>   CI-20190529: 20190529
>   CI_DRM_12252: 14867082ef288af10c90732e31b633af30e304c0 @
> git://anongit.freedesktop.org/gfx-ci/linux IGT_7017:
> 193c8bdd7df32556129482c52011e1b7a233b699 @
> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_107277v2:
> 14867082ef288af10c90732e31b633af30e304c0 @ git://anongit.freedesktop.org/
gfx-ci/linux
> 
> 
> ### Linux commits
> 
> 83c34dd3b2a2 drm/i915/gem: Replace kmap() with kmap_local_page()
> ade39e0c9963 drm/i915/gt: Replace kmap() with kmap_local_page()
> cd89efceb13c drm/i915: Replace kmap() with kmap_local_page()
> 
> == Logs ==
> 
> For more details see:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_107277v2/index.html

This is the second of the three messages with regards to the same series.
My kind request is the same of the previous...

Can anyone please provide any help?

Thanks,

Fabio







Re: [Intel-gfx] [PATCH 2/9] drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c

2022-10-29 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 11:37:18 CEST Zhao Liu wrote:
> From: Zhao Liu 
> 
> The use of kmap_atomic() is being deprecated in favor of
> kmap_local_page()[1].
> 
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption.
> 
> In drm/i915/gem/i915_gem_phys.c, the functions
> i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
> don't need to disable pagefaults and preemption for mapping because of
> these 2 reasons:
> 
> 1. The flush operation is safe for CPU hotplug when preemption is not
> disabled. In drm/i915/gem/i915_gem_object.c, the functions
> i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
> calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
> Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
> drm_clflush_virt_range(), the flush operation is global and any issue
> with cpu's being added or removed can be handled safely.
> 
> 2. Any context switch caused by preemption or sleep (pagefault may
> cause sleep) doesn't affect the validity of local mapping.
> 
> Therefore, i915_gem_object_get_pages_phys() and
> i915_gem_object_put_pages_phys() are two functions where the use of
> kmap_local_page() in place of kmap_atomic() is correctly suited.
> 
> Convert the calls of kmap_atomic() / kunmap_atomic() to
> kmap_local_page() / kunmap_local().
> 

I have here the same questions as in 1/9.

> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> 
> Suggested-by: Dave Hansen 
> Suggested-by: Ira Weiny 
> Suggested-by: Fabio M. De Francesco 
> Signed-off-by: Zhao Liu 
> ---
> Suggested by credits:
>   Dave: Referred to his explanation about cache flush.
>   Ira: Referred to his task document, review comments and explanation about
>cache flush.
>   Fabio: Referred to his boiler plate commit message.
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_phys.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 0d0e46dae559..d602ba19ecb2 
100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> @@ -66,10 +66,10 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object
> *obj) if (IS_ERR(page))
>   goto err_st;
> 
> - src = kmap_atomic(page);
> + src = kmap_local_page(page);
>   memcpy(dst, src, PAGE_SIZE);
>   drm_clflush_virt_range(dst, PAGE_SIZE);
> - kunmap_atomic(src);
> + kunmap_local(src);

Please use memcpy_from_page() instead of open coding mapping + memcpy() + 
unmapping.

> 
>   put_page(page);
>   dst += PAGE_SIZE;
> @@ -114,10 +114,10 @@ i915_gem_object_put_pages_phys(struct 
drm_i915_gem_object *obj,
>   if (IS_ERR(page))
>   continue;
> 
> - dst = kmap_atomic(page);
> + dst = kmap_local_page(page);
>   drm_clflush_virt_range(src, PAGE_SIZE);
>   memcpy(dst, src, PAGE_SIZE);
> - kunmap_atomic(dst);
> + kunmap_local(dst);

For the same reasons said above, memcpy_to_page() should be used here and 
avoid open coding of three functions.

Using those helpers forces you to move drm_clflush_virt_range() out of the 
mapping / un-mapping region. I may be wrong, however I'm pretty sure that the 
relative positions of each of those call sites is something that cannot be 
randomly chosen.

Thanks,

Fabio

> 
>   set_page_dirty(page);
>   if (obj->mm.madv == I915_MADV_WILLNEED)





Re: [Intel-gfx] [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-10-29 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> From: Zhao Liu 
> 
> The use of kmap_atomic() is being deprecated in favor of
> kmap_local_page()[1].
> 
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption.

You are right about about page faults which are never disabled by 
kmap_local_page(). However kmap_atomic might not disable preemption. It 
depends on CONFIG_PREEMPT_RT.

Please refer to how kmap_atomic_prot() works (this function is called by 
kmap_atomic() when kernels have HIGHMEM enabled).

> 
> There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> need to disable pagefaults and preemption for mapping:
> 
> 1. The flush operation is safe for CPU hotplug when preemption is not
> disabled. 

I'm confused here. Why are you talking about CPU hotplug?
In any case, developers should never rely on implicit calls of 
preempt_disable() for the reasons said above. Therefore, flush operations 
should be allowed regardless that kmap_atomic() potential side effect.

> In drm/i915/gem/i915_gem_object.c, the function
> i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range()

If I recall correctly, drm_clflush_virt_range() can always be called with page 
faults and preemption enabled. If so, this is enough to say that the 
conversion is safe. 

Is this code explicitly related to flushing the cache lines before removing / 
adding CPUs? If I recall correctly, there are several other reasons behind the 
need to issue cache lines flushes. Am I wrong about this?

Can you please say more about what I'm missing here?

> to
> use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
> and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
> operation is global and any issue with cpu's being added or removed
> can be handled safely.

Again your main concern is about CPU hotplug.

Even if I'm missing something, do we really need all these details about the 
inner workings of drm_clflush_virt_range()? 

I'm not an expert, so may be that I'm wrong about all I wrote above.

Therefore, can you please elaborate a little more for readers with very little 
knowledge of these kinds of things (like me and perhaps others)?
 
> 2. Any context switch caused by preemption or sleep (pagefault may
> cause sleep) doesn't affect the validity of local mapping.

I'd replace "preemption or sleep" with "preemption and page faults" since 
yourself then added that page faults lead to tasks being put to sleep.  

> Therefore, i915_gem_object_read_from_page_kmap() is a function where
> the use of kmap_local_page() in place of kmap_atomic() is correctly
> suited.
> 
> Convert the calls of kmap_atomic() / kunmap_atomic() to
> kmap_local_page() / kunmap_local().
> 
> And remove the redundant variable that stores the address of the mapped
> page since kunmap_local() can accept any pointer within the page.
> 
> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> 
> Suggested-by: Dave Hansen 
> Suggested-by: Ira Weiny 
> Suggested-by: Fabio M. De Francesco 
> Signed-off-by: Zhao Liu 
> ---
> Suggested by credits:
>   Dave: Referred to his explanation about cache flush.
>   Ira: Referred to his task document, review comments and explanation about
>cache flush.
>   Fabio: Referred to his boiler plate commit message.
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 
369006c5317f..a0072abed75e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -413,17 +413,15 @@ void __i915_gem_object_invalidate_frontbuffer(struct
> drm_i915_gem_object *obj, static void
>  i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64 
offset, void
> *dst, int size) {
> - void *src_map;
>   void *src_ptr;
> 
> - src_map = kmap_atomic(i915_gem_object_get_page(obj, offset >> 
PAGE_SHIFT));
> -
> - src_ptr = src_map + offset_in_page(offset);
> + src_ptr = kmap_local_page(i915_gem_object_get_page(obj, offset >> 
PAGE_SHIFT))
> +   + offset_in_page(offset);
>   if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
>   drm_clflush_virt_range(src_ptr, size);
>   memcpy(dst, src_ptr, size);
> 
> - kunmap_atomic(src_map);
> + kunmap_local(src_ptr);
>  }
> 
>  static void

The changes look good, but I'd like to better understand the commit message.

Thanks,

Fabio 




Re: [Intel-gfx] [PATCH 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2022-10-29 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 11:37:16 CEST Zhao Liu wrote:
> From: Zhao Liu 
> 
> The use of kmap_atomic() is being deprecated in favor of
> kmap_local_page()[1].

Some words to explain why kmap_atomic was deprecated won't hurt. Many 
maintainers and reviewers, and also casual readers might not yet be aware of 
the reasons behind that deprecation.
 
> In the following patches, we can convert the calls of kmap_atomic() /
> kunmap_atomic() to kmap_local_page() / kunmap_local(), which can
> instead do the mapping / unmapping regardless of the context.

Readers are probably much more interested in what you did in the following 
patches and why you did it, instead of being informed about what "we can" do.

I would suggest something like "The following patches convert the calls to 
kmap_atomic() to kmap_local_page() [the rest looks OK]".

This could also be the place to say something about why we prefer 
kmap_local_page() to kmap_atomic(). 

Are you sure that the reasons that motivates your conversions are merely 
summarized to kmap_local_page() being able to do mappings regardless of 
context? I think you are missing the real reasons why. 

What about avoiding the often unwanted side effect of unnecessary page faults 
disables?

> 
> With kmap_local_page(), the mapping is per thread, CPU local and not
> globally visible.

No news here. kmap_atomic() is "per thread, CPU local and not glocally 
visible". I cannot see any difference here between kmap_atomic() and 
kmap_local_page().

> 
> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> ---
> Zhao Liu (9):
>   drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
>   drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
>   drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
>   drm/i915: Use kmap_local_page() in i915_cmd_parser.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c
> 
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
>  drivers/gpu/drm/i915/gem/i915_gem_phys.c |  8 
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
>  drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
>  .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
>  .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
>  drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
>  9 files changed, 30 insertions(+), 37 deletions(-)

Thanks for helping with kmap_atomic() conversions to kmap_local_page().

Fabio




[Intel-gfx] [RESEND PATCH 1/3] drm/i915: Replace kmap() with kmap_local_page()

2022-10-17 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915_gem.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915_gem.c

Cc: "Venkataramanan, Anirudh" 
Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/i915_gem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 702e5b89be22..43effce60e1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -212,14 +212,14 @@ shmem_pread(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
if (needs_clflush)
drm_clflush_virt_range(vaddr + offset, len);
 
ret = __copy_to_user(user_data, vaddr + offset, len);
 
-   kunmap(page);
+   kunmap_local(vaddr);
 
return ret ? -EFAULT : 0;
 }
@@ -634,7 +634,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
if (needs_clflush_before)
drm_clflush_virt_range(vaddr + offset, len);
@@ -643,7 +643,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
if (!ret && needs_clflush_after)
drm_clflush_virt_range(vaddr + offset, len);
 
-   kunmap(page);
+   kunmap_local(vaddr);
 
return ret ? -EFAULT : 0;
 }
-- 
2.37.1



[Intel-gfx] [RESEND PATCH 3/3] drm/i915/gem: Replace kmap() with kmap_local_page()

2022-10-17 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gem is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gem. Instead of
open-coding local map + memcpy + local unmap, use memcpy_to_page() in a
suited call site.

Cc: "Venkataramanan, Anirudh" 
Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 4eed3dd90ba8..2bc6ab9964ff 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -640,16 +640,14 @@ i915_gem_object_create_shmem_from_data(struct 
drm_i915_private *dev_priv,
do {
unsigned int len = min_t(typeof(size), size, PAGE_SIZE);
struct page *page;
-   void *pgdata, *vaddr;
+   void *pgdata;
 
err = aops->write_begin(file, file->f_mapping, offset, len,
, );
if (err < 0)
goto fail;
 
-   vaddr = kmap(page);
-   memcpy(vaddr, data, len);
-   kunmap(page);
+   memcpy_to_page(page, 0, data, len);
 
err = aops->write_end(file, file->f_mapping, offset, len, len,
  page, pgdata);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ced9948a331..bb25b50b5688 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -153,7 +153,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
intel_gt_flush_ggtt_writes(to_gt(i915));
 
p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, 
row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to 
page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -171,7 +171,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);
 
 out:
i915_gem_object_lock(obj, NULL);
@@ -249,7 +249,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
intel_gt_flush_ggtt_writes(to_gt(i915));
 
p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u 
[%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected 
write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -267,7 +267,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);
if (err)
return err;
 
-- 
2.37.1



[Intel-gfx] [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page()

2022-10-17 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in drm/i915 is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in drm/i915.

These changes should be tested in an 32 bits system, booting a kernel
with HIGHMEM enabled. Unfortunately I have no i915 based hardware,
therefore any help with testing would be greatly appreciated.

I'm resending this little series because I suspect that it has been
lost, since it was submitted on Aug 11, 2022. In the meantime I'm
adding one more recipient (Anirudh) who is helping, along with others, Ira
and me with these conversions / removals of kmap() and kmap_atomic() 

Cc: "Venkataramanan, Anirudh" 
Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 

Fabio M. De Francesco (3):
  drm/i915: Replace kmap() with kmap_local_page()
  drm/i915/gt: Replace kmap() with kmap_local_page()
  drm/i915/gem: Replace kmap() with kmap_local_page()

 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  |  6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |  8 
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c   |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c  | 11 ---
 drivers/gpu/drm/i915/i915_gem.c|  8 
 5 files changed, 16 insertions(+), 21 deletions(-)

-- 
2.37.1



[Intel-gfx] [RESEND PATCH 2/3] drm/i915/gt: Replace kmap() with kmap_local_page()

2022-10-17 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gt is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gt. Instead of
open-coding local mappings + memcpy() + local unmappings, use
the memcpy_{from,to}_page() helpers where these are better suited.

Cc: "Venkataramanan, Anirudh" 
Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c| 11 ---
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 6ebda3d65086..21d8ce40b897 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -747,7 +747,7 @@ static void swizzle_page(struct page *page)
char *vaddr;
int i;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
for (i = 0; i < PAGE_SIZE; i += 128) {
memcpy(temp, [i], 64);
@@ -755,7 +755,7 @@ static void swizzle_page(struct page *page)
memcpy([i + 64], temp, 64);
}
 
-   kunmap(page);
+   kunmap_local(vaddr);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 402f085f3a02..48edbb8a33e5 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -98,22 +98,19 @@ static int __shmem_rw(struct file *file, loff_t off,
unsigned int this =
min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
struct page *page;
-   void *vaddr;
 
page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
   GFP_KERNEL);
if (IS_ERR(page))
return PTR_ERR(page);
 
-   vaddr = kmap(page);
if (write) {
-   memcpy(vaddr + offset_in_page(off), ptr, this);
+   memcpy_to_page(page, offset_in_page(off), ptr, this);
set_page_dirty(page);
} else {
-   memcpy(ptr, vaddr + offset_in_page(off), this);
+   memcpy_from_page(ptr, page, offset_in_page(off), this);
}
mark_page_accessed(page);
-   kunmap(page);
put_page(page);
 
len -= this;
@@ -140,11 +137,11 @@ int shmem_read_to_iosys_map(struct file *file, loff_t off,
if (IS_ERR(page))
return PTR_ERR(page);
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
iosys_map_memcpy_to(map, map_off, vaddr + offset_in_page(off),
this);
mark_page_accessed(page);
-   kunmap(page);
+   kunmap_local(vaddr);
put_page(page);
 
len -= this;
-- 
2.37.1



[Intel-gfx] [PATCH 2/3] drm/i915/gt: Replace kmap() with kmap_local_page()

2022-08-24 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gt is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gt

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c| 11 ---
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 6ebda3d65086..21d8ce40b897 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -747,7 +747,7 @@ static void swizzle_page(struct page *page)
char *vaddr;
int i;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
for (i = 0; i < PAGE_SIZE; i += 128) {
memcpy(temp, [i], 64);
@@ -755,7 +755,7 @@ static void swizzle_page(struct page *page)
memcpy([i + 64], temp, 64);
}
 
-   kunmap(page);
+   kunmap_local(vaddr);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 402f085f3a02..48edbb8a33e5 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -98,22 +98,19 @@ static int __shmem_rw(struct file *file, loff_t off,
unsigned int this =
min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
struct page *page;
-   void *vaddr;
 
page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
   GFP_KERNEL);
if (IS_ERR(page))
return PTR_ERR(page);
 
-   vaddr = kmap(page);
if (write) {
-   memcpy(vaddr + offset_in_page(off), ptr, this);
+   memcpy_to_page(page, offset_in_page(off), ptr, this);
set_page_dirty(page);
} else {
-   memcpy(ptr, vaddr + offset_in_page(off), this);
+   memcpy_from_page(ptr, page, offset_in_page(off), this);
}
mark_page_accessed(page);
-   kunmap(page);
put_page(page);
 
len -= this;
@@ -140,11 +137,11 @@ int shmem_read_to_iosys_map(struct file *file, loff_t off,
if (IS_ERR(page))
return PTR_ERR(page);
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
iosys_map_memcpy_to(map, map_off, vaddr + offset_in_page(off),
this);
mark_page_accessed(page);
-   kunmap(page);
+   kunmap_local(vaddr);
put_page(page);
 
len -= this;
-- 
2.37.1



[Intel-gfx] [PATCH 1/3] drm/i915: Replace kmap() with kmap_local_page()

2022-08-15 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915_gem.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915_gem.c

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/i915_gem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 702e5b89be22..43effce60e1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -212,14 +212,14 @@ shmem_pread(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
if (needs_clflush)
drm_clflush_virt_range(vaddr + offset, len);
 
ret = __copy_to_user(user_data, vaddr + offset, len);
 
-   kunmap(page);
+   kunmap_local(vaddr);
 
return ret ? -EFAULT : 0;
 }
@@ -634,7 +634,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
char *vaddr;
int ret;
 
-   vaddr = kmap(page);
+   vaddr = kmap_local_page(page);
 
if (needs_clflush_before)
drm_clflush_virt_range(vaddr + offset, len);
@@ -643,7 +643,7 @@ shmem_pwrite(struct page *page, int offset, int len, char 
__user *user_data,
if (!ret && needs_clflush_after)
drm_clflush_virt_range(vaddr + offset, len);
 
-   kunmap(page);
+   kunmap_local(vaddr);
 
return ret ? -EFAULT : 0;
 }
-- 
2.37.1



[Intel-gfx] [PATCH 3/3] drm/i915/gem: Replace kmap() with kmap_local_page()

2022-08-15 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gem is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gem.

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 4eed3dd90ba8..2bc6ab9964ff 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -640,16 +640,14 @@ i915_gem_object_create_shmem_from_data(struct 
drm_i915_private *dev_priv,
do {
unsigned int len = min_t(typeof(size), size, PAGE_SIZE);
struct page *page;
-   void *pgdata, *vaddr;
+   void *pgdata;
 
err = aops->write_begin(file, file->f_mapping, offset, len,
, );
if (err < 0)
goto fail;
 
-   vaddr = kmap(page);
-   memcpy(vaddr, data, len);
-   kunmap(page);
+   memcpy_to_page(page, 0, data, len);
 
err = aops->write_end(file, file->f_mapping, offset, len, len,
  page, pgdata);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ced9948a331..bb25b50b5688 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -153,7 +153,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
intel_gt_flush_ggtt_writes(to_gt(i915));
 
p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, 
row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to 
page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -171,7 +171,7 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);
 
 out:
i915_gem_object_lock(obj, NULL);
@@ -249,7 +249,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
intel_gt_flush_ggtt_writes(to_gt(i915));
 
p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-   cpu = kmap(p) + offset_in_page(offset);
+   cpu = kmap_local_page(p) + offset_in_page(offset);
drm_clflush_virt_range(cpu, sizeof(*cpu));
if (*cpu != (u32)page) {
pr_err("Partial view for %lu [%u] (offset=%llu, size=%u 
[%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected 
write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -267,7 +267,7 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
}
*cpu = 0;
drm_clflush_virt_range(cpu, sizeof(*cpu));
-   kunmap(p);
+   kunmap_local(cpu);
if (err)
return err;
 
-- 
2.37.1



[Intel-gfx] [PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page()

2022-08-15 Thread Fabio M. De Francesco
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in drm/i915 is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in drm/i915.

These changes should be tested in an 32 bits system, booting a kernel
with HIGHMEM enabled. Unfortunately I have no i915 based hardware,
therefore any help with testing would be greatly appreciated.

Suggested-by: Ira Weiny 
Signed-off-by: Fabio M. De Francesco 

Fabio M. De Francesco (3):
  drm/i915: Replace kmap() with kmap_local_page()
  drm/i915/gt: Replace kmap() with kmap_local_page()
  drm/i915/gem: Replace kmap() with kmap_local_page()

 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  |  6 ++
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |  8 
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c   |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c  | 11 ---
 drivers/gpu/drm/i915/i915_gem.c|  8 
 5 files changed, 16 insertions(+), 21 deletions(-)

-- 
2.37.1