Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-10-18 Thread Zhao Liu
Hi Rodrigo and Tvrtko,

It seems this series is missed in v6.5.
This work should not be forgotten. Let me rebase and refresh the version.

Regards,
Zhao

On Mon, Apr 17, 2023 at 10:53:28AM -0400, Rodrigo Vivi wrote:
> Date: Mon, 17 Apr 2023 10:53:28 -0400
> From: Rodrigo Vivi 
> Subject: Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_execbuffer.c
> 
> On Mon, Apr 17, 2023 at 12:24:45PM +0100, Tvrtko Ursulin wrote:
> > 
> > On 14/04/2023 11:45, Zhao Liu wrote:
> > > Hi Tvrtko,
> > > 
> > > On Wed, Apr 12, 2023 at 04:45:13PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > [snip]
> > > 
> > > > > 
> > > > > [snip]
> > > > > > However I am unsure if disabling pagefaulting is needed or not. 
> > > > > > Thomas,
> > > > > > Matt, being the last to touch this area, perhaps you could have a 
> > > > > > look?
> > > > > > Because I notice we have a fallback iomap path which still uses
> > > > > > io_mapping_map_atomic_wc. So if kmap_atomic to kmap_local 
> > > > > > conversion is
> > > > > > safe, does the iomap side also needs converting to
> > > > > > io_mapping_map_local_wc? Or they have separate requirements?
> > > > > 
> > > > > AFAIK, the requirements for io_mapping_map_local_wc() are the same as 
> > > > > for
> > > > > kmap_local_page(): the kernel virtual address is _only_ valid in the 
> > > > > caller
> > > > > context, and map/unmap nesting must be done in stack-based ordering 
> > > > > (LIFO).
> > > > > 
> > > > > I think a follow up patch could safely switch to 
> > > > > io_mapping_map_local_wc() /
> > > > > io_mapping_unmap_local_wc since the address is local to context.
> > > > > 
> > > > > However, not being an expert, reading your note now I suspect that 
> > > > > I'm missing
> > > > > something. Can I ask why you think that page-faults disabling might be
> > > > > necessary?
> > > > 
> > > > I am not saying it is, was just unsure and wanted some people who 
> > > > worked on this code most recently to take a look and confirm.
> > > > 
> > > > I guess it will work since the copying is done like this anyway:
> > > > 
> > > > /*
> > > >  * This is the fast path and we cannot handle a 
> > > > pagefault
> > > >  * whilst holding the struct mutex lest the user pass 
> > > > in the
> > > >  * relocations contained within a mmaped bo. For in 
> > > > such a case
> > > >  * we, the page fault handler would call 
> > > > i915_gem_fault() and
> > > >  * we would try to acquire the struct mutex again. 
> > > > Obviously
> > > >  * this is bad and so lockdep complains vehemently.
> > > >  */
> > > > pagefault_disable();
> > > > copied = __copy_from_user_inatomic(r, urelocs, count * 
> > > > sizeof(r[0]));
> > > > pagefault_enable();
> > > > if (unlikely(copied)) {
> > > > remain = -EFAULT;
> > > > goto out;
> > > > }
> > > > 
> > > > Comment is a bit outdated since we don't use that global "struct mutex" 
> > > > any longer, but in any case, if there is a page fault on the mapping 
> > > > where we need to recurse into i915 again to satisfy if, we seem to have 
> > > > code already to handle it. So kmap_local conversion I *think* can't 
> > > > regress anything.
> > > 
> > > Thanks for your explanation!
> > > 
> > > > 
> > > > Patch to convert the io_mapping_map_atomic_wc can indeed come later.
> > > 
> > > Okay, I will also look at this.
> > > 
> > > > 
> > > > In terms of logistics - if we landed this series to out branch it would 
> > > > be queued only for 6.5. Would that work for you?
> > > 
> > > Yeah, it's ok for me. But could I ask, did I miss the 6.4 merge time?
> > 
> > Yes, but just because we failed to review and merge in time, not because you
> > did not provide patches in time.
> 
> It is worth mentioning that under drm we close the merge window earlier.
> Around -rc5.
> 
> So, Linus' merge window for 6.4 didn't happen yet. But our drm-next that
> is going to be sent there is already closed.
> 
> > 
> > Regards,
> > 
> > Tvrtko
> > 


[PATCH 7/9] drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gt/uc/intel_us_fw.c, the function intel_uc_fw_copy_rsa()
just use the mapping to do memory copy so it doesn't need to disable
pagefaults and preemption for mapping. Thus the local mapping without
atomic context (not disable pagefaults / preemption) is enough.

Therefore, intel_uc_fw_copy_rsa() is a function where the use of
memcpy_from_page() with kmap_local_page() in place of memcpy() with
kmap_atomic() is correctly suited.

Convert the calls of memcpy() with kmap_atomic() / kunmap_atomic() to
memcpy_from_page() which uses local mapping to copy.

[1]: 
https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com/T/#u

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Ira: Referred to his task document and suggestions about using
   memcpy_from_page() directly.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index b91ad4aede1f..64d56f175d32 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -962,16 +962,13 @@ size_t intel_uc_fw_copy_rsa(struct intel_uc_fw *uc_fw, 
void *dst, u32 max_len)
 
for_each_sgt_page(page, iter, uc_fw->obj->mm.pages) {
u32 len = min_t(u32, size, PAGE_SIZE - offset);
-   void *vaddr;
 
if (idx > 0) {
idx--;
continue;
}
 
-   vaddr = kmap_atomic(page);
-   memcpy(dst, vaddr + offset, len);
-   kunmap_atomic(vaddr);
+   memcpy_from_page(dst, page, offset, len);
 
offset = 0;
dst += len;
-- 
2.34.1



[PATCH 8/9] drm/i915: Use kmap_local_page() in i915_cmd_parser.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

There're 2 reasons why function copy_batch() doesn't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. In i915_cmd_parser.c, the function copy_batch() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu
in drm_clflush_virt_range(), the flush operation is global and any
issue with cpu's being added or removed can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, copy_batch() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index f93e6122f247..1a56000d7476 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1211,11 +1211,11 @@ static u32 *copy_batch(struct drm_i915_gem_object 
*dst_obj,
for (n = offset >> PAGE_SHIFT; remain; n++) {
int len = min(remain, PAGE_SIZE - x);
 
-   src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
+   src = kmap_local_page(i915_gem_object_get_page(src_obj, 
n));
if (src_needs_clflush)
drm_clflush_virt_range(src + x, len);
memcpy(ptr, src + x, len);
-   kunmap_atomic(src);
+   kunmap_local(src);
 
ptr += len;
remain -= len;
-- 
2.34.1



[PATCH 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
kmap_atomic() in eb_relocate_entry(), and is unmapped by
kunmap_atomic() in reloc_cache_reset().

And this mapping/unmapping occurs in two places: one is in
eb_relocate_vma(), and another is in eb_relocate_vma_slow().

The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
need to disable pagefaults and preemption during the above mapping/
unmapping.

So it can simply use kmap_local_page() / kunmap_local() that can
instead do the mapping / unmapping regardless of the context.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Ira Weiny 
Signed-off-by: Zhao Liu 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 845023c14eb3..8263d4e6620a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1110,7 +1110,7 @@ static void reloc_cache_unmap(struct reloc_cache *cache)
 
vaddr = unmask_page(cache->vaddr);
if (cache->vaddr & KMAP)
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
else
io_mapping_unmap_atomic((void __iomem *)vaddr);
 }
@@ -1126,7 +1126,7 @@ static void reloc_cache_remap(struct reloc_cache *cache,
if (cache->vaddr & KMAP) {
struct page *page = i915_gem_object_get_page(obj, cache->page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) |
(unsigned long)vaddr;
} else {
@@ -1156,7 +1156,7 @@ static void reloc_cache_reset(struct reloc_cache *cache, 
struct i915_execbuffer
if (cache->vaddr & CLFLUSH_AFTER)
mb();
 
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
i915_gem_object_finish_access(obj);
} else {
struct i915_ggtt *ggtt = cache_to_ggtt(cache);
@@ -1188,7 +1188,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
struct page *page;
 
if (cache->vaddr) {
-   kunmap_atomic(unmask_page(cache->vaddr));
+   kunmap_local(unmask_page(cache->vaddr));
} else {
unsigned int flushes;
int err;
@@ -1210,7 +1210,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
if (!obj->mm.dirty)
set_page_dirty(page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) | (unsigned long)vaddr;
cache->page = pageno;
 
-- 
2.34.1



[PATCH 6/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and
cpu_check() mainly uses mapping to flush cache and check/assign the
value.

There're 2 reasons why cpu_fill() and cpu_check() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. cpu_fill() and cpu_check() call drm_clflush_virt_range() to
use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global and any issue with cpu's being added or removed
can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_fill() and cpu_check() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index c6ad67b90e8a..736337f23f78 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -466,12 +466,12 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 
value)
for (n = 0; n < real_page_count(obj); n++) {
u32 *map;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
for (m = 0; m < DW_PER_PAGE; m++)
map[m] = value;
if (!has_llc)
drm_clflush_virt_range(map, PAGE_SIZE);
-   kunmap_atomic(map);
+   kunmap_local(map);
}
 
i915_gem_object_finish_access(obj);
@@ -496,7 +496,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
for (n = 0; n < real_page_count(obj); n++) {
u32 *map;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(map, PAGE_SIZE);
 
@@ -522,7 +522,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
}
 
 out_unmap:
-   kunmap_atomic(map);
+   kunmap_local(map);
if (err)
break;
}
-- 
2.34.1



[PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
need to disable pagefaults and preemption for mapping:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. In drm/i915/gem/i915_gem_object.c, the function
i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range() to
use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global and any issue with cpu's being added or removed
can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_read_from_page_kmap() is a function where
the use of kmap_local_page() in place of kmap_atomic() is correctly
suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

And remove the redundant variable that stores the address of the mapped
page since kunmap_local() can accept any pointer within the page.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 369006c5317f..a0072abed75e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -413,17 +413,15 @@ void __i915_gem_object_invalidate_frontbuffer(struct 
drm_i915_gem_object *obj,
 static void
 i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64 
offset, void *dst, int size)
 {
-   void *src_map;
void *src_ptr;
 
-   src_map = kmap_atomic(i915_gem_object_get_page(obj, offset >> 
PAGE_SHIFT));
-
-   src_ptr = src_map + offset_in_page(offset);
+   src_ptr = kmap_local_page(i915_gem_object_get_page(obj, offset >> 
PAGE_SHIFT))
+ + offset_in_page(offset);
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
drm_clflush_virt_range(src_ptr, size);
memcpy(dst, src_ptr, size);
 
-   kunmap_atomic(src_map);
+   kunmap_local(src_ptr);
 }
 
 static void
-- 
2.34.1



[PATCH 5/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set()
and cpu_get() mainly uses mapping to flush cache and assign the value.
There're 2 reasons why cpu_set() and cpu_get() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. cpu_set() and cpu_get() call drm_clflush_virt_range() to use
CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and
WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global and any issue with cpu's being added or removed
can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_set() and cpu_get() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index a666d7e610f5..b12402c74424 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -24,7 +24,6 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -34,8 +33,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
@@ -45,7 +43,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
if (needs_clflush & CLFLUSH_AFTER)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
@@ -57,7 +55,6 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -67,15 +64,14 @@ static int cpu_get(struct context *ctx, unsigned long 
offset, u32 *v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
*v = *cpu;
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
-- 
2.34.1



[PATCH v3] x86/hyperv: Replace kmap() with kmap_local_page()

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

kmap() is being deprecated in favor of kmap_local_page()[1].

There are two main problems with kmap(): (1) It comes with an overhead as 
mapping space is restricted and protected by a global lock for synchronization 
and (2) it also requires global TLB invalidation when the kmap's pool wraps and 
it might block when the mapping space is fully utilized until a slot becomes 
available.

With kmap_local_page() the mappings are per thread, CPU local, can take page 
faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the 
tasks can be preempted and, when they are scheduled to run again, the kernel 
virtual addresses are restored and are still valid.

Since its use in hyperv/hv_init.c is safe, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in hyperv/hv_init.c.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 

---
Suggested by credits.
Ira: Referred to his task documentation and review comments.
Fabio: Stole some of his boiler plate commit message.

---
Changelog:
v2:
- Fix wrong incoming parameters in kunmap_local();
- Add Fabio as suggester since I quoted his commit message.

---
 arch/x86/hyperv/hv_init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 3de6d8b53367..72fe46eb183f 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -459,13 +459,13 @@ void __init hyperv_init(void)
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);

pg = vmalloc_to_page(hv_hypercall_pg);
-   dst = kmap(pg);
+   dst = kmap_local_page(pg);
src = memremap(hypercall_msr.guest_physical_address << 
PAGE_SHIFT, PAGE_SIZE,
MEMREMAP_WB);
BUG_ON(!(src && dst));
memcpy(dst, src, HV_HYP_PAGE_SIZE);
memunmap(src);
-   kunmap(pg);
+   kunmap_local(dst);
} else {
hypercall_msr.guest_physical_address = 
vmalloc_to_pfn(hv_hypercall_pg);
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
--
2.34.1



[PATCH 3/9] drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gem/i915_gem_shmem.c, the function shmem_pwrite() need to
disable pagefault to eliminate the potential recursion fault[2]. But
here __copy_from_user_inatomic() doesn't need to disable preemption and
local mapping is valid for sched in/out.

So it can use kmap_local_page() / kunmap_local() with
pagefault_disable() / pagefault_enable() to replace atomic mapping.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
[2]: https://patchwork.freedesktop.org/patch/295840/

Suggested-by: Ira Weiny 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Ira: Referred to his suggestions about keeping pagefault_disable().
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index f42ca1179f37..e279a3e30c02 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -472,11 +472,13 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
if (err < 0)
return err;
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
+   pagefault_disable();
unwritten = __copy_from_user_inatomic(vaddr + pg,
  user_data,
  len);
-   kunmap_atomic(vaddr);
+   pagefault_enable();
+   kunmap_local(vaddr);
 
err = aops->write_end(obj->base.filp, mapping, offset, len,
  len - unwritten, page, data);
-- 
2.34.1



Re: [PATCH v3] x86/hyperv: Replace kmap() with kmap_local_page()

2022-10-18 Thread Zhao Liu
Sorry, please ignore the last hyperv patch, I made a mistake.

Zhao

On Mon, Oct 17, 2022 at 05:37:26PM +0800, Zhao Liu wrote:
> Date: Mon, 17 Oct 2022 17:37:26 +0800
> From: Zhao Liu 
> Subject: [PATCH v3] x86/hyperv: Replace kmap() with kmap_local_page()
> X-Mailer: git-send-email 2.34.1
> 
> From: Zhao Liu 
> 
> kmap() is being deprecated in favor of kmap_local_page()[1].
> 
> There are two main problems with kmap(): (1) It comes with an overhead as 
> mapping space is restricted and protected by a global lock for 
> synchronization and (2) it also requires global TLB invalidation when the 
> kmap's pool wraps and it might block when the mapping space is fully utilized 
> until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take page 
> faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the 
> tasks can be preempted and, when they are scheduled to run again, the kernel 
> virtual addresses are restored and are still valid.
> 
> Since its use in hyperv/hv_init.c is safe, it should be preferred.
> 
> Therefore, replace kmap() with kmap_local_page() in hyperv/hv_init.c.
> 
> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> 
> Suggested-by: Ira Weiny 
> Suggested-by: Fabio M. De Francesco 
> Signed-off-by: Zhao Liu 
> 
> ---
> Suggested by credits.
> Ira: Referred to his task documentation and review comments.
> Fabio: Stole some of his boiler plate commit message.
> 
> ---
> Changelog:
> v2:
> - Fix wrong incoming parameters in kunmap_local();
> - Add Fabio as suggester since I quoted his commit message.
> 
> ---
>  arch/x86/hyperv/hv_init.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 3de6d8b53367..72fe46eb183f 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -459,13 +459,13 @@ void __init hyperv_init(void)
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> 
> pg = vmalloc_to_page(hv_hypercall_pg);
> -   dst = kmap(pg);
> +   dst = kmap_local_page(pg);
> src = memremap(hypercall_msr.guest_physical_address << 
> PAGE_SHIFT, PAGE_SIZE,
> MEMREMAP_WB);
> BUG_ON(!(src && dst));
> memcpy(dst, src, HV_HYP_PAGE_SIZE);
> memunmap(src);
> -   kunmap(pg);
> +   kunmap_local(dst);
> } else {
> hypercall_msr.guest_physical_address = 
> vmalloc_to_pfn(hv_hypercall_pg);
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> --
> 2.34.1
> 


[PATCH 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

In the following patches, we can convert the calls of kmap_atomic() /
kunmap_atomic() to kmap_local_page() / kunmap_local(), which can
instead do the mapping / unmapping regardless of the context.

With kmap_local_page(), the mapping is per thread, CPU local and not
globally visible.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
---
Zhao Liu (9):
  drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
  drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
  drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
  drm/i915: Use kmap_local_page() in i915_cmd_parser.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c |  8 
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
 drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
 9 files changed, 30 insertions(+), 37 deletions(-)

-- 
2.34.1



[PATCH 2/9] drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gem/i915_gem_phys.c, the functions
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
don't need to disable pagefaults and preemption for mapping because of
these 2 reasons:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. In drm/i915/gem/i915_gem_object.c, the functions
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global and any issue
with cpu's being added or removed can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_get_pages_phys() and
i915_gem_object_put_pages_phys() are two functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/gem/i915_gem_phys.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 0d0e46dae559..d602ba19ecb2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -66,10 +66,10 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
if (IS_ERR(page))
goto err_st;
 
-   src = kmap_atomic(page);
+   src = kmap_local_page(page);
memcpy(dst, src, PAGE_SIZE);
drm_clflush_virt_range(dst, PAGE_SIZE);
-   kunmap_atomic(src);
+   kunmap_local(src);
 
put_page(page);
dst += PAGE_SIZE;
@@ -114,10 +114,10 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
if (IS_ERR(page))
continue;
 
-   dst = kmap_atomic(page);
+   dst = kmap_local_page(page);
drm_clflush_virt_range(src, PAGE_SIZE);
memcpy(dst, src, PAGE_SIZE);
-   kunmap_atomic(dst);
+   kunmap_local(dst);
 
set_page_dirty(page);
if (obj->mm.madv == I915_MADV_WILLNEED)
-- 
2.34.1



[PATCH 4/9] drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c

2022-10-18 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

In drm/i915/gem/selftests/huge_pages.c, function __cpu_check_shmem()
mainly uses mapping to flush cache and check the value. There're 2
reasons why __cpu_check_shmem() doesn't need to disable pagefaults
and preemption for mapping:

1. The flush operation is safe for CPU hotplug when preemption is not
disabled. Function __cpu_check_shmem() calls drm_clflush_virt_range()
to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global and any issue with cpu's being added or removed
can be handled safely.

2. Any context switch caused by preemption or sleep (pagefault may
cause sleep) doesn't affect the validity of local mapping.

Therefore, __cpu_check_shmem() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation about
   cache flush.
  Fabio: Referred to his boiler plate commit message.
---
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index c570cf780079..6f4efe905105 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1022,7 +1022,7 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
goto err_unlock;
 
for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
-   u32 *ptr = kmap_atomic(i915_gem_object_get_page(obj, n));
+   u32 *ptr = kmap_local_page(i915_gem_object_get_page(obj, n));
 
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(ptr, PAGE_SIZE);
@@ -1030,12 +1030,12 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
if (ptr[dword] != val) {
pr_err("n=%lu ptr[%u]=%u, val=%u\n",
   n, dword, ptr[dword], val);
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
err = -EINVAL;
break;
}
 
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
}
 
i915_gem_object_finish_access(obj);
-- 
2.34.1



Re: [PATCH 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2022-11-05 Thread Zhao Liu
On Sat, Oct 29, 2022 at 09:12:27AM +0200, Fabio M. De Francesco wrote:
> Date: Sat, 29 Oct 2022 09:12:27 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH 0/9] drm/i915: Replace kmap_atomic() with
>  kmap_local_page()

Hi Fabio, thanks for your review!! (I'm sorry I missed the previous mails).

> 
> On luned? 17 ottobre 2022 11:37:16 CEST Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > The use of kmap_atomic() is being deprecated in favor of
> > kmap_local_page()[1].
> 
> Some words to explain why kmap_atomic was deprecated won't hurt. Many 
> maintainers and reviewers, and also casual readers might not yet be aware of 
> the reasons behind that deprecation.
>  
> > In the following patches, we can convert the calls of kmap_atomic() /
> > kunmap_atomic() to kmap_local_page() / kunmap_local(), which can
> > instead do the mapping / unmapping regardless of the context.
> 
> Readers are probably much more interested in what you did in the following 
> patches and why you did it, instead of being informed about what "we can" do.
> 
> I would suggest something like "The following patches convert the calls to 
> kmap_atomic() to kmap_local_page() [the rest looks OK]".
> 
> This could also be the place to say something about why we prefer 
> kmap_local_page() to kmap_atomic(). 
> 
> Are you sure that the reasons that motivates your conversions are merely 
> summarized to kmap_local_page() being able to do mappings regardless of 
> context? I think you are missing the real reasons why. 

Thanks for your reminder, I'll emphasize the motivation here.

> What about avoiding the often unwanted side effect of unnecessary page faults 
> disables?

Good suggestion! I'll add this into this cover message.

What I think is that we have two reasons to do the replacement work:
1. (main motication) Avoid unnessary pagefaulta and preemption disabling to gain
performance benefits.
2. We are trying to deprecate the old kmap/kmap_atomic interface. Some 
maintainer
said it's also a good reason especially for the case that the performance is not
critical [1].

In addition, also from [1], I find in some case people chooses kmap_atomic() for
the consideration that they want the atomic context. So, the explaination about
why the atomic context is not needed is also a reasion? I understand that I need
to make special explaination in each commit depending on the situation (In this
case, it is not suitable to describe in the cover?).

[1]: https://lore.kernel.org/lkml/YzRVaJA0EyfcVisW@liuwe-devbox-debian-v2/#t

> 
> > 
> > With kmap_local_page(), the mapping is per thread, CPU local and not
> > globally visible.
> 
> No news here. kmap_atomic() is "per thread, CPU local and not glocally 
> visible". I cannot see any difference here between kmap_atomic() and 
> kmap_local_page().

What about the below description which refers to your doc?
"kmap_atomic() in the kernel creates a non-preemptible section
and disable pagefaults. This could be a source of unwanted latency.
And kmap_local_page effectively overcomes this issue because it doesn't
disable pagefault and preemption."

Thanks,
Zhao



Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-11-05 Thread Zhao Liu
On Thu, Nov 03, 2022 at 08:22:04PM +0100, Fabio M. De Francesco wrote:
> Date: Thu, 03 Nov 2022 20:22:04 +0100
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_object.c
> 
> On gioved? 3 novembre 2022 17:51:23 CET Ira Weiny wrote:
> > On Sat, Oct 29, 2022 at 01:17:03PM +0200, Fabio M. De Francesco wrote:
> > > On luned? 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> > > > From: Zhao Liu 
> > > > 
> > > > The use of kmap_atomic() is being deprecated in favor of
> > > > kmap_local_page()[1].
> > > > 
> > > > The main difference between atomic and local mappings is that local
> > > > mappings doesn't disable page faults or preemption.
> > > 
> > > You are right about about page faults which are never disabled by
> > > kmap_local_page(). However kmap_atomic might not disable preemption. It
> > > depends on CONFIG_PREEMPT_RT.
> > > 
> > > Please refer to how kmap_atomic_prot() works (this function is called by
> > > kmap_atomic() when kernels have HIGHMEM enabled).
> > > 
> > > > There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> > > > need to disable pagefaults and preemption for mapping:
> > > > 
> > > > 1. The flush operation is safe for CPU hotplug when preemption is not
> > > > disabled.
> > > 
> > > I'm confused here. Why are you talking about CPU hotplug?
> > 
> > I agree with Fabio here.  I'm not making the connection between cpu hotplug 
> and
> > this code path.
> > 
> > Ira
> 
> @Zhao,
> 
> I'd like to add that I was about to put my reviewed-by tag. The other things 
> I 
> objected are minor nits. Please just clarify this connection.

Thanks Fabio for your comments! Sorry I missed the mails that day. This 
connection
is my misunderstanding. Other thoughts please refer to my reply to your first 
email
in this thread.

Thanks,
Zhao



Re: [PATCH 2/9] drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c

2022-11-05 Thread Zhao Liu
On Sat, Oct 29, 2022 at 03:32:08PM +0200, Fabio M. De Francesco wrote:
> Date: Sat, 29 Oct 2022 15:32:08 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH 2/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_pyhs.c
> 
> On luned? 17 ottobre 2022 11:37:18 CEST Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > The use of kmap_atomic() is being deprecated in favor of
> > kmap_local_page()[1].
> > 
> > The main difference between atomic and local mappings is that local
> > mappings doesn't disable page faults or preemption.
> > 
> > In drm/i915/gem/i915_gem_phys.c, the functions
> > i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
> > don't need to disable pagefaults and preemption for mapping because of
> > these 2 reasons:
> > 
> > 1. The flush operation is safe for CPU hotplug when preemption is not
> > disabled. In drm/i915/gem/i915_gem_object.c, the functions
> > i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
> > calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
> > Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
> > drm_clflush_virt_range(), the flush operation is global and any issue
> > with cpu's being added or removed can be handled safely.
> > 
> > 2. Any context switch caused by preemption or sleep (pagefault may
> > cause sleep) doesn't affect the validity of local mapping.
> > 
> > Therefore, i915_gem_object_get_pages_phys() and
> > i915_gem_object_put_pages_phys() are two functions where the use of
> > kmap_local_page() in place of kmap_atomic() is correctly suited.
> > 
> > Convert the calls of kmap_atomic() / kunmap_atomic() to
> > kmap_local_page() / kunmap_local().
> > 
> 
> I have here the same questions as in 1/9.
> 
> > [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> > 
> > Suggested-by: Dave Hansen 
> > Suggested-by: Ira Weiny 
> > Suggested-by: Fabio M. De Francesco 
> > Signed-off-by: Zhao Liu 
> > ---
> > Suggested by credits:
> >   Dave: Referred to his explanation about cache flush.
> >   Ira: Referred to his task document, review comments and explanation about
> >cache flush.
> >   Fabio: Referred to his boiler plate commit message.
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_phys.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 0d0e46dae559..d602ba19ecb2 
> 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> > @@ -66,10 +66,10 @@ static int i915_gem_object_get_pages_phys(struct 
> drm_i915_gem_object
> > *obj) if (IS_ERR(page))
> > goto err_st;
> > 
> > -   src = kmap_atomic(page);
> > +   src = kmap_local_page(page);
> > memcpy(dst, src, PAGE_SIZE);
> > drm_clflush_virt_range(dst, PAGE_SIZE);
> > -   kunmap_atomic(src);
> > +   kunmap_local(src);
> 
> Please use memcpy_from_page() instead of open coding mapping + memcpy() + 
> unmapping.

Ok.

> 
> > 
> > put_page(page);
> > dst += PAGE_SIZE;
> > @@ -114,10 +114,10 @@ i915_gem_object_put_pages_phys(struct 
> drm_i915_gem_object *obj,
> > if (IS_ERR(page))
> > continue;
> > 
> > -   dst = kmap_atomic(page);
> > +   dst = kmap_local_page(page);
> > drm_clflush_virt_range(src, PAGE_SIZE);
> > memcpy(dst, src, PAGE_SIZE);
> > -   kunmap_atomic(dst);
> > +   kunmap_local(dst);
> 
> For the same reasons said above, memcpy_to_page() should be used here and 
> avoid open coding of three functions.
> 
> Using those helpers forces you to move drm_clflush_virt_range() out of the 
> mapping / un-mapping region. I may be wrong, however I'm pretty sure that the 
> relative positions of each of those call sites is something that cannot be 
> randomly chosen.

I agree. Will use memcpy_to_page().

Thanks,
Zhao

> 
> Thanks,
> 
> Fabio
> 
> > 
> > set_page_dirty(page);
> > if (obj->mm.madv == I915_MADV_WILLNEED)
> 
> 
> 


Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-11-05 Thread Zhao Liu
On Sat, Oct 29, 2022 at 01:17:03PM +0200, Fabio M. De Francesco wrote:
> Date: Sat, 29 Oct 2022 13:17:03 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_object.c
> 
> On luned? 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > The use of kmap_atomic() is being deprecated in favor of
> > kmap_local_page()[1].
> > 
> > The main difference between atomic and local mappings is that local
> > mappings doesn't disable page faults or preemption.
> 
> You are right about about page faults which are never disabled by 
> kmap_local_page(). However kmap_atomic might not disable preemption. It 
> depends on CONFIG_PREEMPT_RT.
> 
> Please refer to how kmap_atomic_prot() works (this function is called by 
> kmap_atomic() when kernels have HIGHMEM enabled).

Yes, there is some ambiguity here. What about "The main difference between
atomic and local mappings is that local mappings never disable page faults
or preemption"?

> 
> > 
> > There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> > need to disable pagefaults and preemption for mapping:
> > 
> > 1. The flush operation is safe for CPU hotplug when preemption is not
> > disabled. 
> 
> I'm confused here. Why are you talking about CPU hotplug?
> In any case, developers should never rely on implicit calls of 
> preempt_disable() for the reasons said above. Therefore, flush operations 
> should be allowed regardless that kmap_atomic() potential side effect.

Sorry, it's my fault, my misunderstanding about the connection between hotplug
and flush here. When mapping exists, the cpu cannot be unplugged via 
CPU-hotplug.
But whether plug or unplug, it has nothing to do with flush. I will delete this
wrong description.

My initial consideration is that this interface of flush may require an atomic
context, so I want to explain more from the details of its implementation
that cache consistency can be guaranteed without atomic context. Is this
consideration redundant?
Also, do I need to state that migration is still ok for this flush interface
here (since __kmap_local_page_prot() doesn't always disable migration)?

> > In drm/i915/gem/i915_gem_object.c, the function
> > i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range()
> 
> If I recall correctly, drm_clflush_virt_range() can always be called with 
> page 
> faults and preemption enabled. If so, this is enough to say that the 
> conversion is safe. 
> 
> Is this code explicitly related to flushing the cache lines before removing / 
> adding CPUs? If I recall correctly, there are several other reasons behind 
> the 
> need to issue cache lines flushes. Am I wrong about this?
> 
> Can you please say more about what I'm missing here?
> 
> > to
> > use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
> > and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
> > operation is global and any issue with cpu's being added or removed
> > can be handled safely.
> 
> Again your main concern is about CPU hotplug.
> 
> Even if I'm missing something, do we really need all these details about the 
> inner workings of drm_clflush_virt_range()? 
> 
> I'm not an expert, so may be that I'm wrong about all I wrote above.
> 
> Therefore, can you please elaborate a little more for readers with very 
> little 
> knowledge of these kinds of things (like me and perhaps others)?
>  
> > 2. Any context switch caused by preemption or sleep (pagefault may
> > cause sleep) doesn't affect the validity of local mapping.
> 
> I'd replace "preemption or sleep" with "preemption and page faults" since 
> yourself then added that page faults lead to tasks being put to sleep.  

Thanks, good advice.

Zhao



Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2022-11-05 Thread Zhao Liu
On Thu, Nov 03, 2022 at 09:51:23AM -0700, Ira Weiny wrote:
> Date: Thu, 3 Nov 2022 09:51:23 -0700
> From: Ira Weiny 
> Subject: Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_object.c
> 
> On Sat, Oct 29, 2022 at 01:17:03PM +0200, Fabio M. De Francesco wrote:
> > On luned? 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> > > From: Zhao Liu 
> > > 
> > > The use of kmap_atomic() is being deprecated in favor of
> > > kmap_local_page()[1].
> > > 
> > > The main difference between atomic and local mappings is that local
> > > mappings doesn't disable page faults or preemption.
> > 
> > You are right about about page faults which are never disabled by 
> > kmap_local_page(). However kmap_atomic might not disable preemption. It 
> > depends on CONFIG_PREEMPT_RT.
> > 
> > Please refer to how kmap_atomic_prot() works (this function is called by 
> > kmap_atomic() when kernels have HIGHMEM enabled).
> > 
> > > 
> > > There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> > > need to disable pagefaults and preemption for mapping:
> > > 
> > > 1. The flush operation is safe for CPU hotplug when preemption is not
> > > disabled. 
> > 
> > I'm confused here. Why are you talking about CPU hotplug?
> 
> I agree with Fabio here.  I'm not making the connection between cpu hotplug 
> and
> this code path.

Sorry, my misunderstanding. Will delete this wrong explanation.

Thanks,
Zhao


Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-04-14 Thread Zhao Liu
Hi Tvrtko,

On Wed, Apr 12, 2023 at 04:45:13PM +0100, Tvrtko Ursulin wrote:

[snip]

> > 
> > [snip]
> > > However I am unsure if disabling pagefaulting is needed or not. Thomas,
> > > Matt, being the last to touch this area, perhaps you could have a look?
> > > Because I notice we have a fallback iomap path which still uses
> > > io_mapping_map_atomic_wc. So if kmap_atomic to kmap_local conversion is
> > > safe, does the iomap side also needs converting to
> > > io_mapping_map_local_wc? Or they have separate requirements?
> > 
> > AFAIK, the requirements for io_mapping_map_local_wc() are the same as for
> > kmap_local_page(): the kernel virtual address is _only_ valid in the caller
> > context, and map/unmap nesting must be done in stack-based ordering (LIFO).
> > 
> > I think a follow up patch could safely switch to io_mapping_map_local_wc() /
> > io_mapping_unmap_local_wc since the address is local to context.
> > 
> > However, not being an expert, reading your note now I suspect that I'm 
> > missing
> > something. Can I ask why you think that page-faults disabling might be
> > necessary?
> 
> I am not saying it is, was just unsure and wanted some people who worked on 
> this code most recently to take a look and confirm.
> 
> I guess it will work since the copying is done like this anyway:
> 
>   /*
>* This is the fast path and we cannot handle a pagefault
>* whilst holding the struct mutex lest the user pass in the
>* relocations contained within a mmaped bo. For in such a case
>* we, the page fault handler would call i915_gem_fault() and
>* we would try to acquire the struct mutex again. Obviously
>* this is bad and so lockdep complains vehemently.
>*/
>   pagefault_disable();
>   copied = __copy_from_user_inatomic(r, urelocs, count * 
> sizeof(r[0]));
>   pagefault_enable();
>   if (unlikely(copied)) {
>   remain = -EFAULT;
>   goto out;
>   }
> 
> Comment is a bit outdated since we don't use that global "struct mutex" any 
> longer, but in any case, if there is a page fault on the mapping where we 
> need to recurse into i915 again to satisfy if, we seem to have code already 
> to handle it. So kmap_local conversion I *think* can't regress anything.

Thanks for your explanation!

> 
> Patch to convert the io_mapping_map_atomic_wc can indeed come later.

Okay, I will also look at this.

> 
> In terms of logistics - if we landed this series to out branch it would be 
> queued only for 6.5. Would that work for you?

Yeah, it's ok for me. But could I ask, did I miss the 6.4 merge time?

Thanks,
Zhao

> 
> Regards,
> 
> Tvrtko


Re: [PATCH v2 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-03-30 Thread Zhao Liu
Hi Fabio,

On Wed, Mar 29, 2023 at 06:03:38PM +0200, Fabio M. De Francesco wrote:
> Date: Wed, 29 Mar 2023 18:03:38 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH v2 0/9] drm/i915: Replace kmap_atomic() with
>  kmap_local_page()
> 
> On mercoledì 29 marzo 2023 09:32:11 CEST Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > Hi list,
> > 
> > Sorry for a long delay since v1 [1]. This patchset is based on 197b6b6
> > (Linux 6.3-rc4).
> > 
> > Welcome and thanks for your review and comments!
> > 
> > 
> > # Purpose of this patchset
> > 
> > The purpose of this pacthset is to replace all uses of kmap_atomic() in
> > i915 with kmap_local_page() because the use of kmap_atomic() is being
> > deprecated in favor of kmap_local_page()[1]. And 92b64bd (mm/highmem:
> > add notes about conversions from kmap{,_atomic}()) has declared the
> > deprecation of kmap_atomic().
> > 
> > 
> > # Motivation for deprecating kmap_atomic() and using kmap_local_page()
> > 
> > The main difference between atomic and local mappings is that local
> > mappings doesn't disable page faults or preemption (the preemption is
> > disabled for !PREEMPT_RT case, otherwise it only disables migration).
> > 
> > With kmap_local_page(), we can avoid the often unwanted side effect of
> > unnecessary page faults and preemption disables.
> > 
> > 
> > # Patch summary
> > 
> > Patch 1, 4-6 and 8-9 replace kamp_atomic()/kunmap_atomic() with
> > kmap_local_page()/kunmap_local() directly. With thses local
> > mappings, the page faults and preemption are allowed.
> > 
> > Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
> > kamp_atomic()/kunmap_atomic(). These two variants of memcpy()
> > are based on the local mapping, so page faults and preemption
> > are also allowed in these two interfaces.
> > 
> > Patch 3 replaces kamp_atomic()/kunmap_atomic() with kmap_local_page()/
> > kunmap_local() and also diable page fault since the for special
> > handling (pls see the commit message).
> > 
> > 
> > # Changes since v1
> > 
> > * Dropped hot plug related description in commit message since it has
> >   nothing to do with kmap_local_page().
> > * Emphasized the motivation for using kmap_local_page() in commit
> >   message.
> > * Rebased patch 1 on f47e630 (drm/i915/gem: Typecheck page lookups) to
> >   keep the "idx" variable of type pgoff_t here.
> > * Used memcpy_from_page() and memcpy_to_page() to replace
> >   kmap_local_page() + memcpy() in patch 2.
> > 
> > 
> > # Reference
> > 
> > [1]:
> > https://lore.kernel.org/lkml/20221017093726.2070674-1-zhao1.liu@linux.intel.c
> > om/ [1]:
> > https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com ---
> > Zhao Liu (9):
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
> >   drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
> >   drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
> >   drm/i915: Use kmap_local_page() in i915_cmd_parser.c
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c
> > 
> 
> I _think_ that the "long delay" you mentioned in the first sentence has paid 
> off in full. 
> 
> I don't see things to improve (except all those "kamp_atomic()" typo in the 
> patches summary; however, typos are only in the cover so I'm sure they won't 
> hurt anybody). 

Thanks a lot for your patience and your help! :-)

> 
> Each of the nine patches listed above looks good to me, so they are all…
> 
> Reviewed-by: Fabio M. De Francesco 
> 
> Thanks!
> 
> Fabio
> 
> PS: Obviously there was no need to reconfirm my tag for patch 3/9. A single 
> tag that catches all patches is easier for a lazy person like me :-)

The typos and this description still can be improved. I'll pay
attention in the future!

Thanks,
Zhao

> 
> >
> >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
> >  drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
> >  drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
> >  drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
> >  .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
> >  .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
> >  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
> >  drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
> >  9 files changed, 28 insertions(+), 41 deletions(-)
> > 
> > --
> > 2.34.1
> 
> 
> 
> 


Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-04-10 Thread Zhao Liu
Thanks all for your review!

On Fri, Mar 31, 2023 at 05:32:17PM +0200, Fabio M. De Francesco wrote:
> Date: Fri, 31 Mar 2023 17:32:17 +0200
> From: "Fabio M. De Francesco" 
> Subject: Re: [PATCH v2 9/9] drm/i915: Use kmap_local_page() in
>  gem/i915_gem_execbuffer.c
> 
> On venerd? 31 marzo 2023 13:30:20 CEST Tvrtko Ursulin wrote:
> > On 31/03/2023 05:18, Ira Weiny wrote:
> 

[snip]

>  
> > However I am unsure if disabling pagefaulting is needed or not. Thomas,
> > Matt, being the last to touch this area, perhaps you could have a look?
> > Because I notice we have a fallback iomap path which still uses
> > io_mapping_map_atomic_wc. So if kmap_atomic to kmap_local conversion is
> > safe, does the iomap side also needs converting to
> > io_mapping_map_local_wc? Or they have separate requirements?
> 
> AFAIK, the requirements for io_mapping_map_local_wc() are the same as for 
> kmap_local_page(): the kernel virtual address is _only_ valid in the caller 
> context, and map/unmap nesting must be done in stack-based ordering (LIFO).
> 
> I think a follow up patch could safely switch to io_mapping_map_local_wc() / 
> io_mapping_unmap_local_wc since the address is local to context.
> 
> However, not being an expert, reading your note now I suspect that I'm 
> missing 
> something. Can I ask why you think that page-faults disabling might be 
> necessary? 


About the disabling of pagefault here, could you please talk more about
it? :-)

>From previous discussions and commit history, I didn't find relevant
information and I lack background knowledge about it...

If we have the reason to diable pagefault, I will fix and refresh the new
version.

Thanks,
Zhao

> 
> Thanks,
> 
> Fabio
> 
> > Regards,
> > 
> > Tvrtko
> 
> 
> 


[PATCH v2 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
need to disable pagefaults and preemption for mapping:

1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range() to
use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_read_from_page_kmap() is a function where
the use of kmap_local_page() in place of kmap_atomic() is correctly
suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

And remove the redundant variable that stores the address of the mapped
page since kunmap_local() can accept any pointer within the page.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* Rebased on f47e630 (drm/i915/gem: Typecheck page lookups) to keep
  the "idx" variable of type pgoff_t here.
* Added description of the motivation of using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index e6d4efde4fc5..c0bfdd7784f7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -428,17 +428,15 @@ static void
 i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64 
offset, void *dst, int size)
 {
pgoff_t idx = offset >> PAGE_SHIFT;
-   void *src_map;
void *src_ptr;
 
-   src_map = kmap_atomic(i915_gem_object_get_page(obj, idx));
-
-   src_ptr = src_map + offset_in_page(offset);
+   src_ptr = kmap_local_page(i915_gem_object_get_page(obj, idx))
+ + offset_in_page(offset);
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
drm_clflush_virt_range(src_ptr, size);
memcpy(dst, src_ptr, size);
 
-   kunmap_atomic(src_map);
+   kunmap_local(src_ptr);
 }
 
 static void
-- 
2.34.1



[PATCH v2 2/9] drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1],  and this patch converts the call from
kmap_atomic() + memcpy() to memcpy_[from/to]_page(), which use
kmap_local_page() to build local mapping and then do memcpy().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

In drm/i915/gem/i915_gem_phys.c, the functions
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
don't need to disable pagefaults and preemption for mapping because of
2 reasons:

1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_get_pages_phys() and
i915_gem_object_put_pages_phys() are two functions where the uses of
local mappings in place of atomic mappings are correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() + memcpy() to
memcpy_from_page() and memcpy_to_page().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Used memcpy_from_page() and memcpy_to_page() to replace
  kmap_local_page() + memcpy().
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* Added description of the motivation of using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred. Also based on
 his suggestion to use memcpy_[from/to]_page() directly.
---
 drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 76efe98eaa14..4c6d3f07260a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -64,16 +64,13 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
dst = vaddr;
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
struct page *page;
-   void *src;
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page))
goto err_st;
 
-   src = kmap_atomic(page);
-   memcpy(dst, src, PAGE_SIZE);
+   memcpy_from_page(dst, page, 0, PAGE_SIZE);
drm_clflush_virt_range(dst, PAGE_SIZE);
-   kunmap_atomic(src);
 
put_page(page);
dst += PAGE_SIZE;
@@ -112,16 +109,13 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
 
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
struct page *page;
-   char *dst;
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page))
continue;
 
-   dst = kmap_atomic(page);
drm_clflush_virt_range(src, PAGE_SIZE);
-   memcpy(dst, src, PAGE_SIZE);
-   kunmap_atomic(dst);
+   memcpy_to_page(page, 0, src, PAGE_SIZE);
 
set_page_dirty(page);
if (obj->mm.madv == I915_MADV_WILLNEED)
-- 
2.34.1



[PATCH v2 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

Hi list,

Sorry for a long delay since v1 [1]. This patchset is based on 197b6b6
(Linux 6.3-rc4).

Welcome and thanks for your review and comments!


# Purpose of this patchset

The purpose of this pacthset is to replace all uses of kmap_atomic() in
i915 with kmap_local_page() because the use of kmap_atomic() is being
deprecated in favor of kmap_local_page()[1]. And 92b64bd (mm/highmem:
add notes about conversions from kmap{,_atomic}()) has declared the
deprecation of kmap_atomic().


# Motivation for deprecating kmap_atomic() and using kmap_local_page()

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.


# Patch summary

Patch 1, 4-6 and 8-9 replace kamp_atomic()/kunmap_atomic() with
kmap_local_page()/kunmap_local() directly. With thses local
mappings, the page faults and preemption are allowed.

Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
kamp_atomic()/kunmap_atomic(). These two variants of memcpy()
are based on the local mapping, so page faults and preemption
are also allowed in these two interfaces.

Patch 3 replaces kamp_atomic()/kunmap_atomic() with kmap_local_page()/
kunmap_local() and also diable page fault since the for special
handling (pls see the commit message).


# Changes since v1

* Dropped hot plug related description in commit message since it has
  nothing to do with kmap_local_page().
* Emphasized the motivation for using kmap_local_page() in commit
  message.
* Rebased patch 1 on f47e630 (drm/i915/gem: Typecheck page lookups) to
  keep the "idx" variable of type pgoff_t here.
* Used memcpy_from_page() and memcpy_to_page() to replace
  kmap_local_page() + memcpy() in patch 2.


# Reference

[1]: 
https://lore.kernel.org/lkml/20221017093726.2070674-1-zhao1@linux.intel.com/
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
---
Zhao Liu (9):
  drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
  drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
  drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
  drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
  drm/i915: Use kmap_local_page() in i915_cmd_parser.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
 drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
 9 files changed, 28 insertions(+), 41 deletions(-)

-- 
2.34.1



[PATCH v2 5/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration)..

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set()
and cpu_get() mainly uses mapping to flush cache and assign the value.
There're 2 reasons why cpu_set() and cpu_get() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_set() and cpu_get() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_set() and cpu_get() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* No code change since v1, and added description of the motivation of
  using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 3bef1beec7cb..beeb3e12eccc 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -24,7 +24,6 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -34,8 +33,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
@@ -45,7 +43,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
if (needs_clflush & CLFLUSH_AFTER)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
@@ -57,7 +55,6 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -67,15 +64,14 @@ static int cpu_get(struct context *ctx, unsigned long 
offset, u32 *v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
*v = *cpu;
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
-- 
2.34.1



[PATCH v2 7/9] drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption  (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gt/uc/intel_us_fw.c, the function intel_uc_fw_copy_rsa()
just use the mapping to do memory copy so it doesn't need to disable
pagefaults and preemption for mapping. Thus the local mapping without
atomic context (not disable pagefaults / preemption) is enough.

Therefore, intel_uc_fw_copy_rsa() is a function where the use of
memcpy_from_page() with kmap_local_page() in place of memcpy() with
kmap_atomic() is correctly suited.

Convert the calls of memcpy() with kmap_atomic() / kunmap_atomic() to
memcpy_from_page() which uses local mapping to copy.

[1]: 
https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com/T/#u

v2: No code change since v1, and added description of the motivation of
using kmap_local_page().

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Reviewed-by: Ira Weiny 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Ira: Referred to his task document and suggestions about using
   memcpy_from_page() directly.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 65672ff82605..5bbde4abd565 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -1152,16 +1152,13 @@ size_t intel_uc_fw_copy_rsa(struct intel_uc_fw *uc_fw, 
void *dst, u32 max_len)
 
for_each_sgt_page(page, iter, uc_fw->obj->mm.pages) {
u32 len = min_t(u32, size, PAGE_SIZE - offset);
-   void *vaddr;
 
if (idx > 0) {
idx--;
continue;
}
 
-   vaddr = kmap_atomic(page);
-   memcpy(dst, vaddr + offset, len);
-   kunmap_atomic(vaddr);
+   memcpy_from_page(dst, page, offset, len);
 
offset = 0;
dst += len;
-- 
2.34.1



[PATCH v2 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the calls from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
kmap_atomic() in eb_relocate_entry(), and is unmapped by
kunmap_atomic() in reloc_cache_reset().

And this mapping/unmapping occurs in two places: one is in
eb_relocate_vma(), and another is in eb_relocate_vma_slow().

The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
need to disable pagefaults and preemption during the above mapping/
unmapping.

So it can simply use kmap_local_page() / kunmap_local() that can
instead do the mapping / unmapping regardless of the context.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2: No code change since v1. Added description of the motivation of
using kmap_local_page() and "Suggested-by" tag of Fabio.

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Ira: Referred to his task document, review comments.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9dce2957b4e5..805565edd148 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1151,7 +1151,7 @@ static void reloc_cache_unmap(struct reloc_cache *cache)
 
vaddr = unmask_page(cache->vaddr);
if (cache->vaddr & KMAP)
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
else
io_mapping_unmap_atomic((void __iomem *)vaddr);
 }
@@ -1167,7 +1167,7 @@ static void reloc_cache_remap(struct reloc_cache *cache,
if (cache->vaddr & KMAP) {
struct page *page = i915_gem_object_get_page(obj, cache->page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) |
(unsigned long)vaddr;
} else {
@@ -1197,7 +1197,7 @@ static void reloc_cache_reset(struct reloc_cache *cache, 
struct i915_execbuffer
if (cache->vaddr & CLFLUSH_AFTER)
mb();
 
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
i915_gem_object_finish_access(obj);
} else {
struct i915_ggtt *ggtt = cache_to_ggtt(cache);
@@ -1229,7 +1229,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
struct page *page;
 
if (cache->vaddr) {
-   kunmap_atomic(unmask_page(cache->vaddr));
+   kunmap_local(unmask_page(cache->vaddr));
} else {
unsigned int flushes;
int err;
@@ -1251,7 +1251,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
if (!obj->mm.dirty)
set_page_dirty(page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) | (unsigned long)vaddr;
cache->page = pageno;
 
-- 
2.34.1



[PATCH v2 8/9] drm/i915: Use kmap_local_page() in i915_cmd_parser.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

There're 2 reasons why function copy_batch() doesn't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. In i915_cmd_parser.c, copy_batch() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu
in drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, copy_batch() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* No code change since v1, and added description of the motivation of
  using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index ddf49c2dbb91..2905df83e180 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1211,11 +1211,11 @@ static u32 *copy_batch(struct drm_i915_gem_object 
*dst_obj,
for (n = offset >> PAGE_SHIFT; remain; n++) {
int len = min(remain, PAGE_SIZE - x);
 
-   src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
+   src = kmap_local_page(i915_gem_object_get_page(src_obj, 
n));
if (src_needs_clflush)
drm_clflush_virt_range(src + x, len);
memcpy(ptr, src + x, len);
-   kunmap_atomic(src);
+   kunmap_local(src);
 
ptr += len;
remain -= len;
-- 
2.34.1



[PATCH v2 6/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and
cpu_check() mainly uses mapping to flush cache and check/assign the
value.

There're 2 reasons why cpu_fill() and cpu_check() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_fill() and cpu_check() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_fill() and cpu_check() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* No code change since v1, and added description of the motivation of
  using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index a81fa6a20f5a..dcbc0b8e3323 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -481,12 +481,12 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 
value)
for (n = 0; n < real_page_count(obj); n++) {
u32 *map;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
for (m = 0; m < DW_PER_PAGE; m++)
map[m] = value;
if (!has_llc)
drm_clflush_virt_range(map, PAGE_SIZE);
-   kunmap_atomic(map);
+   kunmap_local(map);
}
 
i915_gem_object_finish_access(obj);
@@ -512,7 +512,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
for (n = 0; n < real_page_count(obj); n++) {
u32 *map, m;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(map, PAGE_SIZE);
 
@@ -538,7 +538,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
}
 
 out_unmap:
-   kunmap_atomic(map);
+   kunmap_local(map);
if (err)
break;
}
-- 
2.34.1



[PATCH v2 3/9] drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/i915_gem_shmem.c, the function shmem_pwrite() need to
disable pagefault to eliminate the potential recursion fault[2]. But
here __copy_from_user_inatomic() doesn't need to disable preemption and
local mapping is valid for sched in/out.

So it can use kmap_local_page() / kunmap_local() with
pagefault_disable() / pagefault_enable() to replace atomic mapping.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
[2]: https://patchwork.freedesktop.org/patch/295840/

v2: No code change since v1, and added description of the motivation of
using kmap_local_page().

Suggested-by: Ira Weiny 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Ira: Referred to his suggestions about keeping pagefault_disable().
  Fabio: Referred to his description about why kmap_local_page() should
 be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..ad69a79c8b31 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -475,11 +475,13 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
if (err < 0)
return err;
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
+   pagefault_disable();
unwritten = __copy_from_user_inatomic(vaddr + pg,
  user_data,
  len);
-   kunmap_atomic(vaddr);
+   pagefault_enable();
+   kunmap_local(vaddr);
 
err = aops->write_end(obj->base.filp, mapping, offset, len,
  len - unwritten, page, data);
-- 
2.34.1



[PATCH v2 4/9] drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c

2023-03-29 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/huge_pages.c, function __cpu_check_shmem()
mainly uses mapping to flush cache and check the value. There're
2 reasons why __cpu_check_shmem() doesn't need to disable pagefaults
and preemption for mapping:

1. The flush operation is safe. Function __cpu_check_shmem() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, __cpu_check_shmem() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

v2:
* Dropped hot plug related description since it has nothing to do with
  kmap_local_page().
* No code change since v1, and added description of the motivation of
  using kmap_local_page().

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index defece0bcb81..3f9ea48a48d0 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1026,7 +1026,7 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
goto err_unlock;
 
for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
-   u32 *ptr = kmap_atomic(i915_gem_object_get_page(obj, n));
+   u32 *ptr = kmap_local_page(i915_gem_object_get_page(obj, n));
 
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(ptr, PAGE_SIZE);
@@ -1034,12 +1034,12 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
if (ptr[dword] != val) {
pr_err("n=%lu ptr[%u]=%u, val=%u\n",
   n, dword, ptr[dword], val);
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
err = -EINVAL;
break;
}
 
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
}
 
i915_gem_object_finish_access(obj);
-- 
2.34.1



Re: [PATCH 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-02-14 Thread Zhao Liu
On Tue, Feb 14, 2023 at 08:25:08PM -0800, Ira Weiny wrote:
> Date: Tue, 14 Feb 2023 20:25:08 -0800
> From: Ira Weiny 
> Subject: Re: [PATCH 0/9] drm/i915: Replace kmap_atomic() with
>  kmap_local_page()
> 
> Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > The use of kmap_atomic() is being deprecated in favor of
> > kmap_local_page()[1].
> 
> Zhao,
> 
> Was there ever a v2 of this series?  I'm not finding it on Lore.

Sorry Ira, my delay is too long, I was busy with other patch work,
I will refresh v2 soon, and push this forward!

Best Regards,
Zhao

> 
> Thanks,
> Ira
> 
> > 
> > In the following patches, we can convert the calls of kmap_atomic() /
> > kunmap_atomic() to kmap_local_page() / kunmap_local(), which can
> > instead do the mapping / unmapping regardless of the context.
> > 
> > With kmap_local_page(), the mapping is per thread, CPU local and not
> > globally visible.
> > 
> > [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> > ---
> > Zhao Liu (9):
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_pyhs.c
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
> >   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
> >   drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
> >   drm/i915: Use kmap_local_page() in i915_cmd_parser.c
> >   drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c
> > 
> >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
> >  drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
> >  drivers/gpu/drm/i915/gem/i915_gem_phys.c |  8 
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
> >  drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
> >  .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
> >  .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
> >  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
> >  drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
> >  9 files changed, 30 insertions(+), 37 deletions(-)
> > 
> > -- 
> > 2.34.1
> > 
> 
> 


Re: [PATCH v3 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-12-14 Thread Zhao Liu
On Thu, Dec 14, 2023 at 02:35:26PM +, Tvrtko Ursulin wrote:
> Date: Thu, 14 Dec 2023 14:35:26 +
> From: Tvrtko Ursulin 
> Subject: Re: [PATCH v3 0/9] drm/i915: Replace kmap_atomic() with
>  kmap_local_page()
> 
> 
> On 14/12/2023 13:45, Tvrtko Ursulin wrote:
> > 
> > Hi Zhao,
> > 
> > On 14/12/2023 13:19, Zhao Liu wrote:
> > > Hi maintainers,
> > > 
> > > Just kindly ping.
> > > May I ask if this refresh version could be merged into the next tree of
> > > the i915?
> > 
> > I certainly spotted your series last week or so but then it slipped my
> > mind to go through it. Should be able to go through it today or
> > tomorrow.
> 
> It all looks good to me. I only needed to queue a re-test in our CI since v3
> failed BAT, but pretty sure it wasn't at fault. Once I am satisfied with the
> results I will merge the series. Thanks for the cleanups and your patience!
> 
> Regards,
> 
> Tvrtko
> 

Thanks for your review!

Regards,
Zhao

> 
> > Regards,
> > 
> > Tvrtko
> > 
> > > 
> > > Thanks,
> > > Zhao
> > > 
> > > On Sun, Dec 03, 2023 at 09:29:38PM +0800, Zhao Liu wrote:
> > > > Date: Sun, 3 Dec 2023 21:29:38 +0800
> > > > From: Zhao Liu 
> > > > Subject: [PATCH v3 0/9] drm/i915: Replace kmap_atomic() with
> > > >   kmap_local_page()
> > > > X-Mailer: git-send-email 2.34.1
> > > > 
> > > > From: Zhao Liu 
> > > > 
> > > > Hi all,
> > > > 
> > > > I refreshed this v3 by rebasing v2 [1] on the commit 968f35f4ab1c
> > > > ("Merge tag 'v6.7-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/
> > > > cifs-2.6").
> > > > 
> > > > Based on the current code, I rechecked the substitutions in v2 and they
> > > > still stand and are valid, so no code change in v3.
> > > > 
> > > > Thanks for all the review! And sorry v2 was missed, I'll pay more
> > > > attention to this v3.
> > > > 
> > > > 
> > > > Purpose of This Patchset
> > > > 
> > > > 
> > > > The purpose of this pacthset is to replace all uses of kmap_atomic() in
> > > > i915 with kmap_local_page() because the use of kmap_atomic() is being
> > > > deprecated in favor of kmap_local_page()[2]. And 92b64bd (mm/highmem:
> > > > add notes about conversions from kmap{,_atomic}()) has declared the
> > > > deprecation of kmap_atomic().
> > > > 
> > > > 
> > > > Motivation for Deprecating kmap_atomic() and Using kmap_local_page()
> > > > 
> > > > 
> > > > The main difference between atomic and local mappings is that local
> > > > mappings doesn't disable page faults or preemption (the preemption is
> > > > disabled for !PREEMPT_RT case, otherwise it only disables migration).
> > > > 
> > > > With kmap_local_page(), we can avoid the often unwanted side effect of
> > > > unnecessary page faults and preemption disables.
> > > > 
> > > > 
> > > > Patch summary
> > > > =
> > > > 
> > > > Patch 1, 4-6 and 8-9 replace kmap_atomic()/kunmap_atomic() with
> > > >  kmap_local_page()/kunmap_local() directly. With these local
> > > >  mappings, the page faults and preemption are allowed.
> > > > 
> > > > Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
> > > >  kmap_atomic()/kunmap_atomic(). These two variants of memcpy()
> > > >  are based on the local mapping, so page faults and preemption
> > > >  are also allowed in these two interfaces.
> > > > 
> > > > Patch 3 replaces kmap_atomic()/kunmap_atomic() with kmap_local_page()/
> > > >  kunmap_local() and also disable page fault since the
> > > > for special
> > > >  handling (pls see the commit message).
> > > > 
> > > > 
> > > > Reference
> > > > =
> > > > 
> > > > [1]: 
> > > > https://lore.kernel.org/all/20230329073220.3982460-1-zhao1@linux.intel.com/
> > > > [2]:
> > > > https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> > > > 
> > > > 
> > > > Thanks a

Re: [PATCH v3 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-12-14 Thread Zhao Liu
Hi maintainers,

Just kindly ping.
May I ask if this refresh version could be merged into the next tree of
the i915?

Thanks,
Zhao

On Sun, Dec 03, 2023 at 09:29:38PM +0800, Zhao Liu wrote:
> Date: Sun, 3 Dec 2023 21:29:38 +0800
> From: Zhao Liu 
> Subject: [PATCH v3 0/9] drm/i915: Replace kmap_atomic() with
>  kmap_local_page()
> X-Mailer: git-send-email 2.34.1
> 
> From: Zhao Liu 
> 
> Hi all,
> 
> I refreshed this v3 by rebasing v2 [1] on the commit 968f35f4ab1c
> ("Merge tag 'v6.7-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/
> cifs-2.6").
> 
> Based on the current code, I rechecked the substitutions in v2 and they
> still stand and are valid, so no code change in v3.
> 
> Thanks for all the review! And sorry v2 was missed, I'll pay more
> attention to this v3.
> 
> 
> Purpose of This Patchset
> 
> 
> The purpose of this pacthset is to replace all uses of kmap_atomic() in
> i915 with kmap_local_page() because the use of kmap_atomic() is being
> deprecated in favor of kmap_local_page()[2]. And 92b64bd (mm/highmem:
> add notes about conversions from kmap{,_atomic}()) has declared the
> deprecation of kmap_atomic().
> 
> 
> Motivation for Deprecating kmap_atomic() and Using kmap_local_page()
> 
> 
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption (the preemption is
> disabled for !PREEMPT_RT case, otherwise it only disables migration).
> 
> With kmap_local_page(), we can avoid the often unwanted side effect of
> unnecessary page faults and preemption disables.
> 
> 
> Patch summary
> =
> 
> Patch 1, 4-6 and 8-9 replace kmap_atomic()/kunmap_atomic() with
> kmap_local_page()/kunmap_local() directly. With these local
> mappings, the page faults and preemption are allowed.
> 
> Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
> kmap_atomic()/kunmap_atomic(). These two variants of memcpy()
> are based on the local mapping, so page faults and preemption
> are also allowed in these two interfaces.
> 
> Patch 3 replaces kmap_atomic()/kunmap_atomic() with kmap_local_page()/
> kunmap_local() and also disable page fault since the for special
> handling (pls see the commit message).
> 
> 
> Reference
> =
> 
> [1]: 
> https://lore.kernel.org/all/20230329073220.3982460-1-zhao1@linux.intel.com/
> [2]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
> 
> 
> Thanks and Best Regards,
> Zhao
> 
> ---
> Changlog:
> 
> Changes since v2:
> * Rebased on 968f35f4ab1c ("Merge tag 'v6.7-rc3-smb3-client-fixes' of
>   git://git.samba.org/sfrench/cifs-2.6").
> * Removed changelog (of v2) in commit message.
> * Fixed typo in cover letter (Fabio).
> * Added Reviewed-by tags from Ira and Fabio.
> 
> Changes since v1:
> * Dropped hot plug related description in commit message since it has
>   nothing to do with kmap_local_page().
> * Emphasized the motivation for using kmap_local_page() in commit
>   message.
> * Rebased patch 1 on f47e630 (drm/i915/gem: Typecheck page lookups) to
>   keep the "idx" variable of type pgoff_t here.
> * Used memcpy_from_page() and memcpy_to_page() to replace
>   kmap_local_page() + memcpy() in patch 2.
> 
> ---
> Zhao Liu (9):
>   drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
>   drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
>   drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
>   drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
>   drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
>   drm/i915: Use kmap_local_page() in i915_cmd_parser.c
>   drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c
> 
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
>  drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
>  drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
>  .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
>  .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
>  drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
>  9 files changed, 28 insertions(+), 41 deletions(-)
> 
> -- 
> 2.34.1
> 


[PATCH v3 5/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration)..

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set()
and cpu_get() mainly uses mapping to flush cache and assign the value.
There're 2 reasons why cpu_set() and cpu_get() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_set() and cpu_get() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_set() and cpu_get() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 3bef1beec7cb..beeb3e12eccc 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -24,7 +24,6 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -34,8 +33,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
@@ -45,7 +43,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, 
u32 v)
if (needs_clflush & CLFLUSH_AFTER)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
@@ -57,7 +55,6 @@ static int cpu_get(struct context *ctx, unsigned long offset, 
u32 *v)
 {
unsigned int needs_clflush;
struct page *page;
-   void *map;
u32 *cpu;
int err;
 
@@ -67,15 +64,14 @@ static int cpu_get(struct context *ctx, unsigned long 
offset, u32 *v)
goto out;
 
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
-   map = kmap_atomic(page);
-   cpu = map + offset_in_page(offset);
+   cpu = kmap_local_page(page) + offset_in_page(offset);
 
if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
 
*v = *cpu;
 
-   kunmap_atomic(map);
+   kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);
 
 out:
-- 
2.34.1



[PATCH v3 8/9] drm/i915: Use kmap_local_page() in i915_cmd_parser.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

There're 2 reasons why function copy_batch() doesn't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. In i915_cmd_parser.c, copy_batch() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu
in drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, copy_batch() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index ddf49c2dbb91..2905df83e180 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1211,11 +1211,11 @@ static u32 *copy_batch(struct drm_i915_gem_object 
*dst_obj,
for (n = offset >> PAGE_SHIFT; remain; n++) {
int len = min(remain, PAGE_SIZE - x);
 
-   src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
+   src = kmap_local_page(i915_gem_object_get_page(src_obj, 
n));
if (src_needs_clflush)
drm_clflush_virt_range(src + x, len);
memcpy(ptr, src + x, len);
-   kunmap_atomic(src);
+   kunmap_local(src);
 
ptr += len;
remain -= len;
-- 
2.34.1



[PATCH v3 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
need to disable pagefaults and preemption for mapping:

1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range() to
use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_read_from_page_kmap() is a function where
the use of kmap_local_page() in place of kmap_atomic() is correctly
suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

And remove the redundant variable that stores the address of the mapped
page since kunmap_local() can accept any pointer within the page.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index c26d87555825..a2a7e5005415 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -493,17 +493,15 @@ static void
 i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64 
offset, void *dst, int size)
 {
pgoff_t idx = offset >> PAGE_SHIFT;
-   void *src_map;
void *src_ptr;
 
-   src_map = kmap_atomic(i915_gem_object_get_page(obj, idx));
-
-   src_ptr = src_map + offset_in_page(offset);
+   src_ptr = kmap_local_page(i915_gem_object_get_page(obj, idx))
+ + offset_in_page(offset);
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
drm_clflush_virt_range(src_ptr, size);
memcpy(dst, src_ptr, size);
 
-   kunmap_atomic(src_map);
+   kunmap_local(src_ptr);
 }
 
 static void
-- 
2.34.1



[PATCH v3 4/9] drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/huge_pages.c, function __cpu_check_shmem()
mainly uses mapping to flush cache and check the value. There're
2 reasons why __cpu_check_shmem() doesn't need to disable pagefaults
and preemption for mapping:

1. The flush operation is safe. Function __cpu_check_shmem() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, __cpu_check_shmem() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 6b9f6cf50bf6..c9e6d77abab0 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1082,7 +1082,7 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
goto err_unlock;
 
for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
-   u32 *ptr = kmap_atomic(i915_gem_object_get_page(obj, n));
+   u32 *ptr = kmap_local_page(i915_gem_object_get_page(obj, n));
 
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(ptr, PAGE_SIZE);
@@ -1090,12 +1090,12 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 
dword, u32 val)
if (ptr[dword] != val) {
pr_err("n=%lu ptr[%u]=%u, val=%u\n",
   n, dword, ptr[dword], val);
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
err = -EINVAL;
break;
}
 
-   kunmap_atomic(ptr);
+   kunmap_local(ptr);
}
 
i915_gem_object_finish_access(obj);
-- 
2.34.1



[PATCH v3 6/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and
cpu_check() mainly uses mapping to flush cache and check/assign the
value.

There're 2 reasons why cpu_fill() and cpu_check() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_fill() and cpu_check() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_fill() and cpu_check() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 7021b6e9b219..89d4dc8b60c6 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -489,12 +489,12 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 
value)
for (n = 0; n < real_page_count(obj); n++) {
u32 *map;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
for (m = 0; m < DW_PER_PAGE; m++)
map[m] = value;
if (!has_llc)
drm_clflush_virt_range(map, PAGE_SIZE);
-   kunmap_atomic(map);
+   kunmap_local(map);
}
 
i915_gem_object_finish_access(obj);
@@ -520,7 +520,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
for (n = 0; n < real_page_count(obj); n++) {
u32 *map, m;
 
-   map = kmap_atomic(i915_gem_object_get_page(obj, n));
+   map = kmap_local_page(i915_gem_object_get_page(obj, n));
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(map, PAGE_SIZE);
 
@@ -546,7 +546,7 @@ static noinline int cpu_check(struct drm_i915_gem_object 
*obj,
}
 
 out_unmap:
-   kunmap_atomic(map);
+   kunmap_local(map);
if (err)
break;
}
-- 
2.34.1



[PATCH v3 0/9] drm/i915: Replace kmap_atomic() with kmap_local_page()

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

Hi all,

I refreshed this v3 by rebasing v2 [1] on the commit 968f35f4ab1c
("Merge tag 'v6.7-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/
cifs-2.6").

Based on the current code, I rechecked the substitutions in v2 and they
still stand and are valid, so no code change in v3.

Thanks for all the review! And sorry v2 was missed, I'll pay more
attention to this v3.


Purpose of This Patchset


The purpose of this pacthset is to replace all uses of kmap_atomic() in
i915 with kmap_local_page() because the use of kmap_atomic() is being
deprecated in favor of kmap_local_page()[2]. And 92b64bd (mm/highmem:
add notes about conversions from kmap{,_atomic}()) has declared the
deprecation of kmap_atomic().


Motivation for Deprecating kmap_atomic() and Using kmap_local_page()


The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.


Patch summary
=

Patch 1, 4-6 and 8-9 replace kmap_atomic()/kunmap_atomic() with
kmap_local_page()/kunmap_local() directly. With these local
mappings, the page faults and preemption are allowed.

Patch 2 and 7 use memcpy_from_page() and memcpy_to_page() to replace
kmap_atomic()/kunmap_atomic(). These two variants of memcpy()
are based on the local mapping, so page faults and preemption
are also allowed in these two interfaces.

Patch 3 replaces kmap_atomic()/kunmap_atomic() with kmap_local_page()/
kunmap_local() and also disable page fault since the for special
handling (pls see the commit message).


Reference
=

[1]: 
https://lore.kernel.org/all/20230329073220.3982460-1-zhao1@linux.intel.com/
[2]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com


Thanks and Best Regards,
Zhao

---
Changlog:

Changes since v2:
* Rebased on 968f35f4ab1c ("Merge tag 'v6.7-rc3-smb3-client-fixes' of
  git://git.samba.org/sfrench/cifs-2.6").
* Removed changelog (of v2) in commit message.
* Fixed typo in cover letter (Fabio).
* Added Reviewed-by tags from Ira and Fabio.

Changes since v1:
* Dropped hot plug related description in commit message since it has
  nothing to do with kmap_local_page().
* Emphasized the motivation for using kmap_local_page() in commit
  message.
* Rebased patch 1 on f47e630 (drm/i915/gem: Typecheck page lookups) to
  keep the "idx" variable of type pgoff_t here.
* Used memcpy_from_page() and memcpy_to_page() to replace
  kmap_local_page() + memcpy() in patch 2.

---
Zhao Liu (9):
  drm/i915: Use kmap_local_page() in gem/i915_gem_object.c
  drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c
  drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c
  drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c
  drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c
  drm/i915: Use kmap_local_page() in i915_cmd_parser.c
  drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   | 10 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  8 +++-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c|  6 --
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  6 +++---
 .../gpu/drm/i915/gem/selftests/i915_gem_coherency.c  | 12 
 .../gpu/drm/i915/gem/selftests/i915_gem_context.c|  8 
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c |  5 +
 drivers/gpu/drm/i915/i915_cmd_parser.c   |  4 ++--
 9 files changed, 28 insertions(+), 41 deletions(-)

-- 
2.34.1



[PATCH v3 3/9] drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1].

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/i915_gem_shmem.c, the function shmem_pwrite() need to
disable pagefault to eliminate the potential recursion fault[2]. But
here __copy_from_user_inatomic() doesn't need to disable preemption and
local mapping is valid for sched in/out.

So it can use kmap_local_page() / kunmap_local() with
pagefault_disable() / pagefault_enable() to replace atomic mapping.

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com
[2]: https://patchwork.freedesktop.org/patch/295840/

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Ira: Referred to his suggestions about keeping pagefault_disable().
  Fabio: Referred to his description about why kmap_local_page() should
 be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 73a4a4eb29e0..38b72d86560f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -485,11 +485,13 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
if (err < 0)
return err;
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
+   pagefault_disable();
unwritten = __copy_from_user_inatomic(vaddr + pg,
  user_data,
  len);
-   kunmap_atomic(vaddr);
+   pagefault_enable();
+   kunmap_local(vaddr);
 
err = aops->write_end(obj->base.filp, mapping, offset, len,
  len - unwritten, page, data);
-- 
2.34.1



[PATCH v3 2/9] drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1],  and this patch converts the call from
kmap_atomic() + memcpy() to memcpy_[from/to]_page(), which use
kmap_local_page() to build local mapping and then do memcpy().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

In drm/i915/gem/i915_gem_phys.c, the functions
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
don't need to disable pagefaults and preemption for mapping because of
2 reasons:

1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c,
i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys()
calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, i915_gem_object_get_pages_phys() and
i915_gem_object_put_pages_phys() are two functions where the uses of
local mappings in place of atomic mappings are correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() + memcpy() to
memcpy_from_page() and memcpy_to_page().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Dave Hansen 
Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Dave: Referred to his explanation about cache flush.
  Ira: Referred to his task document, review comments and explanation
   about cache flush.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 5df128e2f4dc..ef85c6dc9fd5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -65,16 +65,13 @@ static int i915_gem_object_get_pages_phys(struct 
drm_i915_gem_object *obj)
dst = vaddr;
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
struct page *page;
-   void *src;
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page))
goto err_st;
 
-   src = kmap_atomic(page);
-   memcpy(dst, src, PAGE_SIZE);
+   memcpy_from_page(dst, page, 0, PAGE_SIZE);
drm_clflush_virt_range(dst, PAGE_SIZE);
-   kunmap_atomic(src);
 
put_page(page);
dst += PAGE_SIZE;
@@ -113,16 +110,13 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
 
for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
struct page *page;
-   char *dst;
 
page = shmem_read_mapping_page(mapping, i);
if (IS_ERR(page))
continue;
 
-   dst = kmap_atomic(page);
drm_clflush_virt_range(src, PAGE_SIZE);
-   memcpy(dst, src, PAGE_SIZE);
-   kunmap_atomic(dst);
+   memcpy_to_page(page, 0, src, PAGE_SIZE);
 
set_page_dirty(page);
if (obj->mm.madv == I915_MADV_WILLNEED)
-- 
2.34.1



[PATCH v3 9/9] drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the calls from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by
kmap_atomic() in eb_relocate_entry(), and is unmapped by
kunmap_atomic() in reloc_cache_reset().

And this mapping/unmapping occurs in two places: one is in
eb_relocate_vma(), and another is in eb_relocate_vma_slow().

The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't
need to disable pagefaults and preemption during the above mapping/
unmapping.

So it can simply use kmap_local_page() / kunmap_local() that can
instead do the mapping / unmapping regardless of the context.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Ira: Referred to his task document, review comments.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 683fd8d3151c..18b0f3117074 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1156,7 +1156,7 @@ static void reloc_cache_unmap(struct reloc_cache *cache)
 
vaddr = unmask_page(cache->vaddr);
if (cache->vaddr & KMAP)
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
else
io_mapping_unmap_atomic((void __iomem *)vaddr);
 }
@@ -1172,7 +1172,7 @@ static void reloc_cache_remap(struct reloc_cache *cache,
if (cache->vaddr & KMAP) {
struct page *page = i915_gem_object_get_page(obj, cache->page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) |
(unsigned long)vaddr;
} else {
@@ -1202,7 +1202,7 @@ static void reloc_cache_reset(struct reloc_cache *cache, 
struct i915_execbuffer
if (cache->vaddr & CLFLUSH_AFTER)
mb();
 
-   kunmap_atomic(vaddr);
+   kunmap_local(vaddr);
i915_gem_object_finish_access(obj);
} else {
struct i915_ggtt *ggtt = cache_to_ggtt(cache);
@@ -1234,7 +1234,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
struct page *page;
 
if (cache->vaddr) {
-   kunmap_atomic(unmask_page(cache->vaddr));
+   kunmap_local(unmask_page(cache->vaddr));
} else {
unsigned int flushes;
int err;
@@ -1256,7 +1256,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
if (!obj->mm.dirty)
set_page_dirty(page);
 
-   vaddr = kmap_atomic(page);
+   vaddr = kmap_local_page(page);
cache->vaddr = unmask_flags(cache->vaddr) | (unsigned long)vaddr;
cache->page = pageno;
 
-- 
2.34.1



[PATCH v3 7/9] drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.c

2023-12-03 Thread Zhao Liu
From: Zhao Liu 

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption  (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gt/uc/intel_us_fw.c, the function intel_uc_fw_copy_rsa()
just use the mapping to do memory copy so it doesn't need to disable
pagefaults and preemption for mapping. Thus the local mapping without
atomic context (not disable pagefaults / preemption) is enough.

Therefore, intel_uc_fw_copy_rsa() is a function where the use of
memcpy_from_page() with kmap_local_page() in place of memcpy() with
kmap_atomic() is correctly suited.

Convert the calls of memcpy() with kmap_atomic() / kunmap_atomic() to
memcpy_from_page() which uses local mapping to copy.

[1]: 
https://lore.kernel.org/all/20220813220034.806698-1-ira.we...@intel.com/T/#u

Suggested-by: Ira Weiny 
Suggested-by: Fabio M. De Francesco 
Signed-off-by: Zhao Liu 
Reviewed-by: Ira Weiny 
Reviewed-by: Fabio M. De Francesco 
---
Suggested by credits:
  Ira: Referred to his task document and suggestions about using
   memcpy_from_page() directly.
  Fabio: Referred to his boiler plate commit message and his description
 about why kmap_local_page() should be preferred.
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 362639162ed6..756093eaf2ad 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -1343,16 +1343,13 @@ size_t intel_uc_fw_copy_rsa(struct intel_uc_fw *uc_fw, 
void *dst, u32 max_len)
 
for_each_sgt_page(page, iter, uc_fw->obj->mm.pages) {
u32 len = min_t(u32, size, PAGE_SIZE - offset);
-   void *vaddr;
 
if (idx > 0) {
idx--;
continue;
}
 
-   vaddr = kmap_atomic(page);
-   memcpy(dst, vaddr + offset, len);
-   kunmap_atomic(vaddr);
+   memcpy_from_page(dst, page, offset, len);
 
offset = 0;
dst += len;
-- 
2.34.1