Re: [PATCH] drm/i915/gem: Calculate object page offset for partial memory mapping

2024-03-28 Thread Andi Shyti
Hi Nirmoy,

On Tue, Mar 26, 2024 at 01:05:37PM +0100, Nirmoy Das wrote:
> On 3/26/2024 12:12 PM, Andi Shyti wrote:

> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > index a2195e28b625..57a2dda2c3cc 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > @@ -276,7 +276,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
> > > > /* PTEs are revoked in obj->ops->put_pages() */
> > > > err = remap_io_sg(area,
> > > >   area->vm_start, area->vm_end - area->vm_start,
> > > > - obj->mm.pages->sgl, iomap);
> > > > + obj->mm.pages->sgl, 0, iomap);
> > > Why don't we need partial mmap for CPU but only for GTT ?
> > As far as I understood we don't. I have a version with the CPU
> > offset as well in trybot[*]
> > 
> > But without support for segmented buffer objects, I don't know
> > how much this has any effect.
> 
> You confused me more :) Why segmented buffer object is needed for partial
> CPU mmap but not for GTT  ?

atually segmented bo's were introduced to support single dma
buffers instead of fragmented buffers. But this goes beyond the
scope of this patch.

> From high level,  GTT and CPU both should support partial mmap unless I
> missing something here.

But yes, we could take the patch I linked which adds some offset
to the cpu memory. I will add it in V2.

> > 
> > > Sounds like this also need to be cover by a IGT tests.
> > Yes, I it does need some igt work, working on it.
> > 
> > > Don't we need "Fixes" tag for this?
> > Why should we? I'm not fixing anything here,
> 
> If userspace  expects partial mmap to work then this is a bug/gap in i915 so
> we need to
> 
> backport this as far as possible. Need some information about the
> requirement about  why we need this patch suddenly?

But a gap is not a bug. Theoretically we are adding a feature.

On the other hand it would be a bug if the API promises to add
the offset but in reality it doesn't. I will check if this is the
case and it needs to be well described in the commit message.

Thanks,
Andi


Re: [PATCH] drm/i915/gem: Calculate object page offset for partial memory mapping

2024-03-26 Thread Nirmoy Das

Hi Andi,

On 3/26/2024 12:12 PM, Andi Shyti wrote:

Hi Nirmoy,

...


diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index a2195e28b625..57a2dda2c3cc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -276,7 +276,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
/* PTEs are revoked in obj->ops->put_pages() */
err = remap_io_sg(area,
  area->vm_start, area->vm_end - area->vm_start,
- obj->mm.pages->sgl, iomap);
+ obj->mm.pages->sgl, 0, iomap);

Why don't we need partial mmap for CPU but only for GTT ?

As far as I understood we don't. I have a version with the CPU
offset as well in trybot[*]

But without support for segmented buffer objects, I don't know
how much this has any effect.


You confused me more :) Why segmented buffer object is needed for 
partial CPU mmap but not for GTT  ?


From high level,  GTT and CPU both should support partial mmap unless I 
missing something here.





Sounds like this also need to be cover by a IGT tests.

Yes, I it does need some igt work, working on it.


Don't we need "Fixes" tag for this?

Why should we? I'm not fixing anything here,


If userspace  expects partial mmap to work then this is a bug/gap in 
i915 so we need to


backport this as far as possible. Need some information about the 
requirement about  why we need this patch suddenly?



Regards,

Nirmoy


  I'm just
recalculating the mapping not starting from the beginning of the
scatter page.

Andi

[*] https://patchwork.freedesktop.org/patch/584474/?series=131539=2


Re: [PATCH] drm/i915/gem: Calculate object page offset for partial memory mapping

2024-03-26 Thread Andi Shyti
Hi Nirmoy,

...

> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > index a2195e28b625..57a2dda2c3cc 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > @@ -276,7 +276,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
> > /* PTEs are revoked in obj->ops->put_pages() */
> > err = remap_io_sg(area,
> >   area->vm_start, area->vm_end - area->vm_start,
> > - obj->mm.pages->sgl, iomap);
> > + obj->mm.pages->sgl, 0, iomap);
> 
> Why don't we need partial mmap for CPU but only for GTT ?

As far as I understood we don't. I have a version with the CPU
offset as well in trybot[*]

But without support for segmented buffer objects, I don't know
how much this has any effect.

> Sounds like this also need to be cover by a IGT tests.

Yes, I it does need some igt work, working on it.

> Don't we need "Fixes" tag for this?

Why should we? I'm not fixing anything here, I'm just
recalculating the mapping not starting from the beginning of the
scatter page.

Andi

[*] https://patchwork.freedesktop.org/patch/584474/?series=131539=2


Re: [PATCH] drm/i915/gem: Calculate object page offset for partial memory mapping

2024-03-25 Thread Nirmoy Das

Hi Andi,

I have too many questions :) I think the patch makes sense but need more 
context, see below:


On 3/25/2024 2:40 PM, Andi Shyti wrote:

To enable partial memory mapping of GPU virtual memory, it's
necessary to introduce an offset to the object's memory
(obj->mm.pages) scatterlist. This adjustment compensates for
instances when userspace mappings do not start from the beginning
of the object.

Based on a patch by Chris Wilson
.

Signed-off-by: Andi Shyti 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
---
  drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 +---
  drivers/gpu/drm/i915/i915_mm.c   | 12 +++-
  drivers/gpu/drm/i915/i915_mm.h   |  3 ++-
  3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index a2195e28b625..57a2dda2c3cc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -276,7 +276,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
/* PTEs are revoked in obj->ops->put_pages() */
err = remap_io_sg(area,
  area->vm_start, area->vm_end - area->vm_start,
- obj->mm.pages->sgl, iomap);
+ obj->mm.pages->sgl, 0, iomap);


Why don't we need partial mmap for CPU but only for GTT ?

Sounds like this also need to be cover by a IGT tests.  Don't we need 
"Fixes" tag for this?


Regards,

Nirmoy

  
  	if (area->vm_flags & VM_WRITE) {

GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
@@ -302,14 +302,16 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
struct i915_ggtt *ggtt = to_gt(i915)->ggtt;
bool write = area->vm_flags & VM_WRITE;
struct i915_gem_ww_ctx ww;
+   unsigned long obj_offset;
intel_wakeref_t wakeref;
struct i915_vma *vma;
pgoff_t page_offset;
int srcu;
int ret;
  
-	/* We don't use vmf->pgoff since that has the fake offset */

+   obj_offset = area->vm_pgoff - drm_vma_node_start(>vma_node);
page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
+   page_offset += obj_offset;
  
  	trace_i915_gem_object_fault(obj, page_offset, true, write);
  
@@ -404,7 +406,7 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
  
  	/* Finally, remap it using the new GTT offset */

ret = remap_io_mapping(area,
-  area->vm_start + (vma->gtt_view.partial.offset 
<< PAGE_SHIFT),
+  area->vm_start + ((vma->gtt_view.partial.offset - 
obj_offset) << PAGE_SHIFT),
   (ggtt->gmadr.start + i915_ggtt_offset(vma)) >> 
PAGE_SHIFT,
   min_t(u64, vma->size, area->vm_end - 
area->vm_start),
   >iomap);
diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c
index 7998bc74ab49..f5c97a620962 100644
--- a/drivers/gpu/drm/i915/i915_mm.c
+++ b/drivers/gpu/drm/i915/i915_mm.c
@@ -122,13 +122,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
   * @addr: target user address to start at
   * @size: size of map area
   * @sgl: Start sg entry
+ * @offset: offset from the start of the page
   * @iobase: Use stored dma address offset by this address or pfn if -1
   *
   *  Note: this is only safe if the mm semaphore is held when called.
   */
  int remap_io_sg(struct vm_area_struct *vma,
unsigned long addr, unsigned long size,
-   struct scatterlist *sgl, resource_size_t iobase)
+   struct scatterlist *sgl, unsigned long offset,
+   resource_size_t iobase)
  {
struct remap_pfn r = {
.mm = vma->vm_mm,
@@ -141,6 +143,14 @@ int remap_io_sg(struct vm_area_struct *vma,
/* We rely on prevalidation of the io-mapping to skip track_pfn(). */
GEM_BUG_ON((vma->vm_flags & EXPECTED_FLAGS) != EXPECTED_FLAGS);
  
+	while (offset >= sg_dma_len(r.sgt.sgp) >> PAGE_SHIFT) {

+   offset -= sg_dma_len(r.sgt.sgp) >> PAGE_SHIFT;
+   r.sgt = __sgt_iter(__sg_next(r.sgt.sgp), use_dma(iobase));
+   if (!r.sgt.sgp)
+   return -EINVAL;
+   }
+   r.sgt.curr = offset << PAGE_SHIFT;
+
if (!use_dma(iobase))
flush_cache_range(vma, addr, size);
  
diff --git a/drivers/gpu/drm/i915/i915_mm.h b/drivers/gpu/drm/i915/i915_mm.h

index 04c8974d822b..69f9351b1a1c 100644
--- a/drivers/gpu/drm/i915/i915_mm.h
+++ b/drivers/gpu/drm/i915/i915_mm.h
@@ -30,6 +30,7 @@ int remap_io_mapping(struct vm_area_struct *vma,
  
  int remap_io_sg(struct vm_area_struct *vma,

unsigned long addr, unsigned long size,
-   struct scatterlist *sgl, resource_size_t iobase);
+   struct scatterlist *sgl, unsigned long offset,
+   resource_size_t iobase);
  
  #endif /* __I915_MM_H__ */


[PATCH] drm/i915/gem: Calculate object page offset for partial memory mapping

2024-03-25 Thread Andi Shyti
To enable partial memory mapping of GPU virtual memory, it's
necessary to introduce an offset to the object's memory
(obj->mm.pages) scatterlist. This adjustment compensates for
instances when userspace mappings do not start from the beginning
of the object.

Based on a patch by Chris Wilson
.

Signed-off-by: Andi Shyti 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 +---
 drivers/gpu/drm/i915/i915_mm.c   | 12 +++-
 drivers/gpu/drm/i915/i915_mm.h   |  3 ++-
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index a2195e28b625..57a2dda2c3cc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -276,7 +276,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
/* PTEs are revoked in obj->ops->put_pages() */
err = remap_io_sg(area,
  area->vm_start, area->vm_end - area->vm_start,
- obj->mm.pages->sgl, iomap);
+ obj->mm.pages->sgl, 0, iomap);
 
if (area->vm_flags & VM_WRITE) {
GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
@@ -302,14 +302,16 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
struct i915_ggtt *ggtt = to_gt(i915)->ggtt;
bool write = area->vm_flags & VM_WRITE;
struct i915_gem_ww_ctx ww;
+   unsigned long obj_offset;
intel_wakeref_t wakeref;
struct i915_vma *vma;
pgoff_t page_offset;
int srcu;
int ret;
 
-   /* We don't use vmf->pgoff since that has the fake offset */
+   obj_offset = area->vm_pgoff - drm_vma_node_start(>vma_node);
page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
+   page_offset += obj_offset;
 
trace_i915_gem_object_fault(obj, page_offset, true, write);
 
@@ -404,7 +406,7 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
 
/* Finally, remap it using the new GTT offset */
ret = remap_io_mapping(area,
-  area->vm_start + (vma->gtt_view.partial.offset 
<< PAGE_SHIFT),
+  area->vm_start + ((vma->gtt_view.partial.offset 
- obj_offset) << PAGE_SHIFT),
   (ggtt->gmadr.start + i915_ggtt_offset(vma)) >> 
PAGE_SHIFT,
   min_t(u64, vma->size, area->vm_end - 
area->vm_start),
   >iomap);
diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c
index 7998bc74ab49..f5c97a620962 100644
--- a/drivers/gpu/drm/i915/i915_mm.c
+++ b/drivers/gpu/drm/i915/i915_mm.c
@@ -122,13 +122,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
  * @addr: target user address to start at
  * @size: size of map area
  * @sgl: Start sg entry
+ * @offset: offset from the start of the page
  * @iobase: Use stored dma address offset by this address or pfn if -1
  *
  *  Note: this is only safe if the mm semaphore is held when called.
  */
 int remap_io_sg(struct vm_area_struct *vma,
unsigned long addr, unsigned long size,
-   struct scatterlist *sgl, resource_size_t iobase)
+   struct scatterlist *sgl, unsigned long offset,
+   resource_size_t iobase)
 {
struct remap_pfn r = {
.mm = vma->vm_mm,
@@ -141,6 +143,14 @@ int remap_io_sg(struct vm_area_struct *vma,
/* We rely on prevalidation of the io-mapping to skip track_pfn(). */
GEM_BUG_ON((vma->vm_flags & EXPECTED_FLAGS) != EXPECTED_FLAGS);
 
+   while (offset >= sg_dma_len(r.sgt.sgp) >> PAGE_SHIFT) {
+   offset -= sg_dma_len(r.sgt.sgp) >> PAGE_SHIFT;
+   r.sgt = __sgt_iter(__sg_next(r.sgt.sgp), use_dma(iobase));
+   if (!r.sgt.sgp)
+   return -EINVAL;
+   }
+   r.sgt.curr = offset << PAGE_SHIFT;
+
if (!use_dma(iobase))
flush_cache_range(vma, addr, size);
 
diff --git a/drivers/gpu/drm/i915/i915_mm.h b/drivers/gpu/drm/i915/i915_mm.h
index 04c8974d822b..69f9351b1a1c 100644
--- a/drivers/gpu/drm/i915/i915_mm.h
+++ b/drivers/gpu/drm/i915/i915_mm.h
@@ -30,6 +30,7 @@ int remap_io_mapping(struct vm_area_struct *vma,
 
 int remap_io_sg(struct vm_area_struct *vma,
unsigned long addr, unsigned long size,
-   struct scatterlist *sgl, resource_size_t iobase);
+   struct scatterlist *sgl, unsigned long offset,
+   resource_size_t iobase);
 
 #endif /* __I915_MM_H__ */
-- 
2.43.0