Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-07 Thread Michael Cheng

Ah Thanks for the great feedback!

@Lucas or @Matt, could you please chime in?

Michael Cheng

On 2022-03-02 11:10 a.m., Robin Murphy wrote:

On 2022-03-02 15:55, Michael Cheng wrote:

Thanks for the feedback Robin!

Sorry my choices of word weren't that great, but what I meant is to 
understand how ARM flushes a range of dcache for device drivers, and 
not an equal to x86 clflush.


I believe the concern is if the CPU writes an update, that update 
might only be sitting in the CPU cache and never make it to device 
memory where the device can see it; there are specific places that we 
are supposed to flush the CPU caches to make sure our updates are 
visible to the hardware.


Ah, OK, if it's more about ordering, and it's actually write buffers 
rather than caches that you care about flushing, then we might be a 
lot safer, phew!


For a very simple overview, in a case where the device itself needs to 
observe memory writes in the correct order, e.g.:


data_descriptor.valid = 1;

clflush(_descriptor);

command_descriptor.data = _descriptor

writel(/* control register to read command to then read data */)

then dma_wmb() between the first two writes should be the right tool 
to ensure that the command does not observe the command update while 
the data update is still sat somewhere in a CPU write buffer.


If you want a slightly stronger notion that, at a given point, all 
prior writes have actually been issued and should now be visible 
(rather than just that they won't become visible in the wrong order 
whenever they do), then wmb() should suffice on arm64.


Note that wioth arm64 memory types, a Non-Cacheable mapping of DRAM 
for a non-coherent DMA mapping, or of VRAM in a prefetchable BAR, can 
still be write-buffered, so barriers still matter even when actual 
cache maintenance ops don't (and as before if you're trying to perform 
cache maintenance outside the DMA API then you've already lost 
anyway). MMIO registers should be mapped as Device memory via 
ioremap(), which is not bufferable, hence the barrier implicit in 
writel() effectively pushes out any prior buffered writes ahead of a 
register write, which is why we don't need to worry about this most of 
the time.


This is only a very rough overview, though, and I'm not familiar 
enough with x86 semantics, your hardware, or the exact use-case to be 
able to say whether barriers alone are anywhere near the right answer 
or not.


Robin.



+Matt Roper

Matt, Lucas, any feed back here?

On 2022-03-02 4:49 a.m., Robin Murphy wrote:

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope 
them

    in sooner.

Why does i915 need to ensure the CPU's instruction cache is 
coherent with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build 
errors

    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would 
you (or
    someone in the ARM community) suggest we do? Could it be 
possible to

    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric 
cache maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a 
single vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. 
Whether the host CPU architecture happens to be AArch64, RISC-V, or 
whatever doesn't really matter, you're at the mercy of 3rd-party 
PCIe and interconnect IP vendors, and SoC integrators. You'll find 
yourself in systems where PCIe simply cannot snoop any caches, where 
you'd better have the correct DMA API calls in place to have any 
hope of even the most basic functionality working properly; you'll 
find yourself in systems where even if the PCIe root complex claims 
to support No Snoop, your uncached traffic will still end up 
snooping stale data that got prefetched back into caches you thought 
you'd invalidated; you'll find yourself in systems where your memory 
attributes may or may not get forcibly rewritten by an IOMMU 
depending on the kernel config and/or command line.


It's not about simply finding a substitute for clflush, it's that 
the reasons you have for using clflush in the first place can no 
longer be assumed to be valid.


Robin.


On 2022-02-25 

Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Robin Murphy

On 2022-03-02 15:55, Michael Cheng wrote:

Thanks for the feedback Robin!

Sorry my choices of word weren't that great, but what I meant is to 
understand how ARM flushes a range of dcache for device drivers, and not 
an equal to x86 clflush.


I believe the concern is if the CPU writes an update, that update might 
only be sitting in the CPU cache and never make it to device memory 
where the device can see it; there are specific places that we are 
supposed to flush the CPU caches to make sure our updates are visible to 
the hardware.


Ah, OK, if it's more about ordering, and it's actually write buffers 
rather than caches that you care about flushing, then we might be a lot 
safer, phew!


For a very simple overview, in a case where the device itself needs to 
observe memory writes in the correct order, e.g.:


data_descriptor.valid = 1;

clflush(_descriptor);

command_descriptor.data = _descriptor

writel(/* control register to read command to then read data */)

then dma_wmb() between the first two writes should be the right tool to 
ensure that the command does not observe the command update while the 
data update is still sat somewhere in a CPU write buffer.


If you want a slightly stronger notion that, at a given point, all prior 
writes have actually been issued and should now be visible (rather than 
just that they won't become visible in the wrong order whenever they 
do), then wmb() should suffice on arm64.


Note that wioth arm64 memory types, a Non-Cacheable mapping of DRAM for 
a non-coherent DMA mapping, or of VRAM in a prefetchable BAR, can still 
be write-buffered, so barriers still matter even when actual cache 
maintenance ops don't (and as before if you're trying to perform cache 
maintenance outside the DMA API then you've already lost anyway). MMIO 
registers should be mapped as Device memory via ioremap(), which is not 
bufferable, hence the barrier implicit in writel() effectively pushes 
out any prior buffered writes ahead of a register write, which is why we 
don't need to worry about this most of the time.


This is only a very rough overview, though, and I'm not familiar enough 
with x86 semantics, your hardware, or the exact use-case to be able to 
say whether barriers alone are anywhere near the right answer or not.


Robin.



+Matt Roper

Matt, Lucas, any feed back here?

On 2022-03-02 4:49 a.m., Robin Murphy wrote:

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope them
    in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build errors
    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would you (or
    someone in the ARM community) suggest we do? Could it be possible to
    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric 
cache maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a single 
vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. 
Whether the host CPU architecture happens to be AArch64, RISC-V, or 
whatever doesn't really matter, you're at the mercy of 3rd-party PCIe 
and interconnect IP vendors, and SoC integrators. You'll find yourself 
in systems where PCIe simply cannot snoop any caches, where you'd 
better have the correct DMA API calls in place to have any hope of 
even the most basic functionality working properly; you'll find 
yourself in systems where even if the PCIe root complex claims to 
support No Snoop, your uncached traffic will still end up snooping 
stale data that got prefetched back into caches you thought you'd 
invalidated; you'll find yourself in systems where your memory 
attributes may or may not get forcibly rewritten by an IOMMU depending 
on the kernel config and/or command line.


It's not about simply finding a substitute for clflush, it's that the 
reasons you have for using clflush in the first place can no longer be 
assumed to be valid.


Robin.


On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


On 

Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Alex Deucher
On Wed, Mar 2, 2022 at 10:55 AM Michael Cheng  wrote:
>
> Thanks for the feedback Robin!
>
> Sorry my choices of word weren't that great, but what I meant is to
> understand how ARM flushes a range of dcache for device drivers, and not
> an equal to x86 clflush.
>
> I believe the concern is if the CPU writes an update, that update might
> only be sitting in the CPU cache and never make it to device memory
> where the device can see it; there are specific places that we are
> supposed to flush the CPU caches to make sure our updates are visible to
> the hardware.
>
> +Matt Roper
>
> Matt, Lucas, any feed back here?

MMIO (e.g., PCI BARs, etc.) should be mapped uncached.  If it's not
you'll have a lot of problems using a GPU on that architecture.  One
thing that you may want to check is if your device has its own caches
or write queues on the BAR aperture.  You may have to flush them after
CPU access to the BAR to make sure CPU updates land in device memory.
For system memory, PCI, per the spec, should be cache coherent with
the CPU.  If it's not, you'll have a lot of trouble using a GPU on
that platform.

Alex

>
> On 2022-03-02 4:49 a.m., Robin Murphy wrote:
> > On 2022-02-25 19:27, Michael Cheng wrote:
> >> Hi Robin,
> >>
> >> [ +arm64 maintainers for their awareness, which would have been a
> >> good thing to do from the start ]
> >>
> >>   * Thanks for adding the arm64 maintainer and sorry I didn't rope them
> >> in sooner.
> >>
> >> Why does i915 need to ensure the CPU's instruction cache is coherent
> >> with its data cache? Is it a self-modifying driver?
> >>
> >>   * Also thanks for pointing this out. Initially I was using
> >> dcache_clean_inval_poc, which seem to be the equivalently to what
> >> x86 is doing for dcache flushing, but it was giving me build errors
> >> since its not on the global list of kernel symbols. And after
> >> revisiting the documentation for caches_clean_inval_pou, it won't
> >> fly for what we are trying to do. Moving forward, what would you (or
> >> someone in the ARM community) suggest we do? Could it be possible to
> >> export dcache_clean_inval_poc as a global symbol?
> >
> > Unlikely, unless something with a legitimate need for CPU-centric
> > cache maintenance like kexec or CPU hotplug ever becomes modular.
> >
> > In the case of a device driver, it's not even the basic issues of
> > assuming to find direct equivalents to x86 semantics in other CPU
> > architectures, or effectively reinventing parts of the DMA API, it's
> > even bigger than that. Once you move from being integrated in a single
> > vendor's system architecture to being on a discrete card, you
> > fundamentally *no longer have any control over cache coherency*.
> > Whether the host CPU architecture happens to be AArch64, RISC-V, or
> > whatever doesn't really matter, you're at the mercy of 3rd-party PCIe
> > and interconnect IP vendors, and SoC integrators. You'll find yourself
> > in systems where PCIe simply cannot snoop any caches, where you'd
> > better have the correct DMA API calls in place to have any hope of
> > even the most basic functionality working properly; you'll find
> > yourself in systems where even if the PCIe root complex claims to
> > support No Snoop, your uncached traffic will still end up snooping
> > stale data that got prefetched back into caches you thought you'd
> > invalidated; you'll find yourself in systems where your memory
> > attributes may or may not get forcibly rewritten by an IOMMU depending
> > on the kernel config and/or command line.
> >
> > It's not about simply finding a substitute for clflush, it's that the
> > reasons you have for using clflush in the first place can no longer be
> > assumed to be valid.
> >
> > Robin.
> >
> >> On 2022-02-25 10:24 a.m., Robin Murphy wrote:
> >>> [ +arm64 maintainers for their awareness, which would have been a
> >>> good thing to do from the start ]
> >>>
> >>> On 2022-02-25 03:24, Michael Cheng wrote:
>  Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
>  performs a flush by first performing a clean, follow by an
>  invalidation
>  operation.
> 
>  v2 (Michael Cheng): Use correct macro for cleaning and invalidation
>  the
>  dcache. Thanks Tvrtko for the suggestion.
> 
>  v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h
> 
>  v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
>  symbol that could be use by other modules, thus use
>  caches_clean_inval_pou instead. Also this version
>  removes include for cacheflush, since its already
>  included base on architecture type.
> 
>  Signed-off-by: Michael Cheng 
>  Reviewed-by: Matt Roper 
>  ---
>    drivers/gpu/drm/drm_cache.c | 5 +
>    1 file changed, 5 insertions(+)
> 
>  diff --git a/drivers/gpu/drm/drm_cache.c 

Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Michael Cheng

Thanks for the feedback Robin!

Sorry my choices of word weren't that great, but what I meant is to 
understand how ARM flushes a range of dcache for device drivers, and not 
an equal to x86 clflush.


I believe the concern is if the CPU writes an update, that update might 
only be sitting in the CPU cache and never make it to device memory 
where the device can see it; there are specific places that we are 
supposed to flush the CPU caches to make sure our updates are visible to 
the hardware.


+Matt Roper

Matt, Lucas, any feed back here?

On 2022-03-02 4:49 a.m., Robin Murphy wrote:

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope them
    in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build errors
    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would you (or
    someone in the ARM community) suggest we do? Could it be possible to
    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric 
cache maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a single 
vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. 
Whether the host CPU architecture happens to be AArch64, RISC-V, or 
whatever doesn't really matter, you're at the mercy of 3rd-party PCIe 
and interconnect IP vendors, and SoC integrators. You'll find yourself 
in systems where PCIe simply cannot snoop any caches, where you'd 
better have the correct DMA API calls in place to have any hope of 
even the most basic functionality working properly; you'll find 
yourself in systems where even if the PCIe root complex claims to 
support No Snoop, your uncached traffic will still end up snooping 
stale data that got prefetched back into caches you thought you'd 
invalidated; you'll find yourself in systems where your memory 
attributes may or may not get forcibly rewritten by an IOMMU depending 
on the kernel config and/or command line.


It's not about simply finding a substitute for clflush, it's that the 
reasons you have for using clflush in the first place can no longer be 
assumed to be valid.


Robin.


On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation 
the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do 
actually have half an idea of what you're trying to do here and why 
it won't fly, but I'd like to at least assume you've read the 
documentation of the function you decided was OK to use)



+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Robin Murphy

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope them
    in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build errors
    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would you (or
    someone in the ARM community) suggest we do? Could it be possible to
    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric cache 
maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a single 
vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. Whether 
the host CPU architecture happens to be AArch64, RISC-V, or whatever 
doesn't really matter, you're at the mercy of 3rd-party PCIe and 
interconnect IP vendors, and SoC integrators. You'll find yourself in 
systems where PCIe simply cannot snoop any caches, where you'd better 
have the correct DMA API calls in place to have any hope of even the 
most basic functionality working properly; you'll find yourself in 
systems where even if the PCIe root complex claims to support No Snoop, 
your uncached traffic will still end up snooping stale data that got 
prefetched back into caches you thought you'd invalidated; you'll find 
yourself in systems where your memory attributes may or may not get 
forcibly rewritten by an IOMMU depending on the kernel config and/or 
command line.


It's not about simply finding a substitute for clflush, it's that the 
reasons you have for using clflush in the first place can no longer be 
assumed to be valid.


Robin.


On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do 
actually have half an idea of what you're trying to do here and why it 
won't fly, but I'd like to at least assume you've read the 
documentation of the function you decided was OK to use)



+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Michael Cheng

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


 * Thanks for adding the arm64 maintainer and sorry I didn't rope them
   in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


 * Also thanks for pointing this out. Initially I was using
   dcache_clean_inval_poc, which seem to be the equivalently to what
   x86 is doing for dcache flushing, but it was giving me build errors
   since its not on the global list of kernel symbols. And after
   revisiting the documentation for caches_clean_inval_pou, it won't
   fly for what we are trying to do. Moving forward, what would you (or
   someone in the ARM community) suggest we do? Could it be possible to
   export dcache_clean_inval_poc as a global symbol?

On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do 
actually have half an idea of what you're trying to do here and why it 
won't fly, but I'd like to at least assume you've read the 
documentation of the function you decided was OK to use)



+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif

Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Matthew Wilcox
On Fri, Feb 25, 2022 at 06:42:37PM +, Tvrtko Ursulin wrote:
> Matthew, what do you think fix for this build warning on h8300 and s390 
> should be? Or perhaps a build environment issue with kernel test robot?

I'd suggest this should do the job:

+++ b/include/linux/cacheflush.h
@@ -4,6 +4,8 @@

 #include 

+struct folio;
+
 #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
 #ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
 void flush_dcache_folio(struct folio *folio);



Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Tvrtko Ursulin




On 25/02/2022 18:23, Michael Cheng wrote:
These seem to be pretty old arch and are day0 warnings, please refer to 
[1] to see the warnings. Also I am not sure why my patch series didn't 
append to the old one.


[1] https://patchwork.freedesktop.org/patch/475829/?series=99450=11



include/linux/cacheflush.h:12:46: warning: declaration of 'struct folio' will 
not be visible outside of this function [-Wvisibility]


That?

Looks like the #else path needs to forward declare struct folio or include the 
relevant header.

+Matthew Wilcox

Matthew, what do you think fix for this build warning on h8300 and s390 should 
be? Or perhaps a build environment issue with kernel test robot?

Regards,

Tvrtko
 

2022-02-25 10:19 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 17:40, Michael Cheng wrote:
Ah, thanks for pointing that out, when I do include it though, it 
causes a few warning other systems such as h8300 and s390.


Errors look like? I haven't heard that kernel code is not allowed to 
include something from linux/ on some arch yet.


Since it is already pulled is, would it be OK to leave it out for 
this case? Or we could use something like !IS_H8300 and !IS_S390


around the header file?


Unlikely, now you made me curious why it does not work.

Regards,

Tvrtko



On 2022-02-25 9:33 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 16:52, Michael Cheng wrote:

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for 
arm64 or x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am 
still

trying to understand how things works!


Probably:

drm_cache.c:

#include 

linux/highmem.h:

#include 

But it is more correct to explicitly include what you use. So if 
drm_cache.c uses stuff declared in cacheflush.h, it should include it.


Regards,

Tvrtko


Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and 
invalidation the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc 
as a

    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? 
Some of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c 
b/drivers/gpu/drm/drm_cache.c

index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned 
long)end);

+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Robin Murphy
[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
symbol that could be use by other modules, thus use
caches_clean_inval_pou instead. Also this version
removes include for cacheflush, since its already
included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long length)
  
  	if (wbinvd_on_all_cpus())

pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+   void *end = addr + length;
+   caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do actually 
have half an idea of what you're trying to do here and why it won't fly, 
but I'd like to at least assume you've read the documentation of the 
function you decided was OK to use)



+
  #else
WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Michael Cheng
These seem to be pretty old arch and are day0 warnings, please refer to 
[1] to see the warnings. Also I am not sure why my patch series didn't 
append to the old one.


[1] https://patchwork.freedesktop.org/patch/475829/?series=99450=11

2022-02-25 10:19 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 17:40, Michael Cheng wrote:
Ah, thanks for pointing that out, when I do include it though, it 
causes a few warning other systems such as h8300 and s390.


Errors look like? I haven't heard that kernel code is not allowed to 
include something from linux/ on some arch yet.


Since it is already pulled is, would it be OK to leave it out for 
this case? Or we could use something like !IS_H8300 and !IS_S390


around the header file?


Unlikely, now you made me curious why it does not work.

Regards,

Tvrtko



On 2022-02-25 9:33 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 16:52, Michael Cheng wrote:

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for 
arm64 or x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am 
still

trying to understand how things works!


Probably:

drm_cache.c:

#include 

linux/highmem.h:

#include 

But it is more correct to explicitly include what you use. So if 
drm_cache.c uses stuff declared in cacheflush.h, it should include it.


Regards,

Tvrtko


Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and 
invalidation the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc 
as a

    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? 
Some of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c 
b/drivers/gpu/drm/drm_cache.c

index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned 
long)end);

+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Tvrtko Ursulin



On 25/02/2022 17:40, Michael Cheng wrote:
Ah, thanks for pointing that out, when I do include it though, it causes 
a few warning other systems such as h8300 and s390.


Errors look like? I haven't heard that kernel code is not allowed to 
include something from linux/ on some arch yet.


Since it is already pulled is, would it be OK to leave it out for this 
case? Or we could use something like !IS_H8300 and !IS_S390


around the header file?


Unlikely, now you made me curious why it does not work.

Regards,

Tvrtko



On 2022-02-25 9:33 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 16:52, Michael Cheng wrote:

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for arm64 
or x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am 
still

trying to understand how things works!


Probably:

drm_cache.c:

#include 

linux/highmem.h:

#include 

But it is more correct to explicitly include what you use. So if 
drm_cache.c uses stuff declared in cacheflush.h, it should include it.


Regards,

Tvrtko


Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation 
the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? 
Some of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);
+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Michael Cheng
Ah, thanks for pointing that out, when I do include it though, it causes 
a few warning other systems such as h8300 and s390.


Since it is already pulled is, would it be OK to leave it out for this 
case? Or we could use something like !IS_H8300 and !IS_S390


around the header file?

On 2022-02-25 9:33 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 16:52, Michael Cheng wrote:

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for arm64 
or x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am 
still

trying to understand how things works!


Probably:

drm_cache.c:

#include 

linux/highmem.h:

#include 

But it is more correct to explicitly include what you use. So if 
drm_cache.c uses stuff declared in cacheflush.h, it should include it.


Regards,

Tvrtko


Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation 
the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? 
Some of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);
+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Tvrtko Ursulin



On 25/02/2022 16:52, Michael Cheng wrote:

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for arm64 or 
x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am still
trying to understand how things works!


Probably:

drm_cache.c:

#include 

linux/highmem.h:

#include 

But it is more correct to explicitly include what you use. So if 
drm_cache.c uses stuff declared in cacheflush.h, it should include it.


Regards,

Tvrtko


Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? Some 
of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);
+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Michael Cheng

Hi Tvrtko,

It seems without cacheflush.h being included, when I build for arm64 or 
x86, it stills pulls in cacheflush.h:


./.drm_cache.o.cmd:838: include/linux/cacheflush.h \
./.drm_cache.o.cmd:839: arch/x86/include/asm/cacheflush.h \
./.drm_cache.o.cmd:920: include/asm-generic/cacheflush.h \
./.drm_cache.o.cmd:830: include/linux/cacheflush.h \
./.drm_cache.o.cmd:831: arch/arm64/include/asm/cacheflush.h \
./.drm_cache.o.cmd:1085: include/asm-generic/cacheflush.h \
So it seems without including it, cacheflush.h stills get pulled in,
I think its because its a required kernel source to build the kernel
per specific architecture, but please correct if I am wrong,as I am still
trying to understand how things works!
Michael Cheng
On 2022-02-25 8:28 a.m., Tvrtko Ursulin wrote:


On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.


What does it mean that it is included based on architecture type? Some 
of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);
+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif

Re: [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-02-25 Thread Tvrtko Ursulin



On 25/02/2022 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
symbol that could be use by other modules, thus use
caches_clean_inval_pou instead. Also this version
removes include for cacheflush, since its already
included base on architecture type.


What does it mean that it is included based on architecture type? Some 
of the other header already pulls it in?


Regards,

Tvrtko


Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long length)
  
  	if (wbinvd_on_all_cpus())

pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+   void *end = addr + length;
+   caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);
+
  #else
WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif