Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-30 Thread Oleksandr Andrushchenko
Dropped in favor of https://lkml.org/lkml/2019/1/29/910

On 1/24/19 5:02 PM, Julien Grall wrote:
>
>
> On 24/01/2019 14:34, Oleksandr Andrushchenko wrote:
>> Hello, Julien!
>
> Hi,
>
>> On 1/22/19 1:44 PM, Julien Grall wrote:
>>>
>>>
>>> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote:
 Hello, Julien!
>>>
>>> Hi,
>>>
 On 1/21/19 7:09 PM, Julien Grall wrote:
 Well, I didn't get the attributes of pages at the backend side, but 
 IMO
 those
 do not matter in my use-case (for simplicity I am not using
 zero-copying at
 backend side):
>>>
>>> They are actually important no matter what is your use case. If you
>>> access the same physical page with different attributes, then you are
>>> asking for trouble.
>> So, we have:
>>
>> DomU: frontend side
>> 
>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
>> PTE_ATTRINDX(MT_NORMAL)
>
> I still don't understand how you came up with MT_NORMAL when you seem 
> to confirm...
>
>>
>> DomD: backend side
>> 
>> PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT +
>> PTE_UXN + PTE_ATTRINDX(MT_NORMAL)
>>
>>   From the above it seems that I don't violate cached/non-cached
>> agreement here
>>>
>>> This is why Xen imposes all the pages shared to have their memory
>>> attributes following some rules. Actually, speaking with Mark R., we
>>> may want to tight a bit more the attributes.
>>>

 1. Frontend device allocates display buffer pages which come from 
 shmem
 and have these attributes:
 !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
 PTE_ATTRINDX(MT_NORMAL)
>>>
>>> My knowledge of Xen DRM is inexistent. However, looking at the code in
>>> 5.0-rc2, I don't seem to find the same attributes. For instance
>>> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using
>>> pgprot_writecombine. So it looks like, the mapping will be
>>> non-cacheable on Arm64.
>>>
>>> Can you explain how you came up to these attributes?
>> pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be
>> applicable here? [1]
>
> ... that MT_NORMAL_NC is used for the frontend pages.
>
> MT_NORMAL_NC is different from MT_NORMAL. The use of the former will 
> result to non-cacheable memory while the latter will result to 
> cacheable memory.
>
> To me, this looks like the exact reason why you see artifact on the 
> display buffer. As the author of this code, can you explain why you 
> decided to use pgprot_writecombine here instead of relying on the 
> default VMA prot?
>
> [...]
>
>>> We actually never required to use cache flush in other PV protocol, so
>>> I still don't understand why the PV DRM should be different here.
>> Well, you are right. But at the same time not flushing the buffer makes
>> troubles,
>> so this is why I am trying to figure out what is wrong here.
>
> The cache flush is likely hiding the real problem rather than solving it.
>
>>>
>>> To me, it looks like that you are either missing some barriers
>> Barriers for the buffer? Not sure what you mean here.
>
> If you share information between two entities, you may need some 
> ordering so the information are seen consistently by the consumer 
> side. This can be achieved by using barriers.
>
>> Even more, we have
>> a use case
>> when the buffer is not touched by CPU in DomD and is solely owned by 
>> the HW.
>
> Memory ordering issues are subtle. The fact that one of your use-case 
> works does not imply that barriers are not necessary. I am not saying 
> there are a missing barriers, I am only pointed out potential reasons.
>
> Anyway, I don't think your problem is a missing barriers here. It is 
> more likely because of mismatch memory attributes (see above).
>
> Cheers,
>


Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-29 Thread Oleksandr Andrushchenko
On 1/24/19 5:02 PM, Julien Grall wrote:
>
>
> On 24/01/2019 14:34, Oleksandr Andrushchenko wrote:
>> Hello, Julien!
>
> Hi,
>
>> On 1/22/19 1:44 PM, Julien Grall wrote:
>>>
>>>
>>> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote:
 Hello, Julien!
>>>
>>> Hi,
>>>
 On 1/21/19 7:09 PM, Julien Grall wrote:
 Well, I didn't get the attributes of pages at the backend side, but 
 IMO
 those
 do not matter in my use-case (for simplicity I am not using
 zero-copying at
 backend side):
>>>
>>> They are actually important no matter what is your use case. If you
>>> access the same physical page with different attributes, then you are
>>> asking for trouble.
>> So, we have:
>>
>> DomU: frontend side
>> 
>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
>> PTE_ATTRINDX(MT_NORMAL)
>
> I still don't understand how you came up with MT_NORMAL when you seem 
> to confirm...
>
>>
>> DomD: backend side
>> 
>> PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT +
>> PTE_UXN + PTE_ATTRINDX(MT_NORMAL)
>>
>>   From the above it seems that I don't violate cached/non-cached
>> agreement here
>>>
>>> This is why Xen imposes all the pages shared to have their memory
>>> attributes following some rules. Actually, speaking with Mark R., we
>>> may want to tight a bit more the attributes.
>>>

 1. Frontend device allocates display buffer pages which come from 
 shmem
 and have these attributes:
 !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
 PTE_ATTRINDX(MT_NORMAL)
>>>
>>> My knowledge of Xen DRM is inexistent. However, looking at the code in
>>> 5.0-rc2, I don't seem to find the same attributes. For instance
>>> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using
>>> pgprot_writecombine. So it looks like, the mapping will be
>>> non-cacheable on Arm64.
>>>
>>> Can you explain how you came up to these attributes?
>> pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be
>> applicable here? [1]
>
> ... that MT_NORMAL_NC is used for the frontend pages.
>
> MT_NORMAL_NC is different from MT_NORMAL. The use of the former will 
> result to non-cacheable memory while the latter will result to 
> cacheable memory.
>
> To me, this looks like the exact reason why you see artifact on the 
> display buffer. As the author of this code, can you explain why you 
> decided to use pgprot_writecombine here instead of relying on the 
> default VMA prot?
>
> [...]
Sorry for late reply, it took me quite some time to re-check all the 
use-cases
that we have with PV DRM.
First of all I found a bug in my debug code which reported page 
attributes and
now I can confirm that all the use-cases sue MT_NORMAL, both backend and 
frontend.
You are right: there is no reason to have pgprot_writecombine and indeed 
I have
to rely on almost default VMA prot. I came up with the following (used 
by omap drm,
for example):

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index ae28ad4b4254..867800a2ed42 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -238,8 +238,7 @@ static int gem_mmap_obj(struct xen_gem_object *xen_obj,
     vma->vm_flags &= ~VM_PFNMAP;
     vma->vm_flags |= VM_MIXEDMAP;
     vma->vm_pgoff = 0;
-   vma->vm_page_prot =
- pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);

     {
     int cnt = xen_obj->num_pages > 5 ? 5 : xen_obj->num_pages;
@@ -295,7 +294,7 @@ void *xen_drm_front_gem_prime_vmap(struct 
drm_gem_object *gem_obj)
     return NULL;

     return vmap(xen_obj->pages, xen_obj->num_pages,
-   VM_MAP, pgprot_writecombine(PAGE_KERNEL));
+   VM_MAP, PAGE_KERNEL);
  }

With the above all the artifacts are gone now as page attributes are the 
same across
all domains. So, I consider this as a fix and will send it as v3 or better
drop this patch and have a new one.

THANK YOU for helping figuring this out!
>
>>> We actually never required to use cache flush in other PV protocol, so
>>> I still don't understand why the PV DRM should be different here.
>> Well, you are right. But at the same time not flushing the buffer makes
>> troubles,
>> so this is why I am trying to figure out what is wrong here.
>
> The cache flush is likely hiding the real problem rather than solving it.
>
It does hide the real issue. And I have confirmed this
>>>
>>> To me, it looks like that you are either missing some barriers
>> Barriers for the buffer? Not sure what you mean here.
>
> If you share information between two entities, you may need some 
> ordering so the information are seen consistently by the consumer 
> side. This can be achieved by using barriers.
>
>> Even more, we have
>> a use case
>> when the buffer is not touched by CPU in DomD and is solely owned by 
>> the HW.
>

Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-24 Thread Julien Grall




On 24/01/2019 14:34, Oleksandr Andrushchenko wrote:

Hello, Julien!


Hi,


On 1/22/19 1:44 PM, Julien Grall wrote:



On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote:

Hello, Julien!


Hi,


On 1/21/19 7:09 PM, Julien Grall wrote:
Well, I didn't get the attributes of pages at the backend side, but IMO
those
do not matter in my use-case (for simplicity I am not using
zero-copying at
backend side):


They are actually important no matter what is your use case. If you
access the same physical page with different attributes, then you are
asking for trouble.

So, we have:

DomU: frontend side

!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
PTE_ATTRINDX(MT_NORMAL)


I still don't understand how you came up with MT_NORMAL when you seem to 
confirm...



DomD: backend side

PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT +
PTE_UXN + PTE_ATTRINDX(MT_NORMAL)

  From the above it seems that I don't violate cached/non-cached
agreement here


This is why Xen imposes all the pages shared to have their memory
attributes following some rules. Actually, speaking with Mark R., we
may want to tight a bit more the attributes.



1. Frontend device allocates display buffer pages which come from shmem
and have these attributes:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
PTE_ATTRINDX(MT_NORMAL)


My knowledge of Xen DRM is inexistent. However, looking at the code in
5.0-rc2, I don't seem to find the same attributes. For instance
xen_drm_front_gem_prime_vmap and gem_mmap_obj are using
pgprot_writecombine. So it looks like, the mapping will be
non-cacheable on Arm64.

Can you explain how you came up to these attributes?

pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be
applicable here? [1]


... that MT_NORMAL_NC is used for the frontend pages.

MT_NORMAL_NC is different from MT_NORMAL. The use of the former will result to 
non-cacheable memory while the latter will result to cacheable memory.


To me, this looks like the exact reason why you see artifact on the display 
buffer. As the author of this code, can you explain why you decided to use 
pgprot_writecombine here instead of relying on the default VMA prot?


[...]


We actually never required to use cache flush in other PV protocol, so
I still don't understand why the PV DRM should be different here.

Well, you are right. But at the same time not flushing the buffer makes
troubles,
so this is why I am trying to figure out what is wrong here.


The cache flush is likely hiding the real problem rather than solving it.



To me, it looks like that you are either missing some barriers

Barriers for the buffer? Not sure what you mean here.


If you share information between two entities, you may need some ordering so the 
information are seen consistently by the consumer side. This can be achieved by 
using barriers.



Even more, we have
a use case
when the buffer is not touched by CPU in DomD and is solely owned by the HW.


Memory ordering issues are subtle. The fact that one of your use-case works does 
not imply that barriers are not necessary. I am not saying there are a missing 
barriers, I am only pointed out potential reasons.


Anyway, I don't think your problem is a missing barriers here. It is more likely 
because of mismatch memory attributes (see above).


Cheers,

--
Julien Grall


Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-24 Thread Oleksandr Andrushchenko
Hello, Julien!
Sorry for the late reply - it took quite some time to collect the data 
requested.

On 1/22/19 1:44 PM, Julien Grall wrote:
>
>
> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote:
>> Hello, Julien!
>
> Hi,
>
>> On 1/21/19 7:09 PM, Julien Grall wrote:
>> Well, I didn't get the attributes of pages at the backend side, but IMO
>> those
>> do not matter in my use-case (for simplicity I am not using 
>> zero-copying at
>> backend side):
>
> They are actually important no matter what is your use case. If you 
> access the same physical page with different attributes, then you are 
> asking for trouble.
So, we have:

DomU: frontend side

!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + 
PTE_ATTRINDX(MT_NORMAL)

DomD: backend side

PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + 
PTE_UXN + PTE_ATTRINDX(MT_NORMAL)

 From the above it seems that I don't violate cached/non-cached 
agreement here
>
> This is why Xen imposes all the pages shared to have their memory 
> attributes following some rules. Actually, speaking with Mark R., we 
> may want to tight a bit more the attributes.
>
>>
>> 1. Frontend device allocates display buffer pages which come from shmem
>> and have these attributes:
>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
>> PTE_ATTRINDX(MT_NORMAL)
>
> My knowledge of Xen DRM is inexistent. However, looking at the code in 
> 5.0-rc2, I don't seem to find the same attributes. For instance 
> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using 
> pgprot_writecombine. So it looks like, the mapping will be 
> non-cacheable on Arm64.
>
> Can you explain how you came up to these attributes?
pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be 
applicable here? [1]
>
>>
>> 2. Frontend grants references to these pages and shares those with the
>> backend
>>
>> 3. Backend is a user-space application (Weston client), so it uses
>> gntdev kernel
>> driver to mmap the pages. The pages, which are used by gntdev, are those
>> coming
>> from the Xen balloon driver and I believe they are all normal memory and
>> shouldn't be non-cached.
>>
>> 4. Once the frontend starts displaying it flips the buffers and backend
>> does *memcpy*
>> from the frontend-backend shared buffer into Weston's buffer. This means
>> no HW at the backend side touches the shared buffer.
>>
>> 5. I can see distorted picture.
>>
>> Previously I used setup with zero-copying, so then the picture becomes
>> more complicated
>> in terms of buffers and how those used by the backed, but anyways it
>> seems that the
>> very basic scenario with memory copying doesn't work for me.
>>
>> Using DMA API on frontend's side does help - no artifacts are seen.
>> This is why I'm thinking that this is related to frontend/kernel side
>> rather then to
>> the backend side. This is why I'm thinking this is related to caches and
>> trying to figure
>> out what can be done here instead of using DMA API.
>
> We actually never required to use cache flush in other PV protocol, so 
> I still don't understand why the PV DRM should be different here.
Well, you are right. But at the same time not flushing the buffer makes 
troubles,
so this is why I am trying to figure out what is wrong here.
>
> To me, it looks like that you are either missing some barriers
Barriers for the buffer? Not sure what you mean here. Even more, we have 
a use case
when the buffer is not touched by CPU in DomD and is solely owned by the HW.
> or the memory attributes are not correct.
Please see above - I do need your advice here
>
> Cheers,
>
Thank you very much for your time,
Oleksandr

[1] 
https://elixir.bootlin.com/linux/v5.0-rc2/source/arch/arm64/include/asm/pgtable.h#L414

Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-22 Thread Julien Grall




On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote:

Hello, Julien!


Hi,


On 1/21/19 7:09 PM, Julien Grall wrote:
Well, I didn't get the attributes of pages at the backend side, but IMO
those
do not matter in my use-case (for simplicity I am not using zero-copying at
backend side):


They are actually important no matter what is your use case. If you 
access the same physical page with different attributes, then you are 
asking for trouble.


This is why Xen imposes all the pages shared to have their memory 
attributes following some rules. Actually, speaking with Mark R., we may 
want to tight a bit more the attributes.




1. Frontend device allocates display buffer pages which come from shmem
and have these attributes:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
PTE_ATTRINDX(MT_NORMAL)


My knowledge of Xen DRM is inexistent. However, looking at the code in 
5.0-rc2, I don't seem to find the same attributes. For instance 
xen_drm_front_gem_prime_vmap and gem_mmap_obj are using 
pgprot_writecombine. So it looks like, the mapping will be non-cacheable 
on Arm64.


Can you explain how you came up to these attributes?



2. Frontend grants references to these pages and shares those with the
backend

3. Backend is a user-space application (Weston client), so it uses
gntdev kernel
driver to mmap the pages. The pages, which are used by gntdev, are those
coming
from the Xen balloon driver and I believe they are all normal memory and
shouldn't be non-cached.

4. Once the frontend starts displaying it flips the buffers and backend
does *memcpy*
from the frontend-backend shared buffer into Weston's buffer. This means
no HW at the backend side touches the shared buffer.

5. I can see distorted picture.

Previously I used setup with zero-copying, so then the picture becomes
more complicated
in terms of buffers and how those used by the backed, but anyways it
seems that the
very basic scenario with memory copying doesn't work for me.

Using DMA API on frontend's side does help - no artifacts are seen.
This is why I'm thinking that this is related to frontend/kernel side
rather then to
the backend side. This is why I'm thinking this is related to caches and
trying to figure
out what can be done here instead of using DMA API.


We actually never required to use cache flush in other PV protocol, so I 
still don't understand why the PV DRM should be different here.


To me, it looks like that you are either missing some barriers or the 
memory attributes are not correct.


Cheers,

--
Julien Grall


Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-22 Thread Oleksandr Andrushchenko
Hello, Julien!

On 1/21/19 7:09 PM, Julien Grall wrote:
> Hello,
>
> On 21/01/2019 12:43, Oleksandr Andrushchenko wrote:
>> On 1/18/19 1:43 PM, Julien Grall wrote:
>>> On 18/01/2019 09:40, Oleksandr Andrushchenko wrote:
 On 1/17/19 11:18 AM, Christoph Hellwig wrote:
> On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko
> wrote:
>>> This whole issue keeps getting more and more confusing.
>> Well, I don't really do DMA here, but instead the buffers in
>> question are shared with other Xen domain, so effectively it
>> could be thought of some sort of DMA here, where the "device" is
>> that remote domain. If the buffers are not flushed then the
>> remote part sees some inconsistency which in my case results
>> in artifacts on screen while displaying the buffers.
>> When buffers are allocated via DMA API then there are no artifacts;
>> if buffers are allocated with shmem + DMA mapping then there are no
>> artifacts as well.
>> The only offending use-case is when I use shmem backed buffers,
>> but do not flush them
> The right answer would be to implement cache maintainance hooks for
> this case in the Xen arch code.  These would basically look the same
> as the low-level cache maintainance used by the DMA ops, but without
> going through the DMA mapping layer, in fact they should probably
> reuse the same low-level assembly routines.
>
> I don't think this is the first usage of such Xen buffer sharing, so
> what do the other users do?
 I'll have to get even deeper into it. Initially I
 looked at the code, but didn't find anything useful.
 Or maybe I have just overlooked obvious things there
>>>  From Xen on Arm ABI:
>>>
>>> "All memory which is shared with other entities in the system
>>> (including the hypervisor and other guests) must reside in memory
>>> which is mapped as Normal Inner Write-Back Outer Write-Back
>>> Inner-Shareable.
>>> This applies to:
>>>    - hypercall arguments passed via a pointer to guest memory.
>>>    - memory shared via the grant table mechanism (including PV I/O
>>>  rings etc).
>>>    - memory shared with the hypervisor (struct shared_info, struct
>>>  vcpu_info, the grant table, etc).
>>> "
>>>
>>> So you should not need any cache maintenance here. Can you provide
>>> more details on the memory attribute you use for memory shared in both
>>> the backend and frontend?
>>>
>> It takes quite some time to collect this (because many components are
>> involved in the
>> use-case), but for now the pages in the guest domain are:
>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
>> PTE_ATTRINDX(MT_NORMAL)
>
> So that's the attribute for the page mapped in the frontend, right? 
> How about the backend side?
Please see below
>
> Also, could that page be handed to the graphic card correctly?
Yes, if we use zero-copying. But please see below
> If so, is your graphic card coherent?
Yes, it is
>
> If one of your components is mapping with non-cacheable attributes 
> then you are already not following the Xen Arm ABI. In that case, we 
> would need to discuss how to extend the ABI.
>
> Cheers,
>
Well, I didn't get the attributes of pages at the backend side, but IMO 
those
do not matter in my use-case (for simplicity I am not using zero-copying at
backend side):

1. Frontend device allocates display buffer pages which come from shmem
and have these attributes:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + 
PTE_ATTRINDX(MT_NORMAL)

2. Frontend grants references to these pages and shares those with the 
backend

3. Backend is a user-space application (Weston client), so it uses 
gntdev kernel
driver to mmap the pages. The pages, which are used by gntdev, are those 
coming
from the Xen balloon driver and I believe they are all normal memory and
shouldn't be non-cached.

4. Once the frontend starts displaying it flips the buffers and backend 
does *memcpy*
from the frontend-backend shared buffer into Weston's buffer. This means
no HW at the backend side touches the shared buffer.

5. I can see distorted picture.

Previously I used setup with zero-copying, so then the picture becomes 
more complicated
in terms of buffers and how those used by the backed, but anyways it 
seems that the
very basic scenario with memory copying doesn't work for me.

Using DMA API on frontend's side does help - no artifacts are seen.
This is why I'm thinking that this is related to frontend/kernel side 
rather then to
the backend side. This is why I'm thinking this is related to caches and 
trying to figure
out what can be done here instead of using DMA API.

Thank you,
Olkesandr

Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-21 Thread Julien Grall

Hello,

On 21/01/2019 12:43, Oleksandr Andrushchenko wrote:

On 1/18/19 1:43 PM, Julien Grall wrote:

On 18/01/2019 09:40, Oleksandr Andrushchenko wrote:

On 1/17/19 11:18 AM, Christoph Hellwig wrote:

On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko
wrote:

This whole issue keeps getting more and more confusing.

Well, I don't really do DMA here, but instead the buffers in
question are shared with other Xen domain, so effectively it
could be thought of some sort of DMA here, where the "device" is
that remote domain. If the buffers are not flushed then the
remote part sees some inconsistency which in my case results
in artifacts on screen while displaying the buffers.
When buffers are allocated via DMA API then there are no artifacts;
if buffers are allocated with shmem + DMA mapping then there are no
artifacts as well.
The only offending use-case is when I use shmem backed buffers,
but do not flush them

The right answer would be to implement cache maintainance hooks for
this case in the Xen arch code.  These would basically look the same
as the low-level cache maintainance used by the DMA ops, but without
going through the DMA mapping layer, in fact they should probably
reuse the same low-level assembly routines.

I don't think this is the first usage of such Xen buffer sharing, so
what do the other users do?

I'll have to get even deeper into it. Initially I
looked at the code, but didn't find anything useful.
Or maybe I have just overlooked obvious things there

 From Xen on Arm ABI:

"All memory which is shared with other entities in the system
(including the hypervisor and other guests) must reside in memory
which is mapped as Normal Inner Write-Back Outer Write-Back
Inner-Shareable.
This applies to:
   - hypercall arguments passed via a pointer to guest memory.
   - memory shared via the grant table mechanism (including PV I/O
     rings etc).
   - memory shared with the hypervisor (struct shared_info, struct
     vcpu_info, the grant table, etc).
"

So you should not need any cache maintenance here. Can you provide
more details on the memory attribute you use for memory shared in both
the backend and frontend?


It takes quite some time to collect this (because many components are
involved in the
use-case), but for now the pages in the guest domain are:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN +
PTE_ATTRINDX(MT_NORMAL)


So that's the attribute for the page mapped in the frontend, right? How about 
the backend side?


Also, could that page be handed to the graphic card correctly? If so, is your 
graphic card coherent?


If one of your components is mapping with non-cacheable attributes then you are 
already not following the Xen Arm ABI. In that case, we would need to discuss 
how to extend the ABI.


Cheers,

--
Julien Grall


Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-21 Thread Oleksandr Andrushchenko
On 1/18/19 1:43 PM, Julien Grall wrote:
> (+ Stefano)
>
> Hi,
>
> Sorry for jumping late in the conversation.
>
> On 18/01/2019 09:40, Oleksandr Andrushchenko wrote:
>> On 1/17/19 11:18 AM, Christoph Hellwig wrote:
>>> On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko 
>>> wrote:
> This whole issue keeps getting more and more confusing.
 Well, I don't really do DMA here, but instead the buffers in
 question are shared with other Xen domain, so effectively it
 could be thought of some sort of DMA here, where the "device" is
 that remote domain. If the buffers are not flushed then the
 remote part sees some inconsistency which in my case results
 in artifacts on screen while displaying the buffers.
 When buffers are allocated via DMA API then there are no artifacts;
 if buffers are allocated with shmem + DMA mapping then there are no
 artifacts as well.
 The only offending use-case is when I use shmem backed buffers,
 but do not flush them
>>> The right answer would be to implement cache maintainance hooks for
>>> this case in the Xen arch code.  These would basically look the same
>>> as the low-level cache maintainance used by the DMA ops, but without
>>> going through the DMA mapping layer, in fact they should probably
>>> reuse the same low-level assembly routines.
>>>
>>> I don't think this is the first usage of such Xen buffer sharing, so
>>> what do the other users do?
>> I'll have to get even deeper into it. Initially I
>> looked at the code, but didn't find anything useful.
>> Or maybe I have just overlooked obvious things there
> From Xen on Arm ABI:
>
> "All memory which is shared with other entities in the system
> (including the hypervisor and other guests) must reside in memory
> which is mapped as Normal Inner Write-Back Outer Write-Back 
> Inner-Shareable.
> This applies to:
>   - hypercall arguments passed via a pointer to guest memory.
>   - memory shared via the grant table mechanism (including PV I/O
>     rings etc).
>   - memory shared with the hypervisor (struct shared_info, struct
>     vcpu_info, the grant table, etc).
> "
>
> So you should not need any cache maintenance here. Can you provide 
> more details on the memory attribute you use for memory shared in both 
> the backend and frontend?
>
It takes quite some time to collect this (because many components are 
involved in the
use-case), but for now the pages in the guest domain are:
!PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + 
PTE_ATTRINDX(MT_NORMAL)

> Cheers,
>
>>
>> Thank you,
>> Oleksandr
>> ___
>> Xen-devel mailing list
>> xen-de...@lists.xenproject.org
>> https://lists.xenproject.org/mailman/listinfo/xen-devel
>>
>


Re: [Xen-devel] [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-18 Thread Julien Grall

(+ Stefano)

Hi,

Sorry for jumping late in the conversation.

On 18/01/2019 09:40, Oleksandr Andrushchenko wrote:

On 1/17/19 11:18 AM, Christoph Hellwig wrote:

On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote:

This whole issue keeps getting more and more confusing.

Well, I don't really do DMA here, but instead the buffers in
question are shared with other Xen domain, so effectively it
could be thought of some sort of DMA here, where the "device" is
that remote domain. If the buffers are not flushed then the
remote part sees some inconsistency which in my case results
in artifacts on screen while displaying the buffers.
When buffers are allocated via DMA API then there are no artifacts;
if buffers are allocated with shmem + DMA mapping then there are no
artifacts as well.
The only offending use-case is when I use shmem backed buffers,
but do not flush them

The right answer would be to implement cache maintainance hooks for
this case in the Xen arch code.  These would basically look the same
as the low-level cache maintainance used by the DMA ops, but without
going through the DMA mapping layer, in fact they should probably
reuse the same low-level assembly routines.

I don't think this is the first usage of such Xen buffer sharing, so
what do the other users do?

I'll have to get even deeper into it. Initially I
looked at the code, but didn't find anything useful.
Or maybe I have just overlooked obvious things there

From Xen on Arm ABI:

"All memory which is shared with other entities in the system
(including the hypervisor and other guests) must reside in memory
which is mapped as Normal Inner Write-Back Outer Write-Back Inner-Shareable.
This applies to:
  - hypercall arguments passed via a pointer to guest memory.
  - memory shared via the grant table mechanism (including PV I/O
rings etc).
  - memory shared with the hypervisor (struct shared_info, struct
vcpu_info, the grant table, etc).
"

So you should not need any cache maintenance here. Can you provide more details 
on the memory attribute you use for memory shared in both the backend and frontend?


Cheers,



Thank you,
Oleksandr
___
Xen-devel mailing list
xen-de...@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel



--
Julien Grall


Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-18 Thread Oleksandr Andrushchenko
On 1/17/19 11:18 AM, Christoph Hellwig wrote:
> On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote:
>>> This whole issue keeps getting more and more confusing.
>> Well, I don't really do DMA here, but instead the buffers in
>> question are shared with other Xen domain, so effectively it
>> could be thought of some sort of DMA here, where the "device" is
>> that remote domain. If the buffers are not flushed then the
>> remote part sees some inconsistency which in my case results
>> in artifacts on screen while displaying the buffers.
>> When buffers are allocated via DMA API then there are no artifacts;
>> if buffers are allocated with shmem + DMA mapping then there are no
>> artifacts as well.
>> The only offending use-case is when I use shmem backed buffers,
>> but do not flush them
> The right answer would be to implement cache maintainance hooks for
> this case in the Xen arch code.  These would basically look the same
> as the low-level cache maintainance used by the DMA ops, but without
> going through the DMA mapping layer, in fact they should probably
> reuse the same low-level assembly routines.
>
> I don't think this is the first usage of such Xen buffer sharing, so
> what do the other users do?
I'll have to get even deeper into it. Initially I
looked at the code, but didn't find anything useful.
Or maybe I have just overlooked obvious things there

Thank you,
Oleksandr

Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-17 Thread Christoph Hellwig
On Wed, Jan 16, 2019 at 06:43:29AM +, Oleksandr Andrushchenko wrote:
> > This whole issue keeps getting more and more confusing.
> Well, I don't really do DMA here, but instead the buffers in
> question are shared with other Xen domain, so effectively it
> could be thought of some sort of DMA here, where the "device" is
> that remote domain. If the buffers are not flushed then the
> remote part sees some inconsistency which in my case results
> in artifacts on screen while displaying the buffers.
> When buffers are allocated via DMA API then there are no artifacts;
> if buffers are allocated with shmem + DMA mapping then there are no
> artifacts as well.
> The only offending use-case is when I use shmem backed buffers,
> but do not flush them

The right answer would be to implement cache maintainance hooks for
this case in the Xen arch code.  These would basically look the same
as the low-level cache maintainance used by the DMA ops, but without
going through the DMA mapping layer, in fact they should probably
reuse the same low-level assembly routines.

I don't think this is the first usage of such Xen buffer sharing, so
what do the other users do?


Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-15 Thread Oleksandr Andrushchenko
On 1/16/19 8:36 AM, Christoph Hellwig wrote:
> On Wed, Jan 16, 2019 at 07:30:02AM +0100, Gerd Hoffmann wrote:
>>Hi,
>>
>>> +   if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents,
>>> +   DMA_BIDIRECTIONAL)) {
>>> +   ret = -EFAULT;
>>> +   goto fail_free_sgt;
>>> +   }
>> Hmm, so it seems the arm guys could not come up with a suggestion how to
>> solve that one in a better way.  Ok, lets go with this then.
>>
>> But didn't we agree that this deserves a comment exmplaining the purpose
>> of the dma_map_sg() call is to flush caches and that there is no actual
>> DMA happening here?
> Using a dma mapping call to flush caches is complete no-go.  But the
> real question is why you'd even want to flush cashes if you do not
> want a dma mapping?
>
> This whole issue keeps getting more and more confusing.
Well, I don't really do DMA here, but instead the buffers in
question are shared with other Xen domain, so effectively it
could be thought of some sort of DMA here, where the "device" is
that remote domain. If the buffers are not flushed then the
remote part sees some inconsistency which in my case results
in artifacts on screen while displaying the buffers.
When buffers are allocated via DMA API then there are no artifacts;
if buffers are allocated with shmem + DMA mapping then there are no
artifacts as well.
The only offending use-case is when I use shmem backed buffers,
but do not flush them

Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-15 Thread Oleksandr Andrushchenko
On 1/16/19 8:30 AM, Gerd Hoffmann wrote:
>Hi,
>
>> +if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents,
>> +DMA_BIDIRECTIONAL)) {
>> +ret = -EFAULT;
>> +goto fail_free_sgt;
>> +}
> Hmm, so it seems the arm guys could not come up with a suggestion how to
> solve that one in a better way.  Ok, lets go with this then.
>
> But didn't we agree that this deserves a comment exmplaining the purpose
> of the dma_map_sg() call is to flush caches and that there is no actual
> DMA happening here?
My bad, will add the comment
> cheers,
>Gerd
>
Thank you,
Oleksandr

Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-15 Thread Christoph Hellwig
On Wed, Jan 16, 2019 at 07:30:02AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > +   if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents,
> > +   DMA_BIDIRECTIONAL)) {
> > +   ret = -EFAULT;
> > +   goto fail_free_sgt;
> > +   }
> 
> Hmm, so it seems the arm guys could not come up with a suggestion how to
> solve that one in a better way.  Ok, lets go with this then.
> 
> But didn't we agree that this deserves a comment exmplaining the purpose
> of the dma_map_sg() call is to flush caches and that there is no actual
> DMA happening here?

Using a dma mapping call to flush caches is complete no-go.  But the
real question is why you'd even want to flush cashes if you do not
want a dma mapping?

This whole issue keeps getting more and more confusing.


Re: [PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-15 Thread Gerd Hoffmann
  Hi,

> + if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents,
> + DMA_BIDIRECTIONAL)) {
> + ret = -EFAULT;
> + goto fail_free_sgt;
> + }

Hmm, so it seems the arm guys could not come up with a suggestion how to
solve that one in a better way.  Ok, lets go with this then.

But didn't we agree that this deserves a comment exmplaining the purpose
of the dma_map_sg() call is to flush caches and that there is no actual
DMA happening here?

cheers,
  Gerd



[PATCH v2] drm/xen-front: Make shmem backed display buffer coherent

2019-01-15 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

When GEM backing storage is allocated with drm_gem_get_pages
the backing pages may be cached, thus making it possible that
the backend sees only partial content of the buffer which may
lead to screen artifacts. Make sure that the frontend's
memory is coherent and the backend always sees correct display
buffer content.

Fixes: c575b7eeb89f ("drm/xen-front: Add support for Xen PV display frontend")

Signed-off-by: Oleksandr Andrushchenko 

---
Changes since v1:
 - Remove GFP_USER|__GFP_DMA32 mapping flags (Gerd)
 - Use drm_prime_pages_to_sg directly (Noralf)

 drivers/gpu/drm/xen/xen_drm_front_gem.c | 38 ++---
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index 28bc501af450..0b0d9b4f97dc 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -32,8 +32,11 @@ struct xen_gem_object {
/* set for buffers allocated by the backend */
bool be_alloc;
 
-   /* this is for imported PRIME buffer */
-   struct sg_table *sgt_imported;
+   /*
+* this is for imported PRIME buffer or the one allocated via
+* drm_gem_get_pages.
+*/
+   struct sg_table *sgt;
 };
 
 static inline struct xen_gem_object *
@@ -124,8 +127,28 @@ static struct xen_gem_object *gem_create(struct drm_device 
*dev, size_t size)
goto fail;
}
 
+   xen_obj->sgt = drm_prime_pages_to_sg(xen_obj->pages,
+xen_obj->num_pages);
+   if (IS_ERR_OR_NULL(xen_obj->sgt)) {
+   ret = PTR_ERR(xen_obj->sgt);
+   xen_obj->sgt = NULL;
+   goto fail_put_pages;
+   }
+
+   if (!dma_map_sg(dev->dev, xen_obj->sgt->sgl, xen_obj->sgt->nents,
+   DMA_BIDIRECTIONAL)) {
+   ret = -EFAULT;
+   goto fail_free_sgt;
+   }
+
return xen_obj;
 
+fail_free_sgt:
+   sg_free_table(xen_obj->sgt);
+   xen_obj->sgt = NULL;
+fail_put_pages:
+   drm_gem_put_pages(_obj->base, xen_obj->pages, true, false);
+   xen_obj->pages = NULL;
 fail:
DRM_ERROR("Failed to allocate buffer with size %zu\n", size);
return ERR_PTR(ret);
@@ -148,7 +171,7 @@ void xen_drm_front_gem_free_object_unlocked(struct 
drm_gem_object *gem_obj)
struct xen_gem_object *xen_obj = to_xen_gem_obj(gem_obj);
 
if (xen_obj->base.import_attach) {
-   drm_prime_gem_destroy(_obj->base, xen_obj->sgt_imported);
+   drm_prime_gem_destroy(_obj->base, xen_obj->sgt);
gem_free_pages_array(xen_obj);
} else {
if (xen_obj->pages) {
@@ -157,6 +180,13 @@ void xen_drm_front_gem_free_object_unlocked(struct 
drm_gem_object *gem_obj)
xen_obj->pages);
gem_free_pages_array(xen_obj);
} else {
+   if (xen_obj->sgt) {
+   dma_unmap_sg(xen_obj->base.dev->dev,
+xen_obj->sgt->sgl,
+xen_obj->sgt->nents,
+DMA_BIDIRECTIONAL);
+   sg_free_table(xen_obj->sgt);
+   }
drm_gem_put_pages(_obj->base,
  xen_obj->pages, true, false);
}
@@ -202,7 +232,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev,
if (ret < 0)
return ERR_PTR(ret);
 
-   xen_obj->sgt_imported = sgt;
+   xen_obj->sgt = sgt;
 
ret = drm_prime_sg_to_page_addr_arrays(sgt, xen_obj->pages,
   NULL, xen_obj->num_pages);
-- 
2.20.1