Re: [PATCH v9 2/3] drm/amdgpu: Enable clear page functionality

2024-04-03 Thread Michel Dänzer
On 2024-04-02 10:17, Christian König wrote:
> Am 26.03.24 um 15:53 schrieb Alex Deucher:
>> On Tue, Mar 26, 2024 at 10:01 AM Alex Deucher  wrote:
>>> On Tue, Mar 26, 2024 at 9:59 AM Paneer Selvam, Arunpravin
>>>>>> @@ -501,6 +502,9 @@ static int amdgpu_vram_mgr_new(struct 
>>>>>> ttm_resource_manager *man,
>>>>>>   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
>>>>>>   vres->flags |= DRM_BUDDY_CONTIGUOUS_ALLOCATION;
>>>>>>
>>>>>> +   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED)
>>>>>> +   vres->flags |= DRM_BUDDY_CLEAR_ALLOCATION;
>>>>> Is there any reason to not always do this?
>>>> Here we are trying to keep a pool of cleared memory which are cleared on
>>>> free path and that can given
>>>> to the application which requires a zeroed memory. I think here if we
>>>> set always clear the memory,
>>>> we would go over the limit of cleared memory in that particular instance
>>>> and the application should wait until
>>>> the hardware clears the memory and this might impact the overall
>>>> performance.
>>> I'd like to have the driver always deliver cleared memory.
>> Actually, I think we can just always set the flag in the allocation IOCTLs.
> 
> We have use cases where that hurts as. Especially during boot when the 
> backing VRAM isn't cleared yet.
> 
> That's one of the reasons why we never always cleared the memory.

Any such performance gain was only valid in the first place if the kernel 
delivering non-cleared memory to user space was considered acceptable, which it 
quite clearly isn't.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] kci-gitlab: Introducing GitLab-CI Pipeline for Kernel Testing

2024-03-05 Thread Michel Dänzer
On 2024-02-29 21:21, Linus Torvalds wrote:
> On Thu, 29 Feb 2024 at 01:23, Nikolai Kondrashov  wrote:
>>
>> However, I think a better approach would be *not* to add the .gitlab-ci.yaml
>> file in the root of the source tree, but instead change the very same repo
>> setting to point to a particular entry YAML, *inside* the repo (somewhere
>> under "ci" directory) instead.
> 
> I really don't want some kind of top-level CI for the base kernel project.
> 
> We already have the situation that the drm people have their own ci
> model. II'm ok with that, partly because then at least the maintainers
> of that subsystem can agree on the rules for that one subsystem.
> 
> I'm not at all interested in having something that people will then
> either fight about, or - more likely - ignore, at the top level
> because there isn't some global agreement about what the rules are.
> 
> For example, even just running checkpatch is often a stylistic thing,
> and not everybody agrees about all the checkpatch warnings.
> 
> I would suggest the CI project be separate from the kernel.

That would be missing a lot of the point / benefit of CI.

A CI system which is separate from the kernel will tend to be out of sync, so 
it can't gate the merging of changes and thus can't prevent regressions from 
propagating.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-26 Thread Michel Dänzer
On 2024-02-26 17:53, Christian König wrote:
> Am 26.02.24 um 17:50 schrieb Michel Dänzer:
>> On 2024-02-26 17:46, Michel Dänzer wrote:
>>> On 2024-02-26 17:34, Christian König wrote:
>>>
>>>> My question is why has it worked so far? I mean we are not doing this 
>>>> since yesterday and the problem only shows up now?
>>> Yes, Wayland compositors are only starting to try and make use of this now.
>> To expand on this, mutter will want to do something like this as well sooner 
>> or later. I suspect it's the same for others like kwin, sway etc.
> 
> Yeah, but we have done similar things with X decades before. E.g. basically 
> the client sends a BO to the server for displaying it.

The scenario in https://gitlab.freedesktop.org/mesa/mesa/-/issues/10635 is 
direct scanout of a client buffer on a secondary dGPU, while the compositor 
uses the APU as the primary compositing GPU.

AFAIR Xorg has never supported direct scanout of client buffers in this 
scenario. Secondary GPU scanout is always done from a separate local buffer 
allocated by Xorg / the driver.

This is Wayland compositors pushing the envelope.


> Why we suddenly have to juggle with the fact that it is DMA-buf shared with 
> another device?

The problematic case is if the Wayland compositor has to produce a composited 
frame (e.g. due to a notification or other window popping up over the 
fullscreen window), but the client hasn't attached a new buffer to the 
fullscreen surface, so the compositor has to use the contents of the same 
client buffer which is still being scanned out by the dGPU for compositing with 
the APU.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-26 Thread Michel Dänzer
On 2024-02-26 17:46, Michel Dänzer wrote:
> On 2024-02-26 17:34, Christian König wrote:
> 
>> My question is why has it worked so far? I mean we are not doing this since 
>> yesterday and the problem only shows up now?
> 
> Yes, Wayland compositors are only starting to try and make use of this now.

To expand on this, mutter will want to do something like this as well sooner or 
later. I suspect it's the same for others like kwin, sway etc.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-26 Thread Michel Dänzer
On 2024-02-26 17:34, Christian König wrote:
> Am 26.02.24 um 17:27 schrieb Michel Dänzer:
>> On 2024-02-26 16:58, Christian König wrote:
>>> Am 23.02.24 um 17:43 schrieb Michel Dänzer:
>>>> On 2024-02-23 11:04, Michel Dänzer wrote:
>>>>> On 2024-02-23 10:34, Christian König wrote:
>>>>>> Am 23.02.24 um 09:11 schrieb Michel Dänzer:
>>>>>>> On 2024-02-23 08:06, Christian König wrote:
>>>>>>>> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>>>>>>>>> From: Michel Dänzer 
>>>>>>>>>
>>>>>>>>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>>>>>>>>> to non-P2P dma-buf importers.
>>>>>>>> Thinking more about it I don't think we can do this.
>>>>>>>>
>>>>>>>> Using the BO in a ping/pong fashion for scanout and DMA-buf is 
>>>>>>>> actually valid, you just can't do both at the same time.
>>>>>>>>
>>>>>>>> And if I'm not completely mistaken we actually have use cases for this 
>>>>>>>> at the moment,
>>>>>>> Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?
>>>>>> Nope, we are basically talking about unit tests and examples for inter 
>>>>>> device operations.
>>>>> Sounds like the "no user-space regressions" rule might not apply then.
>>>> To clarify what I mean by that:
>>>>
>>>> "We can't fix this issue, because it would break some unit tests and 
>>>> examples" is similar to saying "We can't fix this KMS bug, because there 
>>>> are IGT tests expecting the buggy behaviour". In practice, the latter do 
>>>> get fixed, along with the IGT tests.
>>> The problem here is that this is not a bug, but intentional behavior. 
>>> Exporting BOs and using them in scanout in a ping/pong fashion is perfectly 
>>> valid.
>> The problem with the status quo is that it requires amdgpu-specific 
>> knowledge and worst-case pessimization in user space.
> 
> Yeah, I'm perfectly aware of that. But that approach here really doesn't 
> seems to be valid.

It's the only known sane approach at this point. There's no other proposal 
which allows using both BO sharing and scanout without pessimizing or hitting 
seemingly random CS ioctl failures.


>>> We have use cases which depend on this behavior and I'm not going to break 
>>> those to fix your use case.
>> It's not "my" use case, it's a Wayland compositor trying to make use of BO 
>> sharing and scanout without always pessimizing for the worst case.
>>
>> That's surely more of a real-world use case than unit tests and examples.
> 
> I've looked a bit deeper into this and we have told customers for the last 
> ~10 years or so that this is how it is supposed to work and that they can use 
> this approach.
> 
> So this is much more than unit tests and examples, we are changing existing 
> behavior which is clearly not a bug but intentional and have a very high 
> chance to break valid use cases.

"Very high chance" sounds like you still don't know of any actual real-world 
use case relying on it though?


> My question is why has it worked so far? I mean we are not doing this since 
> yesterday and the problem only shows up now?

Yes, Wayland compositors are only starting to try and make use of this now.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-26 Thread Michel Dänzer
On 2024-02-26 16:58, Christian König wrote:
> Am 23.02.24 um 17:43 schrieb Michel Dänzer:
>> On 2024-02-23 11:04, Michel Dänzer wrote:
>>> On 2024-02-23 10:34, Christian König wrote:
>>>> Am 23.02.24 um 09:11 schrieb Michel Dänzer:
>>>>> On 2024-02-23 08:06, Christian König wrote:
>>>>>> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>>>>>>> From: Michel Dänzer 
>>>>>>>
>>>>>>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>>>>>>> to non-P2P dma-buf importers.
>>>>>> Thinking more about it I don't think we can do this.
>>>>>>
>>>>>> Using the BO in a ping/pong fashion for scanout and DMA-buf is actually 
>>>>>> valid, you just can't do both at the same time.
>>>>>>
>>>>>> And if I'm not completely mistaken we actually have use cases for this 
>>>>>> at the moment,
>>>>> Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?
>>>> Nope, we are basically talking about unit tests and examples for inter 
>>>> device operations.
>>> Sounds like the "no user-space regressions" rule might not apply then.
>> To clarify what I mean by that:
>>
>> "We can't fix this issue, because it would break some unit tests and 
>> examples" is similar to saying "We can't fix this KMS bug, because there are 
>> IGT tests expecting the buggy behaviour". In practice, the latter do get 
>> fixed, along with the IGT tests.
> 
> The problem here is that this is not a bug, but intentional behavior. 
> Exporting BOs and using them in scanout in a ping/pong fashion is perfectly 
> valid.

The problem with the status quo is that it requires amdgpu-specific knowledge 
and worst-case pessimization in user space.


> We have use cases which depend on this behavior and I'm not going to break 
> those to fix your use case.

It's not "my" use case, it's a Wayland compositor trying to make use of BO 
sharing and scanout without always pessimizing for the worst case.

That's surely more of a real-world use case than unit tests and examples.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-23 Thread Michel Dänzer
On 2024-02-23 11:04, Michel Dänzer wrote:
> On 2024-02-23 10:34, Christian König wrote:
>> Am 23.02.24 um 09:11 schrieb Michel Dänzer:
>>> On 2024-02-23 08:06, Christian König wrote:
>>>> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>>>>> From: Michel Dänzer 
>>>>>
>>>>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>>>>> to non-P2P dma-buf importers.
>>>> Thinking more about it I don't think we can do this.
>>>>
>>>> Using the BO in a ping/pong fashion for scanout and DMA-buf is actually 
>>>> valid, you just can't do both at the same time.
>>>>
>>>> And if I'm not completely mistaken we actually have use cases for this at 
>>>> the moment,
>>> Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?
>>
>> Nope, we are basically talking about unit tests and examples for inter 
>> device operations.
> 
> Sounds like the "no user-space regressions" rule might not apply then.

To clarify what I mean by that:

"We can't fix this issue, because it would break some unit tests and examples" 
is similar to saying "We can't fix this KMS bug, because there are IGT tests 
expecting the buggy behaviour". In practice, the latter do get fixed, along 
with the IGT tests.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-23 Thread Michel Dänzer
On 2024-02-23 10:34, Christian König wrote:
> Am 23.02.24 um 09:11 schrieb Michel Dänzer:
>> On 2024-02-23 08:06, Christian König wrote:
>>> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>>>> From: Michel Dänzer 
>>>>
>>>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>>>> to non-P2P dma-buf importers.
>>> Thinking more about it I don't think we can do this.
>>>
>>> Using the BO in a ping/pong fashion for scanout and DMA-buf is actually 
>>> valid, you just can't do both at the same time.
>>>
>>> And if I'm not completely mistaken we actually have use cases for this at 
>>> the moment,
>> Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?
> 
> Nope, we are basically talking about unit tests and examples for inter device 
> operations.

Sounds like the "no user-space regressions" rule might not apply then.


> Those render into a shared buffer and then display it to check if the content 
> was rendered/transferred correctly.

That can be fixed by dropping the dma-buf attachments from other GPUs before 
creating the KMS FB.

Conversely, tests / examples which do scanout first can be fixed by destroying 
KMS FBs before sharing the BO with another GPU.


> I'm not sure if we still do those test cases, the last time I looked into it 
> was before P2P was even supported, but I also can't rule it out.

Sounds too vague to block this series.


>>> So rejecting things during CS and atomic commit is the best thing we can do.
>> It's problematic for a Wayland compositor:
>>
>> The CS ioctl failing is awkward. With GL, I'm pretty sure it means the 
>> compositor would have to destroy the context and create a new one. Not sure 
>> about Vulkan, but I suspect it's painful there as well.
>>
>> Similarly for the KMS atomic commit ioctl. The compositor can't know why 
>> exactly it failed, all it gets is an error code.
> 
> Yeah, but that is not because the kernel is doing anything wrong.
> 
> Sharing, rendering and then doing an atomic commit is a perfectly valid use 
> case.
> 
> You just can't do scanout and sharing at the same time.

Per my later follow-up, Xwayland can't really avoid it.


>> And there's no other way for the compositor to detect when both things can 
>> actually work concurrently.
> 
> That I totally agree with. And IIRC we already have at least a query for the 
> buffer placement. E.g. you can already check if the BO is in GTT or VRAM and 
> shared.
> 
> What's missing is exposing if the device can scanout from GTT or not.

Requiring Wayland compositors to have driver-specific knowledge like that baked 
in isn't great either.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-23 Thread Michel Dänzer
On 2024-02-23 09:11, Michel Dänzer wrote:
> On 2024-02-23 08:06, Christian König wrote:
>>
>> So rejecting things during CS and atomic commit is the best thing we can do.
> 
> It's problematic for a Wayland compositor:
> 
> The CS ioctl failing is awkward. With GL, I'm pretty sure it means the 
> compositor would have to destroy the context and create a new one. Not sure 
> about Vulkan, but I suspect it's painful there as well.
> 
> Similarly for the KMS atomic commit ioctl. The compositor can't know why 
> exactly it failed, all it gets is an error code.
> 
> And there's no other way for the compositor to detect when both things can 
> actually work concurrently.
> 
> Together, this means the compositor always has to assume the worst case, and 
> do workarounds such as using the scanout GPU to copy from the scanout buffer 
> to another buffer for access from another GPU. Even when direct access from 
> the other GPU would actually work fine.

It's worse for Xwayland:

It can't know if/when the Wayland compositor uses a buffer for scanout. It 
would have to assume the worst case whenever a buffer is owned by the 
compositor.

Except Xwayland can't know which GPU a buffer is originally from. So it can't 
know when the workaround is actually needed, or which GPU it has to use for the 
workaround.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-23 Thread Michel Dänzer
On 2024-02-23 08:06, Christian König wrote:
> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>> From: Michel Dänzer 
>>
>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>> to non-P2P dma-buf importers.
> 
> Thinking more about it I don't think we can do this.
> 
> Using the BO in a ping/pong fashion for scanout and DMA-buf is actually 
> valid, you just can't do both at the same time.
> 
> And if I'm not completely mistaken we actually have use cases for this at the 
> moment,

Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?

(As discussed on the GitLab issue, AFAICT P2P could be made to work even 
without CONFIG_DMABUF_MOVE_NOTIFY, by pinning to VRAM instead of GTT for 
dma-buf sharing)


> only as fallback but it would still break existing userspace and that is a 
> no-go.

I'm obviously aware of this general rule. There are exceptions though, and this 
might be one.


> So rejecting things during CS and atomic commit is the best thing we can do.

It's problematic for a Wayland compositor:

The CS ioctl failing is awkward. With GL, I'm pretty sure it means the 
compositor would have to destroy the context and create a new one. Not sure 
about Vulkan, but I suspect it's painful there as well.

Similarly for the KMS atomic commit ioctl. The compositor can't know why 
exactly it failed, all it gets is an error code.

And there's no other way for the compositor to detect when both things can 
actually work concurrently.

Together, this means the compositor always has to assume the worst case, and do 
workarounds such as using the scanout GPU to copy from the scanout buffer to 
another buffer for access from another GPU. Even when direct access from the 
other GPU would actually work fine.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



[PATCH 3/3] drm/amdgpu: Refuse non-P2P dma-buf attachments for BOs with KMS FBs

2024-02-22 Thread Michel Dänzer
From: Michel Dänzer 

Pinning the BO storage to VRAM for scanout would make it inaccessible
to non-P2P dma-buf importers.

Also keep file_priv->prime.lock locked until after bumping bo->num_fbs
in amdgpu_display_user_framebuffer_create, so that the checks there and
in amdgpu_dma_buf_attach are always consistent with each other.

Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 20 
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 11 +++
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 89bda2a2baf58..2afe5558ba895 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1251,6 +1251,7 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
 {
struct amdgpu_framebuffer *amdgpu_fb;
struct drm_gem_object *obj;
+   bool prime_locked = false;
struct amdgpu_bo *bo;
uint32_t domains;
int ret;
@@ -1270,6 +1271,7 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
bool can_pin = true;
 
mutex_lock(_priv->prime.lock);
+   prime_locked = true;
 
/* Handle is imported dma-buf, so cannot be migrated to VRAM 
for scanout */
if (obj->import_attach) {
@@ -1293,29 +1295,31 @@ amdgpu_display_user_framebuffer_create(struct 
drm_device *dev,
dma_resv_unlock(dmabuf->resv);
}
 
-   mutex_unlock(_priv->prime.lock);
-
if (!can_pin) {
-   drm_gem_object_put(obj);
-   return ERR_PTR(-EINVAL);
+   amdgpu_fb = ERR_PTR(-EINVAL);
+   goto out;
}
}
 
amdgpu_fb = kzalloc(sizeof(*amdgpu_fb), GFP_KERNEL);
if (amdgpu_fb == NULL) {
-   drm_gem_object_put(obj);
-   return ERR_PTR(-ENOMEM);
+   amdgpu_fb = ERR_PTR(-ENOMEM);
+   goto out;
}
 
ret = amdgpu_display_gem_fb_verify_and_init(dev, amdgpu_fb, file_priv,
mode_cmd, obj);
if (ret) {
kfree(amdgpu_fb);
-   drm_gem_object_put(obj);
-   return ERR_PTR(ret);
+   amdgpu_fb = ERR_PTR(ret);
+   goto out;
}
 
atomic_inc(>num_fbs);
+
+out:
+   if (prime_locked)
+   mutex_unlock(_priv->prime.lock);
drm_gem_object_put(obj);
return _fb->base;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index decbbe3d4f06e..275d34898284d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -63,6 +63,17 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
if (pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0)
attach->peer2peer = false;
 
+   if ((!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) || !attach->peer2peer) &&
+   atomic_read(>num_fbs) > 0) {
+   uint32_t domains = amdgpu_display_supported_domains(adev, 
bo->flags);
+
+   if (!(domains & AMDGPU_GEM_DOMAIN_GTT)) {
+   drm_dbg_prime(adev_to_drm(adev),
+ "Cannot attach to BO with KMS FBs without 
P2P\n");
+   return -EINVAL;
+   }
+   }
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
trace_amdgpu_runpm_reference_dumps(1, __func__);
if (r < 0)
-- 
2.43.0



[PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

2024-02-22 Thread Michel Dänzer
From: Michel Dänzer 

Pinning the BO storage to VRAM for scanout would make it inaccessible
to non-P2P dma-buf importers.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10635
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 38 ++---
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index b8fbe97efe1d3..514a5b2159815 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1255,13 +1255,41 @@ amdgpu_display_user_framebuffer_create(struct 
drm_device *dev,
return ERR_PTR(-ENOENT);
}
 
-   /* Handle is imported dma-buf, so cannot be migrated to VRAM for 
scanout */
bo = gem_to_amdgpu_bo(obj);
domains = amdgpu_display_supported_domains(drm_to_adev(dev), bo->flags);
-   if (obj->import_attach && !(domains & AMDGPU_GEM_DOMAIN_GTT)) {
-   drm_dbg_kms(dev, "Cannot create framebuffer from imported 
dma_buf\n");
-   drm_gem_object_put(obj);
-   return ERR_PTR(-EINVAL);
+   if (!(domains & AMDGPU_GEM_DOMAIN_GTT)) {
+   bool can_pin = true;
+
+   mutex_lock(_priv->prime.lock);
+
+   /* Handle is imported dma-buf, so cannot be migrated to VRAM 
for scanout */
+   if (obj->import_attach) {
+   drm_dbg_kms(dev, "Cannot create framebuffer from 
imported dma_buf\n");
+   can_pin = false;
+   } else if (obj->dma_buf) {
+   struct dma_buf *dmabuf = obj->dma_buf;
+   struct dma_buf_attachment *attachment;
+
+   dma_resv_lock(dmabuf->resv, NULL);
+
+   list_for_each_entry(attachment, >attachments, 
node) {
+   if (IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && 
attachment->peer2peer)
+   continue;
+
+   drm_dbg_kms(dev, "Cannot create framebuffer 
from non-P2P exported dma_buf\n");
+   can_pin = false;
+   break;
+   }
+
+   dma_resv_unlock(dmabuf->resv);
+   }
+
+   mutex_unlock(_priv->prime.lock);
+
+   if (!can_pin) {
+   drm_gem_object_put(obj);
+   return ERR_PTR(-EINVAL);
+   }
}
 
amdgpu_fb = kzalloc(sizeof(*amdgpu_fb), GFP_KERNEL);
-- 
2.43.0



[PATCH 2/3] drm/amdgpu: Keep track of the number of KMS FBs using an amdgpu_bo

2024-02-22 Thread Michel Dänzer
From: Michel Dänzer 

Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 14 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 514a5b2159815..89bda2a2baf58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -531,6 +531,15 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
return true;
 }
 
+static void
+amdgpu_fb_destroy(struct drm_framebuffer *fb)
+{
+   struct amdgpu_bo *bo = gem_to_amdgpu_bo(fb->obj[0]);
+
+   atomic_dec(>num_fbs);
+   drm_gem_fb_destroy(fb);
+}
+
 static int amdgpu_dirtyfb(struct drm_framebuffer *fb, struct drm_file *file,
  unsigned int flags, unsigned int color,
  struct drm_clip_rect *clips, unsigned int num_clips)
@@ -544,12 +553,12 @@ static int amdgpu_dirtyfb(struct drm_framebuffer *fb, 
struct drm_file *file,
 }
 
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
-   .destroy = drm_gem_fb_destroy,
+   .destroy = amdgpu_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
 };
 
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs_atomic = {
-   .destroy = drm_gem_fb_destroy,
+   .destroy = amdgpu_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
.dirty = amdgpu_dirtyfb
 };
@@ -1306,6 +1315,7 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
+   atomic_inc(>num_fbs);
drm_gem_object_put(obj);
return _fb->base;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index a3ea8a82db23a..47579a6eb2ae8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -117,6 +117,9 @@ struct amdgpu_bo {
 * for memory accounting.
 */
int8_t  xcp_id;
+
+   /* Number of KMS FBs using this BO */
+   atomic_tnum_fbs;
 };
 
 struct amdgpu_bo_user {
-- 
2.43.0



Re: [PATCH 0/2] drm/atomic: Allow drivers to write their own plane check for async

2024-01-17 Thread Michel Dänzer
On 2024-01-17 13:57, Xaver Hugl wrote:
> Am Mi., 17. Jan. 2024 um 09:55 Uhr schrieb Pekka Paalanen 
> :
>> Is it important enough to be special-cased, e.g. to be always allowed
>> with async commits?
> 
> I thought so, and sent a patch to dri-devel to make it happen, but
> there are some
> concerns about untested driver paths.
> https://lists.freedesktop.org/archives/dri-devel/2024-January/437511.html
> 
>> Now that I think of it, if userspace needs to wait for the in-fence
>> itself before kicking KMS async, that would defeat much of the async's
>> point, right? And cases where in-fence is not necessary are so rare
>> they might not even exist?
>>
>> So if driver/hardware cannot do IN_FENCE_FD with async, is there any
>> use of supporting async to begin with?
> 
> KWin never commits a buffer where IN_FENCE_FD would actually delay the
> pageflip; it's really only used to disable implicit sync, as there's some edge
> cases where it can wrongly delay the pageflip. The waiting for buffers to 
> become
> readable on the compositor side isn't really significant in terms of latency.
> 
> If hardware doesn't support IN_FENCE_FD with async commits, checking if the
> fence is already signaled at commit time would thus still make things work, at
> least for KWin.

That's how IN_FENCE_FD (and implicit sync) is handled anyway, in common code: 
It waits for all fences to signal before calling into the driver to commit the 
atomic commit.

I can't see why this wouldn't work with async commits, the same as with 
synchronous ones, with any driver.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: Rework TTMs busy handling

2024-01-10 Thread Michel Dänzer
On 2024-01-09 09:34, Christian König wrote:
> Am 09.01.24 um 09:14 schrieb Thomas Hellström:
>> On Tue, 2024-01-09 at 08:47 +0100, Christian König wrote:
>>>
>>> I'm trying to make this functionality a bit more useful for years now
>>> since we multiple reports that behavior of drivers can be suboptimal
>>> when multiple placements be given.
>>>
>>> So basically instead of hacking around the TTM behavior in the driver
>>> once more I've gone ahead and changed the idle/busy placement list
>>> into idle/busy placement flags. This not only saves a bunch of code,
>>> but also allows setting some placements as fallback which are used if
>>> allocating from the preferred ones didn't worked.
>>
>> I also have some doubts about the naming "idle" vs "busy", since an
>> elaborate eviction mechanism would probably at some point want to check
>> for gpu idle vs gpu busy, and this might create some confusion moving
>> forward for people confusing busy as in memory overcommit with busy as
>> in gpu activity.
>>
>> I can't immediately think of something better, though.
> 
> Yeah, I was wondering about that as well. Especially since I wanted to add 
> some more flags in the future when for example a bandwidth quota how much 
> memory can be moved in/out is exceeded.
> 
> Something like phase1, phase2, phase3 etc..., but that's also not very 
> descriptive either.

Maybe something like "desired" vs "fallback"?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [1/3] drm: property: One function call less in drm_property_create() after error detection

2024-01-04 Thread Michel Dänzer
On 2024-01-03 19:08, Markus Elfring wrote:
> 
>> Without seeing the actual Coccinelle report,
> 
> There is no “official” report according to the discussed patch which is 
> triggered
> by known advices for the application of labels in goto chains.

The commit log says:

 This issue was detected by using the Coccinelle software.

Either that's inaccurate then, or you should be able to provide the 
corresponding output from Coccinelle.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm: property: One function call less in drm_property_create() after error detection

2024-01-03 Thread Michel Dänzer
On 2024-01-03 17:24, Markus Elfring wrote:
> 
>> Out of curiosity, what exactly did Coccinelle report?
> 
> Some SmPL scripts from my own selection tend to point questionable 
> implementation details out.

That doesn't answer my question.

Without seeing the actual Coccinelle report, I'll assume that it didn't 
actually call for this change.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 1/3] drm: property: One function call less in drm_property_create() after error detection

2024-01-03 Thread Michel Dänzer
On 2023-12-26 10:38, Markus Elfring wrote:
> From: Markus Elfring 
> Date: Tue, 26 Dec 2023 08:44:37 +0100
> 
> The kfree() function was called in one case by the
> drm_property_create() function during error handling
> even if the passed data structure member contained a null pointer.
> This issue was detected by using the Coccinelle software.
> 
> Thus use another label.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/gpu/drm/drm_property.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_property.c b/drivers/gpu/drm/drm_property.c
> index 596272149a35..3440f4560e6e 100644
> --- a/drivers/gpu/drm/drm_property.c
> +++ b/drivers/gpu/drm/drm_property.c
> @@ -117,7 +117,7 @@ struct drm_property *drm_property_create(struct 
> drm_device *dev,
>   property->values = kcalloc(num_values, sizeof(uint64_t),
>  GFP_KERNEL);
>   if (!property->values)
> - goto fail;
> + goto free_property;
>   }
> 
>   ret = drm_mode_object_add(dev, >base, 
> DRM_MODE_OBJECT_PROPERTY);
> @@ -135,6 +135,7 @@ struct drm_property *drm_property_create(struct 
> drm_device *dev,
>   return property;
>  fail:
>   kfree(property->values);
> +free_property:
>   kfree(property);
>   return NULL;
>  }
> --
> 2.43.0
> 

This change is pointless at best, kfree(NULL) works fine.


Out of curiosity, what exactly did Coccinelle report?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 2/2] drm/amdgpu: Enable clear page functionality

2023-12-14 Thread Michel Dänzer
On 2023-12-14 11:31, Christian König wrote:
> Am 13.12.23 um 16:46 schrieb Michel Dänzer:
>> From a security PoV, the kernel should never return uncleared memory to (at 
>> least unprivileged) user space. This series seems like a big step in that 
>> direction.
> 
> Well please take a look at the MAP_UNINITIALIZED flag for mmap().

   MAP_UNINITIALIZED (since Linux 2.6.33)
  Don't  clear  anonymous pages.  This flag is intended to improve
  performance on embedded devices.  This flag is honored  only  if
  the  kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIAL‐
  IZED option.  Because of the security implications, that  option
  is  normally  enabled  only  on  embedded devices (i.e., devices
  where one has complete control of the contents of user memory).


> We even have the functionality to return uninitialized system memory when the 
> kernel compile option for this is set

>From mm/Kconfig:

config MMAP_ALLOW_UNINITIALIZED 
bool "Allow mmapped anonymous memory to be uninitialized"
depends on EXPERT && !MMU
default n
help
  Normally, and according to the Linux spec, anonymous memory obtained
  from mmap() has its contents cleared before it is passed to
  userspace.  Enabling this config option allows you to request that
  mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
  providing a huge performance boost.  If this option is not enabled,
  then the flag will be ignored.
  
  This is taken advantage of by uClibc's malloc(), and also by
  ELF-FDPIC binfmt's brk and stack allocator.
  
  Because of the obvious security issues, this option should only be
  enabled on embedded devices where you control what is run in
  userspace.  Since that isn't generally a problem on no-MMU systems,
  it is normally safe to say Y here.

  See Documentation/admin-guide/mm/nommu-mmap.rst for more information.


Both looks consistent with what I wrote.


> since this is an important optimization for many use cases.

Per above, it's available only on platforms without MMU.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 2/2] drm/amdgpu: Enable clear page functionality

2023-12-13 Thread Michel Dänzer
On 2023-12-13 16:39, Felix Kuehling wrote:
> On 2023-12-13 9:20, Christian König wrote:
>> Am 12.12.23 um 00:32 schrieb Felix Kuehling:
>>> On 2023-12-11 04:50, Christian König wrote:
>>>> Am 08.12.23 um 20:53 schrieb Alex Deucher:
>>>>> [SNIP]
>>>>>> You also need a functionality which resets all cleared blocks to
>>>>>> uncleared after suspend/resume.
>>>>>>
>>>>>> No idea how to do this, maybe Alex knows of hand.
>>>>> Since the buffers are cleared on creation, is there actually anything to 
>>>>> do?
>>>>
>>>> Well exactly that's the problem, the buffers are no longer always cleared 
>>>> on creation with this patch.
>>>>
>>>> Instead we clear on free, track which areas are cleared and clear only the 
>>>> ones which aren't cleared yet on creation.
>>>
>>> The code I added for clearing-on-free a long time ago, does not clear to 0, 
>>> but to a non-0 poison value. That was meant to make it easier to catch 
>>> applications incorrectly relying on 0-initialized memory. Is that being 
>>> changed? I didn't see it in this patch series.
>>
>> Yeah, Arun stumbled over that as well. Any objections that we fill with 
>> zeros instead or is that poison value something necessary for debugging?
> 
> I don't think it's strictly necessary. But it may encourage sloppy user mode 
> programming to rely on 0-initialized memory that ends up breaking in corner 
> cases or on older kernels.

>From a security PoV, the kernel should never return uncleared memory to (at 
>least unprivileged) user space. This series seems like a big step in that 
>direction.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Michel Dänzer
On 11/13/23 10:47, Simon Ser wrote:
> On Monday, November 13th, 2023 at 10:41, Michel Dänzer 
>  wrote:
> 
>> On 11/13/23 10:18, Simon Ser wrote:
>>
>>> On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr wrote:
>>>
>>>>>>>>>>>>> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is 
>>>>>>>>>>>>> allowed to
>>>>>>>>>>>>> +effectively change only the FB_ID property on any planes. 
>>>>>>>>>>>>> No-operation changes
>>>>>>>>>>>>> +are ignored as always. [...]
>>>>>>>>>>>>> During the hackfest in Brno, it was mentioned that a commit which 
>>>>>>>>>>>>> re-sets the same FB_ID could actually have an effect with VRR: It 
>>>>>>>>>>>>> could trigger scanout of the next frame before vertical blank has 
>>>>>>>>>>>>> reached its maximum duration. Some kind of mechanism is required 
>>>>>>>>>>>>> for this in order to allow user space to perform low frame rate 
>>>>>>>>>>>>> compensation.
>>>>>>>>>>>
>>>>>>>>>>> Xaver tested this hypothesis in a flipping the same fb in a VRR 
>>>>>>>>>>> monitor
>>>>>>>>>>> and it worked as expected, so this shouldn't be a concern.
>>>>>>>>>>> Right, so it must have some effect. It cannot be simply ignored 
>>>>>>>>>>> like in
>>>>>>>>>>> the proposed doc wording. Do we special-case re-setting the same 
>>>>>>>>>>> FB_ID
>>>>>>>>>>> as "not a no-op" or "not ignored" or some other way?
>>>>>>>>>>> There's an effect in the refresh rate, the image won't change but it
>>>>>>>>>>> will report that a flip had happened asynchronously so the reported
>>>>>>>>>>> framerate will be increased. Maybe an additional wording could be 
>>>>>>>>>>> like:
>>>>>>>>>
>>>>>>>>> Flipping to the same FB_ID will result in a immediate flip as if it 
>>>>>>>>> was
>>>>>>>>> changing to a different one, with no effect on the image but effecting
>>>>>>>>> the reported frame rate.
>>>>>>>>
>>>>>>>> Re-setting FB_ID to its current value is a special case regardless of
>>>>>>>> PAGE_FLIP_ASYNC, is it not?
>>>>>>>
>>>>>>> No. The rule has so far been that all side effects are observed
>>>>>>> even if you flip to the same fb. And that is one of my annoyances
>>>>>>> with this proposal. The rules will now be different for async flips
>>>>>>> vs. everything else.
>>>>>>
>>>>>> Well with the patches the async page-flip case is exactly the same as
>>>>>> the non-async page-flip case. In both cases, if a FB_ID is included in
>>>>>> an atomic commit then the side effects are triggered even if the property
>>>>>> value didn't change. The rules are the same for everything.
>>>>>
>>>>> I see it only checking if FB_ID changes or not. If it doesn't
>>>>> change then the implication is that the side effects will in
>>>>> fact be skipped as not all planes may even support async flips.
>>>>
>>>> Hm right. So the problem is that setting any prop = same value as
>>>> previous one will result in a new page-flip for asynchronous page-flips,
>>>> but will not result in any side-effect for asynchronous page-flips.
>>>>
>>>> Does it actually matter though? For async page-flips, I don't think this
>>>> would result in any actual difference in behavior?
>>>
>>> To sum this up, here is a matrix of behavior as seen by user-space:
>>>
>>> - Sync atomic page-flip
>>> - Set FB_ID to different value: programs hw for page-flip, sends uevent
>>> - Set FB_ID to same value: same (important for VRR)
>>> - Set another plane prop to same value: same
>>
>> A page flip is programmed even if FB_ID isn't touched?
> 
> I believe so. Set CRTC_X on a plane to the same value as before, and the
> CRTC gets implicitly included in the atomic commit?
> 
>>> - Set another plane prop to different value: maybe rejected if modeset 
>>> required
>>> - Async atomic page-flip
>>> - Set FB_ID to different value: updates hw with new FB address, sends
>>> immediate uevent
>>> - Set FB_ID to same value: same (no-op for the hw)
>>
>> No-op implies it doesn't trigger scanning out a frame with VRR, if
>> scanout is currently in vertical blank. Is that the case? If so, async
>> flips can't reliably trigger scanning out a frame with VRR.
> 
> By no-op I mean that the hw is programmed for an immediate async flip
> with the same buffer addr as the previous one. So this doesn't actually
> change anything.

If a flip is programmed to the HW, it's not a no-op any more than in the sync 
case, in particular not with VRR.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Michel Dänzer
On 11/13/23 10:18, Simon Ser wrote:
> On Monday, October 23rd, 2023 at 10:25, Simon Ser  wrote:
> 
>>>>>>>>>>> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed 
>>>>>>>>>>> to
>>>>>>>>>>> +effectively change only the FB_ID property on any planes. 
>>>>>>>>>>> No-operation changes
>>>>>>>>>>> +are ignored as always. [...]
>>>>>>>>>>> During the hackfest in Brno, it was mentioned that a commit which 
>>>>>>>>>>> re-sets the same FB_ID could actually have an effect with VRR: It 
>>>>>>>>>>> could trigger scanout of the next frame before vertical blank has 
>>>>>>>>>>> reached its maximum duration. Some kind of mechanism is required 
>>>>>>>>>>> for this in order to allow user space to perform low frame rate 
>>>>>>>>>>> compensation.
>>>>>>>>>
>>>>>>>>> Xaver tested this hypothesis in a flipping the same fb in a VRR 
>>>>>>>>> monitor
>>>>>>>>> and it worked as expected, so this shouldn't be a concern.
>>>>>>>>> Right, so it must have some effect. It cannot be simply ignored like 
>>>>>>>>> in
>>>>>>>>> the proposed doc wording. Do we special-case re-setting the same FB_ID
>>>>>>>>> as "not a no-op" or "not ignored" or some other way?
>>>>>>>>> There's an effect in the refresh rate, the image won't change but it
>>>>>>>>> will report that a flip had happened asynchronously so the reported
>>>>>>>>> framerate will be increased. Maybe an additional wording could be 
>>>>>>>>> like:
>>>>>>>
>>>>>>> Flipping to the same FB_ID will result in a immediate flip as if it was
>>>>>>> changing to a different one, with no effect on the image but effecting
>>>>>>> the reported frame rate.
>>>>>>
>>>>>> Re-setting FB_ID to its current value is a special case regardless of
>>>>>> PAGE_FLIP_ASYNC, is it not?
>>>>>
>>>>> No. The rule has so far been that all side effects are observed
>>>>> even if you flip to the same fb. And that is one of my annoyances
>>>>> with this proposal. The rules will now be different for async flips
>>>>> vs. everything else.
>>>>
>>>> Well with the patches the async page-flip case is exactly the same as
>>>> the non-async page-flip case. In both cases, if a FB_ID is included in
>>>> an atomic commit then the side effects are triggered even if the property
>>>> value didn't change. The rules are the same for everything.
>>>
>>> I see it only checking if FB_ID changes or not. If it doesn't
>>> change then the implication is that the side effects will in
>>> fact be skipped as not all planes may even support async flips.
>>
>> Hm right. So the problem is that setting any prop = same value as
>> previous one will result in a new page-flip for asynchronous page-flips,
>> but will not result in any side-effect for asynchronous page-flips.
>>
>> Does it actually matter though? For async page-flips, I don't think this
>> would result in any actual difference in behavior?
> 
> To sum this up, here is a matrix of behavior as seen by user-space:
> 
> - Sync atomic page-flip
>   - Set FB_ID to different value: programs hw for page-flip, sends uevent
>   - Set FB_ID to same value: same (important for VRR)
>   - Set another plane prop to same value: same

A page flip is programmed even if FB_ID isn't touched?


>   - Set another plane prop to different value: maybe rejected if modeset 
> required
> - Async atomic page-flip
>   - Set FB_ID to different value: updates hw with new FB address, sends
> immediate uevent
>   - Set FB_ID to same value: same (no-op for the hw)

No-op implies it doesn't trigger scanning out a frame with VRR, if scanout is 
currently in vertical blank. Is that the case? If so, async flips can't 
reliably trigger scanning out a frame with VRR.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 5/6] drm/amdgpu: Add flag to disable implicit sync for GEM operations.

2023-10-31 Thread Michel Dänzer
On 10/31/23 15:34, Christian König wrote:
> Am 31.10.23 um 15:14 schrieb Michel Dänzer:
> 
>> FWIW, RADV will also want explicit sync in the CS ioctl.
> You can replace that with the DMA-buf IOCTLs like Faith is planning to do for 
> NVK. 

Those ioctls cannot disable implicit sync for the CS ioctl. They can be used 
for making implicit sync work correctly for individual BOs though, once 
implicit sync is disabled for the CS ioctl.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 5/6] drm/amdgpu: Add flag to disable implicit sync for GEM operations.

2023-10-31 Thread Michel Dänzer
On 10/31/23 14:40, Tatsuyuki Ishi wrote:
> In Vulkan, it is the application's responsibility to perform adequate
> synchronization before a sparse unmap, replace or BO destroy operation.
> Until now, the kernel applied the same rule as implicitly-synchronized
> APIs like OpenGL, which with per-VM BOs made page table updates stall the
> queue completely. The newly added AMDGPU_VM_EXPLICIT_SYNC flag allows
> drivers to opt-out of this behavior, while still ensuring adequate implicit
> sync happens for kernel-initiated updates (e.g. BO moves).
> 
> We record whether to use implicit sync or not for each freed mapping. To
> avoid increasing the mapping struct's size, this is union-ized with the
> interval tree field which is unused after the unmap.
> 
> The reason this is done with a GEM ioctl flag, instead of being a VM /
> context global setting, is that the current libdrm implementation shares
> the DRM handle even between different kind of drivers (radeonsi vs radv).

Different drivers always use separate contexts though, even with the same DRM 
file description, don't they?

FWIW, RADV will also want explicit sync in the CS ioctl.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color pipeline is needed

2023-10-27 Thread Michel Dänzer
On 10/26/23 21:25, Alex Goins wrote:
> On Thu, 26 Oct 2023, Sebastian Wick wrote:
>> On Thu, Oct 26, 2023 at 11:57:47AM +0300, Pekka Paalanen wrote:
>>> On Wed, 25 Oct 2023 15:16:08 -0500 (CDT)
>>> Alex Goins  wrote:
>>>
>>>> Despite being programmable, the LUTs are updated in a manner that is less
>>>> efficient as compared to e.g. the non-static "degamma" LUT. Would it be 
>>>> helpful
>>>> if there was some way to tag operations according to their performance,
>>>> for example so that clients can prefer a high performance one when they
>>>> intend to do an animated transition? I recall from the XDC HDR workshop
>>>> that this is also an issue with AMD's 3DLUT, where updates can be too
>>>> slow to animate.
>>>
>>> I can certainly see such information being useful, but then we need to
>>> somehow quantize the performance.
> 
> Right, which wouldn't even necessarily be universal, could depend on the given
> host, GPU, etc. It could just be a relative performance indication, to give an
> order of preference. That wouldn't tell you if it can or can't be animated, 
> but
> when choosing between two LUTs to animate you could prefer the higher
> performance one.
> 
>>>
>>> What I was left puzzled about after the XDC workshop is that is it
>>> possible to pre-load configurations in the background (slow), and then
>>> quickly switch between them? Hardware-wise I mean.
> 
> This works fine for our "fast" LUTs, you just point them to a surface in video
> memory and they flip to it. You could keep multiple surfaces around and flip
> between them without having to reprogram them in software. We can easily do 
> that
> with enumerated curves, populating them when the driver initializes instead of
> waiting for the client to request them. You can even point multiple hardware
> LUTs to the same video memory surface, if they need the same curve.
> 
>>
>> We could define that pipelines with a lower ID are to be preferred over
>> higher IDs.
> 
> Sure, but this isn't just an issue with a pipeline as a whole, but the
> individual elements within it and how to use them in a given context.
> 
>>
>> The issue is that if programming a pipeline becomes too slow to be
>> useful it probably should just not be made available to user space.
> 
> It's not that programming the pipeline is overall too slow. The LUTs we have
> that are relatively slow to program are meant to be set infrequently, or even
> just once, to allow the scaler and tone mapping operator to operate in fixed
> point PQ space. You might still want the tone mapper, so you would choose a
> pipeline that includes them, but when it comes to e.g. animating a night 
> light,
> you would want to choose a different LUT for that purpose.
> 
>>
>> The prepare-commit idea for blob properties would help to make the
>> pipelines usable again, but until then it's probably a good idea to just
>> not expose those pipelines.
> 
> The prepare-commit idea actually wouldn't work for these LUTs, because they 
> are
> programmed using methods instead of pointing them to a surface. I'm actually 
> not
> sure how slow it actually is, would need to benchmark it. I think not exposing
> them at all would be overkill, since it would mean you can't use the 
> preblending
> scaler or tonemapper, and animation isn't necessary for that.
> 
> The AMD 3DLUT is another example of a LUT that is slow to update, and it would
> obviously be a major loss if that wasn't exposed. There just needs to be some
> way for clients to know if they are going to kill performance by trying to
> change it every frame.

Might a first step be to require the ALLOW_MODESET flag to be set when changing 
the values for a colorop which is too slow to be updated per refresh cycle?

This would tell the compositor: You can use this colorop, but you can't change 
its values on the fly.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v7 4/6] drm: Refuse to async flip with atomic prop changes

2023-10-23 Thread Michel Dänzer
On 10/23/23 10:27, Simon Ser wrote:
> On Sunday, October 22nd, 2023 at 12:12, Michel Dänzer 
>  wrote:
>> On 10/17/23 14:16, Simon Ser wrote:
>>
>>> After discussing with André it seems like we missed a plane type check
>>> here. We need to make sure FB_ID changes are only allowed on primary
>>> planes.
>>
>> Can you elaborate why that's needed?
> 
> Current drivers are in general not prepared to perform async page-flips
> on planes other than primary. For instance I don't think i915 has logic
> to perform async page-flip on an overlay plane FB_ID change.

That should be handled in the driver's atomic_check then?

Async flips of overlay planes would be useful e.g. for presenting a windowed 
application with tearing, while the rest of the desktop is tear-free.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v7 4/6] drm: Refuse to async flip with atomic prop changes

2023-10-22 Thread Michel Dänzer
On 10/17/23 14:16, Simon Ser wrote:
> After discussing with André it seems like we missed a plane type check
> here. We need to make sure FB_ID changes are only allowed on primary
> planes.

Can you elaborate why that's needed?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-10-13 Thread Michel Dänzer
On 10/13/23 11:41, Daniel Vetter wrote:
> On Thu, Oct 12, 2023 at 02:19:41PM -0400, Ray Strode wrote:
>> On Mon, Oct 09, 2023 at 02:36:17PM +0200, Christian König wrote:
>>>>>> To be clear, my take is, if driver code is running in process context
>>>>>> and needs to wait for periods of time on the order of or in excess of
>>>>>> a typical process time slice it should be sleeping during the waiting.
>>>>>> If the operation is at a point where it can be cancelled without side
>>>>>> effects, the sleeping should be INTERRUPTIBLE. If it's past the point
>>>>>> of no return, sleeping should be UNINTERRUPTIBLE. At no point, in my
>>>>>> opinion, should kernel code busy block a typical process for dozens of
>>>>>> milliseconds while keeping the process RUNNING. I don't think this is
>>>>>> a controversial take.
>>>>> Exactly that's what I completely disagree on.
>>
>> Okay if we can't agree that it's not okay for user space (or the
>> kernel running in the context of user space) to busy loop a cpu core
>> at 100% utilization throughout and beyond the process's entire
>> scheduled time slice then we really are at an impasse. I gotta say i'm
>> astonished that this seemingly indefensible behavior is somehow a
>> point of contention, but I'm not going to keep arguing about it beyond
>> this email.
>>
>> I mean we're not talking about scientific computing, or code
>> compilation, or seti@home. We're talking about nearly the equivalent
>> of `while (1) __asm__ ("nop");`
> 
> I don't think anyone said this shouldn't be fixed or improved.
> 
> What I'm saying is that the atomic ioctl is not going to make guarantees
> that it will not take up to much cpu time (for some extremely vague value
> of "too much") to the point that userspace can configure it's compositor
> in a way that it _will_ get killed if we _ever_ violate this rule.
> 
> We should of course try to do as good as job as possible, but that's not
> what you're asking for. You're asking for a hard real-time guarantee with
> the implication if we ever break it, it's a regression, and the kernel has
> to bend over backwards with tricks like in your patch to make it work.

I don't think mutter really needs or wants such a hard real-time guarantee. 
What it needs is a fighting chance to react before the kernel kills its process.

The intended mechanism for this is SIGXCPU, but that can't work if the kernel 
is stuck in a busy-loop. Ray's patch seems like one way to avoid that.

That said, as long as SIGXCPU can work as intended with the non-blocking 
commits mutter uses for everything except modesets, mutter's workaround of 
dropping RT priority for the blocking commits seems acceptable for the time 
being.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-10-08 Thread Michel Dänzer
On 10/6/23 20:48, Ray Strode wrote:
> 
> Note, a point that I don't think has been brought up yet, too, is
> the system unbound workqueue doesn't run with real time priority.
> Given the lion's share of mutter's drmModeAtomicCommit calls are
> nonblock, and so are using the system unbound workqueue for handling
> the commits, even without this patch, that somewhat diminishes the
> utility of using a realtime thread anyway. I believe the original
> point of the realtime thread was to make sure mouse cursor motion
> updates are as responsive as possible, because any latency there is
> something the user really feels.

Mutter's KMS thread needs to be real-time so that it can reliably schedule its 
work building up to calling the atomic commit ioctl for minimal input → output 
latency. That some of the ioctl's own work doesn't run at the same priority 
doesn't necessarily matter for this, as long as it can hit the next vertical 
blank period.

BTW, I understand kwin has or is planning to get a real-time KMS thread as 
well, for the same reasons.


> Maybe there needs to be a different mechanism in place to make sure
> user perceived interactivity is given priority.

The only alternative I'm aware of having been discussed so far is allowing 
atomic commits to be amended. I don't think that would be a great solution for 
this issue though, as it would result in Wayland compositors wasting CPU cycles 
(in other words, energy) for constant amendments of atomic commits, in the hope 
that one of them results in good latency.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] Revert "drm/amd/display: Check all enabled planes in dm_check_crtc_cursor"

2023-10-02 Thread Michel Dänzer
On 10/2/23 12:48, Michel Dänzer wrote:
> On 10/2/23 12:05, Michel Dänzer wrote:
>> On 9/29/23 22:41, Hamza Mahfooz wrote:
>>> From: Ivan Lipski 
>>>
>>> This reverts commit 45e1ade04b4d60fe5df859076005779f27c4c9be.
>>>
>>> Since, it causes the following IGT tests to fail:
>>> kms_cursor_legacy@cursor-vs-flip.*
>>> kms_cursor_legacy@flip-vs-cursor.*
>>
>> Any information about how those tests fail? Maybe they accidentally rely on 
>> the broken behaviour?
> 
> I was able to reproduce, that doesn't seem to be the case. They just rely on 
> multiple legacy cursor ioctl calls being able to complete between consecutive 
> flips, which I suppose is broken by always pulling in non-cursor plane state 
> with any cursor plane state changes.
> 
> I'll see if I can find a better solution. Meanwhile,
> 
> Acked-by: Michel Dänzer 
> 
> for the revert.

Alternatively, here's an incremental fix:

https://patchwork.freedesktop.org/series/124527/


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] Revert "drm/amd/display: Check all enabled planes in dm_check_crtc_cursor"

2023-10-02 Thread Michel Dänzer
On 10/2/23 12:05, Michel Dänzer wrote:
> On 9/29/23 22:41, Hamza Mahfooz wrote:
>> From: Ivan Lipski 
>>
>> This reverts commit 45e1ade04b4d60fe5df859076005779f27c4c9be.
>>
>> Since, it causes the following IGT tests to fail:
>> kms_cursor_legacy@cursor-vs-flip.*
>> kms_cursor_legacy@flip-vs-cursor.*
> 
> Any information about how those tests fail? Maybe they accidentally rely on 
> the broken behaviour?

I was able to reproduce, that doesn't seem to be the case. They just rely on 
multiple legacy cursor ioctl calls being able to complete between consecutive 
flips, which I suppose is broken by always pulling in non-cursor plane state 
with any cursor plane state changes.

I'll see if I can find a better solution. Meanwhile,

Acked-by: Michel Dänzer 

for the revert.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] Revert "drm/amd/display: Check all enabled planes in dm_check_crtc_cursor"

2023-10-02 Thread Michel Dänzer
On 9/29/23 22:41, Hamza Mahfooz wrote:
> From: Ivan Lipski 
> 
> This reverts commit 45e1ade04b4d60fe5df859076005779f27c4c9be.
> 
> Since, it causes the following IGT tests to fail:
> kms_cursor_legacy@cursor-vs-flip.*
> kms_cursor_legacy@flip-vs-cursor.*

Any information about how those tests fail? Maybe they accidentally rely on the 
broken behaviour?


FWIW, something like the reverted commit is definitely needed, see 
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3177#note_1829068 . That 
MR is blocked by the reverted fix.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-09-28 Thread Michel Dänzer
On 9/28/23 16:51, Christian König wrote:
> Am 28.09.23 um 15:37 schrieb Michel Dänzer:
>> On 9/28/23 14:59, Ray Strode wrote:
>>> On Thu, Sep 28, 2023 at 5:43 AM Michel Dänzer
>>>  wrote:
>>>>>>> When it's really not desirable to account the CPU overhead to the
>>>>>>> process initiating it then you probably rather want to use an non
>>>>>>> blocking commit plus a dma_fence to wait for the work to end from 
>>>>>>> userspace.
>>>>>> Well, first I don't think that's very convenient. You're talking about
>>>>>> a per-plane property, so there would need to be a separate file
>>>>>> descriptor allocated for every plane, right? and user-space would have
>>>>>> to block on all of them before proceeding?
>>>> OUT_FENCE_PTR is a per-CRTC property, not per-plane.
>>> Okay, sure.
>>>
>>>> Also, at least in this particular case, a single sync file (not dma_fence) 
>>>> for any CRTC might suffice.
>>> I don't see how we could rely on that given the provided api and
>>> multitude of drivers. It might work and then break randomly.
>> If it's supposed to work from the KMS API PoV, any bugs to the contrary 
>> should be fixed.
>>
>> I'm not really seeing the big difference between using a single fence or 
>> multiple, anyway.
> 
> The big difference is that a standard modeset can take some time, e.g. 
> setting up power levels, waiting for PLLs to settle, waiting for a vblank 
> etc..

Right (starting a game shouldn't involve such a modeset though in a Wayland 
session).

What I mean is there's no significant difference for user space between using a 
single out fence or multiple. Just slightly different bookkeeping.


> But in the case of turning thing off, what should we wait for?

The atomic commit to finish, so it's safe to submit another one without hitting 
EBUSY. (What we use the completion events for with page flips)


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-09-28 Thread Michel Dänzer
On 9/28/23 15:23, Christian König wrote:
> 
> What you need to do here is to report those problems to the driver teams and 
> not try to hide them this way.

See the linked issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2861 
(BTW, the original reporter of that issue isn't hitting it with DPMS off but 
just starting a game, so there seem to be at least two separate causes of 
exceeding 200ms for an atomic commit in DC)

It's not like GitLab issues get much if any attention by DC developers 
though... So Ray tried to come up with some kind of solution.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-09-28 Thread Michel Dänzer
On 9/28/23 14:59, Ray Strode wrote:
> On Thu, Sep 28, 2023 at 5:43 AM Michel Dänzer
>  wrote:
>>>>> When it's really not desirable to account the CPU overhead to the
>>>>> process initiating it then you probably rather want to use an non
>>>>> blocking commit plus a dma_fence to wait for the work to end from 
>>>>> userspace.
>>>> Well, first I don't think that's very convenient. You're talking about
>>>> a per-plane property, so there would need to be a separate file
>>>> descriptor allocated for every plane, right? and user-space would have
>>>> to block on all of them before proceeding?
>>
>> OUT_FENCE_PTR is a per-CRTC property, not per-plane.
> 
> Okay, sure.
> 
>> Also, at least in this particular case, a single sync file (not dma_fence) 
>> for any CRTC might suffice.
> 
> I don't see how we could rely on that given the provided api and
> multitude of drivers. It might work and then break randomly.

If it's supposed to work from the KMS API PoV, any bugs to the contrary should 
be fixed.

I'm not really seeing the big difference between using a single fence or 
multiple, anyway.


I do wonder if there might be a time window where the out fences have 
signalled, but the atomic commit ioctl will still fail with EBUSY. If there is 
though, I'd expect it to affect the flip completion events as well.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-09-28 Thread Michel Dänzer
On 9/28/23 08:56, Christian König wrote:
> Am 27.09.23 um 22:25 schrieb Ray Strode:
>> On Wed, Sep 27, 2023 at 4:05 AM Christian König
>>  wrote:
> 
>>> When it's really not desirable to account the CPU overhead to the
>>> process initiating it then you probably rather want to use an non
>>> blocking commit plus a dma_fence to wait for the work to end from userspace.
>> Well, first I don't think that's very convenient. You're talking about
>> a per-plane property, so there would need to be a separate file
>> descriptor allocated for every plane, right? and user-space would have
>> to block on all of them before proceeding?

OUT_FENCE_PTR is a per-CRTC property, not per-plane. Also, at least in this 
particular case, a single sync file (not dma_fence) for any CRTC might suffice.


>> Also, are you sure that works in all cases? The problematic case we're 
>> facing right >> now is when all planes and all crtcs are getting disabled. 
>> I'm just skimming
>> over the code here, but I see this:
>>
>> →   for_each_new_crtc_in_state(state, crtc, crtc_state, i) {•
>> ...
>> →   →   if (arg->flags & DRM_MODE_PAGE_FLIP_EVENT || fence_ptr) {•
>> ...
>> →   →   →   e = create_vblank_event(crtc, arg->user_data);•
>> ...
>> →   →   →   crtc_state->event = e;•
>> →   →   }•
>> ...
>> →   →   if (fence_ptr) {•
>> ...
>> →   →   →   fence = drm_crtc_create_fence(crtc);•
>> ...
>> →   →   →   ret = setup_out_fence([(*num_fences)++], fence);•
>> ...
>> →   →   →   crtc_state->event->base.fence = fence;•
>> →   →   }•
>>
>> Is the code really going to allocate a vblank_event when all the
>> crtc's are disabled ? I guess it's possible, but that's
>> counterintuitive to me. I really don't know. You're confident a set of
>> fences will actually work?
> 
> No when you disable everything there is of course no fence allocated.

I don't see why not. "new_crtc_in_state" in this code means the atomic commit 
contains some state for the CRTC (such as setting the OUT_FENCE_PTR property), 
it could disable or leave it disabled though. I can't see any other code which 
would prevent this from working with a disabled CRTC, I could be missing 
something though.


> But then you also should never see anything waiting for to long to actually 
> be able to trigger the RLIMIT_RTTIME.

amdgpu DC exceeds 200ms on some setups, see the linked issue.


>> Regardless, this seems like a roundabout way to address a problem that
>> we could just ... fix.

Handling modesets asynchronously does seem desirable in mutter to me. E.g. it 
would allow doing modesets in parallel with modesets or other atomic commits on 
other GPUs.


> From what I know RLIMIT_RTTIME usually works in units of multiple seconds. 

RealtimeKit seems to allow 200ms maximum.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/atomic-helper: prevent uaf in wait_for_vblanks

2023-09-19 Thread Michel Dänzer
0
> [ 3935.326614] page dumped because: kasan: bad access detected
> 
> [ 3935.326614] Memory state around the buggy address:
> [ 3935.326614]  88818a6f7f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
> fc fc
> [ 3935.326614]  88818a6f7f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
> fc fc
> [ 3935.326614] >88818a6f8000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb 
> fb fb
> [ 3935.326772]   ^
> [ 3935.326772]  88818a6f8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
> fb fb
> [ 3935.326772]  88818a6f8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
> fb fb
> [ 3935.326772] 
> ==
> 
> This suggest there may be some situation where a
> struct drm_crtc_state is referenced after already
> being freed by drm_atomic_state_default_clear. This
> patch will check the new_crtc_state is not null before
> using it.
> 
> Signed-off-by: José Pekkarinen 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index 292e38eb6218..cc75d387a542 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1647,7 +1647,7 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device 
> *dev,
>   return;
>  
>   for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, 
> new_crtc_state, i) {
> - if (!new_crtc_state->active)
> + if (new_crtc_state && !new_crtc_state->active)
>   continue;
>  
>   ret = drm_crtc_vblank_get(crtc);

I'm not quite seeing the connection between this change and the KASAN report.

If new_crtc_state was NULL, I would have expected a normal NULL pointer 
dereference oops, not a KASAN report.

The KASAN report instead indicates that new_crtc_state isn't NULL, but points 
to memory which has already been freed. This could be e.g. due to incorrect 
reference counting somewhere else.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v3] drm/plane: Add documentation about software color conversion.

2023-09-13 Thread Michel Dänzer
On 9/13/23 10:14, Jocelyn Falempe wrote:
> On 12/09/2023 17:57, Michel Dänzer wrote:
>> On 9/11/23 10:38, Pekka Paalanen wrote:
>>> On Fri, 8 Sep 2023 17:10:46 +0200
>>> Thomas Zimmermann  wrote:
>>>> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
>>>>> On Fri, 8 Sep 2023 15:56:51 +0200
>>>>> Thomas Zimmermann  wrote:
>>>>>>
>>>>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>>>>> the display thread, but the CPU could do something else in the meantime.
>>>>>> Doing format conversion on the CPU prevents that, hence affecting other
>>>>>> parts of the system negatively. Of course, that's more of a gut feeling
>>>>>> than hard data.
>>
>> Jocelyn, have you measured if the XRGB -> RGB888 conversion copy takes 
>> longer than a straight RGB888 -> RGB888 copy in the kernel?
> 
> At least on my Matrox system, the conversion is really negligible compared to 
> the copy, due to slow memory bandwidth. I wasn't able to see a difference, 
> using ktime_get_ns().
> 
> The CPU is an old Intel(R) Core(TM) i3 CPU 540  @ 3.07GHz
> in 1280x1024, the RGB888 copy takes 95ms.
> and the XRGB->RGB888 takes also 95ms.
> (and the full XRGB copy takes 125ms)

Then any XRGB → RGB888 conversion in user space has to result in worse 
performance.


> But let's admit that this discussion is going nowhere, and that I failed to 
> reach a consensus, so I'm now focusing on other things. 

I see consensus, with one person disagreeing.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v3] drm/plane: Add documentation about software color conversion.

2023-09-12 Thread Michel Dänzer
On 9/11/23 10:38, Pekka Paalanen wrote:
> On Fri, 8 Sep 2023 17:10:46 +0200
> Thomas Zimmermann  wrote:
>> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
>>> On Fri, 8 Sep 2023 15:56:51 +0200
>>> Thomas Zimmermann  wrote:
>>>> 
>>>> I have a number of concerns. My point it not that we shouldn't optimize.
>>>> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
>>>> for userspace to use.
>>>>
>>>> AFAICT the main argument against userspace is that Mesa doesn't like
>>>> 3-byte pixels. But I don't see how this conversion cannot be a
>>>> post-processing step within Mesa: do the rendering in RGB32 and then
>>>> convert to a framebuffer in RGB24.

Even assuming the conversion could be handled transparently in Mesa, it would 
still require the KMS client to pick RGB888 instead of XRGB. Most (all?) 
KMS clients support XRGB, many of them will realistically never support 
RGB888. (Or even if they did, they might prefer XRGB anyway, if RGB888 
requires a final conversion step)


>>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>>> the display thread, but the CPU could do something else in the meantime.
>>>> Doing format conversion on the CPU prevents that, hence affecting other
>>>> parts of the system negatively. Of course, that's more of a gut feeling
>>>> than hard data.

Jocelyn, have you measured if the XRGB -> RGB888 conversion copy takes 
longer than a straight RGB888 -> RGB888 copy in the kernel?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v11] drm: Add initial ci/ subdirectory

2023-09-11 Thread Michel Dänzer
On 9/11/23 14:51, Maxime Ripard wrote:
> On Mon, Sep 11, 2023 at 02:13:43PM +0200, Michel Dänzer wrote:
>> On 9/11/23 11:34, Maxime Ripard wrote:
>>> On Thu, Sep 07, 2023 at 01:40:02PM +0200, Daniel Stone wrote:
>>>>
>>>> Secondly, we will never be there. If we could pause for five years and sit
>>>> down making all the current usecases for all the current hardware on the
>>>> current kernel run perfectly, we'd probably get there. But we can't: 
>>>> there's
>>>> new hardware, new userspace, and hundreds of new kernel trees.
>>>
>>> [...]
>>> 
>>> I'm not sure it's actually an argument, really. 10 years ago, we would
>>> never have been at "every GPU on the market has an open-source driver"
>>> here. 5 years ago, we would never have been at this-series-here. That
>>> didn't stop anyone making progress, everyone involved in that thread
>>> included.
>>
>> Even assuming perfection is achievable at all (which is very doubtful,
>> given the experience from the last few years of CI in Mesa and other
>> projects), if you demand perfection before even taking the first step,
>> it will never get off the ground.
> 
> Perfection and scale from the get-go isn't reasonable, yes. Building a
> small, "perfect" (your words, not mine) system that you can later expand
> is doable.

I mean "perfect" as in every single available test runs, is reliable and gates 
CI. Which seems to be what you're asking for. The only possible expansion of 
such a system would be adding new 100% reliable tests.

What is being proposed here is an "imperfect" system which takes into account 
the reality that some tests are not 100% reliable, and can be improved 
gradually while already preventing some regressions from getting merged.


>>> How are we even supposed to detect those failures in the first
>>> place if tests are flagged as unreliable?
>>
>> Based on experience with Mesa, only a relatively small minority of
>> tests should need to be marked as flaky / not run at all. The majority
>> of tests are reliable and can catch regressions even while some tests
>> are not yet.
> 
> I understand and acknowledge that it worked with Mesa. That's great for
> Mesa. That still doesn't mean that it's the panacea and is for every
> project.

Not sure what you're referring to by panacea, or how it relates to "some tests 
can be useful even while others aren't yet".


>>> No matter what we do here, what you describe will always happen. Like,
>>> if we do flag those tests as unreliable, what exactly prevents another
>>> issue to come on top undetected, and what will happen when we re-enable
>>> testing?
>>
>> Any issues affecting a test will need to be fixed before (re-)enabling
>> the test for CI.
> 
> If that underlying issue is never fixed, at which point do we consider
> that it's a failure and should never be re-enabled? Who has that role?

Not sure what you're asking. Anybody can (re-)enable a test in CI, they just 
need to make sure first that it is reliable. Until somebody does that work, 
it'll stay disabled in CI.


>>> It might or might not be an issue for Linus' release, but I can
>>> definitely see the trouble already for stable releases where fixes will
>>> be backported, but the test state list certainly won't be updated.
>>
>> If the stable branch maintainers want to take advantage of CI for the
>> stable branches, they may need to hunt for corresponding state list
>> commits sometimes. They'll need to take that into account for their
>> decision.
> 
> So we just expect the stable maintainers to track each and every patches
> involved in a test run, make sure that they are in a stable tree, and
> then update the test list? Without having consulted them at all?

I don't expect them to do anything. See the If at the start of what I wrote.


>>>> By keeping those sets of expectations, we've been able to keep Mesa pretty
>>>> clear of regressions, whilst having a very clear set of things that should
>>>> be fixed to point to. It would be great if those set of things were zero,
>>>> but it just isn't. Having that is far better than the two alternatives:
>>>> either not testing at all (obviously bad), or having the test always be red
>>>> so it's always ignored (might as well just not test).
>>>
>>> Isn't that what happens with flaky tests anyway?
>>
>> For a small minority of tests. Daniel was referring to whole test suites.
>>
>>> Even more so since we have 0 context when updating that l

Re: [PATCH v11] drm: Add initial ci/ subdirectory

2023-09-11 Thread Michel Dänzer
 provide whatever context is needed.


> I've asked a couple of times, I'll ask again. In that other series, on
> the MT8173, kms_hdmi_inject@inject-4k is setup as flaky (which is a KMS
> test btw).
> 
> I'm a maintainer for that part of the kernel, I'd like to look into it,
> because it's seriously something that shouldn't fail, ever, the hardware
> isn't involved.
> 
> How can I figure out now (or worse, let's say in a year) how to
> reproduce it? What kernel version was affected? With what board? After
> how many occurences?
> 
> Basically, how can I see that the bug is indeed there (or got fixed
> since), and how to start fixing it?

Many of those things should be documented in the commit log of the state list 
change.

How the CI works in general should be documented in some appropriate place in 
tree.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v3 0/7] GPU workload hints for better performance

2023-08-30 Thread Michel Dänzer
On 8/28/23 17:02, Lazar, Lijo wrote:
> [AMD Official Use Only - General]
> 
> 
> As mentioned with an older version of this series, this is an 'abuse' of 
> power profile interface.
> 
> This series is oversimplifying what PMFW algorithms are supposed to be doing. 
> Whatever this series is doing, FW can do it better.
> 
> To explain in simpler terms - it just tries to boost a profile based on ring 
> type without even knowing how much of activity a job can trigger on a 
> particular ring. A job scheduled to a GFX ring doesn't deserve a profile 
> boost unless it can create a certain level of activity. In CPU terms, a job 
> scheduled to a processor doesn't mean it deserves a frequency boost of that 
> CPU.  At minimum it depends on more details like whether that job is compute 
> bound or memory bound or memory bound. 
> 
> While FW algorithms are designed to do that, this series tries to trivialise 
> all such things.
> 
> Unless you are able to show the tangible benefits in some terms like 
> performance, power, or performance per watt,  I don't think this should be 
> the default behaviour where driver tries to override FW just based on job 
> submissions to rings.

I know at least one tangible benefit this would have: a snappier GNOME desktop 
with lower input → output latency on many laptops. The bootup default profile 
doesn't work well for that IME.

It should also help for issues like
https://gitlab.freedesktop.org/drm/amd/-/issues/1500 .

That said, I agree this approach is very aggressive. I think it might be 
acceptable with AC power, not sure about on battery though. (There might be 
better performance/power profile mechanisms to hook into than AC vs battery)


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v2 19/34] drm/amd/display: decouple steps for mapping CRTC degamma to DC plane

2023-08-30 Thread Michel Dänzer
ng dc_stream_state's input transfer function (ITF) 
>>>> in
>>>> + * preparation for hardware commit. The transfer function used depends on
>>>> + * the preparation done on the stream for color management.
>>>> + *
>>>> + * Returns:
>>>> + * 0 on success. -ENOMEM if mem allocation fails.
>>>> + */
>>>> +int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
>>>> +struct dc_plane_state *dc_plane_state)
>>>> +{
>>>> +  bool has_crtc_cm_degamma;
>>>> +  int ret;
>>>> +
>>>> +  has_crtc_cm_degamma = (crtc->cm_has_degamma || 
>>>> crtc->cm_is_degamma_srgb);
>>>> +  if (has_crtc_cm_degamma){
>>>> +      /* AMD HW doesn't have post-blending degamma caps. When DRM
>>>> +   * CRTC atomic degamma is set, we maps it to DPP degamma block
>>>> +   * (pre-blending) or, on legacy gamma, we use DPP degamma to
>>>> +   * linearize (implicit degamma) from sRGB/BT709 according to
>>>> +   * the input space.  
>>>
>>> Uhh, you can't just move degamma before blending if KMS userspace
>>> wants it after blending. That would be incorrect behaviour. If you
>>> can't implement it correctly, reject it.
>>>
>>> I hope that magical unexpected linearization is not done with atomic,
>>> either.
>>>
>>> Or maybe this is all a lost cause, and only the new color-op pipeline
>>> UAPI will actually work across drivers.  
>>
>> I agree that crtc degamma is an optional property and should be not
>> exposed if not available.  I did something in this line for DCE that has
>> no degamma block[1].  Then, AMD DDX driver stopped to advertise atomic
>> API for DCE, that was not correct too[2].
> 
> Did AMD go through all the trouble of making their Xorg DDX use KMS
> atomic, even after the kernel took it away from X due to modesetting
> DDX screwing it up?

No, I think Melissa meant the KMS properties for advanced colour transforms, 
which xf86-video-amdgpu uses, not with atomic KMS though.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH (set 1) 00/20] Rid W=1 warnings from GPU

2023-08-28 Thread Michel Dänzer
On 8/24/23 14:07, Lee Jones wrote:
> On Thu, 24 Aug 2023, Jani Nikula wrote:
>> On Thu, 24 Aug 2023, Lee Jones  wrote:
>>> This set is part of a larger effort attempting to clean-up W=1
>>> kernel builds, which are currently overwhelmingly riddled with
>>> niggly little warnings.
>>
>> The next question is, how do we keep it W=1 clean going forward?
> 
> My plan was to fix them all, then move each warning to W=0.
> 
> Arnd recently submitted a set doing just that for a bunch of them.
> 
> https://lore.kernel.org/all/20230811140327.3754597-1-a...@kernel.org/
> 
> I like to think a bunch of this is built on top of my previous efforts.
> 
> GPU is a particularly tricky though - the warnings seem to come in faster
> than I can squash them.  Maybe the maintainers can find a way to test
> new patches on merge?

One approach for this which has proved effective in Mesa and other projects is 
to make warnings fatal in CI which must pass for any changes to be merged. 
There is ongoing work toward introducing this for the DRM subsystem, using 
gitlab.freedesktop.org CI.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-08-22 Thread Michel Dänzer
On 8/21/23 22:02, André Almeida wrote:
> Em 17/08/2023 07:37, Michel Dänzer escreveu:
>> On 8/15/23 20:57, André Almeida wrote:
>>> From: Pekka Paalanen 
>>>
>>> Specify how the atomic state is maintained between userspace and
>>> kernel, plus the special case for async flips.
>>>
>>> Signed-off-by: Pekka Paalanen 
>>> Signed-off-by: André Almeida 
>>
>> [...]
>>
>>> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to
>>> +effectively change only the FB_ID property on any planes. No-operation 
>>> changes
>>> +are ignored as always. [...]
>>
>> During the hackfest in Brno, it was mentioned that a commit which re-sets 
>> the same FB_ID could actually have an effect with VRR: It could trigger 
>> scanout of the next frame before vertical blank has reached its maximum 
>> duration. Some kind of mechanism is required for this in order to allow user 
>> space to perform low frame rate compensation.
>>
> 
> I believe the documentation already addresses that sending redundant 
> information may not lead to the desired behavior during an async flip. Do you 
> think adding a note about using the same FB_ID would be helpful?

Maybe not.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-08-17 Thread Michel Dänzer
On 8/17/23 12:37, Michel Dänzer wrote:
> On 8/15/23 20:57, André Almeida wrote:
>> From: Pekka Paalanen 
>>
>> Specify how the atomic state is maintained between userspace and
>> kernel, plus the special case for async flips.
>>
>> Signed-off-by: Pekka Paalanen 
>> Signed-off-by: André Almeida 
> 
> [...]
> 
>> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to
>> +effectively change only the FB_ID property on any planes. No-operation 
>> changes
>> +are ignored as always. [...]
> 
> During the hackfest in Brno, it was mentioned that a commit which re-sets the 
> same FB_ID could actually have an effect with VRR: It could trigger scanout 
> of the next frame before vertical blank has reached its maximum duration. 
> Some kind of mechanism is required for this in order to allow user space to 
> perform low frame rate compensation.

That said, it doesn't make too much sense to use DRM_MODE_PAGE_FLIP_ASYNC for 
this, since it won't have any effect outside of vertical blank anyway.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-08-17 Thread Michel Dänzer
On 8/15/23 20:57, André Almeida wrote:
> From: Pekka Paalanen 
> 
> Specify how the atomic state is maintained between userspace and
> kernel, plus the special case for async flips.
> 
> Signed-off-by: Pekka Paalanen 
> Signed-off-by: André Almeida 

[...]

> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> +effectively change only the FB_ID property on any planes. No-operation 
> changes
> +are ignored as always. [...]

During the hackfest in Brno, it was mentioned that a commit which re-sets the 
same FB_ID could actually have an effect with VRR: It could trigger scanout of 
the next frame before vertical blank has reached its maximum duration. Some 
kind of mechanism is required for this in order to allow user space to perform 
low frame rate compensation.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6] drm/doc: Document DRM device reset expectations

2023-08-14 Thread Michel Dänzer
On 8/11/23 20:55, André Almeida wrote:
> Create a section that specifies how to deal with DRM device resets for
> kernel and userspace drivers.
> 
> Signed-off-by: André Almeida 

[...]

> +Robustness
> +--
> +
> +The only way to try to keep an application working after a reset is if it
> +complies with the robustness aspects of the graphical API that it is using.

"The only way to try to keep a graphical API context working after a device 
reset [...]"

If a graphical API context stops working properly, it doesn't necessarily mean 
that the application as a whole stops working.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-10 Thread Michel Dänzer
On 8/9/23 21:15, Marek Olšák wrote:
> On Wed, Aug 9, 2023 at 3:35 AM Michel Dänzer  
> wrote:
>> On 8/8/23 19:03, Marek Olšák wrote:
>>> It's the same situation as SIGSEGV. A process can catch the signal,
>>> but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
>>> to catch the GPU error and prevent the process termination. If you
>>> don't use the API, you'll get undefined behavior, which means anything
>>> can happen, including process termination.
>>
>> Got a spec reference for that?
>>
>> I know the spec allows process termination in response to e.g. out of bounds 
>> buffer access by the application (which corresponds to SIGSEGV). There are 
>> other causes for GPU hangs though, e.g. driver bugs. The ARB_robustness spec 
>> says:
>>
>> If the reset notification behavior is NO_RESET_NOTIFICATION_ARB,
>> then the implementation will never deliver notification of reset
>> events, and GetGraphicsResetStatusARB will always return
>> NO_ERROR[fn1].
>>[fn1: In this case it is recommended that implementations should
>> not allow loss of context state no matter what events occur.
>> However, this is only a recommendation, and cannot be relied
>> upon by applications.]
>>
>> No mention of process termination, that rather sounds to me like the GL 
>> implementation should do its best to keep the application running.
> 
> It basically says that we can do anything.

Not really? If program termination is a possible outcome, the spec otherwise 
mentions that explicitly, ala "including program termination".


> A frozen window or flipping between 2 random frames can't be described
> as "keeping the application running".

This assumes that an application which uses OpenGL cannot have any other 
purpose than using OpenGL.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-09 Thread Michel Dänzer
On 8/8/23 19:03, Marek Olšák wrote:
> It's the same situation as SIGSEGV. A process can catch the signal,
> but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
> to catch the GPU error and prevent the process termination. If you
> don't use the API, you'll get undefined behavior, which means anything
> can happen, including process termination.

Got a spec reference for that?

I know the spec allows process termination in response to e.g. out of bounds 
buffer access by the application (which corresponds to SIGSEGV). There are 
other causes for GPU hangs though, e.g. driver bugs. The ARB_robustness spec 
says:

If the reset notification behavior is NO_RESET_NOTIFICATION_ARB,
then the implementation will never deliver notification of reset
events, and GetGraphicsResetStatusARB will always return
NO_ERROR[fn1].
   [fn1: In this case it is recommended that implementations should
not allow loss of context state no matter what events occur.
However, this is only a recommendation, and cannot be relied
upon by applications.]

No mention of process termination, that rather sounds to me like the GL 
implementation should do its best to keep the application running.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v4 03/17] drm/imagination/uapi: Add PowerVR driver UAPI

2023-08-08 Thread Michel Dänzer
On 7/14/23 16:25, Sarah Walker wrote:
> 
> +/**
> + * DOC: PowerVR IOCTL CREATE_BO interface
> + */
> +
> +/**
> + * DOC: Flags for CREATE_BO
> + *
> + * The  drm_pvr_ioctl_create_bo_args.flags field is 64 bits wide and 
> consists
> + * of three groups of flags: creation, device mapping and CPU mapping.
> + *
> + * We use "device" to refer to the GPU here because of the ambiguity between
> + * CPU and GPU in some fonts.
> + *
> + * Creation options
> + *These use the prefix ``DRM_PVR_BO_CREATE_``.
> + *
> + *:ZEROED: Require the allocated buffer to be zeroed before returning. 
> Note
> + *  that this is an active operation, and is never zero cost. Unless it 
> is
> + *  explicitly required, this option should not be set.

Making this optional is kind of problematic from a security standpoint 
(information leak, at least if the memory was previously used by a different 
process). See e.g. the discussion starting at 
https://gitlab.freedesktop.org/mesa/mesa/-/issues/9189#note_1972986 .

AFAICT the approach I suggested there (Clear freed memory in the background, 
and make it available for allocation again only once the clear has finished) 
isn't really possible with gem_shmem in its current state though. There seems 
to be ongoing work to do something like that for __GFP_ZERO in general, maybe 
gem_shmem could take advantage of that when it lands. I'm afraid this series 
can't depend on that though.


> +/**
> + * DOC: PowerVR IOCTL VM_MAP and VM_UNMAP interfaces
> + *
> + * The VM UAPI allows userspace to create buffer object mappings in GPU 
> virtual address space.
> + *
> + * The client is responsible for managing GPU address space. It should 
> allocate mappings within
> + * the heaps returned by %DRM_PVR_DEV_QUERY_HEAP_INFO_GET.
> + *
> + * %DRM_IOCTL_PVR_VM_MAP creates a new mapping. The client provides the 
> target virtual address for
> + * the mapping. Size and offset within the mapped buffer object can be 
> specified, so the client can
> + * partially map a buffer.
> + *
> + * %DRM_IOCTL_PVR_VM_UNMAP removes a mapping. The entire mapping will be 
> removed from GPU address
> + * space. For this reason only the start address is provided by the client.
> + */

FWIW, the amdgpu driver uses a single ioctl for VM map & unmap (plus two 
additional operations for partial residency). Maybe this would make sense for 
the PowerVR driver as well, in particular if it might support partial residency 
in the future.

(amdgpu also uses similar multiplexer ioctls for other things such as context 
create/destroy/...)

Just an idea, feel free to ignore.


> +/**
> + * DOC: Flags for SUBMIT_JOB ioctl geometry command.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_FIRST
> + *
> + *Indicates if this the first command to be issued for a render.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_LAST

Does user space really need to pass in the FIRST/LAST flags, can't the kernel 
driver determine this implicitly? What happens if user space sets these 
incorrectly?


> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_FRAG_CMD_PREVENT_CDM_OVERLAP
> + *
> + *Disallow compute overlapped with this render.

Does this affect only compute from the same context, or also from other 
contexts?

(Similar question for DRM_PVR_SUBMIT_JOB_COMPUTE_CMD_PREVENT_ALL_OVERLAP)


P.S. I mostly just skimmed the other patches of the series, but my impression 
is that the patches and code are cleanly structured and well-documented.

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-08-02 Thread Michel Dänzer
On 8/2/23 09:38, Marek Olšák wrote:
> 
> The precedent from the CPU land is pretty strong here. There is
> SIGSEGV for invalid CPU memory access and SIGILL for invalid CPU
> instructions, yet we do nothing for invalid GPU memory access and
> invalid GPU instructions. Sending a terminating signal from the kernel
> would be the most natural thing to do.

After an unhandled SIGSEGV or SIGILL, the process is in an inconsistent state 
and cannot safely continue executing. That's why the process is terminated by 
default in those cases.

The same is not true when an OpenGL context stops working. Any threads / other 
parts of the process not using that OpenGL context continue working normally. 
And any attempts to use that OpenGL context can be safely ignored (or the 
OpenGL implementation couldn't support the robustness extensions).


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-07-26 Thread Michel Dänzer
On 7/25/23 15:02, André Almeida wrote:
> Em 25/07/2023 05:03, Michel Dänzer escreveu:
>> On 7/25/23 04:55, André Almeida wrote:
>>> Hi everyone,
>>>
>>> It's not clear what we should do about non-robust OpenGL apps after GPU 
>>> resets, so I'll try to summarize the topic, show some options and my 
>>> proposal to move forward on that.
>>>
>>> Em 27/06/2023 10:23, André Almeida escreveu:
>>>> +Robustness
>>>> +--
>>>> +
>>>> +The only way to try to keep an application working after a reset is if it
>>>> +complies with the robustness aspects of the graphical API that it is 
>>>> using.
>>>> +
>>>> +Graphical APIs provide ways to applications to deal with device resets. 
>>>> However,
>>>> +there is no guarantee that the app will use such features correctly, and 
>>>> the
>>>> +UMD can implement policies to close the app if it is a repeating offender,
>>>> +likely in a broken loop. This is done to ensure that it does not keep 
>>>> blocking
>>>> +the user interface from being correctly displayed. This should be done 
>>>> even if
>>>> +the app is correct but happens to trigger some bug in the hardware/driver.
>>>> +
>>> Depending on the OpenGL version, there are different robustness API 
>>> available:
>>>
>>> - OpenGL ABR extension [0]
>>> - OpenGL KHR extension [1]
>>> - OpenGL ES extension  [2]
>>>
>>> Apps written in OpenGL should use whatever version is available for them to 
>>> make the app robust for GPU resets. That usually means calling 
>>> GetGraphicsResetStatusARB(), checking the status, and if it encounter 
>>> something different from NO_ERROR, that means that a reset has happened, 
>>> the context is considered lost and should be recreated. If an app follow 
>>> this, it will likely succeed recovering a reset.
>>>
>>> What should non-robustness apps do then? They certainly will not be 
>>> notified if a reset happens, and thus can't recover if their context is 
>>> lost. OpenGL specification does not explicitly define what should be done 
>>> in such situations[3], and I believe that usually when the spec mandates to 
>>> close the app, it would explicitly note it.
>>>
>>> However, in reality there are different types of device resets, causing 
>>> different results. A reset can be precise enough to damage only the guilty 
>>> context, and keep others alive.
>>>
>>> Given that, I believe drivers have the following options:
>>>
>>> a) Kill all non-robust apps after a reset. This may lead to lose work from 
>>> innocent applications.
>>>
>>> b) Ignore all non-robust apps OpenGL calls. That means that applications 
>>> would still be alive, but the user interface would be freeze. The user 
>>> would need to close it manually anyway, but in some corner cases, the app 
>>> could autosave some work or the user might be able to interact with it 
>>> using some alternative method (command line?).
>>>
>>> c) Kill just the affected non-robust applications. To do that, the driver 
>>> need to be 100% sure on the impact of its resets.
>>>
>>> RadeonSI currently implements a), as can be seen at [4], while Iris 
>>> implements what I think it's c)[5].
>>>
>>> For the user experience point-of-view, c) is clearly the best option, but 
>>> it's the hardest to archive. There's not much gain on having b) over a), 
>>> perhaps it could be an optional env var for such corner case applications.
>>
>> I disagree on these conclusions.
>>
>> c) is certainly better than a), but it's not "clearly the best" in all 
>> cases. The OpenGL UMD is not a privileged/special component and is in no 
>> position to decide whether or not the process as a whole (only some 
>> thread(s) of which may use OpenGL at all) gets to continue running or not.
>>
> 
> Thank you for the feedback. How do you think the documentation should look 
> like for this part?

The initial paragraph about robustness should say "keep OpenGL working" instead 
of "keep an application working". If an OpenGL context stops working, it 
doesn't necessarily mean the application stops working altogether.


If the application doesn't use the robustness extensions, your option b) is 
what should happen by default whenever possible. And it really has to be 
possible if the robustness extensions are supported.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-07-25 Thread Michel Dänzer
On 7/25/23 17:05, Marek Olšák wrote:
> On Tue, Jul 25, 2023 at 4:03 AM Michel Dänzer
>  wrote:
>> On 7/25/23 04:55, André Almeida wrote:
>>> Hi everyone,
>>>
>>> It's not clear what we should do about non-robust OpenGL apps after GPU 
>>> resets, so I'll try to summarize the topic, show some options and my 
>>> proposal to move forward on that.
>>>
>>> Em 27/06/2023 10:23, André Almeida escreveu:
>>>> +Robustness
>>>> +--
>>>> +
>>>> +The only way to try to keep an application working after a reset is if it
>>>> +complies with the robustness aspects of the graphical API that it is 
>>>> using.
>>>> +
>>>> +Graphical APIs provide ways to applications to deal with device resets. 
>>>> However,
>>>> +there is no guarantee that the app will use such features correctly, and 
>>>> the
>>>> +UMD can implement policies to close the app if it is a repeating offender,
>>>> +likely in a broken loop. This is done to ensure that it does not keep 
>>>> blocking
>>>> +the user interface from being correctly displayed. This should be done 
>>>> even if
>>>> +the app is correct but happens to trigger some bug in the hardware/driver.
>>>> +
>>> Depending on the OpenGL version, there are different robustness API 
>>> available:
>>>
>>> - OpenGL ABR extension [0]
>>> - OpenGL KHR extension [1]
>>> - OpenGL ES extension  [2]
>>>
>>> Apps written in OpenGL should use whatever version is available for them to 
>>> make the app robust for GPU resets. That usually means calling 
>>> GetGraphicsResetStatusARB(), checking the status, and if it encounter 
>>> something different from NO_ERROR, that means that a reset has happened, 
>>> the context is considered lost and should be recreated. If an app follow 
>>> this, it will likely succeed recovering a reset.
>>>
>>> What should non-robustness apps do then? They certainly will not be 
>>> notified if a reset happens, and thus can't recover if their context is 
>>> lost. OpenGL specification does not explicitly define what should be done 
>>> in such situations[3], and I believe that usually when the spec mandates to 
>>> close the app, it would explicitly note it.
>>>
>>> However, in reality there are different types of device resets, causing 
>>> different results. A reset can be precise enough to damage only the guilty 
>>> context, and keep others alive.
>>>
>>> Given that, I believe drivers have the following options:
>>>
>>> a) Kill all non-robust apps after a reset. This may lead to lose work from 
>>> innocent applications.
>>>
>>> b) Ignore all non-robust apps OpenGL calls. That means that applications 
>>> would still be alive, but the user interface would be freeze. The user 
>>> would need to close it manually anyway, but in some corner cases, the app 
>>> could autosave some work or the user might be able to interact with it 
>>> using some alternative method (command line?).
>>>
>>> c) Kill just the affected non-robust applications. To do that, the driver 
>>> need to be 100% sure on the impact of its resets.
>>>
>>> RadeonSI currently implements a), as can be seen at [4], while Iris 
>>> implements what I think it's c)[5].
>>>
>>> For the user experience point-of-view, c) is clearly the best option, but 
>>> it's the hardest to archive. There's not much gain on having b) over a), 
>>> perhaps it could be an optional env var for such corner case applications.
>>
>> I disagree on these conclusions.
>>
>> c) is certainly better than a), but it's not "clearly the best" in all 
>> cases. The OpenGL UMD is not a privileged/special component and is in no 
>> position to decide whether or not the process as a whole (only some 
>> thread(s) of which may use OpenGL at all) gets to continue running or not.
> 
> That's not true.

Which part of what I wrote are you referring to?


> I recommend that you enable b) with your driver and then hang the GPU under 
> different scenarios and see the result.

I've been doing GPU driver development for over two decades, I'm perfectly 
aware what it means. It doesn't change what I wrote above.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-07-25 Thread Michel Dänzer
On 7/25/23 04:55, André Almeida wrote:
> Hi everyone,
> 
> It's not clear what we should do about non-robust OpenGL apps after GPU 
> resets, so I'll try to summarize the topic, show some options and my proposal 
> to move forward on that.
> 
> Em 27/06/2023 10:23, André Almeida escreveu:
>> +Robustness
>> +--
>> +
>> +The only way to try to keep an application working after a reset is if it
>> +complies with the robustness aspects of the graphical API that it is using.
>> +
>> +Graphical APIs provide ways to applications to deal with device resets. 
>> However,
>> +there is no guarantee that the app will use such features correctly, and the
>> +UMD can implement policies to close the app if it is a repeating offender,
>> +likely in a broken loop. This is done to ensure that it does not keep 
>> blocking
>> +the user interface from being correctly displayed. This should be done even 
>> if
>> +the app is correct but happens to trigger some bug in the hardware/driver.
>> +
> Depending on the OpenGL version, there are different robustness API available:
> 
> - OpenGL ABR extension [0]
> - OpenGL KHR extension [1]
> - OpenGL ES extension  [2]
> 
> Apps written in OpenGL should use whatever version is available for them to 
> make the app robust for GPU resets. That usually means calling 
> GetGraphicsResetStatusARB(), checking the status, and if it encounter 
> something different from NO_ERROR, that means that a reset has happened, the 
> context is considered lost and should be recreated. If an app follow this, it 
> will likely succeed recovering a reset.
> 
> What should non-robustness apps do then? They certainly will not be notified 
> if a reset happens, and thus can't recover if their context is lost. OpenGL 
> specification does not explicitly define what should be done in such 
> situations[3], and I believe that usually when the spec mandates to close the 
> app, it would explicitly note it.
> 
> However, in reality there are different types of device resets, causing 
> different results. A reset can be precise enough to damage only the guilty 
> context, and keep others alive.
> 
> Given that, I believe drivers have the following options:
> 
> a) Kill all non-robust apps after a reset. This may lead to lose work from 
> innocent applications.
> 
> b) Ignore all non-robust apps OpenGL calls. That means that applications 
> would still be alive, but the user interface would be freeze. The user would 
> need to close it manually anyway, but in some corner cases, the app could 
> autosave some work or the user might be able to interact with it using some 
> alternative method (command line?).
> 
> c) Kill just the affected non-robust applications. To do that, the driver 
> need to be 100% sure on the impact of its resets.
> 
> RadeonSI currently implements a), as can be seen at [4], while Iris 
> implements what I think it's c)[5].
> 
> For the user experience point-of-view, c) is clearly the best option, but 
> it's the hardest to archive. There's not much gain on having b) over a), 
> perhaps it could be an optional env var for such corner case applications.

I disagree on these conclusions.

c) is certainly better than a), but it's not "clearly the best" in all cases. 
The OpenGL UMD is not a privileged/special component and is in no position to 
decide whether or not the process as a whole (only some thread(s) of which may 
use OpenGL at all) gets to continue running or not.


> [0] https://registry.khronos.org/OpenGL/extensions/ARB/ARB_robustness.txt
> [1] https://registry.khronos.org/OpenGL/extensions/KHR/KHR_robustness.txt
> [2] https://registry.khronos.org/OpenGL/extensions/EXT/EXT_robustness.txt
> [3] https://registry.khronos.org/OpenGL/specs/gl/glspec46.core.pdf
> [4] 
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.1/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c#L1657
> [5] 
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.1/src/gallium/drivers/iris/iris_batch.c#L842

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH AUTOSEL 6.1 28/41] Revert "drm/amd/display: Do not set drr on pipe commit"

2023-07-24 Thread Michel Dänzer
On 7/24/23 03:21, Sasha Levin wrote:
> From: Michel Dänzer 
> 
> [ Upstream commit 8e1b45c578b799510f9a01a9745a737e74f43cb1 ]
> 
> This reverts commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be.

The reverted commit is the same as patch 1 in this series...

Same issue with the autosel patches for 6.4.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/ast: Fix default resolution when no monitor is connected on DP

2023-07-13 Thread Michel Dänzer
On 7/13/23 11:09, Thomas Zimmermann wrote:
> Am 13.07.23 um 10:53 schrieb Michel Dänzer:
>> On 7/13/23 10:49, Thomas Zimmermann wrote:
>>> Am 13.07.23 um 10:32 schrieb Michel Dänzer:
>>>> On 7/12/23 17:25, Jocelyn Falempe wrote:
>>>>> On 12/07/2023 17:05, Sui Jingfeng wrote:
>>>>>> On 2023/7/10 16:07, Jocelyn Falempe wrote:
>>>>>>
>>>>>>> On the other hand, are there any drawback to present a BMC connector 
>>>>>>> even when the hardware doesn't have it ?
>>>>>>
>>>>>> If not properly setting up, I think you will create two encoder and two 
>>>>>> connector in the system.
>>>>>
>>>>> Yes, but I think it won't have any visible effect for the end-user.
>>>>
>>>> I'm afraid user-space display servers would waste effort producing content 
>>>> for a non-existing BMC (assuming its connector status is connected or 
>>>> unknown).
>>>
>>> Right now, the BMC output works because the VGA status is always connected. 
>>> So nothing really changes.
>>
>> User-space display servers would generally produce different contents by 
>> default for the VGA & BMC connectors.
> 
> Can you elaborate? How would the output differ?

Per the other sub-thread, I didn't realize there's only a single CRTC.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/ast: Fix default resolution when no monitor is connected on DP

2023-07-13 Thread Michel Dänzer
On 7/13/23 10:53, Jocelyn Falempe wrote:
> On 13/07/2023 10:32, Michel Dänzer wrote:
>> On 7/12/23 17:25, Jocelyn Falempe wrote:
>>> On 12/07/2023 17:05, Sui Jingfeng wrote:
>>>> On 2023/7/10 16:07, Jocelyn Falempe wrote:
>>>>
>>>>> On the other hand, are there any drawback to present a BMC connector even 
>>>>> when the hardware doesn't have it ?
>>>>
>>>> If not properly setting up, I think you will create two encoder and two 
>>>> connector in the system.
>>>
>>> Yes, but I think it won't have any visible effect for the end-user.
>>
>> I'm afraid user-space display servers would waste effort producing content 
>> for a non-existing BMC (assuming its connector status is connected or 
>> unknown).
>>
>>
> I think it's already the case, as AST's DP and VGA connectors currently 
> always report "connected" status. And they all share the same crtc, so there 
> is only one framebuffer, that is always active.

"Single CRTC" is the piece of information I was missing. Never mind then.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/ast: Fix default resolution when no monitor is connected on DP

2023-07-13 Thread Michel Dänzer
On 7/13/23 10:49, Thomas Zimmermann wrote:
> Am 13.07.23 um 10:32 schrieb Michel Dänzer:
>> On 7/12/23 17:25, Jocelyn Falempe wrote:
>>> On 12/07/2023 17:05, Sui Jingfeng wrote:
>>>> On 2023/7/10 16:07, Jocelyn Falempe wrote:
>>>>
>>>>> On the other hand, are there any drawback to present a BMC connector even 
>>>>> when the hardware doesn't have it ?
>>>>
>>>> If not properly setting up, I think you will create two encoder and two 
>>>> connector in the system.
>>>
>>> Yes, but I think it won't have any visible effect for the end-user.
>>
>> I'm afraid user-space display servers would waste effort producing content 
>> for a non-existing BMC (assuming its connector status is connected or 
>> unknown).
> 
> Right now, the BMC output works because the VGA status is always connected. 
> So nothing really changes.

User-space display servers would generally produce different contents by 
default for the VGA & BMC connectors.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/ast: Fix default resolution when no monitor is connected on DP

2023-07-13 Thread Michel Dänzer
On 7/12/23 17:25, Jocelyn Falempe wrote:
> On 12/07/2023 17:05, Sui Jingfeng wrote:
>> On 2023/7/10 16:07, Jocelyn Falempe wrote:
>>
>>> On the other hand, are there any drawback to present a BMC connector even 
>>> when the hardware doesn't have it ?
>>
>> If not properly setting up, I think you will create two encoder and two 
>> connector in the system.
> 
> Yes, but I think it won't have any visible effect for the end-user.

I'm afraid user-space display servers would waste effort producing content for 
a non-existing BMC (assuming its connector status is connected or unknown).


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v4 1/6] drm: allow DRM_MODE_PAGE_FLIP_ASYNC for atomic commits

2023-07-05 Thread Michel Dänzer
On 7/4/23 19:06, Sebastian Wick wrote:
> On Sat, Jul 1, 2023 at 4:09 AM André Almeida  wrote:
>>
>> @@ -949,6 +949,15 @@ struct hdr_output_metadata {
>>   * Request that the page-flip is performed as soon as possible, ie. with no
>>   * delay due to waiting for vblank. This may cause tearing to be visible on
>>   * the screen.
>> + *
>> + * When used with atomic uAPI, the driver will return an error if the 
>> hardware
>> + * doesn't support performing an asynchronous page-flip for this update.
>> + * User-space should handle this, e.g. by falling back to a regular 
>> page-flip.
>> + *
>> + * Note, some hardware might need to perform one last synchronous page-flip
>> + * before being able to switch to asynchronous page-flips. As an exception,
>> + * the driver will return success even though that first page-flip is not
>> + * asynchronous.
> 
> What would happen if one commits another async KMS update before the
> first page flip? Does one receive EAGAIN, does it amend the previous
> commit?

Should be the former. DRM_MODE_PAGE_FLIP_ASYNC just means the flip may complete 
outside of vertical blank, it doesn't change anything else.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-05 Thread Michel Dänzer
On 7/5/23 08:30, Marek Olšák wrote:
> On Tue, Jul 4, 2023, 03:55 Michel Dänzer  wrote:
> On 7/4/23 04:34, Marek Olšák wrote:
> > On Mon, Jul 3, 2023, 03:12 Michel Dänzer  
> wrote:
> >     On 6/30/23 22:32, Marek Olšák wrote:
> >     > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer 
>  wrote:
> >     >> On 6/30/23 16:59, Alex Deucher wrote:
> >     >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> >     >>> mailto:sebastian.w...@redhat.com> 
> wrote:
> >     >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida 
>  wrote:
> >     >>>>>
> >     >>>>> +Robustness
> >     >>>>> +--
> >     >>>>> +
> >     >>>>> +The only way to try to keep an application working after a 
> reset is if it
> >     >>>>> +complies with the robustness aspects of the graphical API 
> that it is using.
> >     >>>>> +
> >     >>>>> +Graphical APIs provide ways to applications to deal with 
> device resets. However,
> >     >>>>> +there is no guarantee that the app will use such features 
> correctly, and the
> >     >>>>> +UMD can implement policies to close the app if it is a 
> repeating offender,
> >     >>>>> +likely in a broken loop. This is done to ensure that it does 
> not keep blocking
> >     >>>>> +the user interface from being correctly displayed. This 
> should be done even if
> >     >>>>> +the app is correct but happens to trigger some bug in the 
> hardware/driver.
> >     >>>>
> >     >>>> I still don't think it's good to let the kernel arbitrarily 
> kill
> >     >>>> processes that it thinks are not well-behaved based on some 
> heuristics
> >     >>>> and policy.
> >     >>>>
> >     >>>> Can't this be outsourced to user space? Expose the information 
> about
> >     >>>> processes causing a device and let e.g. systemd deal with 
> coming up
> >     >>>> with a policy and with killing stuff.
> >     >>>
> >     >>> I don't think it's the kernel doing the killing, it would be 
> the UMD.
> >     >>> E.g., if the app is guilty and doesn't support robustness the 
> UMD can
> >     >>> just call exit().
> >     >>
> >     >> It would be safer to just ignore API calls[0], similarly to what 
> is done until the application destroys the context with robustness. Calling 
> exit() likely results in losing any unsaved work, whereas at least some 
> applications might otherwise allow saving the work by other means.
> >     >
> >     > That's a terrible idea. Ignoring API calls would be identical to 
> a freeze. You might as well disable GPU recovery because the result would be 
> the same.
> >
> >     No GPU recovery would affect everything using the GPU, whereas this 
> affects only non-robust applications.
> >
> > which is currently the majority.
> 
> Not sure where you're going with this. Applications need to use 
> robustness to be able to recover from a GPU hang, and the GPU needs to be 
> reset for that. So disabling GPU reset is not the same as what we're 
> discussing here.
> 
> 
> >     > - non-robust contexts: call exit(1) immediately, which is the 
> best way to recover
> >
> >     That's not the UMD's call to make.
> >
> > That's absolutely the UMD's call to make because that's mandated by the 
> hw and API design
> 
> Can you point us to a spec which mandates that the process must be killed 
> in this case?
> 
> 
> > and only driver devs know this, which this thread is a proof of. The 
> default behavior is to skip all command submission if a non-robust context is 
> lost, which looks like a freeze. That's required to prevent infinite hangs 
> from the same context and can be caused by the side effects of the GPU reset 
> itself, not by the cause of the previous hang. The only way out of that is 
> killing the process.
> 
> The UMD killing the process is not the only way out of that, and doing so 
> is overreach on its part. The UMD is but one out of many components in a 
> process, not the main one or a special one. It doesn't get to decide when the 
> process must die, certainly not under cir

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-04 Thread Michel Dänzer
On 7/4/23 04:34, Marek Olšák wrote:
> On Mon, Jul 3, 2023, 03:12 Michel Dänzer  <mailto:michel.daen...@mailbox.org>> wrote:
> On 6/30/23 22:32, Marek Olšák wrote:
> > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer 
> mailto:michel.daen...@mailbox.org> 
> <mailto:michel.daen...@mailbox.org <mailto:michel.daen...@mailbox.org>>> 
> wrote:
> >> On 6/30/23 16:59, Alex Deucher wrote:
> >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> >>> mailto:sebastian.w...@redhat.com> 
> <mailto:sebastian.w...@redhat.com <mailto:sebastian.w...@redhat.com>>> wrote:
> >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida 
> mailto:andrealm...@igalia.com> 
> <mailto:andrealm...@igalia.com <mailto:andrealm...@igalia.com>>> wrote:
> >>>>>
> >>>>> +Robustness
> >>>>> +--
> >>>>> +
> >>>>> +The only way to try to keep an application working after a reset 
> is if it
> >>>>> +complies with the robustness aspects of the graphical API that it 
> is using.
> >>>>> +
> >>>>> +Graphical APIs provide ways to applications to deal with device 
> resets. However,
> >>>>> +there is no guarantee that the app will use such features 
> correctly, and the
> >>>>> +UMD can implement policies to close the app if it is a repeating 
> offender,
> >>>>> +likely in a broken loop. This is done to ensure that it does not 
> keep blocking
> >>>>> +the user interface from being correctly displayed. This should be 
> done even if
> >>>>> +the app is correct but happens to trigger some bug in the 
> hardware/driver.
> >>>>
> >>>> I still don't think it's good to let the kernel arbitrarily kill
> >>>> processes that it thinks are not well-behaved based on some 
> heuristics
> >>>> and policy.
> >>>>
> >>>> Can't this be outsourced to user space? Expose the information about
> >>>> processes causing a device and let e.g. systemd deal with coming up
> >>>> with a policy and with killing stuff.
> >>>
> >>> I don't think it's the kernel doing the killing, it would be the UMD.
> >>> E.g., if the app is guilty and doesn't support robustness the UMD can
> >>> just call exit().
> >>
> >> It would be safer to just ignore API calls[0], similarly to what is 
> done until the application destroys the context with robustness. Calling 
> exit() likely results in losing any unsaved work, whereas at least some 
> applications might otherwise allow saving the work by other means.
> >
> > That's a terrible idea. Ignoring API calls would be identical to a 
> freeze. You might as well disable GPU recovery because the result would be 
> the same.
> 
> No GPU recovery would affect everything using the GPU, whereas this 
> affects only non-robust applications.
> 
> which is currently the majority.

Not sure where you're going with this. Applications need to use robustness to 
be able to recover from a GPU hang, and the GPU needs to be reset for that. So 
disabling GPU reset is not the same as what we're discussing here.


> > - non-robust contexts: call exit(1) immediately, which is the best way 
> to recover
> 
> That's not the UMD's call to make.
> 
> That's absolutely the UMD's call to make because that's mandated by the hw 
> and API design

Can you point us to a spec which mandates that the process must be killed in 
this case?


> and only driver devs know this, which this thread is a proof of. The default 
> behavior is to skip all command submission if a non-robust context is lost, 
> which looks like a freeze. That's required to prevent infinite hangs from the 
> same context and can be caused by the side effects of the GPU reset itself, 
> not by the cause of the previous hang. The only way out of that is killing 
> the process.

The UMD killing the process is not the only way out of that, and doing so is 
overreach on its part. The UMD is but one out of many components in a process, 
not the main one or a special one. It doesn't get to decide when the process 
must die, certainly not under circumstances where it must be able to continue 
while ignoring API calls (that's required for robustness).


> >>     [0] Possibly accompanied by a one-time message to stderr along the 
> lines of "GPU reset detected but robustness not enabled in context, ignoring 
> OpenGL API calls".


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-03 Thread Michel Dänzer
On 6/30/23 22:32, Marek Olšák wrote:
> On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer  <mailto:michel.daen...@mailbox.org>> wrote:
>> On 6/30/23 16:59, Alex Deucher wrote:
>>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
>>> mailto:sebastian.w...@redhat.com>> wrote:
>>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida >>> <mailto:andrealm...@igalia.com>> wrote:
>>>>>
>>>>> +Robustness
>>>>> +--
>>>>> +
>>>>> +The only way to try to keep an application working after a reset is if it
>>>>> +complies with the robustness aspects of the graphical API that it is 
>>>>> using.
>>>>> +
>>>>> +Graphical APIs provide ways to applications to deal with device resets. 
>>>>> However,
>>>>> +there is no guarantee that the app will use such features correctly, and 
>>>>> the
>>>>> +UMD can implement policies to close the app if it is a repeating 
>>>>> offender,
>>>>> +likely in a broken loop. This is done to ensure that it does not keep 
>>>>> blocking
>>>>> +the user interface from being correctly displayed. This should be done 
>>>>> even if
>>>>> +the app is correct but happens to trigger some bug in the 
>>>>> hardware/driver.
>>>>
>>>> I still don't think it's good to let the kernel arbitrarily kill
>>>> processes that it thinks are not well-behaved based on some heuristics
>>>> and policy.
>>>>
>>>> Can't this be outsourced to user space? Expose the information about
>>>> processes causing a device and let e.g. systemd deal with coming up
>>>> with a policy and with killing stuff.
>>>
>>> I don't think it's the kernel doing the killing, it would be the UMD.
>>> E.g., if the app is guilty and doesn't support robustness the UMD can
>>> just call exit().
>>
>> It would be safer to just ignore API calls[0], similarly to what is done 
>> until the application destroys the context with robustness. Calling exit() 
>> likely results in losing any unsaved work, whereas at least some 
>> applications might otherwise allow saving the work by other means.
> 
> That's a terrible idea. Ignoring API calls would be identical to a freeze. 
> You might as well disable GPU recovery because the result would be the same.

No GPU recovery would affect everything using the GPU, whereas this affects 
only non-robust applications.


> - non-robust contexts: call exit(1) immediately, which is the best way to 
> recover

That's not the UMD's call to make.


>> [0] Possibly accompanied by a one-time message to stderr along the lines 
>> of "GPU reset detected but robustness not enabled in context, ignoring 
>> OpenGL API calls".


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-30 Thread Michel Dänzer
On 6/30/23 16:59, Alex Deucher wrote:
> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
>  wrote:
>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida  wrote:
>>>
>>> +Robustness
>>> +--
>>> +
>>> +The only way to try to keep an application working after a reset is if it
>>> +complies with the robustness aspects of the graphical API that it is using.
>>> +
>>> +Graphical APIs provide ways to applications to deal with device resets. 
>>> However,
>>> +there is no guarantee that the app will use such features correctly, and 
>>> the
>>> +UMD can implement policies to close the app if it is a repeating offender,
>>> +likely in a broken loop. This is done to ensure that it does not keep 
>>> blocking
>>> +the user interface from being correctly displayed. This should be done 
>>> even if
>>> +the app is correct but happens to trigger some bug in the hardware/driver.
>>
>> I still don't think it's good to let the kernel arbitrarily kill
>> processes that it thinks are not well-behaved based on some heuristics
>> and policy.
>>
>> Can't this be outsourced to user space? Expose the information about
>> processes causing a device and let e.g. systemd deal with coming up
>> with a policy and with killing stuff.
> 
> I don't think it's the kernel doing the killing, it would be the UMD.
> E.g., if the app is guilty and doesn't support robustness the UMD can
> just call exit().

It would be safer to just ignore API calls[0], similarly to what is done until 
the application destroys the context with robustness. Calling exit() likely 
results in losing any unsaved work, whereas at least some applications might 
otherwise allow saving the work by other means.


[0] Possibly accompanied by a one-time message to stderr along the lines of 
"GPU reset detected but robustness not enabled in context, ignoring OpenGL API 
calls".

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: How to fetch the implicit sync fence for a GPU buffer?

2023-06-02 Thread Michel Dänzer
On 6/1/23 23:15, Hoosier, Matt wrote:
> Hi,
> 
>  
> 
> In the past there has been writing about Wayland’s model using implicit 
> synchronization of GPU renderbuffers and KMS commits [1] [2].
> 
>  
> 
> It would sometimes be nice to do that waiting explicitly in a compositor 
> before enqueueing a KMS pageflip that references a buffer who may go on 
> rendering for some time still. This stalls the pipeline.
> 
>  
> 
> I’m wondering whether there’s an API -- maybe something analogous to 
> drmPrimeHandleToFD() – that allows userspace to fetch a waitable fence fd for 
> a dmabuf?

A dma-buf fd itself can serve this purpose; it polls as readable when the GPU 
has finished drawing to the buffer.

See https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 for how I used 
this to implement what you describe above in mutter. Note that this involves 
some Wayland state management challenges for correct operation in all cases 
though.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH AUTOSEL 6.1 4/9] drm/amd/display: Do not set drr on pipe commit

2023-05-15 Thread Michel Dänzer
On 5/11/23 21:39, Sasha Levin wrote:
> From: Wesley Chalmers 
> 
> [ Upstream commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be ]
> 
> [WHY]
> Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
> pipe commit can cause underflow.
> 
> [HOW]
> Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets
> optimized_required.
> 
> This change expects that Freesync requests are blocked when
> optimized_required is true.

This change caused a regression, see 
https://patchwork.freedesktop.org/patch/532240/?series=116487=1#comment_972234
 / 9deeb132-a317-7419-e9da-cbc0a379c...@daenzer.net .


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-26 Thread Michel Dänzer
On 4/25/23 21:11, Marek Olšák wrote:
> The last 3 comments in this thread contain arguments that are false and were 
> specifically pointed out as false 6 comments ago: Soft resets are just as 
> fatal as hard resets. There is nothing better about soft resets. If the VRAM 
> is lost completely, that's a different story, and if the hard reset is 100% 
> unreliable, that's also a different story, but other than those two outliers, 
> there is no difference between the two from the user point view. Both can 
> repeatedly hang if you don't prevent the app that caused the hang from using 
> the GPU even if the app is not robust. The robustness context type doesn't 
> matter here. By definition, no guilty app can continue after a reset, and no 
> innocent apps affected by a reset can continue either because those can now 
> hang too. That's how destructive all resets are. Personal anecdotes that the 
> soft reset is better are just that, anecdotes.

You're trying to frame the situation as black or white, but reality is shades 
of grey.


There's a similar situation with kernel Oopsen: In principle it's not safe to 
continue executing the kernel after it hits an Oops, since it might be in an 
inconsistent state, which could result in any kind of misbehaviour. Still, the 
default behaviour is to continue executing, and in most cases it turns out 
fine. Users which cannot accept the residual risk can choose to make the kernel 
panic when it hits an Oops (either via CONFIG_PANIC_ON_OOPS at build time, or 
via oops=panic on the kernel command line). A kernel panic means that the 
machine basically freezes from a user PoV, which would be worse as the default 
behaviour for most users (because it would e.g. incur a higher risk of losing 
filesystem data).


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Michel Dänzer
On 4/25/23 14:08, Christian König wrote:
> Well signaling that something happened is not the question. We do this for 
> both soft as well as hard resets.
> 
> The question is if errors result in blocking further submissions with the 
> same context or not.
> 
> In case of a hard reset and potential loss of state we have to kill the 
> context, otherwise a follow up submission would just lockup the hardware once 
> more.
> 
> In case of a soft reset I think we can keep the context alive, this way even 
> applications without robustness handling can keep work.
> 
> You potentially still get some corruption, but at least not your compositor 
> killed.

Right, and if there is corruption, the user can restart the session.


Maybe a possible compromise could be making soft resets fatal if user space 
enabled robustness for the context, and non-fatal if not.


> Am 25.04.23 um 13:07 schrieb Marek Olšák:
>> That supposedly depends on the compositor. There may be compositors for very 
>> specific cases (e.g. Steam Deck) that handle resets very well, and those 
>> would like to be properly notified of all resets because that's how they get 
>> the best outcome, e.g. no corruption. A soft reset that is unhandled by 
>> userspace may result in persistent corruption.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Michel Dänzer
On 4/24/23 18:45, Marek Olšák wrote:
> Soft resets are fatal just as hard resets, but no reset is "always fatal". 
> There are cases when apps keep working depending on which features are being 
> used. It's still unsafe.

Agreed, in theory.

In practice, from a user PoV, right now there's pretty much 0 chance of the 
user session surviving if the GPU context in certain critical processes (e.g. 
the Wayland compositor or Xwayland) hits a fatal reset. There's a > 0 chance of 
it surviving after a soft reset. There's ongoing work towards making user-space 
components more robust against fatal resets, but it's taking time. Meanwhile, I 
suspect most users would take the > 0 chance.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-24 Thread Michel Dänzer
On 4/24/23 15:26, André Almeida wrote:
>>
>> Additional to that I currently didn't considered soft-recovered submissions 
>> as fatal and continue accepting submissions from that context, but already 
>> wanted to talk with Marek about that behavior.
>>
> 
> Interesting. I will try to test and validate this approach to see if the 
> contexts keep working as expected on soft-resets.

FWIW, on this Thinkpad E595 with a Picasso APU, I've hit soft-resets (usually 
either in Firefox or gnome-shell) a number of times, and usually continued 
using the GNOME session for a few days without any issues.

(Interestingly, Firefox reacts to the soft-resets by falling back to software 
rendering, even when it's not guilty itself)


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] fbdev: Don't spam dmesg on bad userspace ioctl input

2023-04-11 Thread Michel Dänzer
On 4/11/23 11:10, Geert Uytterhoeven wrote:
> Hi Michel,
> 
> On Wed, Apr 5, 2023 at 10:50 AM Michel Dänzer
>  wrote:
>> On 4/4/23 14:36, Daniel Vetter wrote:
>>> There's a few reasons the kernel should not spam dmesg on bad
>>> userspace ioctl input:
>>> - at warning level it results in CI false positives
>>> - it allows userspace to drown dmesg output, potentially hiding real
>>>   issues.
>>>
>>> None of the other generic EINVAL checks report in the
>>> FBIOPUT_VSCREENINFO ioctl do this, so it's also inconsistent.
>>>
>>> I guess the intent of the patch which introduced this warning was that
>>> the drivers ->fb_check_var routine should fail in that case. Reality
>>> is that there's too many fbdev drivers and not enough people
>>> maintaining them by far, and so over the past few years we've simply
>>> handled all these validation gaps by tighning the checks in the core,
>>> because that's realistically really all that will ever happen.
>>>
>>> Reported-by: syzbot+20dcf81733d43ddff...@syzkaller.appspotmail.com
>>> Link: 
>>> https://syzkaller.appspot.com/bug?id=c5faf983bfa4a607de530cd3bb00bf06cefc
>>> Fixes: 6c11df58fd1a ("fbmem: Check virtual screen sizes in fb_set_var()")
>>> Cc: Helge Deller 
>>> Cc: Geert Uytterhoeven 
>>> Cc: sta...@vger.kernel.org # v5.4+
>>> Cc: Daniel Vetter 
>>> Cc: Javier Martinez Canillas 
>>> Cc: Thomas Zimmermann 
>>> Signed-off-by: Daniel Vetter 
>>> ---
>>>  drivers/video/fbdev/core/fbmem.c | 4 
>>>  1 file changed, 4 deletions(-)
>>>
>>> diff --git a/drivers/video/fbdev/core/fbmem.c 
>>> b/drivers/video/fbdev/core/fbmem.c
>>> index 875541ff185b..9757f4bcdf57 100644
>>> --- a/drivers/video/fbdev/core/fbmem.c
>>> +++ b/drivers/video/fbdev/core/fbmem.c
>>> @@ -1021,10 +1021,6 @@ fb_set_var(struct fb_info *info, struct 
>>> fb_var_screeninfo *var)
>>>   /* verify that virtual resolution >= physical resolution */
>>>   if (var->xres_virtual < var->xres ||
>>>   var->yres_virtual < var->yres) {
>>> - pr_warn("WARNING: fbcon: Driver '%s' missed to adjust virtual 
>>> screen size (%ux%u vs. %ux%u)\n",
>>> - info->fix.id,
>>> - var->xres_virtual, var->yres_virtual,
>>> - var->xres, var->yres);
>>>   return -EINVAL;
>>>   }
>>>
>>
>> Make it pr_warn_once? 99.9...% of the benefit, without spam.
> 
> Except that it should be pr_warn_once_per_fb_info, [...]

Not really, that's what I mean by 99.9...% of the benefit. Even if a broken 
driver is masked on some systems, eventually the other driver masking it should 
get fixed, at which point the previously masked driver will be reported again.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] fbdev: Don't spam dmesg on bad userspace ioctl input

2023-04-05 Thread Michel Dänzer
On 4/4/23 14:36, Daniel Vetter wrote:
> There's a few reasons the kernel should not spam dmesg on bad
> userspace ioctl input:
> - at warning level it results in CI false positives
> - it allows userspace to drown dmesg output, potentially hiding real
>   issues.
> 
> None of the other generic EINVAL checks report in the
> FBIOPUT_VSCREENINFO ioctl do this, so it's also inconsistent.
> 
> I guess the intent of the patch which introduced this warning was that
> the drivers ->fb_check_var routine should fail in that case. Reality
> is that there's too many fbdev drivers and not enough people
> maintaining them by far, and so over the past few years we've simply
> handled all these validation gaps by tighning the checks in the core,
> because that's realistically really all that will ever happen.
> 
> Reported-by: syzbot+20dcf81733d43ddff...@syzkaller.appspotmail.com
> Link: 
> https://syzkaller.appspot.com/bug?id=c5faf983bfa4a607de530cd3bb00bf06cefc
> Fixes: 6c11df58fd1a ("fbmem: Check virtual screen sizes in fb_set_var()")
> Cc: Helge Deller 
> Cc: Geert Uytterhoeven 
> Cc: sta...@vger.kernel.org # v5.4+
> Cc: Daniel Vetter 
> Cc: Javier Martinez Canillas 
> Cc: Thomas Zimmermann 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/video/fbdev/core/fbmem.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c 
> b/drivers/video/fbdev/core/fbmem.c
> index 875541ff185b..9757f4bcdf57 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1021,10 +1021,6 @@ fb_set_var(struct fb_info *info, struct 
> fb_var_screeninfo *var)
>   /* verify that virtual resolution >= physical resolution */
>   if (var->xres_virtual < var->xres ||
>   var->yres_virtual < var->yres) {
> - pr_warn("WARNING: fbcon: Driver '%s' missed to adjust virtual 
> screen size (%ux%u vs. %ux%u)\n",
> - info->fix.id,
> - var->xres_virtual, var->yres_virtual,
> - var->xres, var->yres);
>   return -EINVAL;
>   }
>  

Make it pr_warn_once? 99.9...% of the benefit, without spam.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v10 01/15] dma-buf/dma-fence: Add deadline awareness

2023-03-17 Thread Michel Dänzer
On 3/16/23 23:22, Sebastian Wick wrote:
> On Thu, Mar 16, 2023 at 5:29 PM Rob Clark  wrote:
>> On Thu, Mar 16, 2023 at 2:26 AM Jonas Ådahl  wrote:
>>> On Wed, Mar 15, 2023 at 09:19:49AM -0700, Rob Clark wrote:
>>>> On Wed, Mar 15, 2023 at 6:53 AM Jonas Ådahl  wrote:
>>>>> On Fri, Mar 10, 2023 at 09:38:18AM -0800, Rob Clark wrote:
>>>>>> On Fri, Mar 10, 2023 at 7:45 AM Jonas Ådahl  wrote:
>>>>>>>
>>>>>>>> + *
>>>>>>>> + * To this end, deadline hint(s) can be set on a _fence via 
>>>>>>>> _fence_set_deadline.
>>>>>>>> + * The deadline hint provides a way for the waiting driver, or 
>>>>>>>> userspace, to
>>>>>>>> + * convey an appropriate sense of urgency to the signaling driver.
>>>>>>>> + *
>>>>>>>> + * A deadline hint is given in absolute ktime (CLOCK_MONOTONIC for 
>>>>>>>> userspace
>>>>>>>> + * facing APIs).  The time could either be some point in the future 
>>>>>>>> (such as
>>>>>>>> + * the vblank based deadline for page-flipping, or the start of a 
>>>>>>>> compositor's
>>>>>>>> + * composition cycle), or the current time to indicate an immediate 
>>>>>>>> deadline
>>>>>>>> + * hint (Ie. forward progress cannot be made until this fence is 
>>>>>>>> signaled).
>>>>>>>
>>>>>>> Is it guaranteed that a GPU driver will use the actual start of the
>>>>>>> vblank as the effective deadline? I have some memories of seing
>>>>>>> something about vblank evasion browsing driver code, which I might have
>>>>>>> misunderstood, but I have yet to find whether this is something
>>>>>>> userspace can actually expect to be something it can rely on.
>>>>>>
>>>>>> I guess you mean s/GPU driver/display driver/ ?  It makes things more
>>>>>> clear if we talk about them separately even if they happen to be the
>>>>>> same device.
>>>>>
>>>>> Sure, sorry about being unclear about that.
>>>>>
>>>>>>
>>>>>> Assuming that is what you mean, nothing strongly defines what the
>>>>>> deadline is.  In practice there is probably some buffering in the
>>>>>> display controller.  For ex, block based (including bandwidth
>>>>>> compressed) formats, you need to buffer up a row of blocks to
>>>>>> efficiently linearize for scanout.  So you probably need to latch some
>>>>>> time before you start sending pixel data to the display.  But details
>>>>>> like this are heavily implementation dependent.  I think the most
>>>>>> reasonable thing to target is start of vblank.
>>>>>
>>>>> The driver exposing those details would be quite useful for userspace
>>>>> though, so that it can delay committing updates to late, but not too
>>>>> late. Setting a deadline to be the vblank seems easy enough, but it
>>>>> isn't enough for scheduling the actual commit.
>>>>
>>>> I'm not entirely sure how that would even work.. but OTOH I think you
>>>> are talking about something on the order of 100us?  But that is a bit
>>>> of another topic.
>>>
>>> Yes, something like that. But yea, it's not really related. Scheduling
>>> commits closer to the deadline has more complex behavior than that too,
>>> e.g. the need for real time scheduling, and knowing how long it usually
>>> takes to create and commit and for the kernel to process.
> 
> Vblank can be really long, especially with VRR where the additional
> time you get to finish the frame comes from making vblank longer.
> Using the start of vblank as a deadline makes VRR useless.

Not really. We normally still want to aim for start of vblank with VRR, which 
would result in the maximum refresh rate. Missing that target just incurs less 
of a penalty than with fixed refresh rate.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] Change the meaning of the fields in the ttm_place structure from pfn to bytes

2023-03-03 Thread Michel Dänzer
On 3/3/23 08:16, Somalapuram Amaranath wrote:
> Change the ttm_place structure member fpfn, lpfn, mem_type to
> res_start, res_end, res_type.
> Change the unsigned to u64.
> Fix the dependence in all the DRM drivers and
> clean up PAGE_SHIFT operation.
> 
> Signed-off-by: Somalapuram Amaranath 
> 
> [...]
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 44367f03316f..5b5104e724e3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -131,11 +131,12 @@ static int amdgpu_gtt_mgr_new(struct 
> ttm_resource_manager *man,
>   goto err_free;
>   }
>  
> - if (place->lpfn) {
> + if (place->res_end) {
>   spin_lock(>lock);
>   r = drm_mm_insert_node_in_range(>mm, >mm_nodes[0],
> - num_pages, tbo->page_alignment,
> - 0, place->fpfn, place->lpfn,
> + num_pages, tbo->page_alignment, 
> 0,
> + place->res_start << PAGE_SHIFT,
> + place->res_end << PAGE_SHIFT,
>   DRM_MM_INSERT_BEST);

This should be >> or no shift instead of <<, shouldn't it? Multiplying a value 
in bytes by the page size doesn't make sense.


I didn't check the rest of the patch in detail, but it's easy introduce subtle 
regressions with this kind of change. It'll require a lot of review & testing 
scrutiny.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 06/10] drm/amd/display: Fix implicit enum conversion

2023-02-14 Thread Michel Dänzer
On 2/13/23 21:49, Arthur Grillo wrote:
> Make implicit enum conversion to avoid -Wenum-conversion warning, such
> as:
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:4109:88:
>  warning: implicit conversion from ‘enum ’ to ‘enum 
> odm_combine_mode’ [-Wenum-conversion]
>  4109 | 
> locals->ODMCombineEnablePerState[i][k] = true;
>   |   
>  ^
> 
> [...]
> 
> @@ -3897,14 +3898,14 @@ void 
> dml20_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
>   
> mode_lib->vba.PlaneRequiredDISPCLKWithODMCombine = 
> mode_lib->vba.PixelClock[k] / 2
>   * (1 + 
> mode_lib->vba.DISPCLKDPPCLKDSCCLKDownSpreading / 100.0);
>  
> - locals->ODMCombineEnablePerState[i][k] = false;
> + locals->ODMCombineEnablePerState[i][k] = (enum 
> odm_combine_mode)false;

dm_odm_combine_mode_disabled would seem clearer than (enum 
odm_combine_mode)false.


> - 
> locals->ODMCombineEnablePerState[i][k] = true;
> + 
> locals->ODMCombineEnablePerState[i][k] = (enum odm_combine_mode)true;

I'm not sure which enum value (enum odm_combine_mode)true will be converted to, 
probably dm_odm_combine_mode_2to1? Is that really appropriate everywhere true 
is used? If so, again 
dm_odm_combine_mode_2to1 would seem clearer.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [RFC PATCH] drm: Create documentation about device resets

2023-02-07 Thread Michel Dänzer
On 2/7/23 14:30, Pekka Paalanen wrote:
> On Mon, 23 Jan 2023 21:38:11 +0100
> Christian König  wrote:
>> Am 23.01.23 um 21:26 schrieb André Almeida:
>>>
>>> diff --git a/Documentation/gpu/drm-reset.rst 
>>> b/Documentation/gpu/drm-reset.rst
>>> new file mode 100644
>>> index ..0dd11a469cf9
>>> --- /dev/null
>>> +++ b/Documentation/gpu/drm-reset.rst
>>> @@ -0,0 +1,51 @@
>>> +
>>> +DRM Device Reset
>>> +
>>> +
>>> +The GPU stack is really complex and is prone to errors, from hardware bugs,
>>> +faulty applications and everything in the many layers in between. To 
>>> recover
>>> +from this kind of state, sometimes is needed to reset the GPU. Unproper 
>>> handling
>>> +of GPU resets can lead to an unstable userspace. This page describes 
>>> what's the
>>> +expected behaviour from DRM drivers to do in those situations, from 
>>> usermode
>>> +drivers and compositors as well.
>>> +
>>> +Robustness
>>> +--
>>> +
>>> +First of all, application robust APIs, when available, should be used. This
>>> +allows the application to correctly recover and continue to run after a 
>>> reset.
>>> +Apps that doesn't use this should be promptly killed when the kernel driver
>>> +detects that it's in broken state. Specifically guidelines for some APIs:
>>> +  
>>
>>> +- OpenGL: During a reset, KMD kill processes that haven't ARB Robustness
>>> +  enabled, assuming they can't recover.  
>>
>> This is a pretty clear NAK from my side to this approach. The KMD should 
>> never mess with an userspace process directly in such a way.
>>
>> Instead use something like this "OpenGL: KMD signals the abortion of 
>> submitted commands and the UMD should then react accordingly and abort 
>> the application.".
> 
> what Christian said, plus I would not assume that robust programs will
> always respond by creating a new context. They could also switch
> to a software renderer, [...]

That is indeed what Firefox does.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/vkms: Add a DRM render node to vkms

2023-01-09 Thread Michel Dänzer
On 1/5/23 16:16, Yi Xie wrote:
> On Thu, Jan 5, 2023 at 11:16 PM Daniel Vetter  wrote:
>>
>> Ah ok. For next time around, copy a link to the comment you want, e.g.
>>
>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5830#note_582477
>>
>> The 3 dots menu on each comments has an option to copy that link tag. That
>> also highlights the right comment.
> 
> Thanks for the tips! Actually you need to sign in to reveal that 3 dots menu.

The date after the username at the top of each GitLab comment is a clickable 
link for the comment as well.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 2/2] drm/rockchip: vop: Leave vblank enabled in self-refresh

2023-01-06 Thread Michel Dänzer
On 1/6/23 02:40, Brian Norris wrote:
> If we disable vblank when entering self-refresh, vblank APIs (like
> DRM_IOCTL_WAIT_VBLANK) no longer work. But user space is not aware when
> we enter self-refresh, so this appears to be an API violation -- that
> DRM_IOCTL_WAIT_VBLANK fails with EINVAL whenever the display is idle and
> enters self-refresh.
> 
> The downstream driver used by many of these systems never used to
> disable vblank for PSR, and in fact, even upstream, we didn't do that
> until radically redesigning the state machine in commit 6c836d965bad
> ("drm/rockchip: Use the helpers for PSR").
> 
> Thus, it seems like a reasonable API fix to simply restore that
> behavior, and leave vblank enabled.
> 
> Note that this appears to potentially unbalance the
> drm_crtc_vblank_{off,on}() calls in some cases, but:
> (a) drm_crtc_vblank_on() documents this as OK and
> (b) if I do the naive balancing, I find state machine issues such that
> we're not in sync properly; so it's easier to take advantage of (a).
> 
> Backporting notes:
> Marking as 'Fixes' commit 6c836d965bad ("drm/rockchip: Use the helpers
> for PSR"), but it probably depends on commit bed030a49f3e
> ("drm/rockchip: Don't fully disable vop on self refresh") as well.
> 
> We also need the previous patch ("drm/atomic: Allow vblank-enabled +
> self-refresh "disable""), of course.
> 
> Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR")
> Cc: 
> Link: https://lore.kernel.org/dri-devel/y5itf0+yniqa6...@sirena.org.uk/
> Reported-by: "kernelci.org bot" 
> Signed-off-by: Brian Norris 
> ---
> 
>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
> b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> index fa1f4ee6d195..c541967114b4 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> @@ -719,11 +719,11 @@ static void vop_crtc_atomic_disable(struct drm_crtc 
> *crtc,
>  
>   mutex_lock(>vop_lock);
>  
> - drm_crtc_vblank_off(crtc);
> -
>   if (crtc->state->self_refresh_active)
>   goto out;
>  
> + drm_crtc_vblank_off(crtc);
> +
>   /*
>* Vop standby will take effect at end of current frame,
>* if dsp hold valid irq happen, it means standby complete.

The out label immediately unlocks vop->vop_lock again, seems a bit pointless. :)

AFAICT the self_refresh_active case should just return immediately, the out 
label would also send any pending atomic commit completion event, which seems 
wrong now that the vblank interrupt is left enabled. (I also wonder if 
drm_crtc_vblank_off doesn't take care of that already, at least amdgpu doesn't 
do this explicitly AFAICT)


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2023-01-04 Thread Michel Dänzer
On 12/23/22 20:10, Harry Wentland wrote:
> On 12/14/22 04:01, Pekka Paalanen wrote:
>> On Tue, 13 Dec 2022 18:20:59 +0100
>> Michel Dänzer  wrote:
>>> On 12/12/22 19:21, Harry Wentland wrote:
>>>> This will let us pass kms_hdr.bpc_switch.
>>>>
>>>> I don't see any good reasons why we still need to
>>>> limit bpc to 8 bpc and doing so is problematic when
>>>> we enable HDR.
>>>>
>>>> If I remember correctly there might have been some
>>>> displays out there where the advertised link bandwidth
>>>> was not large enough to drive the default timing at
>>>> max bpc. This would leave to an atomic commit/check
>>>> failure which should really be handled in compositors
>>>> with some sort of fallback mechanism.
>>>>
>>>> If this somehow turns out to still be an issue I
>>>> suggest we add a module parameter to allow users to
>>>> limit the max_bpc to a desired value.  
>>>
>>> While leaving the fallback for user space to handle makes some sense
>>> in theory, in practice most KMS display servers likely won't handle
>>> it.
>>>
>>> Another issue is that if mode validation is based on the maximum bpc
>>> value, it may reject modes which would work with lower bpc.
>>>
>>>
>>> What Ville (CC'd) suggested before instead (and what i915 seems to be
>>> doing already) is that the driver should do mode validation based on
>>> the *minimum* bpc, and automatically make the effective bpc lower
>>> than the maximum as needed to make the rest of the atomic state work.
>>
>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>> very well should do so when necessary due to *known* hardware etc.
>> limitations.
>>
> 
> I spent a bunch of time to figure out how this actually pans out in
> amdgpu and it looks like we're doing the right thing, i.e. if bandwidth
> limitations require it we'll downgrade bpc appropriately. These changes
> happened over the last couple years or so. So while raising the default
> max_bpc wasn't safe in amdgpu years ago it is completely fine now.
> 
> As for the relevant code it's mostly handled in 
> create_validate_stream_for_sink
> in amdgpu_dm.c where we iterate over a stream's mode validation with
> decreasing bpc if it fails (down to a bpc of 6).
> 
> For HDMI we also have a separate adjust_colour_depth_from_display_info
> function that downgrades bpc in order to fit within the max_tmds_clock.
> 
> So, in short, this change should not lead to displays not lighting up
> because we no longer force a given bpc.

Thanks for double-checking! This patch is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin

2023-01-04 Thread Michel Dänzer
On 1/4/23 03:11, Brian Norris wrote:
> On Tue, Jan 03, 2023 at 07:04:00PM +0100, Michel Dänzer wrote:
>> On 12/21/22 23:02, Brian Norris wrote:
> 
>>> 3. leave vblank enabled even in the presence of PSR
> 
> I'm leaning toward this.

If this means vblank interrupts will arrive as expected even while in PSR, that 
may be the ideal solution indeed.


>> 5. Go/stay out of PSR while vblank interrupts are enabled (probably want to 
>> make sure vblankoffdelay is set up such that vblank interrupts are disabled 
>> ASAP)
> 
> That seems not extremely nice, to waste power. Based on the earlier
> downstream implementation (which left vblank interrupts enabled), I'd
> wager there's a much larger power win from PSR (on the display-transport
> and memory-bandwidth costs), relative to the power cost of vblank
> interrupts.
> 
> Also, messing with vblankoffdelay sounds likely to trigger the races
> mentioned in the drm_vblank.c kerneldoc.

Not sure how likely that is, quite a few drivers are setting 
dev->vblank_disable_immediate = true.

With that, vblank interrupts should generally be enabled only while there are 
screen updates as well[0], in which case PSR shouldn't be possible anyway.

[0] There may be user space which uses the vblank ioctls even while there are 
no screen updates though, which would prevent PSR in this case.


>>> [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI
>>> contract in the presence of PSR, but I believe the upstream PSR
>>> support has always worked this way. I could imagine
>>> DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here
>>> though.
>>
>> Yeah, that's pretty likely to cause issues with existing real-world user 
>> space.
> 
> OK. Any hints as to what kind of user space uses this?

I don't have any specific example, just thinking about how user space could 
respond to the vblank ioctls returning an error, and it would seem to be 
generally either of:

* Just run non-throttled, which might negate any energy savings from PSR
* Fail to work altogether


> I'm in part simply curious, but I'm also wondering what the
> error-handling and reliability (e.g., what if vblanks don't come?)
> expectations might be in practice.

If vblank events never come, user space might freeze.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin

2023-01-03 Thread Michel Dänzer
On 12/21/22 23:02, Brian Norris wrote:
> Hi Mark, Sean, (and dri-devel)
> 
> On Wed, Dec 14, 2022 at 07:04:37PM -0800, Brian Norris wrote:
>> On Tue, Dec 13, 2022 at 04:51:11PM +, Mark Brown wrote:
>>> On Tue, Dec 13, 2022 at 05:56:30AM -0800, KernelCI bot wrote:
>>>
>>> The KernelCI bisection bot found regressions in at least two KMS tests
>>> in the Renesas tree on rk3399-gru-kevin just after the Renesas tree
>>> merged up mainline:
>>>
>>>igt-kms-rockchip.kms_vblank.pipe-A-wait-forked 
>>>igt-kms-rockchip.kms_vblank.pipe-A-query-busy
>>>
>>> which it bisected to ca871659ec16 ("drm/bridge: analogix_dp: Support
>>> PSR-exit to disable transition").  I'm not *100%* sure I trust the
>>> bisection but it sure is suspicous that two separate bisects for related
>>> issues landed on the same commit.
> 
> ...
> 
>>> | IGT-Version: 1.26-gf8a4a0b (aarch64) (Linux: 6.1.0 aarch64)
>>> | <14>[   35.48] [IGT] drm_read: starting subtest short-buffer-wakeup
>>> | Starting subtest: short-buffer-wakeup
>>> | 
>>> | (| drm_read:350) CRITICAL: Test assertion failure function 
>>> generate_event, file ../tests/drm_read.c:65:
>>> | (drm_read:350) CRITICAL: <14>[   36.155642] [IGT] drm_read: exiting, 
>>> ret=98
>>> | Failed assertion: kmstest_get_vblank(fd, pipe, DRM_VBLANK_EVENT)
>>> | 
>>> | (drm_read:350) CRITICAL: Last errno: 22, Invalid argument
>>> | Stack trace:
>>> | 
>>> |   #0 ../lib/igt_core.c:1933 __igt_fail_assert()
>>> |   #1 [+0xd5362770]
>>> |   #2 [+0xd536193c]
>>> |   #3 [__libc_start_main+0xe8]
>>> |   #4 [+0xd5361974]
>>> |   #5 [[   36.162851] Console: switching to colour frame buffer 
>>> device 300x100>+0xd5361974]
>>> | Subtest short-buffer-wakeup failed.
> 
> ...
> 
>> I'll look some more, but this might be a difference of test setup, such
>> that my setup has the issue before and after that commit, but your setup
>> sees a regression.
> 
> I believe this is the case; things are broken both before and after, but
> depending on test sequencing, etc., they might have seemed less broken
> before.
> 
> When we're failing, we're getting EINVAL from drm_vblank_get(). That
> essentially means that vblank has been disabled (drm_crtc_vblank_off()).
> As it happens, this is exactly what we do when we enter PSR (Panel Self
> Refresh).
> 
> Now, these test cases (especially 'kms_vblank.pipe-A-wait-forked') seem
> to display a static image, and then wait for a bunch of vblank events.
> This is exactly the sort of case where we should enter PSR, and so we're
> likely to disable vblank events. Thus, if everything is working right,
> we will often hit EINVAL in ioctl(DRM_IOCTL_WAIT_VBLANK) ... ->
> drm_vblank_get(), and fail the test.
> 
> As to why this appears to be a regression: like mentioned in my previous
> mail, these tests tend to hose the Analogix PSR state before my patch
> set. When PSR is hosed, we tend to stay with PSR disabled, and so
> drm_vblank_get() doesn't return EINVAL, and the test is happy. (Never
> mind that the display is likely non-functional.) [0]
> 
> So how to fix this?
> 
> 1. ignore it; it's "just" a test failure, IIUC [1]
> 2. change the test, to skip this test if the device has PSR
> 3. leave vblank enabled even in the presence of PSR
> 4. otherwise modify the vblank ioctl interface, such that one can "wait"
>for a vblank that won't come (because the display is in self-refresh
>/ there are no new frames or vblanks)

FWIW, a couple more alternatives:

5. Go/stay out of PSR while vblank interrupts are enabled (probably want to 
make sure vblankoffdelay is set up such that vblank interrupts are disabled 
ASAP)
6. Use fallback timers for vblank events while in PSR (there might be some 
infrastructure for this, since some drivers don't have real vblank interrupts)


> [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI
> contract in the presence of PSR, but I believe the upstream PSR
> support has always worked this way. I could imagine
> DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here
> though.

Yeah, that's pretty likely to cause issues with existing real-world user space.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: DRM scheduler & amdgpu splats followed by GPU hang

2022-12-18 Thread Michel Dänzer
On 12/17/22 13:12, Michel Dänzer wrote:
> 
> With the drm-next-2022-12-13 changes for 6.2 merged on top of a 6.1.0 kernel, 
> I hit a GPU (Picasso APU) hang in the menu of Trackmania (free-to-play 
> Windows game, running via Wine).

It happened again when starting Return to Monkey Island, which is Linux native 
and uses Vulkan. It seems to be quite easy to hit with games using Vulkan 
(specifically Mesa's RADV).

Curiously though, I haven't hit it with games using Vulkan on a different 
machine with Navi 21 dGPU.


> After rebooting, I noticed the attached splats in the journal. I can't be 
> sure that the GPU hang was directly related to these, but it seems plausible.

Given the similar reports by Borislav Petkov and Mikhail Gavrilov, it seems 
likely.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-16 Thread Michel Dänzer
On 12/15/22 10:07, Michel Dänzer wrote:
> On 12/14/22 16:46, Alex Deucher wrote:
>> On Wed, Dec 14, 2022 at 4:01 AM Pekka Paalanen  wrote:
>>> On Tue, 13 Dec 2022 18:20:59 +0100
>>> Michel Dänzer  wrote:
>>>> On 12/12/22 19:21, Harry Wentland wrote:
>>>>> This will let us pass kms_hdr.bpc_switch.
>>>>>
>>>>> I don't see any good reasons why we still need to
>>>>> limit bpc to 8 bpc and doing so is problematic when
>>>>> we enable HDR.
>>>>>
>>>>> If I remember correctly there might have been some
>>>>> displays out there where the advertised link bandwidth
>>>>> was not large enough to drive the default timing at
>>>>> max bpc. This would leave to an atomic commit/check
>>>>> failure which should really be handled in compositors
>>>>> with some sort of fallback mechanism.
>>>>>
>>>>> If this somehow turns out to still be an issue I
>>>>> suggest we add a module parameter to allow users to
>>>>> limit the max_bpc to a desired value.
>>>>
>>>> While leaving the fallback for user space to handle makes some sense
>>>> in theory, in practice most KMS display servers likely won't handle
>>>> it.
>>>>
>>>> Another issue is that if mode validation is based on the maximum bpc
>>>> value, it may reject modes which would work with lower bpc.
>>>>
>>>>
>>>> What Ville (CC'd) suggested before instead (and what i915 seems to be
>>>> doing already) is that the driver should do mode validation based on
>>>> the *minimum* bpc, and automatically make the effective bpc lower
>>>> than the maximum as needed to make the rest of the atomic state work.
>>>
>>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>>> very well should do so when necessary due to *known* hardware etc.
>>> limitations.
>>>
>>
>> In the amdgpu case, it's more of a preference thing.  The driver would
>> enable higher bpcs at the expense of refresh rate and it seemed most
>> users want higher refresh rates than higher bpc. 
> 
> I wrote the above because I thought that this patch might result in some 
> modes getting pruned because they can't work with the max bpc. However, I see 
> now that cbd14ae7ea93 ("drm/amd/display: Fix incorrectly pruned modes with 
> deep color") should prevent that AFAICT.
> 
> The question then is: What happens if user space tries to use a mode which 
> doesn't work with the max bpc? Does the driver automatically lower the 
> effective bpc as needed, or does the atomic commit (check) fail? The latter 
> would seem bad.

Per my previous post in the other sub-thread, cbd14ae7ea93 ("drm/amd/display: 
Fix incorrectly pruned modes with deep color") seems to do the former. The 
commit log of this patch should probably be changed to reflect that.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amd/display: fix dp_retrieve_lttpr_cap return code

2022-12-15 Thread Michel Dänzer
On 12/15/22 17:37, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The dp_retrieve_lttpr_cap() return type changed from 'bool'
> to 'enum dc_status', so now the early 'return' uses the wrong
> type:
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c: In function 
> 'dp_retrieve_lttpr_cap':
> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:5075:24: error: 
> implicit conversion from 'enum ' to 'enum dc_status' 
> [-Werror=enum-conversion]
>  5075 | return false;
>   |^
> 
> Convert it to return 'DC_OK', which is the same value as 'false'.
> 
> Fixes: b473bd5fc333 ("drm/amd/display: refine wake up aux in retrieve link 
> caps")
> Signed-off-by: Arnd Bergmann 
> ---
> No idea if DC_OK is the intended return code, but it leaves the behavior
> unchanged and fixes the warning.
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> index af9411ee3c74..95dbfa4e996a 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> @@ -5095,7 +5095,7 @@ enum dc_status dp_retrieve_lttpr_cap(struct dc_link 
> *link)
>   bool vbios_lttpr_interop = link->dc->caps.vbios_lttpr_aware;
>  
>   if (!vbios_lttpr_interop || 
> !link->dc->caps.extended_aux_timeout_support)
> - return false;
> + return DC_OK;

    return status;

seems more appropriate. (Otherwise the status = DC_ERROR_UNEXPECTED 
initialization has no effect)


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-15 Thread Michel Dänzer
On 12/15/22 18:42, Michel Dänzer wrote:
> On 12/15/22 18:33, Alex Deucher wrote:
>> On Thu, Dec 15, 2022 at 4:17 AM Pekka Paalanen  wrote:
>>>
>>> On Wed, 14 Dec 2022 10:46:55 -0500
>>> Alex Deucher  wrote:
>>>
>>>> On Wed, Dec 14, 2022 at 4:01 AM Pekka Paalanen  wrote:
>>>>>
>>>>> On Tue, 13 Dec 2022 18:20:59 +0100
>>>>> Michel Dänzer  wrote:
>>>>>
>>>>>> On 12/12/22 19:21, Harry Wentland wrote:
>>>>>>> This will let us pass kms_hdr.bpc_switch.
>>>>>>>
>>>>>>> I don't see any good reasons why we still need to
>>>>>>> limit bpc to 8 bpc and doing so is problematic when
>>>>>>> we enable HDR.
>>>>>>>
>>>>>>> If I remember correctly there might have been some
>>>>>>> displays out there where the advertised link bandwidth
>>>>>>> was not large enough to drive the default timing at
>>>>>>> max bpc. This would leave to an atomic commit/check
>>>>>>> failure which should really be handled in compositors
>>>>>>> with some sort of fallback mechanism.
>>>>>>>
>>>>>>> If this somehow turns out to still be an issue I
>>>>>>> suggest we add a module parameter to allow users to
>>>>>>> limit the max_bpc to a desired value.
>>>>>>
>>>>>> While leaving the fallback for user space to handle makes some sense
>>>>>> in theory, in practice most KMS display servers likely won't handle
>>>>>> it.
>>>>>>
>>>>>> Another issue is that if mode validation is based on the maximum bpc
>>>>>> value, it may reject modes which would work with lower bpc.
>>>>>>
>>>>>>
>>>>>> What Ville (CC'd) suggested before instead (and what i915 seems to be
>>>>>> doing already) is that the driver should do mode validation based on
>>>>>> the *minimum* bpc, and automatically make the effective bpc lower
>>>>>> than the maximum as needed to make the rest of the atomic state work.
>>>>>
>>>>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>>>>> very well should do so when necessary due to *known* hardware etc.
>>>>> limitations.
>>>>>
>>>>
>>>> In the amdgpu case, it's more of a preference thing.  The driver would
>>>> enable higher bpcs at the expense of refresh rate and it seemed most
>>>> users want higher refresh rates than higher bpc.
>>>
>>> Hi Alex,
>>>
>>> we already have userspace in explicit control of the refresh rate.
>>> Userspace picks the refresh rate first, then the driver silently checks
>>> what bpc is possible. Userspace preference wins, so bpc is chosen to
>>> produce the desired refresh rate.
>>>
>>> In what cases does the driver really pick a refresh rate on its own?
>>
>> It didn't, but IIRC, at the time the driver filtered out modes that it
>> could not support with the max bpc.  It looks like that has been fixed
>> now, so maybe this whole discussion is moot.
> 
> That depends on the answer to the follow-up question in my previous post.

Never mind, looks like cbd14ae7ea93 ("drm/amd/display: Fix incorrectly pruned 
modes with deep color") does lower bpc as needed both for mode validation and 
atomic commit (check).


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-15 Thread Michel Dänzer
On 12/15/22 18:33, Alex Deucher wrote:
> On Thu, Dec 15, 2022 at 4:17 AM Pekka Paalanen  wrote:
>>
>> On Wed, 14 Dec 2022 10:46:55 -0500
>> Alex Deucher  wrote:
>>
>>> On Wed, Dec 14, 2022 at 4:01 AM Pekka Paalanen  wrote:
>>>>
>>>> On Tue, 13 Dec 2022 18:20:59 +0100
>>>> Michel Dänzer  wrote:
>>>>
>>>>> On 12/12/22 19:21, Harry Wentland wrote:
>>>>>> This will let us pass kms_hdr.bpc_switch.
>>>>>>
>>>>>> I don't see any good reasons why we still need to
>>>>>> limit bpc to 8 bpc and doing so is problematic when
>>>>>> we enable HDR.
>>>>>>
>>>>>> If I remember correctly there might have been some
>>>>>> displays out there where the advertised link bandwidth
>>>>>> was not large enough to drive the default timing at
>>>>>> max bpc. This would leave to an atomic commit/check
>>>>>> failure which should really be handled in compositors
>>>>>> with some sort of fallback mechanism.
>>>>>>
>>>>>> If this somehow turns out to still be an issue I
>>>>>> suggest we add a module parameter to allow users to
>>>>>> limit the max_bpc to a desired value.
>>>>>
>>>>> While leaving the fallback for user space to handle makes some sense
>>>>> in theory, in practice most KMS display servers likely won't handle
>>>>> it.
>>>>>
>>>>> Another issue is that if mode validation is based on the maximum bpc
>>>>> value, it may reject modes which would work with lower bpc.
>>>>>
>>>>>
>>>>> What Ville (CC'd) suggested before instead (and what i915 seems to be
>>>>> doing already) is that the driver should do mode validation based on
>>>>> the *minimum* bpc, and automatically make the effective bpc lower
>>>>> than the maximum as needed to make the rest of the atomic state work.
>>>>
>>>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>>>> very well should do so when necessary due to *known* hardware etc.
>>>> limitations.
>>>>
>>>
>>> In the amdgpu case, it's more of a preference thing.  The driver would
>>> enable higher bpcs at the expense of refresh rate and it seemed most
>>> users want higher refresh rates than higher bpc.
>>
>> Hi Alex,
>>
>> we already have userspace in explicit control of the refresh rate.
>> Userspace picks the refresh rate first, then the driver silently checks
>> what bpc is possible. Userspace preference wins, so bpc is chosen to
>> produce the desired refresh rate.
>>
>> In what cases does the driver really pick a refresh rate on its own?
> 
> It didn't, but IIRC, at the time the driver filtered out modes that it
> could not support with the max bpc.  It looks like that has been fixed
> now, so maybe this whole discussion is moot.

That depends on the answer to the follow-up question in my previous post.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-15 Thread Michel Dänzer
On 12/14/22 16:46, Alex Deucher wrote:
> On Wed, Dec 14, 2022 at 4:01 AM Pekka Paalanen  wrote:
>> On Tue, 13 Dec 2022 18:20:59 +0100
>> Michel Dänzer  wrote:
>>> On 12/12/22 19:21, Harry Wentland wrote:
>>>> This will let us pass kms_hdr.bpc_switch.
>>>>
>>>> I don't see any good reasons why we still need to
>>>> limit bpc to 8 bpc and doing so is problematic when
>>>> we enable HDR.
>>>>
>>>> If I remember correctly there might have been some
>>>> displays out there where the advertised link bandwidth
>>>> was not large enough to drive the default timing at
>>>> max bpc. This would leave to an atomic commit/check
>>>> failure which should really be handled in compositors
>>>> with some sort of fallback mechanism.
>>>>
>>>> If this somehow turns out to still be an issue I
>>>> suggest we add a module parameter to allow users to
>>>> limit the max_bpc to a desired value.
>>>
>>> While leaving the fallback for user space to handle makes some sense
>>> in theory, in practice most KMS display servers likely won't handle
>>> it.
>>>
>>> Another issue is that if mode validation is based on the maximum bpc
>>> value, it may reject modes which would work with lower bpc.
>>>
>>>
>>> What Ville (CC'd) suggested before instead (and what i915 seems to be
>>> doing already) is that the driver should do mode validation based on
>>> the *minimum* bpc, and automatically make the effective bpc lower
>>> than the maximum as needed to make the rest of the atomic state work.
>>
>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>> very well should do so when necessary due to *known* hardware etc.
>> limitations.
>>
> 
> In the amdgpu case, it's more of a preference thing.  The driver would
> enable higher bpcs at the expense of refresh rate and it seemed most
> users want higher refresh rates than higher bpc. 

I wrote the above because I thought that this patch might result in some modes 
getting pruned because they can't work with the max bpc. However, I see now 
that cbd14ae7ea93 ("drm/amd/display: Fix incorrectly pruned modes with deep 
color") should prevent that AFAICT.

The question then is: What happens if user space tries to use a mode which 
doesn't work with the max bpc? Does the driver automatically lower the 
effective bpc as needed, or does the atomic commit (check) fail? The latter 
would seem bad.


> I guess the driver can select a lower bpc at its discretion to produce
> what it thinks is the best default, but what about users that don't want
> the default?

They can choose the lower refresh rate?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-13 Thread Michel Dänzer
On 12/12/22 19:21, Harry Wentland wrote:
> This will let us pass kms_hdr.bpc_switch.
> 
> I don't see any good reasons why we still need to
> limit bpc to 8 bpc and doing so is problematic when
> we enable HDR.
> 
> If I remember correctly there might have been some
> displays out there where the advertised link bandwidth
> was not large enough to drive the default timing at
> max bpc. This would leave to an atomic commit/check
> failure which should really be handled in compositors
> with some sort of fallback mechanism.
> 
> If this somehow turns out to still be an issue I
> suggest we add a module parameter to allow users to
> limit the max_bpc to a desired value.

While leaving the fallback for user space to handle makes some sense in theory, 
in practice most KMS display servers likely won't handle it.

Another issue is that if mode validation is based on the maximum bpc value, it 
may reject modes which would work with lower bpc.


What Ville (CC'd) suggested before instead (and what i915 seems to be doing 
already) is that the driver should do mode validation based on the *minimum* 
bpc, and automatically make the effective bpc lower than the maximum as needed 
to make the rest of the atomic state work.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH AUTOSEL 6.0 38/44] drm/amdgpu: Unlock bo_list_mutex after error handling

2022-11-21 Thread Michel Dänzer
On 11/19/22 03:11, Sasha Levin wrote:
> From: Philip Yang 
> 
> [ Upstream commit 64f65135c41a75f933d3bca236417ad8e9eb75de ]
> 
> Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest
> application.
> 
> If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as
> caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the
> kernel WARNING.
> 
> Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to
> cleanup bo_list userptr bo.
> 
>  WARNING: kfdtest/2930 still has locks held!
>  1 lock held by kfdtest/2930:
>   (>bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 
> [amdgpu]
>   stack backtrace:
>dump_stack_lvl+0x44/0x57
>get_signal+0x79f/0xd00
>arch_do_signal_or_restart+0x36/0x7b0
>exit_to_user_mode_prepare+0xfd/0x1b0
>syscall_exit_to_user_mode+0x19/0x40
>do_syscall_64+0x40/0x80
> 
> Signed-off-by: Philip Yang 
> Reviewed-by: Christian König 
> Signed-off-by: Alex Deucher 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index b7bae833c804..9d59f83c8faa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -655,6 +655,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>   }
>   mutex_unlock(>bo_list->bo_list_mutex);
>   }
> + mutex_unlock(>bo_list->bo_list_mutex);
>   return r;
>  }
>  

Looks doubtful that this is a correct backport — there's an identical 
mutex_unlock call just above.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/amdgpu: add mb for si

2022-11-18 Thread Michel Dänzer
On 11/18/22 09:01, Christian König wrote:
> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
>> During reboot test on arm64 platform, it may failure on boot,
>> so add this mb in smc.
>>
>> The error message are as follows:
>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR*
>>     late_init of IP block  failed -22
>> [    7.006919][ 7] [  T295] amdgpu :04:00.0: amdgpu_device_ip_late_init 
>> failed
>> [    7.014224][ 7] [  T295] amdgpu :04:00.0: Fatal error during GPU init
> 
> Memory barries are not supposed to be sprinkled around like this, you need to 
> give a detailed explanation why this is necessary.
> 
> Regards,
> Christian.
> 
>>
>> Signed-off-by: Zhenneng Li 
>> ---
>>   drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c 
>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
>> index 8f994ffa9cd1..c7656f22278d 100644
>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct amdgpu_device *adev)
>>   u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
>>   u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
>>   +    mb();
>> +
>>   if (!(rst & RST_REG) && !(clk & CK_DISABLE))
>>   return true;

In particular, it makes no sense in this specific place, since it cannot 
directly affect the values of rst & clk.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/simpledrm: Only advertise formats that are supported

2022-10-28 Thread Michel Dänzer
On 2022-10-27 12:53, Hector Martin wrote:
> 
> Q: Why not just add a conversion from XRGB2101010 to XRGB?
> A: Because that would only fix KDE, and would make it slower vs. not
> advertising XRGB2101010 at all (double conversions, plus kernel
> conversion can be slower). Plus, it doesn't make any sense as it only
> fills in one entry in the conversion matrix. If we wanted to actually
> fill out the conversion matrix, and thus support everything simpledrm
> has advertised to day correctly, we would need helpers for:
> 
> rgb565->rgb888
> rgb888->rgb565
> rgb565->xrgb2101010
> rgb888->xrgb2101010
> xrgb2101010->rgb565
> xrgb2101010->rgb888
> xrgb2101010->xrgb
> 
> That seems like overkill and unlikely to actually help anyone, it'd just
> give userspace more options to shoot itself in the foot with a
> sub-optimal format choice. And it's a pile of code.

In addition to everything you mentioned, converting from XRGB2101010 to 
XRGB loses the additional information, defeating the only point of using 
XRGB2101010 instead of XRGB in the first place.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH AUTOSEL 5.4 3/5] drm/amdgpu: use dirty framebuffer helper

2022-09-22 Thread Michel Dänzer
On 2022-09-21 17:54, Sasha Levin wrote:
> From: Hamza Mahfooz 
> 
> [ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]
> 
> Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
> drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
> struct.
> 
> Signed-off-by: Hamza Mahfooz 
> Acked-by: Alex Deucher 
> Signed-off-by: Alex Deucher 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index b588e0e409e7..d8687868407d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -495,6 +496,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
> *amdgpu_connector,
>  static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
>   .destroy = drm_gem_fb_destroy,
>   .create_handle = drm_gem_fb_create_handle,
> + .dirty = drm_atomic_helper_dirtyfb,
>  };
>  
>  uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,

This patch has issues, see https://patchwork.freedesktop.org/patch/503749/ .


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH V3 46/47] drm/amd/display: Fix failures of disabling primary plans

2022-09-16 Thread Michel Dänzer
On 2022-09-15 22:44, Rodrigo Siqueira Jordao wrote:
> 
> First of all, thanks a lot for your review. I want to take this opportunity 
> to discuss this topic in more depth and learn more from you and others.

That really should have happened before submitting this patch. It was luck that 
I happened to notice it fly by.


> On 2022-09-15 04:55, Michel Dänzer wrote:
>> On 2022-09-14 22:08, Alex Hung wrote:
>>> On 2022-09-14 10:55, Michel Dänzer wrote:
>>>> On 2022-09-14 18:30, Alex Hung wrote:
>>>>> On 2022-09-14 07:40, Michel Dänzer wrote:
>>>>>> On 2022-09-14 15:31, Michel Dänzer wrote:
>>>>>>> On 2022-09-14 07:10, Wayne Lin wrote:
>>>>>>>> From: Alex Hung 
>>>>>>>>
>>>>>>>> [Why & How]
>>>>>>>> This fixes kernel errors when IGT disables primary planes during the
>>>>>>>> tests kms_universal_plane::functional_test_pipe/pageflip_test_pipe.
>>>>>>>
>>>>>>> NAK.
>>>>>>>
>>>>>>> This essentially reverts commit b836a274b797 ("drm/amdgpu/dc: Require 
>>>>>>> primary plane to be enabled whenever the CRTC is") (except that it goes 
>>>>>>> even further and completely removes the requirement for any HW plane to 
>>>>>>> be enabled when the HW cursor is), so it would reintroduce the issues 
>>>>>>> described in that commit log.
>>>>>>
>>>>>> Actually not exactly the same issues, due to going even further than 
>>>>>> reverting my fix.
>>>>>>
>>>>>> Instead, the driver will claim that an atomic commit which enables the 
>>>>>> CRTC and the cursor plane, while leaving all other KMS planes disabled, 
>>>>>> succeeds. But the HW cursor will not actually be visible.
>>>>>
>>>>> I did not observe problems with cursors (GNOME 42.4 - desktop and 
>>>>> youtube/mpv video playback: windowed/fullscreen). Are there steps to 
>>>>> reproduce cursor problems?
>>>>
>>>> As described in my last follow-up e-mail: An atomic KMS commit which 
>>>> enables the CRTC and the cursor plane, but disables all other KMS planes 
>>>> for the CRTC. The commit will succeed, but will not result in the HW 
>>>> cursor being actually visible. (I don't know of any specific application 
>>>> or test which exercises this)
>>>
>>> Did you mean cursor plane depends on primary plane (i.e. no primary plane = 
>>> no visible HW cursor)?
>>
>> Sort of. I understand the HW cursor isn't an actual separate plane in AMD 
>> HW. Instead, the HW cursor can be displayed as part of any other HW plane. 
>> This means that the HW cursor can only be visible if any other plane is 
>> enabled.
> 
> The commit that you mentioned (b836a274b797) was created to address some 
> issues reported by the user. Please, correct me if I'm wrong, but the 
> original issue could be reproduced by following these steps on Gnome with 
> Ellesmere:
> 
> 1. Lock the screen and wait for suspending;
> 2. Wake up the system a few minutes later;
> 3. See two cursors, one that can be used and another one that is not working.

The primary symptom (which affected myself as a user as well, that was the 
motivation for addressing it) was that mutter fell back to SW cursor, because 
the legacy KMS cursor ioctl returned an error. There were corresponding 
messages from mutter, visible e.g. in the journal.

That the HW cursor remained visible for some users was a secondary symptom, 
probably due to a separate bug. I don't remember ever hitting that myself.

Note that current versions of mutter use atomic KMS and probably don't hit this 
issue anymore by default. You'd need to set the environment variable 
MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 or use an older version of mutter.

> It is not evident to me why we cannot reproduce this problem. Do you have 
> some suggestions? If we find a case showing this bug, we can add it as part 
> of our tests.

Something like this should work:

1. Call drmModeRmFB for the FB currently assigned to the primary plane. This 
implicitly disables the primary plane.
2. Call drmModeSetCursor(2) with a non-0 BO handle and position & dimensions 
such that the cursor would be visible on the CRTC.

Before b836a274b797, this resulted in drmModeSetCursor(2) returning EINVAL due 
to

if (state->enable && state->active &&
does_crtc_have_active_cursor(state) &&
dm_crtc_state->a

Re: [PATCH 2/3] drm/gma500: Fix crtc_vblank reference leak when userspace queues multiple events

2022-09-06 Thread Michel Dänzer
On 2022-09-05 15:37, Hans de Goede wrote:
> The gma500 page-flip code kinda assume that userspace never queues more
> then 1 vblank event. So basically it assume that userspace does:
> 
> - page-flip
> - wait for vblank event
> - render
> - page-flip
> - etc.
> 
> In the case where userspace would submit 2 page-flips without waiting
> for the first to finish, the current code will just overwrite
> gma_crtc->page_flip_event with the event from the 2nd page-flip.

This cannot happen. Common code returns -EBUSY for an attempt to submit a page 
flip while another one is still pending.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: drm: document uAPI page-flip flags

2022-08-30 Thread Michel Dänzer
On 2022-08-30 10:16, Pekka Paalanen wrote:
> On Mon, 29 Aug 2022 15:37:52 +
> Simon Ser  wrote:
>> On Friday, August 26th, 2022 at 10:53, Pekka Paalanen  
>> wrote:
>>
>>>> +/**
>>>> + * DRM_MODE_PAGE_FLIP_ASYNC
>>>> + *
>>>> + * Request that the page-flip is performed as soon as possible, ie. with 
>>>> no
>>>> + * delay due to waiting for vblank. This may cause tearing to be visible 
>>>> on
>>>> + * the screen.  
>>>
>>> Btw. does the kernel fail the flip if the driver does not support async?
>>> Or does it silently fall back to sync flip?
>>> Asking for both legacy and atomic APIs.  
>>
>> Atomic doesn't support this yet, so it's pretty much TBD (see Ville's reply 
>> [1]).
> 
> So atomic commit ioctl will fail with EINVAL because userspace is
> giving it an unrecognized flag?
> 
> [1] is interesting. Apparently the atomic async flag will only be a
> hint, "make it tear if possible". That seems fine to me.

To me it would seem better for the driver to return an error if the async flag 
is passed in, but the driver can't actually do an async update.

Otherwise it's tricky for a Wayland compositor to decide if it should use an 
async commit (which needs to be done ASAP to serve the intended purpose) or not 
(in which case the compositor may want to delay the commit as long as possible 
for minimal latency).


>> For legacy, it seems like drivers do what they want. AFAIU, i915 will reject
>> (see intel_async_flip_check_uapi etc), and amdgpu will silently fall back to
>> vsync (see the `acrtc_state->update_type == UPDATE_TYPE_FAST` check in
>> amdgpu_dm_commit_planes).
>>
>> Maybe one of the drivers is wrong here and should be fixed?
>>
>> [1]: https://lore.kernel.org/dri-devel/ywib%2fxqf6z6sc...@intel.com

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-24 Thread Michel Dänzer
On 2022-08-22 22:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
> same scheduler an uncceptabliy long wait time for some
> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
> might execute ealier even though that job arrived later
> then the job in the long queue.
> 
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.

Instead of ordering based on when jobs are added, might it be possible to order 
them based on when they become ready to run?

Otherwise it seems possible to e.g. submit a large number of inter-dependent 
jobs at once, and they would all run before any jobs from another queue get a 
chance.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] amdgpu: add context creation flags in CS IOCTL

2022-08-02 Thread Michel Dänzer
On 2022-08-02 15:55, Shashank Sharma wrote:
> This patch adds:
> - A new input parameter "flags" in the amdgpu_ctx_create2 call.
> - Some new flags defining workload type hints.
> - Some change in the caller function of amdgpu_ctx_create2, to
>   accomodate this new parameter.
> 
> The idea is to pass the workload hints while context creation, so
> that kernel GPU scheduler can pass this information to GPU FW, which in
> turn can adjust the GPU characterstics as per the workload type.
> 
> Signed-off-by: Shashank Sharma 
> Cc: Alex Deucher 
> Cc: Marek Olsak 
> Cc: Christian Koenig 
> Cc: Amarnath Somalapuram 
> ---
>  amdgpu/amdgpu.h  |  2 ++
>  amdgpu/amdgpu_cs.c   |  5 -
>  include/drm/amdgpu_drm.h | 10 +-

See include/drm/README on how to handle changes to include/drm/*_drm.h .


Also, libdrm changes are now reviewed and merged as GitLab merge requests: 
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH libdrm v2 04/10] util: Add missing big-endian RGB16 frame buffer formats

2022-07-12 Thread Michel Dänzer
On 2022-07-11 14:34, Geert Uytterhoeven wrote:
> Hi Ville,
> 
> On Mon, Jul 11, 2022 at 2:17 PM Ville Syrjälä
>  wrote:
>> On Fri, Jul 08, 2022 at 08:21:43PM +0200, Geert Uytterhoeven wrote:
>>> Signed-off-by: Geert Uytterhoeven 
>>> ---
>>> Any better suggestion than appending "be"?
>>>
>>> v2:
>>>   - New.
> 
>>> --- a/tests/util/format.c
>>> +++ b/tests/util/format.c
>>> @@ -76,6 +76,9 @@ static const struct util_format_info format_info[] = {
>>>   { DRM_FORMAT_BGRX5551, "BX15", MAKE_RGB_INFO(5, 1, 5, 6, 5, 11, 0, 0) 
>>> },
>>>   { DRM_FORMAT_RGB565, "RG16", MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) },
>>>   { DRM_FORMAT_BGR565, "BG16", MAKE_RGB_INFO(5, 0, 6, 5, 5, 11, 0, 0) },
>>> + /* Big-endian RGB16 */
>>> + { DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN, "XR15be", 
>>> MAKE_RGB_INFO(5, 10, 5, 5, 5, 0, 0, 0) },
>>> + { DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN, "RG16be", 
>>> MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) },
>>
>> How about just stripping the BE bit in util_format_info_find()
>> so we don't have to duplicate the entries in the table?
> 
> There is no need to support big-endian variants of all formats.
> E.g. big-endian [AX]RGB just map to little-endian BGR[AX].
> 
> XRGB1555 and RGB565 are probably the only RGB formats we care about.
> Or perhaps some of the *30 formats?

I'd say all of those, and any other formats with components straddling byte 
boundaries (including formats with multi-byte components).


P.S. libdrm changes are now reviewed and merged as GitLab merge requests: 
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests

I guess CONTRIBUTING.rst could make this clearer.

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer


Re: [PATCH 1/3] drm/fourcc: Add missing big-endian XRGB1555 and RGB565 formats

2022-07-11 Thread Michel Dänzer
On 2022-07-08 20:21, Geert Uytterhoeven wrote:
> As of commit eae06120f1974e1a ("drm: refuse ADDFB2 ioctl for broken
> bigendian drivers"), drivers must set the
> quirk_addfb_prefer_host_byte_order quirk to make the drm_mode_addfb()
> compat code work correctly on big-endian machines.
> 
> While that works fine for big-endian XRGB and ARGB, which are
> mapped to the existing little-endian BGRX and BGRA formats, it
> does not work for big-endian XRGB1555 and RGB565, as the latter are not
> listed in the format database.
> 
> Fix this by adding the missing formats.  Limit this to big-endian
> platforms, as there is currently no need to support these formats on
> little-endian platforms.
> 
> Fixes: 6960e6da9cec3f66 ("drm: fix drm_mode_addfb() on big endian machines.")
> Signed-off-by: Geert Uytterhoeven 
> ---
> Cirrus is the only driver setting quirk_addfb_prefer_host_byte_order
> and supporting RGB565 or XRGB1555, but no one tried that on big-endian?
> Cirrus does not support DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN
> in cirrus_fb_create, so you cannot get a graphical text console.
> 
> Do we need these definitions on little-endian platforms, too?
> Would it be better to use "DRM_FORMAT_{XRGB1555,RGB565} |
> DRM_FORMAT_BIG_ENDIAN" instead of "DRM_FORMAT_HOST_{XRGB1555,RGB565}" in
> formats[]?

The intention of DRM_FORMAT_HOST_* is that they are macros in 
include/drm/drm_fourcc.h which just map to little endian formats defined in 
drivers/gpu/drm/drm_fourcc.c. Since this is not possible for big endian hosts 
for XRGB1555 or RGB565 (or any other format with non-8-bit components), this 
isn't applicable here.

It's also doubtful that Cirrus hardware would access these formats as big 
endian (drivers/gpu/drm/tiny/cirrus.c has no endianness references at all, and 
the hardware was surely designed for x86 first and foremost).

Instead, fbcon (and user space) needs to convert to little endian when using 
DRM_FORMAT_HOST_{XRGB1555,RGB565} with the cirrus driver on big endian hosts.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer


  1   2   3   4   5   6   7   8   9   10   >