[PATCH] drm/omap: dsi: fix unsigned expression compared with zero

2021-03-11 Thread angkery
From: Junlin Yang 

r is "u32" always >= 0,mipi_dsi_create_packet may return little than zero.
so r < 0 condition is never accessible.

Fixes coccicheck warnings:
./drivers/gpu/drm/omapdrm/dss/dsi.c:2155:5-6:
WARNING: Unsigned expression compared with zero: r < 0

Signed-off-by: Junlin Yang 
---
 drivers/gpu/drm/omapdrm/dss/dsi.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/dss/dsi.c 
b/drivers/gpu/drm/omapdrm/dss/dsi.c
index 8e11612..b31d750 100644
--- a/drivers/gpu/drm/omapdrm/dss/dsi.c
+++ b/drivers/gpu/drm/omapdrm/dss/dsi.c
@@ -2149,11 +2149,12 @@ static int dsi_vc_send_short(struct dsi_data *dsi, int 
vc,
 const struct mipi_dsi_msg *msg)
 {
struct mipi_dsi_packet pkt;
+   int ret;
u32 r;
 
-   r = mipi_dsi_create_packet(, msg);
-   if (r < 0)
-   return r;
+   ret = mipi_dsi_create_packet(, msg);
+   if (ret < 0)
+   return ret;
 
WARN_ON(!dsi_bus_is_locked(dsi));
 
-- 
1.9.1


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

2021-03-11 Thread Alexandre Desnoyers
On Thu, Mar 11, 2021 at 2:49 PM Daniel Gomez  wrote:
>
> On Thu, 11 Mar 2021 at 10:09, Daniel Gomez  wrote:
> >
> > On Wed, 10 Mar 2021 at 18:06, Alex Deucher  wrote:
> > >
> > > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez  wrote:
> > > >
> > > > Disabling GFXOFF via the quirk list fixes a hardware lockup in
> > > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
> > > >
> > > > Signed-off-by: Daniel Gomez 
> > > > ---
> > > >
> > > > This patch is a continuation of the work here:
> > > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was 
> > > > discussed and
> > > > a dma_fence deadlock was provoke as a side effect. To reproduce the 
> > > > issue
> > > > please refer to the above link.
> > > >
> > > > The hardware lockup was introduced in 5.6-rc1 for our particular 
> > > > revision as it
> > > > wasn't part of the new blacklist. Before that, in kernel v5.5, this 
> > > > hardware was
> > > > working fine without any hardware lock because the GFXOFF was actually 
> > > > disabled
> > > > by the if condition for the CHIP_RAVEN case. So this patch, adds the 
> > > > 'Radeon
> > > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to disable 
> > > > the GFXOFF.
> > > >
> > > > But besides the fix, I'd like to ask from where this revision comes 
> > > > from. Is it
> > > > an ASIC revision or is it hardcoded in the VBIOS from our vendor? From 
> > > > what I
> > > > can see, it comes from the ASIC and I wonder if somehow we can get an 
> > > > APU in the
> > > > future, 'not blacklisted', with the same problem. Then, should this 
> > > > table only
> > > > filter for the vendor and device and not the revision? Do you know if 
> > > > there are
> > > > any revisions for the 1002:15dd validated, tested and functional?
> > >
> > > The pci revision id (RID) is used to specify the specific SKU within a
> > > family.  GFXOFF is supposed to be working on all raven variants.  It
> > > was tested and functional on all reference platforms and any OEM
> > > platforms that launched with Linux support.  There are a lot of
> > > dependencies on sbios in the early raven variants (0x15dd), so it's
> > > likely more of a specific platform issue, but there is not a good way
> > > to detect this so we use the DID/SSID/RID as a proxy.  The newer raven
> > > variants (0x15d8) have much better GFXOFF support since they all
> > > shipped with newer firmware and sbios.
> >
> > We took one of the first reference platform boards to design our
> > custom board based on the V1605B and I assume it has one of the early 
> > 'unstable'
> > raven variants with RID 0x83. Also, as OEM we are in control of the bios
> > (provided by insyde) but I wasn't sure about the RID so, thanks for the
> > clarification. Is there anything we can do with the bios to have the GFXOFF
> > enabled and 'stable' for this particular revision? Otherwise we'd need to 
> > add
> > the 0x83 RID to the table. Also, there is an extra ']' in the patch
> > subject. Sorry
> > for that. Would you need a new patch in case you accept it with the ']' 
> > removed?
> >
> > Good to hear that the newer raven versions have better GFXOFF support.
>
> Adding Alex Desnoyer to the loop as he is the electronic/hardware and
> bios responsible so, he can
> provide more information about this.

Hello everyone,

We, Qtechnology, are the OEM of the hardware platform where we
originally discovered the bug.  Our platform is based on the AMD
Dibbler V-1000 reference design, with the latest Insyde BIOS release
available for the (now unsupported) Dibbler platform.  We have the
Insyde BIOS source code internally, so we can make some modifications
as needed.

The last test that Daniel and myself performed was on a standard
Dibbler PCB rev.B1 motherboard (NOT our platform), and using the
corresponding latest AMD released BIOS "RDB1109GA".  As Daniel wrote,
the hardware lockup can be reproduced on the Dibbler, even if it has a
different RID that our V1605B APU.

We also have a Neousys Technology POC-515 embedded computer (V-1000,
V1605B) in our office.  The Neousys PC also uses Insyde BIOS.  This
computer is also locking-up in the test.
https://www.neousys-tech.com/en/product/application/rugged-embedded/poc-500-amd-ryzen-ultra-compact-embedded-computer


Digging into the BIOS source code, the only reference to GFXOFF is in
the SMU and PSP firmware release notes, where some bug fixes have been
mentioned for previous SMU/PSP releases.  After a quick "git grep -i
gfx | grep -i off", there seems to be no mention of GFXOFF in the
Insyde UEFI (inluding AMD PI) code base.  I would appreciate any
information regarding BIOS modification needed to make the GFXOFF
feature stable.  As you (Alex Deucher) mentionned, it should be
functional on all AMD Raven reference platforms.


Regards,

Alexandre Desnoyers


>
> I've now done a test on the reference platform (dibbler) with the
> latest bios available
> and the hw lockup can be also reproduced with the same steps.
>
> For reference, I'm 

[PATCH v6] drm/loongson: Add DRM Driver for Loongson 7A1000 bridge chip

2021-03-11 Thread lichenyang
This patch adds an initial DRM driver for the Loongson LS7A1000
bridge chip(LS7A). The LS7A bridge chip contains two display
controllers, support dual display output. The maximum support for
each channel display is to 1920x1080@60Hz.
At present, DC device detection and DRM driver registration are
completed, the crtc/plane/encoder/connector objects has been
implemented.
On Loongson 3A4000 CPU and 7A1000 system, we have achieved the use
of dual screen, and support dual screen clone mode and expansion
mode.

v6:
- Remove spin_lock in mmio reg read and write.
- TO_UNCAC is replac with ioremap.
- Fix error arguments in crtc_atomic_enable/disable/mode_valid.

v5:
- Change the name of the chip to LS7A.
- Change magic value in crtc to macros.
- Correct mistakes words.
- Change the register operation function prefix to ls7a.

v4:
- Move the mode_valid function to the crtc.

v3:
- Move the mode_valid function to the connector and optimize it.
- Fix num_crtc calculation method.

v2:
- Complete the case of 32-bit color in CRTC.

Signed-off-by: Chenyang Li 
---
 drivers/gpu/drm/Kconfig   |   2 +
 drivers/gpu/drm/Makefile  |   1 +
 drivers/gpu/drm/loongson/Kconfig  |  14 +
 drivers/gpu/drm/loongson/Makefile |  14 +
 drivers/gpu/drm/loongson/loongson_connector.c |  48 
 drivers/gpu/drm/loongson/loongson_crtc.c  | 243 
 drivers/gpu/drm/loongson/loongson_device.c|  47 +++
 drivers/gpu/drm/loongson/loongson_drv.c   | 270 ++
 drivers/gpu/drm/loongson/loongson_drv.h   | 139 +
 drivers/gpu/drm/loongson/loongson_encoder.c   |  37 +++
 drivers/gpu/drm/loongson/loongson_plane.c | 102 +++
 11 files changed, 917 insertions(+)
 create mode 100644 drivers/gpu/drm/loongson/Kconfig
 create mode 100644 drivers/gpu/drm/loongson/Makefile
 create mode 100644 drivers/gpu/drm/loongson/loongson_connector.c
 create mode 100644 drivers/gpu/drm/loongson/loongson_crtc.c
 create mode 100644 drivers/gpu/drm/loongson/loongson_device.c
 create mode 100644 drivers/gpu/drm/loongson/loongson_drv.c
 create mode 100644 drivers/gpu/drm/loongson/loongson_drv.h
 create mode 100644 drivers/gpu/drm/loongson/loongson_encoder.c
 create mode 100644 drivers/gpu/drm/loongson/loongson_plane.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0973f408d75f..6ed1b6dc2f25 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -374,6 +374,8 @@ source "drivers/gpu/drm/xen/Kconfig"
 
 source "drivers/gpu/drm/vboxvideo/Kconfig"
 
+source "drivers/gpu/drm/loongson/Kconfig"
+
 source "drivers/gpu/drm/lima/Kconfig"
 
 source "drivers/gpu/drm/panfrost/Kconfig"
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index fefaff4c832d..f87da730ea6d 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -119,6 +119,7 @@ obj-$(CONFIG_DRM_PL111) += pl111/
 obj-$(CONFIG_DRM_TVE200) += tve200/
 obj-$(CONFIG_DRM_XEN) += xen/
 obj-$(CONFIG_DRM_VBOXVIDEO) += vboxvideo/
+obj-$(CONFIG_DRM_LOONGSON) += loongson/
 obj-$(CONFIG_DRM_LIMA)  += lima/
 obj-$(CONFIG_DRM_PANFROST) += panfrost/
 obj-$(CONFIG_DRM_ASPEED_GFX) += aspeed/
diff --git a/drivers/gpu/drm/loongson/Kconfig b/drivers/gpu/drm/loongson/Kconfig
new file mode 100644
index ..3cf42a4cca08
--- /dev/null
+++ b/drivers/gpu/drm/loongson/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config DRM_LOONGSON
+   tristate "DRM support for LS7A bridge chipset"
+   depends on DRM && PCI
+   depends on CPU_LOONGSON64
+   select DRM_KMS_HELPER
+   select DRM_VRAM_HELPER
+   select DRM_TTM
+   select DRM_TTM_HELPER
+   default n
+   help
+ Support the display controllers found on the Loongson LS7A
+ bridge.
diff --git a/drivers/gpu/drm/loongson/Makefile 
b/drivers/gpu/drm/loongson/Makefile
new file mode 100644
index ..22d063953b78
--- /dev/null
+++ b/drivers/gpu/drm/loongson/Makefile
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for loongson drm drivers.
+# This driver provides support for the
+# Direct Rendering Infrastructure (DRI)
+
+ccflags-y := -Iinclude/drm
+loongson-y := loongson_drv.o \
+   loongson_crtc.o \
+   loongson_plane.o \
+   loongson_device.o \
+   loongson_connector.o \
+   loongson_encoder.o
+obj-$(CONFIG_DRM_LOONGSON) += loongson.o
diff --git a/drivers/gpu/drm/loongson/loongson_connector.c 
b/drivers/gpu/drm/loongson/loongson_connector.c
new file mode 100644
index ..6b1f0ffa33bd
--- /dev/null
+++ b/drivers/gpu/drm/loongson/loongson_connector.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include "loongson_drv.h"
+
+static int loongson_get_modes(struct drm_connector *connector)
+{
+   int count;
+
+   count = drm_add_modes_noedid(connector, 1920, 1080);
+   drm_set_preferred_mode(connector, 1024, 768);
+
+   return count;
+}
+
+static const 

Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

2021-03-11 Thread Christian König




Am 11.03.21 um 14:17 schrieb Daniel Vetter:

[SNIP]

So I did the following quick experiment on vmwgfx, and it turns out that
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

My understanding from reading the vmf_insert_mixed() code is that iff
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
not consistent with the vm_normal_page() doc. For architectures without
pte_special, VM_PFNMAP must be used, and then we must also block COW
mappings.

If we can get someone can commit to verify that the potential PAT WC
performance issue is gone with PFNMAP, I can put together a series with
that included.

Iirc when I checked there's not much archs without pte_special, so I
guess that's why we luck out. Hopefully.


I still need to read up a bit on what you guys are discussing here, but 
it starts to make a picture. Especially my understanding of what 
VM_MIXEDMAP means seems to have been slightly of.


I would say just go ahead and provide patches to always use VM_PFNMAP in 
TTM and we can test it and see if there are still some issues.



As for existing userspace using COW TTM mappings, I once had a couple of
test cases to verify that it actually worked, in particular together
with huge PMDs and PUDs where breaking COW would imply splitting those,
but I can't think of anything else actually wanting to do that other
than by mistake.

Yeah disallowing MAP_PRIVATE mappings would be another good thing to
lock down. Really doesn't make much sense.


Completely agree. That sounds like something we should try to avoid.

Regards,
Christian.


-Daniel



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v5 0/7] drm: add simpledrm driver

2021-03-11 Thread nerdopolis
On Wednesday, March 10, 2021 4:10:35 AM EST Thomas Zimmermann wrote:
> Hi
> 
> Am 10.03.21 um 03:50 schrieb nerdopolis:
> > On Friday, September 2, 2016 4:22:38 AM EST David Herrmann wrote:
> >> Hey
> >>
> >> On request of Noralf, I picked up the patches and prepared v5. Works fine 
> >> with
> >> Xorg, if configured according to:
> >>  
> >> https://lists.freedesktop.org/archives/dri-devel/2014-January/052777.html
> >> If anyone knows how to make Xorg pick it up dynamically without such a 
> >> static
> >> configuration, please let me know.
> >>
> >>
> >>
> > Hi
> > 
> > I am kind of curious as I do have interest in seeing this merged as well.
> 
> Please take a look at [1]. It's not the same driver, but something to 
> the same effect. I know it's been almost a year, but I do work on this 
> and intend to come back with a new version during 2021.
> 
> I currently work on fastboot support for the new driver. But it's a 
> complicated matter and takes time. If there's interest, we could talk 
> about merging what's already there.
> 
> Best regards
> Thomas
> 
> [1] 
> https://lore.kernel.org/dri-devel/20200625120011.16168-1-tzimmerm...@suse.de/
> 
> > 
> > There is an email in this thread from 2018, but when I tried to import an 
> > mbox
> > file from the whole month for August 2018, for some reason, kmail doesn't 
> > see
> > the sender and mailing list recipient in that one, so I will reply to this 
> > one,
> > because I was able to import this into my mail client.
> > https://www.spinics.net/lists/dri-devel/msg185519.html
> > 
> > I was able to get this to build against Linux 4.8, but not against a newer
> > version, some headers seem to have been split, and some things are off by 8
> > and other things. I could NOT find a git repo, but I was able to find the
> > newest patches I could find, and import those with git am against 4.8 with
> > some tweaks. If that is needed, I can link it, but only if you want.
> > 
> > However in QEMU I wasn't able to figure out how to make it create a
> > /dev/dri/card0 device, even after blacklisting the other modules for qxl,
> > cirrus, etc, and then modprobe-ing simpledrm
> > 
> > In my view something like this is would be useful. There still could be
> > hardware devices that don't have modesetting support (like vmvga in
> > qemu/virt-manager as an example). And most wayland servers need a
> > /dev/dri/card0 device as well as a potential user-mode TTY replacement would
> > also need /dev/dri/card0
> > 
> > I will admit I unfortunately failed to get it to build against master. I
> > couldn't figure out some of the changes, where some new structs were off by
> > a factor of 8.
> > 
> > 
> > Thanks
> > 
> > 
> > 
> > ___
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> 
> 
Hi

I tried simplekms against v5.9, and it built, and it runs, and is pretty neat.

I tried using the qxl, cirrus, and vmware card in QEMU. Weston starts on all
of them. And I did ensure that the simplekms driver was being used

That is, it works after adding GRUB_GFXPAYLOAD_LINUX=keep , to avoid having to
set a VGA option. (although not sure the equivalent in syslinux yet)
 

Thanks.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-11 Thread Boris Brezillon
On Thu, 11 Mar 2021 12:11:48 -0600
Jason Ekstrand  wrote:

> > > > > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > > > > explicit/implicit deps between them), and Vulkan asks that the 
> > > > > > out
> > > > > > fence be signaled when all jobs are done. Timeline syncobjs are 
> > > > > > a
> > > > > > good match for that use case. All we need to do is pass the same
> > > > > > fence syncobj to all jobs being attached to a single QueueSubmit
> > > > > > request, but a different point on the timeline. The syncobj
> > > > > > timeline wait does the rest and guarantees that we've reached a
> > > > > > given timeline point (IOW, all jobs before that point are done)
> > > > > > before declaring the fence as signaled.
> > > > > > One alternative would be to have dummy 'synchronization' jobs 
> > > > > > that
> > > > > > don't actually execute anything on the GPU but declare a 
> > > > > > dependency
> > > > > > on all other jobs that are part of the QueueSubmit request, and
> > > > > > signal the out fence (the scheduler would do most of the work 
> > > > > > for
> > > > > > us, all we have to do is support NULL job heads and signal the
> > > > > > fence directly when that happens instead of queueing the job).  
> > > > >
> > > > > I have to admit to being rather hazy on the details of timeline
> > > > > syncobjs, but I thought there was a requirement that the timeline 
> > > > > moves
> > > > > monotonically. I.e. if you have multiple jobs signalling the same
> > > > > syncobj just with different points, then AFAIU the API requires that 
> > > > > the
> > > > > points are triggered in order.  
> > > >
> > > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so I
> > > > might be wrong, but my understanding is that queuing fences (addition
> > > > of new points in the timeline) should happen in order, but signaling
> > > > can happen in any order. When checking for a signaled fence the
> > > > fence-chain logic starts from the last point (or from an explicit point
> > > > if you use the timeline wait flavor) and goes backward, stopping at the
> > > > first un-signaled node. If I'm correct, that means that fences that
> > > > are part of a chain can be signaled in any order.  
> > >
> > > You don't even need a timeline for this.  Just have a single syncobj
> > > per-queue and make each submit wait on it and then signal it.
> > > Alternatively, you can just always hang on to the out-fence from the
> > > previous submit and make the next one wait on that.  
> >
> > That's what I have right now, but it forces the serialization of all
> > jobs that are pushed during a submit (and there can be more than one
> > per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd
> > be better to not force this serialization if we can avoid it.  
> 
> I'm not familiar with panfrost's needs and I don't work on a tiler and
> I know there are different issues there.  But...
> 
> The Vulkan spec requires that everything that all the submits that
> happen on a given vkQueue happen in-order.  Search the spec for
> "Submission order" for more details.

Duh, looks like I completely occulted the "Submission order"
guarantees. This being said, even after reading this chapter multiple
times I'm not sure what kind of guarantee this gives us, given the
execution itself can be out-of-order. My understanding is that
submission order matters for implicit deps, say you have 2 distinct
VkSubmitInfo, the first one (in submission order) writing to a buffer
and the second one reading from it, you really want the first one to
be submitted first and the second one to wait on the implicit BO fence
created by the first one. If we were to submit out-of-order, this
guarantee wouldn't be met. OTOH, if we have 2 completely independent
submits, I don't really see what submission order gives us if execution
is out-of-order.

In our case, the kernel driver takes care of the submission
serialization (gathering implicit and explicit deps, queuing the job and
assigning the "done" fence to the output sync objects). Once things
are queued, it's the scheduler (drm_sched) deciding of the execution
order.

> 
> So, generally speaking, there are some in-order requirements there.
> Again, not having a lot of tiler experience, I'm not the one to weigh
> in.
> 
> > > Timelines are overkill here, IMO.  
> >
> > Mind developing why you think this is overkill? After looking at the
> > kernel implementation I thought using timeline syncobjs would be
> > pretty cheap compared to the other options I considered.  
> 
> If you use a regular syncobj, every time you wait on it it inserts a
> dependency between the current submit and the last thing to signal it
> on the CPU timeline.  The internal dma_fences will hang around
> as-needed to ensure those dependencies.  If you use a timeline, you
> have to also track a uint64_t to reference the current time point.
> This 

RE: [PATCH] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-11 Thread Zhang, Jack (Jian)
[AMD Official Use Only - Internal Distribution Only]

Hi, Andrey,

ok, I have changed it and uploaded V2 patch.

Thanks,
Jack
-Original Message-
From: Grodzovsky, Andrey 
Sent: Friday, March 12, 2021 1:04 PM
To: Alex Deucher ; Zhang, Jack (Jian) 
; Maling list - DRI developers 

Cc: amd-gfx list ; Koenig, Christian 
; Liu, Monk ; Deng, Emily 

Subject: Re: [PATCH] drm/scheduler re-insert Bailing job to avoid memleak

Check panfrost driver at panfrost_scheduler_stop, and panfrost_job_timedout - 
they also terminate prematurely in both places so probably worth adding this 
there too.

Andrey

On 2021-03-11 11:13 p.m., Alex Deucher wrote:
> +dri-devel
>
> Please be sure to cc dri-devel when you send out gpu scheduler patches.
>
> On Thu, Mar 11, 2021 at 10:57 PM Jack Zhang  wrote:
>>
>> re-insert Bailing jobs to avoid memory leak.
>>
>> Signed-off-by: Jack Zhang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
>>   drivers/gpu/drm/scheduler/sched_main.c | 8 +++-
>>   include/drm/gpu_scheduler.h| 1 +
>>   4 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 79b9cc73763f..86463b0f936e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
>> *adev,
>>  job ? job->base.id : -1);
>>
>>  /* even we skipped this reset, still need to set the job to 
>> guilty */
>> -   if (job)
>> +   if (job) {
>>  drm_sched_increase_karma(>base);
>> +   r = DRM_GPU_SCHED_STAT_BAILING;
>> +   }
>>  goto skip_recovery;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index 759b34799221..41390bdacd9e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>  struct amdgpu_job *job = to_amdgpu_job(s_job);
>>  struct amdgpu_task_info ti;
>>  struct amdgpu_device *adev = ring->adev;
>> +   int ret;
>>
>>  memset(, 0, sizeof(struct amdgpu_task_info));
>>
>> @@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>ti.process_name, ti.tgid, ti.task_name, ti.pid);
>>
>>  if (amdgpu_device_should_recover_gpu(ring->adev)) {
>> -   amdgpu_device_gpu_recover(ring->adev, job);
>> -   return DRM_GPU_SCHED_STAT_NOMINAL;
>> +   ret = amdgpu_device_gpu_recover(ring->adev, job);
>> +   if (ret == DRM_GPU_SCHED_STAT_BAILING)
>> +   return DRM_GPU_SCHED_STAT_BAILING;
>> +   else
>> +   return DRM_GPU_SCHED_STAT_NOMINAL;
>>  } else {
>>  drm_sched_suspend_timeout(>sched);
>>  if (amdgpu_sriov_vf(adev)) diff --git
>> a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 92d8de24d0a1..a44f621fb5c4 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct 
>> *work)
>>   {
>>  struct drm_gpu_scheduler *sched;
>>  struct drm_sched_job *job;
>> +   int ret;
>>
>>  sched = container_of(work, struct drm_gpu_scheduler,
>> work_tdr.work);
>>
>> @@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct 
>> *work)
>>  list_del_init(>list);
>>  spin_unlock(>job_list_lock);
>>
>> -   job->sched->ops->timedout_job(job);
>> +   ret = job->sched->ops->timedout_job(job);
>>
>> +   if (ret == DRM_GPU_SCHED_STAT_BAILING) {
>> +   spin_lock(>job_list_lock);
>> +   list_add(>node, >ring_mirror_list);
>> +   spin_unlock(>job_list_lock);
>> +   }

Problem here that since you already dropped the reset locks you are racing here 
now against other recovery threads as they process the same mirror list, and 
yet,I think this solution makes things better then they are now with the leak 
but still, it's only temporary band-aid until the full solution to be 
implemented. Probably then worth mentioning here with a comment this it's a 
temporary fix and that races are possible.

Andrey

>>  /*
>>   * Guilty job did complete and hence needs to be manually 
>> removed
>>   * See drm_sched_stop doc.
>> diff --git 

[PATCH v2] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-11 Thread Jack Zhang
re-insert Bailing jobs to avoid memory leak.

Signed-off-by: Jack Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
 drivers/gpu/drm/panfrost/panfrost_job.c| 2 +-
 drivers/gpu/drm/scheduler/sched_main.c | 8 +++-
 include/drm/gpu_scheduler.h| 1 +
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 79b9cc73763f..86463b0f936e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
job ? job->base.id : -1);
 
/* even we skipped this reset, still need to set the job to 
guilty */
-   if (job)
+   if (job) {
drm_sched_increase_karma(>base);
+   r = DRM_GPU_SCHED_STAT_BAILING;
+   }
goto skip_recovery;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..41390bdacd9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
struct amdgpu_job *job = to_amdgpu_job(s_job);
struct amdgpu_task_info ti;
struct amdgpu_device *adev = ring->adev;
+   int ret;
 
memset(, 0, sizeof(struct amdgpu_task_info));
 
@@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
  ti.process_name, ti.tgid, ti.task_name, ti.pid);
 
if (amdgpu_device_should_recover_gpu(ring->adev)) {
-   amdgpu_device_gpu_recover(ring->adev, job);
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   ret = amdgpu_device_gpu_recover(ring->adev, job);
+   if (ret == DRM_GPU_SCHED_STAT_BAILING)
+   return DRM_GPU_SCHED_STAT_BAILING;
+   else
+   return DRM_GPU_SCHED_STAT_NOMINAL;
} else {
drm_sched_suspend_timeout(>sched);
if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 6003cfeb1322..c372f4a38736 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -456,7 +456,7 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct 
drm_sched_job
 
/* Scheduler is already stopped, nothing to do. */
if (!panfrost_scheduler_stop(>js->queue[js], sched_job))
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   return DRM_GPU_SCHED_STAT_BAILING;
 
/* Schedule a reset if there's no reset in progress. */
if (!atomic_xchg(>reset.pending, 1))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 92d8de24d0a1..a44f621fb5c4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+   int ret;
 
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
list_del_init(>list);
spin_unlock(>job_list_lock);
 
-   job->sched->ops->timedout_job(job);
+   ret = job->sched->ops->timedout_job(job);
 
+   if (ret == DRM_GPU_SCHED_STAT_BAILING) {
+   spin_lock(>job_list_lock);
+   list_add(>node, >ring_mirror_list);
+   spin_unlock(>job_list_lock);
+   }
/*
 * Guilty job did complete and hence needs to be manually 
removed
 * See drm_sched_stop doc.
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..8093ac2427ef 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -210,6 +210,7 @@ enum drm_gpu_sched_stat {
DRM_GPU_SCHED_STAT_NONE, /* Reserve 0 */
DRM_GPU_SCHED_STAT_NOMINAL,
DRM_GPU_SCHED_STAT_ENODEV,
+   DRM_GPU_SCHED_STAT_BAILING,
 };
 
 /**
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop legacy execbuffer support

2021-03-11 Thread Dixit, Ashutosh
On Thu, 11 Mar 2021 20:31:33 -0800, Jason Ekstrand wrote:
> On March 11, 2021 20:26:06 "Dixit, Ashutosh"  wrote:
>  On Wed, 10 Mar 2021 13:00:49 -0800, Jason Ekstrand wrote:
>
>  libdrm has supported the newer execbuffer2 ioctl and using it by default
>  when it exists since libdrm commit b50964027bef249a0cc3d511de05c2464e0a1e22
>  which landed Mar 2, 2010.  The i915 and i965 drivers in Mesa at the time
>  both used libdrm and so did the Intel X11 back-end.  The SNA back-end
>  for X11 has always used execbuffer2.
>
>  Signed-off-by: Jason Ekstrand 
>  ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 100 --
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 -
>  drivers/gpu/drm/i915/i915_drv.c   |   2 +-
>  3 files changed, 1 insertion(+), 103 deletions(-)
>
>  Don't we want to clean up references to legacy execbuffer in
>  include/uapi/drm/i915_drm.h too?
>
> I thought about that but Daniel said we should leave them. Maybe a
> comment is in order?

No, should be ok since we are using drm_invalid_op(). If we want to delete
the unused 'struct drm_i915_gem_execbuffer' we can do that by converting
from DRM_IOW to DRM_IO in the DRM_IOCTL_I915_GEM_EXECBUFFER #define.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 01/14] opp: Add devres wrapper for dev_pm_opp_set_clkname

2021-03-11 Thread Viresh Kumar
On 11-03-21, 22:20, Dmitry Osipenko wrote:
> +struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char 
> *name)
> +{
> + struct opp_table *opp_table;
> + int err;
> +
> + opp_table = dev_pm_opp_set_clkname(dev, name);
> + if (IS_ERR(opp_table))
> + return opp_table;
> +
> + err = devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, 
> opp_table);
> + if (err)
> + opp_table = ERR_PTR(err);
> +
> + return opp_table;
> +}

I wonder if we still need to return opp_table from here, or a simple
integer is fine.. The callers shouldn't be required to use the OPP
table directly anymore I believe and so better simplify the return
part of this and all other routines you are adding here..

If there is a user which needs the opp_table, let it use the regular
non-devm variant.

-- 
viresh
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 05/14] opp: Add devres wrapper for dev_pm_opp_register_notifier

2021-03-11 Thread Viresh Kumar
On 11-03-21, 22:20, Dmitry Osipenko wrote:
> From: Yangtao Li 
> 
> Add devres wrapper for dev_pm_opp_register_notifier() to simplify driver
> code.
> 
> Signed-off-by: Yangtao Li 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/opp/core.c | 38 ++
>  include/linux/pm_opp.h |  6 ++
>  2 files changed, 44 insertions(+)

As I said in the previous version, I am not sure if we need this patch
at all. This has only one user.

-- 
viresh
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-11 Thread Andrey Grodzovsky

Check panfrost driver at panfrost_scheduler_stop,
and panfrost_job_timedout - they also terminate prematurely
in both places so probably worth adding this there too.

Andrey

On 2021-03-11 11:13 p.m., Alex Deucher wrote:

+dri-devel

Please be sure to cc dri-devel when you send out gpu scheduler patches.

On Thu, Mar 11, 2021 at 10:57 PM Jack Zhang  wrote:


re-insert Bailing jobs to avoid memory leak.

Signed-off-by: Jack Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
  drivers/gpu/drm/scheduler/sched_main.c | 8 +++-
  include/drm/gpu_scheduler.h| 1 +
  4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 79b9cc73763f..86463b0f936e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 job ? job->base.id : -1);

 /* even we skipped this reset, still need to set the job to 
guilty */
-   if (job)
+   if (job) {
 drm_sched_increase_karma(>base);
+   r = DRM_GPU_SCHED_STAT_BAILING;
+   }
 goto skip_recovery;
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..41390bdacd9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
 struct amdgpu_job *job = to_amdgpu_job(s_job);
 struct amdgpu_task_info ti;
 struct amdgpu_device *adev = ring->adev;
+   int ret;

 memset(, 0, sizeof(struct amdgpu_task_info));

@@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
   ti.process_name, ti.tgid, ti.task_name, ti.pid);

 if (amdgpu_device_should_recover_gpu(ring->adev)) {
-   amdgpu_device_gpu_recover(ring->adev, job);
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   ret = amdgpu_device_gpu_recover(ring->adev, job);
+   if (ret == DRM_GPU_SCHED_STAT_BAILING)
+   return DRM_GPU_SCHED_STAT_BAILING;
+   else
+   return DRM_GPU_SCHED_STAT_NOMINAL;
 } else {
 drm_sched_suspend_timeout(>sched);
 if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 92d8de24d0a1..a44f621fb5c4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
  {
 struct drm_gpu_scheduler *sched;
 struct drm_sched_job *job;
+   int ret;

 sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);

@@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
 list_del_init(>list);
 spin_unlock(>job_list_lock);

-   job->sched->ops->timedout_job(job);
+   ret = job->sched->ops->timedout_job(job);

+   if (ret == DRM_GPU_SCHED_STAT_BAILING) {
+   spin_lock(>job_list_lock);
+   list_add(>node, >ring_mirror_list);
+   spin_unlock(>job_list_lock);
+   }


Problem here that since you already dropped the reset locks you are
racing here now against other recovery threads as they process the same
mirror list, and yet,I think this solution makes things better then
they are now with the leak but still, it's only temporary band-aid until
the full solution to be implemented. Probably then worth mentioning here
with a comment this it's a temporary fix and that races are possible.

Andrey


 /*
  * Guilty job did complete and hence needs to be manually 
removed
  * See drm_sched_stop doc.
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..8093ac2427ef 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -210,6 +210,7 @@ enum drm_gpu_sched_stat {
 DRM_GPU_SCHED_STAT_NONE, /* Reserve 0 */
 DRM_GPU_SCHED_STAT_NOMINAL,
 DRM_GPU_SCHED_STAT_ENODEV,
+   DRM_GPU_SCHED_STAT_BAILING,
  };

  /**
--
2.25.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org

Re: [PATCH v5 1/8] mm: Remove special swap entry functions

2021-03-11 Thread Alistair Popple
On Tuesday, 9 March 2021 11:49:49 PM AEDT Matthew Wilcox wrote:
> On Tue, Mar 09, 2021 at 11:14:58PM +1100, Alistair Popple wrote:
> > -static inline struct page *migration_entry_to_page(swp_entry_t entry)
> > -{
> > -   struct page *p = pfn_to_page(swp_offset(entry));
> > -   /*
> > -* Any use of migration entries may only occur while the
> > -* corresponding page is locked
> > -*/
> > -   BUG_ON(!PageLocked(compound_head(p)));
> > -   return p;
> > -}
> 
> > +static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry)
> > +{
> > +   struct page *p = pfn_to_page(swp_offset(entry));
> > +
> > +   /*
> > +* Any use of migration entries may only occur while the
> > +* corresponding page is locked
> > +*/
> > +   BUG_ON(is_migration_entry(entry) && !PageLocked(compound_head(p)));
> > +
> > +   return p;
> > +}
> 
> I appreciate you're only moving this code, but PageLocked includes an
> implicit compound_head():

I am happy to clean this up at the same time. It did seem a odd when I added 
it and I had meant to follow up on it some more.

> 1. __PAGEFLAG(Locked, locked, PF_NO_TAIL)
> 
> 2. #define __PAGEFLAG(uname, lname, policy)\
> TESTPAGEFLAG(uname, lname, policy)  \
> 
> 3. #define TESTPAGEFLAG(uname, lname, policy)  \
> static __always_inline int Page##uname(struct page *page)   \
> { return test_bit(PG_##lname, (page, 0)->flags); }
> 
> 4. #define PF_NO_TAIL(page, enforce) ({\
> VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page); \
> PF_POISONED_CHECK(compound_head(page)); })
> 
> 5. #define PF_POISONED_CHECK(page) ({  \
> VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);\
> page; })
> 
> 
> This macrology isn't easy to understand the first time you read it (nor,
> indeed, the tenth time), so let me decode it:
> 
> Substitute 5 into 4 and remove irrelevancies:
> 
> 6. #define PF_NO_TAIL(page, enforce) compound_head(page)
> 
> Expand 1 in 2:
> 
> 7.TESTPAGEFLAG(Locked, locked, PF_NO_TAIL)
> 
> Expand 7 in 3:
> 
> 8. static __always_inline int PageLocked(struct page *page)
>   { return test_bit(PG_locked, _NO_TAIL(page, 0)->flags); }
> 
> Expand 6 in 8:
> 
> 9. static __always_inline int PageLocked(struct page *page)
>   { return test_bit(PG_locked, _head(page)->flags); }

Thanks for expanding that out, makes sense and matches my reading as well. 
Will remove the redundant compound_head() call in PageLocked() for the next 
revision.

> (in case it's not clear, compound_head() is idempotent.  that is:
>   f(f(a)) == f(a))





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Peter Stuge
Ilia Mirkin wrote:
> XRGB means that the memory layout should match a 32-bit integer,
> stored as LE, with the low bits being B, next bits being G, etc. This
> translates to byte 0 = B, byte 1 = G, etc. If you're on a BE system,
> and you're handed a XRGB buffer, it still expects that byte 0 = B,
> etc (except as I outlined, some drivers which are from before these
> formats were a thing, sort of do their own thing). Thankfully this is
> equivalent to BGRX (big-endian packing), so you can just munge the
> format.

I understand! Thanks a lot for clarifying.

It makes much more sense to me that the format indeed describes
what is in memory rather than how pixels look to software.


> > > I'm not sure why you guys were talking about BE in the first place,
> >
> > I was worried that the translation didn't consider endianess.
> 
> The translation in gud_xrgb_to_color definitely seems suspect.

So to me this means that the gud_pipe translations from XRGB to the
1-bit formats *do* have to adjust for the reversed order on BE.


> There's also a gud_is_big_endian, but I'm guessing this applies to the
> downstream device rather than the host system.

gud_is_big_endian() is a static bool wrapper around defined(__BIG_ENDIAN)
so yes, it applies to the host.

With memory layout being constant I again think gud_xrgb_to_color()
needs to take further steps to work correctly also on BE hosts. (Maybe
that's le32_to_cpu(*pix32), maybe drm_fb_swab(), maybe something else?)


> I didn't check if dev->mode_config.quirk_addfb_prefer_host_byte_order
> is set

I can't tell if that's helpful, probably Noralf can.


Thanks a lot

//Peter
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop legacy execbuffer support

2021-03-11 Thread Jason Ekstrand


On March 11, 2021 20:26:06 "Dixit, Ashutosh"  wrote:


On Wed, 10 Mar 2021 13:00:49 -0800, Jason Ekstrand wrote:


libdrm has supported the newer execbuffer2 ioctl and using it by default
when it exists since libdrm commit b50964027bef249a0cc3d511de05c2464e0a1e22
which landed Mar 2, 2010.  The i915 and i965 drivers in Mesa at the time
both used libdrm and so did the Intel X11 back-end.  The SNA back-end
for X11 has always used execbuffer2.

Signed-off-by: Jason Ekstrand 
---
.../gpu/drm/i915/gem/i915_gem_execbuffer.c| 100 --
drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 -
drivers/gpu/drm/i915/i915_drv.c   |   2 +-
3 files changed, 1 insertion(+), 103 deletions(-)


Don't we want to clean up references to legacy execbuffer in
include/uapi/drm/i915_drm.h too?


I thought about that but Daniel said we should leave them. Maybe a comment 
is in order?


--Jason

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-11 Thread Alex Deucher
+dri-devel

Please be sure to cc dri-devel when you send out gpu scheduler patches.

On Thu, Mar 11, 2021 at 10:57 PM Jack Zhang  wrote:
>
> re-insert Bailing jobs to avoid memory leak.
>
> Signed-off-by: Jack Zhang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
>  drivers/gpu/drm/scheduler/sched_main.c | 8 +++-
>  include/drm/gpu_scheduler.h| 1 +
>  4 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 79b9cc73763f..86463b0f936e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> job ? job->base.id : -1);
>
> /* even we skipped this reset, still need to set the job to 
> guilty */
> -   if (job)
> +   if (job) {
> drm_sched_increase_karma(>base);
> +   r = DRM_GPU_SCHED_STAT_BAILING;
> +   }
> goto skip_recovery;
> }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 759b34799221..41390bdacd9e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
> drm_sched_job *s_job)
> struct amdgpu_job *job = to_amdgpu_job(s_job);
> struct amdgpu_task_info ti;
> struct amdgpu_device *adev = ring->adev;
> +   int ret;
>
> memset(, 0, sizeof(struct amdgpu_task_info));
>
> @@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
> drm_sched_job *s_job)
>   ti.process_name, ti.tgid, ti.task_name, ti.pid);
>
> if (amdgpu_device_should_recover_gpu(ring->adev)) {
> -   amdgpu_device_gpu_recover(ring->adev, job);
> -   return DRM_GPU_SCHED_STAT_NOMINAL;
> +   ret = amdgpu_device_gpu_recover(ring->adev, job);
> +   if (ret == DRM_GPU_SCHED_STAT_BAILING)
> +   return DRM_GPU_SCHED_STAT_BAILING;
> +   else
> +   return DRM_GPU_SCHED_STAT_NOMINAL;
> } else {
> drm_sched_suspend_timeout(>sched);
> if (amdgpu_sriov_vf(adev))
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 92d8de24d0a1..a44f621fb5c4 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>  {
> struct drm_gpu_scheduler *sched;
> struct drm_sched_job *job;
> +   int ret;
>
> sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>
> @@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
> list_del_init(>list);
> spin_unlock(>job_list_lock);
>
> -   job->sched->ops->timedout_job(job);
> +   ret = job->sched->ops->timedout_job(job);
>
> +   if (ret == DRM_GPU_SCHED_STAT_BAILING) {
> +   spin_lock(>job_list_lock);
> +   list_add(>node, >ring_mirror_list);
> +   spin_unlock(>job_list_lock);
> +   }
> /*
>  * Guilty job did complete and hence needs to be manually 
> removed
>  * See drm_sched_stop doc.
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 4ea8606d91fe..8093ac2427ef 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -210,6 +210,7 @@ enum drm_gpu_sched_stat {
> DRM_GPU_SCHED_STAT_NONE, /* Reserve 0 */
> DRM_GPU_SCHED_STAT_NOMINAL,
> DRM_GPU_SCHED_STAT_ENODEV,
> +   DRM_GPU_SCHED_STAT_BAILING,
>  };
>
>  /**
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/3] drm/bridge: ti-sn65dsi86: Properly get the EDID, but only if refclk

2021-03-11 Thread Bjorn Andersson
On Thu 04 Mar 17:52 CST 2021, Douglas Anderson wrote:

> In commit 58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over
> DDC") we attempted to make the ti-sn65dsi86 bridge properly read the
> EDID from the panel. That commit kinda worked but it had some serious
> problems.
> 
> The problems all stem from the fact that userspace wants to be able to
> read the EDID before it explicitly enables the panel. For eDP panels,
> though, we don't actually power the panel up until the pre-enable
> stage and the pre-enable call happens right before the enable call
> with no way to interject in-between. For eDP panels, you can't read
> the EDID until you power the panel. The result was that
> ti_sn_bridge_connector_get_modes() was always failing to read the EDID
> (falling back to what drm_panel_get_modes() returned) until _after_
> the EDID was needed.
> 
> To make it concrete, on my system I saw this happen:
> 1. We'd attach the bridge.
> 2. Userspace would ask for the EDID (several times). We'd try but fail
>to read the EDID over and over again and fall back to the hardcoded
>modes.
> 3. Userspace would decide on a mode based only on the hardcoded modes.
> 4. Userspace would ask to turn the panel on.
> 5. Userspace would (eventually) check the modes again (in Chrome OS
>this happens on the handoff from the boot splash screen to the
>browser). Now we'd read them properly and, if they were different,
>userspace would request to change the mode.
> 
> The fact that userspace would always end up using the hardcoded modes
> at first significantly decreases the benefit of the EDID
> reading. Also: if the modes were even a tiny bit different we'd end up
> doing a wasteful modeset and at boot.
> 
> As a side note: at least early EDID read failures were relatively
> fast. Though the old ti_sn_bridge_connector_get_modes() did call
> pm_runtime_get_sync() it didn't program the important "HPD_DISABLE"
> bit. That meant that all the AUX transfers failed pretty quickly.
> 
> In any case, enough about the problem. How are we fixing it? Obviously
> we need to power the panel on _before_ reading the EDID, but how? It
> turns out that there's really no problem with just doing all the work
> of our pre_enable() function right at attach time and reading the EDID
> right away. We'll do that. It's not as easy as it sounds, though,
> because:
> 
> 1. Powering the panel up and down is a pretty expensive operation. Not
>only do we need to wait for the HPD line which seems to take up to
>200 ms on most panels, but also most panels say that once you power
>them off you need to wait at least 500 ms before powering them on
>again. We really don't want to incur 700 ms of time here.
> 
> 2. If we happen not to have a fixed "refclk" we've got a
>chicken-and-egg problem. We seem to need the clock setup to read
>the EDID. Without a fixed "refclk", though, the bridge runs with
>the MIPI pixel clock which means you've got to use a hardcoded mode
>for the MIPI pixel clock.
> 
> We'll solve problem #1 above by leaving the panel powered on for a
> little while after we read the EDID. If enough time passes and nobody
> turns the panel on then we'll undo our work. NOTE: there are no
> functional problems if someone turns the panel on after a long delay,
> it just might take a little longer to turn on.
> 
> We'll solve problem #2 by simply _always_ using a hardcoded mode (not
> reading the EDID) if a "refclk" wasn't provided. While it might be
> possible to fudge something together to support this, it's my belief
> that nobody is using this mode in real life since it's really
> inflexible. I saw it used for some really early prototype hardware
> that was thrown in the e-waste bin years ago when we realized how
> inflexible it was. In any case, if someone is using this they're in no
> worse shape than they were before the (fairly recent) commit
> 58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over DDC").
> 
> NOTE: while this patch feels a bit hackish, I'm not sure there's much
> we can do better without a more fundamental DRM API change. After
> looking at it a bunch, it also doesn't feel as hacky to me as I first
> thought. The things that pre-enable does are well defined and well
> understood and there should be no problems with doing them early nor
> with doing them before userspace requests anything.
> 
> Fixes: 58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over DDC")
> Signed-off-by: Douglas Anderson 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
> 
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 98 ---
>  1 file changed, 88 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 491c9c4f32d1..af3fb4657af6 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> 

Re: [PULL] drm-intel-fixes

2021-03-11 Thread Rodrigo Vivi
On Fri, Mar 12, 2021 at 11:36:51AM +1000, Dave Airlie wrote:
> On Thu, 11 Mar 2021 at 21:28, Rodrigo Vivi  wrote:
> >
> > Hi Dave and Daniel,
> >
> > Things are very quiet. Only 1 fix this round.
> > Since I will be out next week, if this trend continues I will
> > accumulate 2 weeks and send when in -rc4.
> >
> > Here goes drm-intel-fixes-2021-03-11:
> >
> > - Wedge the GPU if command parser setup fails (Tvrtko)
> >
> > Thanks,
> > Rodrigo.
> >
> > The following changes since commit fe07bfda2fb9cdef8a4d4008a409bb02f35f1bd8:
> >
> >   Linux 5.12-rc1 (2021-02-28 16:05:19 -0800)
> 
> This was based on 5.12-rc1 against my request earlier in the week to
> not do that. but since it was a single patch I just cherry-picked it
> across.

I'm really sorry about that! It is so unusual to have this low influx
of patches at this round that I'm not used and end up forgetting to rebase.

> 
> Can we make sure no fixes or next based on rc1 arrive please.

Sure thing.

> 
> Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/3] drm/bridge: ti-sn65dsi86: Move code in prep for EDID read fix

2021-03-11 Thread Bjorn Andersson
On Thu 04 Mar 17:52 CST 2021, Douglas Anderson wrote:

> This patch is _only_ code motion to prepare for the patch
> ("drm/bridge: ti-sn65dsi86: Properly get the EDID, but only if
> refclk") and make it easier to understand.
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Douglas Anderson 
> ---
> 
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 196 +-
>  1 file changed, 98 insertions(+), 98 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 942019842ff4..491c9c4f32d1 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -345,6 +345,104 @@ static int ti_sn_bridge_parse_regulators(struct 
> ti_sn_bridge *pdata)
>  pdata->supplies);
>  }
>  
> +static u32 ti_sn_bridge_get_dsi_freq(struct ti_sn_bridge *pdata)
> +{
> + u32 bit_rate_khz, clk_freq_khz;
> + struct drm_display_mode *mode =
> + >bridge.encoder->crtc->state->adjusted_mode;
> +
> + bit_rate_khz = mode->clock *
> + mipi_dsi_pixel_format_to_bpp(pdata->dsi->format);
> + clk_freq_khz = bit_rate_khz / (pdata->dsi->lanes * 2);
> +
> + return clk_freq_khz;
> +}
> +
> +/* clk frequencies supported by bridge in Hz in case derived from REFCLK pin 
> */
> +static const u32 ti_sn_bridge_refclk_lut[] = {
> + 1200,
> + 1920,
> + 2600,
> + 2700,
> + 3840,
> +};
> +
> +/* clk frequencies supported by bridge in Hz in case derived from DACP/N pin 
> */
> +static const u32 ti_sn_bridge_dsiclk_lut[] = {
> + 46800,
> + 38400,
> + 41600,
> + 48600,
> + 46080,
> +};
> +
> +static void ti_sn_bridge_set_refclk_freq(struct ti_sn_bridge *pdata)
> +{
> + int i;
> + u32 refclk_rate;
> + const u32 *refclk_lut;
> + size_t refclk_lut_size;
> +
> + if (pdata->refclk) {
> + refclk_rate = clk_get_rate(pdata->refclk);
> + refclk_lut = ti_sn_bridge_refclk_lut;
> + refclk_lut_size = ARRAY_SIZE(ti_sn_bridge_refclk_lut);
> + clk_prepare_enable(pdata->refclk);
> + } else {
> + refclk_rate = ti_sn_bridge_get_dsi_freq(pdata) * 1000;
> + refclk_lut = ti_sn_bridge_dsiclk_lut;
> + refclk_lut_size = ARRAY_SIZE(ti_sn_bridge_dsiclk_lut);
> + }
> +
> + /* for i equals to refclk_lut_size means default frequency */
> + for (i = 0; i < refclk_lut_size; i++)
> + if (refclk_lut[i] == refclk_rate)
> + break;
> +
> + regmap_update_bits(pdata->regmap, SN_DPPLL_SRC_REG, REFCLK_FREQ_MASK,
> +REFCLK_FREQ(i));
> +}
> +
> +static void ti_sn_bridge_post_disable(struct drm_bridge *bridge)
> +{
> + struct ti_sn_bridge *pdata = bridge_to_ti_sn_bridge(bridge);
> +
> + clk_disable_unprepare(pdata->refclk);
> +
> + pm_runtime_put_sync(pdata->dev);
> +}
> +
> +static void ti_sn_bridge_pre_enable(struct drm_bridge *bridge)
> +{
> + struct ti_sn_bridge *pdata = bridge_to_ti_sn_bridge(bridge);
> +
> + pm_runtime_get_sync(pdata->dev);
> +
> + /* configure bridge ref_clk */
> + ti_sn_bridge_set_refclk_freq(pdata);
> +
> + /*
> +  * HPD on this bridge chip is a bit useless.  This is an eDP bridge
> +  * so the HPD is an internal signal that's only there to signal that
> +  * the panel is done powering up.  ...but the bridge chip debounces
> +  * this signal by between 100 ms and 400 ms (depending on process,
> +  * voltage, and temperate--I measured it at about 200 ms).  One
> +  * particular panel asserted HPD 84 ms after it was powered on meaning
> +  * that we saw HPD 284 ms after power on.  ...but the same panel said
> +  * that instead of looking at HPD you could just hardcode a delay of
> +  * 200 ms.  We'll assume that the panel driver will have the hardcoded
> +  * delay in its prepare and always disable HPD.
> +  *
> +  * If HPD somehow makes sense on some future panel we'll have to
> +  * change this to be conditional on someone specifying that HPD should
> +  * be used.
> +  */
> + regmap_update_bits(pdata->regmap, SN_HPD_DISABLE_REG, HPD_DISABLE,
> +HPD_DISABLE);
> +
> + drm_panel_prepare(pdata->panel);
> +}
> +
>  static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>  enum drm_bridge_attach_flags flags)
>  {
> @@ -443,64 +541,6 @@ static void ti_sn_bridge_disable(struct drm_bridge 
> *bridge)
>   drm_panel_unprepare(pdata->panel);
>  }
>  
> -static u32 ti_sn_bridge_get_dsi_freq(struct ti_sn_bridge *pdata)
> -{
> - u32 bit_rate_khz, clk_freq_khz;
> - struct drm_display_mode *mode =
> - >bridge.encoder->crtc->state->adjusted_mode;
> -
> - bit_rate_khz = mode->clock *
> - 

Re: [PATCH 1/3] drm/bridge: ti-sn65dsi86: Simplify refclk handling

2021-03-11 Thread Bjorn Andersson
On Thu 04 Mar 17:51 CST 2021, Douglas Anderson wrote:

> The clock framework makes it simple to deal with an optional clock.
> You can call clk_get_optional() and if the clock isn't specified it'll
> just return NULL without complaint. It's valid to pass NULL to
> enable/disable/prepare/unprepare. Let's make use of this to simplify
> things a tiny bit.
> 
> NOTE: this makes things look a tad bit asymmetric now since we check
> for NULL before clk_prepare_enable() but not for
> clk_disable_unprepare(). This seemed OK to me. We already have to
> check for NULL in the enable case anyway so why not avoid the extra
> call?
> 

Reviewed-by: Bjorn Andersson 

> Signed-off-by: Douglas Anderson 
> ---
> 
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index f27306c51e4d..942019842ff4 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -1261,14 +1261,9 @@ static int ti_sn_bridge_probe(struct i2c_client 
> *client,
>   return ret;
>   }
>  
> - pdata->refclk = devm_clk_get(pdata->dev, "refclk");
> - if (IS_ERR(pdata->refclk)) {
> - ret = PTR_ERR(pdata->refclk);
> - if (ret == -EPROBE_DEFER)
> - return ret;
> - DRM_DEBUG_KMS("refclk not found\n");
> - pdata->refclk = NULL;
> - }
> + pdata->refclk = devm_clk_get_optional(pdata->dev, "refclk");
> + if (IS_ERR(pdata->refclk))
> + return PTR_ERR(pdata->refclk);
>  
>   ret = ti_sn_bridge_parse_dsi_host(pdata);
>   if (ret)
> -- 
> 2.30.1.766.gb4fecdf3b7-goog
> 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop legacy execbuffer support

2021-03-11 Thread Dixit, Ashutosh
On Wed, 10 Mar 2021 13:00:49 -0800, Jason Ekstrand wrote:
>
> libdrm has supported the newer execbuffer2 ioctl and using it by default
> when it exists since libdrm commit b50964027bef249a0cc3d511de05c2464e0a1e22
> which landed Mar 2, 2010.  The i915 and i965 drivers in Mesa at the time
> both used libdrm and so did the Intel X11 back-end.  The SNA back-end
> for X11 has always used execbuffer2.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 100 --
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 -
>  drivers/gpu/drm/i915/i915_drv.c   |   2 +-
>  3 files changed, 1 insertion(+), 103 deletions(-)

Don't we want to clean up references to legacy execbuffer in
include/uapi/drm/i915_drm.h too?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm fixes for 5.12-rc3

2021-03-11 Thread pr-tracker-bot
The pull request you sent on Fri, 12 Mar 2021 11:35:33 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-03-12-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f78d76e72a4671ea52d12752d92077788b4f5d50

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PULL] drm-intel-fixes

2021-03-11 Thread Dave Airlie
On Thu, 11 Mar 2021 at 21:28, Rodrigo Vivi  wrote:
>
> Hi Dave and Daniel,
>
> Things are very quiet. Only 1 fix this round.
> Since I will be out next week, if this trend continues I will
> accumulate 2 weeks and send when in -rc4.
>
> Here goes drm-intel-fixes-2021-03-11:
>
> - Wedge the GPU if command parser setup fails (Tvrtko)
>
> Thanks,
> Rodrigo.
>
> The following changes since commit fe07bfda2fb9cdef8a4d4008a409bb02f35f1bd8:
>
>   Linux 5.12-rc1 (2021-02-28 16:05:19 -0800)

This was based on 5.12-rc1 against my request earlier in the week to
not do that. but since it was a single patch I just cherry-picked it
across.

Can we make sure no fixes or next based on rc1 arrive please.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[git pull] drm fixes for 5.12-rc3

2021-03-11 Thread Dave Airlie
Hi Linus,

Regular fixes for rc3. The i915 pull was based on the rc1 tag so I
just cherry-picked the single fix from there to avoid it. The misc and
amd trees seem to be on okay bases.

It's a bunch of fixes across the tree, amdgpu has most of them a few
ttm fixes around qxl, and nouveau.

Dave.

drm-fixes-2021-03-12-1:
drm fixes for 5.12-rc3

core:
- Clear holes when converting compat ioctl's between 32-bits and 64-bits.

docs:
- Use gitlab for drm bugzilla now.

ttm:
- Fix ttm page pool accounting.

fbdev:
- Fix oops in drm_fbdev_cleanup()

shmem:
- Assorted fixes for shmem helpers.

qxl:
- unpin qxl bos created as pinned when freeing them,
  and make ttm only warn once on this behavior.
- Zero head.surface_id correctly in qxl.

atyfb:
- Use LCD management for atyfb on PPC_MAC.

meson:
- Shutdown kms poll helper in meson correctly.

nouveau:
- fix regression in bo syncing

i915:
- Wedge the GPU if command parser setup fails

amdgpu:
- Fix aux backlight control
- Add a backlight override parameter
- Various display fixes
- PCIe DPM fix for vega
- Polaris watermark fixes
- Additional S0ix fix

radeon:
- Fix GEM regression
- Fix AGP dependency handling
The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:

  Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-03-12-1

for you to fetch changes up to 4042160c2e5433e0759782c402292a90b5bf458d:

  drm/nouveau: fix dma syncing for loops (v2) (2021-03-12 11:21:47 +1000)


drm fixes for 5.12-rc3

core:
- Clear holes when converting compat ioctl's between 32-bits and 64-bits.

docs:
- Use gitlab for drm bugzilla now.

ttm:
- Fix ttm page pool accounting.

fbdev:
- Fix oops in drm_fbdev_cleanup()

shmem:
- Assorted fixes for shmem helpers.

qxl:
- unpin qxl bos created as pinned when freeing them,
  and make ttm only warn once on this behavior.
- Zero head.surface_id correctly in qxl.

atyfb:
- Use LCD management for atyfb on PPC_MAC.

meson:
- Shutdown kms poll helper in meson correctly.

nouveau:
- fix regression in bo syncing

i915:
- Wedge the GPU if command parser setup fails

amdgpu:
- Fix aux backlight control
- Add a backlight override parameter
- Various display fixes
- PCIe DPM fix for vega
- Polaris watermark fixes
- Additional S0ix fix

radeon:
- Fix GEM regression
- Fix AGP dependency handling


Alex Deucher (4):
  drm/amdgpu/display: simplify backlight setting
  drm/amdgpu/display: don't assert in set backlight function
  drm/amdgpu/display: handle aux backlight in backlight_get_brightness
  drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m

Anthony DeRossi (1):
  drm/ttm: Fix TTM page pool accounting

Artem Lapkin (1):
  drm: meson_drv add shutdown function

Christian König (3):
  drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table
  drm/radeon: fix AGP dependency
  drm/ttm: soften TTM warnings

Colin Ian King (1):
  qxl: Fix uninitialised struct field head.surface_id

Daniel Vetter (1):
  drm/compat: Clear bounce structures

Dave Airlie (3):
  Merge tag 'drm-misc-fixes-2021-03-11' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge tag 'amd-drm-fixes-5.12-2021-03-10' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
  drm/nouveau: fix dma syncing for loops (v2)

Dillon Varone (1):
  drm/amd/display: Enabled pipe harvesting in dcn30

Evan Quan (1):
  drm/amd/pm: correct the watermark settings for Polaris

Gerd Hoffmann (2):
  drm/qxl: unpin release objects
  drm/qxl: fix lockdep issue in qxl_alloc_release_reserved

Holger Hoffstätte (2):
  drm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth()
  drm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp()

Kenneth Feng (1):
  drm/amd/pm: bug fix for pcie dpm

Neil Roberts (2):
  drm/shmem-helper: Check for purged buffers in fault handler
  drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

Nirmoy Das (1):
  drm/amdgpu: fb BO should be ttm_bo_type_device

Noralf Trønnes (1):
  drm/shmem-helpers: vunmap: Don't put pages for dma-buf

Pavel Turinský (1):
  MAINTAINERS: update drm bug reporting URL

Qingqing Zhuo (1):
  drm/amd/display: Enable pflip interrupt upon pipe enable

Randy Dunlap (2):
  fbdev: atyfb: always declare aty_{ld,st}_lcd()
  fbdev: atyfb: use LCD management functions for PPC_PMAC also

Sung Lee (1):
  drm/amd/display: Revert dram_clock_change_latency for DCN2.1

Takashi Iwai (1):
  drm/amd/display: Add a backlight module option

Thomas Zimmermann (1):
  drm: Use USB controller's DMA mask when importing dmabufs

Tong Zhang (1):
  drm/fb-helper: only unmap if buffer not null

Tvrtko Ursulin (1):
  drm/i915: Wedge the GPU if 

Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Ilia Mirkin
On Thu, Mar 11, 2021 at 5:58 PM Peter Stuge  wrote:
>
> Ilia Mirkin wrote:
> > > > #define DRM_FORMAT_XRGB   fourcc_code('X', 'R', '2', '4') /* [31:0]
> > > > x:R:G:B 8:8:8:8 little endian */
> > >
> > > Okay, "[31:0] x:R:G:B 8:8:8:8" can certainly mean
> > > [31:24]=x [23:16]=R [15:8]=G [7:0]=B, which when stored "little endian"
> > > becomes B G R X in memory, for which your pix32 code is correct.
> > >
> > > That's the reverse *memory* layout of what the name says :)
> >
> > The definition of the formats is memory layout in little endian.
>
> To clarify, my new (hopefully correct?) understanding is this:
>
> XRGB does *not* mean that address 0=X, 1=R, 2=G, 3=B, but that
> the most significant byte in a packed XRGB 32-bit integer is X
> and the least significant byte is B, and that this is the case both
> on LE and BE machines.

Not quite.

XRGB means that the memory layout should match a 32-bit integer,
stored as LE, with the low bits being B, next bits being G, etc. This
translates to byte 0 = B, byte 1 = G, etc. If you're on a BE system,
and you're handed a XRGB buffer, it still expects that byte 0 = B,
etc (except as I outlined, some drivers which are from before these
formats were a thing, sort of do their own thing). Thankfully this is
equivalent to BGRX (big-endian packing), so you can just munge the
format. Not so with e.g. RGB565 though (since the components don't
fall on byte boundaries).

> I previously thought that XRGB indicated the memory byte order of
> components being X R G B regardless of machine endianess, but now
> understand XRGB to mean the MSB..LSB order of component bytes within
> the 32-bit integer, as seen by software, not the order of bytes in memory.

There are about 100 conventions, and they all manage to be different
from each other. Packed vs array. BE vs LE. If you're *not* confused,
that should be a red flag.

[...]

> > I'm not sure why you guys were talking about BE in the first place,
>
> I was worried that the translation didn't consider endianess.

The translation in gud_xrgb_to_color definitely seems suspect.
There's also a gud_is_big_endian, but I'm guessing this applies to the
downstream device rather than the host system. I didn't check if
dev->mode_config.quirk_addfb_prefer_host_byte_order is set -- that
setting dictates whether these formats are in host-byte-order (and
AddFB2 is disabled, so buffers can only be specified with depth/bpp
and ambiguous component orders) or in LE byte order (and userspace can
use AddFB2 which gives allows precise formats for these buffers). Not
100% sure what something like Xorg's modesetting driver does, TBH.
This is a very poorly-tested scenario.

  -ilia
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: vmwgfx leaking bo pins?

2021-03-11 Thread Zack Rusin


> On Mar 11, 2021, at 17:35, Thomas Hellström (Intel)  
> wrote:
> 
> Hi, Zack
> 
> On 3/11/21 10:07 PM, Zack Rusin wrote:
>>> On Mar 11, 2021, at 05:46, Thomas Hellström (Intel) 
>>>  wrote:
>>> 
>>> Hi,
>>> 
>>> I tried latest drm-fixes today and saw a lot of these: Fallout from ttm 
>>> rework?
>> Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in 
>> drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 
>> 5.12.
>> 
>> z
>> 
> Hmm, yes but doesn't that fix trip the ttm_bo_unpin() 
> dma_resv_assert_held(bo->base.resv)?

No, doesn’t seem to. TBH I’m not sure why myself, but it seems to be working 
fine.

> Taking the reservation to unpin at TTM bo free has always been awkward and 
> that's why vmwgfx and I guess other TTM drivers have been sloppy doing that 
> as TTM never cared. Perhaps TTM could change the pin_count to an atomic and 
> allow unlocked unpinning? still requiring the reservation lock for pin_count 
> transition 0->1, though.

Yea, that’d probably make sense. I think in general just making sure the 
requirements are consistent and well documented would be great.

> Also, pinning at bo creation in vmwgfx has been to do the equivalent of 
> ttm_bo_init_reserved() (which api was added later). Creating pinned would 
> make the object isolated and allowing the reserve trylock that followed to 
> always succeed. With the introduction of the TTM pin_count, it seems 
> ttm_bo_init_reserved() is used to enable pinned creation which is used to 
> emulate ttm_bo_init_reserved() :)

Yea, we should probably port the vmwgfx code to ttm_bo_init_reserved just to be 
match the newly established semantics.
z
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Peter Stuge
Ilia Mirkin wrote:
> > > #define DRM_FORMAT_XRGB   fourcc_code('X', 'R', '2', '4') /* [31:0]
> > > x:R:G:B 8:8:8:8 little endian */
> >
> > Okay, "[31:0] x:R:G:B 8:8:8:8" can certainly mean
> > [31:24]=x [23:16]=R [15:8]=G [7:0]=B, which when stored "little endian"
> > becomes B G R X in memory, for which your pix32 code is correct.
> >
> > That's the reverse *memory* layout of what the name says :)
> 
> The definition of the formats is memory layout in little endian.

To clarify, my new (hopefully correct?) understanding is this:

XRGB does *not* mean that address 0=X, 1=R, 2=G, 3=B, but that
the most significant byte in a packed XRGB 32-bit integer is X
and the least significant byte is B, and that this is the case both
on LE and BE machines.

I previously thought that XRGB indicated the memory byte order of
components being X R G B regardless of machine endianess, but now
understand XRGB to mean the MSB..LSB order of component bytes within
the 32-bit integer, as seen by software, not the order of bytes in memory.


> The definition you see is of a 32-bit packed little-endian integer,
> which is a fixed memory layout.

In the header definition I'm not completely sure what the "little endian"
means - I guess it refers to how the 32-bit integer will be stored in memory,
but it could also refer to the order of component packing within.

Noralf's code and testing and also what fbset tells me seems to support
this understanding, at least on LE machines.


> Now, if you're on an actual big-endian platform, and you want to
> accept big-endian-packed formats, there's a bit of unpleasantness that
> goes on.

In the particular case of XRGB that Noralf has implemented and
I've tested every pixel is translated "manually" anyway; each
component byte is downconverted to a single bit, but this use case
is mostly for smaller resolutions, so no too big deal.


> I'm not sure why you guys were talking about BE in the first place,

I was worried that the translation didn't consider endianess.

Noralf, looking at the 3/3 patch again now, drm_fb_swab() gets called
on BE when format == fb->format, but does it also need to be called
on BE they are different, or will circumstances be such that it's
never neccessary then?


Thanks and sorry if I'm confusing things needlessly

//Peter
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-11 Thread Alyssa Rosenzweig
> I'm not familiar with panfrost's needs and I don't work on a tiler and
> I know there are different issues there.  But...

The primary issue is we submit vertex+compute and fragment for each
batch as two disjoint jobs (with a dependency of course), reflecting the
internal hardware structure as parallel job slots. That we actually
require two ioctls() and a roundtrip for this is a design wart inherited
from early days of the kernel driver. The downstream Mali driver handles
this by allowing multiple jobs to be submitted with a single ioctl, as
Boris's patch enables. In every other respect I believe our needs are
similar to other renderonly drivers. (What does turnip do?)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: vmwgfx leaking bo pins?

2021-03-11 Thread Intel

Hi, Zack

On 3/11/21 10:07 PM, Zack Rusin wrote:

On Mar 11, 2021, at 05:46, Thomas Hellström (Intel)  
wrote:

Hi,

I tried latest drm-fixes today and saw a lot of these: Fallout from ttm rework?

Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in 
drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12.

z

Hmm, yes but doesn't that fix trip the ttm_bo_unpin() 
dma_resv_assert_held(bo->base.resv)?


Taking the reservation to unpin at TTM bo free has always been awkward 
and that's why vmwgfx and I guess other TTM drivers have been sloppy 
doing that as TTM never cared. Perhaps TTM could change the pin_count to 
an atomic and allow unlocked unpinning? still requiring the reservation 
lock for pin_count transition 0->1, though.


Also, pinning at bo creation in vmwgfx has been to do the equivalent of 
ttm_bo_init_reserved() (which api was added later). Creating pinned 
would make the object isolated and allowing the reserve trylock that 
followed to always succeed. With the introduction of the TTM pin_count, 
it seems ttm_bo_init_reserved() is used to enable pinned creation which 
is used to emulate ttm_bo_init_reserved() :)


/Thomas



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Ilia Mirkin
On Thu, Mar 11, 2021 at 3:02 PM Peter Stuge  wrote:
> > > Hence the question: What does DRM promise about the XRGB mode?
> >
> > That it's a 32-bit value. From include/uapi/drm/drm_fourcc.h:
> >
> > /* 32 bpp RGB */
> > #define DRM_FORMAT_XRGB   fourcc_code('X', 'R', '2', '4') /* [31:0]
> > x:R:G:B 8:8:8:8 little endian */
>
> Okay, "[31:0] x:R:G:B 8:8:8:8" can certainly mean
> [31:24]=x [23:16]=R [15:8]=G [7:0]=B, which when stored "little endian"
> becomes B G R X in memory, for which your pix32 code is correct.
>
> That's the reverse *memory* layout of what the name says :) but yes,
> the name then matches the representation seen by software. That's the
> "abstracted" case that I didn't expect, because I thought the name was
> refering to memory layout and because I was thinking about how traditional
> graphics adapter video memory has the R component at the lower
> address, at least in early linear modes.

The definition of the formats is memory layout in little endian. The
definition you see is of a 32-bit packed little-endian integer, which
is a fixed memory layout.

Now, if you're on an actual big-endian platform, and you want to
accept big-endian-packed formats, there's a bit of unpleasantness that
goes on. Basically there are two options:

1. Ignore the above definition and interpret the formats as
*big-endian* layouts. This is what nouveau and radeon do. They also
don't support AddFB2 (which is what allows supplying a format) -- only
AddFB which just has depth (and bpp). That's fine for nouveau and
radeon because the relevant userspace just uses AddFB, and knows what
the drivers want, so it all works out.

2. Comply with the above definition and set
dev->mode_config.quirk_addfb_prefer_host_byte_order to false. This
loses you native host packing of RGB565/etc, since they're just not
defined as formats. There's a DRM_FORMAT_BIG_ENDIAN bit but it's not
properly supported for anything but the  formats.

I'm not sure why you guys were talking about BE in the first place,
but since this is a topic I've looked into (in the context of moving
nouveau from 1 to 2 - but that can't happen due to the reduced format
availability), figured I'd share some of the current sad state.

Cheers,

  -ilia
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: vmwgfx leaking bo pins?

2021-03-11 Thread Zack Rusin

> On Mar 11, 2021, at 05:46, Thomas Hellström (Intel)  
> wrote:
> 
> Hi,
> 
> I tried latest drm-fixes today and saw a lot of these: Fallout from ttm 
> rework?

Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in 
drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12.

z

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Peter Stuge
Noralf Trønnes wrote:
> > Endianness matters because parts of pix32 are used.
> 
> This code:
..
> prints:
> 
> xrgb=aabbccdd
> 32-bit access:
> r=bb
> g=cc
> b=dd
> Byte access on LE:
> r=cc
> g=bb
> b=aa

As expected, and:

xrgb=aabbccdd
32-bit access:
r=bb
g=cc
b=dd
Byte access on BE:
r=bb
g=cc
b=dd

I've done similar tests in the past and did another before my last mail.

We agree about endian effects. Apologies if I came across as overbearing!


> > Hence the question: What does DRM promise about the XRGB mode?
> 
> That it's a 32-bit value. From include/uapi/drm/drm_fourcc.h:
> 
> /* 32 bpp RGB */
> #define DRM_FORMAT_XRGB   fourcc_code('X', 'R', '2', '4') /* [31:0]
> x:R:G:B 8:8:8:8 little endian */

Okay, "[31:0] x:R:G:B 8:8:8:8" can certainly mean
[31:24]=x [23:16]=R [15:8]=G [7:0]=B, which when stored "little endian"
becomes B G R X in memory, for which your pix32 code is correct.

That's the reverse *memory* layout of what the name says :) but yes,
the name then matches the representation seen by software. That's the
"abstracted" case that I didn't expect, because I thought the name was
refering to memory layout and because I was thinking about how traditional
graphics adapter video memory has the R component at the lower
address, at least in early linear modes.

I also didn't pay attention to the fbset output:

rgba 8/16,8/8,8/0,0/0


With drm format describing software pixel representation and per the
fbset rgba description my test file was incorrect. I've recreated it
with B G R X bytes and it shows correctly with your pix32 code.

Sending data directly to the device without the gud driver uses
different data, so isn't actually a fair comparison, but I didn't
change the device at all now, and that still works.


> If a raw buffer was passed from a BE to an LE machine, there would be
> problems because of how the value is stored,

And swab would be required on a LE machine with a graphics adapter in
a mode with X R G B memory layout, or that system would just never
present XRGB for that adapter/mode but perhaps something called
BGRX instead? I see.


> but here it's the same endianness in userspace and kernel space.

Ack.


> There is code in gud_prep_flush() that handles a BE host with a
> multibyte format:
> 
>   } else if (gud_is_big_endian() && format->cpp[0] > 1) {
>   drm_fb_swab(buf, vaddr, fb, rect, !import_attach);
> 
> In this case we can't just pass on the raw buffer to the device since
> the protocol is LE, and thus have to swap the bytes to match up how
> they're stored in memory on the device.

Ack.


> I'm not loosing any of the colors when running modetest. This is the
> test image that modetest uses and it comes through just like that:
> https://commons.wikimedia.org/wiki/File:SMPTE_Color_Bars.svg

So your destination rgb565 buffer has a [15:11]=R [10:5]=G [4:0]=B
pixel format, which stores as B+G G+R in memory, as opposed to R+G G+B.
All right.


Thanks a lot for clearing up my misunderstanding of drm format names
and my endianess concerns!


//Peter
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

2021-03-11 Thread Daniel Gomez
On Thu, 11 Mar 2021 at 17:10, Alex Deucher  wrote:
>
> On Thu, Mar 11, 2021 at 10:02 AM Alexandre Desnoyers  wrote:
> >
> > On Thu, Mar 11, 2021 at 2:49 PM Daniel Gomez  wrote:
> > >
> > > On Thu, 11 Mar 2021 at 10:09, Daniel Gomez  wrote:
> > > >
> > > > On Wed, 10 Mar 2021 at 18:06, Alex Deucher  
> > > > wrote:
> > > > >
> > > > > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez  wrote:
> > > > > >
> > > > > > Disabling GFXOFF via the quirk list fixes a hardware lockup in
> > > > > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
> > > > > >
> > > > > > Signed-off-by: Daniel Gomez 
> > > > > > ---
> > > > > >
> > > > > > This patch is a continuation of the work here:
> > > > > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was 
> > > > > > discussed and
> > > > > > a dma_fence deadlock was provoke as a side effect. To reproduce the 
> > > > > > issue
> > > > > > please refer to the above link.
> > > > > >
> > > > > > The hardware lockup was introduced in 5.6-rc1 for our particular 
> > > > > > revision as it
> > > > > > wasn't part of the new blacklist. Before that, in kernel v5.5, this 
> > > > > > hardware was
> > > > > > working fine without any hardware lock because the GFXOFF was 
> > > > > > actually disabled
> > > > > > by the if condition for the CHIP_RAVEN case. So this patch, adds 
> > > > > > the 'Radeon
> > > > > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to 
> > > > > > disable the GFXOFF.
> > > > > >
> > > > > > But besides the fix, I'd like to ask from where this revision comes 
> > > > > > from. Is it
> > > > > > an ASIC revision or is it hardcoded in the VBIOS from our vendor? 
> > > > > > From what I
> > > > > > can see, it comes from the ASIC and I wonder if somehow we can get 
> > > > > > an APU in the
> > > > > > future, 'not blacklisted', with the same problem. Then, should this 
> > > > > > table only
> > > > > > filter for the vendor and device and not the revision? Do you know 
> > > > > > if there are
> > > > > > any revisions for the 1002:15dd validated, tested and functional?
> > > > >
> > > > > The pci revision id (RID) is used to specify the specific SKU within a
> > > > > family.  GFXOFF is supposed to be working on all raven variants.  It
> > > > > was tested and functional on all reference platforms and any OEM
> > > > > platforms that launched with Linux support.  There are a lot of
> > > > > dependencies on sbios in the early raven variants (0x15dd), so it's
> > > > > likely more of a specific platform issue, but there is not a good way
> > > > > to detect this so we use the DID/SSID/RID as a proxy.  The newer raven
> > > > > variants (0x15d8) have much better GFXOFF support since they all
> > > > > shipped with newer firmware and sbios.
> > > >
> > > > We took one of the first reference platform boards to design our
> > > > custom board based on the V1605B and I assume it has one of the early 
> > > > 'unstable'
> > > > raven variants with RID 0x83. Also, as OEM we are in control of the bios
> > > > (provided by insyde) but I wasn't sure about the RID so, thanks for the
> > > > clarification. Is there anything we can do with the bios to have the 
> > > > GFXOFF
> > > > enabled and 'stable' for this particular revision? Otherwise we'd need 
> > > > to add
> > > > the 0x83 RID to the table. Also, there is an extra ']' in the patch
> > > > subject. Sorry
> > > > for that. Would you need a new patch in case you accept it with the ']' 
> > > > removed?
> > > >
> > > > Good to hear that the newer raven versions have better GFXOFF support.
> > >
> > > Adding Alex Desnoyer to the loop as he is the electronic/hardware and
> > > bios responsible so, he can
> > > provide more information about this.
> >
> > Hello everyone,
> >
> > We, Qtechnology, are the OEM of the hardware platform where we
> > originally discovered the bug.  Our platform is based on the AMD
> > Dibbler V-1000 reference design, with the latest Insyde BIOS release
> > available for the (now unsupported) Dibbler platform.  We have the
> > Insyde BIOS source code internally, so we can make some modifications
> > as needed.
> >
> > The last test that Daniel and myself performed was on a standard
> > Dibbler PCB rev.B1 motherboard (NOT our platform), and using the
> > corresponding latest AMD released BIOS "RDB1109GA".  As Daniel wrote,
> > the hardware lockup can be reproduced on the Dibbler, even if it has a
> > different RID that our V1605B APU.
> >
> > We also have a Neousys Technology POC-515 embedded computer (V-1000,
> > V1605B) in our office.  The Neousys PC also uses Insyde BIOS.  This
> > computer is also locking-up in the test.
> > https://www.neousys-tech.com/en/product/application/rugged-embedded/poc-500-amd-ryzen-ultra-compact-embedded-computer
> >
> >
> > Digging into the BIOS source code, the only reference to GFXOFF is in
> > the SMU and PSP firmware release notes, where some bug fixes have been
> > mentioned for previous SMU/PSP releases.  After a quick "git grep -i
> > 

Re: [PATCH v2 07/14] spi: spi-geni-qcom: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
11.03.2021 22:44, Mark Brown пишет:
> On Thu, Mar 11, 2021 at 10:20:58PM +0300, Dmitry Osipenko wrote:
> 
>> Acked-by: Mark brown 
> 
> Typo there.
> 

Good catch! Although, that should be a patchwork fault since it
auto-added acks when I downloaded v1 patches and I haven't changed them.
I'll fix it in v3 or, if there won't be anything else to improve, then
maybe Viresh could fix it up while applying patches.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 07/14] spi: spi-geni-qcom: Convert to use resource-managed OPP API

2021-03-11 Thread Mark Brown
On Thu, Mar 11, 2021 at 10:20:58PM +0300, Dmitry Osipenko wrote:

> Acked-by: Mark brown 

Typo there.


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 14/14] memory: samsung: exynos5422-dmc: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed API to simplify code.

Signed-off-by: Yangtao Li 
Reviewed-by: Krzysztof Kozlowski 
Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/samsung/exynos5422-dmc.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/memory/samsung/exynos5422-dmc.c 
b/drivers/memory/samsung/exynos5422-dmc.c
index 1dabb509dec3..56f6e65d40cd 100644
--- a/drivers/memory/samsung/exynos5422-dmc.c
+++ b/drivers/memory/samsung/exynos5422-dmc.c
@@ -343,7 +343,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc,
int idx;
unsigned long freq;
 
-   ret = dev_pm_opp_of_add_table(dmc->dev);
+   ret = devm_pm_opp_of_add_table(dmc->dev);
if (ret < 0) {
dev_err(dmc->dev, "Failed to get OPP table\n");
return ret;
@@ -354,7 +354,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc,
dmc->opp = devm_kmalloc_array(dmc->dev, dmc->opp_count,
  sizeof(struct dmc_opp_table), GFP_KERNEL);
if (!dmc->opp)
-   goto err_opp;
+   return -ENOMEM;
 
idx = dmc->opp_count - 1;
for (i = 0, freq = ULONG_MAX; i < dmc->opp_count; i++, freq--) {
@@ -362,7 +362,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc,
 
opp = dev_pm_opp_find_freq_floor(dmc->dev, );
if (IS_ERR(opp))
-   goto err_opp;
+   return PTR_ERR(opp);
 
dmc->opp[idx - i].freq_hz = freq;
dmc->opp[idx - i].volt_uv = dev_pm_opp_get_voltage(opp);
@@ -371,11 +371,6 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc,
}
 
return 0;
-
-err_opp:
-   dev_pm_opp_of_remove_table(dmc->dev);
-
-   return -EINVAL;
 }
 
 /**
@@ -1567,8 +1562,6 @@ static int exynos5_dmc_remove(struct platform_device 
*pdev)
clk_disable_unprepare(dmc->mout_bpll);
clk_disable_unprepare(dmc->fout_bpll);
 
-   dev_pm_opp_remove_table(dmc->dev);
-
return 0;
 }
 
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 11/14] drm/lima: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/lima/lima_devfreq.c | 43 -
 drivers/gpu/drm/lima/lima_devfreq.h |  2 --
 2 files changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
b/drivers/gpu/drm/lima/lima_devfreq.c
index 5686ad4aaf7c..e3ccaf269892 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -99,13 +99,6 @@ void lima_devfreq_fini(struct lima_device *ldev)
devm_devfreq_remove_device(ldev->dev, devfreq->devfreq);
devfreq->devfreq = NULL;
}
-
-   dev_pm_opp_of_remove_table(ldev->dev);
-
-   dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
-   dev_pm_opp_put_clkname(devfreq->clkname_opp_table);
-   devfreq->regulators_opp_table = NULL;
-   devfreq->clkname_opp_table = NULL;
 }
 
 int lima_devfreq_init(struct lima_device *ldev)
@@ -125,40 +118,31 @@ int lima_devfreq_init(struct lima_device *ldev)
 
spin_lock_init(>lock);
 
-   opp_table = dev_pm_opp_set_clkname(dev, "core");
-   if (IS_ERR(opp_table)) {
-   ret = PTR_ERR(opp_table);
-   goto err_fini;
-   }
-
-   ldevfreq->clkname_opp_table = opp_table;
+   opp_table = devm_pm_opp_set_clkname(dev, "core");
+   if (IS_ERR(opp_table))
+   return PTR_ERR(opp_table);
 
-   opp_table = dev_pm_opp_set_regulators(dev,
- (const char *[]){ "mali" },
- 1);
+   opp_table = devm_pm_opp_set_regulators(dev, (const char *[]){ "mali" },
+  1);
if (IS_ERR(opp_table)) {
ret = PTR_ERR(opp_table);
 
/* Continue if the optional regulator is missing */
if (ret != -ENODEV)
-   goto err_fini;
-   } else {
-   ldevfreq->regulators_opp_table = opp_table;
+   return ret;
}
 
-   ret = dev_pm_opp_of_add_table(dev);
+   ret = devm_pm_opp_of_add_table(dev);
if (ret)
-   goto err_fini;
+   return ret;
 
lima_devfreq_reset(ldevfreq);
 
cur_freq = clk_get_rate(ldev->clk_gpu);
 
opp = devfreq_recommended_opp(dev, _freq, 0);
-   if (IS_ERR(opp)) {
-   ret = PTR_ERR(opp);
-   goto err_fini;
-   }
+   if (IS_ERR(opp))
+   return PTR_ERR(opp);
 
lima_devfreq_profile.initial_freq = cur_freq;
dev_pm_opp_put(opp);
@@ -167,8 +151,7 @@ int lima_devfreq_init(struct lima_device *ldev)
  DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL);
if (IS_ERR(devfreq)) {
dev_err(dev, "Couldn't initialize GPU devfreq\n");
-   ret = PTR_ERR(devfreq);
-   goto err_fini;
+   return PTR_ERR(devfreq);
}
 
ldevfreq->devfreq = devfreq;
@@ -180,10 +163,6 @@ int lima_devfreq_init(struct lima_device *ldev)
ldevfreq->cooling = cooling;
 
return 0;
-
-err_fini:
-   lima_devfreq_fini(ldev);
-   return ret;
 }
 
 void lima_devfreq_record_busy(struct lima_devfreq *devfreq)
diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
b/drivers/gpu/drm/lima/lima_devfreq.h
index 2d9b3008ce77..c3bcae76ca07 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.h
+++ b/drivers/gpu/drm/lima/lima_devfreq.h
@@ -15,8 +15,6 @@ struct lima_device;
 
 struct lima_devfreq {
struct devfreq *devfreq;
-   struct opp_table *clkname_opp_table;
-   struct opp_table *regulators_opp_table;
struct thermal_cooling_device *cooling;
 
ktime_t busy_time;
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 13/14] media: venus: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/media/platform/qcom/venus/pm_helpers.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c 
b/drivers/media/platform/qcom/venus/pm_helpers.c
index 43c4e3d9e281..14fa27f26a7d 100644
--- a/drivers/media/platform/qcom/venus/pm_helpers.c
+++ b/drivers/media/platform/qcom/venus/pm_helpers.c
@@ -860,45 +860,33 @@ static int core_get_v4(struct device *dev)
if (legacy_binding)
return 0;
 
-   core->opp_table = dev_pm_opp_set_clkname(dev, "core");
+   core->opp_table = devm_pm_opp_set_clkname(dev, "core");
if (IS_ERR(core->opp_table))
return PTR_ERR(core->opp_table);
 
if (core->res->opp_pmdomain) {
-   ret = dev_pm_opp_of_add_table(dev);
+   ret = devm_pm_opp_of_add_table(dev);
if (!ret) {
core->has_opp_table = true;
} else if (ret != -ENODEV) {
dev_err(dev, "invalid OPP table in device tree\n");
-   dev_pm_opp_put_clkname(core->opp_table);
return ret;
}
}
 
ret = vcodec_domains_get(dev);
-   if (ret) {
-   if (core->has_opp_table)
-   dev_pm_opp_of_remove_table(dev);
-   dev_pm_opp_put_clkname(core->opp_table);
+   if (ret)
return ret;
-   }
 
return 0;
 }
 
 static void core_put_v4(struct device *dev)
 {
-   struct venus_core *core = dev_get_drvdata(dev);
-
if (legacy_binding)
return;
 
vcodec_domains_put(dev);
-
-   if (core->has_opp_table)
-   dev_pm_opp_of_remove_table(dev);
-   dev_pm_opp_put_clkname(core->opp_table);
-
 }
 
 static int core_power_v4(struct device *dev, int on)
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 12/14] drm/panfrost: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Reviewed-by: Steven Price 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/panfrost/panfrost_devfreq.c | 33 +
 drivers/gpu/drm/panfrost/panfrost_devfreq.h |  1 -
 2 files changed, 8 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c 
b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
index 56b3f5935703..eeb50c55c472 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
@@ -93,25 +93,23 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
struct thermal_cooling_device *cooling;
struct panfrost_devfreq *pfdevfreq = >pfdevfreq;
 
-   opp_table = dev_pm_opp_set_regulators(dev, pfdev->comp->supply_names,
- pfdev->comp->num_supplies);
+   opp_table = devm_pm_opp_set_regulators(dev, pfdev->comp->supply_names,
+  pfdev->comp->num_supplies);
if (IS_ERR(opp_table)) {
ret = PTR_ERR(opp_table);
/* Continue if the optional regulator is missing */
if (ret != -ENODEV) {
DRM_DEV_ERROR(dev, "Couldn't set OPP regulators\n");
-   goto err_fini;
+   return ret;
}
-   } else {
-   pfdevfreq->regulators_opp_table = opp_table;
}
 
-   ret = dev_pm_opp_of_add_table(dev);
+   ret = devm_pm_opp_of_add_table(dev);
if (ret) {
/* Optional, continue without devfreq */
if (ret == -ENODEV)
ret = 0;
-   goto err_fini;
+   return ret;
}
pfdevfreq->opp_of_table_added = true;
 
@@ -122,10 +120,8 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
cur_freq = clk_get_rate(pfdev->clock);
 
opp = devfreq_recommended_opp(dev, _freq, 0);
-   if (IS_ERR(opp)) {
-   ret = PTR_ERR(opp);
-   goto err_fini;
-   }
+   if (IS_ERR(opp))
+   return PTR_ERR(opp);
 
panfrost_devfreq_profile.initial_freq = cur_freq;
dev_pm_opp_put(opp);
@@ -134,8 +130,7 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
  DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL);
if (IS_ERR(devfreq)) {
DRM_DEV_ERROR(dev, "Couldn't initialize GPU devfreq\n");
-   ret = PTR_ERR(devfreq);
-   goto err_fini;
+   return PTR_ERR(devfreq);
}
pfdevfreq->devfreq = devfreq;
 
@@ -146,10 +141,6 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
pfdevfreq->cooling = cooling;
 
return 0;
-
-err_fini:
-   panfrost_devfreq_fini(pfdev);
-   return ret;
 }
 
 void panfrost_devfreq_fini(struct panfrost_device *pfdev)
@@ -160,14 +151,6 @@ void panfrost_devfreq_fini(struct panfrost_device *pfdev)
devfreq_cooling_unregister(pfdevfreq->cooling);
pfdevfreq->cooling = NULL;
}
-
-   if (pfdevfreq->opp_of_table_added) {
-   dev_pm_opp_of_remove_table(>pdev->dev);
-   pfdevfreq->opp_of_table_added = false;
-   }
-
-   dev_pm_opp_put_regulators(pfdevfreq->regulators_opp_table);
-   pfdevfreq->regulators_opp_table = NULL;
 }
 
 void panfrost_devfreq_resume(struct panfrost_device *pfdev)
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h 
b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
index db6ea48e21f9..a51854cc8c06 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
@@ -15,7 +15,6 @@ struct panfrost_device;
 
 struct panfrost_devfreq {
struct devfreq *devfreq;
-   struct opp_table *regulators_opp_table;
struct thermal_cooling_device *cooling;
bool opp_of_table_added;
 
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 10/14] drm/msm: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 24 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  2 --
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 31 ++---
 drivers/gpu/drm/msm/dp/dp_ctrl.h|  1 -
 drivers/gpu/drm/msm/dp/dp_display.c |  5 +---
 drivers/gpu/drm/msm/dsi/dsi_host.c  | 14 ---
 9 files changed, 25 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 7e553d3efeb2..caf747ba8d5b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1705,7 +1705,7 @@ static void check_speed_bin(struct device *dev)
nvmem_cell_put(cell);
}
 
-   dev_pm_opp_set_supported_hw(dev, , 1);
+   devm_pm_opp_set_supported_hw(dev, , 1);
 }
 
 struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 91cf46f84025..232940b41720 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1340,7 +1340,7 @@ static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
 * The GMU handles its own frequency switching so build a list of
 * available frequencies to send during initialization
 */
-   ret = dev_pm_opp_of_add_table(gmu->dev);
+   ret = devm_pm_opp_of_add_table(gmu->dev);
if (ret) {
DRM_DEV_ERROR(gmu->dev, "Unable to set the OPP table for the 
GMU\n");
return ret;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 0f184c3dd9d9..dfd3cac50f7f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -841,7 +841,7 @@ static void adreno_get_pwrlevels(struct device *dev,
if (!of_find_property(dev->of_node, "operating-points-v2", NULL))
ret = adreno_get_legacy_pwrlevels(dev);
else {
-   ret = dev_pm_opp_of_add_table(dev);
+   ret = devm_pm_opp_of_add_table(dev);
if (ret)
DRM_DEV_ERROR(dev, "Unable to set the OPP table\n");
}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 5a8e3e1fc48c..8344a3314133 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1082,27 +1082,28 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
struct msm_drm_private *priv = ddev->dev_private;
struct dpu_kms *dpu_kms;
struct dss_module_power *mp;
+   struct opp_table *opp_table;
int ret = 0;
 
dpu_kms = devm_kzalloc(>dev, sizeof(*dpu_kms), GFP_KERNEL);
if (!dpu_kms)
return -ENOMEM;
 
-   dpu_kms->opp_table = dev_pm_opp_set_clkname(dev, "core");
-   if (IS_ERR(dpu_kms->opp_table))
-   return PTR_ERR(dpu_kms->opp_table);
+   opp_table = devm_pm_opp_set_clkname(dev, "core");
+   if (IS_ERR(opp_table))
+   return PTR_ERR(opp_table);
/* OPP table is optional */
-   ret = dev_pm_opp_of_add_table(dev);
+   ret = devm_pm_opp_of_add_table(dev);
if (ret && ret != -ENODEV) {
dev_err(dev, "invalid OPP table in device tree\n");
-   goto put_clkname;
+   return ret;
}
 
mp = _kms->mp;
ret = msm_dss_parse_clock(pdev, mp);
if (ret) {
DPU_ERROR("failed to parse clocks, ret=%d\n", ret);
-   goto err;
+   return ret;
}
 
platform_set_drvdata(pdev, dpu_kms);
@@ -1110,7 +,7 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
ret = msm_kms_init(_kms->base, _funcs);
if (ret) {
DPU_ERROR("failed to init kms, ret=%d\n", ret);
-   goto err;
+   return ret;
}
dpu_kms->dev = ddev;
dpu_kms->pdev = pdev;
@@ -1119,11 +1120,7 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
dpu_kms->rpm_enabled = true;
 
priv->kms = _kms->base;
-   return ret;
-err:
-   dev_pm_opp_of_remove_table(dev);
-put_clkname:
-   dev_pm_opp_put_clkname(dpu_kms->opp_table);
+
return ret;
 }
 
@@ -1139,9 +1136,6 @@ static void dpu_unbind(struct device *dev, struct device 
*master, void *data)
 
if (dpu_kms->rpm_enabled)
pm_runtime_disable(>dev);
-
-   dev_pm_opp_of_remove_table(dev);
-   dev_pm_opp_put_clkname(dpu_kms->opp_table);
 }
 
 static const struct 

[PATCH v2 09/14] mmc: sdhci-msm: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/mmc/host/sdhci-msm.c | 20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index 5e1da4df096f..af3f7bd764e8 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -264,7 +264,6 @@ struct sdhci_msm_host {
struct clk_bulk_data bulk_clks[5];
unsigned long clk_rate;
struct mmc_host *mmc;
-   struct opp_table *opp_table;
bool use_14lpp_dll_reset;
bool tuning_done;
bool calibration_done;
@@ -2483,6 +2482,7 @@ static int sdhci_msm_probe(struct platform_device *pdev)
const struct sdhci_msm_offset *msm_offset;
const struct sdhci_msm_variant_info *var_info;
struct device_node *node = pdev->dev.of_node;
+   struct opp_table *opp_table;
 
host = sdhci_pltfm_init(pdev, _msm_pdata, sizeof(*msm_host));
if (IS_ERR(host))
@@ -2551,17 +2551,17 @@ static int sdhci_msm_probe(struct platform_device *pdev)
if (ret)
goto bus_clk_disable;
 
-   msm_host->opp_table = dev_pm_opp_set_clkname(>dev, "core");
-   if (IS_ERR(msm_host->opp_table)) {
-   ret = PTR_ERR(msm_host->opp_table);
+   opp_table = devm_pm_opp_set_clkname(>dev, "core");
+   if (IS_ERR(opp_table)) {
+   ret = PTR_ERR(opp_table);
goto bus_clk_disable;
}
 
/* OPP table is optional */
-   ret = dev_pm_opp_of_add_table(>dev);
+   ret = devm_pm_opp_of_add_table(>dev);
if (ret && ret != -ENODEV) {
dev_err(>dev, "Invalid OPP table in Device tree\n");
-   goto opp_put_clkname;
+   goto bus_clk_disable;
}
 
/* Vote for maximum clock rate for maximum performance */
@@ -2587,7 +2587,7 @@ static int sdhci_msm_probe(struct platform_device *pdev)
ret = clk_bulk_prepare_enable(ARRAY_SIZE(msm_host->bulk_clks),
  msm_host->bulk_clks);
if (ret)
-   goto opp_cleanup;
+   goto bus_clk_disable;
 
/*
 * xo clock is needed for FLL feature of cm_dll.
@@ -2732,10 +2732,6 @@ static int sdhci_msm_probe(struct platform_device *pdev)
 clk_disable:
clk_bulk_disable_unprepare(ARRAY_SIZE(msm_host->bulk_clks),
   msm_host->bulk_clks);
-opp_cleanup:
-   dev_pm_opp_of_remove_table(>dev);
-opp_put_clkname:
-   dev_pm_opp_put_clkname(msm_host->opp_table);
 bus_clk_disable:
if (!IS_ERR(msm_host->bus_clk))
clk_disable_unprepare(msm_host->bus_clk);
@@ -2754,8 +2750,6 @@ static int sdhci_msm_remove(struct platform_device *pdev)
 
sdhci_remove_host(host, dead);
 
-   dev_pm_opp_of_remove_table(>dev);
-   dev_pm_opp_put_clkname(msm_host->opp_table);
pm_runtime_get_sync(>dev);
pm_runtime_disable(>dev);
pm_runtime_put_noidle(>dev);
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 08/14] spi: spi-qcom-qspi: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Acked-by: Mark Brown 
Signed-off-by: Dmitry Osipenko 
---
 drivers/spi/spi-qcom-qspi.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c
index 1dbcc410cd35..f14801dd5120 100644
--- a/drivers/spi/spi-qcom-qspi.c
+++ b/drivers/spi/spi-qcom-qspi.c
@@ -142,7 +142,6 @@ struct qcom_qspi {
struct clk_bulk_data *clks;
struct qspi_xfer xfer;
struct icc_path *icc_path_cpu_to_qspi;
-   struct opp_table *opp_table;
unsigned long last_speed;
/* Lock to protect data accessed by IRQs */
spinlock_t lock;
@@ -459,6 +458,7 @@ static int qcom_qspi_probe(struct platform_device *pdev)
struct device *dev;
struct spi_master *master;
struct qcom_qspi *ctrl;
+   struct opp_table *opp_table;
 
dev = >dev;
 
@@ -530,14 +530,14 @@ static int qcom_qspi_probe(struct platform_device *pdev)
master->handle_err = qcom_qspi_handle_err;
master->auto_runtime_pm = true;
 
-   ctrl->opp_table = dev_pm_opp_set_clkname(>dev, "core");
-   if (IS_ERR(ctrl->opp_table))
-   return PTR_ERR(ctrl->opp_table);
+   opp_table = devm_pm_opp_set_clkname(>dev, "core");
+   if (IS_ERR(opp_table))
+   return PTR_ERR(opp_table);
/* OPP table is optional */
-   ret = dev_pm_opp_of_add_table(>dev);
+   ret = devm_pm_opp_of_add_table(>dev);
if (ret && ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
-   goto exit_probe_put_clkname;
+   return ret;
}
 
pm_runtime_use_autosuspend(dev);
@@ -549,10 +549,6 @@ static int qcom_qspi_probe(struct platform_device *pdev)
return 0;
 
pm_runtime_disable(dev);
-   dev_pm_opp_of_remove_table(>dev);
-
-exit_probe_put_clkname:
-   dev_pm_opp_put_clkname(ctrl->opp_table);
 
return ret;
 }
@@ -560,14 +556,11 @@ static int qcom_qspi_probe(struct platform_device *pdev)
 static int qcom_qspi_remove(struct platform_device *pdev)
 {
struct spi_master *master = platform_get_drvdata(pdev);
-   struct qcom_qspi *ctrl = spi_master_get_devdata(master);
 
/* Unregister _before_ disabling pm_runtime() so we stop transfers */
spi_unregister_master(master);
 
pm_runtime_disable(>dev);
-   dev_pm_opp_of_remove_table(>dev);
-   dev_pm_opp_put_clkname(ctrl->opp_table);
 
return 0;
 }
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 07/14] spi: spi-geni-qcom: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Acked-by: Mark brown 
Signed-off-by: Dmitry Osipenko 
---
 drivers/spi/spi-geni-qcom.c  | 17 +++--
 include/linux/qcom-geni-se.h |  2 --
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 881f645661cc..20cc29ea198b 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -666,6 +666,7 @@ static int spi_geni_probe(struct platform_device *pdev)
void __iomem *base;
struct clk *clk;
struct device *dev = >dev;
+   struct opp_table *opp_table;
 
irq = platform_get_irq(pdev, 0);
if (irq < 0)
@@ -691,14 +692,15 @@ static int spi_geni_probe(struct platform_device *pdev)
mas->se.wrapper = dev_get_drvdata(dev->parent);
mas->se.base = base;
mas->se.clk = clk;
-   mas->se.opp_table = dev_pm_opp_set_clkname(>dev, "se");
-   if (IS_ERR(mas->se.opp_table))
-   return PTR_ERR(mas->se.opp_table);
+
+   opp_table = devm_pm_opp_set_clkname(>dev, "se");
+   if (IS_ERR(opp_table))
+   return PTR_ERR(opp_table);
/* OPP table is optional */
-   ret = dev_pm_opp_of_add_table(>dev);
+   ret = devm_pm_opp_of_add_table(>dev);
if (ret && ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
-   goto put_clkname;
+   return ret;
}
 
spi->bus_num = -1;
@@ -750,9 +752,6 @@ static int spi_geni_probe(struct platform_device *pdev)
free_irq(mas->irq, spi);
 spi_geni_probe_runtime_disable:
pm_runtime_disable(dev);
-   dev_pm_opp_of_remove_table(>dev);
-put_clkname:
-   dev_pm_opp_put_clkname(mas->se.opp_table);
return ret;
 }
 
@@ -766,8 +765,6 @@ static int spi_geni_remove(struct platform_device *pdev)
 
free_irq(mas->irq, spi);
pm_runtime_disable(>dev);
-   dev_pm_opp_of_remove_table(>dev);
-   dev_pm_opp_put_clkname(mas->se.opp_table);
return 0;
 }
 
diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
index ec2ad4b0fe14..cddef864a760 100644
--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -47,7 +47,6 @@ struct geni_icc_path {
  * @num_clk_levels:Number of valid clock levels in clk_perf_tbl
  * @clk_perf_tbl:  Table of clock frequency input to serial engine clock
  * @icc_paths: Array of ICC paths for SE
- * @opp_table: Pointer to the OPP table
  */
 struct geni_se {
void __iomem *base;
@@ -57,7 +56,6 @@ struct geni_se {
unsigned int num_clk_levels;
unsigned long *clk_perf_tbl;
struct geni_icc_path icc_paths[3];
-   struct opp_table *opp_table;
 };
 
 /* Common SE registers */
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 06/14] serial: qcom_geni_serial: Convert to use resource-managed OPP API

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/tty/serial/qcom_geni_serial.c | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c 
b/drivers/tty/serial/qcom_geni_serial.c
index 291649f02821..7c6e029fdb2a 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1352,6 +1352,7 @@ static int qcom_geni_serial_probe(struct platform_device 
*pdev)
int irq;
bool console = false;
struct uart_driver *drv;
+   struct opp_table *opp_table;
 
if (of_device_is_compatible(pdev->dev.of_node, "qcom,geni-debug-uart"))
console = true;
@@ -1433,14 +1434,14 @@ static int qcom_geni_serial_probe(struct 
platform_device *pdev)
if (of_property_read_bool(pdev->dev.of_node, "cts-rts-swap"))
port->cts_rts_swap = true;
 
-   port->se.opp_table = dev_pm_opp_set_clkname(>dev, "se");
-   if (IS_ERR(port->se.opp_table))
-   return PTR_ERR(port->se.opp_table);
+   opp_table = devm_pm_opp_set_clkname(>dev, "se");
+   if (IS_ERR(opp_table))
+   return PTR_ERR(opp_table);
/* OPP table is optional */
-   ret = dev_pm_opp_of_add_table(>dev);
+   ret = devm_pm_opp_of_add_table(>dev);
if (ret && ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
-   goto put_clkname;
+   return ret;
}
 
port->private_data.drv = drv;
@@ -1450,7 +1451,7 @@ static int qcom_geni_serial_probe(struct platform_device 
*pdev)
 
ret = uart_add_one_port(drv, uport);
if (ret)
-   goto err;
+   return ret;
 
irq_set_status_flags(uport->irq, IRQ_NOAUTOEN);
ret = devm_request_irq(uport->dev, uport->irq, qcom_geni_serial_isr,
@@ -1458,7 +1459,7 @@ static int qcom_geni_serial_probe(struct platform_device 
*pdev)
if (ret) {
dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret);
uart_remove_one_port(drv, uport);
-   goto err;
+   return ret;
}
 
/*
@@ -1475,16 +1476,11 @@ static int qcom_geni_serial_probe(struct 
platform_device *pdev)
if (ret) {
device_init_wakeup(>dev, false);
uart_remove_one_port(drv, uport);
-   goto err;
+   return ret;
}
}
 
return 0;
-err:
-   dev_pm_opp_of_remove_table(>dev);
-put_clkname:
-   dev_pm_opp_put_clkname(port->se.opp_table);
-   return ret;
 }
 
 static int qcom_geni_serial_remove(struct platform_device *pdev)
@@ -1492,8 +1488,6 @@ static int qcom_geni_serial_remove(struct platform_device 
*pdev)
struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
struct uart_driver *drv = port->private_data.drv;
 
-   dev_pm_opp_of_remove_table(>dev);
-   dev_pm_opp_put_clkname(port->se.opp_table);
dev_pm_clear_wake_irq(>dev);
device_init_wakeup(>dev, false);
uart_remove_one_port(drv, >uport);
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 05/14] opp: Add devres wrapper for dev_pm_opp_register_notifier

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Add devres wrapper for dev_pm_opp_register_notifier() to simplify driver
code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/opp/core.c | 38 ++
 include/linux/pm_opp.h |  6 ++
 2 files changed, 44 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index f9e4ebb7aad0..509b2e052f3c 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2899,6 +2899,44 @@ int dev_pm_opp_unregister_notifier(struct device *dev,
 }
 EXPORT_SYMBOL(dev_pm_opp_unregister_notifier);
 
+static void devm_pm_opp_notifier_release(struct device *dev, void *res)
+{
+   struct notifier_block *nb = *(struct notifier_block **)res;
+
+   WARN_ON(dev_pm_opp_unregister_notifier(dev, nb));
+}
+
+/**
+ * devm_pm_opp_register_notifier() - Register OPP notifier for the device
+ * @dev:   Device for which notifier needs to be registered
+ * @nb:Notifier block to be registered
+ *
+ * Return: 0 on success or a negative error value.
+ *
+ * The notifier will be unregistered after the device is destroyed.
+ */
+int devm_pm_opp_register_notifier(struct device *dev, struct notifier_block 
*nb)
+{
+   struct notifier_block **ptr;
+   int ret;
+
+   ptr = devres_alloc(devm_pm_opp_notifier_release, sizeof(*ptr), 
GFP_KERNEL);
+   if (!ptr)
+   return -ENOMEM;
+
+   ret = dev_pm_opp_register_notifier(dev, nb);
+   if (ret) {
+   devres_free(ptr);
+   return ret;
+   }
+
+   *ptr = nb;
+   devres_add(dev, ptr);
+
+   return 0;
+}
+EXPORT_SYMBOL(devm_pm_opp_register_notifier);
+
 /**
  * dev_pm_opp_remove_table() - Free all OPPs associated with the device
  * @dev:   device pointer used to lookup OPP table.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 2f5dc123c1a0..7088997491b2 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -141,6 +141,7 @@ int dev_pm_opp_disable(struct device *dev, unsigned long 
freq);
 
 int dev_pm_opp_register_notifier(struct device *dev, struct notifier_block 
*nb);
 int dev_pm_opp_unregister_notifier(struct device *dev, struct notifier_block 
*nb);
+int devm_pm_opp_register_notifier(struct device *dev, struct notifier_block 
*nb);
 
 struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev, const u32 
*versions, unsigned int count);
 void dev_pm_opp_put_supported_hw(struct opp_table *opp_table);
@@ -313,6 +314,11 @@ static inline int dev_pm_opp_unregister_notifier(struct 
device *dev, struct noti
return -EOPNOTSUPP;
 }
 
+static inline int devm_pm_opp_register_notifier(struct device *dev, struct 
notifier_block *nb)
+{
+   return -EOPNOTSUPP;
+}
+
 static inline struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev,
const u32 *versions,
unsigned int count)
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 03/14] opp: Add devres wrapper for dev_pm_opp_set_supported_hw

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Add devres wrapper for dev_pm_opp_set_supported_hw() to simplify driver
code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/opp/core.c | 38 ++
 include/linux/pm_opp.h |  8 
 2 files changed, 46 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 32fa2bff847b..f9e4ebb7aad0 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1857,6 +1857,44 @@ void dev_pm_opp_put_supported_hw(struct opp_table 
*opp_table)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_supported_hw);
 
+static void devm_pm_opp_supported_hw_release(void *data)
+{
+   dev_pm_opp_put_supported_hw(data);
+}
+
+/**
+ * devm_pm_opp_set_supported_hw() - Set supported platforms
+ * @dev: Device for which supported-hw has to be set.
+ * @versions: Array of hierarchy of versions to match.
+ * @count: Number of elements in the array.
+ *
+ * This is required only for the V2 bindings, and it enables a platform to
+ * specify the hierarchy of versions it supports. OPP layer will then enable
+ * OPPs, which are available for those versions, based on its 
'opp-supported-hw'
+ * property.
+ *
+ * The opp_table structure will be freed after the device is destroyed.
+ */
+struct opp_table *devm_pm_opp_set_supported_hw(struct device *dev,
+  const u32 *versions,
+  unsigned int count)
+{
+   struct opp_table *opp_table;
+   int err;
+
+   opp_table = dev_pm_opp_set_supported_hw(dev, versions, count);
+   if (IS_ERR(opp_table))
+   return opp_table;
+
+   err = devm_add_action_or_reset(dev, devm_pm_opp_supported_hw_release,
+  opp_table);
+   if (err)
+   opp_table = ERR_PTR(err);
+
+   return opp_table;
+}
+EXPORT_SYMBOL_GPL(devm_pm_opp_set_supported_hw);
+
 /**
  * dev_pm_opp_set_prop_name() - Set prop-extn name
  * @dev: Device for which the prop-name has to be set.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 284d23665b15..e68c3c29301e 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -144,6 +144,7 @@ int dev_pm_opp_unregister_notifier(struct device *dev, 
struct notifier_block *nb
 
 struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev, const u32 
*versions, unsigned int count);
 void dev_pm_opp_put_supported_hw(struct opp_table *opp_table);
+struct opp_table *devm_pm_opp_set_supported_hw(struct device *dev, const u32 
*versions, unsigned int count);
 struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char 
*name);
 void dev_pm_opp_put_prop_name(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * 
const names[], unsigned int count);
@@ -321,6 +322,13 @@ static inline struct opp_table 
*dev_pm_opp_set_supported_hw(struct device *dev,
 
 static inline void dev_pm_opp_put_supported_hw(struct opp_table *opp_table) {}
 
+static inline struct opp_table *devm_pm_opp_set_supported_hw(struct device 
*dev,
+const u32 
*versions,
+unsigned int count)
+{
+   return ERR_PTR(-EOPNOTSUPP);
+}
+
 static inline struct opp_table *dev_pm_opp_register_set_opp_helper(struct 
device *dev,
int (*set_opp)(struct dev_pm_set_opp_data *data))
 {
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 04/14] opp: Add devres wrapper for dev_pm_opp_of_add_table

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Add devres wrapper for dev_pm_opp_of_add_table() to simplify driver
code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/opp/of.c   | 36 
 include/linux/pm_opp.h |  6 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index f480c10e6314..c582a9ca397b 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -1104,6 +1104,42 @@ static int _of_add_table_indexed(struct device *dev, int 
index, bool getclk)
return ret;
 }
 
+static void devm_pm_opp_of_table_release(void *data)
+{
+   dev_pm_opp_of_remove_table(data);
+}
+
+/**
+ * devm_pm_opp_of_add_table() - Initialize opp table from device tree
+ * @dev:   device pointer used to lookup OPP table.
+ *
+ * Register the initial OPP table with the OPP library for given device.
+ *
+ * The opp_table structure will be freed after the device is destroyed.
+ *
+ * Return:
+ * 0   On success OR
+ * Duplicate OPPs (both freq and volt are same) and opp->available
+ * -EEXIST Freq are same and volt are different OR
+ * Duplicate OPPs (both freq and volt are same) and !opp->available
+ * -ENOMEM Memory allocation failure
+ * -ENODEV when 'operating-points' property is not found or is invalid data
+ * in device node.
+ * -ENODATAwhen empty 'operating-points' property is found
+ * -EINVAL when invalid entries are found in opp-v2 table
+ */
+int devm_pm_opp_of_add_table(struct device *dev)
+{
+   int ret;
+
+   ret = dev_pm_opp_of_add_table(dev);
+   if (ret)
+   return ret;
+
+   return devm_add_action_or_reset(dev, devm_pm_opp_of_table_release, dev);
+}
+EXPORT_SYMBOL_GPL(devm_pm_opp_of_add_table);
+
 /**
  * dev_pm_opp_of_add_table() - Initialize opp table from device tree
  * @dev:   device pointer used to lookup OPP table.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index e68c3c29301e..2f5dc123c1a0 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -441,6 +441,7 @@ int dev_pm_opp_of_add_table(struct device *dev);
 int dev_pm_opp_of_add_table_indexed(struct device *dev, int index);
 int dev_pm_opp_of_add_table_noclk(struct device *dev, int index);
 void dev_pm_opp_of_remove_table(struct device *dev);
+int devm_pm_opp_of_add_table(struct device *dev);
 int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask);
 void dev_pm_opp_of_cpumask_remove_table(const struct cpumask *cpumask);
 int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask 
*cpumask);
@@ -473,6 +474,11 @@ static inline void dev_pm_opp_of_remove_table(struct 
device *dev)
 {
 }
 
+static inline int devm_pm_opp_of_add_table(struct device *dev)
+{
+   return -EOPNOTSUPP;
+}
+
 static inline int dev_pm_opp_of_cpumask_add_table(const struct cpumask 
*cpumask)
 {
return -EOPNOTSUPP;
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 02/14] opp: Add devres wrapper for dev_pm_opp_set_regulators

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Add devres wrapper for dev_pm_opp_set_regulators() to simplify drivers
code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/opp/core.c | 39 +++
 include/linux/pm_opp.h |  8 
 2 files changed, 47 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 3345ab8da6b2..32fa2bff847b 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2047,6 +2047,45 @@ void dev_pm_opp_put_regulators(struct opp_table 
*opp_table)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators);
 
+static void devm_pm_opp_regulators_release(void *data)
+{
+   dev_pm_opp_put_regulators(data);
+}
+
+/**
+ * devm_pm_opp_set_regulators() - Set regulator names for the device
+ * @dev: Device for which regulator name is being set.
+ * @names: Array of pointers to the names of the regulator.
+ * @count: Number of regulators.
+ *
+ * In order to support OPP switching, OPP layer needs to know the name of the
+ * device's regulators, as the core would be required to switch voltages as
+ * well.
+ *
+ * This must be called before any OPPs are initialized for the device.
+ *
+ * The opp_table structure will be freed after the device is destroyed.
+ */
+struct opp_table *devm_pm_opp_set_regulators(struct device *dev,
+const char * const names[],
+unsigned int count)
+{
+   struct opp_table *opp_table;
+   int err;
+
+   opp_table = dev_pm_opp_set_regulators(dev, names, count);
+   if (IS_ERR(opp_table))
+   return opp_table;
+
+   err = devm_add_action_or_reset(dev, devm_pm_opp_regulators_release,
+  opp_table);
+   if (err)
+   opp_table = ERR_PTR(err);
+
+   return opp_table;
+}
+EXPORT_SYMBOL_GPL(devm_pm_opp_set_regulators);
+
 /**
  * dev_pm_opp_set_clkname() - Set clk name for the device
  * @dev: Device for which clk name is being set.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 6fb992168f1e..284d23665b15 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -148,6 +148,7 @@ struct opp_table *dev_pm_opp_set_prop_name(struct device 
*dev, const char *name)
 void dev_pm_opp_put_prop_name(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * 
const names[], unsigned int count);
 void dev_pm_opp_put_regulators(struct opp_table *opp_table);
+struct opp_table *devm_pm_opp_set_regulators(struct device *dev, const char * 
const names[], unsigned int count);
 struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char *name);
 void dev_pm_opp_put_clkname(struct opp_table *opp_table);
 struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char 
*name);
@@ -349,6 +350,13 @@ static inline struct opp_table 
*dev_pm_opp_set_regulators(struct device *dev, co
 
 static inline void dev_pm_opp_put_regulators(struct opp_table *opp_table) {}
 
+static inline struct opp_table *devm_pm_opp_set_regulators(struct device *dev,
+  const char * const 
names[],
+  unsigned int count)
+{
+   return ERR_PTR(-EOPNOTSUPP);
+}
+
 static inline struct opp_table *dev_pm_opp_set_clkname(struct device *dev, 
const char *name)
 {
return ERR_PTR(-EOPNOTSUPP);
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 01/14] opp: Add devres wrapper for dev_pm_opp_set_clkname

2021-03-11 Thread Dmitry Osipenko
From: Yangtao Li 

Add devres wrapper for dev_pm_opp_set_clkname() to simplify drivers code.

Signed-off-by: Yangtao Li 
Signed-off-by: Dmitry Osipenko 
---
 drivers/opp/core.c | 36 
 include/linux/pm_opp.h |  6 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 150be4c28c99..3345ab8da6b2 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2119,6 +2119,42 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
 
+static void devm_pm_opp_clkname_release(void *data)
+{
+   dev_pm_opp_put_clkname(data);
+}
+
+/**
+ * devm_pm_opp_set_clkname() - Set clk name for the device
+ * @dev: Device for which clk name is being set.
+ * @name: Clk name.
+ *
+ * In order to support OPP switching, OPP layer needs to get pointer to the
+ * clock for the device. Simple cases work fine without using this routine 
(i.e.
+ * by passing connection-id as NULL), but for a device with multiple clocks
+ * available, the OPP core needs to know the exact name of the clk to use.
+ *
+ * This must be called before any OPPs are initialized for the device.
+ *
+ * The opp_table structure will be freed after the device is destroyed.
+ */
+struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char *name)
+{
+   struct opp_table *opp_table;
+   int err;
+
+   opp_table = dev_pm_opp_set_clkname(dev, name);
+   if (IS_ERR(opp_table))
+   return opp_table;
+
+   err = devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, 
opp_table);
+   if (err)
+   opp_table = ERR_PTR(err);
+
+   return opp_table;
+}
+EXPORT_SYMBOL_GPL(devm_pm_opp_set_clkname);
+
 /**
  * dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper
  * @dev: Device for which the helper is getting registered.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index c0371efa4a0f..6fb992168f1e 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -150,6 +150,7 @@ struct opp_table *dev_pm_opp_set_regulators(struct device 
*dev, const char * con
 void dev_pm_opp_put_regulators(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char *name);
 void dev_pm_opp_put_clkname(struct opp_table *opp_table);
+struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char 
*name);
 struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int 
(*set_opp)(struct dev_pm_set_opp_data *data));
 void dev_pm_opp_unregister_set_opp_helper(struct opp_table *opp_table);
 struct opp_table *devm_pm_opp_register_set_opp_helper(struct device *dev, int 
(*set_opp)(struct dev_pm_set_opp_data *data));
@@ -355,6 +356,11 @@ static inline struct opp_table 
*dev_pm_opp_set_clkname(struct device *dev, const
 
 static inline void dev_pm_opp_put_clkname(struct opp_table *opp_table) {}
 
+static inline struct opp_table *devm_pm_opp_set_clkname(struct device *dev, 
const char *name)
+{
+   return ERR_PTR(-EOPNOTSUPP);
+}
+
 static inline struct opp_table *dev_pm_opp_attach_genpd(struct device *dev, 
const char **names, struct device ***virt_devs)
 {
return ERR_PTR(-EOPNOTSUPP);
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 00/14] Introduce devm_pm_opp_* API

2021-03-11 Thread Dmitry Osipenko
This series adds resource-managed OPP API helpers and makes drivers
to use them.

Changelog:

v2: - This is a continuation of the work that was started by Yangtao Li.
  Apparently Yangtao doesn't have time to finish it, so I
  (Dmitry Osipenko) picked up the effort since these patches are
  wanted by the NVIDIA Tegra voltage-scaling series that I'm
  working on.

- Fixed the double put of OPP resources.

- Dropped all patches that are unrelated to OPP API. I also dropped
  the Tegra memory patch since it doesn't apply now and because I plan
  to switch all Tegra drivers soon to a common tegra-specific OPP helper
  that will use the resource-managed OPP API anyways.

- Squashed couple patches into a single ones since there was no
  good reason to separate them.

- Added acks that were given to a couple of v1 patches.

Yangtao Li (14):
  opp: Add devres wrapper for dev_pm_opp_set_clkname
  opp: Add devres wrapper for dev_pm_opp_set_regulators
  opp: Add devres wrapper for dev_pm_opp_set_supported_hw
  opp: Add devres wrapper for dev_pm_opp_of_add_table
  opp: Add devres wrapper for dev_pm_opp_register_notifier
  serial: qcom_geni_serial: Convert to use resource-managed OPP API
  spi: spi-geni-qcom: Convert to use resource-managed OPP API
  spi: spi-qcom-qspi: Convert to use resource-managed OPP API
  mmc: sdhci-msm: Convert to use resource-managed OPP API
  drm/msm: Convert to use resource-managed OPP API
  drm/lima: Convert to use resource-managed OPP API
  drm/panfrost: Convert to use resource-managed OPP API
  media: venus: Convert to use resource-managed OPP API
  memory: samsung: exynos5422-dmc: Convert to use resource-managed OPP
API

 drivers/gpu/drm/lima/lima_devfreq.c   |  43 ++---
 drivers/gpu/drm/lima/lima_devfreq.h   |   2 -
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |   2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c |   2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |  24 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |   2 -
 drivers/gpu/drm/msm/dp/dp_ctrl.c  |  31 +---
 drivers/gpu/drm/msm/dp/dp_ctrl.h  |   1 -
 drivers/gpu/drm/msm/dp/dp_display.c   |   5 +-
 drivers/gpu/drm/msm/dsi/dsi_host.c|  14 +-
 drivers/gpu/drm/panfrost/panfrost_devfreq.c   |  33 +---
 drivers/gpu/drm/panfrost/panfrost_devfreq.h   |   1 -
 .../media/platform/qcom/venus/pm_helpers.c|  18 +--
 drivers/memory/samsung/exynos5422-dmc.c   |  13 +-
 drivers/mmc/host/sdhci-msm.c  |  20 +--
 drivers/opp/core.c| 151 ++
 drivers/opp/of.c  |  36 +
 drivers/spi/spi-geni-qcom.c   |  17 +-
 drivers/spi/spi-qcom-qspi.c   |  19 +--
 drivers/tty/serial/qcom_geni_serial.c |  24 ++-
 include/linux/pm_opp.h|  34 
 include/linux/qcom-geni-se.h  |   2 -
 23 files changed, 300 insertions(+), 196 deletions(-)

-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH] i915: Drop legacy execbuffer support

2021-03-11 Thread Dave Airlie
On Thu, 11 Mar 2021 at 08:04, Keith Packard  wrote:
>
> Jason Ekstrand  writes:
>
> > libdrm has supported the newer execbuffer2 ioctl and using it by default
> > when it exists since libdrm commit b50964027bef249a0cc3d511de05c2464e0a1e22
> > which landed Mar 2, 2010.  The i915 and i965 drivers in Mesa at the time
> > both used libdrm and so did the Intel X11 back-end.  The SNA back-end
> > for X11 has always used execbuffer2.
>
> All execbuffer users in the past that I'm aware of used libdrm, which
> now uses the execbuffer2 ioctl for this API. That means these
> applications will remain ABI compatible through this change.
>
> Acked-by: Keith Packard 

Acked-by: Dave Airlie 

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 12:20 PM Zbigniew Kempczyński
 wrote:
>
> On Thu, Mar 11, 2021 at 11:18:11AM -0600, Jason Ekstrand wrote:
> > On Thu, Mar 11, 2021 at 10:51 AM Zbigniew Kempczyński
> >  wrote:
> > >
> > > On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote:
> > > > On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
> > > > >
> > > > > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  
> > > > > wrote:
> > > > > >
> > > > > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > > > > > The Vulkan driver in Mesa for Intel hardware never uses 
> > > > > > > > relocations if
> > > > > > > > it's running on a version of i915 that supports at least 
> > > > > > > > softpin which
> > > > > > > > all versions of i915 supporting Gen12 do.  On the OpenGL side, 
> > > > > > > > Gen12+ is
> > > > > > > > only supported by iris which never uses relocations.  The older 
> > > > > > > > i965
> > > > > > > > driver in Mesa does use relocations but it only supports Intel 
> > > > > > > > hardware
> > > > > > > > through Gen11 and has been deprecated for all hardware Gen9+.  
> > > > > > > > The
> > > > > > > > compute driver also never uses relocations.  This only leaves 
> > > > > > > > the media
> > > > > > > > driver which is supposed to be switching to softpin going 
> > > > > > > > forward.
> > > > > > > > Making softpin a requirement for all future hardware seems 
> > > > > > > > reasonable.
> > > > > > > >
> > > > > > > > Rejecting relocations starting with Gen12 has the benefit that 
> > > > > > > > we don't
> > > > > > > > have to bother supporting it on platforms with local memory.  
> > > > > > > > Given how
> > > > > > > > much CPU touching of memory is required for relocations, not 
> > > > > > > > having to
> > > > > > > > do so on platforms where not all memory is directly 
> > > > > > > > CPU-accessible
> > > > > > > > carries significant advantages.
> > > > > > > >
> > > > > > > > v2 (Jason Ekstrand):
> > > > > > > >  - Allow TGL-LP platforms as they've already shipped
> > > > > > > >
> > > > > > > > v3 (Jason Ekstrand):
> > > > > > > >  - WARN_ON platforms with LMEM support in case the check is 
> > > > > > > > wrong
> > > > > > >
> > > > > > > I was asked to review of this patch. It works along with expected
> > > > > > > IGT check 
> > > > > > > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> > > > > > >
> > > > > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() 
> > > > > > > better place
> > > > > > > to do for loop just after copy_from_user() and check 
> > > > > > > relocation_count?
> > > > > > > We have an access to exec2_list there, we know the gen so we're 
> > > > > > > able to say
> > > > > > > relocations are not supported immediate, without entering 
> > > > > > > i915_gem_do_execbuffer().
> > > > > >
> > > > > > I considered that but it adds an extra object list walk for a case
> > > > > > which we expect to not happen.  I'm not sure how expensive the list
> > > > > > walk would be if all we do is check the number of relocations on 
> > > > > > each
> > > > > > object.  I guess, if it comes right after a copy_from_user, it's all
> > > > > > hot in the cache so it shouldn't matter.  Ok.  I've convinced 
> > > > > > myself.
> > > > > > I'll move it.
> > > > >
> > > > > I really wouldn't move it if it's another list walk. Execbuf has a lot
> > > > > of fast-paths going on, and we have extensive tests to make sure it
> > > > > unwinds correctly in all cases. It's not very intuitive, but execbuf
> > > > > code isn't scoring very high on that.
> > > >
> > > > And here I'd just finished doing the typing to move it.  Good thing I
> > > > hadn't closed vim yet and it was still in my undo buffer. :-)
> > >
> > > Before entering "slower" path from my perspective I would just check
> > > batch object at that place. We still would have single list walkthrough
> > > and quick check on the very beginning.
> >
> > Can you be more specific about what exactly you think we can check at
> > the beginning?  Either we add a flag that we can O(1) check at the
> > beginning or we have to check everything in exec2_list for
> > exec2_list[n].relocation_count == 0.  That's a list walk.  I'm not
> > seeing what up-front check you're thinking we can do without the list
> > walk.
>
> I expect that last (default) or first (I915_EXEC_BATCH_FIRST) execobj
> (batch) will likely has relocations. So we can check that single
> object without entering i915_gem_do_execbuffer(). If that check
> is missed (relocation_count = 0) you'll catch relocations in other
> objects in check_relocations() as you already did. This is simple
> optimization but we can avoid two iterations over buffer list
> (first is in eb_lookup_vmas()).

Sure, we can do an early-exit check of the first and last objects.
I'm just not seeing what that saves us given that we still have to do

Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Zbigniew Kempczyński
On Thu, Mar 11, 2021 at 11:18:11AM -0600, Jason Ekstrand wrote:
> On Thu, Mar 11, 2021 at 10:51 AM Zbigniew Kempczyński
>  wrote:
> >
> > On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote:
> > > On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
> > > >
> > > > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  
> > > > wrote:
> > > > >
> > > > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > > > > The Vulkan driver in Mesa for Intel hardware never uses 
> > > > > > > relocations if
> > > > > > > it's running on a version of i915 that supports at least softpin 
> > > > > > > which
> > > > > > > all versions of i915 supporting Gen12 do.  On the OpenGL side, 
> > > > > > > Gen12+ is
> > > > > > > only supported by iris which never uses relocations.  The older 
> > > > > > > i965
> > > > > > > driver in Mesa does use relocations but it only supports Intel 
> > > > > > > hardware
> > > > > > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > > > > > compute driver also never uses relocations.  This only leaves the 
> > > > > > > media
> > > > > > > driver which is supposed to be switching to softpin going forward.
> > > > > > > Making softpin a requirement for all future hardware seems 
> > > > > > > reasonable.
> > > > > > >
> > > > > > > Rejecting relocations starting with Gen12 has the benefit that we 
> > > > > > > don't
> > > > > > > have to bother supporting it on platforms with local memory.  
> > > > > > > Given how
> > > > > > > much CPU touching of memory is required for relocations, not 
> > > > > > > having to
> > > > > > > do so on platforms where not all memory is directly CPU-accessible
> > > > > > > carries significant advantages.
> > > > > > >
> > > > > > > v2 (Jason Ekstrand):
> > > > > > >  - Allow TGL-LP platforms as they've already shipped
> > > > > > >
> > > > > > > v3 (Jason Ekstrand):
> > > > > > >  - WARN_ON platforms with LMEM support in case the check is wrong
> > > > > >
> > > > > > I was asked to review of this patch. It works along with expected
> > > > > > IGT check 
> > > > > > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> > > > > >
> > > > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() 
> > > > > > better place
> > > > > > to do for loop just after copy_from_user() and check 
> > > > > > relocation_count?
> > > > > > We have an access to exec2_list there, we know the gen so we're 
> > > > > > able to say
> > > > > > relocations are not supported immediate, without entering 
> > > > > > i915_gem_do_execbuffer().
> > > > >
> > > > > I considered that but it adds an extra object list walk for a case
> > > > > which we expect to not happen.  I'm not sure how expensive the list
> > > > > walk would be if all we do is check the number of relocations on each
> > > > > object.  I guess, if it comes right after a copy_from_user, it's all
> > > > > hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
> > > > > I'll move it.
> > > >
> > > > I really wouldn't move it if it's another list walk. Execbuf has a lot
> > > > of fast-paths going on, and we have extensive tests to make sure it
> > > > unwinds correctly in all cases. It's not very intuitive, but execbuf
> > > > code isn't scoring very high on that.
> > >
> > > And here I'd just finished doing the typing to move it.  Good thing I
> > > hadn't closed vim yet and it was still in my undo buffer. :-)
> >
> > Before entering "slower" path from my perspective I would just check
> > batch object at that place. We still would have single list walkthrough
> > and quick check on the very beginning.
> 
> Can you be more specific about what exactly you think we can check at
> the beginning?  Either we add a flag that we can O(1) check at the
> beginning or we have to check everything in exec2_list for
> exec2_list[n].relocation_count == 0.  That's a list walk.  I'm not
> seeing what up-front check you're thinking we can do without the list
> walk.

I expect that last (default) or first (I915_EXEC_BATCH_FIRST) execobj
(batch) will likely has relocations. So we can check that single 
object without entering i915_gem_do_execbuffer(). If that check
is missed (relocation_count = 0) you'll catch relocations in other
objects in check_relocations() as you already did. This is simple
optimization but we can avoid two iterations over buffer list 
(first is in eb_lookup_vmas()).

--
Zbigniew

> 
> --Jason
> 
> > --
> > Zbigniew
> >
> > >
> > > --Jason
> > >
> > > > -Daniel
> > > >
> > > > >
> > > > > --Jason
> > > > >
> > > > > > --
> > > > > > Zbigniew
> > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Jason Ekstrand 
> > > > > > > Cc: Dave Airlie 
> > > > > > > Cc: Daniel Vetter 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 
> > > > > > > ---
> > > > > > >  1 file changed, 12 

RE: [PATCH 2/2] drm/amdgpu: fix a few compiler warnings

2021-03-11 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Rajneesh Bhardwaj 

-Original Message-
From: amd-gfx  On Behalf Of Oak Zeng
Sent: Wednesday, March 10, 2021 10:29 PM
To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: Zeng, Oak 
Subject: [PATCH 2/2] drm/amdgpu: fix a few compiler warnings

[CAUTION: External Email]

1. make function mmhub_v1_7_setup_vm_pt_regs static 2. indent a if statement

Signed-off-by: Oak Zeng 
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c | 4 ++--  
drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
index 3b4193c..8fca72e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
@@ -88,14 +88,14 @@ int gfxhub_v1_1_get_xgmi_info(struct amdgpu_device *adev)
adev->gmc.xgmi.num_physical_nodes = max_region + 1;

if (adev->gmc.xgmi.num_physical_nodes > max_num_physical_nodes)
-   return -EINVAL;
+   return -EINVAL;

adev->gmc.xgmi.physical_node_id =
REG_GET_FIELD(xgmi_lfb_cntl, MC_VM_XGMI_LFB_CNTL,
  PF_LFB_REGION);

if (adev->gmc.xgmi.physical_node_id > max_physical_node_id)
-   return -EINVAL;
+   return -EINVAL;

adev->gmc.xgmi.node_segment_size = seg_size;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
index ac74d66..29d7f50 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
@@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct amdgpu_device 
*adev)
return base;
 }

-void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid,
+static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, 
+uint32_t vmid,
uint64_t page_table_base)  {
struct amdgpu_vmhub *hub = >vmhub[AMDGPU_MMHUB_0];
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Crajneesh.bhardwaj%40amd.com%7C8f296a25634a47c40dde08d8e43dde97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637510301669209178%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ucX2H8f4HhlXKJ4OBcZZwjfBTBAXYDSrGpPF%2BK1JOLw%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 11:25 AM Boris Brezillon
 wrote:
>
> Hi Jason,
>
> On Thu, 11 Mar 2021 10:58:46 -0600
> Jason Ekstrand  wrote:
>
> > Hi all,
> >
> > Dropping in where I may or may not be wanted to feel free to ignore. : -)
>
> I'm glad you decided to chime in. :-)
>
>
> > > > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > > > explicit/implicit deps between them), and Vulkan asks that the out
> > > > > fence be signaled when all jobs are done. Timeline syncobjs are a
> > > > > good match for that use case. All we need to do is pass the same
> > > > > fence syncobj to all jobs being attached to a single QueueSubmit
> > > > > request, but a different point on the timeline. The syncobj
> > > > > timeline wait does the rest and guarantees that we've reached a
> > > > > given timeline point (IOW, all jobs before that point are done)
> > > > > before declaring the fence as signaled.
> > > > > One alternative would be to have dummy 'synchronization' jobs that
> > > > > don't actually execute anything on the GPU but declare a 
> > > > > dependency
> > > > > on all other jobs that are part of the QueueSubmit request, and
> > > > > signal the out fence (the scheduler would do most of the work for
> > > > > us, all we have to do is support NULL job heads and signal the
> > > > > fence directly when that happens instead of queueing the job).
> > > >
> > > > I have to admit to being rather hazy on the details of timeline
> > > > syncobjs, but I thought there was a requirement that the timeline moves
> > > > monotonically. I.e. if you have multiple jobs signalling the same
> > > > syncobj just with different points, then AFAIU the API requires that the
> > > > points are triggered in order.
> > >
> > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so I
> > > might be wrong, but my understanding is that queuing fences (addition
> > > of new points in the timeline) should happen in order, but signaling
> > > can happen in any order. When checking for a signaled fence the
> > > fence-chain logic starts from the last point (or from an explicit point
> > > if you use the timeline wait flavor) and goes backward, stopping at the
> > > first un-signaled node. If I'm correct, that means that fences that
> > > are part of a chain can be signaled in any order.
> >
> > You don't even need a timeline for this.  Just have a single syncobj
> > per-queue and make each submit wait on it and then signal it.
> > Alternatively, you can just always hang on to the out-fence from the
> > previous submit and make the next one wait on that.
>
> That's what I have right now, but it forces the serialization of all
> jobs that are pushed during a submit (and there can be more than one
> per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd
> be better to not force this serialization if we can avoid it.

I'm not familiar with panfrost's needs and I don't work on a tiler and
I know there are different issues there.  But...

The Vulkan spec requires that everything that all the submits that
happen on a given vkQueue happen in-order.  Search the spec for
"Submission order" for more details.

So, generally speaking, there are some in-order requirements there.
Again, not having a lot of tiler experience, I'm not the one to weigh
in.

> > Timelines are overkill here, IMO.
>
> Mind developing why you think this is overkill? After looking at the
> kernel implementation I thought using timeline syncobjs would be
> pretty cheap compared to the other options I considered.

If you use a regular syncobj, every time you wait on it it inserts a
dependency between the current submit and the last thing to signal it
on the CPU timeline.  The internal dma_fences will hang around
as-needed to ensure those dependencies.  If you use a timeline, you
have to also track a uint64_t to reference the current time point.
This may work if you need to sync a bunch of in-flight stuff at one
go, that may work but if you're trying to serialize, it's just extra
tracking for no point.  Again, maybe there's something I'm missing and
you don't actually want to serialize.

--Jason


> >
> > > Note that I also considered using a sync file, which has the ability to
> > > merge fences, but that required 3 extra ioctls for each syncobj to merge
> > > (for the export/merge/import round trip), and AFAICT, fences stay around
> > > until the sync file is destroyed, which forces some garbage collection
> > > if we want to keep the number of active fences low. One nice thing
> > > about the fence-chain/syncobj-timeline logic is that signaled fences
> > > are collected automatically when walking the chain.
> >
> > Yeah, that's the pain when working with sync files.  This is one of
> > the reasons why our driver takes an arbitrary number of in/out
> > syncobjs.
> >
> > > > So I'm not sure that you've actually fixed this point - you either need
> > > > to force an order 

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-11 Thread Boris Brezillon
Hi Jason,

On Thu, 11 Mar 2021 10:58:46 -0600
Jason Ekstrand  wrote:

> Hi all,
> 
> Dropping in where I may or may not be wanted to feel free to ignore. : -)

I'm glad you decided to chime in. :-)

 
> > > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > > explicit/implicit deps between them), and Vulkan asks that the out
> > > > fence be signaled when all jobs are done. Timeline syncobjs are a
> > > > good match for that use case. All we need to do is pass the same
> > > > fence syncobj to all jobs being attached to a single QueueSubmit
> > > > request, but a different point on the timeline. The syncobj
> > > > timeline wait does the rest and guarantees that we've reached a
> > > > given timeline point (IOW, all jobs before that point are done)
> > > > before declaring the fence as signaled.
> > > > One alternative would be to have dummy 'synchronization' jobs that
> > > > don't actually execute anything on the GPU but declare a dependency
> > > > on all other jobs that are part of the QueueSubmit request, and
> > > > signal the out fence (the scheduler would do most of the work for
> > > > us, all we have to do is support NULL job heads and signal the
> > > > fence directly when that happens instead of queueing the job).  
> > >
> > > I have to admit to being rather hazy on the details of timeline
> > > syncobjs, but I thought there was a requirement that the timeline moves
> > > monotonically. I.e. if you have multiple jobs signalling the same
> > > syncobj just with different points, then AFAIU the API requires that the
> > > points are triggered in order.  
> >
> > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so I
> > might be wrong, but my understanding is that queuing fences (addition
> > of new points in the timeline) should happen in order, but signaling
> > can happen in any order. When checking for a signaled fence the
> > fence-chain logic starts from the last point (or from an explicit point
> > if you use the timeline wait flavor) and goes backward, stopping at the
> > first un-signaled node. If I'm correct, that means that fences that
> > are part of a chain can be signaled in any order.  
> 
> You don't even need a timeline for this.  Just have a single syncobj
> per-queue and make each submit wait on it and then signal it.
> Alternatively, you can just always hang on to the out-fence from the
> previous submit and make the next one wait on that.

That's what I have right now, but it forces the serialization of all
jobs that are pushed during a submit (and there can be more than one
per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd
be better to not force this serialization if we can avoid it.

> Timelines are overkill here, IMO.

Mind developing why you think this is overkill? After looking at the
kernel implementation I thought using timeline syncobjs would be
pretty cheap compared to the other options I considered.

> 
> > Note that I also considered using a sync file, which has the ability to
> > merge fences, but that required 3 extra ioctls for each syncobj to merge
> > (for the export/merge/import round trip), and AFAICT, fences stay around
> > until the sync file is destroyed, which forces some garbage collection
> > if we want to keep the number of active fences low. One nice thing
> > about the fence-chain/syncobj-timeline logic is that signaled fences
> > are collected automatically when walking the chain.  
> 
> Yeah, that's the pain when working with sync files.  This is one of
> the reasons why our driver takes an arbitrary number of in/out
> syncobjs.
> 
> > > So I'm not sure that you've actually fixed this point - you either need
> > > to force an order (in which case the last job can signal the Vulkan
> > > fence)  
> >
> > That options requires adding deps that do not really exist on the last
> > jobs, so I'm not sure I like it.
> >  
> > > or you still need a dummy job to do the many-to-one dependency.  
> >
> > Yeah, that's what I've considered doing before realizing timelined
> > syncojbs could solve this problem (assuming I got it right :-/).
> >  
> > >
> > > Or I may have completely misunderstood timeline syncobjs - definitely a
> > > possibility :)  
> >
> > I wouldn't pretend I got it right either :-).
> >  
> > >  
> > > > 3/ The current implementation lacks information about BO access,
> > > > so we serialize all jobs accessing the same set of BOs, even
> > > > if those jobs might just be reading from them (which can
> > > > happen concurrently). Other drivers pass an access type to the
> > > > list of referenced BOs to address that. Another option would be
> > > > to disable implicit deps (deps based on BOs) and force the driver
> > > > to pass all deps explicitly (interestingly, some drivers have
> > > > both the no-implicit-dep and r/w flags, probably to support
> > > > sub-resource access, so we might 

[PATCH v15 1/2] drm/tegra: dc: Support memory bandwidth management

2021-03-11 Thread Dmitry Osipenko
Display controller (DC) performs isochronous memory transfers, and thus,
has a requirement for a minimum memory bandwidth that shall be fulfilled,
otherwise framebuffer data can't be fetched fast enough and this results
in a DC's data-FIFO underflow that follows by a visual corruption.

The Memory Controller drivers provide facility for memory bandwidth
management via interconnect API. Let's wire up the interconnect API
support to the DC driver in order to fix the distorted display output
on T30 Ouya, T124 TK1 and other Tegra devices.

Tested-by: Peter Geis  # Ouya T30
Tested-by: Matt Merhar  # Ouya T30
Tested-by: Nicolas Chauvet  # PAZ00 T20 and TK1 T124
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 352 ++
 drivers/gpu/drm/tegra/dc.h|  14 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 127 
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 526 insertions(+)

diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 5043dcaf1cf9..1650a448eabd 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -9,6 +9,7 @@ config DRM_TEGRA
select DRM_MIPI_DSI
select DRM_PANEL
select TEGRA_HOST1X
+   select INTERCONNECT
select IOMMU_IOVA
select CEC_CORE if CEC_NOTIFIER
help
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 0ae3a025efe9..49fa488cf930 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -616,6 +617,9 @@ static int tegra_plane_atomic_check(struct drm_plane *plane,
struct tegra_dc *dc = to_tegra_dc(state->crtc);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -802,6 +806,12 @@ static struct drm_plane *tegra_primary_plane_create(struct 
drm_device *drm,
formats = dc->soc->primary_formats;
modifiers = dc->soc->modifiers;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, modifiers, type, NULL);
@@ -833,9 +843,13 @@ static const u32 tegra_cursor_plane_formats[] = {
 static int tegra_cursor_atomic_check(struct drm_plane *plane,
 struct drm_plane_state *state)
 {
+   struct tegra_plane_state *plane_state = to_tegra_plane_state(state);
struct tegra_plane *tegra = to_tegra_plane(plane);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -973,6 +987,12 @@ static struct drm_plane 
*tegra_dc_cursor_plane_create(struct drm_device *drm,
num_formats = ARRAY_SIZE(tegra_cursor_plane_formats);
formats = tegra_cursor_plane_formats;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, NULL,
@@ -1087,6 +1107,12 @@ static struct drm_plane 
*tegra_dc_overlay_plane_create(struct drm_device *drm,
num_formats = dc->soc->num_overlay_formats;
formats = dc->soc->overlay_formats;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
if (!cursor)
type = DRM_PLANE_TYPE_OVERLAY;
else
@@ -1204,6 +1230,7 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
 {
struct tegra_dc_state *state = to_dc_state(crtc->state);
struct tegra_dc_state *copy;
+   unsigned int i;
 
copy = kmalloc(sizeof(*copy), GFP_KERNEL);
if (!copy)
@@ -1215,6 +1242,9 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
copy->div = state->div;
copy->planes = state->planes;
 
+   for (i = 0; i < ARRAY_SIZE(state->plane_peak_bw); i++)
+   copy->plane_peak_bw[i] = state->plane_peak_bw[i];
+
return >base;
 }
 
@@ -1741,6 +1771,106 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, 
unsigned long timeout)
return -ETIMEDOUT;
 }
 
+static void
+tegra_crtc_update_memory_bandwidth(struct drm_crtc 

[PATCH v15 2/2] drm/tegra: dc: Extend debug stats with total number of events

2021-03-11 Thread Dmitry Osipenko
It's useful to know the total number of underflow events and currently
the debug stats are getting reset each time CRTC is being disabled. Let's
account the overall number of events that doesn't get a reset.

Tested-by: Peter Geis 
Tested-by: Nicolas Chauvet 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/dc.c | 10 ++
 drivers/gpu/drm/tegra/dc.h |  5 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 49fa488cf930..ecac28e814ec 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -1539,6 +1539,11 @@ static int tegra_dc_show_stats(struct seq_file *s, void 
*data)
seq_printf(s, "underflow: %lu\n", dc->stats.underflow);
seq_printf(s, "overflow: %lu\n", dc->stats.overflow);
 
+   seq_printf(s, "frames total: %lu\n", dc->stats.frames_total);
+   seq_printf(s, "vblank total: %lu\n", dc->stats.vblank_total);
+   seq_printf(s, "underflow total: %lu\n", dc->stats.underflow_total);
+   seq_printf(s, "overflow total: %lu\n", dc->stats.overflow_total);
+
return 0;
 }
 
@@ -2313,6 +2318,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): frame end\n", __func__);
*/
+   dc->stats.frames_total++;
dc->stats.frames++;
}
 
@@ -2321,6 +2327,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
dev_dbg(dc->dev, "%s(): vertical blank\n", __func__);
*/
drm_crtc_handle_vblank(>base);
+   dc->stats.vblank_total++;
dc->stats.vblank++;
}
 
@@ -2328,6 +2335,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): underflow\n", __func__);
*/
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
@@ -2335,11 +2343,13 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): overflow\n", __func__);
*/
+   dc->stats.overflow_total++;
dc->stats.overflow++;
}
 
if (status & HEAD_UF_INT) {
dev_dbg_ratelimited(dc->dev, "%s(): head underflow\n", 
__func__);
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
index 69d4cca2e58c..ad8d51a55a00 100644
--- a/drivers/gpu/drm/tegra/dc.h
+++ b/drivers/gpu/drm/tegra/dc.h
@@ -48,6 +48,11 @@ struct tegra_dc_stats {
unsigned long vblank;
unsigned long underflow;
unsigned long overflow;
+
+   unsigned long frames_total;
+   unsigned long vblank_total;
+   unsigned long underflow_total;
+   unsigned long overflow_total;
 };
 
 struct tegra_windowgroup_soc {
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v15 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver

2021-03-11 Thread Dmitry Osipenko
This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
which is done using interconnect framework. It fixes display corruption that
happens due to insufficient memory bandwidth.

Changelog:

v15: - Corrected tegra_plane_icc_names[] NULL-check that was partially lost
   by accident in v14 after unsuccessful rebase.

v14: - Made improvements that were suggested by Michał Mirosław to v13:

   - Changed 'unsigned int' to 'bool'.
   - Renamed functions which calculate bandwidth state.
   - Reworked comment in the code that explains why downscaled plane
 require higher bandwidth.
   - Added round-up to bandwidth calculation.
   - Added sanity checks of the plane index and fixed out-of-bounds
 access which happened on T124 due to the cursor plane index.

v13: - No code changes. Patches missed v5.12, re-sending them for v5.13.

Dmitry Osipenko (2):
  drm/tegra: dc: Support memory bandwidth management
  drm/tegra: dc: Extend debug stats with total number of events

 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 362 ++
 drivers/gpu/drm/tegra/dc.h|  19 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 127 
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 541 insertions(+)

-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 10:51 AM Zbigniew Kempczyński
 wrote:
>
> On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote:
> > On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
> > >
> > > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  
> > > wrote:
> > > >
> > > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> > > >  wrote:
> > > > >
> > > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > > > The Vulkan driver in Mesa for Intel hardware never uses relocations 
> > > > > > if
> > > > > > it's running on a version of i915 that supports at least softpin 
> > > > > > which
> > > > > > all versions of i915 supporting Gen12 do.  On the OpenGL side, 
> > > > > > Gen12+ is
> > > > > > only supported by iris which never uses relocations.  The older i965
> > > > > > driver in Mesa does use relocations but it only supports Intel 
> > > > > > hardware
> > > > > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > > > > compute driver also never uses relocations.  This only leaves the 
> > > > > > media
> > > > > > driver which is supposed to be switching to softpin going forward.
> > > > > > Making softpin a requirement for all future hardware seems 
> > > > > > reasonable.
> > > > > >
> > > > > > Rejecting relocations starting with Gen12 has the benefit that we 
> > > > > > don't
> > > > > > have to bother supporting it on platforms with local memory.  Given 
> > > > > > how
> > > > > > much CPU touching of memory is required for relocations, not having 
> > > > > > to
> > > > > > do so on platforms where not all memory is directly CPU-accessible
> > > > > > carries significant advantages.
> > > > > >
> > > > > > v2 (Jason Ekstrand):
> > > > > >  - Allow TGL-LP platforms as they've already shipped
> > > > > >
> > > > > > v3 (Jason Ekstrand):
> > > > > >  - WARN_ON platforms with LMEM support in case the check is wrong
> > > > >
> > > > > I was asked to review of this patch. It works along with expected
> > > > > IGT check 
> > > > > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> > > > >
> > > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better 
> > > > > place
> > > > > to do for loop just after copy_from_user() and check relocation_count?
> > > > > We have an access to exec2_list there, we know the gen so we're able 
> > > > > to say
> > > > > relocations are not supported immediate, without entering 
> > > > > i915_gem_do_execbuffer().
> > > >
> > > > I considered that but it adds an extra object list walk for a case
> > > > which we expect to not happen.  I'm not sure how expensive the list
> > > > walk would be if all we do is check the number of relocations on each
> > > > object.  I guess, if it comes right after a copy_from_user, it's all
> > > > hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
> > > > I'll move it.
> > >
> > > I really wouldn't move it if it's another list walk. Execbuf has a lot
> > > of fast-paths going on, and we have extensive tests to make sure it
> > > unwinds correctly in all cases. It's not very intuitive, but execbuf
> > > code isn't scoring very high on that.
> >
> > And here I'd just finished doing the typing to move it.  Good thing I
> > hadn't closed vim yet and it was still in my undo buffer. :-)
>
> Before entering "slower" path from my perspective I would just check
> batch object at that place. We still would have single list walkthrough
> and quick check on the very beginning.

Can you be more specific about what exactly you think we can check at
the beginning?  Either we add a flag that we can O(1) check at the
beginning or we have to check everything in exec2_list for
exec2_list[n].relocation_count == 0.  That's a list walk.  I'm not
seeing what up-front check you're thinking we can do without the list
walk.

--Jason

> --
> Zbigniew
>
> >
> > --Jason
> >
> > > -Daniel
> > >
> > > >
> > > > --Jason
> > > >
> > > > > --
> > > > > Zbigniew
> > > > >
> > > > > >
> > > > > > Signed-off-by: Jason Ekstrand 
> > > > > > Cc: Dave Airlie 
> > > > > > Cc: Daniel Vetter 
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
> > > > > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > > > > > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > index 99772f37bff60..b02dbd16bfa03 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > @@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer 
> > > > > > *eb, struct eb_vma *ev)
> > > > > >   return err;
> > > > > >  }
> > > > > >
> > > > > > -static int check_relocations(const struct 
> > > > > > drm_i915_gem_exec_object2 *entry)
> > > > > > +static int check_relocations(const struct i915_execbuffer *eb,
> > > > > > +  const struct 

Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Noralf Trønnes


Den 11.03.2021 15.48, skrev Peter Stuge:
> Noralf Trønnes wrote:
>>> I didn't receive the expected bits/bytes for RGB111 on the bulk endpoint,
>>> I think because of how components were extracted in gud_xrgb_to_color().
>>>
>>> Changing to the following gets me the expected (X R1 G1 B1 X R2 G2 B2) 
>>> bytes:
>>>
>>> r = (*pix32 >> 8) & 0xff;
>>> g = (*pix32 >> 16) & 0xff;
>>> b = (*pix32++ >> 24) & 0xff;
>>
>> We're accessing the whole word here through pix32, no byte access, so
>> endianess doesn't come into play.
> 
> Endianness matters because parts of pix32 are used.
> 

This code:

#include 
#include 

void main()
{
volatile uint32_t endian = 0x01234567;
uint32_t v = 0xaabbccdd;
uint32_t *pix32 = 
uint8_t r, g, b, *p;

r = *pix32 >> 16;
g = *pix32 >> 8;
b = *pix32++;

printf("xrgb=%08x\n", v);

printf("32-bit access:\n");
printf("r=%02x\n", r);
printf("g=%02x\n", g);
printf("b=%02x\n", b);

printf("Byte access on %s:\n", (*((uint8_t*)())) == 0x67 ? "LE"
: "BE");
p = (uint8_t *)
printf("r=%02x\n", p[1]);
printf("g=%02x\n", p[2]);
printf("b=%02x\n", p[3]);
}

prints:

xrgb=aabbccdd
32-bit access:
r=bb
g=cc
b=dd
Byte access on LE:
r=cc
g=bb
b=aa

> Software only sees bytes (or larger) because addresses are byte granular,
> but must pay attention to the bit order when dealing with smaller values
> inside larger memory accesses.
> 
> Given 4 bytes of memory {0x11, 0x22, 0x33, 0x44} at address A, both LE
> and BE machines appear the same when accessing individual bytes, but with
> uint32_t *a32 = A then a32[0] is 0x44332211 on LE and 0x11223344 on BE.
> 
> 
> Hence the question: What does DRM promise about the XRGB mode?
> 

That it's a 32-bit value. From include/uapi/drm/drm_fourcc.h:

/* 32 bpp RGB */
#define DRM_FORMAT_XRGB fourcc_code('X', 'R', '2', '4') /* [31:0]
x:R:G:B 8:8:8:8 little endian */

If a raw buffer was passed from a BE to an LE machine, there would be
problems because of how the value is stored, but here it's the same
endianness in userspace and kernel space.

There is code in gud_prep_flush() that handles a BE host with a
multibyte format:

} else if (gud_is_big_endian() && format->cpp[0] > 1) {
drm_fb_swab(buf, vaddr, fb, rect, !import_attach);

In this case we can't just pass on the raw buffer to the device since
the protocol is LE, and thus have to swap the bytes to match up how
they're stored in memory on the device.

I'm not loosing any of the colors when running modetest. This is the
test image that modetest uses and it comes through just like that:
https://commons.wikimedia.org/wiki/File:SMPTE_Color_Bars.svg

Noralf.

> Is it guaranteed that the first byte in memory is always unused, the second
> represents red, the third green and the fourth blue (endianess agnostic)?
> I'd expect this and I guess that it is the case, but I don't know DRM?
> 
> Or is it instead guaranteed that when accessed natively as one 32-bit
> value the blue component is always in the most significant byte (endianess
> abstracted, always LE in memory) or always in the least significant byte
> (endianess abstracted, always BE in memory)?
> This would be annoying for userspace, but well, it's possible.
> 
> In the abstracted (latter) case pix32 would work, but could still be
> questioned on style, and in fact, pix32 didn't work for me, so at a
> minimum the byte order would be the reverse.
> 
> 
> In the agnostic (former) case your code was correct for BE and mine
> for LE, but I'd then suggest using a u8 * to both work correctly
> everywhere and be obvious.
> 
> 
>> This change will flip r and b, which gives: XRGB -> XBGR
> 
> The old code was:
>   r = *pix32 >> 16;
>   g = *pix32 >> 8;
>   b = *pix32++;
> 
> On my LE machine this set r to the third byte (G), g to the second (R)
> and b to the first (X), explaining the color confusion that I saw.
> 
> 
>> BGR is a common thing on controllers, are you sure yours are set to RGB
>> and not BGR?
> 
> Yes; I've verified that my display takes red in MSB both in its data
> sheet and by writing raw bits to it on a system without the gud driver.
> 
> 
>> And the 0xff masking isn't necessary since we're assigning to a byte, right?
> 
> Not strictly neccessary but I like to do it anyway, both to be explicit
> and also to ensure that the compiler will never sign extend, if types
> are changed or if values become treated as signed and/or larger by the
> compiler because the code is changed.
> 
> It's frustrating to debug such unexpected changes in behavior due to
> a type change or calculation change, but if you find it too defensive
> then go ahead and remove it, if pix32 does stay.
> 
> 
>> I haven't got a native R1G1B1 display so I have emulated and I do get

Re: [PATCH v14 1/2] drm/tegra: dc: Support memory bandwidth management

2021-03-11 Thread Dmitry Osipenko
11.03.2021 20:06, Dmitry Osipenko пишет:
> +static const char * const tegra_plane_icc_names[TEGRA_DC_LEGACY_PLANES_NUM] 
> = {
> + "wina", "winb", "winc", "", "", "", "cursor",
> +};
> +
> +int tegra_plane_interconnect_init(struct tegra_plane *plane)
> +{
> + const char *icc_name = tegra_plane_icc_names[plane->index];
> + struct device *dev = plane->dc->dev;
> + struct tegra_dc *dc = plane->dc;
> + int err;
> +
> + if (WARN_ON(plane->index >= TEGRA_DC_LEGACY_PLANES_NUM) ||
> + WARN_ON(!tegra_plane_icc_names[plane->index]))
> + return -EINVAL;

It just occurred to me that I added the NULL-check here, but missed to
change "" to NULLs. I'll make a v15 shortly.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v14 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver

2021-03-11 Thread Dmitry Osipenko
11.03.2021 20:06, Dmitry Osipenko пишет:
> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
> which is done using interconnect framework. It fixes display corruption that
> happens due to insufficient memory bandwidth.
> 
> Changelog:
> 
> v14: - Made improvements that were suggested by Michał Mirosław to v13:
> 
>- Changed 'unsigned int' to 'bool'.
>- Renamed functions which calculate bandwidth state.
>- Reworked comment in the code that explains why downscaled plane
>  require higher bandwidth.
>- Added round-up to bandwidth calculation.
>- Added sanity checks of the plane index and fixed out-of-bounds
>  access which happened on T124 due to the cursor plane index.
> 
> v13: - No code changes. Patches missed v5.12, re-sending them for v5.13.
> 
> Dmitry Osipenko (2):
>   drm/tegra: dc: Support memory bandwidth management
>   drm/tegra: dc: Extend debug stats with total number of events
> 
>  drivers/gpu/drm/tegra/Kconfig |   1 +
>  drivers/gpu/drm/tegra/dc.c| 362 ++
>  drivers/gpu/drm/tegra/dc.h|  19 ++
>  drivers/gpu/drm/tegra/drm.c   |  14 ++
>  drivers/gpu/drm/tegra/hub.c   |   3 +
>  drivers/gpu/drm/tegra/plane.c | 127 
>  drivers/gpu/drm/tegra/plane.h |  15 ++
>  7 files changed, 541 insertions(+)
> 

Michał, please let me know if v14 looks good to you. I'll appreciate
yours r-b, thanks in advance.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v14 1/2] drm/tegra: dc: Support memory bandwidth management

2021-03-11 Thread Dmitry Osipenko
Display controller (DC) performs isochronous memory transfers, and thus,
has a requirement for a minimum memory bandwidth that shall be fulfilled,
otherwise framebuffer data can't be fetched fast enough and this results
in a DC's data-FIFO underflow that follows by a visual corruption.

The Memory Controller drivers provide facility for memory bandwidth
management via interconnect API. Let's wire up the interconnect API
support to the DC driver in order to fix the distorted display output
on T30 Ouya, T124 TK1 and other Tegra devices.

Tested-by: Peter Geis  # Ouya T30
Tested-by: Matt Merhar  # Ouya T30
Tested-by: Nicolas Chauvet  # PAZ00 T20 and TK1 T124
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 352 ++
 drivers/gpu/drm/tegra/dc.h|  14 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 127 
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 526 insertions(+)

diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 5043dcaf1cf9..1650a448eabd 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -9,6 +9,7 @@ config DRM_TEGRA
select DRM_MIPI_DSI
select DRM_PANEL
select TEGRA_HOST1X
+   select INTERCONNECT
select IOMMU_IOVA
select CEC_CORE if CEC_NOTIFIER
help
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 0ae3a025efe9..49fa488cf930 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -616,6 +617,9 @@ static int tegra_plane_atomic_check(struct drm_plane *plane,
struct tegra_dc *dc = to_tegra_dc(state->crtc);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -802,6 +806,12 @@ static struct drm_plane *tegra_primary_plane_create(struct 
drm_device *drm,
formats = dc->soc->primary_formats;
modifiers = dc->soc->modifiers;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, modifiers, type, NULL);
@@ -833,9 +843,13 @@ static const u32 tegra_cursor_plane_formats[] = {
 static int tegra_cursor_atomic_check(struct drm_plane *plane,
 struct drm_plane_state *state)
 {
+   struct tegra_plane_state *plane_state = to_tegra_plane_state(state);
struct tegra_plane *tegra = to_tegra_plane(plane);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -973,6 +987,12 @@ static struct drm_plane 
*tegra_dc_cursor_plane_create(struct drm_device *drm,
num_formats = ARRAY_SIZE(tegra_cursor_plane_formats);
formats = tegra_cursor_plane_formats;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, NULL,
@@ -1087,6 +1107,12 @@ static struct drm_plane 
*tegra_dc_overlay_plane_create(struct drm_device *drm,
num_formats = dc->soc->num_overlay_formats;
formats = dc->soc->overlay_formats;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
if (!cursor)
type = DRM_PLANE_TYPE_OVERLAY;
else
@@ -1204,6 +1230,7 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
 {
struct tegra_dc_state *state = to_dc_state(crtc->state);
struct tegra_dc_state *copy;
+   unsigned int i;
 
copy = kmalloc(sizeof(*copy), GFP_KERNEL);
if (!copy)
@@ -1215,6 +1242,9 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
copy->div = state->div;
copy->planes = state->planes;
 
+   for (i = 0; i < ARRAY_SIZE(state->plane_peak_bw); i++)
+   copy->plane_peak_bw[i] = state->plane_peak_bw[i];
+
return >base;
 }
 
@@ -1741,6 +1771,106 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, 
unsigned long timeout)
return -ETIMEDOUT;
 }
 
+static void
+tegra_crtc_update_memory_bandwidth(struct drm_crtc 

[PATCH v14 2/2] drm/tegra: dc: Extend debug stats with total number of events

2021-03-11 Thread Dmitry Osipenko
It's useful to know the total number of underflow events and currently
the debug stats are getting reset each time CRTC is being disabled. Let's
account the overall number of events that doesn't get a reset.

Tested-by: Peter Geis 
Tested-by: Nicolas Chauvet 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/dc.c | 10 ++
 drivers/gpu/drm/tegra/dc.h |  5 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 49fa488cf930..ecac28e814ec 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -1539,6 +1539,11 @@ static int tegra_dc_show_stats(struct seq_file *s, void 
*data)
seq_printf(s, "underflow: %lu\n", dc->stats.underflow);
seq_printf(s, "overflow: %lu\n", dc->stats.overflow);
 
+   seq_printf(s, "frames total: %lu\n", dc->stats.frames_total);
+   seq_printf(s, "vblank total: %lu\n", dc->stats.vblank_total);
+   seq_printf(s, "underflow total: %lu\n", dc->stats.underflow_total);
+   seq_printf(s, "overflow total: %lu\n", dc->stats.overflow_total);
+
return 0;
 }
 
@@ -2313,6 +2318,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): frame end\n", __func__);
*/
+   dc->stats.frames_total++;
dc->stats.frames++;
}
 
@@ -2321,6 +2327,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
dev_dbg(dc->dev, "%s(): vertical blank\n", __func__);
*/
drm_crtc_handle_vblank(>base);
+   dc->stats.vblank_total++;
dc->stats.vblank++;
}
 
@@ -2328,6 +2335,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): underflow\n", __func__);
*/
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
@@ -2335,11 +2343,13 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): overflow\n", __func__);
*/
+   dc->stats.overflow_total++;
dc->stats.overflow++;
}
 
if (status & HEAD_UF_INT) {
dev_dbg_ratelimited(dc->dev, "%s(): head underflow\n", 
__func__);
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
index 69d4cca2e58c..ad8d51a55a00 100644
--- a/drivers/gpu/drm/tegra/dc.h
+++ b/drivers/gpu/drm/tegra/dc.h
@@ -48,6 +48,11 @@ struct tegra_dc_stats {
unsigned long vblank;
unsigned long underflow;
unsigned long overflow;
+
+   unsigned long frames_total;
+   unsigned long vblank_total;
+   unsigned long underflow_total;
+   unsigned long overflow_total;
 };
 
 struct tegra_windowgroup_soc {
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v14 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver

2021-03-11 Thread Dmitry Osipenko
This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
which is done using interconnect framework. It fixes display corruption that
happens due to insufficient memory bandwidth.

Changelog:

v14: - Made improvements that were suggested by Michał Mirosław to v13:

   - Changed 'unsigned int' to 'bool'.
   - Renamed functions which calculate bandwidth state.
   - Reworked comment in the code that explains why downscaled plane
 require higher bandwidth.
   - Added round-up to bandwidth calculation.
   - Added sanity checks of the plane index and fixed out-of-bounds
 access which happened on T124 due to the cursor plane index.

v13: - No code changes. Patches missed v5.12, re-sending them for v5.13.

Dmitry Osipenko (2):
  drm/tegra: dc: Support memory bandwidth management
  drm/tegra: dc: Extend debug stats with total number of events

 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 362 ++
 drivers/gpu/drm/tegra/dc.h|  19 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 127 
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 541 insertions(+)

-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-11 Thread Jason Ekstrand
Hi all,

Dropping in where I may or may not be wanted to feel free to ignore. : -)

On Thu, Mar 11, 2021 at 7:00 AM Boris Brezillon
 wrote:
>
> Hi Steven,
>
> On Thu, 11 Mar 2021 12:16:33 +
> Steven Price  wrote:
>
> > On 11/03/2021 09:25, Boris Brezillon wrote:
> > > Hello,
> > >
> > > I've been playing with Vulkan lately and struggled quite a bit to
> > > implement VkQueueSubmit with the submit ioctl we have. There are
> > > several limiting factors that can be worked around if we really have to,
> > > but I think it'd be much easier and future-proof if we introduce a new
> > > ioctl that addresses the current limitations:
> >
> > Hi Boris,
> >
> > I think what you've proposed is quite reasonable, some detailed comments
> > to your points below.
> >
> > >
> > > 1/ There can only be one out_sync, but Vulkan might ask us to signal
> > > several VkSemaphores and possibly one VkFence too, both of those
> > > being based on sync objects in my PoC. Making out_sync an array of
> > > syncobjs to attach the render_done fence to would make that possible.
> > > The other option would be to collect syncobj updates in userspace
> > > in a separate thread and propagate those updates to all
> > > semaphores+fences waiting on those events (I think the v3dv driver
> > > does something like that, but I didn't spend enough time studying
> > > the code to be sure, so I might be wrong).
> >
> > You should be able to avoid the separate thread to propagate by having a
> > proxy object in user space that maps between the one outsync of the job
> > and the possibly many Vulkan objects. But I've had this argument before
> > with the DDK... and the upshot of it was that he Vulkan API is
> > unnecessarily complex here and makes this really hard to do in practise.
> > So I agree adding this capability to the kernel is likely the best approach.

This is pretty easy to do from userspace.  Just take the out sync_file
and stuff it into each of the syncobjs using
DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE.  That's all the kernel would be doing
under the hood anyway.  Having it built into your submit ioctl does
have the advantage of fewer round-trips to kernel space, though.

> > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > explicit/implicit deps between them), and Vulkan asks that the out
> > > fence be signaled when all jobs are done. Timeline syncobjs are a
> > > good match for that use case. All we need to do is pass the same
> > > fence syncobj to all jobs being attached to a single QueueSubmit
> > > request, but a different point on the timeline. The syncobj
> > > timeline wait does the rest and guarantees that we've reached a
> > > given timeline point (IOW, all jobs before that point are done)
> > > before declaring the fence as signaled.
> > > One alternative would be to have dummy 'synchronization' jobs that
> > > don't actually execute anything on the GPU but declare a dependency
> > > on all other jobs that are part of the QueueSubmit request, and
> > > signal the out fence (the scheduler would do most of the work for
> > > us, all we have to do is support NULL job heads and signal the
> > > fence directly when that happens instead of queueing the job).
> >
> > I have to admit to being rather hazy on the details of timeline
> > syncobjs, but I thought there was a requirement that the timeline moves
> > monotonically. I.e. if you have multiple jobs signalling the same
> > syncobj just with different points, then AFAIU the API requires that the
> > points are triggered in order.
>
> I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so I
> might be wrong, but my understanding is that queuing fences (addition
> of new points in the timeline) should happen in order, but signaling
> can happen in any order. When checking for a signaled fence the
> fence-chain logic starts from the last point (or from an explicit point
> if you use the timeline wait flavor) and goes backward, stopping at the
> first un-signaled node. If I'm correct, that means that fences that
> are part of a chain can be signaled in any order.

You don't even need a timeline for this.  Just have a single syncobj
per-queue and make each submit wait on it and then signal it.
Alternatively, you can just always hang on to the out-fence from the
previous submit and make the next one wait on that.  Timelines are
overkill here, IMO.

> Note that I also considered using a sync file, which has the ability to
> merge fences, but that required 3 extra ioctls for each syncobj to merge
> (for the export/merge/import round trip), and AFAICT, fences stay around
> until the sync file is destroyed, which forces some garbage collection
> if we want to keep the number of active fences low. One nice thing
> about the fence-chain/syncobj-timeline logic is that signaled fences
> are collected automatically when walking the chain.

Yeah, that's the pain when working 

Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Zbigniew Kempczyński
On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote:
> On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
> >
> > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  wrote:
> > >
> > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> > >  wrote:
> > > >
> > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > > > > it's running on a version of i915 that supports at least softpin which
> > > > > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ 
> > > > > is
> > > > > only supported by iris which never uses relocations.  The older i965
> > > > > driver in Mesa does use relocations but it only supports Intel 
> > > > > hardware
> > > > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > > > compute driver also never uses relocations.  This only leaves the 
> > > > > media
> > > > > driver which is supposed to be switching to softpin going forward.
> > > > > Making softpin a requirement for all future hardware seems reasonable.
> > > > >
> > > > > Rejecting relocations starting with Gen12 has the benefit that we 
> > > > > don't
> > > > > have to bother supporting it on platforms with local memory.  Given 
> > > > > how
> > > > > much CPU touching of memory is required for relocations, not having to
> > > > > do so on platforms where not all memory is directly CPU-accessible
> > > > > carries significant advantages.
> > > > >
> > > > > v2 (Jason Ekstrand):
> > > > >  - Allow TGL-LP platforms as they've already shipped
> > > > >
> > > > > v3 (Jason Ekstrand):
> > > > >  - WARN_ON platforms with LMEM support in case the check is wrong
> > > >
> > > > I was asked to review of this patch. It works along with expected
> > > > IGT check 
> > > > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> > > >
> > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better 
> > > > place
> > > > to do for loop just after copy_from_user() and check relocation_count?
> > > > We have an access to exec2_list there, we know the gen so we're able to 
> > > > say
> > > > relocations are not supported immediate, without entering 
> > > > i915_gem_do_execbuffer().
> > >
> > > I considered that but it adds an extra object list walk for a case
> > > which we expect to not happen.  I'm not sure how expensive the list
> > > walk would be if all we do is check the number of relocations on each
> > > object.  I guess, if it comes right after a copy_from_user, it's all
> > > hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
> > > I'll move it.
> >
> > I really wouldn't move it if it's another list walk. Execbuf has a lot
> > of fast-paths going on, and we have extensive tests to make sure it
> > unwinds correctly in all cases. It's not very intuitive, but execbuf
> > code isn't scoring very high on that.
> 
> And here I'd just finished doing the typing to move it.  Good thing I
> hadn't closed vim yet and it was still in my undo buffer. :-)

Before entering "slower" path from my perspective I would just check
batch object at that place. We still would have single list walkthrough
and quick check on the very beginning.

--
Zbigniew

> 
> --Jason
> 
> > -Daniel
> >
> > >
> > > --Jason
> > >
> > > > --
> > > > Zbigniew
> > > >
> > > > >
> > > > > Signed-off-by: Jason Ekstrand 
> > > > > Cc: Dave Airlie 
> > > > > Cc: Daniel Vetter 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
> > > > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > > > > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > index 99772f37bff60..b02dbd16bfa03 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > @@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer 
> > > > > *eb, struct eb_vma *ev)
> > > > >   return err;
> > > > >  }
> > > > >
> > > > > -static int check_relocations(const struct drm_i915_gem_exec_object2 
> > > > > *entry)
> > > > > +static int check_relocations(const struct i915_execbuffer *eb,
> > > > > +  const struct drm_i915_gem_exec_object2 
> > > > > *entry)
> > > > >  {
> > > > >   const char __user *addr, *end;
> > > > >   unsigned long size;
> > > > > @@ -1774,6 +1775,14 @@ static int check_relocations(const struct 
> > > > > drm_i915_gem_exec_object2 *entry)
> > > > >   if (size == 0)
> > > > >   return 0;
> > > > >
> > > > > + /* Relocations are disallowed for all platforms after TGL-LP */
> > > > > + if (INTEL_GEN(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* All discrete memory platforms are Gen12 or above */
> > > > > + if (WARN_ON(HAS_LMEM(eb->i915)))
> > > > > + 

Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 10:31 AM Chris Wilson  wrote:
>
> Quoting Zbigniew Kempczyński (2021-03-11 11:44:32)
> > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > > it's running on a version of i915 that supports at least softpin which
> > > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
> > > only supported by iris which never uses relocations.  The older i965
> > > driver in Mesa does use relocations but it only supports Intel hardware
> > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > compute driver also never uses relocations.  This only leaves the media
> > > driver which is supposed to be switching to softpin going forward.
> > > Making softpin a requirement for all future hardware seems reasonable.
> > >
> > > Rejecting relocations starting with Gen12 has the benefit that we don't
> > > have to bother supporting it on platforms with local memory.  Given how
> > > much CPU touching of memory is required for relocations, not having to
> > > do so on platforms where not all memory is directly CPU-accessible
> > > carries significant advantages.
> > >
> > > v2 (Jason Ekstrand):
> > >  - Allow TGL-LP platforms as they've already shipped
> > >
> > > v3 (Jason Ekstrand):
> > >  - WARN_ON platforms with LMEM support in case the check is wrong
> >
> > I was asked to review of this patch. It works along with expected
> > IGT check 
> > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> >
> > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better place
> > to do for loop just after copy_from_user() and check relocation_count?
> > We have an access to exec2_list there, we know the gen so we're able to say
> > relocations are not supported immediate, without entering 
> > i915_gem_do_execbuffer().
>
> There's a NORELOC flag you can enforce as mandatory. That's trivial for
> userspace to set, really makes sure they are aware of the change afoot,
> and i915_gem_ceck_execbuffer() will perform the validation upfront with
> the other flag checks.

NORELOC doesn't quite ensure that there are no relocations; it just
makes things optional if the kernel hasn't moved anything.  I guess we
could require userspace to set it but it also doesn't do anything if
there are no relocations to begin with.  I think I'd personally err on
the side of not requiring pointless flags.

--Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dt-bindings: display: sitronix, st7789v: Add Waveshare 2inch LCD module

2021-03-11 Thread Rob Herring
On Wed, 10 Mar 2021 22:08:35 +0800, Carlis wrote:
> From: "Carlis" 
> 
> Document support for the Waveshare 2inch LCD module display, which is a
> 240x320 2" TFT display driven by a Sitronix ST7789V TFT Controller.
> 
> Signed-off-by: Carlis 
> ---
>  .../bindings/display/sitronix,st7789v.yaml | 72 
> ++
>  1 file changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/sitronix,st7789v.yaml
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.example.dt.yaml:
 panel@0: compatible: 'oneOf' conditional failed, one must be fixed:
['sitronix,st7789v'] is too short
'sitronix,st7789v' is not one of ['waveshare,ws-2inch-lcd']
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.example.dt.yaml:
 panel@0: 'dc-gpios' is a required property
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.example.dt.yaml:
 panel@0: 'port', 'power-supply', 'spi-cpha', 'spi-cpol' do not match any of 
the regexes: 'pinctrl-[0-9]+'
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.example.dt.yaml:
 display@0: compatible:0: 'sitronix,st7789v' was expected
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.example.dt.yaml:
 display@0: compatible: ['waveshare,ws-2inch-lcd', 'sitronix,st7789v'] is too 
long
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.example.dt.yaml:
 display@0: compatible: Additional items are not allowed ('sitronix,st7789v' 
was unexpected)
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/sitronix,st7789v.example.dt.yaml:
 display@0: 'power-supply' is a required property
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml

See https://patchwork.ozlabs.org/patch/1450539

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/amd/display: remove redundant initialization of variable result

2021-03-11 Thread Colin King
From: Colin Ian King 

The variable result is being initialized with a value that is
never read and it is being updated later with a new value.  The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 099f43709060..47e6c33f73cb 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -4281,7 +4281,7 @@ void dp_set_panel_mode(struct dc_link *link, enum 
dp_panel_mode panel_mode)
 
if (edp_config_set.bits.PANEL_MODE_EDP
!= panel_mode_edp) {
-   enum dc_status result = DC_ERROR_UNEXPECTED;
+   enum dc_status result;
 
edp_config_set.bits.PANEL_MODE_EDP =
panel_mode_edp;
-- 
2.30.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Chris Wilson
Quoting Zbigniew Kempczyński (2021-03-11 11:44:32)
> On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > it's running on a version of i915 that supports at least softpin which
> > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
> > only supported by iris which never uses relocations.  The older i965
> > driver in Mesa does use relocations but it only supports Intel hardware
> > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > compute driver also never uses relocations.  This only leaves the media
> > driver which is supposed to be switching to softpin going forward.
> > Making softpin a requirement for all future hardware seems reasonable.
> > 
> > Rejecting relocations starting with Gen12 has the benefit that we don't
> > have to bother supporting it on platforms with local memory.  Given how
> > much CPU touching of memory is required for relocations, not having to
> > do so on platforms where not all memory is directly CPU-accessible
> > carries significant advantages.
> > 
> > v2 (Jason Ekstrand):
> >  - Allow TGL-LP platforms as they've already shipped
> > 
> > v3 (Jason Ekstrand):
> >  - WARN_ON platforms with LMEM support in case the check is wrong
> 
> I was asked to review of this patch. It works along with expected
> IGT check https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> 
> Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better place
> to do for loop just after copy_from_user() and check relocation_count?
> We have an access to exec2_list there, we know the gen so we're able to say
> relocations are not supported immediate, without entering 
> i915_gem_do_execbuffer().

There's a NORELOC flag you can enforce as mandatory. That's trivial for
userspace to set, really makes sure they are aware of the change afoot,
and i915_gem_ceck_execbuffer() will perform the validation upfront with
the other flag checks.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] video: fbdev: delete redundant printing of return value

2021-03-11 Thread kernel test robot
Hi Wang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.12-rc2 next-20210311]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Wang-Qing/video-fbdev-delete-redundant-printing-of-return-value/20210311-201743
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
a74e6a014c9d4d4161061f770c9b4f98372ac778
config: arm-pxa_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/6d93756a48a2f91e8ac0cfdfd8734d30080706c2
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Wang-Qing/video-fbdev-delete-redundant-printing-of-return-value/20210311-201743
git checkout 6d93756a48a2f91e8ac0cfdfd8734d30080706c2
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/video/fbdev/pxafb.c: In function 'pxafb_probe':
>> drivers/video/fbdev/pxafb.c:2329:2: warning: this 'if' clause does not 
>> guard... [-Wmisleading-indentation]
2329 |  if (irq < 0)
 |  ^~
   drivers/video/fbdev/pxafb.c:2331:3: note: ...this statement, but the latter 
is misleadingly indented as if it were guarded by the 'if'
2331 |   goto failed_free_mem;
 |   ^~~~


vim +/if +2329 drivers/video/fbdev/pxafb.c

  2235  
  2236  static int pxafb_probe(struct platform_device *dev)
  2237  {
  2238  struct pxafb_info *fbi;
  2239  struct pxafb_mach_info *inf, *pdata;
  2240  int i, irq, ret;
  2241  
  2242  dev_dbg(>dev, "pxafb_probe\n");
  2243  
  2244  ret = -ENOMEM;
  2245  pdata = dev_get_platdata(>dev);
  2246  inf = devm_kmalloc(>dev, sizeof(*inf), GFP_KERNEL);
  2247  if (!inf)
  2248  goto failed;
  2249  
  2250  if (pdata) {
  2251  *inf = *pdata;
  2252  inf->modes =
  2253  devm_kmalloc_array(>dev, pdata->num_modes,
  2254 sizeof(inf->modes[0]), 
GFP_KERNEL);
  2255  if (!inf->modes)
  2256  goto failed;
  2257  for (i = 0; i < inf->num_modes; i++)
  2258  inf->modes[i] = pdata->modes[i];
  2259  }
  2260  
  2261  if (!pdata)
  2262  inf = of_pxafb_of_mach_info(>dev);
  2263  if (IS_ERR_OR_NULL(inf))
  2264  goto failed;
  2265  
  2266  ret = pxafb_parse_options(>dev, g_options, inf);
  2267  if (ret < 0)
  2268  goto failed;
  2269  
  2270  pxafb_check_options(>dev, inf);
  2271  
  2272  dev_dbg(>dev, "got a %dx%dx%d LCD\n",
  2273  inf->modes->xres,
  2274  inf->modes->yres,
  2275  inf->modes->bpp);
  2276  if (inf->modes->xres == 0 ||
  2277  inf->modes->yres == 0 ||
  2278  inf->modes->bpp == 0) {
  2279  dev_err(>dev, "Invalid resolution or bit depth\n");
  2280  ret = -EINVAL;
  2281  goto failed;
  2282  }
  2283  
  2284  fbi = pxafb_init_fbinfo(>dev, inf);
  2285  if (IS_ERR(fbi)) {
  2286  dev_err(>dev, "Failed to initialize framebuffer 
device\n");
  2287  ret = PTR_ERR(fbi);
  2288  goto failed;
  2289  }
  2290  
  2291  if (cpu_is_pxa3xx() && inf->acceleration_enabled)
  2292  fbi->fb.fix.accel = FB_ACCEL_PXA3XX;
  2293  
  2294  fbi->backlight_power = inf->pxafb_backlight_power;
  2295  fbi->lcd_power = inf->pxafb_lcd_power;
  2296  
  2297  fbi->lcd_supply = devm_regulator_get_optional(>dev, "lcd");
  2298  if (IS_ERR(fbi->lcd_supply)) {
  2299  if (PTR_ERR(fbi->lcd_supply) == -EPROBE_DEFER)
  2300  return -EPROBE_DEFER;
  2301  
  2302  fbi->lcd_supply = NULL;
  2303  }
  2304  
  2305  fbi->mmio_base = devm_platform_ioremap_resource(dev, 0);
  

[PATCH] drm/i915/gem: Drop relocation support on all new hardware (v4)

2021-03-11 Thread Jason Ekstrand
The Vulkan driver in Mesa for Intel hardware never uses relocations if
it's running on a version of i915 that supports at least softpin which
all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
only supported by iris which never uses relocations.  The older i965
driver in Mesa does use relocations but it only supports Intel hardware
through Gen11 and has been deprecated for all hardware Gen9+.  The
compute driver also never uses relocations.  This only leaves the media
driver which is supposed to be switching to softpin going forward.
Making softpin a requirement for all future hardware seems reasonable.

There is one piece of hardware enabled by default in i915: RKL which was
enabled by e22fa6f0a976 which has not yet landed in drm-next so this
almost but not really a userspace API change for RKL.  If it becomes a
problem, we can always add !IS_ROCKETLAKE(eb->i915) to the condition.

Rejecting relocations starting with newer Gen12 platforms has the
benefit that we don't have to bother supporting it on platforms with
local memory.  Given how much CPU touching of memory is required for
relocations, not having to do so on platforms where not all memory is
directly CPU-accessible carries significant advantages.

v2 (Jason Ekstrand):
 - Allow TGL-LP platforms as they've already shipped

v3 (Jason Ekstrand):
 - WARN_ON platforms with LMEM support in case the check is wrong

v4 (Jason Ekstrand):
 - Call out Rocket Lake in the commit message

Signed-off-by: Jason Ekstrand 
Acked-by: Keith Packard 
Cc: Dave Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 99772f37bff60..b02dbd16bfa03 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer *eb, struct 
eb_vma *ev)
return err;
 }
 
-static int check_relocations(const struct drm_i915_gem_exec_object2 *entry)
+static int check_relocations(const struct i915_execbuffer *eb,
+const struct drm_i915_gem_exec_object2 *entry)
 {
const char __user *addr, *end;
unsigned long size;
@@ -1774,6 +1775,14 @@ static int check_relocations(const struct 
drm_i915_gem_exec_object2 *entry)
if (size == 0)
return 0;
 
+   /* Relocations are disallowed for all platforms after TGL-LP */
+   if (INTEL_GEN(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
+   return -EINVAL;
+
+   /* All discrete memory platforms are Gen12 or above */
+   if (WARN_ON(HAS_LMEM(eb->i915)))
+   return -EINVAL;
+
if (size > N_RELOC(ULONG_MAX))
return -EINVAL;
 
@@ -1807,7 +1816,7 @@ static int eb_copy_relocations(const struct 
i915_execbuffer *eb)
if (nreloc == 0)
continue;
 
-   err = check_relocations(>exec[i]);
+   err = check_relocations(eb, >exec[i]);
if (err)
goto err;
 
@@ -1880,7 +1889,7 @@ static int eb_prefault_relocations(const struct 
i915_execbuffer *eb)
for (i = 0; i < count; i++) {
int err;
 
-   err = check_relocations(>exec[i]);
+   err = check_relocations(eb, >exec[i]);
if (err)
return err;
}
-- 
2.29.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
>
> On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  wrote:
> >
> > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> >  wrote:
> > >
> > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > > > it's running on a version of i915 that supports at least softpin which
> > > > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
> > > > only supported by iris which never uses relocations.  The older i965
> > > > driver in Mesa does use relocations but it only supports Intel hardware
> > > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > > compute driver also never uses relocations.  This only leaves the media
> > > > driver which is supposed to be switching to softpin going forward.
> > > > Making softpin a requirement for all future hardware seems reasonable.
> > > >
> > > > Rejecting relocations starting with Gen12 has the benefit that we don't
> > > > have to bother supporting it on platforms with local memory.  Given how
> > > > much CPU touching of memory is required for relocations, not having to
> > > > do so on platforms where not all memory is directly CPU-accessible
> > > > carries significant advantages.
> > > >
> > > > v2 (Jason Ekstrand):
> > > >  - Allow TGL-LP platforms as they've already shipped
> > > >
> > > > v3 (Jason Ekstrand):
> > > >  - WARN_ON platforms with LMEM support in case the check is wrong
> > >
> > > I was asked to review of this patch. It works along with expected
> > > IGT check 
> > > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> > >
> > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better place
> > > to do for loop just after copy_from_user() and check relocation_count?
> > > We have an access to exec2_list there, we know the gen so we're able to 
> > > say
> > > relocations are not supported immediate, without entering 
> > > i915_gem_do_execbuffer().
> >
> > I considered that but it adds an extra object list walk for a case
> > which we expect to not happen.  I'm not sure how expensive the list
> > walk would be if all we do is check the number of relocations on each
> > object.  I guess, if it comes right after a copy_from_user, it's all
> > hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
> > I'll move it.
>
> I really wouldn't move it if it's another list walk. Execbuf has a lot
> of fast-paths going on, and we have extensive tests to make sure it
> unwinds correctly in all cases. It's not very intuitive, but execbuf
> code isn't scoring very high on that.

And here I'd just finished doing the typing to move it.  Good thing I
hadn't closed vim yet and it was still in my undo buffer. :-)

--Jason

> -Daniel
>
> >
> > --Jason
> >
> > > --
> > > Zbigniew
> > >
> > > >
> > > > Signed-off-by: Jason Ekstrand 
> > > > Cc: Dave Airlie 
> > > > Cc: Daniel Vetter 
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
> > > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > index 99772f37bff60..b02dbd16bfa03 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > @@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer *eb, 
> > > > struct eb_vma *ev)
> > > >   return err;
> > > >  }
> > > >
> > > > -static int check_relocations(const struct drm_i915_gem_exec_object2 
> > > > *entry)
> > > > +static int check_relocations(const struct i915_execbuffer *eb,
> > > > +  const struct drm_i915_gem_exec_object2 
> > > > *entry)
> > > >  {
> > > >   const char __user *addr, *end;
> > > >   unsigned long size;
> > > > @@ -1774,6 +1775,14 @@ static int check_relocations(const struct 
> > > > drm_i915_gem_exec_object2 *entry)
> > > >   if (size == 0)
> > > >   return 0;
> > > >
> > > > + /* Relocations are disallowed for all platforms after TGL-LP */
> > > > + if (INTEL_GEN(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
> > > > + return -EINVAL;
> > > > +
> > > > + /* All discrete memory platforms are Gen12 or above */
> > > > + if (WARN_ON(HAS_LMEM(eb->i915)))
> > > > + return -EINVAL;
> > > > +
> > > >   if (size > N_RELOC(ULONG_MAX))
> > > >   return -EINVAL;
> > > >
> > > > @@ -1807,7 +1816,7 @@ static int eb_copy_relocations(const struct 
> > > > i915_execbuffer *eb)
> > > >   if (nreloc == 0)
> > > >   continue;
> > > >
> > > > - err = check_relocations(>exec[i]);
> > > > + err = check_relocations(eb, >exec[i]);
> > > >   if (err)
> > > >   goto err;
> > > >
> > > > @@ -1880,7 

Re: [PATCH v2 0/5] drm/panel-simple: Patches for N116BCA-EA1

2021-03-11 Thread Linus Walleij
On Thu, Mar 11, 2021 at 2:01 AM Doug Anderson  wrote:

> If you happen to feel in an applying mood one other patch to
> simple-panel I think is OK to land is at:
>
> https://lore.kernel.org/r/20210222081716.1.I1a45aece5d2ac6a2e73bbec50da2086e43e0862b@changeid

I applied and pushed this as well.

Yours,
Linus Walleij
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"

2021-03-11 Thread Chris Wilson
Quoting Daniel Vetter (2021-03-11 16:01:46)
> On Fri, Mar 05, 2021 at 11:05:46AM -0600, Jason Ekstrand wrote:
> > This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> > since that commit, we've been having issues where a hang in one client
> > can propagate to another.  In particular, a hang in an app can propagate
> > to the X server which causes the whole desktop to lock up.
> > 
> > Signed-off-by: Jason Ekstrand 
> > Reported-by: Marcin Slusarz 
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already 
> > signaled fences")
> 
> Yeah I suggested to just go with the revert, so I guess on my to give you
> the explainer to be added to the commit message.

If you took the patch this was copied from and only revert on the
dma-resv, things do not break horribly.

> Error propagation along fences sound like a good idea, but as your bug
> shows, surprising consequences, since propagating errors across security
> boundaries is not a good thing.
> 
> What we do have is track the hangs on the ctx, and report information to
> userspace using RESET_STATS.

And by the fence, which is far more precise.

> That's how arb_robustness works. Also, if my
> understanding is still correct, the EIO from execbuf is when your context
> is banned (because not recoverable or too many hangs). And in all these
> cases it's up to userspace to figure out what is all impacted and should
> be reported to the application, that's not on the kernel to guess and
> automatically propagate.
> 
> What's more, we're also building more features on top of ctx error
> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> userspace fence wait also relies on that mechanism. So it is the path
> going forward for reporting gpu hangs and resets to userspace.

That ioctl is not a solid basis, it never did quite work as expected and
we kept realising the limitations of the information and the accuracy
that it could report.
 
> So all together that's why I think we should just bury this idea again as
> not quite the direction we want to go to, hence why I think the revert is
> the right option here.

No, as shown by igt it's a critical issue that we have to judicially
chose which errors to ignore. And it's not just the ability to subvert
gen7 and gen9, it's the error tracking employed for preempting contexts
among others.  Hence go with the original patch to undo the propagation
along dma-resv.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/5] drm/panel-simple: Patches for N116BCA-EA1

2021-03-11 Thread Linus Walleij
On Fri, Jan 15, 2021 at 11:44 PM Douglas Anderson  wrote:

> This series is to get the N116BCA-EA1 panel working. Most of the
> patches are simple, but on hardware I have in front of me the panel
> sometimes doesn't come up. I'm still working with the hardware
> manufacturer to get to the bottom of it, but I've got it working with
> retries. Adding the retries doesn't seem like an insane thing to do
> and makes some of the error handling more robust, so I've gone ahead
> and included those patches here. Hopefully they look OK.
>
> Changes in v2:

This v2 version applied to drm-misc-next.

Yours,
Linus Walleij
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

2021-03-11 Thread Alex Deucher
On Thu, Mar 11, 2021 at 10:02 AM Alexandre Desnoyers  wrote:
>
> On Thu, Mar 11, 2021 at 2:49 PM Daniel Gomez  wrote:
> >
> > On Thu, 11 Mar 2021 at 10:09, Daniel Gomez  wrote:
> > >
> > > On Wed, 10 Mar 2021 at 18:06, Alex Deucher  wrote:
> > > >
> > > > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez  wrote:
> > > > >
> > > > > Disabling GFXOFF via the quirk list fixes a hardware lockup in
> > > > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
> > > > >
> > > > > Signed-off-by: Daniel Gomez 
> > > > > ---
> > > > >
> > > > > This patch is a continuation of the work here:
> > > > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was 
> > > > > discussed and
> > > > > a dma_fence deadlock was provoke as a side effect. To reproduce the 
> > > > > issue
> > > > > please refer to the above link.
> > > > >
> > > > > The hardware lockup was introduced in 5.6-rc1 for our particular 
> > > > > revision as it
> > > > > wasn't part of the new blacklist. Before that, in kernel v5.5, this 
> > > > > hardware was
> > > > > working fine without any hardware lock because the GFXOFF was 
> > > > > actually disabled
> > > > > by the if condition for the CHIP_RAVEN case. So this patch, adds the 
> > > > > 'Radeon
> > > > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to disable 
> > > > > the GFXOFF.
> > > > >
> > > > > But besides the fix, I'd like to ask from where this revision comes 
> > > > > from. Is it
> > > > > an ASIC revision or is it hardcoded in the VBIOS from our vendor? 
> > > > > From what I
> > > > > can see, it comes from the ASIC and I wonder if somehow we can get an 
> > > > > APU in the
> > > > > future, 'not blacklisted', with the same problem. Then, should this 
> > > > > table only
> > > > > filter for the vendor and device and not the revision? Do you know if 
> > > > > there are
> > > > > any revisions for the 1002:15dd validated, tested and functional?
> > > >
> > > > The pci revision id (RID) is used to specify the specific SKU within a
> > > > family.  GFXOFF is supposed to be working on all raven variants.  It
> > > > was tested and functional on all reference platforms and any OEM
> > > > platforms that launched with Linux support.  There are a lot of
> > > > dependencies on sbios in the early raven variants (0x15dd), so it's
> > > > likely more of a specific platform issue, but there is not a good way
> > > > to detect this so we use the DID/SSID/RID as a proxy.  The newer raven
> > > > variants (0x15d8) have much better GFXOFF support since they all
> > > > shipped with newer firmware and sbios.
> > >
> > > We took one of the first reference platform boards to design our
> > > custom board based on the V1605B and I assume it has one of the early 
> > > 'unstable'
> > > raven variants with RID 0x83. Also, as OEM we are in control of the bios
> > > (provided by insyde) but I wasn't sure about the RID so, thanks for the
> > > clarification. Is there anything we can do with the bios to have the 
> > > GFXOFF
> > > enabled and 'stable' for this particular revision? Otherwise we'd need to 
> > > add
> > > the 0x83 RID to the table. Also, there is an extra ']' in the patch
> > > subject. Sorry
> > > for that. Would you need a new patch in case you accept it with the ']' 
> > > removed?
> > >
> > > Good to hear that the newer raven versions have better GFXOFF support.
> >
> > Adding Alex Desnoyer to the loop as he is the electronic/hardware and
> > bios responsible so, he can
> > provide more information about this.
>
> Hello everyone,
>
> We, Qtechnology, are the OEM of the hardware platform where we
> originally discovered the bug.  Our platform is based on the AMD
> Dibbler V-1000 reference design, with the latest Insyde BIOS release
> available for the (now unsupported) Dibbler platform.  We have the
> Insyde BIOS source code internally, so we can make some modifications
> as needed.
>
> The last test that Daniel and myself performed was on a standard
> Dibbler PCB rev.B1 motherboard (NOT our platform), and using the
> corresponding latest AMD released BIOS "RDB1109GA".  As Daniel wrote,
> the hardware lockup can be reproduced on the Dibbler, even if it has a
> different RID that our V1605B APU.
>
> We also have a Neousys Technology POC-515 embedded computer (V-1000,
> V1605B) in our office.  The Neousys PC also uses Insyde BIOS.  This
> computer is also locking-up in the test.
> https://www.neousys-tech.com/en/product/application/rugged-embedded/poc-500-amd-ryzen-ultra-compact-embedded-computer
>
>
> Digging into the BIOS source code, the only reference to GFXOFF is in
> the SMU and PSP firmware release notes, where some bug fixes have been
> mentioned for previous SMU/PSP releases.  After a quick "git grep -i
> gfx | grep -i off", there seems to be no mention of GFXOFF in the
> Insyde UEFI (inluding AMD PI) code base.  I would appreciate any
> information regarding BIOS modification needed to make the GFXOFF
> feature stable.  As you (Alex Deucher) mentionned, it should 

Re: [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"

2021-03-11 Thread Daniel Vetter
On Fri, Mar 05, 2021 at 11:05:46AM -0600, Jason Ekstrand wrote:
> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> since that commit, we've been having issues where a hang in one client
> can propagate to another.  In particular, a hang in an app can propagate
> to the X server which causes the whole desktop to lock up.
> 
> Signed-off-by: Jason Ekstrand 
> Reported-by: Marcin Slusarz 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled 
> fences")

Yeah I suggested to just go with the revert, so I guess on my to give you
the explainer to be added to the commit message.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.

Maybe quote the entire above thing in the commit message, if it makes some
sense?

Cheers, Daniel
> ---
>  drivers/gpu/drm/i915/i915_request.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c 
> b/drivers/gpu/drm/i915/i915_request.c
> index e7b4c4bc41a64..870d6083bb57e 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1232,10 +1232,8 @@ i915_request_await_execution(struct i915_request *rq,
>  
>   do {
>   fence = *child++;
> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> - i915_sw_fence_set_error_once(>submit, fence->error);
> + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
>   continue;
> - }
>  
>   if (fence->context == rq->fence.context)
>   continue;
> @@ -1333,10 +1331,8 @@ i915_request_await_dma_fence(struct i915_request *rq, 
> struct dma_fence *fence)
>  
>   do {
>   fence = *child++;
> - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> - i915_sw_fence_set_error_once(>submit, fence->error);
> + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
>   continue;
> - }
>  
>   /*
>* Requests on the same timeline are explicitly ordered, along
> -- 
> 2.29.2
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Daniel Vetter
On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand  wrote:
>
> On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
>  wrote:
> >
> > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > > it's running on a version of i915 that supports at least softpin which
> > > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
> > > only supported by iris which never uses relocations.  The older i965
> > > driver in Mesa does use relocations but it only supports Intel hardware
> > > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > > compute driver also never uses relocations.  This only leaves the media
> > > driver which is supposed to be switching to softpin going forward.
> > > Making softpin a requirement for all future hardware seems reasonable.
> > >
> > > Rejecting relocations starting with Gen12 has the benefit that we don't
> > > have to bother supporting it on platforms with local memory.  Given how
> > > much CPU touching of memory is required for relocations, not having to
> > > do so on platforms where not all memory is directly CPU-accessible
> > > carries significant advantages.
> > >
> > > v2 (Jason Ekstrand):
> > >  - Allow TGL-LP platforms as they've already shipped
> > >
> > > v3 (Jason Ekstrand):
> > >  - WARN_ON platforms with LMEM support in case the check is wrong
> >
> > I was asked to review of this patch. It works along with expected
> > IGT check 
> > https://patchwork.freedesktop.org/patch/423361/?series=82954=25
> >
> > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better place
> > to do for loop just after copy_from_user() and check relocation_count?
> > We have an access to exec2_list there, we know the gen so we're able to say
> > relocations are not supported immediate, without entering 
> > i915_gem_do_execbuffer().
>
> I considered that but it adds an extra object list walk for a case
> which we expect to not happen.  I'm not sure how expensive the list
> walk would be if all we do is check the number of relocations on each
> object.  I guess, if it comes right after a copy_from_user, it's all
> hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
> I'll move it.

I really wouldn't move it if it's another list walk. Execbuf has a lot
of fast-paths going on, and we have extensive tests to make sure it
unwinds correctly in all cases. It's not very intuitive, but execbuf
code isn't scoring very high on that.
-Daniel

>
> --Jason
>
> > --
> > Zbigniew
> >
> > >
> > > Signed-off-by: Jason Ekstrand 
> > > Cc: Dave Airlie 
> > > Cc: Daniel Vetter 
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
> > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index 99772f37bff60..b02dbd16bfa03 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer *eb, 
> > > struct eb_vma *ev)
> > >   return err;
> > >  }
> > >
> > > -static int check_relocations(const struct drm_i915_gem_exec_object2 
> > > *entry)
> > > +static int check_relocations(const struct i915_execbuffer *eb,
> > > +  const struct drm_i915_gem_exec_object2 *entry)
> > >  {
> > >   const char __user *addr, *end;
> > >   unsigned long size;
> > > @@ -1774,6 +1775,14 @@ static int check_relocations(const struct 
> > > drm_i915_gem_exec_object2 *entry)
> > >   if (size == 0)
> > >   return 0;
> > >
> > > + /* Relocations are disallowed for all platforms after TGL-LP */
> > > + if (INTEL_GEN(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
> > > + return -EINVAL;
> > > +
> > > + /* All discrete memory platforms are Gen12 or above */
> > > + if (WARN_ON(HAS_LMEM(eb->i915)))
> > > + return -EINVAL;
> > > +
> > >   if (size > N_RELOC(ULONG_MAX))
> > >   return -EINVAL;
> > >
> > > @@ -1807,7 +1816,7 @@ static int eb_copy_relocations(const struct 
> > > i915_execbuffer *eb)
> > >   if (nreloc == 0)
> > >   continue;
> > >
> > > - err = check_relocations(>exec[i]);
> > > + err = check_relocations(eb, >exec[i]);
> > >   if (err)
> > >   goto err;
> > >
> > > @@ -1880,7 +1889,7 @@ static int eb_prefault_relocations(const struct 
> > > i915_execbuffer *eb)
> > >   for (i = 0; i < count; i++) {
> > >   int err;
> > >
> > > - err = check_relocations(>exec[i]);
> > > + err = check_relocations(eb, >exec[i]);
> > >   if (err)
> > >   return err;
> > >   }
> > > --
> > > 2.29.2
> > >
> > > 

Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

2021-03-11 Thread Jason Ekstrand
On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
 wrote:
>
> On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > The Vulkan driver in Mesa for Intel hardware never uses relocations if
> > it's running on a version of i915 that supports at least softpin which
> > all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
> > only supported by iris which never uses relocations.  The older i965
> > driver in Mesa does use relocations but it only supports Intel hardware
> > through Gen11 and has been deprecated for all hardware Gen9+.  The
> > compute driver also never uses relocations.  This only leaves the media
> > driver which is supposed to be switching to softpin going forward.
> > Making softpin a requirement for all future hardware seems reasonable.
> >
> > Rejecting relocations starting with Gen12 has the benefit that we don't
> > have to bother supporting it on platforms with local memory.  Given how
> > much CPU touching of memory is required for relocations, not having to
> > do so on platforms where not all memory is directly CPU-accessible
> > carries significant advantages.
> >
> > v2 (Jason Ekstrand):
> >  - Allow TGL-LP platforms as they've already shipped
> >
> > v3 (Jason Ekstrand):
> >  - WARN_ON platforms with LMEM support in case the check is wrong
>
> I was asked to review of this patch. It works along with expected
> IGT check https://patchwork.freedesktop.org/patch/423361/?series=82954=25
>
> Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() better place
> to do for loop just after copy_from_user() and check relocation_count?
> We have an access to exec2_list there, we know the gen so we're able to say
> relocations are not supported immediate, without entering 
> i915_gem_do_execbuffer().

I considered that but it adds an extra object list walk for a case
which we expect to not happen.  I'm not sure how expensive the list
walk would be if all we do is check the number of relocations on each
object.  I guess, if it comes right after a copy_from_user, it's all
hot in the cache so it shouldn't matter.  Ok.  I've convinced myself.
I'll move it.

--Jason

> --
> Zbigniew
>
> >
> > Signed-off-by: Jason Ekstrand 
> > Cc: Dave Airlie 
> > Cc: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 99772f37bff60..b02dbd16bfa03 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -1764,7 +1764,8 @@ eb_relocate_vma_slow(struct i915_execbuffer *eb, 
> > struct eb_vma *ev)
> >   return err;
> >  }
> >
> > -static int check_relocations(const struct drm_i915_gem_exec_object2 *entry)
> > +static int check_relocations(const struct i915_execbuffer *eb,
> > +  const struct drm_i915_gem_exec_object2 *entry)
> >  {
> >   const char __user *addr, *end;
> >   unsigned long size;
> > @@ -1774,6 +1775,14 @@ static int check_relocations(const struct 
> > drm_i915_gem_exec_object2 *entry)
> >   if (size == 0)
> >   return 0;
> >
> > + /* Relocations are disallowed for all platforms after TGL-LP */
> > + if (INTEL_GEN(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
> > + return -EINVAL;
> > +
> > + /* All discrete memory platforms are Gen12 or above */
> > + if (WARN_ON(HAS_LMEM(eb->i915)))
> > + return -EINVAL;
> > +
> >   if (size > N_RELOC(ULONG_MAX))
> >   return -EINVAL;
> >
> > @@ -1807,7 +1816,7 @@ static int eb_copy_relocations(const struct 
> > i915_execbuffer *eb)
> >   if (nreloc == 0)
> >   continue;
> >
> > - err = check_relocations(>exec[i]);
> > + err = check_relocations(eb, >exec[i]);
> >   if (err)
> >   goto err;
> >
> > @@ -1880,7 +1889,7 @@ static int eb_prefault_relocations(const struct 
> > i915_execbuffer *eb)
> >   for (i = 0; i < count; i++) {
> >   int err;
> >
> > - err = check_relocations(>exec[i]);
> > + err = check_relocations(eb, >exec[i]);
> >   if (err)
> >   return err;
> >   }
> > --
> > 2.29.2
> >
> > ___
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/3] drm/amdgpu: Remove in_interrupt() usage.

2021-03-11 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Feb 9, 2021 at 7:50 AM Christian König  wrote:
>
> Reviewed-by: Christian König  for the series.
>
> Am 09.02.21 um 13:44 schrieb Sebastian Andrzej Siewior:
> > Folks,
> >
> > in the discussion about preempt count consistency across kernel
> > configurations:
> >
> >   
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20200914204209.256266093%40linutronix.de%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7C66cfb449f0ba475dd76b08d8ccf87a85%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484714876862283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=g04sQoqvfkuHzplig%2F%2BOruqzmyypIhaqrkKU0xeIJ80%3Dreserved=0
> >
> > it was concluded that the usage of in_interrupt() and related context
> > checks should be removed from non-core code.
> >
> > In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
> > driver code completely.
> >
> > This series addresses parts of the amdgpu driver.  There are still call 
> > sites
> > left in in the amdgpu driver.
> >
> > v1…v2:
> > - Limit to admgpu only
> > - use "bool" instead of "bool == true"
> >
> > Sebastian
> >
> >
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH][next] drm/amdgpu: Fix spelling mistake "disabed" -> "disabled"

2021-03-11 Thread Alex Deucher
Applied.  Thanks!

Alex

On Thu, Mar 11, 2021 at 4:28 AM Colin King  wrote:
>
> From: Colin Ian King 
>
> There is a spelling mistake in a drm debug message. Fix it.
>
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 617e62e1eff9..dcbae9237cfa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3464,7 +3464,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_COMMON ||
>   adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_IH ||
>   adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_SMC)) {
> -   DRM_DEBUG("IP %s disabed for 
> hw_init.\n",
> +   DRM_DEBUG("IP %s disabled for 
> hw_init.\n",
> 
> adev->ip_blocks[i].version->funcs->name);
> adev->ip_blocks[i].status.hw = true;
> }
> --
> 2.30.2
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

2021-03-11 Thread Intel


On 3/11/21 2:17 PM, Daniel Vetter wrote:

On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
 wrote:

Hi!

On 3/11/21 2:00 PM, Daniel Vetter wrote:

On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:

On 3/1/21 3:09 PM, Daniel Vetter wrote:

On Mon, Mar 1, 2021 at 11:17 AM Christian König
 wrote:

Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):

On 3/1/21 10:05 AM, Daniel Vetter wrote:

On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
wrote:

Hi,

On 3/1/21 9:28 AM, Daniel Vetter wrote:

On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
 wrote:

On 2/26/21 2:28 PM, Daniel Vetter wrote:

So I think it stops gup. But I haven't verified at all. Would be
good
if Christian can check this with some direct io to a buffer in
system
memory.

Hmm,

Docs (again vm_normal_page() say)

   * VM_MIXEDMAP mappings can likewise contain memory with or
without "struct
   * page" backing, however the difference is that _all_ pages
with a struct
   * page (that is, those where pfn_valid is true) are refcounted
and
considered
   * normal pages by the VM. The disadvantage is that pages are
refcounted
   * (which can be slower and simply not an option for some PFNMAP
users). The
   * advantage is that we don't have to follow the strict
linearity rule of
   * PFNMAP mappings in order to support COWable mappings.

but it's true __vm_insert_mixed() ends up in the insert_pfn()
path, so
the above isn't really true, which makes me wonder if and in that
case
why there could any longer ever be a significant performance
difference
between MIXEDMAP and PFNMAP.

Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.


BTW regarding the TTM hugeptes, I don't think we ever landed that
devmap
hack, so they are (for the non-gup case) relying on
vma_is_special_huge(). For the gup case, I think the bug is still
there.

Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel

Hmm perhaps it might, but I don't think so. The fix I tried out was
to set

PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
true, and
then

follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
gup_fast()
backs off,

in the end that would mean setting in stone that "if there is a huge
devmap
page table entry for which we haven't registered any devmap struct
pages
(get_dev_pagemap returns NULL), we should treat that as a "special"
huge
page table entry".

From what I can tell, all code calling get_dev_pagemap() already
does that,
it's just a question of getting it accepted and formalizing it.

Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.

Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
Otherwise pmd_devmap() will not return true and since there is no
pmd_special() things break.

Is that maybe the issue we have seen with amdgpu and huge pages?

Yeah, essentially when you have a hugepte inserted by ttm, and it
happens to point at system memory, then gup will work on that. And
create all kinds of havoc.


Apart from that I'm lost guys, that devmap and gup stuff is not
something I have a good knowledge of apart from a one mile high view.

I'm not really better, hence would be good to do a testcase and see.
This should provoke it:
- allocate nicely aligned bo in system memory
- mmap, again nicely aligned to 2M
- do some direct io from a filesystem into that mmap, that should trigger gup
- before the gup completes free the mmap and bo so that ttm recycles
the pages, which should trip up on the elevated refcount. If you wait
until the direct io is completely, then I think nothing bad can be
observed.

Ofc if your amdgpu+hugepte issue is something else, then maybe we have
another issue.

Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
-Daniel

So I did the following quick experiment on vmwgfx, and it turns out that
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

My understanding from reading the vmf_insert_mixed() code is that iff
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
not consistent with the vm_normal_page() doc. For architectures without
pte_special, VM_PFNMAP must be used, and then we must also block COW

Re: [Intel-gfx] [PATCH 2/2] drm/i915/dp_link_training: Convert DRM_DEBUG_KMS to drm_dbg_kms

2021-03-11 Thread Ville Syrjälä
On Wed, Mar 10, 2021 at 04:47:57PM -0500, Sean Paul wrote:
> From: Sean Paul 
> 
> One instance of DRM_DEBUG_KMS was leftover in dp_link_training, convert
> it to the new shiny.
> 
> Signed-off-by: Sean Paul 
> ---
>  .../gpu/drm/i915/display/intel_dp_link_training.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c 
> b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> index ad02d493ec16..19ba7c7cbaab 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> @@ -26,12 +26,13 @@
>  #include "intel_dp_link_training.h"
>  
>  static void
> -intel_dp_dump_link_status(const u8 link_status[DP_LINK_STATUS_SIZE])
> +intel_dp_dump_link_status(struct drm_device *drm,

I'd generally pass 'i915' rather than the drm_device to
any i915 specific function. But doesn't really matter here since
we just pass is straight through anyway.

Reviewed-by: Ville Syrjälä 

> +   const u8 link_status[DP_LINK_STATUS_SIZE])
>  {
> -
> - DRM_DEBUG_KMS("ln0_1:0x%x ln2_3:0x%x align:0x%x sink:0x%x 
> adj_req0_1:0x%x adj_req2_3:0x%x\n",
> -   link_status[0], link_status[1], link_status[2],
> -   link_status[3], link_status[4], link_status[5]);
> + drm_dbg_kms(drm,
> + "ln0_1:0x%x ln2_3:0x%x align:0x%x sink:0x%x adj_req0_1:0x%x 
> adj_req2_3:0x%x\n",
> + link_status[0], link_status[1], link_status[2],
> + link_status[3], link_status[4], link_status[5]);
>  }
>  
>  static void intel_dp_reset_lttpr_count(struct intel_dp *intel_dp)
> @@ -642,7 +643,7 @@ intel_dp_link_training_channel_equalization(struct 
> intel_dp *intel_dp,
>   /* Make sure clock is still ok */
>   if (!drm_dp_clock_recovery_ok(link_status,
> crtc_state->lane_count)) {
> - intel_dp_dump_link_status(link_status);
> + intel_dp_dump_link_status(>drm, link_status);
>   drm_dbg_kms(>drm,
>   "Clock recovery check failed, cannot "
>   "continue channel equalization\n");
> @@ -669,7 +670,7 @@ intel_dp_link_training_channel_equalization(struct 
> intel_dp *intel_dp,
>  
>   /* Try 5 times, else fail and try at lower BW */
>   if (tries == 5) {
> - intel_dp_dump_link_status(link_status);
> + intel_dp_dump_link_status(>drm, link_status);
>   drm_dbg_kms(>drm,
>   "Channel equalization failed 5 times\n");
>   }
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] drm/i915/dp_link_training: Add newlines to debug messages

2021-03-11 Thread Ville Syrjälä
On Wed, Mar 10, 2021 at 04:47:56PM -0500, Sean Paul wrote:
> From: Sean Paul 
> 
> This patch adds some newlines which are missing from debug messages.
> This will prevent logs from being stacked up in dmesg.
> 
> Signed-off-by: Sean Paul 
> ---
>  drivers/gpu/drm/i915/display/intel_dp_link_training.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c 
> b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> index 892d7db7d94f..ad02d493ec16 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> @@ -29,7 +29,7 @@ static void
>  intel_dp_dump_link_status(const u8 link_status[DP_LINK_STATUS_SIZE])
>  {
>  
> - DRM_DEBUG_KMS("ln0_1:0x%x ln2_3:0x%x align:0x%x sink:0x%x 
> adj_req0_1:0x%x adj_req2_3:0x%x",
> + DRM_DEBUG_KMS("ln0_1:0x%x ln2_3:0x%x align:0x%x sink:0x%x 
> adj_req0_1:0x%x adj_req2_3:0x%x\n",
> link_status[0], link_status[1], link_status[2],
> link_status[3], link_status[4], link_status[5]);
>  }
> @@ -731,7 +731,7 @@ intel_dp_link_train_phy(struct intel_dp *intel_dp,
>  
>  out:
>   drm_dbg_kms(_to_i915(intel_dp)->drm,
> - "[CONNECTOR:%d:%s] Link Training %s at link rate = %d, lane 
> count = %d, at %s",
> + "[CONNECTOR:%d:%s] Link Training %s at link rate = %d, lane 
> count = %d, at %s\n",

Looking through some ci logs we do get the newline here somehow. A bit
weird.

Reviewed-by: Ville Syrjälä 

-- 
Ville Syrjälä
Intel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


2021 X.Org Foundation Membership renewal period extended to Mar 18

2021-03-11 Thread Harry Wentland
Due to some hickups with some of the early election emails and the large 
spike in membership registrations the elections committee decided to 
extend the membership deadline by one week to Mar 18, 2021.


If you have not renewed your membership please do so by Thursday, Mar 18 
at https://members.x.org.


The nominated candidates will be announced on Sunday, allowing for a 
week of Candidate QA before the start of election on Mon Mar 22.


** Election Schedule **

Nomination period Start: Mon 22nd February
Nomination period End: Sun 7th March
Publication of Candidates & start of Candidate QA: Mon 15th March
Deadline of X.Org membership application or renewal: Thu 18th March
Election Planned Start: Mon 22nd March anywhere on earth
Election Planned End: Sun 4th April anywhere on earth

** Election Committee **

 * Eric Anholt
 * Mark Filion
 * Keith Packard
 * Harry Wentland

Thanks,
Harry Wentland,
on behalf of the X.Org elections committee
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-11 Thread Peter Stuge
Noralf Trønnes wrote:
> > I didn't receive the expected bits/bytes for RGB111 on the bulk endpoint,
> > I think because of how components were extracted in gud_xrgb_to_color().
> > 
> > Changing to the following gets me the expected (X R1 G1 B1 X R2 G2 B2) 
> > bytes:
> > 
> > r = (*pix32 >> 8) & 0xff;
> > g = (*pix32 >> 16) & 0xff;
> > b = (*pix32++ >> 24) & 0xff;
> 
> We're accessing the whole word here through pix32, no byte access, so
> endianess doesn't come into play.

Endianness matters because parts of pix32 are used.

Software only sees bytes (or larger) because addresses are byte granular,
but must pay attention to the bit order when dealing with smaller values
inside larger memory accesses.

Given 4 bytes of memory {0x11, 0x22, 0x33, 0x44} at address A, both LE
and BE machines appear the same when accessing individual bytes, but with
uint32_t *a32 = A then a32[0] is 0x44332211 on LE and 0x11223344 on BE.


Hence the question: What does DRM promise about the XRGB mode?

Is it guaranteed that the first byte in memory is always unused, the second
represents red, the third green and the fourth blue (endianess agnostic)?
I'd expect this and I guess that it is the case, but I don't know DRM?

Or is it instead guaranteed that when accessed natively as one 32-bit
value the blue component is always in the most significant byte (endianess
abstracted, always LE in memory) or always in the least significant byte
(endianess abstracted, always BE in memory)?
This would be annoying for userspace, but well, it's possible.

In the abstracted (latter) case pix32 would work, but could still be
questioned on style, and in fact, pix32 didn't work for me, so at a
minimum the byte order would be the reverse.


In the agnostic (former) case your code was correct for BE and mine
for LE, but I'd then suggest using a u8 * to both work correctly
everywhere and be obvious.


> This change will flip r and b, which gives: XRGB -> XBGR

The old code was:
r = *pix32 >> 16;
g = *pix32 >> 8;
b = *pix32++;

On my LE machine this set r to the third byte (G), g to the second (R)
and b to the first (X), explaining the color confusion that I saw.


> BGR is a common thing on controllers, are you sure yours are set to RGB
> and not BGR?

Yes; I've verified that my display takes red in MSB both in its data
sheet and by writing raw bits to it on a system without the gud driver.


> And the 0xff masking isn't necessary since we're assigning to a byte, right?

Not strictly neccessary but I like to do it anyway, both to be explicit
and also to ensure that the compiler will never sign extend, if types
are changed or if values become treated as signed and/or larger by the
compiler because the code is changed.

It's frustrating to debug such unexpected changes in behavior due to
a type change or calculation change, but if you find it too defensive
then go ahead and remove it, if pix32 does stay.


> I haven't got a native R1G1B1 display so I have emulated and I do get
> the expected colors. This is the conversion function I use on the device
> which I think is correct:
> 
> static size_t rgb111_to_rgb565(uint16_t *dst, uint8_t *src,
>uint16_t src_width, uint16_t src_height)
> {
> uint8_t rgb111, val = 0;
> size_t len = 0;
> 
> for (uint16_t y = 0; y < src_height; y++) {
> for (uint16_t x = 0; x < src_width; x++) {
> if (!(x % 2))
> val = *src++;
> rgb111 = val >> 4;
> *dst++ = ((rgb111 & 0x04) << 13) | ((rgb111 & 0x02) << 9) |
>  ((rgb111 & 0x01) << 4);

I'm afraid this isn't correct. Two wrongs end up cancelling each other
out and it's not so obvious because the destination has symmetric (565)
components.

If you were to convert to xrgb in the same way I think you'd also
see some color confusion, and in any case blue is getting lost already
in gud_xrgb_to_color() on LE.


//Peter
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dma-buf: Fix confusion of dynamic dma-buf vs dynamic attachment

2021-03-11 Thread Daniel Vetter
On Fri, Mar 05, 2021 at 11:54:49AM +0100, Christian König wrote:
> Am 05.03.21 um 11:51 schrieb Chris Wilson:
> > Commit c545781e1c55 ("dma-buf: doc polish for pin/unpin") disagrees with
> > the introduction of dynamism in commit: bb42df4662a4 ("dma-buf: add
> > dynamic DMA-buf handling v15") resulting in warning spew on
> > importing dma-buf. Silence the warning from the latter by only pinning
> > the attachment if the attachment rather than the dmabuf is to be
> > dynamic.
> 
> NAK, this is intentionally like this. You need to pin the DMA-buf if it is
> dynamic and the attachment isn't.
> 
> Otherwise the DMA-buf would be able to move even when it has an attachment
> which can't handle that.
> 
> We should rather fix the documentation if that is wrong on this point.

The doc is right, it's for the exporter function for importers. For
non-dynamic importers dma-buf.c code itself does ensure the pinning
happens. So non-dynamic importers really have no business calling
pin/unpin, because they always get a mapping that's put into system memory
and pinned there.

Ofc for driver specific stuff with direct interfaces you can do whatever
you feel like, but probably good to match these semantics.

But looking at the patch, I think this is more about the locking, not the
pin/unpin stuff. Locking rules definitely depend upon what the exporter
requires, and again dma-buf.c should do all the impendence mismatch that's
needed.

So I think we're all good with the doc, but please double-check.
-Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Fixes: bb42df4662a4 ("dma-buf: add dynamic DMA-buf handling v15")
> > Fixes: c545781e1c55 ("dma-buf: doc polish for pin/unpin")
> > Signed-off-by: Chris Wilson 
> > Cc: Daniel Vetter 
> > Cc: Christian König 
> > Cc:  # v5.7+
> > ---
> >   drivers/dma-buf/dma-buf.c | 9 +
> >   1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index f264b70c383e..09f5ae458515 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -758,8 +758,8 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct 
> > device *dev,
> > dma_buf_is_dynamic(dmabuf)) {
> > struct sg_table *sgt;
> > -   if (dma_buf_is_dynamic(attach->dmabuf)) {
> > -   dma_resv_lock(attach->dmabuf->resv, NULL);
> > +   if (dma_buf_attachment_is_dynamic(attach)) {
> > +   dma_resv_lock(dmabuf->resv, NULL);
> > ret = dma_buf_pin(attach);
> > if (ret)
> > goto err_unlock;
> > @@ -772,8 +772,9 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct 
> > device *dev,
> > ret = PTR_ERR(sgt);
> > goto err_unpin;
> > }
> > -   if (dma_buf_is_dynamic(attach->dmabuf))
> > -   dma_resv_unlock(attach->dmabuf->resv);
> > +   if (dma_buf_attachment_is_dynamic(attach))
> > +   dma_resv_unlock(dmabuf->resv);
> > +
> > attach->sgt = sgt;
> > attach->dir = DMA_BIDIRECTIONAL;
> > }
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Query regarding DRM mastership sharing between multiple process

2021-03-11 Thread Daniel Vetter
On Fri, Mar 05, 2021 at 05:44:04PM +0200, Pekka Paalanen wrote:
> On Thu, 4 Mar 2021 09:43:22 +0530
> Hardik Panchal  wrote:
> 
> > Hello Sir/Madam,
> > 
> > I am trying to render some stuff using DRM with Qt GUI application and
> > decoded stream from Intel H/w decoder.
> > 
> > I have two applications one is for GUI content and another one is for
> > decoded video streams. While doing this I am facing an issue that only
> > singal process acquires DRM mastership while the other one is getting
> > error.
> 
> Hi,
> 
> yes, this is deliberate and by design.
> 
> The idea of having two separate processes simultaneously controlling
> KMS planes of the same CRTC is fundamentally forbidden. Even if it was
> not forbidden, doing so would lead to other technical problems.
> 
> You have to change your architecture so that only one process controls
> KMS. It you need other processes, they have to pass buffers or
> rendering commands to the process that does control KMS. In other
> words, you need a display server.

One option is kms leases, where the main compositor with exclusive control
over the display can pass a select set of resources to another process.
But it's a clear lessor/lessee relationship, and the main compositor can
always revoke the lease if needed.
-Daniel

> 
> > While wondering how to get the privilege to render stuff I came
> > across GET_MAGIC and AUTH_MAGIC.
> > Please refer to this text from the MAN page of DRM.
> 
> Those will not help you with breaking the DRM master concept.
> 
> > > All DRM devices provide authentication mechanisms. Only a DRM-Master is
> > > allowed to perform mode-setting or modify core state and only one user can
> > > be DRM-Master at a time. See drmSetMaster
> > > (3) for
> > > information on how to become DRM-Master and what the limitations are. 
> > > Other
> > > DRM users can be authenticated to the DRM-Master via drmAuthMagic
> > > (3) so
> > > they can perform buffer allocations and rendering.
> > >  
> > 
> > As per this the client which is authenticated using magic code should be
> > able to allocate buffer and rendering.
> > But while doing this I am not able to use drmModeSetPlane() for rendering
> > stuff on display from an authenticated client application. It is giving me
> > Permission Denied.
> > 
> > As per my understanding if the client is authenticated by using
> > GET/AUTH_MAGIC it should be able to set a plane and render stuff on the
> > display.
> 
> No. Authentication gives access to buffer allocation and submitting
> rendering commands to the GPU. It does not give access to KMS.
> 
> 
> Sorry,
> pq



> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail

2021-03-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=211277

--- Comment #15 from kolAflash (kolafl...@kolahilft.de) ---
(In reply to Alex Deucher from comment #10)
> Can you bisect? 
> https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html

I've done several s2ram-wakeup cycles (100 automatic and about three manual
wakeups/day) with the kernel I compiled on 2021-03-07.

It's based on 5.10.21 with c6d2b0fbb reverted. (as suggested by Jerome)
Result: No crashes.
This looks very prosiming!

@Alex
Can I help with anything else to solve this?




I also compiled 5.10.21 without reverting c6d2b0fbb, tested it for a few hours
and got three wakeup-crashes.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RESEND 00/53] Rid GPU from W=1 warnings

2021-03-11 Thread Lee Jones
On Thu, 11 Mar 2021, Daniel Vetter wrote:

> On Mon, Mar 08, 2021 at 09:19:32AM +, Lee Jones wrote:
> > On Fri, 05 Mar 2021, Roland Scheidegger wrote:
> > 
> > > The vmwgfx ones look all good to me, so for
> > > 23-53: Reviewed-by: Roland Scheidegger 
> > > That said, they were already signed off by Zack, so not sure what
> > > happened here.
> > 
> > Yes, they were accepted at one point, then dropped without a reason.
> > 
> > Since I rebased onto the latest -next, I had to pluck them back out of
> > a previous one.
> 
> They should show up in linux-next again. We merge patches for next merge
> window even during the current merge window, but need to make sure they
> don't pollute linux-next. Occasionally the cut off is wrong so patches
> show up, and then get pulled again.
> 
> Unfortunately especially the 5.12 merge cycle was very wobbly due to some
> confusion here. But your patches should all be in linux-next again (they
> are queued up for 5.13 in drm-misc-next, I checked that).
> 
> Sorry for the confusion here.

Oh, I see.  Well so long as they don't get dropped, I'll be happy.

Thanks for the explanation Daniel

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

2021-03-11 Thread Daniel Gomez
On Thu, 11 Mar 2021 at 10:09, Daniel Gomez  wrote:
>
> On Wed, 10 Mar 2021 at 18:06, Alex Deucher  wrote:
> >
> > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez  wrote:
> > >
> > > Disabling GFXOFF via the quirk list fixes a hardware lockup in
> > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
> > >
> > > Signed-off-by: Daniel Gomez 
> > > ---
> > >
> > > This patch is a continuation of the work here:
> > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was discussed 
> > > and
> > > a dma_fence deadlock was provoke as a side effect. To reproduce the issue
> > > please refer to the above link.
> > >
> > > The hardware lockup was introduced in 5.6-rc1 for our particular revision 
> > > as it
> > > wasn't part of the new blacklist. Before that, in kernel v5.5, this 
> > > hardware was
> > > working fine without any hardware lock because the GFXOFF was actually 
> > > disabled
> > > by the if condition for the CHIP_RAVEN case. So this patch, adds the 
> > > 'Radeon
> > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to disable the 
> > > GFXOFF.
> > >
> > > But besides the fix, I'd like to ask from where this revision comes from. 
> > > Is it
> > > an ASIC revision or is it hardcoded in the VBIOS from our vendor? From 
> > > what I
> > > can see, it comes from the ASIC and I wonder if somehow we can get an APU 
> > > in the
> > > future, 'not blacklisted', with the same problem. Then, should this table 
> > > only
> > > filter for the vendor and device and not the revision? Do you know if 
> > > there are
> > > any revisions for the 1002:15dd validated, tested and functional?
> >
> > The pci revision id (RID) is used to specify the specific SKU within a
> > family.  GFXOFF is supposed to be working on all raven variants.  It
> > was tested and functional on all reference platforms and any OEM
> > platforms that launched with Linux support.  There are a lot of
> > dependencies on sbios in the early raven variants (0x15dd), so it's
> > likely more of a specific platform issue, but there is not a good way
> > to detect this so we use the DID/SSID/RID as a proxy.  The newer raven
> > variants (0x15d8) have much better GFXOFF support since they all
> > shipped with newer firmware and sbios.
>
> We took one of the first reference platform boards to design our
> custom board based on the V1605B and I assume it has one of the early 
> 'unstable'
> raven variants with RID 0x83. Also, as OEM we are in control of the bios
> (provided by insyde) but I wasn't sure about the RID so, thanks for the
> clarification. Is there anything we can do with the bios to have the GFXOFF
> enabled and 'stable' for this particular revision? Otherwise we'd need to add
> the 0x83 RID to the table. Also, there is an extra ']' in the patch
> subject. Sorry
> for that. Would you need a new patch in case you accept it with the ']' 
> removed?
>
> Good to hear that the newer raven versions have better GFXOFF support.

Adding Alex Desnoyer to the loop as he is the electronic/hardware and
bios responsible so, he can
provide more information about this.

I've now done a test on the reference platform (dibbler) with the
latest bios available
and the hw lockup can be also reproduced with the same steps.

For reference, I'm using mainline kernel 5.12-rc2.

[5.938544] [drm] initializing kernel modesetting (RAVEN
0x1002:0x15DD 0x1002:0x15DD 0xC1).
[5.939942] amdgpu: ATOM BIOS: 113-RAVEN-11

As in the previous cases, the clocks go to 100% of usage when the hang occurs.

However, when the gpu hangs, dmesg output displays the following:

[ 1568.279847] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=188, emitted seq=191
[ 1568.434084] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 311 thread Xorg:cs0 pid 312
[ 1568.279847] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=188, emitted seq=191
[ 1568.434084] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 311 thread Xorg:cs0 pid 312
[ 1568.507000] amdgpu :01:00.0: amdgpu: GPU reset begin!
[ 1628.491882] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1628.491882] rcu: 3-...!: (665 ticks this GP)
idle=f9a/1/0x4000 softirq=188533/188533 fqs=15
[ 1628.491882] rcu: rcu_sched kthread timer wakeup didn't happen for
58497 jiffies! g726761 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 1628.491882] rcu: Possible timer handling issue on cpu=2
timer-softirq=55225
[ 1628.491882] rcu: rcu_sched kthread starved for 58500 jiffies!
g726761 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
[ 1628.491882] rcu: Unless rcu_sched kthread gets sufficient CPU
time, OOM is now expected behavior.
[ 1628.491882] rcu: RCU grace-period kthread stack dump:
[ 1628.491882] rcu: Stack dump where RCU GP kthread last ran:
[ 1808.518445] rcu: INFO: rcu_sched self-detected stall on CPU
[ 1808.518445] rcu: 3-...!: (2643 ticks this GP)
idle=f9a/1/0x4000 

Re: [RESEND 00/53] Rid GPU from W=1 warnings

2021-03-11 Thread Daniel Vetter
On Mon, Mar 08, 2021 at 09:19:32AM +, Lee Jones wrote:
> On Fri, 05 Mar 2021, Roland Scheidegger wrote:
> 
> > The vmwgfx ones look all good to me, so for
> > 23-53: Reviewed-by: Roland Scheidegger 
> > That said, they were already signed off by Zack, so not sure what
> > happened here.
> 
> Yes, they were accepted at one point, then dropped without a reason.
> 
> Since I rebased onto the latest -next, I had to pluck them back out of
> a previous one.

They should show up in linux-next again. We merge patches for next merge
window even during the current merge window, but need to make sure they
don't pollute linux-next. Occasionally the cut off is wrong so patches
show up, and then get pulled again.

Unfortunately especially the 5.12 merge cycle was very wobbly due to some
confusion here. But your patches should all be in linux-next again (they
are queued up for 5.13 in drm-misc-next, I checked that).

Sorry for the confusion here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vboxvideo: Use managed VRAM-helper initialization

2021-03-11 Thread Daniel Vetter
On Thu, Mar 11, 2021 at 02:13:57PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 3/11/21 2:11 PM, Daniel Vetter wrote:
> > On Wed, Mar 03, 2021 at 09:39:46AM +0800, Tian Tao wrote:
> >> updated to use drmm_vram_helper_init().
> >>
> >> Signed-off-by: Tian Tao 
> > 
> > Hans, do you plan to pick this up?
> 
> The drm patch-workflow falls outside my daily kernel-work workflow,
> so it is always a bit of a task-switch for me to switch to dealing
> with the "dim" workflow. ATM I don't have any other drm work pending,
> so I would appreciate it if someone else can pick this up.
> 
> The change does look good to me:
> 
> Reviewed-by: Hans de Goede 

I'll push, thanks for reviewing.
-Daniel

> 
> Regards,
> 
> Hans
> 
>  
> 
> > -Daniel
> > 
> >> ---
> >>  drivers/gpu/drm/vboxvideo/vbox_ttm.c | 7 ++-
> >>  1 file changed, 2 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vboxvideo/vbox_ttm.c 
> >> b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> >> index 0066a3c..fd8a53a 100644
> >> --- a/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> >> +++ b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> >> @@ -12,15 +12,13 @@
> >>  
> >>  int vbox_mm_init(struct vbox_private *vbox)
> >>  {
> >> -  struct drm_vram_mm *vmm;
> >>int ret;
> >>struct drm_device *dev = >ddev;
> >>struct pci_dev *pdev = to_pci_dev(dev->dev);
> >>  
> >> -  vmm = drm_vram_helper_alloc_mm(dev, pci_resource_start(pdev, 0),
> >> +  ret = drmm_vram_helper_init(dev, pci_resource_start(pdev, 0),
> >>   vbox->available_vram_size);
> >> -  if (IS_ERR(vmm)) {
> >> -  ret = PTR_ERR(vmm);
> >> +  if (ret) {
> >>DRM_ERROR("Error initializing VRAM MM; %d\n", ret);
> >>return ret;
> >>}
> >> @@ -33,5 +31,4 @@ int vbox_mm_init(struct vbox_private *vbox)
> >>  void vbox_mm_fini(struct vbox_private *vbox)
> >>  {
> >>arch_phys_wc_del(vbox->fb_mtrr);
> >> -  drm_vram_helper_release_mm(>ddev);
> >>  }
> >> -- 
> >> 2.7.4
> >>
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

2021-03-11 Thread Daniel Vetter
On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
 wrote:
>
> Hi!
>
> On 3/11/21 2:00 PM, Daniel Vetter wrote:
> > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> >> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> >>>  wrote:
> 
>  Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >> wrote:
> >>> Hi,
> >>>
> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>  On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>   wrote:
> > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >> So I think it stops gup. But I haven't verified at all. Would be
> >> good
> >> if Christian can check this with some direct io to a buffer in
> >> system
> >> memory.
> > Hmm,
> >
> > Docs (again vm_normal_page() say)
> >
> >   * VM_MIXEDMAP mappings can likewise contain memory with or
> > without "struct
> >   * page" backing, however the difference is that _all_ pages
> > with a struct
> >   * page (that is, those where pfn_valid is true) are refcounted
> > and
> > considered
> >   * normal pages by the VM. The disadvantage is that pages are
> > refcounted
> >   * (which can be slower and simply not an option for some 
> > PFNMAP
> > users). The
> >   * advantage is that we don't have to follow the strict
> > linearity rule of
> >   * PFNMAP mappings in order to support COWable mappings.
> >
> > but it's true __vm_insert_mixed() ends up in the insert_pfn()
> > path, so
> > the above isn't really true, which makes me wonder if and in that
> > case
> > why there could any longer ever be a significant performance
> > difference
> > between MIXEDMAP and PFNMAP.
>  Yeah it's definitely confusing. I guess I'll hack up a patch and see
>  what sticks.
> 
> > BTW regarding the TTM hugeptes, I don't think we ever landed that
> > devmap
> > hack, so they are (for the non-gup case) relying on
> > vma_is_special_huge(). For the gup case, I think the bug is still
> > there.
>  Maybe there's another devmap hack, but the ttm_vm_insert functions do
>  use PFN_DEV and all that. And I think that stops gup_fast from trying
>  to find the underlying page.
>  -Daniel
> >>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>> to set
> >>>
> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>> true, and
> >>> then
> >>>
> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>> gup_fast()
> >>> backs off,
> >>>
> >>> in the end that would mean setting in stone that "if there is a huge
> >>> devmap
> >>> page table entry for which we haven't registered any devmap struct
> >>> pages
> >>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>> huge
> >>> page table entry".
> >>>
> >>>From what I can tell, all code calling get_dev_pagemap() already
> >>> does that,
> >>> it's just a question of getting it accepted and formalizing it.
> >> Oh I thought that's already how it works, since I didn't spot anything
> >> else that would block gup_fast from falling over. I guess really would
> >> need some testcases to make sure direct i/o (that's the easiest to 
> >> test)
> >> fails like we expect.
> > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > Otherwise pmd_devmap() will not return true and since there is no
> > pmd_special() things break.
>  Is that maybe the issue we have seen with amdgpu and huge pages?
> >>> Yeah, essentially when you have a hugepte inserted by ttm, and it
> >>> happens to point at system memory, then gup will work on that. And
> >>> create all kinds of havoc.
> >>>
>  Apart from that I'm lost guys, that devmap and gup stuff is not
>  something I have a good knowledge of apart from a one mile high view.
> >>> I'm not really better, hence would be good to do a testcase and see.
> >>> This should provoke it:
> >>> - allocate nicely aligned bo in system memory
> >>> - mmap, again nicely aligned to 2M
> >>> - do some direct io from a filesystem into that mmap, that should trigger 
> >>> gup
> >>> - before the gup completes free the mmap and bo so that ttm recycles
> >>> the pages, which should trip up on the elevated refcount. If you wait
> >>> until the direct io is completely, then I think nothing bad can be
> >>> observed.
> >>>
> >>> Ofc if your amdgpu+hugepte issue is something 

Re: [PATCH] drm/vboxvideo: Use managed VRAM-helper initialization

2021-03-11 Thread Hans de Goede
Hi,

On 3/11/21 2:11 PM, Daniel Vetter wrote:
> On Wed, Mar 03, 2021 at 09:39:46AM +0800, Tian Tao wrote:
>> updated to use drmm_vram_helper_init().
>>
>> Signed-off-by: Tian Tao 
> 
> Hans, do you plan to pick this up?

The drm patch-workflow falls outside my daily kernel-work workflow,
so it is always a bit of a task-switch for me to switch to dealing
with the "dim" workflow. ATM I don't have any other drm work pending,
so I would appreciate it if someone else can pick this up.

The change does look good to me:

Reviewed-by: Hans de Goede 

Regards,

Hans

 

> -Daniel
> 
>> ---
>>  drivers/gpu/drm/vboxvideo/vbox_ttm.c | 7 ++-
>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/vboxvideo/vbox_ttm.c 
>> b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
>> index 0066a3c..fd8a53a 100644
>> --- a/drivers/gpu/drm/vboxvideo/vbox_ttm.c
>> +++ b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
>> @@ -12,15 +12,13 @@
>>  
>>  int vbox_mm_init(struct vbox_private *vbox)
>>  {
>> -struct drm_vram_mm *vmm;
>>  int ret;
>>  struct drm_device *dev = >ddev;
>>  struct pci_dev *pdev = to_pci_dev(dev->dev);
>>  
>> -vmm = drm_vram_helper_alloc_mm(dev, pci_resource_start(pdev, 0),
>> +ret = drmm_vram_helper_init(dev, pci_resource_start(pdev, 0),
>> vbox->available_vram_size);
>> -if (IS_ERR(vmm)) {
>> -ret = PTR_ERR(vmm);
>> +if (ret) {
>>  DRM_ERROR("Error initializing VRAM MM; %d\n", ret);
>>  return ret;
>>  }
>> @@ -33,5 +31,4 @@ int vbox_mm_init(struct vbox_private *vbox)
>>  void vbox_mm_fini(struct vbox_private *vbox)
>>  {
>>  arch_phys_wc_del(vbox->fb_mtrr);
>> -drm_vram_helper_release_mm(>ddev);
>>  }
>> -- 
>> 2.7.4
>>
> 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

2021-03-11 Thread Intel

Hi!

On 3/11/21 2:00 PM, Daniel Vetter wrote:

On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:

On 3/1/21 3:09 PM, Daniel Vetter wrote:

On Mon, Mar 1, 2021 at 11:17 AM Christian König
 wrote:


Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):

On 3/1/21 10:05 AM, Daniel Vetter wrote:

On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
wrote:

Hi,

On 3/1/21 9:28 AM, Daniel Vetter wrote:

On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
 wrote:

On 2/26/21 2:28 PM, Daniel Vetter wrote:

So I think it stops gup. But I haven't verified at all. Would be
good
if Christian can check this with some direct io to a buffer in
system
memory.

Hmm,

Docs (again vm_normal_page() say)

  * VM_MIXEDMAP mappings can likewise contain memory with or
without "struct
  * page" backing, however the difference is that _all_ pages
with a struct
  * page (that is, those where pfn_valid is true) are refcounted
and
considered
  * normal pages by the VM. The disadvantage is that pages are
refcounted
  * (which can be slower and simply not an option for some PFNMAP
users). The
  * advantage is that we don't have to follow the strict
linearity rule of
  * PFNMAP mappings in order to support COWable mappings.

but it's true __vm_insert_mixed() ends up in the insert_pfn()
path, so
the above isn't really true, which makes me wonder if and in that
case
why there could any longer ever be a significant performance
difference
between MIXEDMAP and PFNMAP.

Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.


BTW regarding the TTM hugeptes, I don't think we ever landed that
devmap
hack, so they are (for the non-gup case) relying on
vma_is_special_huge(). For the gup case, I think the bug is still
there.

Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel

Hmm perhaps it might, but I don't think so. The fix I tried out was
to set

PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
true, and
then

follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
gup_fast()
backs off,

in the end that would mean setting in stone that "if there is a huge
devmap
page table entry for which we haven't registered any devmap struct
pages
(get_dev_pagemap returns NULL), we should treat that as a "special"
huge
page table entry".

   From what I can tell, all code calling get_dev_pagemap() already
does that,
it's just a question of getting it accepted and formalizing it.

Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.

Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
Otherwise pmd_devmap() will not return true and since there is no
pmd_special() things break.

Is that maybe the issue we have seen with amdgpu and huge pages?

Yeah, essentially when you have a hugepte inserted by ttm, and it
happens to point at system memory, then gup will work on that. And
create all kinds of havoc.


Apart from that I'm lost guys, that devmap and gup stuff is not
something I have a good knowledge of apart from a one mile high view.

I'm not really better, hence would be good to do a testcase and see.
This should provoke it:
- allocate nicely aligned bo in system memory
- mmap, again nicely aligned to 2M
- do some direct io from a filesystem into that mmap, that should trigger gup
- before the gup completes free the mmap and bo so that ttm recycles
the pages, which should trip up on the elevated refcount. If you wait
until the direct io is completely, then I think nothing bad can be
observed.

Ofc if your amdgpu+hugepte issue is something else, then maybe we have
another issue.

Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
-Daniel

So I did the following quick experiment on vmwgfx, and it turns out that
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.


My understanding from reading the vmf_insert_mixed() code is that iff 
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's 
not consistent with the vm_normal_page() doc. For architectures without 
pte_special, VM_PFNMAP must be used, and then we must also block COW 
mappings.


If we can get someone can commit to verify that the potential PAT WC 
performance issue is gone with 

Re: [PATCH] drm/vboxvideo: Use managed VRAM-helper initialization

2021-03-11 Thread Daniel Vetter
On Wed, Mar 03, 2021 at 09:39:46AM +0800, Tian Tao wrote:
> updated to use drmm_vram_helper_init().
> 
> Signed-off-by: Tian Tao 

Hans, do you plan to pick this up?
-Daniel

> ---
>  drivers/gpu/drm/vboxvideo/vbox_ttm.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_ttm.c 
> b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> index 0066a3c..fd8a53a 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> @@ -12,15 +12,13 @@
>  
>  int vbox_mm_init(struct vbox_private *vbox)
>  {
> - struct drm_vram_mm *vmm;
>   int ret;
>   struct drm_device *dev = >ddev;
>   struct pci_dev *pdev = to_pci_dev(dev->dev);
>  
> - vmm = drm_vram_helper_alloc_mm(dev, pci_resource_start(pdev, 0),
> + ret = drmm_vram_helper_init(dev, pci_resource_start(pdev, 0),
>  vbox->available_vram_size);
> - if (IS_ERR(vmm)) {
> - ret = PTR_ERR(vmm);
> + if (ret) {
>   DRM_ERROR("Error initializing VRAM MM; %d\n", ret);
>   return ret;
>   }
> @@ -33,5 +31,4 @@ int vbox_mm_init(struct vbox_private *vbox)
>  void vbox_mm_fini(struct vbox_private *vbox)
>  {
>   arch_phys_wc_del(vbox->fb_mtrr);
> - drm_vram_helper_release_mm(>ddev);
>  }
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >