Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-08 Thread Christian König

Am 07.09.23 um 18:33 schrieb suijingfeng:

Hi,


On 2023/9/7 17:08, Christian König wrote:


I strongly suggest that you just completely drop this here 



Drop this is OK, no problem. Then I will go to develop something else.
This version is not intended to merge originally, as it's a RFC.
Also, the core mechanism already finished, it is the first patch in 
this series.
Things left are just policy (how to specify one and parse the kernel 
CMD line) and nothing interesting left.
It is actually to fulfill my promise at V3 which is to give some 
examples as usage cases.



and go into the AST driver and try to fix it. 


Well, someone tell me that this is well defined behavior yesterday,
which imply that it is not a bug. I'm not going to fix a non-bug.


Sorry for that, I wasn't realizing what you are actually trying to do.


But if thomas ask me to fix it, then I probably have to try to fix.
But I suggest if things not broken, don't fix it. Otherwise this may
incur more big trouble. For server's single display use case, it is
good enough.


Yeah, exactly that's the reason why you shouldn't mess with this.

In theory you could try to re-program the necessary north bridge blocks 
to make integrated graphics work even if you installed a dedicated VGA 
adapter, but you will most likely be missing something.


The only real fix is to tell the BIOS that you want to use the 
integrated VGA device even if a dedicated one is detected.


If you want to learn more about the background AMD has a bunch of 
documentation around this on their website: 
https://www.amd.com/en/search/documentation/hub.html


The most interesting document for you is probably the BIOS programming 
manual, but don't ask me what exactly the title of that one. @Alex do 
you remember what that was called?


IIRC Intel had similar documentations public, but I don't know where to 
find those of hand.


Regards,
Christian.




Thanks.





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread suijingfeng

Hi,


On 2023/9/7 17:08, Christian König wrote:


I strongly suggest that you just completely drop this here 



Drop this is OK, no problem. Then I will go to develop something else.
This version is not intended to merge originally, as it's a RFC.
Also, the core mechanism already finished, it is the first patch in this series.
Things left are just policy (how to specify one and parse the kernel CMD line) 
and nothing interesting left.
It is actually to fulfill my promise at V3 which is to give some examples as 
usage cases.


and go into the AST driver and try to fix it. 


Well, someone tell me that this is well defined behavior yesterday,
which imply that it is not a bug. I'm not going to fix a non-bug.
But if thomas ask me to fix it, then I probably have to try to fix.
But I suggest if things not broken, don't fix it. Otherwise this may
incur more big trouble. For server's single display use case, it is
good enough.


Thanks.



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König

Am 07.09.23 um 17:26 schrieb suijingfeng:

[SNIP]



Then, I'll give you another example, see below for elaborate description.
I have one AMD BC160 GPU, see[1] to get what it looks like.

The GPU don't has a display connector interface exported.
It actually can be seen as a render-only GPU or compute class GPU for 
bitcoin.

But the firmware of it still acclaim this GPU as VGA compatible.
When mount this GPU onto motherboard, the system always select this 
GPU as primary.

But this GPU can't be able to connect with a monitor.

Under such a situation, modprobe.blacklist=amdgpu don't works either,
because vgaarb always select this GPU as primary, this is a 
device-level decision.


It's not VGAARB which makes this selection, it's the BIOS. VGAARB just 
detects what the BIOS has decided.




$ dmesg | grep vgaarb:

[    3.541405] pci :0c:00.0: vgaarb: BAR 0: [mem 
0xa000-0xafff 64bit pref] contains firmware FB 
[0xa000-0xa02f]

[    3.901448] pci :05:00.0: vgaarb: setting as boot VGA device
[    3.905375] pci :05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    3.905382] pci :0c:00.0: vgaarb: setting as boot VGA device 
(overriding previous)
[    3.909375] pci :0c:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    3.913375] pci :0d:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none

[    3.913377] vgaarb: loaded
[   13.513760] amdgpu :0c:00.0: vgaarb: deactivate vga console
[   19.020992] amdgpu :0c:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem


I'm using ubuntu 22.04 system, with ast.modeset=10 passed on the cmd 
line,
I still be able to enter the graphics system. And views this GPU as a 
render-only GPU.

Probably continue to examine what's wrong, except this, drm/amdgpu report
" *ERROR* IB test failed on sdma0 (-110)" to me.

Does this count as problem?


No, again that is perfectly expected behavior.

Some BIOSes (or maybe most by modern standard) allows to override this, 
but if you later override this by the OS you run the hardware outside 
what's validated.


When you put a VGA device into a board with an integrated VGA device the 
integrated one gets disabled. This is even part of some PCIe 
specification IIRC.


So the problems you run into here are perfectly expected.

Regards,
Christian.



Before I could find solution, I have keep this de-fact render only GPU 
mounted.
Because I need recompile kennel module, install the kernel module and 
testing.


All I need is a 2D video card to display something, ast drm is OK, 
despite simple.

It suit the need for my daily usage with VIM, that's enough for me.

Now, the real questions that I want ask is:

1)

Does the fact that when the kernel driver module got blocked (by 
modprobe.blacklist=amdgpu),
while the vgaarb still select it as primary which leave the X server 
crash there (because no kennel space driver loaded)

count as a problem?


2)

Does my approach that mounting another GPU as the primary display 
adapter,
while its real purpose is to solving bugs and development for another 
GPU,

count as a use case?


$ cat demsg.txt | grep drm

[   10.099888] ACPI: bus type drm_connector registered
[   11.083920] etnaviv :0d:00.0: [drm] bind etnaviv-display, 
master name: :0d:00.0
[   11.084106] [drm] Initialized etnaviv 1.3.0 20151214 for 
:0d:00.0 on minor 0

[   13.301702] [drm] amdgpu kernel modesetting enabled.
[   13.359820] [drm] initializing kernel modesetting (NAVI12 
0x1002:0x7360 0x1002:0x0A34 0xC7).

[   13.368246] [drm] register mmio base: 0xEB10
[   13.372861] [drm] register mmio size: 524288
[   13.380788] [drm] add ip block number 0 
[   13.385661] [drm] add ip block number 1 
[   13.390531] [drm] add ip block number 2 
[   13.395405] [drm] add ip block number 3 
[   13.399760] [drm] add ip block number 4 
[   13.404111] [drm] add ip block number 5 
[   13.408378] [drm] add ip block number 6 
[   13.413249] [drm] add ip block number 7 
[   13.433546] [drm] add ip block number 8 
[   13.433547] [drm] add ip block number 9 
[   13.497757] [drm] VCN decode is enabled in VM mode
[   13.502540] [drm] VCN encode is enabled in VM mode
[   13.508785] [drm] JPEG decode is enabled in VM mode
[   13.529596] [drm] vm size is 262144 GB, 4 levels, block size is 
9-bit, fragment size is 9-bit

[   13.564762] [drm] Detected VRAM RAM=8176M, BAR=256M
[   13.569628] [drm] RAM width 2048bits HBM
[   13.574167] [drm] amdgpu: 8176M of VRAM memory ready
[   13.579125] [drm] amdgpu: 15998M of GTT memory ready.
[   13.584184] [drm] GART: num cpu pages 131072, num gpu pages 131072
[   13.590505] [drm] PCIE GART of 512M enabled (table at 
0x00800030).
[   13.598749] [drm] Found VCN firmware Version ENC: 1.16 DEC: 5 VEP: 
0 Revision: 4

[   13.671786] [drm] reserve 0xe0 from 0x81fd00 for PSP TMR
[   13.801235] [drm] Display Core v3.2.247 initialized on DCN 2.0
[   13.807061] [drm] 

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread suijingfeng

Hi,


On 2023/9/7 20:43, Christian König wrote:

Am 07.09.23 um 14:32 schrieb suijingfeng:

Hi,


On 2023/9/7 17:08, Christian König wrote:
Well, I have over 25 years of experience with display hardware and 
what you describe here was never an issue. 


I want to give you an example to let you know more.

I have a ASRock AD2550B-ITX board[1],
When another discrete video card is mounted into it mini PCIe slot or 
PCI slot,
The IGD cannot be the primary display adapter anymore. The display is 
totally black.

I have try to draft a few trivial patch to help fix this[2].

And I want to use the IGD as primary, does this count as an issue?


No, this is completely expected behavior and a limitation of the 
hardware design.


As far as I know both AMD and Intel GPUs work the same here.

Regards,
Christian.



[1] https://www.asrock.com/mb/Intel/AD2550-ITX/
[2] https://patchwork.freedesktop.org/series/123073/



Then, I'll give you another example, see below for elaborate description.
I have one AMD BC160 GPU, see[1] to get what it looks like.

The GPU don't has a display connector interface exported.
It actually can be seen as a render-only GPU or compute class GPU for bitcoin.
But the firmware of it still acclaim this GPU as VGA compatible.
When mount this GPU onto motherboard, the system always select this GPU as 
primary.
But this GPU can't be able to connect with a monitor.

Under such a situation, modprobe.blacklist=amdgpu don't works either,
because vgaarb always select this GPU as primary, this is a device-level 
decision.

$ dmesg | grep vgaarb:

[3.541405] pci :0c:00.0: vgaarb: BAR 0: [mem 0xa000-0xafff 
64bit pref] contains firmware FB [0xa000-0xa02f]
[3.901448] pci :05:00.0: vgaarb: setting as boot VGA device
[3.905375] pci :05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[3.905382] pci :0c:00.0: vgaarb: setting as boot VGA device (overriding 
previous)
[3.909375] pci :0c:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[3.913375] pci :0d:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[3.913377] vgaarb: loaded
[   13.513760] amdgpu :0c:00.0: vgaarb: deactivate vga console
[   19.020992] amdgpu :0c:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem

I'm using ubuntu 22.04 system, with ast.modeset=10 passed on the cmd line,
I still be able to enter the graphics system. And views this GPU as a 
render-only GPU.
Probably continue to examine what's wrong, except this, drm/amdgpu report
" *ERROR* IB test failed on sdma0 (-110)" to me.

Does this count as problem?

Before I could find solution, I have keep this de-fact render only GPU mounted.
Because I need recompile kennel module, install the kernel module and testing.

All I need is a 2D video card to display something, ast drm is OK, despite 
simple.
It suit the need for my daily usage with VIM, that's enough for me.

Now, the real questions that I want ask is:

1)

Does the fact that when the kernel driver module got blocked (by 
modprobe.blacklist=amdgpu),
while the vgaarb still select it as primary which leave the X server crash 
there (because no kennel space driver loaded)
count as a problem?


2)

Does my approach that mounting another GPU as the primary display adapter,
while its real purpose is to solving bugs and development for another GPU,
count as a use case?


$ cat demsg.txt | grep drm

[   10.099888] ACPI: bus type drm_connector registered
[   11.083920] etnaviv :0d:00.0: [drm] bind etnaviv-display, master 
name: :0d:00.0
[   11.084106] [drm] Initialized etnaviv 1.3.0 20151214 for :0d:00.0 
on minor 0

[   13.301702] [drm] amdgpu kernel modesetting enabled.
[   13.359820] [drm] initializing kernel modesetting (NAVI12 
0x1002:0x7360 0x1002:0x0A34 0xC7).

[   13.368246] [drm] register mmio base: 0xEB10
[   13.372861] [drm] register mmio size: 524288
[   13.380788] [drm] add ip block number 0 
[   13.385661] [drm] add ip block number 1 
[   13.390531] [drm] add ip block number 2 
[   13.395405] [drm] add ip block number 3 
[   13.399760] [drm] add ip block number 4 
[   13.404111] [drm] add ip block number 5 
[   13.408378] [drm] add ip block number 6 
[   13.413249] [drm] add ip block number 7 
[   13.433546] [drm] add ip block number 8 
[   13.433547] [drm] add ip block number 9 
[   13.497757] [drm] VCN decode is enabled in VM mode
[   13.502540] [drm] VCN encode is enabled in VM mode
[   13.508785] [drm] JPEG decode is enabled in VM mode
[   13.529596] [drm] vm size is 262144 GB, 4 levels, block size is 
9-bit, fragment size is 9-bit

[   13.564762] [drm] Detected VRAM RAM=8176M, BAR=256M
[   13.569628] [drm] RAM width 2048bits HBM
[   13.574167] [drm] amdgpu: 8176M of VRAM memory ready
[   13.579125] [drm] amdgpu: 15998M of GTT memory ready.
[   13.584184] [drm] GART: num cpu pages 131072, num gpu pages 131072
[   13.590505] [drm] PCIE GART of 512M 

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König

Am 07.09.23 um 14:32 schrieb suijingfeng:

Hi,


On 2023/9/7 17:08, Christian König wrote:
Well, I have over 25 years of experience with display hardware and 
what you describe here was never an issue. 


I want to give you an example to let you know more.

I have a ASRock AD2550B-ITX board[1],
When another discrete video card is mounted into it mini PCIe slot or 
PCI slot,
The IGD cannot be the primary display adapter anymore. The display is 
totally black.

I have try to draft a few trivial patch to help fix this[2].

And I want to use the IGD as primary, does this count as an issue?


No, this is completely expected behavior and a limitation of the 
hardware design.


As far as I know both AMD and Intel GPUs work the same here.

Regards,
Christian.



[1] https://www.asrock.com/mb/Intel/AD2550-ITX/
[2] https://patchwork.freedesktop.org/series/123073/





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread suijingfeng

Hi,


On 2023/9/7 17:08, Christian König wrote:
Well, I have over 25 years of experience with display hardware and 
what you describe here was never an issue. 


I want to give you an example to let you know more.

I have a ASRock AD2550B-ITX board[1],
When another discrete video card is mounted into it mini PCIe slot or PCI slot,
The IGD cannot be the primary display adapter anymore. The display is totally 
black.
I have try to draft a few trivial patch to help fix this[2].

And I want to use the IGD as primary, does this count as an issue?

[1] https://www.asrock.com/mb/Intel/AD2550-ITX/
[2] https://patchwork.freedesktop.org/series/123073/



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Jani Nikula
On Wed, 06 Sep 2023, suijingfeng  wrote:
> Another limitation of the 'nomodeset' parameter is that
> it is only available on recent upstream kernel. Low version
> downstream kernel don't has this parameter supported yet.
> So this create inconstant developing experience. I believe that
> there always some people need do back-port and upstream work
> for various reasons.

While that may be true, it's not an argument in favour of adding new
module parameters or special values to existing module parameters. They
would have to be backported just as well.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-07 Thread Christian König

Am 07.09.23 um 04:30 schrieb Sui Jingfeng:

Hi,


On 2023/9/6 17:40, Christian König wrote:

Am 06.09.23 um 11:08 schrieb suijingfeng:

Well, welcome to correct me if I'm wrong.


You seem to have some very basic misunderstandings here.

The term framebuffer describes some VRAM memory used for scanout.

This framebuffer is exposed to userspace through some framebuffer 
driver, on UEFI platforms that is usually efifb but can be quite a 
bunch of different drivers.


When the DRM drivers load they remove the previous drivers using 
drm_aperture_remove_conflicting_pci_framebuffers() (or similar 
function), but this does not mean that the framebuffer or scanout 
parameters are modified in any way. It just means that the 
framebuffer is just no longer exposed through this driver.


Take over is the perfectly right description here because that's 
exactly what's happening. The framebuffer configuration including the 
VRAM memory as well as the parameters for scanout are exposed by the 
newly loaded DRM driver.


In other words userspace can query through the DRM interfaces which 
monitors already driven by the hardware and so in your terminology 
figure out which is the primary one.



I'm a little bit of not convinced about this idea, you might be correct.


Well I can point you to the code if you don't believe me.


But there cases where three are multiple monitors and each video card
connect one.


Yeah, but this is irrelevant. The key point is the configuration is 
taken over when the driver loads.


So whatever is there before as setup (one monitor showing console, three 
monitors mirrored, whatever) should be there after loading the driver as 
well. This configuration is just immediately overwritten because nobody 
cares about it.




It also quite common that no monitors is connected, let the machine boot
first, then find a monitors to connect to a random display output. See
which will display. I don't expect the primary shake with.
The primary one have to be determined as early as possible, because of
the VGA console and the framebuffer console may directly output the 
primary.


Well that is simply not correct. There is not concept of "primary" 
display, it can just be that a monitor was brought up by the BIOS or 
bootloader and we take over this configuration.



Get the DDC and/or HPD involved may necessary complicated the problem.

There are ASpeed BMC who add a virtual connector in order to able 
display remotely.

There are also have commands to force a connector to be connected status.


It's just that as Thomas explained as well that this completely 
irrelevant to any modern desktop. Both X and Wayland both iterate the 
available devices and start rendering to them which one was used 
during boot doesn't really matter to them.



You may be correct, but I'm still not sure.
I probably need more times to investigate.
Me and my colleagues are mainly using X server,
the version varies from 1.20.4 and 1.21.1.4.
Even this is true, the problems still exist for non-modern desktops.


Well, I have over 25 years of experience with display hardware and what 
you describe here was never an issue.


What you have is simply a broken display driver which for some reason 
can't handle your use case.


I strongly suggest that you just completely drop this here and go into 
the AST driver and try to fix it.


Regards,
Christian.




Apart from that ranting like this and trying to explain stuff to 
people who obviously have much better background in the topic is not 
going to help your patches getting upstream.




Thanks for you tell me so much knowledge,
I'm realized where are the problems now.
I will try to resolve the concerns at the next version.



Regards,
Christian.





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng

Hi,


On 2023/9/6 17:40, Christian König wrote:

Am 06.09.23 um 11:08 schrieb suijingfeng:

Well, welcome to correct me if I'm wrong.


You seem to have some very basic misunderstandings here.

The term framebuffer describes some VRAM memory used for scanout.

This framebuffer is exposed to userspace through some framebuffer 
driver, on UEFI platforms that is usually efifb but can be quite a 
bunch of different drivers.


When the DRM drivers load they remove the previous drivers using 
drm_aperture_remove_conflicting_pci_framebuffers() (or similar 
function), but this does not mean that the framebuffer or scanout 
parameters are modified in any way. It just means that the framebuffer 
is just no longer exposed through this driver.


Take over is the perfectly right description here because that's 
exactly what's happening. The framebuffer configuration including the 
VRAM memory as well as the parameters for scanout are exposed by the 
newly loaded DRM driver.


In other words userspace can query through the DRM interfaces which 
monitors already driven by the hardware and so in your terminology 
figure out which is the primary one.



I'm a little bit of not convinced about this idea, you might be correct.
But there cases where three are multiple monitors and each video card
connect one.

It also quite common that no monitors is connected, let the machine boot
first, then find a monitors to connect to a random display output. See
which will display. I don't expect the primary shake with.
The primary one have to be determined as early as possible, because of
the VGA console and the framebuffer console may directly output the primary.
Get the DDC and/or HPD involved may necessary complicated the problem.

There are ASpeed BMC who add a virtual connector in order to able display 
remotely.
There are also have commands to force a connector to be connected status.


It's just that as Thomas explained as well that this completely 
irrelevant to any modern desktop. Both X and Wayland both iterate the 
available devices and start rendering to them which one was used 
during boot doesn't really matter to them.



You may be correct, but I'm still not sure.
I probably need more times to investigate.
Me and my colleagues are mainly using X server,
the version varies from 1.20.4 and 1.21.1.4.
Even this is true, the problems still exist for non-modern desktops.

Apart from that ranting like this and trying to explain stuff to 
people who obviously have much better background in the topic is not 
going to help your patches getting upstream.




Thanks for you tell me so much knowledge,
I'm realized where are the problems now.
I will try to resolve the concerns at the next version.



Regards,
Christian.



Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Alex Williamson
On Wed, 6 Sep 2023 11:51:59 +0800
Sui Jingfeng  wrote:

> Hi,
> 
> 
> On 2023/9/5 22:52, Alex Williamson wrote:
> > On Tue,  5 Sep 2023 03:57:15 +0800
> > Sui Jingfeng  wrote:
> >  
> >> From: Sui Jingfeng 
> >>
> >> On a machine with multiple GPUs, a Linux user has no control over which
> >> one is primary at boot time. This series tries to solve above mentioned
> >> problem by introduced the ->be_primary() function stub. The specific
> >> device drivers can provide an implementation to hook up with this stub by
> >> calling the vga_client_register() function.
> >>
> >> Once the driver bound the device successfully, VGAARB will call back to
> >> the device driver. To query if the device drivers want to be primary or
> >> not. Device drivers can just pass NULL if have no such needs.
> >>
> >> Please note that:
> >>
> >> 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
> >> like to mount at least three video cards.
> >>
> >> 2) Typically, those non-86 machines don't have a good UEFI firmware
> >> support, which doesn't support select primary GPU as firmware stage.
> >> Even on x86, there are old UEFI firmwares which already made undesired
> >> decision for you.
> >>
> >> 3) This series is attempt to solve the remain problems at the driver level,
> >> while another series[1] of me is target to solve the majority of the
> >> problems at device level.
> >>
> >> Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
> >> 630 is the default boot VGA, successfully override by ast2400 with
> >> ast.modeset=10 append at the kernel cmd line.
> >>
> >> $ lspci | grep VGA
> >>
> >>   00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 
> >> [UHD Graphics 630]  
> > In all my previous experiments with VGA routing and IGD I found that
> > IGD can't actually release VGA routing and Intel confirmed the hardware
> > doesn't have the ability to do so.  
> 
> Which model of the IGD you are using? even for the IGD in Atom D2550,
> the legacy 128KB VGA memory range can be tuned to be mapped to IGD
> or to the DMI Interface. See the 1.7.3.2 section of the N2000 datasheet[1].

I believe it's the VGA I/O that can't be disabled, there's no means to
do so other than the I/O enable bit in the command register and iirc
the driver depends on this for other features.  The history of this is
pretty old, but here are some links:

https://lore.kernel.org/all/1376486637.31494.19.ca...@ul30vt.home/
https://bbs.archlinux.org/viewtopic.php?pid=1400212#p1400212
https://lore.kernel.org/all/20130815223917.27890.28003.st...@bling.home/
https://lore.kernel.org/all/20130824144701.23370.42110.st...@bling.home/
https://lore.kernel.org/all/20140509201655.2849.97478.st...@bling.home/

I think the issue was that i915 doesn't claim to the VGA arbiter to be
controlling legacy VGA ranges, but in fact the hardware does claim
those ranges.  We can "fix" i915 to report that VGA MMIO space is
owned and can be controlled, but then Xorg likely sees multiple VGA
arbiter clients and disables DRI because it wants to mmap VGA MMIO
space.

Therefore unless something has changed in the past 10yrs, i915 owns but
does not advertise ownership of the VGA address spaces and therefore
the arbiter can't and doesn't know to change VGA routing to enable a
"be_primary" path to another device.
 
> If a specific model of Intel has a bug in the VGA routing hardware logic unit,
> I would like to ignore it. Or switch to the UEFI firmware on such hardware.

That's a convenient and impractical approach.  I expect all Intel HD
graphics has this issue.  Unknown for Xe.

> It is the hardware engineer's responsibility, I will not worry about it.

We often need to deal with broken hardware in the kernel.

> Thanks for you tell this.
> 
> [1] 
> https://www.intel.com/content/dam/doc/datasheet/atom-d2000-n2000-vol-2-datasheet.pdf
> 
> 
> >   It will always be primary from a
> > VGA routing perspective.  Was this actually tested with non-UEFI?  
> 
> 
> As you already said, the generous Intel already have confirmed that the 
> hardware defect.
> So probably this is a good chance to switch to UEFI to solve the problem. 
> Then, no
> testing for legacy is needed.

Then why are we hacking on VGA arbitration in this series at all?

> > I suspect it might only work in UEFI mode where we probably don't
> > actually have a dependency on VGA routing.  This is essentially why
> > vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
> > broken to use on Intel systems with IGD.  Thanks,  
> 
> Thanks for you tell me this.
> 
> To be honest, I have only tested my patch on machines with UEFI firmware.
> Since UEFI because the main stream, but if this patch is really useful for
> majority machine, I'm satisfied. The results is not too bad.

This looks like a pretty significant scoping issue if you're proposing
changes to the VGA arbiter which specifically handles the routing of
legacy VGA address spaces 

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng

Hi,

On 2023/9/6 14:45, Christian König wrote:
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow 
them to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best 
test examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is called 
as POST.


Well, you don't seem to understand the background here. This is 
perfectly normal behavior.


Secondary cards are posted after loading the appropriate DRM driver. 
At least for amdgpu this is done by calling the appropriate functions 
in the BIOS. 



Well, thanks for you tell me this. You know more than me and definitely have a 
better understanding.

Are you telling me that the POST function for AMDGPU reside in the BIOS?
The kernel call into the BIOS?
Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something else?

But the POST function for the drm ast, reside in the kernel space (in other 
word, in ast.ko).
Is this statement correct?

I means that for ASpeed BMC chip, if the firmware not POST the display 
controller.
Then we have to POST it at the kernel space before doing various modeset option.
We can only POST this chip by directly operate the various registers.
Am I correct for the judgement about ast drm driver?

Thanks for your reviews.



Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng

Hi,


On 2023/9/5 22:52, Alex Williamson wrote:

On Tue,  5 Sep 2023 03:57:15 +0800
Sui Jingfeng  wrote:


From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]

In all my previous experiments with VGA routing and IGD I found that
IGD can't actually release VGA routing and Intel confirmed the hardware
doesn't have the ability to do so.


Which model of the IGD you are using? even for the IGD in Atom D2550,
the legacy 128KB VGA memory range can be tuned to be mapped to IGD
or to the DMI Interface. See the 1.7.3.2 section of the N2000 datasheet[1].

If a specific model of Intel has a bug in the VGA routing hardware logic unit,
I would like to ignore it. Or switch to the UEFI firmware on such hardware.

It is the hardware engineer's responsibility, I will not worry about it.
Thanks for you tell this.

[1] 
https://www.intel.com/content/dam/doc/datasheet/atom-d2000-n2000-vol-2-datasheet.pdf



  It will always be primary from a
VGA routing perspective.  Was this actually tested with non-UEFI?



As you already said, the generous Intel already have confirmed that the 
hardware defect.
So probably this is a good chance to switch to UEFI to solve the problem. Then, 
no
testing for legacy is needed.



I suspect it might only work in UEFI mode where we probably don't
actually have a dependency on VGA routing.  This is essentially why
vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
broken to use on Intel systems with IGD.  Thanks,


Thanks for you tell me this.

To be honest, I have only tested my patch on machines with UEFI firmware.
Since UEFI because the main stream, but if this patch is really useful for
majority machine, I'm satisfied. The results is not too bad.

Thanks.


Alex



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng

Hi,

On 2023/9/5 23:05, Thomas Zimmermann wrote:
You might have found a bug in the ast driver. Ast has means to detect 
if the device has been POSTed and maybe do that. If this doesn't work 
correctly, it needs a fix.



That sounds fine.

The bug is not a big deal, I'm just take it as an example and report it to you.
But a real fix can be complex, because there are quite a lot of servers
ship with ASpeed BMC hardware.

Honestly I don't have the time fix it on formal way.
I have already tons patches in pending and I will focus on solve VGAARB related 
problem.


Because I want to test your patch occasionally.
So this series is useful for myself at corner cases.



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann

Hi

Am 06.09.23 um 11:48 schrieb suijingfeng:
[...]


There's 'nomodeset', which disables all native drivers. It's useful 
for debugging or as a quick-fix if the graphics driver breaks. If you 
want to disable a specific driver, please use one of the options for 
blacklisting.



Yeah, the 'nomodeset' disables all native drivers,
this is a good point of it, but this is also the weak point of it.


Well, that's by design. Graphics is at the core of the user experience. 
We often cannot _not_ provide it. And if it's broken, there needs to be 
a reliable fallback. There needs to be at least enough graphics support 
to run a terminal and repair the system. And it also needs to be simple 
enough for the average user. Falling back to serial terminals if often 
not an option.


At least here at SUSE, when users or customers report a broken graphics 
driver, we can tell them to start with 'nomodeset' and get at least the 
basic graphics. That's good enough for most productivity/office 
software. In the meantime, we investigate the problem.


There were concerns about the need of nomodeset, but I think it has 
proven to be useful in practice.



Sometimes, when you are developing a drm driver for a new device.
You will see the pain. Its too often a programmer's modification
make the entire Linux kernel hang there. The problematic drm
driver kernel module already in the initrd. Then, the real
need to disable the ill-functional drm driver kernel module
only. While what you recommend to disable them all. There
are subtle difference.


I found that initcall_blacklist= works reliable for me.



Another limitation of the 'nomodeset' parameter is that
it is only available on recent upstream kernel. Low version
downstream kernel don't has this parameter supported yet.
So this create inconstant developing experience. I believe that
there always some people need do back-port and upstream work
for various reasons.


Nomodeset used to be there, but in a different form. It forced VGA text 
mode IIRC. 'git grep' for vga_text_force() in an old kernel. We adopted 
the parameter for all of graphics, because it already did what we needed.


Best regards
Thomas



While (kindly, no offensive) debating, since we have the modprobe.blacklist
why we still need the 'nomodeset' parameter ?
why not try 
modprobe.blacklist="amdgpu,radeon,i915,ast,nouveau,gma500_gfx, ..."


:-/


But OK in overall, I will listen to your advice.



Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83



for the modeset parameter, authors of various device driver try to 
make the usage not

conflict with others. I believe that this is good thing for Linux users.
It is probably the responsibility of the drm core maintainers to 
force various drm
drivers to reach a minimal consensus. Probably it pains to do so and 
doesn't pay off.

But reach a minimal consensus do benefit to Linux users.


You can use modprobe.blacklist or initcall_blacklist on the kernel 
command line.



There are some cases where the modprobe.blacklist doesn't works,
I have come cross several time during the past.
Because the device selected by the VGAARB is device-level thing,
it is not the driver's problem.

Sometimes when VGAARB has a bug, it will select a wrong device as 
primary.
And the X server will use this wrong device as primary and completely 
crash

there, due to lack a driver. Take my old S3 Graphics as an example:

$ lspci | grep VGA

  00:06.1 VGA compatible controller: Loongson Technology LLC DC 
(Display Controller) (rev 01)
  03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]
  07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)
  08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)


Before apply this patch:

[    0.361748] pci :00:06.1: vgaarb: setting as boot VGA device
[    0.361753] pci :00:06.1: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    0.361765] pci :03:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361773] pci :07:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361779] pci :08:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none

[    0.361781] vgaarb: loaded
[    0.367838] pci :00:06.1: Overriding boot device as 1002:6778
[    0.367841] pci :00:06.1: Overriding boot device as 5333:9070
[    0.367843] pci :00:06.1: Overriding boot device as 5333:9070


For known reason, one of my system select the S3 Graphics as primary 
GPU.

But this S3 Graphics not even have a decent drm upstream driver yet.
Under such a case, I begin to believe that only the device who has a
driver deserve the primary.

Under such a condition, I want to reboot and enter the graphic 
environment
with other working video cards. Either platform integrated and 
discrete GPU.
This don't means I should 

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König

Am 06.09.23 um 12:31 schrieb Sui Jingfeng:

Hi,

On 2023/9/6 14:45, Christian König wrote:
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow 
them to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best 
test examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is 
called as POST.


Well, you don't seem to understand the background here. This is 
perfectly normal behavior.


Secondary cards are posted after loading the appropriate DRM driver. 
At least for amdgpu this is done by calling the appropriate functions 
in the BIOS. 



Well, thanks for you tell me this. You know more than me and 
definitely have a better understanding.


Are you telling me that the POST function for AMDGPU reside in the BIOS?
The kernel call into the BIOS?


Yes, exactly that.

Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something 
else?


On dGPUs it's the VBIOS on a flashrom on the board, for iGPUs (APUs as 
AMD calls them) it's part of the system BIOS.


UEFI is actually just a small subsystem in the system BIOS which 
replaced the old interface used between system BIOS, video BIOS and 
operating system.




But the POST function for the drm ast, reside in the kernel space (in 
other word, in ast.ko).

Is this statement correct?


I don't know the ast driver well enough to answer that, but I assume 
they just read the BIOS and execute the appropriate functions.




I means that for ASpeed BMC chip, if the firmware not POST the display 
controller.
Then we have to POST it at the kernel space before doing various 
modeset option.

We can only POST this chip by directly operate the various registers.
Am I correct for the judgement about ast drm driver?


Well POST just means Power On Self Test, but what you mean is 
initializing the hardware.


Some drivers can of course initialize the hardware without the help of 
the BIOS, but I don't think AST can do that. As far as I know it's a 
relatively simple driver.


BTW firmware is not the same as the BIOS (which runs the POST), firmware 
usually refers to something run on microcontrollers inside the ASIC 
while the (system or video) BIOS runs on the host CPU.


Regards,
Christian.



Thanks for your reviews.





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread suijingfeng

Hi,


On 2023/9/6 16:05, Thomas Zimmermann wrote:

Hi

Am 05.09.23 um 17:59 schrieb suijingfeng:
[...]
FYI: per-driver modeset parameters are deprecated and not to be 
used. Please don't promote them.



Well, please wait, I want to explain.



drm/nouveau already promote it a little bit.

Despite no code of conduct or specification guiding how the modules 
parameters should be.
Noticed that there already have a lot of DRM drivers support the 
modeset parameters,


Please look at the history and discussion around this parameter. To my 
knowledge, 'modeset' got introduced when modesetting with still done 
in userspace. It was an easy way of disabling the kernel driver if the 
system's Xorg did no yet support kernel mode setting.


Fast forward a few years and all Linux' use kernel modesetting, which 
make the modeset parameters obsolete. We discussed and decided to keep 
them in, because many articles and blog posts refer to them. We didn't 
want to invalidate them. BUT modeset is deprecated and not allowed in 
new code. If you look at existing modeset usage, you will eventually 
come across the comment at [1].




OK, no problem. I agree what you said.


There's 'nomodeset', which disables all native drivers. It's useful 
for debugging or as a quick-fix if the graphics driver breaks. If you 
want to disable a specific driver, please use one of the options for 
blacklisting.



Yeah, the 'nomodeset' disables all native drivers,
this is a good point of it, but this is also the weak point of it.

Sometimes, when you are developing a drm driver for a new device.
You will see the pain. Its too often a programmer's modification
make the entire Linux kernel hang there. The problematic drm
driver kernel module already in the initrd. Then, the real
need to disable the ill-functional drm driver kernel module
only. While what you recommend to disable them all. There
are subtle difference.

Another limitation of the 'nomodeset' parameter is that
it is only available on recent upstream kernel. Low version
downstream kernel don't has this parameter supported yet.
So this create inconstant developing experience. I believe that
there always some people need do back-port and upstream work
for various reasons.

While (kindly, no offensive) debating, since we have the modprobe.blacklist
why we still need the 'nomodeset' parameter ?
why not try modprobe.blacklist="amdgpu,radeon,i915,ast,nouveau,gma500_gfx, ..."

:-/


But OK in overall, I will listen to your advice.



Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83



for the modeset parameter, authors of various device driver try to 
make the usage not

conflict with others. I believe that this is good thing for Linux users.
It is probably the responsibility of the drm core maintainers to 
force various drm
drivers to reach a minimal consensus. Probably it pains to do so and 
doesn't pay off.

But reach a minimal consensus do benefit to Linux users.


You can use modprobe.blacklist or initcall_blacklist on the kernel 
command line.



There are some cases where the modprobe.blacklist doesn't works,
I have come cross several time during the past.
Because the device selected by the VGAARB is device-level thing,
it is not the driver's problem.

Sometimes when VGAARB has a bug, it will select a wrong device as 
primary.
And the X server will use this wrong device as primary and completely 
crash

there, due to lack a driver. Take my old S3 Graphics as an example:

$ lspci | grep VGA

  00:06.1 VGA compatible controller: Loongson Technology LLC DC 
(Display Controller) (rev 01)
  03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]
  07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)
  08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)


Before apply this patch:

[    0.361748] pci :00:06.1: vgaarb: setting as boot VGA device
[    0.361753] pci :00:06.1: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    0.361765] pci :03:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361773] pci :07:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361779] pci :08:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none

[    0.361781] vgaarb: loaded
[    0.367838] pci :00:06.1: Overriding boot device as 1002:6778
[    0.367841] pci :00:06.1: Overriding boot device as 5333:9070
[    0.367843] pci :00:06.1: Overriding boot device as 5333:9070


For known reason, one of my system select the S3 Graphics as primary 
GPU.

But this S3 Graphics not even have a decent drm upstream driver yet.
Under such a case, I begin to believe that only the device who has a
driver deserve the primary.

Under such a condition, I want to reboot and enter the graphic 
environment
with other working video cards. Either platform 

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König

Am 06.09.23 um 11:08 schrieb suijingfeng:

Well, welcome to correct me if I'm wrong.


You seem to have some very basic misunderstandings here.

The term framebuffer describes some VRAM memory used for scanout.

This framebuffer is exposed to userspace through some framebuffer 
driver, on UEFI platforms that is usually efifb but can be quite a bunch 
of different drivers.


When the DRM drivers load they remove the previous drivers using 
drm_aperture_remove_conflicting_pci_framebuffers() (or similar 
function), but this does not mean that the framebuffer or scanout 
parameters are modified in any way. It just means that the framebuffer 
is just no longer exposed through this driver.


Take over is the perfectly right description here because that's exactly 
what's happening. The framebuffer configuration including the VRAM 
memory as well as the parameters for scanout are exposed by the newly 
loaded DRM driver.


In other words userspace can query through the DRM interfaces which 
monitors already driven by the hardware and so in your terminology 
figure out which is the primary one.


It's just that as Thomas explained as well that this completely 
irrelevant to any modern desktop. Both X and Wayland both iterate the 
available devices and start rendering to them which one was used during 
boot doesn't really matter to them.


Apart from that ranting like this and trying to explain stuff to people 
who obviously have much better background in the topic is not going to 
help your patches getting upstream.


Regards,
Christian.



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread suijingfeng

Hi,


On 2023/9/6 14:45, Christian König wrote:

Am 05.09.23 um 15:30 schrieb suijingfeng:

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over 
which
one is primary at boot time. This series tries to solve above 
mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the 
boot-up graphics already finished.


This is an incorrect assumption.

drm_aperture_remove_conflicting_pci_framebuffers() and co don't kill 
the framebuffer, 


Well, my original description to this technique point is that

1) "Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings)"
2) "By the time of DRM drivers get loaded successfully, the boot-up graphics already 
finished."

The word "killed" here is rough and coarse description about
how does the drm device driver take over the firmware framebuffer.
Since there seems have something obscure our communication,
lets make the things clear. See below for more elaborate description.



they just remove the current framebuffer driver to avoid further updates.


This statement doesn't sound right, for UEFI environment,
a correct description is that they remove the platform device, not the 
framebuffer driver.
For the machines with the UEFI firmware, framebuffer driver here definitely 
refer to the efifb.
The efifb still reside in the system(linux kernel).

Please see the aperture_detach_platform_device() function in video/aperture.c

So what happens (at least for amdgpu) is that we take over the 
framebuffer,


This statement here is also not an accurate description.

Strictly speaking, drm/amdgpu takes over the device (the VRAM hardware),
not the framebuffer.

The word "take over" here is also dubious, because drm/amdgpu takes over 
nothing.

From the perspective of device-driver model, the GPU hardware *belongs* to the 
amdgpu drivers.
Why you need to take over a thing originally and belong to you?

If you could build the drm/amdgpu into the kernel and make it get loaded
before the efifb. Then, there no need to use the firmware framebuffer (
the talking is limited to the display boot graphics purpose here).
On such a case, the so-called "take over" will not happen.

The truth is that the efifb create a platform device, which *occupy*
part of the VRAM hardware resource. Thus, the efifb and the drm/amdgpu
form the conflict. There are conflict because they share the same
hardware resource. It is the hardware resources(address ranges) used
by two different driver are conflict. Not the efifb driver itself
conflict with drm/amdgpu driver.

Thus, drm_aperture_remove_conflicting_xx() function have to kill
one of the device are conflicting. Not to kill the driver. Therefore,
the correct word would be the "reclaim".
drm/amdgpu *reclaim* the hardware resource (vram address range) originally 
belong to you.

The modeset state (including the framebuffer content) still reside in the 
amdgpu device.
You just get the dirty framebuffer image in the framebuffer object.
But the framebuffer object already dirty since it in the UEFI firmware stage.

In conclusion, *reclaim* is more accurate than the "take over".
And as far as I'm understanding, the drm/amdgpu take over nothing, no gains.

Well, welcome to correct me if I'm wrong.



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann

Hi

Am 05.09.23 um 17:59 schrieb suijingfeng:
[...]
FYI: per-driver modeset parameters are deprecated and not to be used. 
Please don't promote them.



Well, please wait, I want to explain.



drm/nouveau already promote it a little bit.

Despite no code of conduct or specification guiding how the modules 
parameters should be.
Noticed that there already have a lot of DRM drivers support the modeset 
parameters,


Please look at the history and discussion around this parameter. To my 
knowledge, 'modeset' got introduced when modesetting with still done in 
userspace. It was an easy way of disabling the kernel driver if the 
system's Xorg did no yet support kernel mode setting.


Fast forward a few years and all Linux' use kernel modesetting, which 
make the modeset parameters obsolete. We discussed and decided to keep 
them in, because many articles and blog posts refer to them. We didn't 
want to invalidate them. BUT modeset is deprecated and not allowed in 
new code. If you look at existing modeset usage, you will eventually 
come across the comment at [1].


There's 'nomodeset', which disables all native drivers. It's useful for 
debugging or as a quick-fix if the graphics driver breaks. If you want 
to disable a specific driver, please use one of the options for 
blacklisting.


Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83



for the modeset parameter, authors of various device driver try to make 
the usage not

conflict with others. I believe that this is good thing for Linux users.
It is probably the responsibility of the drm core maintainers to force 
various drm
drivers to reach a minimal consensus. Probably it pains to do so and 
doesn't pay off.

But reach a minimal consensus do benefit to Linux users.


You can use modprobe.blacklist or initcall_blacklist on the kernel 
command line.



There are some cases where the modprobe.blacklist doesn't works,
I have come cross several time during the past.
Because the device selected by the VGAARB is device-level thing,
it is not the driver's problem.

Sometimes when VGAARB has a bug, it will select a wrong device as primary.
And the X server will use this wrong device as primary and completely crash
there, due to lack a driver. Take my old S3 Graphics as an example:

$ lspci | grep VGA

  00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display 
Controller) (rev 01)
  03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]

  07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01)
  08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01)

Before apply this patch:

[    0.361748] pci :00:06.1: vgaarb: setting as boot VGA device
[    0.361753] pci :00:06.1: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    0.361765] pci :03:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361773] pci :07:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361779] pci :08:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none

[    0.361781] vgaarb: loaded
[    0.367838] pci :00:06.1: Overriding boot device as 1002:6778
[    0.367841] pci :00:06.1: Overriding boot device as 5333:9070
[    0.367843] pci :00:06.1: Overriding boot device as 5333:9070


For known reason, one of my system select the S3 Graphics as primary GPU.
But this S3 Graphics not even have a decent drm upstream driver yet.
Under such a case, I begin to believe that only the device who has a
driver deserve the primary.

Under such a condition, I want to reboot and enter the graphic environment
with other working video cards. Either platform integrated and discrete 
GPU.

This don't means I should compromise by un-mount the S3 graphics card from
the motherboard, this also don't means that I should update my BIOS 
setting.

As sometimes, the BIOS is more worse.

With this series applied, all I need to do is to reboot the computer and
pass a command line. By force override another video card (who has a
decent driver support) as primary, I'm able to do the debugging under
graphic environment. I would like to examine what's wrong with the vgaarb
on a specific platform under X server graphic environment.

Probably try compile a driver for this card and see it works, simply reboot
without the need to change anything. It is so efficient. So this is 
probably

the second usage of my patch. It hand the right of control back to the
graphic developer.




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann

Hi

Am 06.09.23 um 05:08 schrieb suijingfeng:

Hi,


On 2023/9/5 23:05, Thomas Zimmermann wrote:
However, on modern Linux systems the primary display does not really 
exist. 'Primary' is the device that is available via VGA, VESA or EFI. 


I may miss the point, what do you means by choose the word "modern"?
Are you trying to tell me that X server is too old and Wayland is the 
modern display server?


It comes down to that. Xorg's device handling is out of date. Fixing it 
would require a redesign of the whole program. A 'modern' compositor 
delegates device handling to the kernel. All it does is to open the 
device files and use the provided functionality. I've briefly mentioned 
this in the other email.


There's more to 'modern', such as 'uses Wayland for compositing', 'Mesa 
for direct rendering' or 'does atomic modesetting'. But that's all 
unrelated here.






Our drivers don't use these interfaces, but the native registers.



Yes and no?

Yes for the machine with the UEFI firmware,
but I not sure if this statement is true for the machine with the legacy 
firmware.


What I mean is: the primary device is the one that owns the VGA/VESA/EFI 
I/O space. But DRM drivers don't program by VGA registers or VESA/EFI 
calls. They use the hardware's actual native registers in the each 
device's I/O space. So each device operates on it's own. They (usually) 
don't have to share/arbitrate access to the VGA registers.


Hence the idea of a primary device does not make sense here. It's useful 
to pick an initial default, but further display setup should rather be 
left to userspace.




As the display controller in the ASpeed BMC is VGA compatible.
Therefore, in theory, it should works with the VGA console on the machine
with another VGA compatible video card. So the ast_vga_set_decode() 
function

provided in the 0007 patch probably useful on legacy firmware environment.

To be honest, I have tested this on various machine with UEFI firmware.
But I didn't realized that I should do the testing on legacy firmware 
environment
before sending this patch. It seems that the testing effort needed are 
quite

exhausting, since all my machines come with the UEFI firmware.

So is it OK to leave the legacy part to someone else who interested in it?
Probably Alex is more professional at legacy VGA routing stuff?


Maybe you can describe the user's problem to us. TBH I still don't 
understand what you're trying to solve. If you what to set the console's 
initial output device, you can make a parameter in vgaarb. But I also 
don't really see a need for that either.


Best regards
Thomas


:-)




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann

Hi

Am 06.09.23 um 04:34 schrieb suijingfeng:


On 2023/9/5 23:05, Thomas Zimmermann wrote:

Hi

Am 05.09.23 um 15:30 schrieb suijingfeng:

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over 
which
one is primary at boot time. This series tries to solve above 
mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the 
boot-up graphics already finished.
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer


Yes and no. The helpers you mention will attempt to remove the 
firmware framebuffer on the given PCI device. If you have multiple PCI 
devices, the other devices would not be affected.



Yes and no.


For the yes part: drm_aperture_remove_conflicting_pci_framebuffers() 
only kill the conflict one.

But for a specific machine with the modern UEFI firmware,
there should be only one firmware framebuffer driver.
That shoudd be the EFIFB(UEFI GOP). I do have multiple PCI devices,
but I don't understand when and why a system will have more than one 
firmware framebuffer.


Maybe somewhat unrelated to the actual discussion, but it's not as 
simple as you assume. Many non-X86 systems use DeviceTree. On Sparc 
IIRC, there's the case of having multiple firmware framebuffers listed 
in the DT. We create an device for each and attach a DRM firmware 
driver; ofdrm in this case. I haven't seen this in the wild, but 
non-Sparc systems could also behave like that.


And in addition to that, ARM-based systems often uses UEFI boot stub 
code that provides a simple UEFI environment to the kernel. For graphics 
we've had cases where we received the same firmware framebuffer from the 
DT and from the UEFI boot stub. We have to detect and handle such 
duplication in the kernel.


Best regards
Thomas



Even for the machines with the legacy BIOS, the fixed VGA aperture 
address range
can only be owned by one firmware driver. It is just that we need to 
handle the
routing, the ->set_decode() callback of vga_client_register() is used to 
do such

work. Am I correct?




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann

Hi

Am 06.09.23 um 04:14 schrieb suijingfeng:

Hi,


On 2023/9/5 23:05, Thomas Zimmermann wrote:
However, on modern Linux systems the primary display does not really 
exist.



No, it do exist.  X server need to know which one is the primary GPU.
The '*' character at the of (4@0:0:0) PCI device is the Primary.
The '*' denote primary, see the log below.

(II) xfree86: Adding drm device (/dev/dri/card2)
(II) xfree86: Adding drm device (/dev/dri/card0)
(II) Platform probe for 
/sys/devices/pci:00/:00:1c.5/:003:00.0/:04:00.0/drm/card0

(II) xfree86: Adding drm device (/dev/dri/card3)
(II) Platform probe for 
/sys/devices/pci:00/:00:1c.6/:005:00.0/drm/card3
(--) PCI: (0@0:2:0) 8086:3e91:8086:3e91 rev 0, Mem @ 
0xdb00/16216, 0xa000/536870912, I/O @ 0xf000/64, BIOS @ 
0x/131072
(--) PCI: (1@0:0:0) 1002:6771:1043:8636 rev 0, Mem @ 
0xc000/2688435456, 0xdf22/131072, I/O @ 0xe000/256, BIOS @ 
0x/131072
(--) PCI:*(4@0:0:0) 1a03:2000:1a03:2000 rev 48, Mem @ 
0xde00/166777216, 0xdf02/131072, I/O @ 0xc000/128, BIOS @ 
0x/131072
(--) PCI: (5@0:0:0) 10de:1288:174b:b324 rev 161, Mem @ 
0xdc00/116777216, 0xd000/134217728, 0xd800/33554432, I/O @ 
0xb000/128, BIOS @@0x/524288


The modesetting driver of X server will create framebuffer on the 
primary video adapter.
If a 2D video adapter (like the aspeed BMC) is not the primary, then it 
probably will not
be used. The only chance to be able to display something is to 
functional as a output slave.
But the output slave technology need the PRIME support for cross driver 
buffer sharing.


So, there do have some difference between the primary and non-primary 
video adapters.


Xorg is a pretty bad example, because X parses the PCI bus and then 
tries to match devices to /dev/dri/ files. That's also not fixable in 
Xorg's current code base. Please don't promote Xorg's design. It dates 
back to the time when Xorg did the modesetting by itself.


Userspace should just open existing device files and start rendering. 
Maybe pick the previous settings and/or do some guess work about the 
arrangment of these devices. AFAIK that's what the modern compositors do.


Best regards
Thomas




'Primary' is the device that is available via VGA, VESA or EFI. Our 
drivers don't use these interfaces, but the native registers. As you 
said yourself, these firmware devices (VGA, VESA, EFI) are removed 
ASAP by the native drivers. 




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König

Am 05.09.23 um 16:28 schrieb Sui Jingfeng:

Hi,

On 2023/9/5 21:28, Christian König wrote:


2) Typically, those non-86 machines don't have a good UEFI firmware
    support, which doesn't support select primary GPU as firmware 
stage.
    Even on x86, there are old UEFI firmwares which already made 
undesired

    decision for you.

3) This series is attempt to solve the remain problems at the 
driver level,
    while another series[1] of me is target to solve the majority 
of the

    problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD 
Graphics

630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

The value 10 is incredibly arbitrary, and multiplied as a magic number
all over the place.


+1 



This is the exact reason why I made this series as RFC, because this 
is a open-ended problem.
The choices of 3,4,5,6,7,8 and 9 are as arbitrary as the number of 
'10'. '1' and '2' is

definitely not suitable, because the seat has already been taken.


Well you are completely missing the point. *DON'T* abuse the modeset 
module parameters for this!


If you use 10 or any other value doesn't matter.

Regards,
Christian.



Take the drm/nouveau as an example:


```

MODULE_PARM_DESC(modeset, "enable driver (default: auto, "
  "0 = disabled, 1 = enabled, 2 = headless)");
int nouveau_modeset = -1;
module_param_named(modeset, nouveau_modeset, int, 0400);

```


'1' is for enable the drm driver, some driver even override the 
'nomodeset' parameter.


'2' is not suitable, because nouveau use it as headless GPU 
(render-only or compute class GPU?)


'3' is also not likely the best, the concerns is that
what if a specific drm driver want to expand the usage in the future?


The reason I pick up the digit '10' is that


1) The modeset parameter is unlikely to get expanded up to 10 usages.

Other drm drivers only use the '-1', '0' and 1, choose '2' will 
conflict with drm/nouveau.
By pick the digit '10', it leave some space(room) to various device 
driver authors.

It also helps to keep the usage consistent across various drivers.


2) An int taken up 4 byte, I don't want to waste even a single byte,

While in the process of defencing my patch, I have to say
draft another kernel command line would cause the wasting of precious 
RAM storage.


An int can have 2^31 usage, why we can't improve the utilization rate?

3) Please consider the fact that the modeset is the most common and 
attractive parameter


No name is better than the 'modeset', as other name is not easy to 
remember.


Again, this is for Linux user, thus it is not arbitrary.
Despite simple and trivial, I think about it more than one week.





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König

Am 05.09.23 um 15:30 schrieb suijingfeng:

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the 
boot-up graphics already finished.


This is an incorrect assumption.

drm_aperture_remove_conflicting_pci_framebuffers() and co don't kill the 
framebuffer, they just remove the current framebuffer driver to avoid 
further updates.


So what happens (at least for amdgpu) is that we take over the 
framebuffer, including both mode and it's contents, and provide a new 
framebuffer interface until DRM masters like X or Wayland take over.


Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow 
them to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best test 
examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is called 
as POST.


Well, you don't seem to understand the background here. This is 
perfectly normal behavior.


Secondary cards are posted after loading the appropriate DRM driver. At 
least for amdgpu this is done by calling the appropriate functions in 
the BIOS.




One of the use cases of this series is to test if a specific DRM 
driver could works properly,
even though there is no prerequisite works have been done by firmware 
at all.

And it seems that the results is not satisfying in all cases.

drm/ast is the first drm drivers which refused to work if not being 
POST-ed by the firmware.


As far as I know this is expected as well. AST is a relatively simple 
driver and when it's not the primary one during boot the assumption is 
that it isn't used at all.


Regards,
Christian.



Before apply this series, I was unable make drm/ast as the primary 
video card easily. On a
multiple video card configuration, the monitor connected with the 
AST2400 not light up.

While confusing, a naive programmer may suspect the PRIME is not working.

After applied this series and passing ast.modeset=10 on the kernel cmd 
line,
I found that the monitor connected with my ast2400 video card still 
black,

It doesn't display and doesn't show image to me.

While in the process of study drm/ast, I know that drm/ast driver has 
the POST code shipped.
See the ast_post_gpu() function, then, I was wondering why this 
function doesn't works.
After a short-time (hasty) debugging, I found that the the 
ast_post_gpu() function
didn't get run. Because it have something to do with the 
ast->config_mode.


Without thinking too much, I hardcoded the ast->config_mode as 
ast_use_p2a to

force the ast_post_gpu() function get run.

```

--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -132,6 +132,8 @@ static int ast_device_config_init(struct 
ast_device *ast)

    }
    }

+   ast->config_mode = ast_use_p2a;
+
    switch (ast->config_mode) {
    case ast_use_defaults:
    drm_info(dev, "Using default configuration\n");

```

Then, the monitor light up, it display the Ubuntu greeter to me.
Therefore, my patch is helpful, at lease for the Linux drm driver 
tester and developer.

It allow programmers to test the specific part of the specific drive
without changing a line of the source code and without the need of 
sudo authority.

It helps to improve efficiency of the testing and patch verification.

I know the PrimaryGPU option of Xorg conf, but this approach will 
remember the setup
have been made, you need modify it with root authority each time you 
want to switch
the primary. But on rapid developing and/or testing multiple video 
drivers, with
only one computer hardware resource available. What we really want 
probably is a

one-shoot command as this series provide.

So, this is the first use case. This probably also help to test full 
modeset,

PRIME and reverse PRIME on multiple video card machine.



Best regards
Thomas







Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng

Hi,


On 2023/9/5 23:05, Thomas Zimmermann wrote:
However, on modern Linux systems the primary display does not really 
exist. 'Primary' is the device that is available via VGA, VESA or EFI. 


I may miss the point, what do you means by choose the word "modern"?
Are you trying to tell me that X server is too old and Wayland is the modern 
display server?



Our drivers don't use these interfaces, but the native registers.



Yes and no?

Yes for the machine with the UEFI firmware,
but I not sure if this statement is true for the machine with the legacy 
firmware.

As the display controller in the ASpeed BMC is VGA compatible.
Therefore, in theory, it should works with the VGA console on the machine
with another VGA compatible video card. So the ast_vga_set_decode() function
provided in the 0007 patch probably useful on legacy firmware environment.

To be honest, I have tested this on various machine with UEFI firmware.
But I didn't realized that I should do the testing on legacy firmware 
environment
before sending this patch. It seems that the testing effort needed are quite
exhausting, since all my machines come with the UEFI firmware.

So is it OK to leave the legacy part to someone else who interested in it?
Probably Alex is more professional at legacy VGA routing stuff?
:-)




Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng



On 2023/9/5 23:05, Thomas Zimmermann wrote:

Hi

Am 05.09.23 um 15:30 schrieb suijingfeng:

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over 
which
one is primary at boot time. This series tries to solve above 
mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the 
boot-up graphics already finished.
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer


Yes and no. The helpers you mention will attempt to remove the 
firmware framebuffer on the given PCI device. If you have multiple PCI 
devices, the other devices would not be affected.



Yes and no.


For the yes part: drm_aperture_remove_conflicting_pci_framebuffers() only kill 
the conflict one.
But for a specific machine with the modern UEFI firmware,
there should be only one firmware framebuffer driver.
That shoudd be the EFIFB(UEFI GOP). I do have multiple PCI devices,
but I don't understand when and why a system will have more than one firmware 
framebuffer.

Even for the machines with the legacy BIOS, the fixed VGA aperture address range
can only be owned by one firmware driver. It is just that we need to handle the
routing, the ->set_decode() callback of vga_client_register() is used to do such
work. Am I correct?




Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng

Hi,


On 2023/9/5 23:05, Thomas Zimmermann wrote:
However, on modern Linux systems the primary display does not really 
exist.



No, it do exist.  X server need to know which one is the primary GPU.
The '*' character at the of (4@0:0:0) PCI device is the Primary.
The '*' denote primary, see the log below.

(II) xfree86: Adding drm device (/dev/dri/card2)
(II) xfree86: Adding drm device (/dev/dri/card0)
(II) Platform probe for 
/sys/devices/pci:00/:00:1c.5/:003:00.0/:04:00.0/drm/card0

(II) xfree86: Adding drm device (/dev/dri/card3)
(II) Platform probe for 
/sys/devices/pci:00/:00:1c.6/:005:00.0/drm/card3
(--) PCI: (0@0:2:0) 8086:3e91:8086:3e91 rev 0, Mem @ 
0xdb00/16216, 0xa000/536870912, I/O @ 0xf000/64, BIOS @ 
0x/131072
(--) PCI: (1@0:0:0) 1002:6771:1043:8636 rev 0, Mem @ 
0xc000/2688435456, 0xdf22/131072, I/O @ 0xe000/256, BIOS @ 
0x/131072
(--) PCI:*(4@0:0:0) 1a03:2000:1a03:2000 rev 48, Mem @ 
0xde00/166777216, 0xdf02/131072, I/O @ 0xc000/128, BIOS @ 
0x/131072
(--) PCI: (5@0:0:0) 10de:1288:174b:b324 rev 161, Mem @ 
0xdc00/116777216, 0xd000/134217728, 0xd800/33554432, I/O @ 
0xb000/128, BIOS @@0x/524288


The modesetting driver of X server will create framebuffer on the primary video 
adapter.
If a 2D video adapter (like the aspeed BMC) is not the primary, then it 
probably will not
be used. The only chance to be able to display something is to functional as a 
output slave.
But the output slave technology need the PRIME support for cross driver buffer 
sharing.

So, there do have some difference between the primary and non-primary video 
adapters.


'Primary' is the device that is available via VGA, VESA or EFI. Our 
drivers don't use these interfaces, but the native registers. As you 
said yourself, these firmware devices (VGA, VESA, EFI) are removed 
ASAP by the native drivers. 




Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Alex Williamson
On Wed, 6 Sep 2023 00:21:09 +0800
suijingfeng  wrote:

> Hi,
> 
> On 2023/9/5 22:52, Alex Williamson wrote:
> > On Tue,  5 Sep 2023 03:57:15 +0800
> > Sui Jingfeng  wrote:
> >  
> >> From: Sui Jingfeng 
> >>
> >> On a machine with multiple GPUs, a Linux user has no control over which
> >> one is primary at boot time. This series tries to solve above mentioned
> >> problem by introduced the ->be_primary() function stub. The specific
> >> device drivers can provide an implementation to hook up with this stub by
> >> calling the vga_client_register() function.
> >>
> >> Once the driver bound the device successfully, VGAARB will call back to
> >> the device driver. To query if the device drivers want to be primary or
> >> not. Device drivers can just pass NULL if have no such needs.
> >>
> >> Please note that:
> >>
> >> 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
> >> like to mount at least three video cards.
> >>
> >> 2) Typically, those non-86 machines don't have a good UEFI firmware
> >> support, which doesn't support select primary GPU as firmware stage.
> >> Even on x86, there are old UEFI firmwares which already made undesired
> >> decision for you.
> >>
> >> 3) This series is attempt to solve the remain problems at the driver level,
> >> while another series[1] of me is target to solve the majority of the
> >> problems at device level.
> >>
> >> Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
> >> 630 is the default boot VGA, successfully override by ast2400 with
> >> ast.modeset=10 append at the kernel cmd line.
> >>
> >> $ lspci | grep VGA
> >>
> >>   00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 
> >> [UHD Graphics 630]  
> > In all my previous experiments with VGA routing and IGD I found that
> > IGD can't actually release VGA routing and Intel confirmed the hardware
> > doesn't have the ability to do so.  It will always be primary from a
> > VGA routing perspective.  Was this actually tested with non-UEFI?  
> 
> Yes, I have tested on my aspire e471 notebook (i5 5200U),
> because that notebook using legacy firmware (also have UEFI, double firmware).
> But this machine have difficult in install ubuntu under UEFI firmware in the 
> past.
> So I keep it using the legacy firmware.
> 
> It have two video card, IGD and nvidia video card(GFORCE 840M).
> nvidia call its video card as 3D controller (pci->class = 0x030200)
> 
> I have tested this patch and another patch mention at [1] together.
> I can tell you that the firmware framebuffer of this notebook using vesafb, 
> not efifb.
> And the framebuffer size (lfb.size) is very small. This is very strange,
> but I don't have enough time to look in details. But still works.
> 
> I'm using and tesing my patch whenever and wherever possible.

So you're testing VGA routing using a non-VGA 3D controller through the
VESA address space?  How does that test anything about VGA routing?

> > I suspect it might only work in UEFI mode where we probably don't
> > actually have a dependency on VGA routing.  This is essentially why
> > vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
> > broken to use on Intel systems with IGD.  Thanks,  
> 
> 
> What you tell me here is the side effect come with the VGA-compatible,
> but I'm focus on the arbitration itself. I think there no need to keep
> the VGA routing hardware features nowadays except that hardware vendor
> want keep the backward compatibility and/or comply the PCI VGA compatible 
> spec.

"VGA arbitration" is the mediation of VGA routing between devices, so
I'm confused how you can be focused on the arbitration without the
routing itself.  Thanks,

Alex



Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng

Hi,

On 2023/9/5 22:52, Alex Williamson wrote:

On Tue,  5 Sep 2023 03:57:15 +0800
Sui Jingfeng  wrote:


From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]

In all my previous experiments with VGA routing and IGD I found that
IGD can't actually release VGA routing and Intel confirmed the hardware
doesn't have the ability to do so.  It will always be primary from a
VGA routing perspective.  Was this actually tested with non-UEFI?


Yes, I have tested on my aspire e471 notebook (i5 5200U),
because that notebook using legacy firmware (also have UEFI, double firmware).
But this machine have difficult in install ubuntu under UEFI firmware in the 
past.
So I keep it using the legacy firmware.

It have two video card, IGD and nvidia video card(GFORCE 840M).
nvidia call its video card as 3D controller (pci->class = 0x030200)

I have tested this patch and another patch mention at [1] together.
I can tell you that the firmware framebuffer of this notebook using vesafb, not 
efifb.
And the framebuffer size (lfb.size) is very small. This is very strange,
but I don't have enough time to look in details. But still works.

I'm using and tesing my patch whenever and wherever possible.


I suspect it might only work in UEFI mode where we probably don't
actually have a dependency on VGA routing.  This is essentially why
vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
broken to use on Intel systems with IGD.  Thanks,



What you tell me here is the side effect come with the VGA-compatible,
but I'm focus on the arbitration itself. I think there no need to keep
the VGA routing hardware features nowadays except that hardware vendor
want keep the backward compatibility and/or comply the PCI VGA compatible spec.



Alex





Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng



On 2023/9/5 18:49, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this 
stub by

calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
    like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
    support, which doesn't support select primary GPU as firmware stage.
    Even on x86, there are old UEFI firmwares which already made 
undesired

    decision for you.

3) This series is attempt to solve the remain problems at the driver 
level,

    while another series[1] of me is target to solve the majority of the
    problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.


FYI: per-driver modeset parameters are deprecated and not to be used. 
Please don't promote them.



Well, please wait, I want to explain.



drm/nouveau already promote it a little bit.

Despite no code of conduct or specification guiding how the modules parameters 
should be.
Noticed that there already have a lot of DRM drivers support the modeset 
parameters,
for the modeset parameter, authors of various device driver try to make the 
usage not
conflict with others. I believe that this is good thing for Linux users.
It is probably the responsibility of the drm core maintainers to force various 
drm
drivers to reach a minimal consensus. Probably it pains to do so and doesn't 
pay off.
But reach a minimal consensus do benefit to Linux users.


You can use modprobe.blacklist or initcall_blacklist on the kernel 
command line.



There are some cases where the modprobe.blacklist doesn't works,
I have come cross several time during the past.
Because the device selected by the VGAARB is device-level thing,
it is not the driver's problem.

Sometimes when VGAARB has a bug, it will select a wrong device as primary.
And the X server will use this wrong device as primary and completely crash
there, due to lack a driver. Take my old S3 Graphics as an example:

$ lspci | grep VGA

 00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display 
Controller) (rev 01)
 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]
 07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01)
 08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01)

Before apply this patch:

[0.361748] pci :00:06.1: vgaarb: setting as boot VGA device
[0.361753] pci :00:06.1: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[0.361765] pci :03:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[0.361773] pci :07:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[0.361779] pci :08:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[0.361781] vgaarb: loaded
[0.367838] pci :00:06.1: Overriding boot device as 1002:6778
[0.367841] pci :00:06.1: Overriding boot device as 5333:9070
[0.367843] pci :00:06.1: Overriding boot device as 5333:9070


For known reason, one of my system select the S3 Graphics as primary GPU.
But this S3 Graphics not even have a decent drm upstream driver yet.
Under such a case, I begin to believe that only the device who has a
driver deserve the primary.

Under such a condition, I want to reboot and enter the graphic environment
with other working video cards. Either platform integrated and discrete GPU.
This don't means I should compromise by un-mount the S3 graphics card from
the motherboard, this also don't means that I should update my BIOS setting.
As sometimes, the BIOS is more worse.

With this series applied, all I need to do is to reboot the computer and
pass a command line. By force override another video card (who has a
decent driver support) as primary, I'm able to do the debugging under
graphic environment. I would like to examine what's wrong with the vgaarb
on a specific platform under X server graphic environment.

Probably try compile a driver for this card and see it works, simply reboot
without the need to change anything. It is so efficient. So this is probably
the second usage of my patch. It hand the right of control back to the
graphic developer.




Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Thomas Zimmermann

Hi

Am 05.09.23 um 15:30 schrieb suijingfeng:

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the boot-up 
graphics already finished.
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer


Yes and no. The helpers you mention will attempt to remove the firmware 
framebuffer on the given PCI device. If you have multiple PCI devices, 
the other devices would not be affected.


This also means that probing a non-primary card will not affect the 
firmware framebuffer on the primary card. You can have all these drivers 
co-exist next to each other. If you link a full DRM driver into the 
kernel image, it might even be loaded before the firmware-framebuffer's 
driver.  We had some funny bugs from these interactions.



(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow them 
to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best test 
examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.


If you want to run a userspace compositor or X11 on a certain device, 
you best configure this in the program's config files. But not on the 
kernel command line.


The whole concept of a 'primary' display is bogus IMHO. It only exists 
because old VGA and BIOS (and their equivalents on non-PC systems) were 
unable to use more than one graphics device. Hence, as you write below, 
only the first device got POSTed by the BIOS. If you had an additional 
card, the device driver needed to perform the POSTing.


However, on modern Linux systems the primary display does not really 
exist. 'Primary' is the device that is available via VGA, VESA or EFI. 
Our drivers don't use these interfaces, but the native registers. As you 
said yourself, these firmware devices (VGA, VESA, EFI) are removed ASAP 
by the native drivers.






But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is called as 
POST.


One of the use cases of this series is to test if a specific DRM driver 
could works properly,
even though there is no prerequisite works have been done by firmware at 
all.

And it seems that the results is not satisfying in all cases.

drm/ast is the first drm drivers which refused to work if not being 
POST-ed by the firmware.


You might have found a bug in the ast driver. Ast has means to detect if 
the device has been POSTed and maybe do that. If this doesn't work 
correctly, it needs a fix.


As Christian mentioned, if anything, you might add an option to specify 
the default card to vgaarb (e.g., as PCI slot). But userspace should 
avoid the idea of a primary card IMHO.


Best regards
Thomas



Before apply this series, I was unable make drm/ast as the primary video 
card easily. On a
multiple video card configuration, the monitor connected with the 
AST2400 not light up.

While confusing, a naive programmer may suspect the PRIME is not working.

After applied this series and passing ast.modeset=10 on the kernel cmd 
line,

I found that the monitor connected with my ast2400 video card still black,
It doesn't display and doesn't show image to me.

While in the process of study drm/ast, I know that drm/ast driver has 
the POST code shipped.
See the ast_post_gpu() function, then, I was wondering why this function 
doesn't works.
After a short-time (hasty) debugging, I found that the the 
ast_post_gpu() function

didn't get run. Because it have something to do with the ast->config_mode.

Without thinking too much, I hardcoded the ast->config_mode as 
ast_use_p2a to

force the ast_post_gpu() function get run.

```

--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device 
*ast)

     }
     }

+   ast->config_mode = ast_use_p2a;
+
     switch (ast->config_mode) {
     case ast_use_defaults:
     drm_info(dev, "Using default configuration\n");

```

Then, the monitor light up, it display the Ubuntu 

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Alex Williamson
On Tue,  5 Sep 2023 03:57:15 +0800
Sui Jingfeng  wrote:

> From: Sui Jingfeng 
> 
> On a machine with multiple GPUs, a Linux user has no control over which
> one is primary at boot time. This series tries to solve above mentioned
> problem by introduced the ->be_primary() function stub. The specific
> device drivers can provide an implementation to hook up with this stub by
> calling the vga_client_register() function.
> 
> Once the driver bound the device successfully, VGAARB will call back to
> the device driver. To query if the device drivers want to be primary or
> not. Device drivers can just pass NULL if have no such needs.
> 
> Please note that:
> 
> 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
>like to mount at least three video cards.
> 
> 2) Typically, those non-86 machines don't have a good UEFI firmware
>support, which doesn't support select primary GPU as firmware stage.
>Even on x86, there are old UEFI firmwares which already made undesired
>decision for you.
> 
> 3) This series is attempt to solve the remain problems at the driver level,
>while another series[1] of me is target to solve the majority of the
>problems at device level.
> 
> Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
> 630 is the default boot VGA, successfully override by ast2400 with
> ast.modeset=10 append at the kernel cmd line.
> 
> $ lspci | grep VGA
> 
>  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
> Graphics 630]

In all my previous experiments with VGA routing and IGD I found that
IGD can't actually release VGA routing and Intel confirmed the hardware
doesn't have the ability to do so.  It will always be primary from a
VGA routing perspective.  Was this actually tested with non-UEFI?

I suspect it might only work in UEFI mode where we probably don't
actually have a dependency on VGA routing.  This is essentially why
vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
broken to use on Intel systems with IGD.  Thanks,

Alex

>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
> Caicos XTX [Radeon HD 8490 / R5 235X OEM]
>  04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
> Family (rev 30)
>  05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 
> 720] (rev a1)
> 
> $ sudo dmesg | grep vgaarb
> 
>  pci :00:02.0: vgaarb: setting as boot VGA device
>  pci :00:02.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=io+mem,locks=none
>  pci :01:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  pci :04:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  pci :05:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  vgaarb: loaded
>  ast :04:00.0: vgaarb: Override as primary by driver
>  i915 :00:02.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=io+mem
>  radeon :01:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
>  ast :04:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
> 
> v2:
>   * Add a simple implemment for drm/i915 and drm/ast
>   * Pick up all tags (Mario)
> v3:
>   * Fix a mistake for drm/i915 implement
>   * Fix patch can not be applied problem because of merge conflect.
> v4:
>   * Focus on solve the real problem.
> 
> v1,v2 at https://patchwork.freedesktop.org/series/120059/
>v3 at https://patchwork.freedesktop.org/series/120562/
> 
> [1] https://patchwork.freedesktop.org/series/122845/
> 
> Sui Jingfeng (9):
>   PCI/VGA: Allowing the user to select the primary video adapter at boot
> time
>   drm/nouveau: Implement .be_primary() callback
>   drm/radeon: Implement .be_primary() callback
>   drm/amdgpu: Implement .be_primary() callback
>   drm/i915: Implement .be_primary() callback
>   drm/loongson: Implement .be_primary() callback
>   drm/ast: Register as a VGA client by calling vga_client_register()
>   drm/hibmc: Register as a VGA client by calling vga_client_register()
>   drm/gma500: Register as a VGA client by calling vga_client_register()
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
>  drivers/gpu/drm/ast/ast_drv.c | 31 ++
>  drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
>  .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
>  drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
>  drivers/gpu/drm/loongson/loongson_module.c|  2 +-
>  drivers/gpu/drm/loongson/loongson_module.h|  1 +
>  drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
>  drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
>  drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
>  drivers/pci/vgaarb.c  | 43 --
>  drivers/vfio/pci/vfio_pci_core.c   

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Sui Jingfeng

Hi,

On 2023/9/5 21:28, Christian König wrote:


2) Typically, those non-86 machines don't have a good UEFI firmware
    support, which doesn't support select primary GPU as firmware 
stage.
    Even on x86, there are old UEFI firmwares which already made 
undesired

    decision for you.

3) This series is attempt to solve the remain problems at the driver 
level,
    while another series[1] of me is target to solve the majority of 
the

    problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD 
Graphics

630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

The value 10 is incredibly arbitrary, and multiplied as a magic number
all over the place.


+1 



This is the exact reason why I made this series as RFC, because this is a 
open-ended problem.
The choices of 3,4,5,6,7,8 and 9 are as arbitrary as the number of '10'. '1' 
and '2' is
definitely not suitable, because the seat has already been taken.

Take the drm/nouveau as an example:


```

MODULE_PARM_DESC(modeset, "enable driver (default: auto, "
  "0 = disabled, 1 = enabled, 2 = headless)");
int nouveau_modeset = -1;
module_param_named(modeset, nouveau_modeset, int, 0400);

```


'1' is for enable the drm driver, some driver even override the 'nomodeset' 
parameter.

'2' is not suitable, because nouveau use it as headless GPU (render-only or 
compute class GPU?)

'3' is also not likely the best, the concerns is that
what if a specific drm driver want to expand the usage in the future?


The reason I pick up the digit '10' is that


1) The modeset parameter is unlikely to get expanded up to 10 usages.

Other drm drivers only use the '-1', '0' and 1, choose '2' will conflict with 
drm/nouveau.
By pick the digit '10', it leave some space(room) to various device driver 
authors.
It also helps to keep the usage consistent across various drivers.


2) An int taken up 4 byte, I don't want to waste even a single byte,

While in the process of defencing my patch, I have to say
draft another kernel command line would cause the wasting of precious RAM 
storage.

An int can have 2^31 usage, why we can't improve the utilization rate?

3) Please consider the fact that the modeset is the most common and attractive 
parameter

No name is better than the 'modeset', as other name is not easy to remember.

Again, this is for Linux user, thus it is not arbitrary.
Despite simple and trivial, I think about it more than one week.



Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread suijingfeng

Hi,


On 2023/9/5 18:45, Thomas Zimmermann wrote:

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned


If anything, the primary graphics adapter is the one initialized by 
the firmware. I think our boot-up graphics also make this assumption 
implicitly.




Yes, but by the time of DRM drivers get loaded successfully,the boot-up 
graphics already finished.
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to interact with 
the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space program, such 
as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow them to 
direct graphic display server
using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best test examples.
If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is called as POST.

One of the use cases of this series is to test if a specific DRM driver could 
works properly,
even though there is no prerequisite works have been done by firmware at all.
And it seems that the results is not satisfying in all cases.

drm/ast is the first drm drivers which refused to work if not being POST-ed by 
the firmware.

Before apply this series, I was unable make drm/ast as the primary video card 
easily. On a
multiple video card configuration, the monitor connected with the AST2400 not 
light up.
While confusing, a naive programmer may suspect the PRIME is not working.

After applied this series and passing ast.modeset=10 on the kernel cmd line,
I found that the monitor connected with my ast2400 video card still black,
It doesn't display and doesn't show image to me.

While in the process of study drm/ast, I know that drm/ast driver has the POST 
code shipped.
See the ast_post_gpu() function, then, I was wondering why this function 
doesn't works.
After a short-time (hasty) debugging, I found that the the ast_post_gpu() 
function
didn't get run. Because it have something to do with the ast->config_mode.

Without thinking too much, I hardcoded the ast->config_mode as ast_use_p2a to
force the ast_post_gpu() function get run.

```

--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device 
*ast)

    }
    }

+   ast->config_mode = ast_use_p2a;
+
    switch (ast->config_mode) {
    case ast_use_defaults:
    drm_info(dev, "Using default configuration\n");

```

Then, the monitor light up, it display the Ubuntu greeter to me.
Therefore, my patch is helpful, at lease for the Linux drm driver tester and 
developer.
It allow programmers to test the specific part of the specific drive
without changing a line of the source code and without the need of sudo 
authority.
It helps to improve efficiency of the testing and patch verification.

I know the PrimaryGPU option of Xorg conf, but this approach will remember the 
setup
have been made, you need modify it with root authority each time you want to 
switch
the primary. But on rapid developing and/or testing multiple video drivers, with
only one computer hardware resource available. What we really want probably is a
one-shoot command as this series provide.

So, this is the first use case. This probably also help to test full modeset,
PRIME and reverse PRIME on multiple video card machine.



Best regards
Thomas





Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Christian König

Am 05.09.23 um 12:38 schrieb Jani Nikula:

On Tue, 05 Sep 2023, Sui Jingfeng  wrote:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.


Well, you rarely find a board which can actually handle a single one :)



2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

The value 10 is incredibly arbitrary, and multiplied as a magic number
all over the place.


+1




$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]
  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XTX [Radeon HD 8490 / R5 235X OEM]
  04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30)
  05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] 
(rev a1)

In this example, all of the GPUs are driven by different drivers. What
good does a module parameter do if you have multiple GPUs of the same
model, all driven by the same driver module?


Completely agree. Question is what is the benefit for the end user to 
actually specify this?


If you want the initial console on a different device than implement a 
kernel options for vgaarb and *not* the drivers.


Regards,
Christian.



BR,
Jani.


$ sudo dmesg | grep vgaarb

  pci :00:02.0: vgaarb: setting as boot VGA device
  pci :00:02.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
  pci :01:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :04:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  vgaarb: loaded
  ast :04:00.0: vgaarb: Override as primary by driver
  i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
  radeon :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
  ast :04:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none

v2:
* Add a simple implemment for drm/i915 and drm/ast
* Pick up all tags (Mario)
v3:
* Fix a mistake for drm/i915 implement
* Fix patch can not be applied problem because of merge conflect.
v4:
* Focus on solve the real problem.

v1,v2 at https://patchwork.freedesktop.org/series/120059/
v3 at https://patchwork.freedesktop.org/series/120562/

[1] https://patchwork.freedesktop.org/series/122845/

Sui Jingfeng (9):
   PCI/VGA: Allowing the user to select the primary video adapter at boot
 time
   drm/nouveau: Implement .be_primary() callback
   drm/radeon: Implement .be_primary() callback
   drm/amdgpu: Implement .be_primary() callback
   drm/i915: Implement .be_primary() callback
   drm/loongson: Implement .be_primary() callback
   drm/ast: Register as a VGA client by calling vga_client_register()
   drm/hibmc: Register as a VGA client by calling vga_client_register()
   drm/gma500: Register as a VGA client by calling vga_client_register()

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
  drivers/gpu/drm/ast/ast_drv.c | 31 ++
  drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
  .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
  drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
  drivers/gpu/drm/loongson/loongson_module.c|  2 +-
  drivers/gpu/drm/loongson/loongson_module.h|  1 +
  drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
  drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
  drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
  drivers/pci/vgaarb.c  | 43 --
  drivers/vfio/pci/vfio_pci_core.c  |  2 +-
  

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Thomas Zimmermann

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.


FYI: per-driver modeset parameters are deprecated and not to be used. 
Please don't promote them. You can use modprobe.blacklist or 
initcall_blacklist on the kernel command line.


Best regards
Thomas



$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]
  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XTX [Radeon HD 8490 / R5 235X OEM]
  04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30)
  05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] 
(rev a1)

$ sudo dmesg | grep vgaarb

  pci :00:02.0: vgaarb: setting as boot VGA device
  pci :00:02.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
  pci :01:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :04:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  vgaarb: loaded
  ast :04:00.0: vgaarb: Override as primary by driver
  i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
  radeon :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
  ast :04:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none

v2:
* Add a simple implemment for drm/i915 and drm/ast
* Pick up all tags (Mario)
v3:
* Fix a mistake for drm/i915 implement
* Fix patch can not be applied problem because of merge conflect.
v4:
* Focus on solve the real problem.

v1,v2 at https://patchwork.freedesktop.org/series/120059/
v3 at https://patchwork.freedesktop.org/series/120562/

[1] https://patchwork.freedesktop.org/series/122845/

Sui Jingfeng (9):
   PCI/VGA: Allowing the user to select the primary video adapter at boot
 time
   drm/nouveau: Implement .be_primary() callback
   drm/radeon: Implement .be_primary() callback
   drm/amdgpu: Implement .be_primary() callback
   drm/i915: Implement .be_primary() callback
   drm/loongson: Implement .be_primary() callback
   drm/ast: Register as a VGA client by calling vga_client_register()
   drm/hibmc: Register as a VGA client by calling vga_client_register()
   drm/gma500: Register as a VGA client by calling vga_client_register()

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
  drivers/gpu/drm/ast/ast_drv.c | 31 ++
  drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
  .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
  drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
  drivers/gpu/drm/loongson/loongson_module.c|  2 +-
  drivers/gpu/drm/loongson/loongson_module.h|  1 +
  drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
  drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
  drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
  drivers/pci/vgaarb.c  | 43 --
  drivers/vfio/pci/vfio_pci_core.c  |  2 +-
  include/linux/vgaarb.h|  8 ++-
  14 files changed, 210 insertions(+), 19 deletions(-)



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Thomas Zimmermann

Hi

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned


If anything, the primary graphics adapter is the one initialized by the 
firmware. I think our boot-up graphics also make this assumption implicitly.


But what's the use case for overriding this setting?

Best regards
Thomas


problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]
  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XTX [Radeon HD 8490 / R5 235X OEM]
  04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30)
  05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] 
(rev a1)

$ sudo dmesg | grep vgaarb

  pci :00:02.0: vgaarb: setting as boot VGA device
  pci :00:02.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
  pci :01:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :04:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  pci :05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
  vgaarb: loaded
  ast :04:00.0: vgaarb: Override as primary by driver
  i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
  radeon :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
  ast :04:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none

v2:
* Add a simple implemment for drm/i915 and drm/ast
* Pick up all tags (Mario)
v3:
* Fix a mistake for drm/i915 implement
* Fix patch can not be applied problem because of merge conflect.
v4:
* Focus on solve the real problem.

v1,v2 at https://patchwork.freedesktop.org/series/120059/
v3 at https://patchwork.freedesktop.org/series/120562/

[1] https://patchwork.freedesktop.org/series/122845/

Sui Jingfeng (9):
   PCI/VGA: Allowing the user to select the primary video adapter at boot
 time
   drm/nouveau: Implement .be_primary() callback
   drm/radeon: Implement .be_primary() callback
   drm/amdgpu: Implement .be_primary() callback
   drm/i915: Implement .be_primary() callback
   drm/loongson: Implement .be_primary() callback
   drm/ast: Register as a VGA client by calling vga_client_register()
   drm/hibmc: Register as a VGA client by calling vga_client_register()
   drm/gma500: Register as a VGA client by calling vga_client_register()

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
  drivers/gpu/drm/ast/ast_drv.c | 31 ++
  drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
  .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
  drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
  drivers/gpu/drm/loongson/loongson_module.c|  2 +-
  drivers/gpu/drm/loongson/loongson_module.h|  1 +
  drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
  drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
  drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
  drivers/pci/vgaarb.c  | 43 --
  drivers/vfio/pci/vfio_pci_core.c  |  2 +-
  include/linux/vgaarb.h|  8 ++-
  14 files changed, 210 insertions(+), 19 deletions(-)



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Jani Nikula
On Tue, 05 Sep 2023, Sui Jingfeng  wrote:
> From: Sui Jingfeng 
>
> On a machine with multiple GPUs, a Linux user has no control over which
> one is primary at boot time. This series tries to solve above mentioned
> problem by introduced the ->be_primary() function stub. The specific
> device drivers can provide an implementation to hook up with this stub by
> calling the vga_client_register() function.
>
> Once the driver bound the device successfully, VGAARB will call back to
> the device driver. To query if the device drivers want to be primary or
> not. Device drivers can just pass NULL if have no such needs.
>
> Please note that:
>
> 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
>like to mount at least three video cards.
>
> 2) Typically, those non-86 machines don't have a good UEFI firmware
>support, which doesn't support select primary GPU as firmware stage.
>Even on x86, there are old UEFI firmwares which already made undesired
>decision for you.
>
> 3) This series is attempt to solve the remain problems at the driver level,
>while another series[1] of me is target to solve the majority of the
>problems at device level.
>
> Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
> 630 is the default boot VGA, successfully override by ast2400 with
> ast.modeset=10 append at the kernel cmd line.

The value 10 is incredibly arbitrary, and multiplied as a magic number
all over the place.

> $ lspci | grep VGA
>
>  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
> Graphics 630]
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
> Caicos XTX [Radeon HD 8490 / R5 235X OEM]
>  04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
> Family (rev 30)
>  05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 
> 720] (rev a1)

In this example, all of the GPUs are driven by different drivers. What
good does a module parameter do if you have multiple GPUs of the same
model, all driven by the same driver module?

BR,
Jani.

>
> $ sudo dmesg | grep vgaarb
>
>  pci :00:02.0: vgaarb: setting as boot VGA device
>  pci :00:02.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=io+mem,locks=none
>  pci :01:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  pci :04:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  pci :05:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none
>  vgaarb: loaded
>  ast :04:00.0: vgaarb: Override as primary by driver
>  i915 :00:02.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=io+mem
>  radeon :01:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
>  ast :04:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
>
> v2:
>   * Add a simple implemment for drm/i915 and drm/ast
>   * Pick up all tags (Mario)
> v3:
>   * Fix a mistake for drm/i915 implement
>   * Fix patch can not be applied problem because of merge conflect.
> v4:
>   * Focus on solve the real problem.
>
> v1,v2 at https://patchwork.freedesktop.org/series/120059/
>v3 at https://patchwork.freedesktop.org/series/120562/
>
> [1] https://patchwork.freedesktop.org/series/122845/
>
> Sui Jingfeng (9):
>   PCI/VGA: Allowing the user to select the primary video adapter at boot
> time
>   drm/nouveau: Implement .be_primary() callback
>   drm/radeon: Implement .be_primary() callback
>   drm/amdgpu: Implement .be_primary() callback
>   drm/i915: Implement .be_primary() callback
>   drm/loongson: Implement .be_primary() callback
>   drm/ast: Register as a VGA client by calling vga_client_register()
>   drm/hibmc: Register as a VGA client by calling vga_client_register()
>   drm/gma500: Register as a VGA client by calling vga_client_register()
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
>  drivers/gpu/drm/ast/ast_drv.c | 31 ++
>  drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
>  .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
>  drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
>  drivers/gpu/drm/loongson/loongson_module.c|  2 +-
>  drivers/gpu/drm/loongson/loongson_module.h|  1 +
>  drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
>  drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
>  drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
>  drivers/pci/vgaarb.c  | 43 --
>  drivers/vfio/pci/vfio_pci_core.c  |  2 +-
>  include/linux/vgaarb.h|  8 ++-
>  14 files changed, 210 insertions(+), 19 deletions(-)

-- 
Jani Nikula, Intel Open Source Graphics Center


[RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-05 Thread Sui Jingfeng
From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
   like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
   support, which doesn't support select primary GPU as firmware stage.
   Even on x86, there are old UEFI firmwares which already made undesired
   decision for you.

3) This series is attempt to solve the remain problems at the driver level,
   while another series[1] of me is target to solve the majority of the
   problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

 00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Caicos XTX [Radeon HD 8490 / R5 235X OEM]
 04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30)
 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] 
(rev a1)

$ sudo dmesg | grep vgaarb

 pci :00:02.0: vgaarb: setting as boot VGA device
 pci :00:02.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
 pci :01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 pci :04:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
 vgaarb: loaded
 ast :04:00.0: vgaarb: Override as primary by driver
 i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
 radeon :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
 ast :04:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none

v2:
* Add a simple implemment for drm/i915 and drm/ast
* Pick up all tags (Mario)
v3:
* Fix a mistake for drm/i915 implement
* Fix patch can not be applied problem because of merge conflect.
v4:
* Focus on solve the real problem.

v1,v2 at https://patchwork.freedesktop.org/series/120059/
   v3 at https://patchwork.freedesktop.org/series/120562/

[1] https://patchwork.freedesktop.org/series/122845/

Sui Jingfeng (9):
  PCI/VGA: Allowing the user to select the primary video adapter at boot
time
  drm/nouveau: Implement .be_primary() callback
  drm/radeon: Implement .be_primary() callback
  drm/amdgpu: Implement .be_primary() callback
  drm/i915: Implement .be_primary() callback
  drm/loongson: Implement .be_primary() callback
  drm/ast: Register as a VGA client by calling vga_client_register()
  drm/hibmc: Register as a VGA client by calling vga_client_register()
  drm/gma500: Register as a VGA client by calling vga_client_register()

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 13 -
 drivers/gpu/drm/ast/ast_drv.c | 31 ++
 drivers/gpu/drm/gma500/psb_drv.c  | 57 ++-
 .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   | 15 +
 drivers/gpu/drm/i915/display/intel_vga.c  | 15 -
 drivers/gpu/drm/loongson/loongson_module.c|  2 +-
 drivers/gpu/drm/loongson/loongson_module.h|  1 +
 drivers/gpu/drm/loongson/lsdc_drv.c   | 10 +++-
 drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++-
 drivers/gpu/drm/radeon/radeon_device.c| 10 +++-
 drivers/pci/vgaarb.c  | 43 --
 drivers/vfio/pci/vfio_pci_core.c  |  2 +-
 include/linux/vgaarb.h|  8 ++-
 14 files changed, 210 insertions(+), 19 deletions(-)

-- 
2.34.1