Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 05:23:00PM +, Taylor R Campbell a écrit : > > Date: Sun, 29 Jan 2023 16:44:08 +0100 > > From: tlaro...@polynum.com > > > > I will look (silently) to dev/pci/radeonfb.c to understand better the > > logics and try to find if there is a way to obtain a better console > > display. > > FYI, dev/pci/radeonfb.c is the legacy radeon framebuffer driver only > for very old (~20-year-old) devices, not the modern drm driver. > Yep. Realized that when adding debugging information in this file that did not show up... > > BTW, the problem is with VGA and DVI(-D) connections. With another monitor > > connected with HDMI (so more recent than this present 16:9 monitor, that > > have only VGA and DVI-D connectors and was manufactured in > > 2012 according to the EDID), the framebuffer has a better resolution. > > Comparing dmesg output from `boot -vx' with the two connectors may > help to diagnose what's happening. > > (If you already sent it, sorry -- haven't had time to look closely > yet.) Yes: I have already sent the various dmesg'es to you :-) In the mean time, I will try to worm my way in the sources. Even if I don't succeed in finding a cure, I will undoubtely learn things along the way... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
> Date: Sun, 29 Jan 2023 16:44:08 +0100 > From: tlaro...@polynum.com > > I will look (silently) to dev/pci/radeonfb.c to understand better the > logics and try to find if there is a way to obtain a better console > display. FYI, dev/pci/radeonfb.c is the legacy radeon framebuffer driver only for very old (~20-year-old) devices, not the modern drm driver. > BTW, the problem is with VGA and DVI(-D) connections. With another monitor > connected with HDMI (so more recent than this present 16:9 monitor, that > have only VGA and DVI-D connectors and was manufactured in > 2012 according to the EDID), the framebuffer has a better resolution. Comparing dmesg output from `boot -vx' with the two connectors may help to diagnose what's happening. (If you already sent it, sorry -- haven't had time to look closely yet.)
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 03:59:45PM +0100, tlaro...@polynum.com a écrit : > Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit : > > Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > > > > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > > > production). Only kernel and modules (not userland); and kernel is not > > > GENERIC but a special config one matching the previous 9.2 config > > > running on the node. > > > > > > No problem so far. As a user (and as advertised), I had simply to use > > > audiocfg(1) to set the new correct default for audio in order to have > > > sound back where I used to expect it. > > > > > > The main difference is about the framebuffer: previous kernel version > > > picked the correct mode. NetBSD 10.0 does not and use "entry level" > > > mode 640x480x67, resulting streched fat big characters; message: > > > > > > no data for est. mode 640x480x67 > > > > I think we are looking at the wrong place. The problem is the depth > > in the mode looked for: 67! The only depths the cards new about are > > multiple of 2^3. > > > > So where does this come from? > > Replying to myself: it is not the depth, but the frequency and it comes > from sys/dev/videomode/edid.c. > > Now trying to find why, at least, it does not find 640x480x60, which > exists---and 720x400x70 that exists also. I have it backward: the failure is displayed, for DIAGNOSTIC, for one mode that is not found, but this does not mean that others are not found. The monitor EDID advertizes only two modes: 640x480x60 and 720x400x70 (while it can do others). The screen being 16:9 (nominal resolution is 1600x900), the VESA mode chosen leads to this "ugly" rendering with stretched, fat characters---which was not the case with 9.2. But is correct with the logics implemented if I'm not (this time) mistaken. I will look (silently) to dev/pci/radeonfb.c to understand better the logics and try to find if there is a way to obtain a better console display. BTW, the problem is with VGA and DVI(-D) connections. With another monitor connected with HDMI (so more recent than this present 16:9 monitor, that have only VGA and DVI-D connectors and was manufactured in 2012 according to the EDID), the framebuffer has a better resolution. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
67 is a refresh-rate, not a depth On Sun, 29 Jan 2023, tlaro...@polynum.com wrote: Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : Context: I'm testing NetBSD 10.0 BETA on an isolated node (not production). Only kernel and modules (not userland); and kernel is not GENERIC but a special config one matching the previous 9.2 config running on the node. No problem so far. As a user (and as advertised), I had simply to use audiocfg(1) to set the new correct default for audio in order to have sound back where I used to expect it. The main difference is about the framebuffer: previous kernel version picked the correct mode. NetBSD 10.0 does not and use "entry level" mode 640x480x67, resulting streched fat big characters; message: no data for est. mode 640x480x67 I think we are looking at the wrong place. The problem is the depth in the mode looked for: 67! The only depths the cards new about are multiple of 2^3. So where does this come from? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C !DSPAM:63d695c5339123521361! ++--+--+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses:| | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com| | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | | & Network Engineer | | pgoyett...@gmail.com | ++--+--+
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > production). Only kernel and modules (not userland); and kernel is not > GENERIC but a special config one matching the previous 9.2 config > running on the node. > > No problem so far. As a user (and as advertised), I had simply to use > audiocfg(1) to set the new correct default for audio in order to have > sound back where I used to expect it. > > The main difference is about the framebuffer: previous kernel version > picked the correct mode. NetBSD 10.0 does not and use "entry level" > mode 640x480x67, resulting streched fat big characters; message: > > no data for est. mode 640x480x67 I think we are looking at the wrong place. The problem is the depth in the mode looked for: 67! The only depths the cards new about are multiple of 2^3. So where does this come from? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit : > Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > > production). Only kernel and modules (not userland); and kernel is not > > GENERIC but a special config one matching the previous 9.2 config > > running on the node. > > > > No problem so far. As a user (and as advertised), I had simply to use > > audiocfg(1) to set the new correct default for audio in order to have > > sound back where I used to expect it. > > > > The main difference is about the framebuffer: previous kernel version > > picked the correct mode. NetBSD 10.0 does not and use "entry level" > > mode 640x480x67, resulting streched fat big characters; message: > > > > no data for est. mode 640x480x67 > > I think we are looking at the wrong place. The problem is the depth > in the mode looked for: 67! The only depths the cards new about are > multiple of 2^3. > > So where does this come from? Replying to myself: it is not the depth, but the frequency and it comes from sys/dev/videomode/edid.c. Now trying to find why, at least, it does not find 640x480x60, which exists---and 720x400x70 that exists also. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
> Date: Mon, 23 Jan 2023 09:33:45 +0100 > From: tlaro...@polynum.com > > Le Mon, Jan 23, 2023 at 05:17:29AM +0700, Robert Elz a écrit : > > Date:Sun, 22 Jan 2023 20:27:24 +0100 > > From:tlaro...@polynum.com > > Message-ID: > > > > > > | +Zone kernel: Available graphics memory: 9007199254079374 KiB > > > > I see something like that too, but while it is obviously absurd, > > I'm not sure that it actually does any harm (maybe) - my system > > mostly works -- though I am still using wsfb - the last time I > > tried to start X with nouveau and no X server config at all > > (a week or so ago) the kernel crashed very soon after. > > > > In every case I have looked that big number has been (when converted > > to bytes, which the actual value being printed is - the output simply > > divides by 2^10 (ie: >>10) for our convenience, a value of the same > > general form, in your case > > > >9007199254079374 KiB == 9223372036177278976 bytes == 0x7FFFD79E3800 > > > > To me that suggests that probably something has a 64 bit value set to > > MAXINT, and then writes a 32 bit value on top of it (and then treats that > > as a 64 bit value). The top 32 bits being 0x7FFF seems always there. > > [...] This is the result of an integer overflow. It started to happen after a change in some uvm kernel memory parameters which are used by a Linux API shim, struct sysinfo::totalhigh. The definition of si_meminfo, which initializes this, is wrong, but previously the kernel map virtual size was usually much smaller than the amount of physical RAM anyway so it would print a less bonkers number in the past. This should be fixed, but it's only relevant to allocation in low-memory situations, not relevant to mode setting early at boot. > Another possibility is a ptr diff'ing that gave the correct value > previously and is not pertinent anymore because the memory address hasi > changed: > > 9.2: > -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, > stride 6400 > > while 10.0 is: > +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride > 6400 Red herring: - 9.2 prints the framebuffer's kernel virtual address, which is not very useful (and bad for kaslr). - 10.0 prints the framebuffer's physical address. Nothing changed in what is stored or calculated -- only in what is printed, in genfb.c 1.79.
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Mon, Jan 23, 2023 at 05:17:29AM +0700, Robert Elz a écrit : > Date:Sun, 22 Jan 2023 20:27:24 +0100 > From:tlaro...@polynum.com > Message-ID: > > > | +Zone kernel: Available graphics memory: 9007199254079374 KiB > > I see something like that too, but while it is obviously absurd, > I'm not sure that it actually does any harm (maybe) - my system > mostly works -- though I am still using wsfb - the last time I > tried to start X with nouveau and no X server config at all > (a week or so ago) the kernel crashed very soon after. > > In every case I have looked that big number has been (when converted > to bytes, which the actual value being printed is - the output simply > divides by 2^10 (ie: >>10) for our convenience, a value of the same > general form, in your case > >9007199254079374 KiB == 9223372036177278976 bytes == 0x7FFFD79E3800 > > To me that suggests that probably something has a 64 bit value set to > MAXINT, and then writes a 32 bit value on top of it (and then treats that > as a 64 bit value). The top 32 bits being 0x7FFF seems always there. > [...] Another possibility is a ptr diff'ing that gave the correct value previously and is not pertinent anymore because the memory address hasi changed: 9.2: -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, stride 6400 while 10.0 is: +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 6400 FWIW, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Hello, Le Sun, Jan 22, 2023 at 04:59:19PM +0100, Martin Husemann a écrit : > On Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com wrote: > > no data for est. mode 640x480x67 > > [..] > > > I have not updated the book blocks. Is the 10.0 kernel expecting to have > > hints about the modes from the bootloader i.e. a new install would > > have updated the boot blocks and I would not have seen this? > > Boot blocks should be unrelated to this, but boot method (UEFI or BIOS) > may play a role (that is not fully analyzed). > > We need more details, like full dmesg. > > Does the kernel probe the correct display connection? > > There are a few i915 PRs open that are caused by the wrong connector being > used or the proper connector not responding, so the display capabilities > can not be read, but there may be other reasons why the kernel can not > read the EDID data. Please find attached the 10.0 dmesg and the diff from 9.2 dmesg to 10.0 dmesg (not edited while the huge majority of differences are that PCI ids are translated to strings about vendor and product). Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 10.0_BETA (CONFIG) #0: Sun Jan 22 11:01:04 CET 2023 tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-9.2-amd64_netbsd-amd64/netbsd/obj/sys/arch/amd64/compile/CONFIG total memory = 8120 MB avail memory = 7834 MB timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 mainbus0 (root) ACPI: RSDP 0x000F04A0 24 (v02 ALASKA) ACPI: XSDT 0xDDF9A078 74 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: FACP 0xDDFA7AC8 00010C (v05 ALASKA A M I01072009 AMI 00010013) ACPI: DSDT 0xDDF9A188 00D940 (v02 ALASKA A M I0034 INTL 20120711) ACPI: FACS 0xDDFC7F80 40 ACPI: APIC 0xDDFA7BD8 62 (v03 ALASKA A M I01072009 AMI 00010013) ACPI: FPDT 0xDDFA7C40 44 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: SSDT 0xDDFA7C88 000539 (v01 PmRef Cpu0Ist 3000 INTL 20120711) ACPI: SSDT 0xDDFA81C8 000AD8 (v01 PmRef CpuPm3000 INTL 20120711) ACPI: MCFG 0xDDFA8CA0 3C (v01 ALASKA A M I01072009 MSFT 0097) ACPI: HPET 0xDDFA8CE0 38 (v01 ALASKA A M I01072009 AMI. 0005) ACPI: SSDT 0xDDFA8D18 00036D (v01 SataRe SataTabl 1000 INTL 20120711) ACPI: SSDT 0xDDFA9088 0034E1 (v01 SaSsdt SaSsdt 3000 INTL 20091112) ACPI: ASF! 0xDDFAC570 A5 (v32 INTEL HCG 0001 TFSM 000F4240) ACPI: 5 ACPI AML tables successfully acquired and loaded ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins cpu0 at mainbus0 apid 0 cpu0: Use lfence to serialize rdtsc cpu0: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu0: node 0, package 0, core 0, smt 0 cpu1 at mainbus0 apid 2 cpu1: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu1: node 0, package 0, core 1, smt 0 acpi0 at mainbus0: Intel ACPICA 20221020 acpi0: X/RSDT: OemId , AslId acpi0: MCFG: segment 0, bus 0-63, address 0xf800 ACPI: Dynamic OEM Table Load: ACPI: SSDT 0x8E7E9B90F808 0005AA (v01 PmRef ApIst3000 INTL 20120711) acpi0: SCI interrupting at int 9 acpi0: fixed power button present timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000 hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400) timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000 acpiec0 at acpi0 (H_EC, PNP0C09-1): not present TPMX (PNP0C01) at acpi0 not configured FWHD (INT0800) at acpi0 not configured attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0 com0 at acpi0 (UAR1, PNP0501-1): io 0x3f8-0x3ff irq 4 com0: ns16550a, 16-byte FIFO lpt0 at acpi0 (LPTE, PNP0400): io 0x378-0x37f irq 5 acpiwmi0 at acpi0 (WMI1, PNP0C14-MXM2): ACPI WMI Interface acpiwmibus at acpiwmi0 not configured acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button acpiwmi1 at acpi0 (WMIO, PNP0C14-0): ACPI WMI Interface acpiwmibus at acpiwmi1 not configured acpifan0 at acpi0 (FAN0, PNP0C0B-0): ACPI Fan acpifan1 at acpi0 (FAN1, PNP0C0B-1): ACPI Fan acpifan2 at acpi0 (FAN2, PNP0C0B-2): ACPI Fan acpifan3 at acpi0 (FAN3, PNP0C0B-3): ACPI Fan acpifan4 at acpi0 (FAN4, PNP0C0B-4): ACPI Fan acpitz0 at acpi0 (TZ00) acpitz0: active cooling level 0: 80.0C acpitz0: active cooling level 1: 55.0C acpitz0: active cooling level 2: 0.0C acpitz0: active cooling level
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Date:Sun, 22 Jan 2023 20:27:24 +0100 From:tlaro...@polynum.com Message-ID: | +Zone kernel: Available graphics memory: 9007199254079374 KiB I see something like that too, but while it is obviously absurd, I'm not sure that it actually does any harm (maybe) - my system mostly works -- though I am still using wsfb - the last time I tried to start X with nouveau and no X server config at all (a week or so ago) the kernel crashed very soon after. In every case I have looked that big number has been (when converted to bytes, which the actual value being printed is - the output simply divides by 2^10 (ie: >>10) for our convenience, a value of the same general form, in your case 9007199254079374 KiB == 9223372036177278976 bytes == 0x7FFFD79E3800 To me that suggests that probably something has a 64 bit value set to MAXINT, and then writes a 32 bit value on top of it (and then treats that as a 64 bit value). The top 32 bits being 0x7FFF seems always there. It could also be doing a read of a 64 bit value from hardware, where most (or all) of the upper 32 bits don't really exist, and simply float, which isn't being masked - but it seems very unlikely an issue like that would affect multiple different graphics board types (from different manufacturers). I took a quick look in the kernel, and while I could find where this value exists, and is printed, attempting to track down what sets it eluded me. It looks to be via a function referenced by a structure, but I couldn't find anything which looked like it might be calling it (it may be hidden in a macro or something.) Since the same thing happens with all different video drivers, it is unlikely to be in those (though it could be a common, cut buggy code type issue). kre
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > [...] > > The main difference is about the framebuffer: previous kernel version > picked the correct mode. NetBSD 10.0 does not and use "entry level" > mode 640x480x67, resulting streched fat big characters; message: > > no data for est. mode 640x480x67 > > while in dmesg the framebuffer has the same dimensions as with the > 9.2 kernel: > > 9.2: > -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, > stride 6400 > > 10.0: > +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride > 6400 > The differences between 9.2 (/^-/) and 10.0 (/^+/) extracted: -kern info: [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164). +initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164 0x00). -Zone kernel: Available graphics memory: 2601178 kiB -Zone dma32: Available graphics memory: 2097152 kiB +Zone kernel: Available graphics memory: 9007199254079374 KiB +Zone dma32: Available graphics memory: 2097152 KiB Note the value, on 10.0 about the "Zone kernel" and cf. with the correct (9.2) one. In PR #56847, this is mentionned about "nouveau" (and I have "radeon") and about the problem been with UEFI and not BIOS: this is incorrect, since my node is in legacy boot: it uses BIOS and the value is incorrect. So the problem is not UEFI vs. BIOS. There is also a third argument about CEDAR in 10.0 not existing in 9.2. May be the same as for the sound: 10.0 is not enumerating in the same order, and what succeeded previously because the first entry was fortunately the correct one, is now failing. Note: I stumbled upon PR #56847, previously, while searching something else and had quite a time, now, remembering it, finding it back with the PR search tools. And then, trying to find a way to find it back... I stumbled on a page by D. Holland stating that the bug report system should be revamped. It's difficult not to concur... May I suggest that a future system should send candidates PR to a mailing list so that keywords and sorting is done by knowledgeable people in order to put in their vincinity PRs based on the moon they are (probably) pointing to, instead of the finger of the reporter ? (It's not a derision against the reporter---me included; the reporter reports what he sees: symptoms.) -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
On Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com wrote: > no data for est. mode 640x480x67 [..] > I have not updated the book blocks. Is the 10.0 kernel expecting to have > hints about the modes from the bootloader i.e. a new install would > have updated the boot blocks and I would not have seen this? Boot blocks should be unrelated to this, but boot method (UEFI or BIOS) may play a role (that is not fully analyzed). We need more details, like full dmesg. Does the kernel probe the correct display connection? There are a few i915 PRs open that are caused by the wrong connector being used or the proper connector not responding, so the display capabilities can not be read, but there may be other reasons why the kernel can not read the EDID data. Martin