Re: SunBlade 100: X is very yellow with XVR-100 (radeon r100)

2021-12-09 Thread Jonathan Gray
On Thu, Dec 09, 2021 at 10:01:30PM -0700, Ted Bullock wrote:
> On 2021-12-09 6:46 p.m., Ted Bullock wrote:
> > On 2021-12-06 4:21 p.m., Ted Bullock wrote:
> > I think that there is an bug triggered by endian code here:
> > 
> >> radeondrm0: RV100
> >> BIOS signature incorrect 0 0
> > 
> > in sys/dev/pci/drm/radeon/radeon_bios.c:840
> > 
> > if (rdev->bios[0] != 0x55 || rdev->bios[1] != 0xaa) {
> > printk("BIOS signature incorrect %x %x\n", rdev->bios[0], 
> > rdev->bios[1]);
> > goto free_bios;
> > }
> > 
> > I'm pretty sure that on sparc those bytes aren't going to be reporting
> > the same information as on a little endian machine. Or am I crazy and
> > wrong...
> > 
> 
> Indeed, I'm correct about there being an endian bug here.
> 
> I wrote some testing printfs to determine the code path since I'm still
> an uneducated peasant who doesn't understand ddb.  At least part of the
> problem for this card/system, starts with the following code:
> 
> function: radeon_read_bios in sys/dev/pci/drm/radeon/radeon_bios.c:157
> 
> I added a test around the memcpy where the cards bios is copied to a
> buffer rdev->bios and printed the first 8 bytes.
> 
>   printk("radeon bios header: %x %x %x %x %x %x %x %x\n",
>   bios[0],
>   bios[1],
>   bios[2],
>   bios[3],
>   bios[4],
>   bios[5],
>   bios[6],
>   bios[7]);
> 
>   rdev->bios = kmalloc(size, GFP_KERNEL);
>   memcpy(rdev->bios, bios, size);
> 
>   printk("buffered bios header: %x %x %x %x %x %x %x %x\n",
>   rdev->bios[0],
>   rdev->bios[1],
>   rdev->bios[2],
>   rdev->bios[3],
>   rdev->bios[4],
>   rdev->bios[5],
>   rdev->bios[6],
>   rdev->bios[7]);
> 
> On the following boot I see this:
> 
> Rebooting with command: boot
> Boot device: disk  File and args:
> OpenBSD IEEE 1275 Bootblock 2.1
> ..>> OpenBSD BOOT 1.22
> Trying bsd...
> 
> 
> 
> radeondrm0: RV100
> radeon bios header: 55 aa 34 0 0 0 0 0
> buffered bios header: 0 0 0 0 0 34 aa 55
> BIOS signature incorrect 0 0
> [drm] *ERROR* radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
> [drm] *ERROR* radeon: cp isn't working (-22).
> drm:pid0:r100_startup *ERROR* failed initializing CP (-22).
> drm:pid0:r100_init *ERROR* Disabling GPU acceleration
> [drm] *ERROR* Wait for CP idle timeout, shutting down CP.
> Failed to wait GUI idle while programming pipes. Bad things might happen.
> radeondrm0: 1280x1024, 8bpp
> wsdisplay1 at radeondrm0 mux 1
> wsdisplay1: screen 0 added (std, sun emulation)
> Bogus possible_clones: [ENCODER:45:TMDS-45] possible_clones=0x6 (full encoder 
> mask=0x7)
> Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x5 (full encoder 
> mask=0x7)
> Bogus possible_clones: [ENCODER:48:DAC-48] possible_clones=0x3 (full encoder 
> mask=0x7)
> 
> Thoughts folks? This is clearly going to impact all big endian + radeon gear.
> 
> Actually, I bet that the macppc platform has the same problem too.

sparc64 maps pci little endian, I don't think macppc does

can you try the following?

Index: sys/dev/pci/drm/amd/amdgpu/amdgpu_bios.c
===
RCS file: /cvs/src/sys/dev/pci/drm/amd/amdgpu/amdgpu_bios.c,v
retrieving revision 1.4
diff -u -p -r1.4 amdgpu_bios.c
--- sys/dev/pci/drm/amd/amdgpu/amdgpu_bios.c7 Jul 2021 02:38:22 -   
1.4
+++ sys/dev/pci/drm/amd/amdgpu/amdgpu_bios.c10 Dec 2021 07:41:39 -
@@ -200,7 +200,6 @@ bool amdgpu_read_bios(struct amdgpu_devi
 #else
 bool amdgpu_read_bios(struct amdgpu_device *adev)
 {
-   uint8_t __iomem *bios;
size_t size;
pcireg_t address, mask;
bus_space_handle_t romh;
@@ -218,25 +217,15 @@ bool amdgpu_read_bios(struct amdgpu_devi
size = PCI_ROM_SIZE(mask);
if (size == 0)
return false;
-   rc = bus_space_map(adev->memt, PCI_ROM_ADDR(address), size,
-   BUS_SPACE_MAP_LINEAR, );
+   rc = bus_space_map(adev->memt, PCI_ROM_ADDR(address), size, 0, );
if (rc != 0) {
printf(": can't map PCI ROM (%d)\n", rc);
return false;
}
-   bios = (uint8_t *)bus_space_vaddr(adev->memt, romh);
-   if (!bios) {
-   printf(": bus_space_vaddr failed\n");
-   return false;
-   }
 
adev->bios = kzalloc(size, GFP_KERNEL);
-   if (adev->bios == NULL) {
-   bus_space_unmap(adev->memt, romh, size);
-   return false;
-   }
adev->bios_size = size;
-   memcpy_fromio(adev->bios, bios, size);
+   bus_space_read_region_1(adev->memt, romh, 0, adev->bios, size);
bus_space_unmap(adev->memt, romh, size);
 
if (!check_atom_bios(adev->bios, size)) {
Index: sys/dev/pci/drm/radeon/radeon_bios.c
===
RCS file: 

Re: SunBlade 100: X is very yellow with XVR-100 (radeon r100)

2021-12-09 Thread Ted Bullock
On 2021-12-09 6:46 p.m., Ted Bullock wrote:
> On 2021-12-06 4:21 p.m., Ted Bullock wrote:
> I think that there is an bug triggered by endian code here:
> 
>> radeondrm0: RV100
>> BIOS signature incorrect 0 0
> 
> in sys/dev/pci/drm/radeon/radeon_bios.c:840
> 
> if (rdev->bios[0] != 0x55 || rdev->bios[1] != 0xaa) {
>   printk("BIOS signature incorrect %x %x\n", rdev->bios[0], 
> rdev->bios[1]);
>   goto free_bios;
> }
> 
> I'm pretty sure that on sparc those bytes aren't going to be reporting
> the same information as on a little endian machine. Or am I crazy and
> wrong...
> 

Indeed, I'm correct about there being an endian bug here.

I wrote some testing printfs to determine the code path since I'm still
an uneducated peasant who doesn't understand ddb.  At least part of the
problem for this card/system, starts with the following code:

function: radeon_read_bios in sys/dev/pci/drm/radeon/radeon_bios.c:157

I added a test around the memcpy where the cards bios is copied to a
buffer rdev->bios and printed the first 8 bytes.

printk("radeon bios header: %x %x %x %x %x %x %x %x\n",
bios[0],
bios[1],
bios[2],
bios[3],
bios[4],
bios[5],
bios[6],
bios[7]);

rdev->bios = kmalloc(size, GFP_KERNEL);
memcpy(rdev->bios, bios, size);

printk("buffered bios header: %x %x %x %x %x %x %x %x\n",
rdev->bios[0],
rdev->bios[1],
rdev->bios[2],
rdev->bios[3],
rdev->bios[4],
rdev->bios[5],
rdev->bios[6],
rdev->bios[7]);

On the following boot I see this:

Rebooting with command: boot
Boot device: disk  File and args:
OpenBSD IEEE 1275 Bootblock 2.1
..>> OpenBSD BOOT 1.22
Trying bsd...



radeondrm0: RV100
radeon bios header: 55 aa 34 0 0 0 0 0
buffered bios header: 0 0 0 0 0 34 aa 55
BIOS signature incorrect 0 0
[drm] *ERROR* radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
[drm] *ERROR* radeon: cp isn't working (-22).
drm:pid0:r100_startup *ERROR* failed initializing CP (-22).
drm:pid0:r100_init *ERROR* Disabling GPU acceleration
[drm] *ERROR* Wait for CP idle timeout, shutting down CP.
Failed to wait GUI idle while programming pipes. Bad things might happen.
radeondrm0: 1280x1024, 8bpp
wsdisplay1 at radeondrm0 mux 1
wsdisplay1: screen 0 added (std, sun emulation)
Bogus possible_clones: [ENCODER:45:TMDS-45] possible_clones=0x6 (full encoder 
mask=0x7)
Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x5 (full encoder 
mask=0x7)
Bogus possible_clones: [ENCODER:48:DAC-48] possible_clones=0x3 (full encoder 
mask=0x7)

Thoughts folks? This is clearly going to impact all big endian + radeon gear.

Actually, I bet that the macppc platform has the same problem too.


-- 
Ted Bullock 



Re: SunBlade 100: X is very yellow with XVR-100 (radeon r100)

2021-12-09 Thread Ted Bullock
On 2021-12-06 4:21 p.m., Ted Bullock wrote:
> Ok, so this time I plugged in a discrete GPU into this ultrasparc
> system, the sun XVR-100 which is a PCI card with vga and dvi ports.  The
> card uses an ati radeon r100 generation video chip.

I think that there is an bug triggered by endian code here:

> radeondrm0: RV100
> BIOS signature incorrect 0 0

in sys/dev/pci/drm/radeon/radeon_bios.c:840

if (rdev->bios[0] != 0x55 || rdev->bios[1] != 0xaa) {
printk("BIOS signature incorrect %x %x\n", rdev->bios[0], 
rdev->bios[1]);
goto free_bios;
}

I'm pretty sure that on sparc those bytes aren't going to be reporting
the same information as on a little endian machine. Or am I crazy and
wrong...

At the moment I don't know how to use the debugger to inspect what's
happening here. So stay tuned I suppose while I learn some stuff. In the
meantime I'll throw some printf to see what's actually there (after this
slow machine builds a test kernel which seems to take a while :P

> [drm] *ERROR* radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
> [drm] *ERROR* radeon: cp isn't working (-22).
> drm:pid0:r100_startup *ERROR* failed initializing CP (-22).
> drm:pid0:r100_init *ERROR* Disabling GPU acceleration
> [drm] *ERROR* Wait for CP idle timeout, shutting down CP.
> Failed to wait GUI idle while programming pipes. Bad things might happen.
> radeondrm0: 1280x1024, 8bpp
> wsdisplay1 at radeondrm0 mux 1: console (std, sun emulation), using wskbd0
> Bogus possible_clones: [ENCODER:45:TMDS-45] possible_clones=0x6 (full encoder 
> mask=0x7)
> Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x5 (full encoder 
> mask=0x7)
> Bogus possible_clones: [ENCODER:48:DAC-48] possible_clones=0x3 (full encoder 
> mask=0x7)
> 

^^^ I don't think that any of this information means anything until
talking to the cards bios works.

-- 
Ted Bullock 



Re: vte(4): restore MDC clock speed register value after MAC reset

2021-12-09 Thread Andrius V
Hi,

I would like to follow up on submitted patch, is there any concern on
applying it for vte(4) driver (as well as adding new PHY models to
rdcphy(4): https://marc.info/?l=openbsd-bugs=163105121012387=2 to
rdcphy)? It would be helpful for me to avoid patching it manually.
Thank you.

Regards,
Andrius V




On Mon, Sep 13, 2021 at 1:42 PM Andrius V  wrote:
>
> Hi,
>
> On some Vortex86 SoCs MDC speed control register needs to be restored
> to original value after MAC reset. This issue happens if MAC has non
> default VTE_MDCSC register value before reset, and it is erroneously
> set to default after, thus causing certain PHY registers fail to be
> read. Since PHY registers determine link status, the link is never
> established (ifconfig media shows "none" value). Also, one obvious
> sign is incorrect oui value in dmesg (0x34 instead of the one
> defined in MII_OUI_RDC).
>
> Initially, I found and fixed that in NetBSD, but it affects all BSDs
> and Linux. Patch is already applied on NetBSD
> (http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/pci/if_vte.c.diff?r1=1.31=1.32)
> and Linux netdev branch
> (https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=e3f0cc1a945fcefec0c7c9d9dfd028a51daa1846).
> Sending the same patch for OpenBSD. For more info and my debugging
> history can be found in
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=53494
> thread. I tested the patch on my Vortex86DX3 (link is established/oui
> is correct), DX2 based machines on OpenBSD (but the patch itself was
> tested by few more people in NetBSD/Linux too).
>
> This patch is loosely related to my request to add new PHY models, but
> can be applied independently, since vte(4) works with generic PHY
> driver as well.
>
> ---
> Index: sys/dev/pci/if_vte.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_vte.c,v
> retrieving revision 1.24
> diff -u -p -u -p -r1.24 if_vte.c
> --- sys/dev/pci/if_vte.c10 Jul 2020 13:26:38 -1.24
> +++ sys/dev/pci/if_vte.c13 Sep 2021 10:22:06 -
> @@ -1084,9 +1084,10 @@ vte_tick(void *arg)
>  void
>  vte_reset(struct vte_softc *sc)
>  {
> -uint16_t mcr;
> +uint16_t mcr, mdcsc;
>  int i;
>
> +mdcsc = CSR_READ_2(sc, VTE_MDCSC);
>  mcr = CSR_READ_2(sc, VTE_MCR1);
>  CSR_WRITE_2(sc, VTE_MCR1, mcr | MCR1_MAC_RESET);
>  for (i = VTE_RESET_TIMEOUT; i > 0; i--) {
> @@ -1105,6 +1106,14 @@ vte_reset(struct vte_softc *sc)
>  CSR_WRITE_2(sc, VTE_MACSM, 0x0002);
>  CSR_WRITE_2(sc, VTE_MACSM, 0);
>  DELAY(5000);
> +
> +/*
> + * On some SoCs (like Vortex86DX3) MDC speed control register value
> + * needs to be restored to original value instead of default one,
> + * otherwise some PHY registers may fail to be read.
> + */
> +if (mdcsc != MDCSC_DEFAULT)
> +CSR_WRITE_2(sc, VTE_MDCSC, mdcsc);
>  }
>
>  int
> 
>
> Regards,
> Andrius V