I've recently acquired a Sparc T3-1, and installed Debian Unstable's Sparc port 
on it, as a guest in a Oracle VM Server for Sparc ("ldm") VM.

I ran into a few issues, that I've cataloged in a story below. But it has a 
happy ending!

Kernel sunvdc module
====================
Installation wasn't 100% straightforward, as the "sunvdc" virtual disk driver, 
at least as used in kernel 3.16.7-ckt9-3, which was what was in the d-i image I 
downloaded from http://d-i.debian.org/daily-images/sparc/ at the time, seems to 
be basically 100% broken. As soon as the installer got to the partitioner, the 
whole VM would hang. I see that there have been a lot of commits to that driver 
from Oracle people in the last few months, so I hope they're working on fixing 
it. Dunno. 

I also never tried installing on "bare metal", which I'm led to believe from 
random forum posts does work out of the box, since I wanted to keep solaris 
(and didn't realize, going in, how hard I was making things for myself...)

So long story short on that, I ended up doing an NFS root install instead, 
since the sunvnet network driver worked fine. It would be real nice if 
debian-installer had the ability to install to NFS readily available; I had to 
go extract the nfs modules manually from the normal kernel package, and then 
run debootstrap manually. (But -- I'm sure happy that debian's initramfs has 
builtin support for NFS root!)

klibc-utils
===========
Next problem I found is that the klibc-utils' ipconfig program gets a Bus Error 
when trying to get itself an DHCP address. I believe that DHCP client is only 
ever used in the initramfs, and only if you want to do an NFS root; the other 
dhcp daemons, e.g. as found in debian-installer, had worked fine. So, I told it 
to use a static IP instead, which worked. (I'm sure the bug is just an obvious 
misaligned memory access; I can look into that later).

GLibc
=====
After that, everything seemed to be going fine, except that programs like GCC 
would randomly segfault and give parse errors. This has been reported before, 
e.g. http://thread.gmane.org/gmane.linux.ports.sparc/16835, from 2 years ago. 
Things were stable enough to use interactively, if you're willing to keep 
retrying a build until it works, but not stable enough to use for any autobuild 
system.

After a getting a hint from Aurelien that disabling optimized memcpy routines 
in glibc (eglibc 2.19-1, on Wed, 04 Jun 2014 20:32:06 +0200) had improved, but 
did not fix, the problem, I started looking into that....

...And found that recompiling glibc, disabling the sparcv9 optimizations (that 
is: eliminating debian/patches/sparc/local-sparcv9-target.diff), *appears* to 
have completely fixed the stability issue!

To try to verify that, I ran a loop building and rebuilding 'clang' (with full 
"ninja" parallelism) overnight, and it's had zero crashes in all 14 builds of 
clang that it got through. Prior to fixing glibc, at least one of the ~2300 
build steps (gcc/as/ld) was sure to crash unreproducibly.

It'd be great if someone wants to try to figure out exactly /which/ of the asm 
routines in the various sysdeps/**/sparc32/sparcv9 are broken, to narrow down 
the problem better, too. I highly suspect there's just something wrong in one 
or more of the hand-written asm files, but it's certainly possible there's some 
wider problem that the sparcv9 optimizations of glibc (but nothing else I've 
seen so far), just happens to expose.

GCC
====

Oh, and I'll mention one more bug I ran into, which is not sparc-specific, but 
does affect building some C++11 software on Sparc:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65945

The workaround for that is usually to just compile at an optimization level 
greater than -O0, as the problematic construct typically only occurs in inline 
templates forwarding their arguments onto another function, which all just 
disappear at high opt levels.

Conclusion
==========

It seems like the one change to glibc is probably a good-enough fix to get the 
Sparc port back to a position of stability.

And I hope this can help avoid Sparc needing to be deleted from Debian...

It seems to really *not* be in as bad a shape as one might be led to believe. 
E.g. I'm not sure what "lack of proper kernel support" means (from Joerg's 
https://lists.debian.org/debian-devel/2015/04/msg00284.html). The kernel 
appears to be working fine. I ran into some bugs, but besides the one glibc 
issue, none really seem fatal to the health of the port in Debian.

James

Reply via email to