Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread John Paul Adrian Glaubitz
On 02/18/2018 10:40 PM, Frank Scheiner wrote:
>> , it's just that
>> the tool you are using is apparently producing unaligned accesses.
> 
> What you emphasized is from the AlphaServer DS25, do you assume that 
> unaligned accesses lead to bus errors on sparc64?

To be honest, you wrote so much text that I didn't notice you were
talking about an Alpha machine here.

If you're talking about a command crashing, it would be enough to just
show the reproducer command line and then someone can have a look with
GDB to find the place where the unaligned access happens.

So, please try to reduce messages in such cases a bit so that we don't
have to search the important information in a wall of text.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread Frank Scheiner

On 02/18/2018 10:48 PM, James Clarke wrote:

Having said that if there are unaligned
accesses reported on alpha then those are likely to also happen on sparc64 and
be the critters we're looking for.


Why do these only trap to software on alpha and not on sparc64? Is that 
due to a compiler option?


Cheers,
Frank



Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread James Clarke
On 18 Feb 2018, at 21:34, John Paul Adrian Glaubitz 
 wrote:
> On 02/18/2018 10:10 PM, John Paul Adrian Glaubitz wrote:
>> See my emphasis above. It's not that the network stack is broken, it's just 
>> that
>> the tool you are using is apparently producing unaligned accesses.
>> 
>> Did you try running systemd-networkd?
> 
> As a quick shot into the dark, I have triggered a binNMU of src:klibc, maybe
> it was broken by binutils which recently suffered from issues on sparc64.

Which will (or should) be rejected by mini-dak, as my 2.0.4-11+sparc64.1 is in
unreleased (#885852 hasn't been fixed yet or even acknowledged by anyone). I'd
be surprised if it's the binutils bug, but I guess you could do a rebuild
locally and upload a .2 to unreleased. Having said that if there are unaligned
accesses reported on alpha then those are likely to also happen on sparc64 and
be the critters we're looking for.

James



Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread Frank Scheiner

On 02/18/2018 10:10 PM, John Paul Adrian Glaubitz wrote:

IP-Config: no response after 3 s[    7.950191] ipconfig(196): unaligned trap at 
000120003868: 00011ffcf3af 28 2
[    8.058589] ipconfig(196): unaligned trap at 000120003868: 
00011ffcf3af 28 2

 ^^

(...)
Any idea what could be wrong with `ipconfig` or how I can further debug this?


See my emphasis above. It's not that the network stack is broken


Yeah, I should have been more specific in the subject, it's `ipconfig` 
or some dependency of it that looks broken.



, it's just that
the tool you are using is apparently producing unaligned accesses.


What you emphasized is from the AlphaServer DS25, do you assume that 
unaligned accesses lead to bus errors on sparc64?




Did you try running systemd-networkd?


No, but I also don't know how to include this in a Debian initramfs.

Cheers,
Frank



Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread John Paul Adrian Glaubitz
On 02/18/2018 10:10 PM, John Paul Adrian Glaubitz wrote:
> See my emphasis above. It's not that the network stack is broken, it's just 
> that
> the tool you are using is apparently producing unaligned accesses.
> 
> Did you try running systemd-networkd?

As a quick shot into the dark, I have triggered a binNMU of src:klibc, maybe
it was broken by binutils which recently suffered from issues on sparc64.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread John Paul Adrian Glaubitz
On 02/18/2018 09:55 PM, Frank Scheiner wrote:
> I'm currently working on creating a useful configuration for network boot 
> with GRUB2 on sparc64. But I'm experiencing problems during IP 
> auto-configuration.
> This is done by using the `ip=[...]` kernel command line option and the 
> `ipconfig` tool ([1]) included in the initramfs which evaluates this option.
> (...)
> Hence I currently do all needed configuration on sparc64 "manually" by 
> providing all required addresses in the `ip=[...]` option which at least 
> works but of
> course duplicates configuration. Other architectures (ppc64, alpha, hppa) 
> don't have this problem AFAICS.
> (...)
> IP-Config: no response after 3 s[    7.950191] ipconfig(196): unaligned trap 
> at 000120003868: 00011ffcf3af 28 2
> [    8.058589] ipconfig(196): unaligned trap at 000120003868: 
> 00011ffcf3af 28 2
^^
> (...)
> Any idea what could be wrong with `ipconfig` or how I can further debug this?

See my emphasis above. It's not that the network stack is broken, it's just that
the tool you are using is apparently producing unaligned accesses.

Did you try running systemd-networkd?

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



IP auto-config with DHCP on sparc64 possibly broken

2018-02-18 Thread Frank Scheiner

Hi,

I'm currently working on creating a useful configuration for network 
boot with GRUB2 on sparc64. But I'm experiencing problems during IP 
auto-configuration. This is done by using the `ip=[...]` kernel command 
line option and the `ipconfig` tool ([1]) included in the initramfs 
which evaluates this option.


[1]: 
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/tree/usr/kinit/ipconfig/README.ipconfig


Whenever I try to use DHCP I see bus errors during operation:

```
[...]
Begin: Running /scripts/init-premount .[   36.957993] RPC: Registered 
named UNIX socket transport module.

.. do[   37.033724] RPC: Registered udp transport module.
[   37.094913] RPC: Registered tcp transport module.

Beg[   37.155124] RPC: Registered tcp NFSv4.1 backchannel transport module.
in: Mounting root file system ... Begin: Running /scr[   37.286986] 
FS-Cache: Netfs 'nfs' registered for caching

ipts/nfs-top ... done.
Begin: Running /scripts/nfs-premount ... done.
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56 mtu 1500 DHCP
IP-Config: no response after 2 secs - giving up
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56 mtu 1500 DHCP
Bus error
[...]
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56 mtu 1500 DHCP
Bus error
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56[   41.042475] 
enp1s1f1: Link is up using

 mtu[   41.042483] internal
 150[   41.092568] transceiver at
0 DH[   41.123683] 100Mb/s, Full Duplex.
CP
Bus error
[...
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56 mtu 1500 DHCP
Bus error
/init: .: line 230: can't open '/run/net-enp1s1f1.conf'
[   41.534502] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0200

[...]
```

These are from an Ultra 10, but I also saw the same on a T5220 and a T5240.

The same happens, when running `ipconfig` manually from the initramfs:
```
[...]
No init found. Try passing init= bootarg.


BusyBox v1.27.2 (Debian 1:1.27.2-2) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) ipconfig :enp1s1f1:dhcp
IP-Config: enp1s1f1 hardware address 08:00:20:12:34:56 mtu 1500 DHCP
Bus error
```

Hence I currently do all needed configuration on sparc64 "manually" by 
providing all required addresses in the `ip=[...]` option which at least 
works but of course duplicates configuration. Other architectures 
(ppc64, alpha, hppa) don't have this problem AFAICS.


e.g. on DS25:
```
[...]
Begin: Running /scripts/init-p[3.334959] tg3 0002:02:05.0: firmware: 
direct-loading firmware tigon/tg3_tso.bin

remount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/nfs-top ... 
done.
Begin: Runnin[3.439451] IPv6: ADDRCONF(NETDEV_UP): enP2p2s5: link is 
not ready

g /scripts/nfs-premount ... done.
IP-Config: enP2p2s5 hardware address 00:16:35:12:34:56 mtu 1500 DHCP
[3.803708] random: crng init done
IP-Config: no response after 2 secs - giving up
IP-Config: enP2p2s5 hardware address 00:16:35:12:34:56 mtu 1500 DHCP
[7.153316] tg3 0002:02:05.0 enP2p2s5: Link is up at 1000 Mbps, full 
duplex
[7.154293] tg3 0002:02:05.0 enP2p2s5: Flow control is on for TX and 
on for RX

[7.155269] IPv6: ADDRCONF(NETDEV_CHANGE): enP2p2s5: link becomes ready
IP-Config: no response after 3 s[7.950191] ipconfig(196): unaligned 
trap at 000120003868: 00011ffcf3af 28 2
[8.058589] ipconfig(196): unaligned trap at 000120003868: 
00011ffcf3af 28 2


IP-Config: enP2p2s5 hardware address 00:16:35:12:34:56 mtu 1500 DHCP
IP-Config: enP2p2s5 guessed broadcast address 172.16.255.255
IP-Config: enP2p2s5 complete (dhcp from 172.16.0.1):
 address: 172.16.2.89  broadcast: 172.16.255.255   netmask: 255.255.0.0
 gateway: 172.16.0.1   dns0 : 172.16.0.1   dns1   : 0.0.0.0
 host   : ds25
 domain : domain.tld
 rootserver: 172.16.0.9 rootpath: /path/to/ds25
 filename  : /AC100259
done.
[...]
```
...everything works as expected.

As there were some problems with klibc on sparc64 end of 2017 I assume 
that there might still be something wrong there which breaks IP 
auto-configuration with DHCP on sparc64.


I only tried on a limited selection of machines, but as there were years 
between the introduction of Ultra 10 and T5240 and both use different 
NICs but show the same issues it's likely that this affects more or all 
sparc64 machines.


You can reproduce this by reconfiguring and rebuilding your initramfs 
(add `BOOT=nfs` to `/etc/initramfs-tools/initramfs.conf` and rebuild) 
and adding `ip=::dhcp` to your kernel command line. 
The latter should work regardless of using SILO or GRUB2 and also when 
the kernel is loaded from disk.


Before reboot, make sure you still have your original initramfs and a 
boot loader configuration which boots with the original initramfs.


A running DHCP server is required. I use the ISC DHCP server and a host 
configuration similar to the following:


```
host ultra-10 {
hardware ethernet 08:00:20:12:34:56;
fixed-address