Hi all, At work I am in the process of deploying an array of 4 cubietrucks for use in the Xen Project automated test framework.
2 of the 4 boards seem to work just fine in (repeated) pre-commissioning tests but two are failing fairly reliably. One with: Timeout, server not responding. >From ssh and the other with networking issues (DNS timeouts etc) during initial installation (Debian installer). These boards are now in a colo, but previously when they were on my desk the same two boards both failed with the ssh Timeout error and the other two were ok, so the problem boards do seem to persist over changes of infrastructure etc. ssh is used to login to the boxes and drive the test case from a controller machine. In this case the failing test is a build job which is building Xen or a Linux kernel etc. All build jobs run natively under Debian (running things under Xen would follow, but it never gets past the build jobs due to this issue). The kernel in use is 3.16.7-ckt2 (Debian revision 1~bpo70+1). The boards are all using u-boot v2014.10. I can't see anything in the logs and the ifconfig stats show now errors. Apart from the hiccough networking seems fine (i.e. subsequent ssh commands do work). Ssh is using "-o BatchMode=yes -o ConnectTimeout=100 -o ServerAliveInterval=100" options and make is invoked with -j4 which doesn't seem too aggressive. In the case of the failing installation /sys/class/net/eth0/statistics/*{dropped,errors} are all 0, nothing in dmesg or the logs. TBH this one might be a cabling or infrastructure issue, but I'm reasonably confident that the ssh one is not. Since two of the boards are OK and two are not I suppose something somewhere must be marginal. I'm not really sure where to start looking. Perhaps CONFIG_GMAC_TX_DELAY on the u-boot side might be relevant? I've had a look through the logs from v3.16 to master for drivers/net/ethernet/stmicro/stmmac and nothing leaps out as being a relevant backport. Or perhaps it isn't networking related at all, e.g. perhaps AHCI is stalling and stopping sshd from responding to pings. There's nothing in the logs to indicate one way or another. Any bright ideas on where to look / what to try would be gratefully received. Ian. -- You received this message because you are subscribed to the Google Groups "linux-sunxi" group. To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.