> On Thu, Jul 30, 2009 at 10:50 AM, Andreas Barth<[email protected]> wrote: > > You know your porters mailing list best, but I want to highlight some of > > the issues: > > http://lists.debian.org/debian-hppa/2009/07/msg00002.html > > I can't comment on this issue. I hope Dave can?
Over the past few weeks, I have been testing 2.6.30.y on three different platforms (c3750, rp3440 and A500-7X). I have run identical 32 and 64-bit kernels on the c3750. To the base system, I have applied a collected set of patches. Except for the typo change recently posted to the parisc linux list, all the changes are now in 2.6.31. With the exception of nscd, I have had no segfault problems with 2.6.30.y on the c3750. However, the same is not true for the rp3440 and A500-7X. The rp3440 is worse than the A500-7X, but application segfaults occured very quickly running SMP kernels building GCC (usually in our old friend the dynamic loader). The A500-7X (gsyprf11) is now back running a modified SMP version of 2.6.19.22. Last change was the U bit fix. It has now run eight days without any obvious segfaults. 2.6.19.22 with the above changes is not segfault free on the rp3440. However, it is better than any other SMP build on this processor. I am currently running a UP build of 2.6.30.3 on the rp3440. It is not segfault free, but I can usually get through a GCC build without a fault. So, even with a UP kernel, we still get cache corruption on this machine. I wonder if it is possible to turn L2 off. I had hoped that the U bit fix would help. However, its effect is not dramatic. When rebooting the rp3440, it would sometimes report memory errors in the system hardware log. Similarly, the display attached to the VisEG on the c3750 would sometimes get noisy. Resetting the display mode at boot would cure this. Another effect was for cpus to mysteriously get disabled. I suspect that the kernel was sometimes accidently writing to the control memory for these devices. These problems may be fixed or reduced with the U bit fix. In summary, the segfault problem is still there and a major issue, particularly with SMP kernels. Without a testcase that consistently triggers the problem, it's almost impossible to debug what's going wrong. glob2 built for me, so the build failure was probably caused by cache corruption. Dave -- J. David Anglin [email protected] National Research Council of Canada (613) 990-0752 (FAX: 952-6602) -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

