Re: f11 ppc64 woes
On Sun, Jun 07, 2009 at 12:57:06PM +0100, David Woodhouse wrote: > On Sun, 7 Jun 2009, Josh Boyer wrote: > >> On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote: >>> On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote: I blame yaboot... >>> >>> Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem). >>> >>> We missed the boat to get that in F11 GA, right? >> >> Pretty sure. Jesse is getting on a plane today, and we release Tuesday. >> >> 0-day update it seems. Though that isn't going to help the installer any. >> We might have to recommend yum updating or pre-upgrade for the machines this >> impacted. > > Or 'netboot' with the zImage, which can be done from the CD. > > Precisely where in the release kernel does the 4MiB corruption happen? > drivers/scsi/scsi_transport_iscsi.c, which is unconditionally hit (via iscsi_tcp) by anaconda. ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sun, 7 Jun 2009, Josh Boyer wrote: On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote: On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote: I blame yaboot... Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem). We missed the boat to get that in F11 GA, right? Pretty sure. Jesse is getting on a plane today, and we release Tuesday. 0-day update it seems. Though that isn't going to help the installer any. We might have to recommend yum updating or pre-upgrade for the machines this impacted. Or 'netboot' with the zImage, which can be done from the CD. Precisely where in the release kernel does the 4MiB corruption happen? -- dwmw2 ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote: >On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote: >> I blame yaboot... > >Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem). > >We missed the boat to get that in F11 GA, right? Pretty sure. Jesse is getting on a plane today, and we release Tuesday. 0-day update it seems. Though that isn't going to help the installer any. We might have to recommend yum updating or pre-upgrade for the machines this impacted. josh ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote: > I blame yaboot... Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem). We missed the boat to get that in F11 GA, right? -- David WoodhouseOpen Source Technology Centre [email protected] Intel Corporation ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote: > On Sat, 2009-06-06 at 15:29 +0100, David Woodhouse wrote: > > You could also try using kexec -- that should help eliminate yaboot > > bugs too. > > Booting with kexec (after rebuilding it because for some reason we're > shipping a ppc32-capable kexec again, gr) I needed this too, btw: --- kexec/arch/ppc64/kexec-elf-rel-ppc64.c.orig 2009-06-06 16:27:10.0 +0100 +++ kexec/arch/ppc64/kexec-elf-rel-ppc64.c 2009-06-06 16:08:37.0 +0100 @@ -88,6 +88,11 @@ void machine_apply_elf_rel(struct mem_eh | (value & 0x03fc); break; + case R_PPC64_REL32: + /* Convert value to relative */ + *(uint32_t *)location = value - address; + break; + case R_PPC64_ADDR16_LO: *(uint16_t *)location = value & 0x; break; > I blame yaboot... I note that yaboot doesn't actually do any relocations when it loads the relocatable kernel, while kexec does. Should it? -- David WoodhouseOpen Source Technology Centre [email protected] Intel Corporation ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sat, 2009-06-06 at 15:29 +0100, David Woodhouse wrote: > You could also try using kexec -- that should help eliminate yaboot > bugs too. Booting with kexec (after rebuilding it because for some reason we're shipping a ppc32-capable kexec again, gr) shows that the corruption has gone away: Instruction dump: fbbd fbbd0008 48224d75 6000 eb9e8000 7f83e378 48227829 6000 <7fa3eb78> e89c0020 38bc0018 4beecb21 6000 7f83e378 4822715d 6000 3860 383f0090 e8010010 7c0803a6 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 I had been seeing it before: fbbd fbbd0008 48224d75 6000 eb9e8000 7f83e378 48227829 6000 <1010> 0008 1013 000f 7961626f 6f74 00101600 0c00 0200 00101100 0810 7c0803a6 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 I blame yaboot... -- David WoodhouseOpen Source Technology Centre [email protected] Intel Corporation ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sat, 2009-06-06 at 09:54 -0400, Josh Boyer wrote: > ybin isn't needed on the powerstation iirc. Anyway, that is indeed odd. > > We should have Tony take a look at this if possible. Or if David can remeber > how to do a netboot directly from OF (and skipping yaboot), that would be a > good test too. /usr/sbin/wrapper -o zImage /boot/vmlinuz-2.6.29.4-167.fc11.ppc64 \ -i /boot/initrd-2.6.29.4-167.fc11.ppc64.img Give resulting zImage to OpenFirmware. Various versions of OF have different bugs with that (image size, etc.) but I think the PowerStation ought to be fine. You could also try using kexec -- that should help eliminate yaboot bugs too. -- David WoodhouseOpen Source Technology Centre [email protected] Intel Corporation ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
Re: f11 ppc64 woes
On Sat, Jun 06, 2009 at 05:27:49AM -0700, Roland McGrath wrote: >I reproduced what jwb reported on the powerstation. >Mine is with F10 + updates userland, only the kernel seems to matter. >The test case is: > > # modprobe iscsi_tcp > Illegal Instruction > # > >On -16[278], same oops that jwb saw, wrong text appearing at a page boundary. > >This kernel: >http://kojipkgs.fedoraproject.org/scratch/roland/task_1396640/kernel-vanilla-2.6.29.4-168.fc11.ppc64.rpm > >does not exhibit the problem. That should be all the same buildroot stuff, >and 2.6.29.4 with no extra patches. > >OTOH, this kernel: >http://kojipkgs.fedoraproject.org/scratch/roland/task_1396192/kernel-2.6.29.4-167.fc10.ppc64.rpm > >also does not exibit the problem. That is normal -167 with all the same >patches, but built in dist-f10-updates-candidate buildroots. > >But contrary to jwb's reports: >On my powerstation 2.6.29.3-159.fc11.ppc64 fails to boot: That's not contrary. We were testing on different machines. I was testing on a Apple PowerMac7,2 (dual ppc970 G5) which uses sata_swv for storage, not ipr. >This is obviously a variant of the same problem. Right. >It's losing on clobbered instructions at a page boundary. Yes, seems so. >Man but these bastards boot slow. I've noticed that about the powerstation, yes. The G5 boots surprisingly quick with F11. Go figure. >Oh, and note the two variant crashes in different kernels are in different >routines in different builds, but always at PC 0xc040, >and always clobbered the next few words with: > 1010 0008 1013 000f > >The magic PAGE_OFFSET+4MB effect. So, youse gots to wonder, and... > >On 2.6.29.3-142.fc11.ppc64, which has "no problem", I built the appended >module. >It printed this: > >Instruction dump: >e809 f8410028 7f83e378 e9690010 7fa5eb78 7c0903a6 e8490008 4e800421 ><1010> 0008 1013 000f 7961626f 6f74 00101600 0c00 > <-- spells "yaboot" >0400 00101100 0800 7fa3eb78 4bfff24d 6000 3860 383f00b0 > <-- goes to correct text again from here > >The magic 44 bytes of bogon at PAGE_OFFSET+4MB effect. >We have no idea how long we have been screwed. > >I updated to yaboot-1.3.14-12.fc11.ppc (was f10), ran ybin, no help. ybin isn't needed on the powerstation iirc. Anyway, that is indeed odd. We should have Tony take a look at this if possible. Or if David can remeber how to do a netboot directly from OF (and skipping yaboot), that would be a good test too. josh ___ Fedora-kernel-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-kernel-list
