Re: f11 ppc64 woes

2009-06-07 Thread Kyle McMartin
On Sun, Jun 07, 2009 at 12:57:06PM +0100, David Woodhouse wrote:
> On Sun, 7 Jun 2009, Josh Boyer wrote:
>
>> On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote:
>>> On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote:
 I blame yaboot...
>>>
>>> Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem).
>>>
>>> We missed the boat to get that in F11 GA, right?
>>
>> Pretty sure.  Jesse is getting on a plane today, and we release Tuesday.
>>
>> 0-day update it seems.  Though that isn't going to help the installer any.
>> We might have to recommend yum updating or pre-upgrade for the machines this
>> impacted.
>
> Or 'netboot' with the zImage, which can be done from the CD.
>
> Precisely where in the release kernel does the 4MiB corruption happen?
>

drivers/scsi/scsi_transport_iscsi.c, which is unconditionally hit
(via iscsi_tcp) by anaconda.

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-07 Thread David Woodhouse

On Sun, 7 Jun 2009, Josh Boyer wrote:


On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote:

On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote:

I blame yaboot...


Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem).

We missed the boat to get that in F11 GA, right?


Pretty sure.  Jesse is getting on a plane today, and we release Tuesday.

0-day update it seems.  Though that isn't going to help the installer any.
We might have to recommend yum updating or pre-upgrade for the machines this
impacted.


Or 'netboot' with the zImage, which can be done from the CD.

Precisely where in the release kernel does the 4MiB corruption happen?

--
dwmw2

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-07 Thread Josh Boyer
On Sun, Jun 07, 2009 at 09:28:25AM +0100, David Woodhouse wrote:
>On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote:
>> I blame yaboot...
>
>Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem).
>
>We missed the boat to get that in F11 GA, right?

Pretty sure.  Jesse is getting on a plane today, and we release Tuesday.

0-day update it seems.  Though that isn't going to help the installer any.
We might have to recommend yum updating or pre-upgrade for the machines this
impacted.

josh

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-07 Thread David Woodhouse
On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote:
> I blame yaboot...

Fixed in yaboot-1.3.14-13 (thanks to benh for pointing out the problem).

We missed the boat to get that in F11 GA, right?

-- 
David WoodhouseOpen Source Technology Centre
[email protected]  Intel Corporation

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-06 Thread David Woodhouse
On Sat, 2009-06-06 at 16:13 +0100, David Woodhouse wrote:
> On Sat, 2009-06-06 at 15:29 +0100, David Woodhouse wrote:
> > You could also try using kexec -- that should help eliminate yaboot
> > bugs too.
> 
> Booting with kexec (after rebuilding it because for some reason we're
> shipping a ppc32-capable kexec again, gr)

I needed this too, btw:

--- kexec/arch/ppc64/kexec-elf-rel-ppc64.c.orig 2009-06-06 16:27:10.0 
+0100
+++ kexec/arch/ppc64/kexec-elf-rel-ppc64.c  2009-06-06 16:08:37.0 
+0100
@@ -88,6 +88,11 @@ void machine_apply_elf_rel(struct mem_eh
| (value & 0x03fc);
break;
 
+   case R_PPC64_REL32:
+   /* Convert value to relative */
+   *(uint32_t *)location = value - address;
+   break;
+
case R_PPC64_ADDR16_LO:
*(uint16_t *)location = value & 0x;
break;

> I blame yaboot...

I note that yaboot doesn't actually do any relocations when it loads the
relocatable kernel, while kexec does. Should it?

-- 
David WoodhouseOpen Source Technology Centre
[email protected]  Intel Corporation

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-06 Thread David Woodhouse
On Sat, 2009-06-06 at 15:29 +0100, David Woodhouse wrote:
> You could also try using kexec -- that should help eliminate yaboot
> bugs too.

Booting with kexec (after rebuilding it because for some reason we're
shipping a ppc32-capable kexec again, gr) shows that the corruption has
gone away:

Instruction dump:
fbbd fbbd0008 48224d75 6000 eb9e8000 7f83e378 48227829 6000 
<7fa3eb78> e89c0020 38bc0018 4beecb21 6000 7f83e378 4822715d 6000 
3860 383f0090 e8010010 7c0803a6 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 

I had been seeing it before:
fbbd fbbd0008 48224d75 6000 eb9e8000 7f83e378 48227829 6000 
<1010> 0008 1013 000f 7961626f 6f74 00101600 0c00 
0200 00101100 0810 7c0803a6 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 

I blame yaboot...

-- 
David WoodhouseOpen Source Technology Centre
[email protected]  Intel Corporation

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-06 Thread David Woodhouse
On Sat, 2009-06-06 at 09:54 -0400, Josh Boyer wrote:
> ybin isn't needed on the powerstation iirc.  Anyway, that is indeed odd.
> 
> We should have Tony take a look at this if possible.  Or if David can remeber
> how to do a netboot directly from OF (and skipping yaboot), that would be a
> good test too.

/usr/sbin/wrapper -o zImage /boot/vmlinuz-2.6.29.4-167.fc11.ppc64 \
 -i /boot/initrd-2.6.29.4-167.fc11.ppc64.img 

Give resulting zImage to OpenFirmware.

Various versions of OF have different bugs with that (image size, etc.)
but I think the PowerStation ought to be fine.

You could also try using kexec -- that should help eliminate yaboot bugs
too.

-- 
David WoodhouseOpen Source Technology Centre
[email protected]  Intel Corporation

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list


Re: f11 ppc64 woes

2009-06-06 Thread Josh Boyer
On Sat, Jun 06, 2009 at 05:27:49AM -0700, Roland McGrath wrote:
>I reproduced what jwb reported on the powerstation.
>Mine is with F10 + updates userland, only the kernel seems to matter.
>The test case is:
>
>   # modprobe iscsi_tcp
>   Illegal Instruction
>   #
>
>On -16[278], same oops that jwb saw, wrong text appearing at a page boundary.
>
>This kernel:
>http://kojipkgs.fedoraproject.org/scratch/roland/task_1396640/kernel-vanilla-2.6.29.4-168.fc11.ppc64.rpm
>
>does not exhibit the problem.  That should be all the same buildroot stuff,
>and 2.6.29.4 with no extra patches.
>
>OTOH, this kernel:
>http://kojipkgs.fedoraproject.org/scratch/roland/task_1396192/kernel-2.6.29.4-167.fc10.ppc64.rpm
>
>also does not exibit the problem.  That is normal -167 with all the same
>patches, but built in dist-f10-updates-candidate buildroots.
>
>But contrary to jwb's reports:
>On my powerstation 2.6.29.3-159.fc11.ppc64 fails to boot:

That's not contrary.  We were testing on different machines.  I was testing on
a Apple PowerMac7,2 (dual ppc970 G5) which uses sata_swv for storage, not ipr.

>This is obviously a variant of the same problem.  

Right.

>It's losing on clobbered instructions at a page boundary.

Yes, seems so.

>Man but these bastards boot slow.

I've noticed that about the powerstation, yes.  The G5 boots surprisingly
quick with F11.  Go figure.


>Oh, and note the two variant crashes in different kernels are in different
>routines in different builds, but always at PC 0xc040,
>and always clobbered the next few words with:
>   1010 0008 1013 000f 
>
>The magic PAGE_OFFSET+4MB effect.  So, youse gots to wonder, and...
>
>On 2.6.29.3-142.fc11.ppc64, which has "no problem", I built the appended 
>module.
>It printed this:
>
>Instruction dump:
>e809 f8410028 7f83e378 e9690010 7fa5eb78 7c0903a6 e8490008 4e800421 
><1010> 0008 1013 000f 7961626f 6f74 00101600 0c00 
>   <-- spells "yaboot"
>0400 00101100 0800 7fa3eb78 4bfff24d 6000 3860 383f00b0 
>   <-- goes to correct text again from here
>
>The magic 44 bytes of bogon at PAGE_OFFSET+4MB effect.
>We have no idea how long we have been screwed.
>
>I updated to yaboot-1.3.14-12.fc11.ppc (was f10), ran ybin, no help.

ybin isn't needed on the powerstation iirc.  Anyway, that is indeed odd.

We should have Tony take a look at this if possible.  Or if David can remeber
how to do a netboot directly from OF (and skipping yaboot), that would be a
good test too.

josh

___
Fedora-kernel-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-kernel-list