Re: PS3 early lock-up

2008-08-05 Thread Geert Uytterhoeven
On Tue, 5 Aug 2008, Benjamin Herrenschmidt wrote: Can you find out where that stupid value comes from ? I didn't have time to look at in detail, but it fails from the ioremap call in ps3_map_htab (arch/powerpc/platfroms/ps3/htab.c): htab = (__force struct hash_pte

Re: PS3 early lock-up

2008-08-05 Thread Benjamin Herrenschmidt
arch/powerpc/platfroms/ps3/htab.c:ps3_hpte_updatepp() uses `htab[slot].v'. Ah, I missed that one. Indeed it -is- used. Ok, that leaves us with 2 options: - Change ps3_hpte_updatepp() to not read from the hash table via that mapping (ie, do you have an LV1 call to read an HPTE ? Do you

Re: PS3 early lock-up

2008-08-05 Thread Benjamin Herrenschmidt
You could do that by adding: if (!(pteflags (_PAGE_USER | _PAGE_RW))) rflags |= (1 1) | (1 63); Dbl check that the resulting mapping isn't accessible to user space though. Make these 1UL x, and a proper patch would have to also test that the CPU supports the 3rd

Re: PS3 early lock-up

2008-08-05 Thread Geoff Levand
Benjamin Herrenschmidt wrote: arch/powerpc/platfroms/ps3/htab.c:ps3_hpte_updatepp() uses `htab[slot].v'. Ok, that leaves us with 2 options: - Change ps3_hpte_updatepp() to not read from the hash table via that mapping (ie, do you have an LV1 call to read an HPTE ? Do you measure any

Re: PS3 early lock-up

2008-08-04 Thread Benjamin Herrenschmidt
On Mon, 2008-08-04 at 17:48 +0200, Geert Uytterhoeven wrote: On PS3, recent kernels lock up in the very early stage (i.e. before mere mortals get to see a working console). The kernel crashes with | kernel BUG at linux/arch/powerpc/platforms/ps3/htab.c:141! Bisecting shows this happens

Re: PS3 early lock-up

2008-08-04 Thread Geoff Levand
Hi, Benjamin Herrenschmidt wrote: ps3_hpte_insert() seems to be called during system initialization with the following values of rflags: - first call: 0x190 - initial memory: 0x194 (455 times) - hotplug memory: o crash: 0x115 o OK: 0x117 Do you have an idea of

Re: PS3 early lock-up

2008-08-04 Thread Benjamin Herrenschmidt
Which should be 0x194. That is 0x190. 0x194 = _PAGE_EXEC | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX Right, _PAGE_EXEC should only be set for the part covering the kernel text. In any case, it shouldn't be what you showed. Can you find out where that stupid value comes