Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Will Deacon
On Tue, Nov 06, 2018 at 10:39:47PM +0100, Ard Biesheuvel wrote:
> On 6 November 2018 at 22:34, Will Deacon  wrote:
> > On Tue, Nov 06, 2018 at 12:37:28PM +0100, Ard Biesheuvel wrote:
> >> This series addresses the kexec/kdump crash on arm64 system with many CPUs
> >> that was reported by Bhupesh.
> >>
> >> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
> >> EFI persistent memreserve infrastructure so that fewer memblock 
> >> reservations
> >> are required.
> >
> > I acked the arm64 part and patches 3 and 4 look good afaict, so:
> >
> > Acked-by: Will Deacon 
> >
> > for those as well.
> >
> > The only thing I was wondering is whether or not exhausting the page-sized
> > array in the first list entry is rare enough that we could just realloc the
> > thing and copy instead of chaining together new pages. That said, without
> > seeing the code it's hard to tell whether you save much complexity with such
> > a scheme so I'll leave it up to you.
> >
> 
> Well, the problem is that the page-sized array may have been allocated
> by a previous kernel, and so it may be memblock_reserve()d already, in
> which case reallocating does not actually free up the memory in a
> useful way.
> 
> Also, in the unlikely event that we race and allocate two new pages at
> the same time, the current scheme will not break (and we will
> ultimately fill up all the slots in both pages before allocating even
> more pages). This will become a lot trickier if we do reallocation.

Ah crap, yeah, the concurrency angle makes that really awful. Thanks.

> So if the current approach looks correct to you, then I'd prefer to
> stick with it.
> 
> Do you agree with applying #3 and #4 as fixes?

If Patch 1 alone is sufficient to solve the issue for Bhupesh, then I don't
think we have a need to treat 3 and 4 as fixes (and therefore we can avoid
having to backport them to stable kernels).

Will


Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Ard Biesheuvel
On 6 November 2018 at 22:34, Will Deacon  wrote:
> Hi Ard,
>
> On Tue, Nov 06, 2018 at 12:37:28PM +0100, Ard Biesheuvel wrote:
>> This series addresses the kexec/kdump crash on arm64 system with many CPUs
>> that was reported by Bhupesh.
>>
>> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
>> EFI persistent memreserve infrastructure so that fewer memblock reservations
>> are required.
>
> I acked the arm64 part and patches 3 and 4 look good afaict, so:
>
> Acked-by: Will Deacon 
>
> for those as well.
>
> The only thing I was wondering is whether or not exhausting the page-sized
> array in the first list entry is rare enough that we could just realloc the
> thing and copy instead of chaining together new pages. That said, without
> seeing the code it's hard to tell whether you save much complexity with such
> a scheme so I'll leave it up to you.
>

Well, the problem is that the page-sized array may have been allocated
by a previous kernel, and so it may be memblock_reserve()d already, in
which case reallocating does not actually free up the memory in a
useful way.

Also, in the unlikely event that we race and allocate two new pages at
the same time, the current scheme will not break (and we will
ultimately fill up all the slots in both pages before allocating even
more pages). This will become a lot trickier if we do reallocation.

So if the current approach looks correct to you, then I'd prefer to
stick with it.

Do you agree with applying #3 and #4 as fixes?


Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Will Deacon
Hi Ard,

On Tue, Nov 06, 2018 at 12:37:28PM +0100, Ard Biesheuvel wrote:
> This series addresses the kexec/kdump crash on arm64 system with many CPUs
> that was reported by Bhupesh.
> 
> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
> EFI persistent memreserve infrastructure so that fewer memblock reservations
> are required.

I acked the arm64 part and patches 3 and 4 look good afaict, so:

Acked-by: Will Deacon 

for those as well.

The only thing I was wondering is whether or not exhausting the page-sized
array in the first list entry is rare enough that we could just realloc the
thing and copy instead of chaining together new pages. That said, without
seeing the code it's hard to tell whether you save much complexity with such
a scheme so I'll leave it up to you.

Will


Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Marc Zyngier
On Tue, 06 Nov 2018 19:01:51 +,
Ard Biesheuvel  wrote:
> 
> On 6 November 2018 at 19:27, Marc Zyngier  wrote:
> > Hi Ard,
> >
> > On 06/11/18 11:37, Ard Biesheuvel wrote:
> >> This series addresses the kexec/kdump crash on arm64 system with many CPUs
> >> that was reported by Bhupesh.
> >>
> >> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
> >> EFI persistent memreserve infrastructure so that fewer memblock 
> >> reservations
> >> are required.
> >>
> >> Ard Biesheuvel (4):
> >>   arm64: memblock: don't permit memblock resizing until linear mapping
> >> is up
> >>   efi/arm: defer persistent reservations until after paging_init()
> >>   efi: permit multiple entries in persistent memreserve data structure
> >>   efi: reduce the amount of memblock reservations for persistent
> >> allocations
> >>
> >>  arch/arm/kernel/setup.c |  1 +
> >>  arch/arm64/kernel/setup.c   |  1 +
> >>  arch/arm64/mm/init.c|  2 -
> >>  arch/arm64/mm/mmu.c |  2 +
> >>  drivers/firmware/efi/efi.c  | 59 ++--
> >>  drivers/firmware/efi/libstub/arm-stub.c |  2 +-
> >>  include/linux/efi.h | 23 +++-
> >>  7 files changed, 68 insertions(+), 22 deletions(-)
> >
> > I've just given these patches a go a a TX2 box (one of the 224 CPU
> > ones...), and kexec worked just fine (v4.20-rc1 vanilla didn't manage to
> > kexec on this box).
> >
> > There seem to be some additional userspace patches that are required for
> > the ACPI tables not to be corrupted in the secondary kernel, but that's
> > an orthogonal issue.
> >
> > Feel free to add
> >
> > Tested-by: Marc Zyngier 
> >
> 
> Thanks Marc.
> 
> Any thoughts on whether patches #3 and #4 should be included as fixes?
> I'm leaning towards yes.

They certainly massively reduce the number of list entries, and
probably help reducing the overhead. I'd be inclined to merge them as
well.

Thanks,

M.

-- 
Jazz is not dead, it just smell funny.


Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Ard Biesheuvel
On 6 November 2018 at 19:27, Marc Zyngier  wrote:
> Hi Ard,
>
> On 06/11/18 11:37, Ard Biesheuvel wrote:
>> This series addresses the kexec/kdump crash on arm64 system with many CPUs
>> that was reported by Bhupesh.
>>
>> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
>> EFI persistent memreserve infrastructure so that fewer memblock reservations
>> are required.
>>
>> Ard Biesheuvel (4):
>>   arm64: memblock: don't permit memblock resizing until linear mapping
>> is up
>>   efi/arm: defer persistent reservations until after paging_init()
>>   efi: permit multiple entries in persistent memreserve data structure
>>   efi: reduce the amount of memblock reservations for persistent
>> allocations
>>
>>  arch/arm/kernel/setup.c |  1 +
>>  arch/arm64/kernel/setup.c   |  1 +
>>  arch/arm64/mm/init.c|  2 -
>>  arch/arm64/mm/mmu.c |  2 +
>>  drivers/firmware/efi/efi.c  | 59 ++--
>>  drivers/firmware/efi/libstub/arm-stub.c |  2 +-
>>  include/linux/efi.h | 23 +++-
>>  7 files changed, 68 insertions(+), 22 deletions(-)
>
> I've just given these patches a go a a TX2 box (one of the 224 CPU
> ones...), and kexec worked just fine (v4.20-rc1 vanilla didn't manage to
> kexec on this box).
>
> There seem to be some additional userspace patches that are required for
> the ACPI tables not to be corrupted in the secondary kernel, but that's
> an orthogonal issue.
>
> Feel free to add
>
> Tested-by: Marc Zyngier 
>

Thanks Marc.

Any thoughts on whether patches #3 and #4 should be included as fixes?
I'm leaning towards yes.


Re: [PATCH 0/4] arm/efi: fix memblock reallocation crash due to persistent reservations

2018-11-06 Thread Marc Zyngier
Hi Ard,

On 06/11/18 11:37, Ard Biesheuvel wrote:
> This series addresses the kexec/kdump crash on arm64 system with many CPUs
> that was reported by Bhupesh.
> 
> Patches #1 and #2 fix the actual crash. Patches #3 and #4 optimize the
> EFI persistent memreserve infrastructure so that fewer memblock reservations
> are required.
> 
> Ard Biesheuvel (4):
>   arm64: memblock: don't permit memblock resizing until linear mapping
> is up
>   efi/arm: defer persistent reservations until after paging_init()
>   efi: permit multiple entries in persistent memreserve data structure
>   efi: reduce the amount of memblock reservations for persistent
> allocations
> 
>  arch/arm/kernel/setup.c |  1 +
>  arch/arm64/kernel/setup.c   |  1 +
>  arch/arm64/mm/init.c|  2 -
>  arch/arm64/mm/mmu.c |  2 +
>  drivers/firmware/efi/efi.c  | 59 ++--
>  drivers/firmware/efi/libstub/arm-stub.c |  2 +-
>  include/linux/efi.h | 23 +++-
>  7 files changed, 68 insertions(+), 22 deletions(-)

I've just given these patches a go a a TX2 box (one of the 224 CPU
ones...), and kexec worked just fine (v4.20-rc1 vanilla didn't manage to
kexec on this box).

There seem to be some additional userspace patches that are required for
the ACPI tables not to be corrupted in the secondary kernel, but that's
an orthogonal issue.

Feel free to add

Tested-by: Marc Zyngier 

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...