Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-29 Thread Alex Ghiti

Le 6/3/20 à 11:36 AM, Alexandre Ghiti a écrit :

This small patchset intends to use PUD/PGDIR entries for linear mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
beginning of the DRAM: this is achieved in patch 1.

But furthermore, at the moment, the firmware (opensbi) explicitly asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for performance
reasons, we must at least make sure that PMD entries can be used: that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that whenever
possible, PUD/PGDIR entries are used.

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xc000-0xc00176e00x8020 5998M PMD D A . 
. . W R V

And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xc000-0xc0014000 0x8000 5G PUD D A 
. . . W R V
0xc0014000-0xc00177000x0001c000 880M PMD D A . 
. . W R V

Alexandre Ghiti (2):
   riscv: Get memory below load_pa while ensuring linear mapping is PMD
 aligned
   riscv: Use PUD/PGDIR entries for linear mapping when possible

  arch/riscv/include/asm/page.h |  8 
  arch/riscv/mm/init.c  | 69 +--
  2 files changed, 65 insertions(+), 12 deletions(-)



The way to handle the remapping of the first 2MB is incorrect: Atish has 
issues while using an initrd because the initrd_start variable is 
defined using __va between setup_vm and setup_vm_final and then its 
value is inconsistent after setup_vm_final since virtual addressing was 
modified with the remapping of the first 2MB.


I will come with another solution to this problem since the way I handle 
it for now is not correct.


Thanks,

Alex


Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-16 Thread Atish Patra
On Sun, Jun 14, 2020 at 10:35 PM Alex Ghiti  wrote:
>
> Hi Atish,
>
> Le 6/12/20 à 1:43 PM, Atish Patra a écrit :
> > On Fri, Jun 12, 2020 at 6:17 AM Alex Ghiti  wrote:
> >> Le 6/12/20 à 8:59 AM, Alex Ghiti a écrit :
> >>> Hi Atish,
> >>>
> >>> Le 6/11/20 à 1:29 PM, Atish Patra a écrit :
>  On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:
> > Hi Atish,
> >
> > Le 6/10/20 à 2:32 PM, Atish Patra a écrit :
> >> On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:
> >>> This small patchset intends to use PUD/PGDIR entries for linear
> >>> mapping
> >>> in order to better utilize TLB.
> >>>
> >>> At the moment, only PMD entries can be used since on common platforms
> >>> (qemu/unleashed), the kernel is loaded at DRAM + 2MB which
> >>> dealigns virtual
> >>> and physical addresses and then prevents the use of PUD/PGDIR
> >>> entries.
> >>> So the kernel must be able to get those 2MB for PAGE_OFFSET to map
> >>> the
> >>> beginning of the DRAM: this is achieved in patch 1.
> >>>
> >> I don't have in depth knowledge of how mm code works so this question
> >> may be a completely
> >> stupid one :). Just for my understanding,
> >> As per my understanding, kernel will map those 2MB of memory but
> >> never use it.
> >> How does the kernel ensure that it doesn't allocate any memory from
> >> those 2MB
> >> memory if it is not marked as reserved?
> > Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
> > stage to mark this region
> > as reserved if there is something there (like opensbi). Otherwise, the
> > kernel will indeed try to
> > allocate memory from there :)
> >
>  In that case, this patch mandates that the firmware region has to be
>  mark "reserved"
>  the device tree so that the Linux kernel doesn't try to allocate
>  memory from there.
>  OpenSBI is already doing it from v0.7. Thus, any user using latest
>  OpenSBI can leverage
>  this patch for a better TLB utilization.
> >>>
> >>> Note that *currently* OpenSBI v0.7 still adds the "no-map" property
> >>> which prevents such optimization.
> >>>
> > Thanks for the clarification. When I said latest, I meant including
> > your patch in the mailing list.
> >
>  However, legacy previous boot stages(BBL) do not reserve this area via
>  DT which may
>  result in an unexpected crash. I am not sure how many developers still
>  use BBL though.
> 
>  Few general suggestions to tackle this problem:
>  1. This mandatory requirement should be added to the booting document
>  so that any other
>  SBI implementation is also aware of it.
>  2. You may have to move the patch1 to a separate config so that any
>  users of legacy boot stages
>  can disable this feature.
> >>>
> >>> IMHO, the region occupied by runtime services should be marked as
> >>> reserved in the device-tree. So it seems redundant to add this as a
> >>> requirement, I would rather consider its absence as a bug.
> >>>
> > I agree. I was just suggesting to document this bug :).
>
> Oh ok then, we meant the same thing :)
> >
> >>> Even if I understand that this might break some system, I don't like
> >>> the idea of a new config to support old "buggy" bootloaders: when will
> >>> we be able to remove it ? We'll never know when people will stop using
> >>> those bootloaders, so it will stay here forever...Where can I find the
> > Personally, I am fine with that. However, there were few concerns in the 
> > past.
> > I am leaving it to Palmer to decide.
> >
> > @Palmer Dabbelt : Any thoughts ?
> >
> >>> boot document you are talking about ? Can we simply state here that
> >>> this kernel version will not be compatible with those bootloaders
> >>> (we'll draw an exhaustive list here) ?
> > Yes.
> >
> >> Ok, I have just found Documentation/riscv/boot-image-header.rst: could
> >> we imagine doing something like incrementing the version and use that as
> >> a hint in the kernel not to map the 2MB offset ? That's still legacy,
> >> but at least it does not require to recompile a kernel as the check
> >> would be done at runtime.
> >>
> > I was suggesting to add a risc-v specific booting document and
> > document this "bug".
> > Documentation/riscv/boot-image-header.rst can be linked from that document 
> > or
> > the boot hader content can be included in that. No changes in code is 
> > necessary.
> >
> > Eventually, this booting document will also include other additional
> > booting constraints for RISC-V
> > such as minimum extension required to boot Linux, csr state upon
> > entering S-mode, mmu state.
>
>
> Ok I will prepare a boot document that links to the existing documents and
> add all of that, I will need you for the last constraints that I don't
> know about.
>

Thanks. I will send patches on top of your boot document patch.
This was long due. Thanks for coming up with the initial

Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-14 Thread Alex Ghiti

Hi Atish,

Le 6/12/20 à 1:43 PM, Atish Patra a écrit :

On Fri, Jun 12, 2020 at 6:17 AM Alex Ghiti  wrote:

Le 6/12/20 à 8:59 AM, Alex Ghiti a écrit :

Hi Atish,

Le 6/11/20 à 1:29 PM, Atish Patra a écrit :

On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:

Hi Atish,

Le 6/10/20 à 2:32 PM, Atish Patra a écrit :

On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:

This small patchset intends to use PUD/PGDIR entries for linear
mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which
dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR
entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map
the
beginning of the DRAM: this is achieved in patch 1.


I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but
never use it.
How does the kernel ensure that it doesn't allocate any memory from
those 2MB
memory if it is not marked as reserved?

Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
stage to mark this region
as reserved if there is something there (like opensbi). Otherwise, the
kernel will indeed try to
allocate memory from there :)


In that case, this patch mandates that the firmware region has to be
mark "reserved"
the device tree so that the Linux kernel doesn't try to allocate
memory from there.
OpenSBI is already doing it from v0.7. Thus, any user using latest
OpenSBI can leverage
this patch for a better TLB utilization.


Note that *currently* OpenSBI v0.7 still adds the "no-map" property
which prevents such optimization.


Thanks for the clarification. When I said latest, I meant including
your patch in the mailing list.


However, legacy previous boot stages(BBL) do not reserve this area via
DT which may
result in an unexpected crash. I am not sure how many developers still
use BBL though.

Few general suggestions to tackle this problem:
1. This mandatory requirement should be added to the booting document
so that any other
SBI implementation is also aware of it.
2. You may have to move the patch1 to a separate config so that any
users of legacy boot stages
can disable this feature.


IMHO, the region occupied by runtime services should be marked as
reserved in the device-tree. So it seems redundant to add this as a
requirement, I would rather consider its absence as a bug.


I agree. I was just suggesting to document this bug :).


Oh ok then, we meant the same thing :)



Even if I understand that this might break some system, I don't like
the idea of a new config to support old "buggy" bootloaders: when will
we be able to remove it ? We'll never know when people will stop using
those bootloaders, so it will stay here forever...Where can I find the

Personally, I am fine with that. However, there were few concerns in the past.
I am leaving it to Palmer to decide.

@Palmer Dabbelt : Any thoughts ?


boot document you are talking about ? Can we simply state here that
this kernel version will not be compatible with those bootloaders
(we'll draw an exhaustive list here) ?

Yes.


Ok, I have just found Documentation/riscv/boot-image-header.rst: could
we imagine doing something like incrementing the version and use that as
a hint in the kernel not to map the 2MB offset ? That's still legacy,
but at least it does not require to recompile a kernel as the check
would be done at runtime.


I was suggesting to add a risc-v specific booting document and
document this "bug".
Documentation/riscv/boot-image-header.rst can be linked from that document or
the boot hader content can be included in that. No changes in code is necessary.

Eventually, this booting document will also include other additional
booting constraints for RISC-V
such as minimum extension required to boot Linux, csr state upon
entering S-mode, mmu state.



Ok I will prepare a boot document that links to the existing documents and
add all of that, I will need you for the last constraints that I don't 
know about.


Thanks Atish,

Alex


Alex



Alex



But furthermore, at the moment, the firmware (opensbi) explicitly
asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers
some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for
performance
reasons, we must at least make sure that PMD entries can be used:
that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that
whenever
possible, PUD/

Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-12 Thread Atish Patra
On Fri, Jun 12, 2020 at 6:17 AM Alex Ghiti  wrote:
>
> Le 6/12/20 à 8:59 AM, Alex Ghiti a écrit :
> > Hi Atish,
> >
> > Le 6/11/20 à 1:29 PM, Atish Patra a écrit :
> >> On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:
> >>> Hi Atish,
> >>>
> >>> Le 6/10/20 à 2:32 PM, Atish Patra a écrit :
>  On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:
> > This small patchset intends to use PUD/PGDIR entries for linear
> > mapping
> > in order to better utilize TLB.
> >
> > At the moment, only PMD entries can be used since on common platforms
> > (qemu/unleashed), the kernel is loaded at DRAM + 2MB which
> > dealigns virtual
> > and physical addresses and then prevents the use of PUD/PGDIR
> > entries.
> > So the kernel must be able to get those 2MB for PAGE_OFFSET to map
> > the
> > beginning of the DRAM: this is achieved in patch 1.
> >
>  I don't have in depth knowledge of how mm code works so this question
>  may be a completely
>  stupid one :). Just for my understanding,
>  As per my understanding, kernel will map those 2MB of memory but
>  never use it.
>  How does the kernel ensure that it doesn't allocate any memory from
>  those 2MB
>  memory if it is not marked as reserved?
> >>> Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
> >>> stage to mark this region
> >>> as reserved if there is something there (like opensbi). Otherwise, the
> >>> kernel will indeed try to
> >>> allocate memory from there :)
> >>>
> >> In that case, this patch mandates that the firmware region has to be
> >> mark "reserved"
> >> the device tree so that the Linux kernel doesn't try to allocate
> >> memory from there.
> >> OpenSBI is already doing it from v0.7. Thus, any user using latest
> >> OpenSBI can leverage
> >> this patch for a better TLB utilization.
> >
> >
> > Note that *currently* OpenSBI v0.7 still adds the "no-map" property
> > which prevents such optimization.
> >

Thanks for the clarification. When I said latest, I meant including
your patch in the mailing list.

> >> However, legacy previous boot stages(BBL) do not reserve this area via
> >> DT which may
> >> result in an unexpected crash. I am not sure how many developers still
> >> use BBL though.
> >>
> >> Few general suggestions to tackle this problem:
> >> 1. This mandatory requirement should be added to the booting document
> >> so that any other
> >> SBI implementation is also aware of it.
> >> 2. You may have to move the patch1 to a separate config so that any
> >> users of legacy boot stages
> >> can disable this feature.
> >
> >
> > IMHO, the region occupied by runtime services should be marked as
> > reserved in the device-tree. So it seems redundant to add this as a
> > requirement, I would rather consider its absence as a bug.
> >

I agree. I was just suggesting to document this bug :).

> > Even if I understand that this might break some system, I don't like
> > the idea of a new config to support old "buggy" bootloaders: when will
> > we be able to remove it ? We'll never know when people will stop using
> > those bootloaders, so it will stay here forever...Where can I find the

Personally, I am fine with that. However, there were few concerns in the past.
I am leaving it to Palmer to decide.

@Palmer Dabbelt : Any thoughts ?

> > boot document you are talking about ? Can we simply state here that
> > this kernel version will not be compatible with those bootloaders
> > (we'll draw an exhaustive list here) ?
>

Yes.

>
> Ok, I have just found Documentation/riscv/boot-image-header.rst: could
> we imagine doing something like incrementing the version and use that as
> a hint in the kernel not to map the 2MB offset ? That's still legacy,
> but at least it does not require to recompile a kernel as the check
> would be done at runtime.
>

I was suggesting to add a risc-v specific booting document and
document this "bug".
Documentation/riscv/boot-image-header.rst can be linked from that document or
the boot hader content can be included in that. No changes in code is necessary.

Eventually, this booting document will also include other additional
booting constraints for RISC-V
such as minimum extension required to boot Linux, csr state upon
entering S-mode, mmu state.
>
> >
> > Alex
> >
> >
> >>> Alex
> >>>
> >>>
> > But furthermore, at the moment, the firmware (opensbi) explicitly
> > asks the
> > kernel not to map the region it occupies, which is on those common
> > platforms at the very beginning of the DRAM and then it also dealigns
> > virtual and physical addresses. I proposed a patch here:
> >
> > https://github.com/riscv/opensbi/pull/167
> >
> > that removes this 'constraint' but *not* all the time as it offers
> > some
> > kind of protection in case PMP is not available. So sometimes, we may
> > have a part of the memory below the kernel that is removed creating a
> > misalignment b

Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-12 Thread Alex Ghiti

Le 6/12/20 à 8:59 AM, Alex Ghiti a écrit :

Hi Atish,

Le 6/11/20 à 1:29 PM, Atish Patra a écrit :

On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:

Hi Atish,

Le 6/10/20 à 2:32 PM, Atish Patra a écrit :

On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:
This small patchset intends to use PUD/PGDIR entries for linear 
mapping

in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which 
dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR 
entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map 
the

beginning of the DRAM: this is achieved in patch 1.


I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but 
never use it.
How does the kernel ensure that it doesn't allocate any memory from 
those 2MB

memory if it is not marked as reserved?

Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
stage to mark this region
as reserved if there is something there (like opensbi). Otherwise, the
kernel will indeed try to
allocate memory from there :)


In that case, this patch mandates that the firmware region has to be
mark "reserved"
the device tree so that the Linux kernel doesn't try to allocate
memory from there.
OpenSBI is already doing it from v0.7. Thus, any user using latest
OpenSBI can leverage
this patch for a better TLB utilization.



Note that *currently* OpenSBI v0.7 still adds the "no-map" property 
which prevents such optimization.



However, legacy previous boot stages(BBL) do not reserve this area via
DT which may
result in an unexpected crash. I am not sure how many developers still
use BBL though.

Few general suggestions to tackle this problem:
1. This mandatory requirement should be added to the booting document
so that any other
SBI implementation is also aware of it.
2. You may have to move the patch1 to a separate config so that any
users of legacy boot stages
can disable this feature.



IMHO, the region occupied by runtime services should be marked as 
reserved in the device-tree. So it seems redundant to add this as a 
requirement, I would rather consider its absence as a bug.


Even if I understand that this might break some system, I don't like 
the idea of a new config to support old "buggy" bootloaders: when will 
we be able to remove it ? We'll never know when people will stop using 
those bootloaders, so it will stay here forever...Where can I find the 
boot document you are talking about ? Can we simply state here that 
this kernel version will not be compatible with those bootloaders 
(we'll draw an exhaustive list here) ?



Ok, I have just found Documentation/riscv/boot-image-header.rst: could 
we imagine doing something like incrementing the version and use that as 
a hint in the kernel not to map the 2MB offset ? That's still legacy, 
but at least it does not require to recompile a kernel as the check 
would be done at runtime.





Alex



Alex


But furthermore, at the moment, the firmware (opensbi) explicitly 
asks the

kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers 
some

kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for 
performance
reasons, we must at least make sure that PMD entries can be used: 
that

is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that 
whenever

possible, PUD/PGDIR entries are used.

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xc000-0xc00176e0 0x8020 5998M 
PMD D A . . . W R V


And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xc000-0xc0014000 0x8000 
5G PUD D A . . . W R V
0xc0014000-0xc0017700 0x0001c000 880M 
PMD D A . . . W R V


Alexandre Ghiti (2):
    riscv: Get memory below load_pa while ensuring linear mapping 
is PMD

  aligned
    riscv: Use PUD/PGDIR entries for linear mapping when possible

   arch/riscv/include/asm/page.h |  8 
   arch/riscv/mm/init.c  | 69 
+--

   2 files changed, 65 insertions(+), 12 deletions(-)

--
2.20.1








Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-12 Thread Alex Ghiti

Hi Atish,

Le 6/11/20 à 1:29 PM, Atish Patra a écrit :

On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:

Hi Atish,

Le 6/10/20 à 2:32 PM, Atish Patra a écrit :

On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:

This small patchset intends to use PUD/PGDIR entries for linear mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
beginning of the DRAM: this is achieved in patch 1.


I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but never use it.
How does the kernel ensure that it doesn't allocate any memory from those 2MB
memory if it is not marked as reserved?

Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
stage to mark this region
as reserved if there is something there (like opensbi). Otherwise, the
kernel will indeed try to
allocate memory from there :)


In that case, this patch mandates that the firmware region has to be
mark "reserved"
the device tree so that the Linux kernel doesn't try to allocate
memory from there.
OpenSBI is already doing it from v0.7. Thus, any user using latest
OpenSBI can leverage
this patch for a better TLB utilization.



Note that *currently* OpenSBI v0.7 still adds the "no-map" property 
which prevents such optimization.



However, legacy previous boot stages(BBL) do not reserve this area via
DT which may
result in an unexpected crash. I am not sure how many developers still
use BBL though.

Few general suggestions to tackle this problem:
1. This mandatory requirement should be added to the booting document
so that any other
SBI implementation is also aware of it.
2. You may have to move the patch1 to a separate config so that any
users of legacy boot stages
can disable this feature.



IMHO, the region occupied by runtime services should be marked as 
reserved in the device-tree. So it seems redundant to add this as a 
requirement, I would rather consider its absence as a bug.


Even if I understand that this might break some system, I don't like the 
idea of a new config to support old "buggy" bootloaders: when will we be 
able to remove it ? We'll never know when people will stop using those 
bootloaders, so it will stay here forever...Where can I find the boot 
document you are talking about ? Can we simply state here that this 
kernel version will not be compatible with those bootloaders (we'll draw 
an exhaustive list here) ?


Alex



Alex



But furthermore, at the moment, the firmware (opensbi) explicitly asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for performance
reasons, we must at least make sure that PMD entries can be used: that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that whenever
possible, PUD/PGDIR entries are used.

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xc000-0xc00176e00x8020 5998M PMD D A . 
. . W R V

And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xc000-0xc0014000 0x8000 5G PUD D A 
. . . W R V
0xc0014000-0xc00177000x0001c000 880M PMD D A . 
. . W R V

Alexandre Ghiti (2):
riscv: Get memory below load_pa while ensuring linear mapping is PMD
  aligned
riscv: Use PUD/PGDIR entries for linear mapping when possible

   arch/riscv/include/asm/page.h |  8 
   arch/riscv/mm/init.c  | 69 +--
   2 files changed, 65 insertions(+), 12 deletions(-)

--
2.20.1






Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-11 Thread Atish Patra
On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti  wrote:
>
> Hi Atish,
>
> Le 6/10/20 à 2:32 PM, Atish Patra a écrit :
> > On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:
> >> This small patchset intends to use PUD/PGDIR entries for linear mapping
> >> in order to better utilize TLB.
> >>
> >> At the moment, only PMD entries can be used since on common platforms
> >> (qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
> >> and physical addresses and then prevents the use of PUD/PGDIR entries.
> >> So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
> >> beginning of the DRAM: this is achieved in patch 1.
> >>
> > I don't have in depth knowledge of how mm code works so this question
> > may be a completely
> > stupid one :). Just for my understanding,
> > As per my understanding, kernel will map those 2MB of memory but never use 
> > it.
> > How does the kernel ensure that it doesn't allocate any memory from those 
> > 2MB
> > memory if it is not marked as reserved?
>
> Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
> stage to mark this region
> as reserved if there is something there (like opensbi). Otherwise, the
> kernel will indeed try to
> allocate memory from there :)
>
In that case, this patch mandates that the firmware region has to be
mark "reserved"
the device tree so that the Linux kernel doesn't try to allocate
memory from there.
OpenSBI is already doing it from v0.7. Thus, any user using latest
OpenSBI can leverage
this patch for a better TLB utilization.
However, legacy previous boot stages(BBL) do not reserve this area via
DT which may
result in an unexpected crash. I am not sure how many developers still
use BBL though.

Few general suggestions to tackle this problem:
1. This mandatory requirement should be added to the booting document
so that any other
SBI implementation is also aware of it.
2. You may have to move the patch1 to a separate config so that any
users of legacy boot stages
can disable this feature.

> Alex
>
>
> >> But furthermore, at the moment, the firmware (opensbi) explicitly asks the
> >> kernel not to map the region it occupies, which is on those common
> >> platforms at the very beginning of the DRAM and then it also dealigns
> >> virtual and physical addresses. I proposed a patch here:
> >>
> >> https://github.com/riscv/opensbi/pull/167
> >>
> >> that removes this 'constraint' but *not* all the time as it offers some
> >> kind of protection in case PMP is not available. So sometimes, we may
> >> have a part of the memory below the kernel that is removed creating a
> >> misalignment between virtual and physical addresses. So for performance
> >> reasons, we must at least make sure that PMD entries can be used: that
> >> is guaranteed by patch 1 too.
> >>
> >> Finally the second patch simply improves best_map_size so that whenever
> >> possible, PUD/PGDIR entries are used.
> >>
> >> Below is the kernel page table without this patch on a 6G platform:
> >>
> >> ---[ Linear mapping ]---
> >> 0xc000-0xc00176e00x8020 5998M PMD 
> >> D A . . . W R V
> >>
> >> And with this patchset + opensbi patch:
> >>
> >> ---[ Linear mapping ]---
> >> 0xc000-0xc0014000 0x8000 5G PUD
> >>  D A . . . W R V
> >> 0xc0014000-0xc00177000x0001c000 880M PMD D 
> >> A . . . W R V
> >>
> >> Alexandre Ghiti (2):
> >>riscv: Get memory below load_pa while ensuring linear mapping is PMD
> >>  aligned
> >>riscv: Use PUD/PGDIR entries for linear mapping when possible
> >>
> >>   arch/riscv/include/asm/page.h |  8 
> >>   arch/riscv/mm/init.c  | 69 +--
> >>   2 files changed, 65 insertions(+), 12 deletions(-)
> >>
> >> --
> >> 2.20.1
> >>
> >>
> >



-- 
Regards,
Atish


Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-10 Thread Alex Ghiti

Hi Atish,

Le 6/10/20 à 2:32 PM, Atish Patra a écrit :

On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:

This small patchset intends to use PUD/PGDIR entries for linear mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
beginning of the DRAM: this is achieved in patch 1.


I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but never use it.
How does the kernel ensure that it doesn't allocate any memory from those 2MB
memory if it is not marked as reserved?


Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot 
stage to mark this region
as reserved if there is something there (like opensbi). Otherwise, the 
kernel will indeed try to

allocate memory from there :)

Alex



But furthermore, at the moment, the firmware (opensbi) explicitly asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for performance
reasons, we must at least make sure that PMD entries can be used: that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that whenever
possible, PUD/PGDIR entries are used.

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xc000-0xc00176e00x8020 5998M PMD D A . 
. . W R V

And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xc000-0xc0014000 0x8000 5G PUD D A 
. . . W R V
0xc0014000-0xc00177000x0001c000 880M PMD D A . 
. . W R V

Alexandre Ghiti (2):
   riscv: Get memory below load_pa while ensuring linear mapping is PMD
 aligned
   riscv: Use PUD/PGDIR entries for linear mapping when possible

  arch/riscv/include/asm/page.h |  8 
  arch/riscv/mm/init.c  | 69 +--
  2 files changed, 65 insertions(+), 12 deletions(-)

--
2.20.1






Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-10 Thread Atish Patra
On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti  wrote:
>
> This small patchset intends to use PUD/PGDIR entries for linear mapping
> in order to better utilize TLB.
>
> At the moment, only PMD entries can be used since on common platforms
> (qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
> and physical addresses and then prevents the use of PUD/PGDIR entries.
> So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
> beginning of the DRAM: this is achieved in patch 1.
>

I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but never use it.
How does the kernel ensure that it doesn't allocate any memory from those 2MB
memory if it is not marked as reserved?

> But furthermore, at the moment, the firmware (opensbi) explicitly asks the
> kernel not to map the region it occupies, which is on those common
> platforms at the very beginning of the DRAM and then it also dealigns
> virtual and physical addresses. I proposed a patch here:
>
> https://github.com/riscv/opensbi/pull/167
>
> that removes this 'constraint' but *not* all the time as it offers some
> kind of protection in case PMP is not available. So sometimes, we may
> have a part of the memory below the kernel that is removed creating a
> misalignment between virtual and physical addresses. So for performance
> reasons, we must at least make sure that PMD entries can be used: that
> is guaranteed by patch 1 too.
>
> Finally the second patch simply improves best_map_size so that whenever
> possible, PUD/PGDIR entries are used.
>
> Below is the kernel page table without this patch on a 6G platform:
>
> ---[ Linear mapping ]---
> 0xc000-0xc00176e00x8020 5998M PMD D A 
> . . . W R V
>
> And with this patchset + opensbi patch:
>
> ---[ Linear mapping ]---
> 0xc000-0xc0014000 0x8000 5G PUD D 
> A . . . W R V
> 0xc0014000-0xc00177000x0001c000 880M PMD D A 
> . . . W R V
>
> Alexandre Ghiti (2):
>   riscv: Get memory below load_pa while ensuring linear mapping is PMD
> aligned
>   riscv: Use PUD/PGDIR entries for linear mapping when possible
>
>  arch/riscv/include/asm/page.h |  8 
>  arch/riscv/mm/init.c  | 69 +--
>  2 files changed, 65 insertions(+), 12 deletions(-)
>
> --
> 2.20.1
>
>


-- 
Regards,
Atish


[PATCH 0/2] PUD/PGDIR entries for linear mapping

2020-06-03 Thread Alexandre Ghiti
This small patchset intends to use PUD/PGDIR entries for linear mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
beginning of the DRAM: this is achieved in patch 1.

But furthermore, at the moment, the firmware (opensbi) explicitly asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for performance
reasons, we must at least make sure that PMD entries can be used: that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that whenever
possible, PUD/PGDIR entries are used. 

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xc000-0xc00176e00x8020 5998M PMD D A . 
. . W R V 

And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xc000-0xc0014000 0x8000 5G PUD D A 
. . . W R V
0xc0014000-0xc00177000x0001c000 880M PMD D A . 
. . W R V

Alexandre Ghiti (2):
  riscv: Get memory below load_pa while ensuring linear mapping is PMD
aligned
  riscv: Use PUD/PGDIR entries for linear mapping when possible

 arch/riscv/include/asm/page.h |  8 
 arch/riscv/mm/init.c  | 69 +--
 2 files changed, 65 insertions(+), 12 deletions(-)

-- 
2.20.1