Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling

2015-10-13 Thread Suzuki K. Poulose

On 13/10/15 16:39, Christoffer Dall wrote:

On Mon, Oct 12, 2015 at 10:55:24AM +0100, Suzuki K. Poulose wrote:

On 10/10/15 15:52, Christoffer Dall wrote:

Hi Suzuki,


Hi Christoffer,

Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.


I had to refresh my mind a fair bit to be able to review this, so I
thought it may be useful to just remind us all what the constraints of
this whole thing is, and make sure we agree on this:

1. We fix the IPA max width to 40 bits
2. We don't support systems with a PARange smaller than 40 bits (do we
check this anywhere or document this anywhere?)


AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?



If we support 40bit IPA or more, yes, I think that would be sane.  Or at
least put a comment somewhere, perhaps in Documenation.


OK


3. We always assume we are running on a system with PARange of 40 bits
and we are therefore constrained to use concatination.

As an implication of (3) above, this code will attempt to allocate 256K
of physically contiguous memory for each VM on the system.  That is
probably ok, but I just wanted to point it out in case it raises any
eyebrows for other people following this thread.


Right, I will document this in a comment.


level:  0   1 2 3
bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
  ^   ^ ^
  |   | |
host entry| x stage-2 entry
  |
 IPA -x


Isn't the stage-2 entry using bits [39:25], because you resolve
more than 11 bits on the initial level of lookup when you concatenate
tables?


Yes, the stage-2 entry is just supposed to show the entry level (2).



I don't understand, the stage-2 entry level will be at bit 39, not 35?



That picture shows the 'level 2' at which the stage-2 translations begin,
with 16 pages concatenated, which gives 39-25. The host kernel macros,
normally only sees upto bit 35, which is fixed using the kvm_pgd_index()
to pick the right PGD entry for a VA.

Thanks
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling

2015-10-12 Thread Suzuki K. Poulose

On 10/10/15 15:52, Christoffer Dall wrote:

Hi Suzuki,


Hi Christoffer,

Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.


I had to refresh my mind a fair bit to be able to review this, so I
thought it may be useful to just remind us all what the constraints of
this whole thing is, and make sure we agree on this:

1. We fix the IPA max width to 40 bits
2. We don't support systems with a PARange smaller than 40 bits (do we
check this anywhere or document this anywhere?)


AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?


3. We always assume we are running on a system with PARange of 40 bits
and we are therefore constrained to use concatination.

As an implication of (3) above, this code will attempt to allocate 256K
of physically contiguous memory for each VM on the system.  That is
probably ok, but I just wanted to point it out in case it raises any
eyebrows for other people following this thread.


Right, I will document this in a comment.


level:  0   1 2 3
bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
  ^   ^ ^
  |   | |
host entry| x stage-2 entry
  |
 IPA -x


Isn't the stage-2 entry using bits [39:25], because you resolve
more than 11 bits on the initial level of lookup when you concatenate
tables?


Yes, the stage-2 entry is just supposed to show the entry level (2).



The following conditions hold true for all cases(with 40bit IPA)
1) The stage-2 entry level <= 2
2) Number of fake page-table entries is in the inclusive range [0, 2].


nit: Number of fake levels of page tables


Correct, I have fixed it already.



+/*
+ * At stage-2 entry level, upto 16 tables can be concatenated and


nit: Can you rewrite the first part of this comment to be in line with
the ARM ARM, such as: "The stage-2 page tables can concatenate up to 16
tables at the inital level"  ?


Yes, will do it.





+ * the hardware expects us to use concatenation, whenever possible.


I think the 'hardware expects us' is a bit vague.  At least I find this
whole part of the architecture incredibly confusing already, so it would
help me in the future if we put something like:

"The hardware requires that we use concatenation depending on the
supported PARange and page size.  We always assume the hardware's PASize
is maximum 40 bits in this context, and with a fixed IPA width of 40
bits, we concatenate 2 tables for 4K pages, 16 tables for 16K pages, and
do not use concatenation for 64K pages."

Did I get this right?


You are right. The rule is simple. Upto 16 tables can be concatenated at
the stage-2 entry level.




+ * So, number of page table levels for KVM_PHYS_SHIFT is always
+ * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
+ */
+#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)


I see the math lines up, but I don't think it's intuitive, as I don't
understand why it's obvious that it's the 'normal' page table for
KVM_PHYS_SHIFT - 4.


Because, we can concatenate upto 16 page table entries. With the current
set of page sizes the above 'magic' formula works out. But yes, the following
suggestion makes more sense.



I see this as an architectural limitation given in the ARM ARM, and we
should just refer to that, and do:

#if PAGE_SHIFT == 12
#define S2_PGTABLE_LEVELS   3
#else
#define S2_PGTABLE_LEVELS   2
#endif


OK, we could do that.




+/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
+#define HYP_PGTABLE_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS 
+ 1)
+#define HYP_PGDIR_SHIFT
ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
+#define HYP_PGTABLE_ENTRY_LEVEL(4 - HYP_PGTABLE_LEVELS)


We are introducing a huge number of defines here, which are all more or
less opaque to anyone coming back to this code.

I may be extraordinarily stupid, but I really need each define explained
in a comment to be able to follow this code (those above and the
S2_ENTRY_TABLES below).


No, you right. I need to document all the above properly, which I is something
I am in the middle of.



I actually wonder from looking at this whole patch if we even want to go
here.  Maybe this is really the time to say that we should get rid of
the dependency between the host page table layout and the stage-2 page
table layout.

Since the rest of this series looks pretty good, I'm wondering if you
should just disable KVM in the config system if 16K pages is selected,
and then you can move ahead with this series while we fix KVM properly?


I can send an updated version (which is in the test furnace) soon, so that
you can take 

Re: [PATCH 03/15] arm64: Introduce helpers for page table levels

2015-10-09 Thread Suzuki K. Poulose

On 08/10/15 18:28, Catalin Marinas wrote:

On Thu, Oct 08, 2015 at 06:22:34PM +0100, Suzuki K. Poulose wrote:

On 08/10/15 15:45, Christoffer Dall wrote:

On Wed, Oct 07, 2015 at 10:26:14AM +0100, Marc Zyngier wrote:

I just had a chat with Catalin, who did shed some light on this.
It all has to do with rounding up. What you would like to have here is:

#define ARM64_HW_PGTABLE_LEVELS(va_bits) DIV_ROUND_UP(va_bits - PAGE_SHIFT, 
PAGE_SHIFT - 3)

where (va_bits - PAGE_SHIFT) is the total number of bits we deal
with during a page table walk, and (PAGE_SHIFT - 3) is the number
of bits we deal with per level.

The clue is in how DIV_ROUND_UP is written:

#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

which gives you Suzuki's magic formula.

I'd vote for the DIV_ROUND_UP(), which will make things a lot more readable.


Thanks for the explanation, I vote for DIV_ROUND_UP too.


Btw, DIV_ROUND_UP is defined in linux/kernel.h, including which in the required
headers breaks the build. I could add the definition of the same locally.


Or just keep the original magic formula and add the DIV_ROUND_UP one in
a comment.



OK, will keep proper documentation with the cryptic formula ;)

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/15] arm64: Introduce helpers for page table levels

2015-10-08 Thread Suzuki K. Poulose

On 08/10/15 15:45, Christoffer Dall wrote:

On Wed, Oct 07, 2015 at 10:26:14AM +0100, Marc Zyngier wrote:

On 07/10/15 09:26, Christoffer Dall wrote:

Hi Suzuki,





I just had a chat with Catalin, who did shed some light on this.
It all has to do with rounding up. What you would like to have here is:

#define ARM64_HW_PGTABLE_LEVELS(va_bits) DIV_ROUND_UP(va_bits - PAGE_SHIFT, 
PAGE_SHIFT - 3)

where (va_bits - PAGE_SHIFT) is the total number of bits we deal
with during a page table walk, and (PAGE_SHIFT - 3) is the number
of bits we deal with per level.

The clue is in how DIV_ROUND_UP is written:

#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

which gives you Suzuki's magic formula.

I'd vote for the DIV_ROUND_UP(), which will make things a lot more readable.


Thanks for the explanation, I vote for DIV_ROUND_UP too.


Btw, DIV_ROUND_UP is defined in linux/kernel.h, including which in the required
headers breaks the build. I could add the definition of the same locally.



You can stash this away for a cryptic interview question ;)


;)


Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/15] arm64: Introduce helpers for page table levels

2015-10-07 Thread Suzuki K. Poulose

On 07/10/15 10:26, Marc Zyngier wrote:

On 07/10/15 09:26, Christoffer Dall wrote:

Hi Suzuki,

On Tue, Sep 15, 2015 at 04:41:12PM +0100, Suzuki K. Poulose wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Introduce helpers for finding the number of page table
levels required for a given VA width, shift for a particular
page table level.

Convert the existing users to the new helpers. More users
to follow.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
  arch/arm64/include/asm/pgtable-hwdef.h |   15 ---
  1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 24154b0..ce18389 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -16,13 +16,21 @@
  #ifndef __ASM_PGTABLE_HWDEF_H
  #define __ASM_PGTABLE_HWDEF_H

+/*
+ * Number of page-table levels required to address 'va_bits' wide
+ * address, without section mapping
+ */
+#define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3))


I don't understand the '(va_bits) - 4' here, can you explain it (and add a
comment to that effect) ?


I just had a chat with Catalin, who did shed some light on this.
It all has to do with rounding up. What you would like to have here is:

#define ARM64_HW_PGTABLE_LEVELS(va_bits) DIV_ROUND_UP(va_bits - PAGE_SHIFT, 
PAGE_SHIFT - 3)

where (va_bits - PAGE_SHIFT) is the total number of bits we deal
with during a page table walk, and (PAGE_SHIFT - 3) is the number
of bits we deal with per level.

The clue is in how DIV_ROUND_UP is written:

#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

which gives you Suzuki's magic formula.


Thanks Marc for pitching in. That explains it better.



I'd vote for the DIV_ROUND_UP(), which will make things a lot more readable.


Sure, I can change that.

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/15] arm64: Cleanup VTCR_EL2 computation

2015-10-07 Thread Suzuki K. Poulose

On 07/10/15 11:11, Marc Zyngier wrote:

On 15/09/15 16:41, Suzuki K. Poulose wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>




diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index bdf139e..699554d 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -138,6 +138,9 @@
   * The magic numbers used for VTTBR_X in this patch can be found in Tables
   * D4-23 and D4-25 in ARM DDI 0487A.b.
   */
+#define VTCR_EL2_COMMON_BITS   (VTCR_EL2_SH0_INNER | VTCR_EL2_ORGN0_WBWA | \
+VTCR_EL2_IRGN0_WBWA | VTCR_EL2_T0SZ_40B)
+
  #ifdef CONFIG_ARM64_64K_PAGES
  /*
   * Stage2 translation configuration:
@@ -145,9 +148,8 @@
   * 64kB pages (TG0 = 1)
   * 2 level page tables (SL = 1)
   */
-#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_64K | VTCR_EL2_SH0_INNER | \
-VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
-VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
+#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_64K | VTCR_EL2_SL0_LVL1 | \
+VTCR_EL2_COMMON_BITS)
  #define VTTBR_X   (38 - VTCR_EL2_T0SZ_40B)
  #else
  /*
@@ -156,9 +158,8 @@
   * 4kB pages (TG0 = 0)
   * 3 level page tables (SL = 1)
   */
-#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_4K | VTCR_EL2_SH0_INNER | \
-VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
-VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
+#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_4K | VTCR_EL2_SL0_LVL1 | \
+VTCR_EL2_COMMON_BITS)
  #define VTTBR_X   (37 - VTCR_EL2_T0SZ_40B)
  #endif




This looks OK, but is going to clash badly with 857d1a9 ("arm64: KVM:
set {v,}TCR_EL2 RES1 bits"). Nothing we can't fix though.



As discussed, I will rebase my series on top of 4.3-rc4 to avoid this.

Thanks
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/15] arm64: Introduce helpers for page table levels

2015-10-07 Thread Suzuki K. Poulose

On 07/10/15 09:26, Christoffer Dall wrote:

Hi Suzuki,

On Tue, Sep 15, 2015 at 04:41:12PM +0100, Suzuki K. Poulose wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Introduce helpers for finding the number of page table
levels required for a given VA width, shift for a particular
page table level.

Convert the existing users to the new helpers. More users
to follow.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
  arch/arm64/include/asm/pgtable-hwdef.h |   15 ---
  1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 24154b0..ce18389 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -16,13 +16,21 @@
  #ifndef __ASM_PGTABLE_HWDEF_H
  #define __ASM_PGTABLE_HWDEF_H

+/*
+ * Number of page-table levels required to address 'va_bits' wide
+ * address, without section mapping
+ */
+#define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3))


I don't understand the '(va_bits) - 4' here, can you explain it (and add a
comment to that effect) ?


As mentioned, I will change it to DIV_ROUND_UP() as suggested by Marc.




+#define ARM64_HW_PGTABLE_LEVEL_SHIFT(level) \
+   ((PAGE_SHIFT - 3) * (level) + 3)
+


While this change is clearly correct, if you can explain the math here
in a comment as well, that would be helpful.


Sure, will add a comment to that effect.

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling

2015-10-07 Thread Suzuki K. Poulose

On 07/10/15 12:13, Marc Zyngier wrote:

On 15/09/15 16:41, Suzuki K. Poulose wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

The existing fake pgd handling code assumes that the stage-2 entry
level can only be one level down that of the host, which may not be
true always(e.g, with the introduction of 16k pagesize).

e.g.
With 16k page size and 48bit VA and 40bit IPA we have the following
split for page table levels:

level:  0   1 2 3
bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
  ^   ^ ^
  |   | |
host entry| x stage-2 entry
  |
 IPA -x

The stage-2 entry level is 2, due to the concatenation of 16tables
at level 2(mandated by the hardware). So, we need to fake two levels
to actually reach the hyp page table. This case cannot be handled


Nit: this is the stage-2 PT, not HYP.


with the existing code, as, all we know about is KVM_PREALLOC_LEVEL
which kind of stands for two different pieces of information.

1) Whether we have fake page table entry levels.
2) The entry level of stage-2 translation.

We loose the information about the number of fake levels that
we may have to use. Also, KVM_PREALLOC_LEVEL computation itself
is wrong, as we assume the hw entry level is always 1 level down
from the host.

This patch introduces two seperate indicators :


Nit: "separate".


1) Accurate entry level for stage-2 translation - HYP_PGTABLE_ENTRY_LEVEL -
using the new helpers.


Same confusion here. HYP has its own set of page tables, and this
definitely is S2, not HYP. Please update this symbol (and all the
similar ones) so that it is not confusing.



Sure, I will use S2 everywhere.


2) Number of levels of fake pagetable entries. (KVM_FAKE_PGTABLE_LEVELS)

The following conditions hold true for all cases(with 40bit IPA)
1) The stage-2 entry level <= 2
2) Number of fake page-table entries is in the inclusive range [0, 2].

Cc: kvm...@lists.cs.columbia.edu
Cc: christoffer.d...@linaro.org
Cc: marc.zyng...@arm.com
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
  arch/arm64/include/asm/kvm_mmu.h |  114 --
  1 file changed, 61 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 2567fe8..72cfd9e 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -41,18 +41,6 @@
   */
  #define TRAMPOLINE_VA (HYP_PAGE_OFFSET_MASK & PAGE_MASK)

-/*
- * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation
- * levels in addition to the PGD and potentially the PUD which are
- * pre-allocated (we pre-allocate the fake PGD and the PUD when the Stage-2
- * tables use one level of tables less than the kernel.
- */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define KVM_MMU_CACHE_MIN_PAGES1
-#else
-#define KVM_MMU_CACHE_MIN_PAGES2
-#endif
-
  #ifdef __ASSEMBLY__

  /*
@@ -80,6 +68,26 @@
  #define KVM_PHYS_SIZE (1UL << KVM_PHYS_SHIFT)
  #define KVM_PHYS_MASK (KVM_PHYS_SIZE - 1UL)

+/*
+ * At stage-2 entry level, upto 16 tables can be concatenated and
+ * the hardware expects us to use concatenation, whenever possible.
+ * So, number of page table levels for KVM_PHYS_SHIFT is always
+ * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
+ */
+#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
+/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
+#define HYP_PGTABLE_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS 
+ 1)


Why +1? I don't understand where that is coming from... which makes the
rest of the patch fairly opaque to me...


Sorry for the confusion in the numbering of levels and the lack of comments.

Taking the above example in the description, with 16K.


ARM ARM entry

no. of
levels 4 3 2 1 0

vabits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
^   ^^
|   ||
  host entry|x stage-2 entry
||
   IPA -xx- HYP_PGTABLE_SHIFT


1) ARM64_HW_PGTABLE_LEVEL_SHIFT(x) gives the size a level 'x' entry can map.

e.g, PTE_SHIFT => ARM64_HW_PGTABLE_LEVEL_SHIFT(1) => PAGE_SHIFT = 14
 PMD_SHIFT => ARM64_HW_PGTABLE_LEVEL_SHIFT(2) => (PAGE_SHIFT - 3) + 
PAGE_SHIFT = 25
 PUD_SHIFT => ARM64_HW_PGTABLE_LEVEL_SHIFT(3) => 36

and so on.

Now we get HYP_PAGETABLE_LEVELS = 2

To calculate the number of concatenated entries, we need to know the total 
size(HYP_PGTABLE_SHIFT)
that can be mapped by the hyp(stage2) page table with HYP_PGTABLE_LEVELS(2). It 
is
nothing but the size mapped by a (HYP_PGTABLE_LEVELS + 1) entry.
i.e, ARM64_HW_PGTABLE_LEVEL_SHIFT(3) = 36 ( = 39 for 4K)

We can use that to calculate the number of concatenated entries, by :

KVM_PHYS_SHIFT - HYP_PGTABLE_SHIFT

Numbering of

Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-05 Thread Suzuki K. Poulose

On 02/10/15 16:49, Catalin Marinas wrote:

On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:

From: Ard Biesheuvel <ard.biesheu...@linaro.org>

This patch adds the page size to the arm64 kernel image header
so that one can infer the PAGESIZE used by the kernel. This will
be helpful to diagnose failures to boot the kernel with page size
not supported by the CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>


This patch needs you signed-off-by as well since you are posting it. And
IIRC I acked it as well, I'll check.



Yes, you did mention that you were OK with the patch. But I thought there was
no  'Acked-by' tag added. Hence didn't pick that up.



If you are fine with adding your signed-of-by, I can add it to the patch
when applying (unless I see other issues with the series).



Yes, please go ahead, if I don't have to send another version depending on
the review of KVM bits. If I not, I will add the S-o-b and your Acked-by.


Thanks
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/15] arm64: Handle 4 level page table for swapper

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

At the moment, we only support maximum of 3-level page table for
swapper. With 48bit VA, 64K has only 3 levels and 4K uses section
mapping. Add support for 4-level page table for swapper, needed
by 16K pages.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/kernel/head.S |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 46670bf..01b8e58 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -271,7 +271,10 @@ ENDPROC(preserve_boot_args)
  */
.macro  create_pgd_entry, tbl, virt, tmp1, tmp2
create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
-#if SWAPPER_PGTABLE_LEVELS == 3
+#if SWAPPER_PGTABLE_LEVELS > 3
+   create_table_entry \tbl, \virt, PUD_SHIFT, PTRS_PER_PUD, \tmp1, \tmp2
+#endif
+#if SWAPPER_PGTABLE_LEVELS > 2
create_table_entry \tbl, \virt, SWAPPER_TABLE_SHIFT, PTRS_PER_PTE, 
\tmp1, \tmp2
 #endif
.endm
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/15] arm64: 36 bit VA

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

36bit VA lets us use 2 level page tables while limiting the
available address space to 64GB.

Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/Kconfig |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2253819..3560241 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -168,6 +168,7 @@ config FIX_EARLYCON_MEM
 
 config PGTABLE_LEVELS
int
+   default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36
default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48
default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
@@ -373,6 +374,10 @@ choice
  space sizes. The level of translation table is determined by
  a combination of page size and virtual address space size.
 
+config ARM64_VA_BITS_36
+   bool "36-bit"
+   depends on ARM64_16K_PAGES
+
 config ARM64_VA_BITS_39
bool "39-bit"
depends on ARM64_4K_PAGES
@@ -392,6 +397,7 @@ endchoice
 
 config ARM64_VA_BITS
int
+   default 36 if ARM64_VA_BITS_36
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
default 47 if ARM64_VA_BITS_47
@@ -465,7 +471,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if ARM64_4K_PAGES || ARM64_16K_PAGES
+   def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/15] arm: kvm: Move fake PGD handling to arch specific files

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Rearrange the code for fake pgd handling, which is applicable
to only ARM64. The intention is to keep the common code cleaner,
unaware of the underlying hacks.

Cc: kvm...@lists.cs.columbia.edu
Cc: christoffer.d...@linaro.org
Cc: marc.zyng...@arm.com
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |7 ++
 arch/arm/kvm/mmu.c   |   44 +-
 arch/arm64/include/asm/kvm_mmu.h |   43 +
 3 files changed, 55 insertions(+), 39 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa18..1c9aa8a 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -173,6 +173,13 @@ static inline unsigned int kvm_get_hwpgd_size(void)
return PTRS_PER_S2_PGD * sizeof(pgd_t);
 }
 
+static inline pgd_t *kvm_setup_fake_pgd(pgd_t *pgd)
+{
+   return pgd;
+}
+
+static inline void kvm_free_fake_pgd(pgd_t *pgd) {}
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)   __cpuc_flush_dcache_area((a), (l))
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7b42012..b210622 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -677,43 +677,11 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
 * guest, we allocate a fake PGD and pre-populate it to point
 * to the next-level page table, which will be the real
 * initial page table pointed to by the VTTBR.
-*
-* When KVM_PREALLOC_LEVEL==2, we allocate a single page for
-* the PMD and the kernel will use folded pud.
-* When KVM_PREALLOC_LEVEL==1, we allocate 2 consecutive PUD
-* pages.
 */
-   if (KVM_PREALLOC_LEVEL > 0) {
-   int i;
-
-   /*
-* Allocate fake pgd for the page table manipulation macros to
-* work.  This is not used by the hardware and we have no
-* alignment requirement for this allocation.
-*/
-   pgd = kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t),
-   GFP_KERNEL | __GFP_ZERO);
-
-   if (!pgd) {
-   kvm_free_hwpgd(hwpgd);
-   return -ENOMEM;
-   }
-
-   /* Plug the HW PGD into the fake one. */
-   for (i = 0; i < PTRS_PER_S2_PGD; i++) {
-   if (KVM_PREALLOC_LEVEL == 1)
-   pgd_populate(NULL, pgd + i,
-(pud_t *)hwpgd + i * PTRS_PER_PUD);
-   else if (KVM_PREALLOC_LEVEL == 2)
-   pud_populate(NULL, pud_offset(pgd, 0) + i,
-(pmd_t *)hwpgd + i * PTRS_PER_PMD);
-   }
-   } else {
-   /*
-* Allocate actual first-level Stage-2 page table used by the
-* hardware for Stage-2 page table walks.
-*/
-   pgd = (pgd_t *)hwpgd;
+   pgd = kvm_setup_fake_pgd(hwpgd);
+   if (IS_ERR(pgd)) {
+   kvm_free_hwpgd(hwpgd);
+   return PTR_ERR(pgd);
}
 
kvm_clean_pgd(pgd);
@@ -820,9 +788,7 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 
unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
kvm_free_hwpgd(kvm_get_hwpgd(kvm));
-   if (KVM_PREALLOC_LEVEL > 0)
-   kfree(kvm->arch.pgd);
-
+   kvm_free_fake_pgd(kvm->arch.pgd);
kvm->arch.pgd = NULL;
 }
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6150567..2567fe8 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -198,6 +198,49 @@ static inline unsigned int kvm_get_hwpgd_size(void)
return PTRS_PER_S2_PGD * sizeof(pgd_t);
 }
 
+/*
+ * Allocate fake pgd for the page table manipulation macros to
+ * work.  This is not used by the hardware and we have no
+ * alignment requirement for this allocation.
+ */
+static inline pgd_t* kvm_setup_fake_pgd(pgd_t *hwpgd)
+{
+   int i;
+   pgd_t *pgd;
+
+   if (!KVM_PREALLOC_LEVEL)
+   return hwpgd;
+   /*
+* When KVM_PREALLOC_LEVEL==2, we allocate a single page for
+* the PMD and the kernel will use folded pud.
+* When KVM_PREALLOC_LEVEL==1, we allocate 2 consecutive PUD
+* pages.
+*/
+   pgd = kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t),
+   GFP_KERNEL | __GFP_ZERO);
+
+   if (!pgd)
+   return ERR_PTR(-ENOMEM);
+
+   /* Plug the HW PGD into the fake one. */
+   for (i = 0; i < PTRS_PER_S2_PGD; i++) {
+   if (KVM_PREALLOC_LEVEL == 1)
+   pgd_populate(NULL, pgd + i,
+(pud_t *)hwpgd + i * PTRS_PER_PUD);
+

[PATCH 14/15] arm64: Add 16K page size support

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

This patch turns on the 16K page support in the kernel. We
support 48bit VA (4 level page tables) and 47bit VA (3 level
page tables).

Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/Kconfig   |   25 -
 arch/arm64/include/asm/fixmap.h  |4 +++-
 arch/arm64/include/asm/kvm_arm.h |   12 
 arch/arm64/include/asm/page.h|2 ++
 arch/arm64/include/asm/sysreg.h  |2 ++
 arch/arm64/include/asm/thread_info.h |2 ++
 arch/arm64/kernel/head.S |7 ++-
 arch/arm64/mm/proc.S |4 +++-
 8 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 83bca48..2253819 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -171,7 +171,8 @@ config PGTABLE_LEVELS
default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48
default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
-   default 4 if ARM64_4K_PAGES && ARM64_VA_BITS_48
+   default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47
+   default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48
 
 source "init/Kconfig"
 
@@ -345,6 +346,13 @@ config ARM64_4K_PAGES
help
  This feature enables 4KB pages support.
 
+config ARM64_16K_PAGES
+   bool "16KB"
+   help
+ The system will use 16KB pages support. AArch32 emulation
+ requires applications compiled with 16K(or multiple of 16K)
+ aligned segments.
+
 config ARM64_64K_PAGES
bool "64KB"
help
@@ -358,6 +366,7 @@ endchoice
 choice
prompt "Virtual address space size"
default ARM64_VA_BITS_39 if ARM64_4K_PAGES
+   default ARM64_VA_BITS_47 if ARM64_16K_PAGES
default ARM64_VA_BITS_42 if ARM64_64K_PAGES
help
  Allows choosing one of multiple possible virtual address
@@ -372,6 +381,10 @@ config ARM64_VA_BITS_42
bool "42-bit"
depends on ARM64_64K_PAGES
 
+config ARM64_VA_BITS_47
+   bool "47-bit"
+   depends on ARM64_16K_PAGES
+
 config ARM64_VA_BITS_48
bool "48-bit"
 
@@ -381,6 +394,7 @@ config ARM64_VA_BITS
int
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
+   default 47 if ARM64_VA_BITS_47
default 48 if ARM64_VA_BITS_48
 
 config CPU_BIG_ENDIAN
@@ -451,7 +465,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if ARM64_4K_PAGES
+   def_bool y if ARM64_4K_PAGES || ARM64_16K_PAGES
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
@@ -488,6 +502,7 @@ config XEN
 config FORCE_MAX_ZONEORDER
int
default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE)
+   default "12" if (ARM64_16K_PAGES && TRANSPARENT_HUGEPAGE)
default "11"
 
 menuconfig ARMV8_DEPRECATED
@@ -674,9 +689,9 @@ config COMPAT
  the user helper functions, VFP support and the ptrace interface are
  handled appropriately by the kernel.
 
- If you also enabled CONFIG_ARM64_64K_PAGES, please be aware that you
- will only be able to execute AArch32 binaries that were compiled with
- 64k aligned segments.
+ If you use a page size other than 4KB(i.e, 16KB or 64KB), please be 
aware
+ that you will only be able to execute AArch32 binaries that were 
compiled
+ with page size aligned segments.
 
  If you want to execute 32-bit userspace applications, say Y.
 
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 8b9884c..a294c70 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,8 +55,10 @@ enum fixed_addresses {
 * Temporary boot-time mappings, used by early_ioremap(),
 * before ioremap() is functional.
 */
-#ifdef CONFIG_ARM64_64K_PAGES
+#ifdefined(CONFIG_ARM64_64K_PAGES)
 #define NR_FIX_BTMAPS  4
+#elif  defined (CONFIG_ARM64_16K_PAGES)
+#define NR_FIX_BTMAPS  16
 #else
 #define NR_FIX_BTMAPS  64
 #endif
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 699554d..b28a06e 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -113,6 +113,7 @@
 #define VTCR_EL2_TG0_MASK  (3 << 14)
 #define VTCR_EL2_TG0_4K(0 << 14)
 #define VTCR_EL2_TG0_64K   (1 <<

[PATCH 06/15] arm64: Clean config usages for page size

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

We use !CONFIG_ARM64_64K_PAGES for CONFIG_ARM64_4K_PAGES
(and vice versa) in code. It all worked well, so far since
we only had two options. Now, with the introduction of 16K,
these cases will break. This patch cleans up the code to
use the required CONFIG symbol expression without the assumption
that !64K => 4K (and vice versa)

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Acked-by: Mark Rutland <mark.rutl...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/Kconfig   |4 ++--
 arch/arm64/Kconfig.debug |2 +-
 arch/arm64/include/asm/thread_info.h |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7d95663..ab0a36f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -451,7 +451,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if !ARM64_64K_PAGES
+   def_bool y if ARM64_4K_PAGES
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
@@ -663,7 +663,7 @@ source "fs/Kconfig.binfmt"
 
 config COMPAT
bool "Kernel support for 32-bit EL0"
-   depends on !ARM64_64K_PAGES || EXPERT
+   depends on ARM64_4K_PAGES || EXPERT
select COMPAT_BINFMT_ELF
select HAVE_UID16
select OLD_SIGSUSPEND3
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index d6285ef..c24d6ad 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -77,7 +77,7 @@ config DEBUG_RODATA
   If in doubt, say Y
 
 config DEBUG_ALIGN_RODATA
-   depends on DEBUG_RODATA && !ARM64_64K_PAGES
+   depends on DEBUG_RODATA && ARM64_4K_PAGES
bool "Align linker sections up to SECTION_SIZE"
help
  If this option is enabled, sections that may potentially be marked as
diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index dcd06d1..d9c8c9f 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -23,7 +23,7 @@
 
 #include 
 
-#ifndef CONFIG_ARM64_64K_PAGES
+#ifdef CONFIG_ARM64_4K_PAGES
 #define THREAD_SIZE_ORDER  2
 #endif
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/15] arm64: kvm: Rewrite fake pgd handling

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

The existing fake pgd handling code assumes that the stage-2 entry
level can only be one level down that of the host, which may not be
true always(e.g, with the introduction of 16k pagesize).

e.g.
With 16k page size and 48bit VA and 40bit IPA we have the following
split for page table levels:

level:  0   1 2 3
bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
 ^   ^ ^
 |   | |
   host entry| x stage-2 entry
 |
IPA -x

The stage-2 entry level is 2, due to the concatenation of 16tables
at level 2(mandated by the hardware). So, we need to fake two levels
to actually reach the hyp page table. This case cannot be handled
with the existing code, as, all we know about is KVM_PREALLOC_LEVEL
which kind of stands for two different pieces of information.

1) Whether we have fake page table entry levels.
2) The entry level of stage-2 translation.

We loose the information about the number of fake levels that
we may have to use. Also, KVM_PREALLOC_LEVEL computation itself
is wrong, as we assume the hw entry level is always 1 level down
from the host.

This patch introduces two seperate indicators :
1) Accurate entry level for stage-2 translation - HYP_PGTABLE_ENTRY_LEVEL -
   using the new helpers.
2) Number of levels of fake pagetable entries. (KVM_FAKE_PGTABLE_LEVELS)

The following conditions hold true for all cases(with 40bit IPA)
1) The stage-2 entry level <= 2
2) Number of fake page-table entries is in the inclusive range [0, 2].

Cc: kvm...@lists.cs.columbia.edu
Cc: christoffer.d...@linaro.org
Cc: marc.zyng...@arm.com
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h |  114 --
 1 file changed, 61 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 2567fe8..72cfd9e 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -41,18 +41,6 @@
  */
 #define TRAMPOLINE_VA  (HYP_PAGE_OFFSET_MASK & PAGE_MASK)
 
-/*
- * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation
- * levels in addition to the PGD and potentially the PUD which are
- * pre-allocated (we pre-allocate the fake PGD and the PUD when the Stage-2
- * tables use one level of tables less than the kernel.
- */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define KVM_MMU_CACHE_MIN_PAGES1
-#else
-#define KVM_MMU_CACHE_MIN_PAGES2
-#endif
-
 #ifdef __ASSEMBLY__
 
 /*
@@ -80,6 +68,26 @@
 #define KVM_PHYS_SIZE  (1UL << KVM_PHYS_SHIFT)
 #define KVM_PHYS_MASK  (KVM_PHYS_SIZE - 1UL)
 
+/*
+ * At stage-2 entry level, upto 16 tables can be concatenated and
+ * the hardware expects us to use concatenation, whenever possible.
+ * So, number of page table levels for KVM_PHYS_SHIFT is always
+ * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
+ */
+#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
+/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
+#define HYP_PGTABLE_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS 
+ 1)
+#define HYP_PGDIR_SHIFT
ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
+#define HYP_PGTABLE_ENTRY_LEVEL(4 - HYP_PGTABLE_LEVELS)
+
+/*
+ * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation
+ * levels in addition to the PGD and potentially the PUD which are
+ * pre-allocated (we pre-allocate the fake PGD and the PUD when the Stage-2
+ * tables use one level of tables less than the kernel.
+ */
+#define KVM_MMU_CACHE_MIN_PAGES(HYP_PGTABLE_LEVELS - 1)
+
 int create_hyp_mappings(void *from, void *to);
 int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
 void free_boot_hyp_pgd(void);
@@ -145,56 +153,41 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 #define kvm_pud_addr_end(addr, end)pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)pmd_addr_end(addr, end)
 
-/*
- * In the case where PGDIR_SHIFT is larger than KVM_PHYS_SHIFT, we can address
- * the entire IPA input range with a single pgd entry, and we would only need
- * one pgd entry.  Note that in this case, the pgd is actually not used by
- * the MMU for Stage-2 translations, but is merely a fake pgd used as a data
- * structure for the kernel pgtable macros to work.
- */
-#if PGDIR_SHIFT > KVM_PHYS_SHIFT
-#define PTRS_PER_S2_PGD_SHIFT  0
+/* Number of concatenated tables in stage-2 entry level */
+#if KVM_PHYS_SHIFT > HYP_PGTABLE_SHIFT
+#define S2_ENTRY_TABLES_SHIFT  (KVM_PHYS_SHIFT - HYP_PGTABLE_SHIFT)
 #else
-#define PTRS_PER_S2_PGD_SHIFT  (KVM_PHYS_SHIFT - PGDIR_SHIFT)
+#define S2_ENTRY_TABLES_SHIFT  0
 #endif
+#define S2_ENTRY_TABLES(1 << (S2_ENTRY_TABLES_SHIFT))
+
+/* Number of page table levels we fake to reach the hw pgta

[PATCH 04/15] arm64: Calculate size for idmap_pg_dir at compile time

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Now that we can calculate the number of levels required for
mapping a va width, reserve exact number of pages that would
be required to cover the idmap. The idmap should be able to handle
the maximum physical address size supported.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/include/asm/boot.h   |1 +
 arch/arm64/include/asm/kernel-pgtable.h |7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b6..678b63e 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -2,6 +2,7 @@
 #ifndef __ASM_BOOT_H
 #define __ASM_BOOT_H
 
+#include 
 #include 
 
 /*
diff --git a/arch/arm64/include/asm/kernel-pgtable.h 
b/arch/arm64/include/asm/kernel-pgtable.h
index 5876a36..def7168 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -33,16 +33,19 @@
  * map to pte level. The swapper also maps the FDT (see __create_page_tables
  * for more information). Note that the number of ID map translation levels
  * could be increased on the fly if system RAM is out of reach for the default
- * VA range, so 3 pages are reserved in all cases.
+ * VA range, so pages required to map highest possible PA are reserved in all
+ * cases.
  */
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#define IDMAP_PGTABLE_LEVELS   (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT) - 1)
 #else
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
+#define IDMAP_PGTABLE_LEVELS   (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT))
 #endif
 
 #define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
-#define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
+#define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
 
 /* Initial memory map size */
 #if ARM64_SWAPPER_USES_SECTION_MAPS
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 00/15] arm64: 16K translation granule support

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

This series enables the 16K page size support on Linux for arm64.
Adds support for 48bit VA(4 level), 47bit VA(3 level) and
36bit VA(2 level) with 16K. 16K was a late addition to the architecture
and is not implemented by all CPUs. Added a check to ensure the
selected granule size is supported by the CPU, failing which the CPU
won't proceed with booting. Also the kernel page size is added to the
kernel image header (patch from Ard).

KVM bits have been tested on a fast model with GICv3 using kvmtool [1].

Patches 1-7 cleans up the kernel page size handling code.
Patch 8 Adds a check to ensure the CPU supports the selected granule 
size.
Patch 9 Adds the page size information to image header.
Patches 10-13   Fixes some issues with the KVM bits, mainly the fake PGD
handling code.
Patches 14-15   Adds the 16k page size support bits.

This series applies on top of 4.3-rc1.

The tree is also available here:

git://linux-arm.org/linux-skp.git  16k/v2-4.3-rc1

Changes since V1:
  - Rebase to 4.3-rc1
  - Fix vmemmap_populate for 16K (use !ARM64_SWAPPER_USES_SECTION_MAPS)
  - Better description for patch2 (suggested-by: Ard)
  - Add page size information to the image header flags.
  - Added reviewed-by/tested-by Ard.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git

Ard Biesheuvel (1):
  arm64: Add page size to the kernel image header

Suzuki K. Poulose (14):
  arm64: Move swapper pagetable definitions
  arm64: Handle section maps for swapper/idmap
  arm64: Introduce helpers for page table levels
  arm64: Calculate size for idmap_pg_dir at compile time
  arm64: Handle 4 level page table for swapper
  arm64: Clean config usages for page size
  arm64: Kconfig: Fix help text about AArch32 support with 64K pages
  arm64: Check for selected granule support
  arm64: kvm: Fix {V}TCR_EL2_TG0 mask
  arm64: Cleanup VTCR_EL2 computation
  arm: kvm: Move fake PGD handling to arch specific files
  arm64: kvm: Rewrite fake pgd handling
  arm64: Add 16K page size support
  arm64: 36 bit VA

 Documentation/arm64/booting.txt |7 +-
 arch/arm/include/asm/kvm_mmu.h  |7 ++
 arch/arm/kvm/mmu.c  |   44 ++
 arch/arm64/Kconfig  |   37 +++--
 arch/arm64/Kconfig.debug|2 +-
 arch/arm64/include/asm/boot.h   |1 +
 arch/arm64/include/asm/fixmap.h |4 +-
 arch/arm64/include/asm/kernel-pgtable.h |   77 ++
 arch/arm64/include/asm/kvm_arm.h|   29 +--
 arch/arm64/include/asm/kvm_mmu.h|  135 +--
 arch/arm64/include/asm/page.h   |   20 +
 arch/arm64/include/asm/pgtable-hwdef.h  |   15 +++-
 arch/arm64/include/asm/sysreg.h |8 ++
 arch/arm64/include/asm/thread_info.h|4 +-
 arch/arm64/kernel/head.S|   71 +---
 arch/arm64/kernel/image.h   |5 +-
 arch/arm64/kernel/vmlinux.lds.S |1 +
 arch/arm64/mm/mmu.c |   72 -
 arch/arm64/mm/proc.S|4 +-
 19 files changed, 348 insertions(+), 195 deletions(-)
 create mode 100644 arch/arm64/include/asm/kernel-pgtable.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/15] arm64: kvm: Fix {V}TCR_EL2_TG0 mask

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

{V}TCR_EL2_TG0 is a 2bit wide field, where:

 00 - 4K
 01 - 64K
 10 - 16K

But we use only 1 bit, which has worked well so far since
we never cared about 16K. Fix it for 16K support.

Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Christoffer Dall <christoffer.d...@linaro.org>
Cc: kvm...@lists.cs.columbia.edu
Acked-by: Mark Rutland <mark.rutl...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 7605e09..bdf139e 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -98,7 +98,7 @@
 #define TCR_EL2_TBI(1 << 20)
 #define TCR_EL2_PS (7 << 16)
 #define TCR_EL2_PS_40B (2 << 16)
-#define TCR_EL2_TG0(1 << 14)
+#define TCR_EL2_TG0(3 << 14)
 #define TCR_EL2_SH0(3 << 12)
 #define TCR_EL2_ORGN0  (3 << 10)
 #define TCR_EL2_IRGN0  (3 << 8)
@@ -110,7 +110,7 @@
 
 /* VTCR_EL2 Registers bits */
 #define VTCR_EL2_PS_MASK   (7 << 16)
-#define VTCR_EL2_TG0_MASK  (1 << 14)
+#define VTCR_EL2_TG0_MASK  (3 << 14)
 #define VTCR_EL2_TG0_4K(0 << 14)
 #define VTCR_EL2_TG0_64K   (1 << 14)
 #define VTCR_EL2_SH0_MASK  (3 << 12)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/15] arm64: Handle section maps for swapper/idmap

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

We use section maps with 4K page size to create the swapper/idmaps.
So far we have used !64K or 4K checks to handle the case where we
use the section maps.
This patch adds a new symbol, ARM64_SWAPPER_USES_SECTION_MAPS, to
handle cases where we use section maps, instead of using the page size
symbols.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
Changes since v1:
  - Use ARM64_SWAPPER_USES_SECTION_MAPS for vmemmap_populate()
  - Fix description
---
 arch/arm64/include/asm/kernel-pgtable.h |   31 -
 arch/arm64/mm/mmu.c |   72 ++-
 2 files changed, 52 insertions(+), 51 deletions(-)

diff --git a/arch/arm64/include/asm/kernel-pgtable.h 
b/arch/arm64/include/asm/kernel-pgtable.h
index 622929d..5876a36 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -19,6 +19,13 @@
 #ifndef __ASM_KERNEL_PGTABLE_H
 #define __ASM_KERNEL_PGTABLE_H
 
+/* With 4K pages, we use section maps. */
+#ifdef CONFIG_ARM64_4K_PAGES
+#define ARM64_SWAPPER_USES_SECTION_MAPS 1
+#else
+#define ARM64_SWAPPER_USES_SECTION_MAPS 0
+#endif
+
 /*
  * The idmap and swapper page tables need some space reserved in the kernel
  * image. Both require pgd, pud (4 levels only) and pmd tables to (section)
@@ -28,26 +35,28 @@
  * could be increased on the fly if system RAM is out of reach for the default
  * VA range, so 3 pages are reserved in all cases.
  */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#else
+#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
 #endif
 
 #define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
 #define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
 
 /* Initial memory map size */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_BLOCK_SHIFTPAGE_SHIFT
-#define SWAPPER_BLOCK_SIZE PAGE_SIZE
-#define SWAPPER_TABLE_SHIFTPMD_SHIFT
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_BLOCK_SHIFTSECTION_SHIFT
 #define SWAPPER_BLOCK_SIZE SECTION_SIZE
 #define SWAPPER_TABLE_SHIFTPUD_SHIFT
+#else
+#define SWAPPER_BLOCK_SHIFTPAGE_SHIFT
+#define SWAPPER_BLOCK_SIZE PAGE_SIZE
+#define SWAPPER_TABLE_SHIFTPMD_SHIFT
 #endif
 
+/* The size of the initial kernel direct mapping */
+#define SWAPPER_INIT_MAP_SIZE  (_AC(1, UL) << SWAPPER_TABLE_SHIFT)
 
 /*
  * Initial memory map attributes.
@@ -55,10 +64,10 @@
 #define SWAPPER_PTE_FLAGS  PTE_TYPE_PAGE | PTE_AF | PTE_SHARED
 #define SWAPPER_PMD_FLAGS  PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S
 
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_MM_MMUFLAGSPTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGSPMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS
+#else
+#define SWAPPER_MM_MMUFLAGSPTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS
 #endif
 
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 9211b85..f533312 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -32,6 +32,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -353,14 +354,11 @@ static void __init map_mem(void)
 * memory addressable from the initial direct kernel mapping.
 *
 * The initial direct kernel mapping, located at swapper_pg_dir, gives
-* us PUD_SIZE (4K pages) or PMD_SIZE (64K pages) memory starting from
-* PHYS_OFFSET (which must be aligned to 2MB as per
-* Documentation/arm64/booting.txt).
+* us PUD_SIZE (with SECTION maps, i.e, 4K) or PMD_SIZE (without
+* SECTION maps, i.e, 64K pages) memory starting from PHYS_OFFSET
+* (which must be aligned to 2MB as per 
Documentation/arm64/booting.txt).
 */
-   if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
-   limit = PHYS_OFFSET + PMD_SIZE;
-   else
-   limit = PHYS_OFFSET + PUD_SIZE;
+   limit = PHYS_OFFSET + SWAPPER_INIT_MAP_SIZE;
memblock_set_current_limit(limit);
 
/* map all the memory banks */
@@ -371,21 +369,24 @@ static void __init map_mem(void)
if (start >= end)
break;
 
-#ifndef CONFIG_ARM64_64K_PAGES
-   /*
-* For the first memory bank align the start address and
-* current memblock limit to prevent create_mapping() from
-* allocating pte page tables from unmapped memory.
-* When 64K pages are enabled, the pte page table for the
-* first PGDIR_SIZE is already present in swapper_pg_dir

[PATCH 01/15] arm64: Move swapper pagetable definitions

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Move the kernel pagetable (both swapper and idmap) definitions
from the generic asm/page.h to a new file, asm/kernel-pgtable.h.

This is mostly a cosmetic change, to clean up the asm/page.h to
get rid of the arch specific details which are not needed by the
generic code.

Also renames the symbols to prevent conflicts. e.g,
BLOCK_SHIFT => SWAPPER_BLOCK_SHIFT

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/include/asm/kernel-pgtable.h |   65 +++
 arch/arm64/include/asm/page.h   |   18 -
 arch/arm64/kernel/head.S|   37 --
 arch/arm64/kernel/vmlinux.lds.S |1 +
 4 files changed, 74 insertions(+), 47 deletions(-)
 create mode 100644 arch/arm64/include/asm/kernel-pgtable.h

diff --git a/arch/arm64/include/asm/kernel-pgtable.h 
b/arch/arm64/include/asm/kernel-pgtable.h
new file mode 100644
index 000..622929d
--- /dev/null
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -0,0 +1,65 @@
+/*
+ * asm/kernel-pgtable.h : Kernel page table mapping
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_KERNEL_PGTABLE_H
+#define __ASM_KERNEL_PGTABLE_H
+
+/*
+ * The idmap and swapper page tables need some space reserved in the kernel
+ * image. Both require pgd, pud (4 levels only) and pmd tables to (section)
+ * map the kernel. With the 64K page configuration, swapper and idmap need to
+ * map to pte level. The swapper also maps the FDT (see __create_page_tables
+ * for more information). Note that the number of ID map translation levels
+ * could be increased on the fly if system RAM is out of reach for the default
+ * VA range, so 3 pages are reserved in all cases.
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
+#else
+#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#endif
+
+#define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
+#define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
+
+/* Initial memory map size */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define SWAPPER_BLOCK_SHIFTPAGE_SHIFT
+#define SWAPPER_BLOCK_SIZE PAGE_SIZE
+#define SWAPPER_TABLE_SHIFTPMD_SHIFT
+#else
+#define SWAPPER_BLOCK_SHIFTSECTION_SHIFT
+#define SWAPPER_BLOCK_SIZE SECTION_SIZE
+#define SWAPPER_TABLE_SHIFTPUD_SHIFT
+#endif
+
+
+/*
+ * Initial memory map attributes.
+ */
+#define SWAPPER_PTE_FLAGS  PTE_TYPE_PAGE | PTE_AF | PTE_SHARED
+#define SWAPPER_PMD_FLAGS  PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#define SWAPPER_MM_MMUFLAGSPTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS
+#else
+#define SWAPPER_MM_MMUFLAGSPMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS
+#endif
+
+
+#endif
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 7d9c7e4..3c9ce8c 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -28,24 +28,6 @@
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
-/*
- * The idmap and swapper page tables need some space reserved in the kernel
- * image. Both require pgd, pud (4 levels only) and pmd tables to (section)
- * map the kernel. With the 64K page configuration, swapper and idmap need to
- * map to pte level. The swapper also maps the FDT (see __create_page_tables
- * for more information). Note that the number of ID map translation levels
- * could be increased on the fly if system RAM is out of reach for the default
- * VA range, so 3 pages are reserved in all cases.
- */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
-#else
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
-#endif
-
-#define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
-#define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
-
 #ifndef __ASSEMBLY__
 
 #include 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a055be6..46670bf 100644
--- a/arch/arm64

[PATCH 03/15] arm64: Introduce helpers for page table levels

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Introduce helpers for finding the number of page table
levels required for a given VA width, shift for a particular
page table level.

Convert the existing users to the new helpers. More users
to follow.

Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/include/asm/pgtable-hwdef.h |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 24154b0..ce18389 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -16,13 +16,21 @@
 #ifndef __ASM_PGTABLE_HWDEF_H
 #define __ASM_PGTABLE_HWDEF_H
 
+/*
+ * Number of page-table levels required to address 'va_bits' wide
+ * address, without section mapping
+ */
+#define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3))
+#define ARM64_HW_PGTABLE_LEVEL_SHIFT(level) \
+   ((PAGE_SHIFT - 3) * (level) + 3)
+
 #define PTRS_PER_PTE   (1 << (PAGE_SHIFT - 3))
 
 /*
  * PMD_SHIFT determines the size a level 2 page table entry can map.
  */
 #if CONFIG_PGTABLE_LEVELS > 2
-#define PMD_SHIFT  ((PAGE_SHIFT - 3) * 2 + 3)
+#define PMD_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
 #define PMD_SIZE   (_AC(1, UL) << PMD_SHIFT)
 #define PMD_MASK   (~(PMD_SIZE-1))
 #define PTRS_PER_PMD   PTRS_PER_PTE
@@ -32,7 +40,7 @@
  * PUD_SHIFT determines the size a level 1 page table entry can map.
  */
 #if CONFIG_PGTABLE_LEVELS > 3
-#define PUD_SHIFT  ((PAGE_SHIFT - 3) * 3 + 3)
+#define PUD_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(3)
 #define PUD_SIZE   (_AC(1, UL) << PUD_SHIFT)
 #define PUD_MASK   (~(PUD_SIZE-1))
 #define PTRS_PER_PUD   PTRS_PER_PTE
@@ -42,7 +50,8 @@
  * PGDIR_SHIFT determines the size a top-level page table entry can map
  * (depending on the configuration, this level can be 0, 1 or 2).
  */
-#define PGDIR_SHIFT((PAGE_SHIFT - 3) * CONFIG_PGTABLE_LEVELS + 3)
+#define PGDIR_SHIFT\
+   ARM64_HW_PGTABLE_LEVEL_SHIFT(CONFIG_PGTABLE_LEVELS)
 #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT)
 #define PGDIR_MASK (~(PGDIR_SIZE-1))
 #define PTRS_PER_PGD   (1 << (VA_BITS - PGDIR_SHIFT))
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/15] arm64: Add page size to the kernel image header

2015-09-15 Thread Suzuki K. Poulose
From: Ard Biesheuvel 

This patch adds the page size to the arm64 kernel image header
so that one can infer the PAGESIZE used by the kernel. This will
be helpful to diagnose failures to boot the kernel with page size
not supported by the CPU.

Signed-off-by: Ard Biesheuvel 
---
 Documentation/arm64/booting.txt |7 ++-
 arch/arm64/kernel/image.h   |5 -
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 7d9d3c2..aaf6d77 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -104,7 +104,12 @@ Header notes:
 - The flags field (introduced in v3.17) is a little-endian 64-bit field
   composed as follows:
   Bit 0:   Kernel endianness.  1 if BE, 0 if LE.
-  Bits 1-63:   Reserved.
+  Bit 1-2: Kernel Page size.
+   0 - Unspecified.
+   1 - 4K
+   2 - 16K
+   3 - 64K
+  Bits 3-63:   Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index 8fae075..73b736c 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -47,7 +47,10 @@
 #define __HEAD_FLAG_BE 0
 #endif
 
-#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0)
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+
+#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0) | \
+   (__HEAD_FLAG_PAGE_SIZE << 1)
 
 /*
  * These will output as part of the Image header, which should be little-endian
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/15] arm64: Kconfig: Fix help text about AArch32 support with 64K pages

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Update the help text for ARM64_64K_PAGES to reflect the reality
about AArch32 support.

Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/Kconfig |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ab0a36f..83bca48 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -350,8 +350,8 @@ config ARM64_64K_PAGES
help
  This feature enables 64KB pages support (4KB by default)
  allowing only two levels of page tables and faster TLB
- look-up. AArch32 emulation is not available when this feature
- is enabled.
+ look-up. AArch32 emulation requires applications compiled
+ with 64K aligned segments.
 
 endchoice
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/15] arm64: Cleanup VTCR_EL2 computation

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

No functional changes. Group the common bits for VCTR_EL2
initialisation for better readability. The granule size
and the entry level are controlled by the page size.

Cc: Christoffer Dall <christoffer.d...@linaro.org>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: kvm...@lists.cs.columbia.edu
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index bdf139e..699554d 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -138,6 +138,9 @@
  * The magic numbers used for VTTBR_X in this patch can be found in Tables
  * D4-23 and D4-25 in ARM DDI 0487A.b.
  */
+#define VTCR_EL2_COMMON_BITS   (VTCR_EL2_SH0_INNER | VTCR_EL2_ORGN0_WBWA | \
+VTCR_EL2_IRGN0_WBWA | VTCR_EL2_T0SZ_40B)
+
 #ifdef CONFIG_ARM64_64K_PAGES
 /*
  * Stage2 translation configuration:
@@ -145,9 +148,8 @@
  * 64kB pages (TG0 = 1)
  * 2 level page tables (SL = 1)
  */
-#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_64K | VTCR_EL2_SH0_INNER | \
-VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
-VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
+#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_64K | VTCR_EL2_SL0_LVL1 | \
+VTCR_EL2_COMMON_BITS)
 #define VTTBR_X(38 - VTCR_EL2_T0SZ_40B)
 #else
 /*
@@ -156,9 +158,8 @@
  * 4kB pages (TG0 = 0)
  * 3 level page tables (SL = 1)
  */
-#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_4K | VTCR_EL2_SH0_INNER | \
-VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
-VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
+#define VTCR_EL2_FLAGS (VTCR_EL2_TG0_4K | VTCR_EL2_SL0_LVL1 | \
+VTCR_EL2_COMMON_BITS)
 #define VTTBR_X(37 - VTCR_EL2_T0SZ_40B)
 #endif
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/15] arm64: Check for selected granule support

2015-09-15 Thread Suzuki K. Poulose
From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

Ensure that the selected page size is supported by the
CPU(s).

Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poul...@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
---
 arch/arm64/include/asm/sysreg.h |6 ++
 arch/arm64/kernel/head.S|   24 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a7f3d4b..e01d323 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -87,4 +87,10 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
 }
 #endif
 
+#define ID_AA64MMFR0_TGran4_SHIFT  28
+#define ID_AA64MMFR0_TGran64_SHIFT 24
+
+#define ID_AA64MMFR0_TGran4_ENABLED0x0
+#define ID_AA64MMFR0_TGran64_ENABLED   0x0
+
 #endif /* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01b8e58..0cb04db 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -31,10 +31,11 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #define __PHYS_OFFSET  (KERNEL_START - TEXT_OFFSET)
@@ -606,9 +607,25 @@ ENDPROC(__secondary_switched)
  *  x27 = *virtual* address to jump to upon completion
  *
  * other registers depend on the function called upon completion
+ * Checks if the selected granule size is supported by the CPU.
  */
+#ifdefined(CONFIG_ARM64_64K_PAGES)
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran64_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran64_ENABLED
+
+#else
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran4_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran4_ENABLED
+
+#endif
.section".idmap.text", "ax"
 __enable_mmu:
+   mrs x1, ID_AA64MMFR0_EL1
+   ubfxx2, x1, #ID_AA64MMFR0_TGran_SHIFT, 4
+   cmp x2, #ID_AA64MMFR0_TGran_ENABLED
+   b.ne__no_granule_support
ldr x5, =vectors
msr vbar_el1, x5
msr ttbr0_el1, x25  // load TTBR0
@@ -626,3 +643,8 @@ __enable_mmu:
isb
br  x27
 ENDPROC(__enable_mmu)
+
+__no_granule_support:
+   wfe
+   b __no_granule_support
+ENDPROC(__no_granule_support)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/14] arm64: 16K translation granule support

2015-09-02 Thread Suzuki K. Poulose

On 02/09/15 10:55, Ard Biesheuvel wrote:

On 13 August 2015 at 13:33, Suzuki K. Poulose <suzuki.poul...@arm.com> wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>




Patches 1-7 cleans up the kernel page size handling code.
Patches 8-11 Fixes some issues with the KVM bits, mainly the fake PGD
  handling code.
Patch 12Adds a check to ensure the CPU supports the selected granule size.
Patch 13-14 Adds the 16k page size support bits.

This series applies on top of for-next/core branch of the aarch64 tree and is
also available here:

 git://linux-arm.org/linux-skp.git  16k/v1




Hi Suzuki,

I have given this a spin on the FVP Base model to check UEFI booting,
and everything seems to work fine. (I tested 2-level and 3-level)
I didn't test the KVM changes, so for all patches except those:

Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>


Thanks for the review and testing !!

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/14] arm64: Handle section maps for swapper/idmap

2015-09-02 Thread Suzuki K. Poulose

On 02/09/15 10:38, Ard Biesheuvel wrote:

On 13 August 2015 at 13:33, Suzuki K. Poulose <suzuki.poul...@arm.com> wrote:

From: "Suzuki K. Poulose" <suzuki.poul...@arm.com>

We use section maps with 4K page size to create the
swapper/idmaps. So far we have used !64K or 4K checks
to handle the case where we use the section maps. This
patch adds a symbol to make it clear those cases.



That sentence does not make sense.


I agree. How about :

"This patch adds a new symbol, 'ARM64_SWAPPER_USES_SECTION_MAPS', to
handle cases where we use section maps, instead of using the page size
symbols."

Suzuki


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/14] arm64: Add 16K page size support

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

This patch turns on the 16K page support in the kernel. We
support 48bit VA (4 level page tables) and 47bit VA (3 level
page tables).

Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Cc: Steve Capper steve.cap...@linaro.org
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/Kconfig   |   25 -
 arch/arm64/include/asm/fixmap.h  |4 +++-
 arch/arm64/include/asm/kvm_arm.h |   12 
 arch/arm64/include/asm/page.h|2 ++
 arch/arm64/include/asm/sysreg.h  |2 ++
 arch/arm64/include/asm/thread_info.h |2 ++
 arch/arm64/kernel/head.S |7 ++-
 arch/arm64/mm/proc.S |4 +++-
 8 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b247897..8327edf 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -167,7 +167,8 @@ config PGTABLE_LEVELS
default 2 if ARM64_64K_PAGES  ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES  ARM64_VA_BITS_48
default 3 if ARM64_4K_PAGES  ARM64_VA_BITS_39
-   default 4 if ARM64_4K_PAGES  ARM64_VA_BITS_48
+   default 3 if ARM64_16K_PAGES  ARM64_VA_BITS_47
+   default 4 if !ARM64_64K_PAGES  ARM64_VA_BITS_48
 
 source init/Kconfig
 
@@ -444,6 +445,13 @@ config ARM64_4K_PAGES
help
  This feature enables 4KB pages support.
 
+config ARM64_16K_PAGES
+   bool 16KB
+   help
+ The system will use 16KB pages support. AArch32 emulation
+ requires applications compiled with 16K(or multiple of 16K)
+ aligned segments.
+
 config ARM64_64K_PAGES
bool 64KB
help
@@ -457,6 +465,7 @@ endchoice
 choice
prompt Virtual address space size
default ARM64_VA_BITS_39 if ARM64_4K_PAGES
+   default ARM64_VA_BITS_47 if ARM64_16K_PAGES
default ARM64_VA_BITS_42 if ARM64_64K_PAGES
help
  Allows choosing one of multiple possible virtual address
@@ -471,6 +480,10 @@ config ARM64_VA_BITS_42
bool 42-bit
depends on ARM64_64K_PAGES
 
+config ARM64_VA_BITS_47
+   bool 47-bit
+   depends on ARM64_16K_PAGES
+
 config ARM64_VA_BITS_48
bool 48-bit
 
@@ -480,6 +493,7 @@ config ARM64_VA_BITS
int
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
+   default 47 if ARM64_VA_BITS_47
default 48 if ARM64_VA_BITS_48
 
 config CPU_BIG_ENDIAN
@@ -550,7 +564,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if ARM64_4K_PAGES
+   def_bool y if ARM64_4K_PAGES || ARM64_16K_PAGES
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
@@ -587,6 +601,7 @@ config XEN
 config FORCE_MAX_ZONEORDER
int
default 14 if (ARM64_64K_PAGES  TRANSPARENT_HUGEPAGE)
+   default 12 if (ARM64_16K_PAGES  TRANSPARENT_HUGEPAGE)
default 11
 
 menuconfig ARMV8_DEPRECATED
@@ -773,9 +788,9 @@ config COMPAT
  the user helper functions, VFP support and the ptrace interface are
  handled appropriately by the kernel.
 
- If you also enabled CONFIG_ARM64_64K_PAGES, please be aware that you
- will only be able to execute AArch32 binaries that were compiled with
- 64k aligned segments.
+ If you use a page size other than 4KB(i.e, 16KB or 64KB), please be 
aware
+ that you will only be able to execute AArch32 binaries that were 
compiled
+ with page size aligned segments.
 
  If you want to execute 32-bit userspace applications, say Y.
 
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index c0739187..f44a390 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,8 +55,10 @@ enum fixed_addresses {
 * Temporary boot-time mappings, used by early_ioremap(),
 * before ioremap() is functional.
 */
-#ifdef CONFIG_ARM64_64K_PAGES
+#ifdefined(CONFIG_ARM64_64K_PAGES)
 #define NR_FIX_BTMAPS  4
+#elif  defined (CONFIG_ARM64_16K_PAGES)
+#define NR_FIX_BTMAPS  16
 #else
 #define NR_FIX_BTMAPS  64
 #endif
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index dcaf799..4d6a022 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -113,6 +113,7 @@
 #define VTCR_EL2_TG0_MASK  (3  14)
 #define VTCR_EL2_TG0_4K(0  14)
 #define VTCR_EL2_TG0_64K   (1  14)
+#define VTCR_EL2_TG0_16K   (2  14)
 #define VTCR_EL2_SH0_MASK  (3  12)
 #define VTCR_EL2_SH0_INNER (3  12)
 #define VTCR_EL2_ORGN0_MASK(3  10)
@@ -134,6 +135,8 @@
  *
  * Note that when using 4K pages, we concatenate two first level page tables
  * together.
+ * With 16K pages, we concatenate 16 first level page tables and enter at
+ * level

[PATCH 14/14] arm64: 36 bit VA

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

36bit VA lets us use 2 level page tables while limiting the
available address space to 64GB.

Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/Kconfig |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8327edf..0407fd3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -164,6 +164,7 @@ config FIX_EARLYCON_MEM
 
 config PGTABLE_LEVELS
int
+   default 2 if ARM64_16K_PAGES  ARM64_VA_BITS_36
default 2 if ARM64_64K_PAGES  ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES  ARM64_VA_BITS_48
default 3 if ARM64_4K_PAGES  ARM64_VA_BITS_39
@@ -472,6 +473,10 @@ choice
  space sizes. The level of translation table is determined by
  a combination of page size and virtual address space size.
 
+config ARM64_VA_BITS_36
+   bool 36-bit
+   depends on ARM64_16K_PAGES
+
 config ARM64_VA_BITS_39
bool 39-bit
depends on ARM64_4K_PAGES
@@ -491,6 +496,7 @@ endchoice
 
 config ARM64_VA_BITS
int
+   default 36 if ARM64_VA_BITS_36
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
default 47 if ARM64_VA_BITS_47
@@ -564,7 +570,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if ARM64_4K_PAGES || ARM64_16K_PAGES
+   def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES  !ARM64_VA_BITS_36)
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/14] arm: kvm: Move fake PGD handling to arch specific files

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

Rearrange the code for fake pgd handling, which is applicable
to only ARM64. The intention is to keep the common code cleaner,
unaware of the underlying hacks.

Cc: kvm...@lists.cs.columbia.edu
Cc: christoffer.d...@linaro.org
Cc: marc.zyng...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm/include/asm/kvm_mmu.h   |7 ++
 arch/arm/kvm/mmu.c   |   44 +-
 arch/arm64/include/asm/kvm_mmu.h |   43 +
 3 files changed, 55 insertions(+), 39 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa18..1c9aa8a 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -173,6 +173,13 @@ static inline unsigned int kvm_get_hwpgd_size(void)
return PTRS_PER_S2_PGD * sizeof(pgd_t);
 }
 
+static inline pgd_t *kvm_setup_fake_pgd(pgd_t *pgd)
+{
+   return pgd;
+}
+
+static inline void kvm_free_fake_pgd(pgd_t *pgd) {}
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)   __cpuc_flush_dcache_area((a), (l))
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7b42012..b210622 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -677,43 +677,11 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
 * guest, we allocate a fake PGD and pre-populate it to point
 * to the next-level page table, which will be the real
 * initial page table pointed to by the VTTBR.
-*
-* When KVM_PREALLOC_LEVEL==2, we allocate a single page for
-* the PMD and the kernel will use folded pud.
-* When KVM_PREALLOC_LEVEL==1, we allocate 2 consecutive PUD
-* pages.
 */
-   if (KVM_PREALLOC_LEVEL  0) {
-   int i;
-
-   /*
-* Allocate fake pgd for the page table manipulation macros to
-* work.  This is not used by the hardware and we have no
-* alignment requirement for this allocation.
-*/
-   pgd = kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t),
-   GFP_KERNEL | __GFP_ZERO);
-
-   if (!pgd) {
-   kvm_free_hwpgd(hwpgd);
-   return -ENOMEM;
-   }
-
-   /* Plug the HW PGD into the fake one. */
-   for (i = 0; i  PTRS_PER_S2_PGD; i++) {
-   if (KVM_PREALLOC_LEVEL == 1)
-   pgd_populate(NULL, pgd + i,
-(pud_t *)hwpgd + i * PTRS_PER_PUD);
-   else if (KVM_PREALLOC_LEVEL == 2)
-   pud_populate(NULL, pud_offset(pgd, 0) + i,
-(pmd_t *)hwpgd + i * PTRS_PER_PMD);
-   }
-   } else {
-   /*
-* Allocate actual first-level Stage-2 page table used by the
-* hardware for Stage-2 page table walks.
-*/
-   pgd = (pgd_t *)hwpgd;
+   pgd = kvm_setup_fake_pgd(hwpgd);
+   if (IS_ERR(pgd)) {
+   kvm_free_hwpgd(hwpgd);
+   return PTR_ERR(pgd);
}
 
kvm_clean_pgd(pgd);
@@ -820,9 +788,7 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 
unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
kvm_free_hwpgd(kvm_get_hwpgd(kvm));
-   if (KVM_PREALLOC_LEVEL  0)
-   kfree(kvm-arch.pgd);
-
+   kvm_free_fake_pgd(kvm-arch.pgd);
kvm-arch.pgd = NULL;
 }
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6150567..2567fe8 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -198,6 +198,49 @@ static inline unsigned int kvm_get_hwpgd_size(void)
return PTRS_PER_S2_PGD * sizeof(pgd_t);
 }
 
+/*
+ * Allocate fake pgd for the page table manipulation macros to
+ * work.  This is not used by the hardware and we have no
+ * alignment requirement for this allocation.
+ */
+static inline pgd_t* kvm_setup_fake_pgd(pgd_t *hwpgd)
+{
+   int i;
+   pgd_t *pgd;
+
+   if (!KVM_PREALLOC_LEVEL)
+   return hwpgd;
+   /*
+* When KVM_PREALLOC_LEVEL==2, we allocate a single page for
+* the PMD and the kernel will use folded pud.
+* When KVM_PREALLOC_LEVEL==1, we allocate 2 consecutive PUD
+* pages.
+*/
+   pgd = kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t),
+   GFP_KERNEL | __GFP_ZERO);
+
+   if (!pgd)
+   return ERR_PTR(-ENOMEM);
+
+   /* Plug the HW PGD into the fake one. */
+   for (i = 0; i  PTRS_PER_S2_PGD; i++) {
+   if (KVM_PREALLOC_LEVEL == 1)
+   pgd_populate(NULL, pgd + i,
+(pud_t *)hwpgd + i * PTRS_PER_PUD);
+   else if (KVM_PREALLOC_LEVEL == 2

[PATCH 11/14] arm64: kvm: Rewrite fake pgd handling

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

The existing fake pgd handling code assumes that the stage-2 entry
level can only be one level down that of the host, which may not be
true always(e.g, with the introduction of 16k pagesize).

e.g.
With 16k page size and 48bit VA and 40bit IPA we have the following
split for page table levels:

level:  0   1 2 3
bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
 ^   ^ ^
 |   | |
   host entry| x stage-2 entry
 |
IPA -x

The stage-2 entry level is 2, due to the concatenation of 16tables
at level 2(mandated by the hardware). So, we need to fake two levels
to actually reach the hyp page table. This case cannot be handled
with the existing code, as, all we know about is KVM_PREALLOC_LEVEL
which kind of stands for two different pieces of information.

1) Whether we have fake page table entry levels.
2) The entry level of stage-2 translation.

We loose the information about the number of fake levels that
we may have to use. Also, KVM_PREALLOC_LEVEL computation itself
is wrong, as we assume the hw entry level is always 1 level down
from the host.

This patch introduces two seperate indicators :
1) Accurate entry level for stage-2 translation - HYP_PGTABLE_ENTRY_LEVEL -
   using the new helpers.
2) Number of levels of fake pagetable entries. (KVM_FAKE_PGTABLE_LEVELS)

The following conditions hold true for all cases(with 40bit IPA)
1) The stage-2 entry level = 2
2) Number of fake page-table entries is in the inclusive range [0, 2].

Cc: kvm...@lists.cs.columbia.edu
Cc: christoffer.d...@linaro.org
Cc: marc.zyng...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/include/asm/kvm_mmu.h |  114 --
 1 file changed, 61 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 2567fe8..72cfd9e 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -41,18 +41,6 @@
  */
 #define TRAMPOLINE_VA  (HYP_PAGE_OFFSET_MASK  PAGE_MASK)
 
-/*
- * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation
- * levels in addition to the PGD and potentially the PUD which are
- * pre-allocated (we pre-allocate the fake PGD and the PUD when the Stage-2
- * tables use one level of tables less than the kernel.
- */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define KVM_MMU_CACHE_MIN_PAGES1
-#else
-#define KVM_MMU_CACHE_MIN_PAGES2
-#endif
-
 #ifdef __ASSEMBLY__
 
 /*
@@ -80,6 +68,26 @@
 #define KVM_PHYS_SIZE  (1UL  KVM_PHYS_SHIFT)
 #define KVM_PHYS_MASK  (KVM_PHYS_SIZE - 1UL)
 
+/*
+ * At stage-2 entry level, upto 16 tables can be concatenated and
+ * the hardware expects us to use concatenation, whenever possible.
+ * So, number of page table levels for KVM_PHYS_SHIFT is always
+ * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
+ */
+#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
+/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
+#define HYP_PGTABLE_SHIFT  ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS 
+ 1)
+#define HYP_PGDIR_SHIFT
ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
+#define HYP_PGTABLE_ENTRY_LEVEL(4 - HYP_PGTABLE_LEVELS)
+
+/*
+ * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation
+ * levels in addition to the PGD and potentially the PUD which are
+ * pre-allocated (we pre-allocate the fake PGD and the PUD when the Stage-2
+ * tables use one level of tables less than the kernel.
+ */
+#define KVM_MMU_CACHE_MIN_PAGES(HYP_PGTABLE_LEVELS - 1)
+
 int create_hyp_mappings(void *from, void *to);
 int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
 void free_boot_hyp_pgd(void);
@@ -145,56 +153,41 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 #define kvm_pud_addr_end(addr, end)pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)pmd_addr_end(addr, end)
 
-/*
- * In the case where PGDIR_SHIFT is larger than KVM_PHYS_SHIFT, we can address
- * the entire IPA input range with a single pgd entry, and we would only need
- * one pgd entry.  Note that in this case, the pgd is actually not used by
- * the MMU for Stage-2 translations, but is merely a fake pgd used as a data
- * structure for the kernel pgtable macros to work.
- */
-#if PGDIR_SHIFT  KVM_PHYS_SHIFT
-#define PTRS_PER_S2_PGD_SHIFT  0
+/* Number of concatenated tables in stage-2 entry level */
+#if KVM_PHYS_SHIFT  HYP_PGTABLE_SHIFT
+#define S2_ENTRY_TABLES_SHIFT  (KVM_PHYS_SHIFT - HYP_PGTABLE_SHIFT)
 #else
-#define PTRS_PER_S2_PGD_SHIFT  (KVM_PHYS_SHIFT - PGDIR_SHIFT)
+#define S2_ENTRY_TABLES_SHIFT  0
 #endif
+#define S2_ENTRY_TABLES(1  (S2_ENTRY_TABLES_SHIFT))
+
+/* Number of page table levels we fake to reach the hw pgtable for hyp */
+#define KVM_FAKE_PGTABLE_LEVELS

[PATCH 00/14] arm64: 16K translation granule support

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

This series enables the 16K page size support on Linux for arm64.
This series adds support for 48bit VA(4 level), 47bit VA(3 level) and
36bit VA(2 level) with 16K. 16K was a late addition to the architecture
and is not implemented by all CPUs. Added a check to ensure the
selected granule size is supported by the CPU, failing which the CPU
won't proceed with booting.

KVM bits have been tested on a fast model with GICv3 using Andre's kvmtool
with gicv3 support[1].

Patches 1-7 cleans up the kernel page size handling code.
Patches 8-11 Fixes some issues with the KVM bits, mainly the fake PGD
 handling code.
Patch 12Adds a check to ensure the CPU supports the selected granule size.
Patch 13-14 Adds the 16k page size support bits.

This series applies on top of for-next/core branch of the aarch64 tree and is
also available here:

git://linux-arm.org/linux-skp.git  16k/v1

[1] git://linux-arm.org/kvmtool.git gicv3/v4

TODO:
 1) Testing on a silicon
 2) Analyse the performance of HugePages with 16K (32MB) on a
silicon.
 3) SMMU driver

Suzuki K. Poulose (14):
  arm64: Move swapper pagetable definitions
  arm64: Handle section maps for swapper/idmap
  arm64: Introduce helpers for page table levels
  arm64: Calculate size for idmap_pg_dir at compile time
  arm64: Handle 4 level page table for swapper
  arm64: Clean config usages for page size
  arm64: Kconfig: Fix help text about AArch32 support with 64K pages
  arm64: kvm: Fix {V}TCR_EL2_TG0 mask
  arm64: Cleanup VTCR_EL2 computation
  arm: kvm: Move fake PGD handling to arch specific files
  arm64: kvm: Rewrite fake pgd handling
  arm64: Check for selected granule support
  arm64: Add 16K page size support
  arm64: 36 bit VA

 arch/arm/include/asm/kvm_mmu.h  |7 ++
 arch/arm/kvm/mmu.c  |   44 ++
 arch/arm64/Kconfig  |   37 +++--
 arch/arm64/Kconfig.debug|2 +-
 arch/arm64/include/asm/boot.h   |1 +
 arch/arm64/include/asm/fixmap.h |4 +-
 arch/arm64/include/asm/kernel-pgtable.h |   77 ++
 arch/arm64/include/asm/kvm_arm.h|   29 +--
 arch/arm64/include/asm/kvm_mmu.h|  135 +--
 arch/arm64/include/asm/page.h   |   20 +
 arch/arm64/include/asm/pgtable-hwdef.h  |   15 +++-
 arch/arm64/include/asm/sysreg.h |8 ++
 arch/arm64/include/asm/thread_info.h|4 +-
 arch/arm64/kernel/head.S|   71 +---
 arch/arm64/kernel/vmlinux.lds.S |1 +
 arch/arm64/mm/mmu.c |   70 +++-
 arch/arm64/mm/proc.S|4 +-
 17 files changed, 337 insertions(+), 192 deletions(-)
 create mode 100644 arch/arm64/include/asm/kernel-pgtable.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/14] arm64: Check for selected granule support

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

Ensure that the selected page size is supported by the
CPU(s).

Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/include/asm/sysreg.h |6 ++
 arch/arm64/kernel/head.S|   24 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a7f3d4b..e01d323 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -87,4 +87,10 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
 }
 #endif
 
+#define ID_AA64MMFR0_TGran4_SHIFT  28
+#define ID_AA64MMFR0_TGran64_SHIFT 24
+
+#define ID_AA64MMFR0_TGran4_ENABLED0x0
+#define ID_AA64MMFR0_TGran64_ENABLED   0x0
+
 #endif /* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01b8e58..0cb04db 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -31,10 +31,11 @@
 #include asm/cputype.h
 #include asm/kernel-pgtable.h
 #include asm/memory.h
-#include asm/thread_info.h
 #include asm/pgtable-hwdef.h
 #include asm/pgtable.h
 #include asm/page.h
+#include asm/sysreg.h
+#include asm/thread_info.h
 #include asm/virt.h
 
 #define __PHYS_OFFSET  (KERNEL_START - TEXT_OFFSET)
@@ -606,9 +607,25 @@ ENDPROC(__secondary_switched)
  *  x27 = *virtual* address to jump to upon completion
  *
  * other registers depend on the function called upon completion
+ * Checks if the selected granule size is supported by the CPU.
  */
+#ifdefined(CONFIG_ARM64_64K_PAGES)
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran64_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran64_ENABLED
+
+#else
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran4_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran4_ENABLED
+
+#endif
.section.idmap.text, ax
 __enable_mmu:
+   mrs x1, ID_AA64MMFR0_EL1
+   ubfxx2, x1, #ID_AA64MMFR0_TGran_SHIFT, 4
+   cmp x2, #ID_AA64MMFR0_TGran_ENABLED
+   b.ne__no_granule_support
ldr x5, =vectors
msr vbar_el1, x5
msr ttbr0_el1, x25  // load TTBR0
@@ -626,3 +643,8 @@ __enable_mmu:
isb
br  x27
 ENDPROC(__enable_mmu)
+
+__no_granule_support:
+   wfe
+   b __no_granule_support
+ENDPROC(__no_granule_support)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/14] arm64: kvm: Fix {V}TCR_EL2_TG0 mask

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

{V}TCR_EL2_TG0 is a 2bit wide field, where:

 00 - 4K
 01 - 64K
 10 - 16K

But we use only 1 bit, which has worked well so far since
we never cared about 16K. Fix it for 16K support.

Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Cc: Marc Zyngier marc.zyng...@arm.com
Cc: Christoffer Dall christoffer.d...@linaro.org
Cc: kvm...@lists.cs.columbia.edu
Acked-by: Mark Rutland mark.rutl...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/include/asm/kvm_arm.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index ac6fafb..52dc9cc 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -98,7 +98,7 @@
 #define TCR_EL2_TBI(1  20)
 #define TCR_EL2_PS (7  16)
 #define TCR_EL2_PS_40B (2  16)
-#define TCR_EL2_TG0(1  14)
+#define TCR_EL2_TG0(3  14)
 #define TCR_EL2_SH0(3  12)
 #define TCR_EL2_ORGN0  (3  10)
 #define TCR_EL2_IRGN0  (3  8)
@@ -110,7 +110,7 @@
 
 /* VTCR_EL2 Registers bits */
 #define VTCR_EL2_PS_MASK   (7  16)
-#define VTCR_EL2_TG0_MASK  (1  14)
+#define VTCR_EL2_TG0_MASK  (3  14)
 #define VTCR_EL2_TG0_4K(0  14)
 #define VTCR_EL2_TG0_64K   (1  14)
 #define VTCR_EL2_SH0_MASK  (3  12)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/14] arm64: Handle 4 level page table for swapper

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

At the moment, we only support maximum of 3-level page table for
swapper. With 48bit VA, 64K has only 3 levels and 4K uses section
mapping. Add support for 4-level page table for swapper, needed
by 16K pages.

Cc: Ard Biesheuvel ard.biesheu...@linaro.org
Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/kernel/head.S |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 46670bf..01b8e58 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -271,7 +271,10 @@ ENDPROC(preserve_boot_args)
  */
.macro  create_pgd_entry, tbl, virt, tmp1, tmp2
create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
-#if SWAPPER_PGTABLE_LEVELS == 3
+#if SWAPPER_PGTABLE_LEVELS  3
+   create_table_entry \tbl, \virt, PUD_SHIFT, PTRS_PER_PUD, \tmp1, \tmp2
+#endif
+#if SWAPPER_PGTABLE_LEVELS  2
create_table_entry \tbl, \virt, SWAPPER_TABLE_SHIFT, PTRS_PER_PTE, 
\tmp1, \tmp2
 #endif
.endm
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/14] arm64: Handle section maps for swapper/idmap

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

We use section maps with 4K page size to create the
swapper/idmaps. So far we have used !64K or 4K checks
to handle the case where we use the section maps. This
patch adds a symbol to make it clear those cases.

Cc: Ard Biesheuvel ard.biesheu...@linaro.org
Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/include/asm/kernel-pgtable.h |   31 +-
 arch/arm64/mm/mmu.c |   70 ++-
 2 files changed, 51 insertions(+), 50 deletions(-)

diff --git a/arch/arm64/include/asm/kernel-pgtable.h 
b/arch/arm64/include/asm/kernel-pgtable.h
index 622929d..5876a36 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -19,6 +19,13 @@
 #ifndef __ASM_KERNEL_PGTABLE_H
 #define __ASM_KERNEL_PGTABLE_H
 
+/* With 4K pages, we use section maps. */
+#ifdef CONFIG_ARM64_4K_PAGES
+#define ARM64_SWAPPER_USES_SECTION_MAPS 1
+#else
+#define ARM64_SWAPPER_USES_SECTION_MAPS 0
+#endif
+
 /*
  * The idmap and swapper page tables need some space reserved in the kernel
  * image. Both require pgd, pud (4 levels only) and pmd tables to (section)
@@ -28,26 +35,28 @@
  * could be increased on the fly if system RAM is out of reach for the default
  * VA range, so 3 pages are reserved in all cases.
  */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#else
+#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
 #endif
 
 #define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
 #define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
 
 /* Initial memory map size */
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_BLOCK_SHIFTPAGE_SHIFT
-#define SWAPPER_BLOCK_SIZE PAGE_SIZE
-#define SWAPPER_TABLE_SHIFTPMD_SHIFT
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_BLOCK_SHIFTSECTION_SHIFT
 #define SWAPPER_BLOCK_SIZE SECTION_SIZE
 #define SWAPPER_TABLE_SHIFTPUD_SHIFT
+#else
+#define SWAPPER_BLOCK_SHIFTPAGE_SHIFT
+#define SWAPPER_BLOCK_SIZE PAGE_SIZE
+#define SWAPPER_TABLE_SHIFTPMD_SHIFT
 #endif
 
+/* The size of the initial kernel direct mapping */
+#define SWAPPER_INIT_MAP_SIZE  (_AC(1, UL)  SWAPPER_TABLE_SHIFT)
 
 /*
  * Initial memory map attributes.
@@ -55,10 +64,10 @@
 #define SWAPPER_PTE_FLAGS  PTE_TYPE_PAGE | PTE_AF | PTE_SHARED
 #define SWAPPER_PMD_FLAGS  PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S
 
-#ifdef CONFIG_ARM64_64K_PAGES
-#define SWAPPER_MM_MMUFLAGSPTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS
-#else
+#if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGSPMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS
+#else
+#define SWAPPER_MM_MMUFLAGSPTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS
 #endif
 
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 9211b85..71230488 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -32,6 +32,7 @@
 
 #include asm/cputype.h
 #include asm/fixmap.h
+#include asm/kernel-pgtable.h
 #include asm/sections.h
 #include asm/setup.h
 #include asm/sizes.h
@@ -353,14 +354,11 @@ static void __init map_mem(void)
 * memory addressable from the initial direct kernel mapping.
 *
 * The initial direct kernel mapping, located at swapper_pg_dir, gives
-* us PUD_SIZE (4K pages) or PMD_SIZE (64K pages) memory starting from
-* PHYS_OFFSET (which must be aligned to 2MB as per
-* Documentation/arm64/booting.txt).
+* us PUD_SIZE (with SECTION maps, i.e, 4K) or PMD_SIZE (without
+* SECTION maps, i.e, 64K pages) memory starting from PHYS_OFFSET
+* (which must be aligned to 2MB as per 
Documentation/arm64/booting.txt).
 */
-   if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
-   limit = PHYS_OFFSET + PMD_SIZE;
-   else
-   limit = PHYS_OFFSET + PUD_SIZE;
+   limit = PHYS_OFFSET + SWAPPER_INIT_MAP_SIZE;
memblock_set_current_limit(limit);
 
/* map all the memory banks */
@@ -371,21 +369,24 @@ static void __init map_mem(void)
if (start = end)
break;
 
-#ifndef CONFIG_ARM64_64K_PAGES
-   /*
-* For the first memory bank align the start address and
-* current memblock limit to prevent create_mapping() from
-* allocating pte page tables from unmapped memory.
-* When 64K pages are enabled, the pte page table for the
-* first PGDIR_SIZE is already present in swapper_pg_dir.
-*/
-   if (start  limit)
-   start = ALIGN(start, PMD_SIZE);
-   if (end  limit) {
-   limit = end  PMD_MASK

[PATCH 07/14] arm64: Kconfig: Fix help text about AArch32 support with 64K pages

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

Update the help text for ARM64_64K_PAGES to reflect the reality
about AArch32 support.

Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/Kconfig |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d1fb2a3..b247897 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -449,8 +449,8 @@ config ARM64_64K_PAGES
help
  This feature enables 64KB pages support (4KB by default)
  allowing only two levels of page tables and faster TLB
- look-up. AArch32 emulation is not available when this feature
- is enabled.
+ look-up. AArch32 emulation requires applications compiled
+ with 64K aligned segments.
 
 endchoice
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/14] arm64: Clean config usages for page size

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

We use !CONFIG_ARM64_64K_PAGES for CONFIG_ARM64_4K_PAGES
(and vice versa) in code. It all worked well, so far since
we only had two options. Now, with the introduction of 16K,
these cases will break. This patch cleans up the code to
use the required CONFIG symbol expression without the assumption
that !64K = 4K (and vice versa)

Cc: Ard Biesheuvel ard.biesheu...@linaro.org
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Cc: Steve Capper steve.cap...@linaro.org
Acked-by: Mark Rutland mark.rutl...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/Kconfig   |4 ++--
 arch/arm64/Kconfig.debug |2 +-
 arch/arm64/include/asm/thread_info.h |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 78b89fa..d1fb2a3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -550,7 +550,7 @@ config ARCH_WANT_GENERAL_HUGETLB
def_bool y
 
 config ARCH_WANT_HUGE_PMD_SHARE
-   def_bool y if !ARM64_64K_PAGES
+   def_bool y if ARM64_4K_PAGES
 
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
@@ -762,7 +762,7 @@ source fs/Kconfig.binfmt
 
 config COMPAT
bool Kernel support for 32-bit EL0
-   depends on !ARM64_64K_PAGES || EXPERT
+   depends on ARM64_4K_PAGES || EXPERT
select COMPAT_BINFMT_ELF
select HAVE_UID16
select OLD_SIGSUSPEND3
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index d6285ef..c24d6ad 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -77,7 +77,7 @@ config DEBUG_RODATA
   If in doubt, say Y
 
 config DEBUG_ALIGN_RODATA
-   depends on DEBUG_RODATA  !ARM64_64K_PAGES
+   depends on DEBUG_RODATA  ARM64_4K_PAGES
bool Align linker sections up to SECTION_SIZE
help
  If this option is enabled, sections that may potentially be marked as
diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index dcd06d1..d9c8c9f 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -23,7 +23,7 @@
 
 #include linux/compiler.h
 
-#ifndef CONFIG_ARM64_64K_PAGES
+#ifdef CONFIG_ARM64_4K_PAGES
 #define THREAD_SIZE_ORDER  2
 #endif
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/14] arm64: Calculate size for idmap_pg_dir at compile time

2015-08-13 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

Now that we can calculate the number of levels required for
mapping a va width, reserve exact number of pages that would
be required to cover the idmap. The idmap should be able to handle
the maximum physical address size supported.

Cc: Ard Biesheuvel ard.biesheu...@linaro.org
Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 arch/arm64/include/asm/boot.h   |1 +
 arch/arm64/include/asm/kernel-pgtable.h |7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b6..678b63e 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -2,6 +2,7 @@
 #ifndef __ASM_BOOT_H
 #define __ASM_BOOT_H
 
+#include asm/page.h
 #include asm/sizes.h
 
 /*
diff --git a/arch/arm64/include/asm/kernel-pgtable.h 
b/arch/arm64/include/asm/kernel-pgtable.h
index 5876a36..def7168 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -33,16 +33,19 @@
  * map to pte level. The swapper also maps the FDT (see __create_page_tables
  * for more information). Note that the number of ID map translation levels
  * could be increased on the fly if system RAM is out of reach for the default
- * VA range, so 3 pages are reserved in all cases.
+ * VA range, so pages required to map highest possible PA are reserved in all
+ * cases.
  */
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1)
+#define IDMAP_PGTABLE_LEVELS   (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT) - 1)
 #else
 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS)
+#define IDMAP_PGTABLE_LEVELS   (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT))
 #endif
 
 #define SWAPPER_DIR_SIZE   (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
-#define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
+#define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
 
 /* Initial memory map size */
 #if ARM64_SWAPPER_USES_SECTION_MAPS
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/14] arm64: Check for selected granule support

2015-08-13 Thread Suzuki K. Poulose

On 13/08/15 13:28, Steve Capper wrote:

On 13 August 2015 at 12:34, Suzuki K. Poulose suzuki.poul...@arm.com wrote:

From: Suzuki K. Poulose suzuki.poul...@arm.com

Ensure that the selected page size is supported by the
CPU(s).

Cc: Mark Rutland mark.rutl...@arm.com
Cc: Catalin Marinas catalin.mari...@arm.com
Cc: Will Deacon will.dea...@arm.com
Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
  arch/arm64/include/asm/sysreg.h |6 ++
  arch/arm64/kernel/head.S|   24 +++-
  2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a7f3d4b..e01d323 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -87,4 +87,10 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
  }
  #endif

+#define ID_AA64MMFR0_TGran4_SHIFT  28
+#define ID_AA64MMFR0_TGran64_SHIFT 24
+
+#define ID_AA64MMFR0_TGran4_ENABLED0x0
+#define ID_AA64MMFR0_TGran64_ENABLED   0x0
+
  #endif /* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01b8e58..0cb04db 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -31,10 +31,11 @@
  #include asm/cputype.h
  #include asm/kernel-pgtable.h
  #include asm/memory.h
-#include asm/thread_info.h
  #include asm/pgtable-hwdef.h
  #include asm/pgtable.h
  #include asm/page.h
+#include asm/sysreg.h
+#include asm/thread_info.h
  #include asm/virt.h

  #define __PHYS_OFFSET  (KERNEL_START - TEXT_OFFSET)
@@ -606,9 +607,25 @@ ENDPROC(__secondary_switched)
   *  x27 = *virtual* address to jump to upon completion
   *
   * other registers depend on the function called upon completion
+ * Checks if the selected granule size is supported by the CPU.
   */
+#ifdefined(CONFIG_ARM64_64K_PAGES)
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran64_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran64_ENABLED
+
+#else
+
+#define ID_AA64MMFR0_TGran_SHIFT   ID_AA64MMFR0_TGran4_SHIFT
+#define ID_AA64MMFR0_TGran_ENABLED ID_AA64MMFR0_TGran4_ENABLED
+
+#endif
 .section.idmap.text, ax
  __enable_mmu:
+   mrs x1, ID_AA64MMFR0_EL1
+   ubfxx2, x1, #ID_AA64MMFR0_TGran_SHIFT, 4
+   cmp x2, #ID_AA64MMFR0_TGran_ENABLED
+   b.ne__no_granule_support
 ldr x5, =vectors
 msr vbar_el1, x5
 msr ttbr0_el1, x25  // load TTBR0
@@ -626,3 +643,8 @@ __enable_mmu:
 isb
 br  x27
  ENDPROC(__enable_mmu)
+
+__no_granule_support:
+   wfe
+   b __no_granule_support
+ENDPROC(__no_granule_support)
--
1.7.9.5



Hi Suzuki,
Is is possible to tell the user that the kernel has failed to boot due
to the kernel granule being unsupported?


We don't have anything up at this time. The looping address is actually a clue
to the (expert) user. Not sure we can do something, until we get something like 
DEBUG_LL(?)
Or we should let it continue and end in a panic(?). The current situation can 
boot a
multi-cluster system with boot cluster having the Tgran support(which doesn't 
make a
strong use case though). I will try out some options and get back to you.


Thanks
Suzuki



Cheers,
--
Steve



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-23 Thread Suzuki K. Poulose

On 23/06/15 13:39, Christoffer Dall wrote:

On Mon, Jun 22, 2015 at 09:44:48AM +0100, Peter Maydell wrote:

On 17 June 2015 at 10:00, Suzuki K. Poulose suzuki.poul...@arm.com wrote:

From: Suzuki K. Poulose suzuki.poul...@arm.com

This patch adds a generic ARM v8 KVM target cpu type for use
by the new CPUs which eventualy ends up using the common sys_reg
table. For backward compatibility the existing targets have been
preserved. Any new target CPU that can be covered by generic v8
sys_reg tables should make use of the new generic target.


How do you intend this to work for cross-host migration?
Is the idea that the kernel guarantees that generic looks
100% the same to the guest regardless of host hardware? I'm
not sure that can be made to work, given impdef differences
in ID register values, bp/wp registers, and so on.

Given that, it seems to me that we still need to provide
KVM_ARM_TARGET_$THISCPU defines so userspace can request
a specific guest CPU flavour; so what does this patch
provide that isn't already provided by just having userspace
query for the preferred CPU type as it does already?


I'm guessing the intention is to avoid having to add code in the kernel
to support KVM on a new CPU where nothing else needs to be done to
support KVM on that system.

Yes, thats the *only* motivation behind the patch and doesn't address
the migration issue. May be we can create a dummy set of values for
the ID registers, which doesn't provide any 'special functionality'
so that it is safe to be migrated across any host ?



Wrt. migration, I was also wondering about this.  Would the differences
in the CPU architecture be detected when feeding back the invariant
sysregs from userspace on VM restore?

-Christoffer



Suzuki

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-17 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

This patch adds a generic ARM v8 KVM target cpu type for use
by the new CPUs which eventualy ends up using the common sys_reg
table. For backward compatibility the existing targets have been
preserved. Any new target CPU that can be covered by generic v8
sys_reg tables should make use of the new generic target.

Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
Acked-by: Marc Zyngier marc.zyng...@arm.com
---
 arch/arm64/include/uapi/asm/kvm.h|   10 --
 arch/arm64/kvm/guest.c   |3 ++-
 arch/arm64/kvm/sys_regs_generic_v8.c |2 ++
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index d268320..f5de418 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -53,14 +53,20 @@ struct kvm_regs {
struct user_fpsimd_state fp_regs;
 };
 
-/* Supported Processor Types */
+/*
+ * Supported CPU Targets - Adding a new target type is not recommended,
+ * unless there are some special registers not supported by the
+ * genericv8 syreg table.
+ */
 #define KVM_ARM_TARGET_AEM_V8  0
 #define KVM_ARM_TARGET_FOUNDATION_V8   1
 #define KVM_ARM_TARGET_CORTEX_A57  2
 #define KVM_ARM_TARGET_XGENE_POTENZA   3
 #define KVM_ARM_TARGET_CORTEX_A53  4
+/* Generic ARM v8 target */
+#define KVM_ARM_TARGET_GENERIC_V8  5
 
-#define KVM_ARM_NUM_TARGETS5
+#define KVM_ARM_NUM_TARGETS6
 
 /* KVM_ARM_SET_DEVICE_ADDR ioctl id encoding */
 #define KVM_ARM_DEVICE_TYPE_SHIFT  0
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 9535bd5..124aa57 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -293,7 +293,8 @@ int __attribute_const__ kvm_target_cpu(void)
break;
};
 
-   return -EINVAL;
+   /* Return a default generic target */
+   return KVM_ARM_TARGET_GENERIC_V8;
 }
 
 int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c 
b/arch/arm64/kvm/sys_regs_generic_v8.c
index 475fd29..1e45768 100644
--- a/arch/arm64/kvm/sys_regs_generic_v8.c
+++ b/arch/arm64/kvm/sys_regs_generic_v8.c
@@ -94,6 +94,8 @@ static int __init sys_reg_genericv8_init(void)
  genericv8_target_table);
kvm_register_target_sys_reg_table(KVM_ARM_TARGET_XGENE_POTENZA,
  genericv8_target_table);
+   kvm_register_target_sys_reg_table(KVM_ARM_TARGET_GENERIC_V8,
+ genericv8_target_table);
 
return 0;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvmtool: virtio-9p: Convert EMFILE error at the server to ENFILE for the guest

2015-01-16 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

If an open at the 9p server(host) fails with EMFILE (Too many open files for
the process), we should return ENFILE(too many open files in the system) to
the guest to indicate the actual status within the guest.

This was uncovered during LTP, where getdtablesize01 fails to open the maximum
number-open-files.

getdtablesize010  TINFO  :  Maximum number of files a process can have 
opened is 1024
getdtablesize010  TINFO  :  Checking with the value returned by 
getrlimit...RLIMIT_NOFILE
getdtablesize011  TPASS  :  got correct dtablesize, value is 1024
getdtablesize010  TINFO  :  Checking Max num of files that can be opened by 
a process.Should be: RLIMIT_NOFILE - 1
getdtablesize012  TFAIL  :  getdtablesize01.c:102: 974 != 1023

For a more practial impact:

 # ./getdtablesize01 
[1] 1834
 getdtablesize010  TINFO  :  Maximum number of files a process can have 
opened is 1024
 getdtablesize010  TINFO  :  Checking with the value returned by 
getrlimit...RLIMIT_NOFILE
 getdtablesize011  TPASS  :  got correct dtablesize, value is 1024
 getdtablesize010  TINFO  :  Checking Max num of files that can be opened 
by a process.Should be: RLIMIT_NOFILE - 1
 getdtablesize012  TFAIL  :  getdtablesize01.c:102: 974 != 1023
 [--- Modified to sleep indefinitely, without closing the files --- ]

 # ls
 bash: /bin/ls: Too many open files

That gives a wrong error message for the bash, when getdtablesize01 has 
exhausted the system
wide limits, giving false indicators.

With the fix, we get :

 # ls
 bash: /bin/ls: Too many open files in system

Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
---
 tools/kvm/virtio/9p.c |4 
 1 file changed, 4 insertions(+)

diff --git a/tools/kvm/virtio/9p.c b/tools/kvm/virtio/9p.c
index 9073a1e..b24c0f2 100644
--- a/tools/kvm/virtio/9p.c
+++ b/tools/kvm/virtio/9p.c
@@ -152,6 +152,10 @@ static void virtio_p9_error_reply(struct p9_dev *p9dev,
 {
u16 tag;
 
+   /* EMFILE at server implies ENFILE for the VM */
+   if (err == EMFILE)
+   err = ENFILE;
+
pdu-write_offset = VIRTIO_9P_HDR_LEN;
virtio_p9_pdu_writef(pdu, d, err);
*outlen = pdu-write_offset;
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [kvmtool]: Use the arch default transport method for network

2014-12-16 Thread Suzuki K. Poulose
From: Suzuki K. Poulose suzuki.poul...@arm.com

lkvm by default sets up a virtio-pci transport for network, if none is
specified. This can be a problem on archs (e.g ARM64), where virtio-pci is
not supported yet and cause the following warning at exit.

  # KVM compatibility warning.
virtio-net device was not detected.
While you have requested a virtio-net device, the guest kernel did not 
initialize it.
Please make sure that the guest kernel was compiled with 
CONFIG_VIRTIO_NET=y enabled in .config.

This patch changes it to make use of the default transport method for the
architecture when none is specified. This will ensure that on every arch
we get the network up by default in the VM.

Applies on top of the kvm/arm branch in Will's kvmtool tree.

Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
Acked-by: Will Deacon will.dea...@arm.com
---
 tools/kvm/include/kvm/virtio.h |1 +
 tools/kvm/virtio/core.c|9 +
 tools/kvm/virtio/net.c |   21 +++--
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/tools/kvm/include/kvm/virtio.h b/tools/kvm/include/kvm/virtio.h
index 8a9eab5..768ee96 100644
--- a/tools/kvm/include/kvm/virtio.h
+++ b/tools/kvm/include/kvm/virtio.h
@@ -160,6 +160,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct 
virtio_device *vdev,
struct virtio_ops *ops, enum virtio_trans trans,
int device_id, int subsys_id, int class);
 int virtio_compat_add_message(const char *device, const char *config);
+const char* virtio_trans_name(enum virtio_trans trans);
 
 static inline void *virtio_get_vq(struct kvm *kvm, u32 pfn, u32 page_size)
 {
diff --git a/tools/kvm/virtio/core.c b/tools/kvm/virtio/core.c
index 9ae7887..3b6e4d7 100644
--- a/tools/kvm/virtio/core.c
+++ b/tools/kvm/virtio/core.c
@@ -12,6 +12,15 @@
 #include kvm/kvm.h
 
 
+const char* virtio_trans_name(enum virtio_trans trans)
+{
+   if (trans == VIRTIO_PCI)
+   return pci;
+   else if (trans == VIRTIO_MMIO)
+   return mmio;
+   return unknown;
+}
+
 struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len)
 {
struct vring_used_elem *used_elem;
diff --git a/tools/kvm/virtio/net.c b/tools/kvm/virtio/net.c
index c8af385..e9daea4 100644
--- a/tools/kvm/virtio/net.c
+++ b/tools/kvm/virtio/net.c
@@ -758,6 +758,7 @@ static int virtio_net__init_one(struct virtio_net_params 
*params)
int i, err;
struct net_dev *ndev;
struct virtio_ops *ops;
+   enum virtio_trans trans = VIRTIO_DEFAULT_TRANS(params-kvm);
 
ndev = calloc(1, sizeof(struct net_dev));
if (ndev == NULL)
@@ -799,12 +800,20 @@ static int virtio_net__init_one(struct virtio_net_params 
*params)
}
 
*ops = net_dev_virtio_ops;
-   if (params-trans  strcmp(params-trans, mmio) == 0)
-   virtio_init(params-kvm, ndev, ndev-vdev, ops, VIRTIO_MMIO,
-   PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, 
PCI_CLASS_NET);
-   else
-   virtio_init(params-kvm, ndev, ndev-vdev, ops, VIRTIO_PCI,
-   PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, 
PCI_CLASS_NET);
+
+   if (params-trans) {
+   if (strcmp(params-trans, mmio) == 0)
+   trans = VIRTIO_MMIO;
+   else if (strcmp(params-trans, pci) == 0)
+   trans = VIRTIO_PCI;
+   else
+   pr_warning(virtio-net: Unknown transport method : %s, 
 
+  falling back to %s., params-trans,
+  virtio_trans_name(trans));
+   }
+
+   virtio_init(params-kvm, ndev, ndev-vdev, ops, trans,
+   PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, PCI_CLASS_NET);
 
if (params-vhost)
virtio_net__vhost_init(params-kvm, ndev);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html