Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-17 Thread Andre Przywara
Hi,



>> So Xen does not need to throw in its own ideas here. Which would avoid
>> some of the hard problems we encountered.
> I got all your point.
> Just question. Why does existing CPUFreq on x86 have own logic? Do we have
> something yet another on ARM that having own logic in Xen doesn't make
> any sense?

That's a good question. From quickly poking some people in #xendevel,
Julien learnt that CPUFreq on x86 might not really work well or at least
not as expected.
So the benefit is not even clear there. It just went in the tree once,
and possibly nobody ever revisited it since.
And even if there were good reasons back then, modern CPUs tend to be
quite different in terms of power characteristics.



>> 0. Decide whether CPUFreq justifies 1.-4. in the first place.
> Sure,
>> That sounds like a lot of work and code, so we should be sure it's worth it.
>>
>> I wonder if you could provide some input, ideally measurements on the
>> actual power savings CPUFreq provides.
> Well, I think I will be able to provide some numbers when a firmware,
> which runs on the SoC
> I am using, is ready. Actually, currently I have an emulator without
> any real freq/volt changes.

Yes, some actual numbers would very much help the case. I don't think
you need very sophisticated equipment, just running a workload once with
and once without CPUFreq and compare the power consumption would be a
good start. This could be as easy as measuring the (m)Wh consumed with
some wall-plug type power meter. I use some very cheap USB power
meter[1], which I put between the PSU and some single board computer to
get an idea on what the power consumption is. Surely not really
reliable, but better than nothing.

>> Does the wish to have CPUFreq purely come from some "tick-the-box"
>> exercise? As in: We have it on native Linux, so we need it in Xen?
> As I said before, we are interesting in purely embedded use-cases
> where power consumption is a question.
> If you know how to save power without having CPUFreq involved I would
> appreciate the pointers.

As Julien said, I guess idling and CPU offlining/CPU suspend (via PSCI)
would be a good start to look at. You could try to get some numbers on
this as well.

Cheers,
Andre.

[1]
https://www.ebay.co.uk/itm/USB-Charger-Doctor-Voltage-Current-Meter-Mobile-Battery-Tester-Power-Detector-UK/263220956905

>> What power savings can we expect from CPUFreq? Can those possible
>> savings be transferred into a virtualized environment at all? And do
>> those saving justify all the extra code in Xen?
>>
>> I think those questions need to be answered first, then we can discuss
>> about the implementation details.
> OK.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-16 Thread Andre Przywara
Hi,

On 16/11/17 14:57, Oleksandr Tyshchenko wrote:
> On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
> <andre.przyw...@linaro.org> wrote:
>> Hi,
> Hi Andre, Jassi
> 
> Thank you for your comments!
> 
>>
>> On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
>>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>>> <andre.przyw...@linaro.org> wrote:
>>>> Hi,
>>> Hi Andre
>>>
>>>>
>>>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>>>> <andre.przyw...@linaro.org> wrote:
>>>>>> Hi,
>>>>> Hi Andre,
>>>>>
>>>>>>
>>>>>> thanks very much for your work on this!
>>>>> Thank you for your comments.
>>>>>
>>>>>>
>>>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshche...@epam.com>
>>>>>>>
>>>>>>> Hi, all.
>>>>>>>
>>>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen 
>>>>>>> on ARM.
>>>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
>>>>>>> use-cases in virtualized system powered by Xen hypervisor. Rationale 
>>>>>>> behind this activity is that CPU virtualization is done by hypervisor 
>>>>>>> and the guest OS doesn't actually know anything about physical CPUs 
>>>>>>> because it is running on virtual CPUs. It is quite clear that a 
>>>>>>> decision about frequency change should be taken by hypervisor as only 
>>>>>>> it has information about actual CPU load.
>>>>>>
>>>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>>>> in the design are quite different between those.
>>>>> We keep embedded use-cases in mind. For example, it is a system with
>>>>> several domains,
>>>>> where one domain has most critical SW running on and other domain(s)
>>>>> are, let say, for entertainment purposes.
>>>>> I think, the CPUFreq is useful where power consumption is a question.
>>>>
>>>> Does the SoC you use allow different frequencies for each core? Or is it
>>>> one frequency for all cores? Most x86 CPU allow different frequencies
>>>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>>>> limit the usefulness of this approach in general.
>>> Good question. All cores in a cluster share the same clock. It is
>>> impossible to set different frequencies on the cores inside one
>>> cluster.
>>>
>>>>
>>>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>>>> position to make a decision on the proper frequency physical CPUs should
>>>>>> run with. From all I know it's already hard for an OS kernel to make
>>>>>> that call. So I would actually expect that guests provide some input,
>>>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>>>> could then decide to act on it - or not.
>>>>> Each running guest sees only part of the picture, but hypervisor has
>>>>> the whole picture, it knows all about CPU, measures CPU load and able
>>>>> to choose required CPU frequency to run on.
>>>>
>>>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>>>> hypercall or on WFI, for that matter. It does not know much more about
>>>> the guest, especially it's rather clueless about what the guest OS
>>>> actually intended to do.
>>>> For instance Linux can track the actual utilization of a core by keeping
>>>> statistics of runnable processes and monitoring their time slice usage.
>>>> It can see that a certain process exhibits periodical, but bursty CPU
>>>> usage, which may hint that is could run at lower frequency. Xen does not
>>>> see this fine granular information.
>>>>
>>>>> I am wondering, does Xen
>>>>> need additional input from guests for make a decision?
>>>>

[Xen-devel] [PATCH] arm64: ITS: fix cacheability adjustment

2017-11-16 Thread Andre Przywara
If the host GICv3 redistributor reports that the pending table cannot
use shareable memory, we try to drop the cacheability attributes as
well. However we fail horribly in doing computer science 101 bit
masking, effectively clearing the whole register instead of just a few
bits.
Fix this by removing the one redundant masking operation and adding the
magic negation for the actually needed other operation.

Reported-by: Manish Jaggi <manish.ja...@linaro.org>
Signed-off-by: Andre Przywara <andre.przyw...@linaro.org>
---
Julien,

can we have this still for 4.10, please? Seems like an obvious bug to me.

Cheers,
Andre

 xen/arch/arm/gic-v3-lpi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index c3474f5434..84582157b8 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -359,8 +359,7 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base)
 /* If the hardware reports non-shareable, drop cacheability as well. */
 if ( !(table_reg & GICR_PENDBASER_SHAREABILITY_MASK) )
 {
-table_reg &= GICR_PENDBASER_SHAREABILITY_MASK;
-table_reg &= GICR_PENDBASER_INNER_CACHEABILITY_MASK;
+table_reg &= ~GICR_PENDBASER_INNER_CACHEABILITY_MASK;
 table_reg |= GIC_BASER_CACHE_nC << 
GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
 
 writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-15 Thread Andre Przywara
Hi,

On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
> <andre.przyw...@linaro.org> wrote:
>> Hi,
> Hi Andre
> 
>>
>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>> <andre.przyw...@linaro.org> wrote:
>>>> Hi,
>>> Hi Andre,
>>>
>>>>
>>>> thanks very much for your work on this!
>>> Thank you for your comments.
>>>
>>>>
>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshche...@epam.com>
>>>>>
>>>>> Hi, all.
>>>>>
>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on 
>>>>> ARM.
>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
>>>>> use-cases in virtualized system powered by Xen hypervisor. Rationale 
>>>>> behind this activity is that CPU virtualization is done by hypervisor and 
>>>>> the guest OS doesn't actually know anything about physical CPUs because 
>>>>> it is running on virtual CPUs. It is quite clear that a decision about 
>>>>> frequency change should be taken by hypervisor as only it has information 
>>>>> about actual CPU load.
>>>>
>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>> in the design are quite different between those.
>>> We keep embedded use-cases in mind. For example, it is a system with
>>> several domains,
>>> where one domain has most critical SW running on and other domain(s)
>>> are, let say, for entertainment purposes.
>>> I think, the CPUFreq is useful where power consumption is a question.
>>
>> Does the SoC you use allow different frequencies for each core? Or is it
>> one frequency for all cores? Most x86 CPU allow different frequencies
>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>> limit the usefulness of this approach in general.
> Good question. All cores in a cluster share the same clock. It is
> impossible to set different frequencies on the cores inside one
> cluster.
> 
>>
>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>> position to make a decision on the proper frequency physical CPUs should
>>>> run with. From all I know it's already hard for an OS kernel to make
>>>> that call. So I would actually expect that guests provide some input,
>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>> could then decide to act on it - or not.
>>> Each running guest sees only part of the picture, but hypervisor has
>>> the whole picture, it knows all about CPU, measures CPU load and able
>>> to choose required CPU frequency to run on.
>>
>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>> hypercall or on WFI, for that matter. It does not know much more about
>> the guest, especially it's rather clueless about what the guest OS
>> actually intended to do.
>> For instance Linux can track the actual utilization of a core by keeping
>> statistics of runnable processes and monitoring their time slice usage.
>> It can see that a certain process exhibits periodical, but bursty CPU
>> usage, which may hint that is could run at lower frequency. Xen does not
>> see this fine granular information.
>>
>>> I am wondering, does Xen
>>> need additional input from guests for make a decision?
>>
>> I very much believe so. The guest OS is in a much better position to
>> make that call.
>>
>>> BTW, currently guest domain on ARM doesn't even know how many physical
>>> CPUs the system has and what are these OPPs. When creating guest
>>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>>> OPPs, thermal, etc are not passed to guest.
>>
>> Sure, because this is what virtualization is about. And I am not asking
>> for unconditionally allowing any guest to change frequency.
>> But there could be certain use cases where this could be considered:
>> Think about your "critical SW" mentioned above, which is probably some
>> RTOS, also possibly running on pinned vCPUs. For that
>> (latency-sensitive

Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-15 Thread Andre Przywara
Hi,

On 15/11/17 03:03, Jassi Brar wrote:
> On 15 November 2017 at 02:16, Oleksandr Tyshchenko <olekst...@gmail.com> 
> wrote:
>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>> <andre.przyw...@linaro.org> wrote:
>>
> 
>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC 
>>>>>> triggered mailbox driver. All components except mailbox driver are in 
>>>>>> mainline Linux.
>>>>>
>>>>> Why do you actually need this mailbox framework?
>>
> It is unnecessary if you are always going to use one particular signal
> mechanism, say SMC. However ...
> 
>>>>> Actually I just
>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>> "press a button". I don't see a need to import the whole Linux
>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>> mailbox is similarly simple, for instance.
>>
> ... Its going to be SMC and MHU now... and you talk about Rockchip as
> well later. That becomes unwieldy.
> 
> 
>>>
>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>> it would be much more easy for me to just add
>>>> a few required commands handling with issuing SMC call and without any
>>>> mailbox infrastructure involved.
>>>> But, I want to show what is going on and what place these things come from.
>>>
>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>> functionality.
>>
> That is a painful misconception.
> Mailbox api is designed to be (almost) as light weight as being
> transparent. Please have a look at mbox_send_message() and see how
> negligible overhead it adds for "SMC controller" that you compare
> against here. just integer manipulations protected by a spinlock.
> Of course if your protocol needs async messaging, you pay the price
> but only fair.

Normally I would agree on importing some well designed code rather than
hacking up something yourself.

BUT: This is Xen, which is meant to be lean, micro-kernel like
hypervisor. If we now add code from Linux, there must be a good
rationale why we need it. And this is why we need to make sure that
CPUFreq is really justified in the first place.
So I am a bit wary that pulling some rather unrelated Linux *framework*
into Xen bloats it up and introduces more burden to the trusted code
base. With SCPI being the only user, this controller - client
abstraction is not really needed. And to just trigger an interrupt on
the SCP side we just need to:
writel(BIT(channel), base_addr + CPU_INTR_H_SET);

I expect other mailboxes to be similarly simple.
The only other code needed is some DT parsing.

That being said I haven't look too closely how much code this actually
pulls in, it is just my gut feeling that it's a bit over the top,
conceptually.

>>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>>> mailbox, we can get away with a *much* simpler solution.
>>
>> Agree, but I am afraid that simplifying things now might lead to some
>> difficulties when there is a need
>> to integrate a little bit different mailbox IP. Also, we need to
>> recheck if SCMI, we might want to support as well,
>> have the similar interface with mailbox.
>>
> Exactly.

My understanding is that the SCMI transport protocol is not different
from that used by SCPI.

Cheers,
Andre.

>>> - We would need to port mailbox drivers one-by-one anyway, so we could
>>> as well implement the simple "press-the-button" subset for each mailbox
>>> separately.
>>
> Is it about virtual controller?
> 
>>> The interface between the SCPI code and the mailbox is
>>> probably just "signal_mailbox()".
>>
> Afterall we should have the following to spread the nice feeling of
> "supporting doorbell controllers"  :)
> 
> mailbox_client.h
> ***
> void signal_mailbox(struct mbox_chan *chan)
> {
>(void)mbox_send_message(chan, NULL);
>mbox_client_txdone(chan, 0);
> }
> 
> 
> Cheers!
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-14 Thread Andre Przywara
Hi,

On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
> <andre.przyw...@linaro.org> wrote:
>> Hi,
> Hi Andre
> 
>>
>> thanks very much for your work on this!
> Thank you for your comments.
> 
>>
>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshche...@epam.com>
>>>
>>> Hi, all.
>>>
>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on 
>>> ARM.
>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
>>> use-cases in virtualized system powered by Xen hypervisor. Rationale behind 
>>> this activity is that CPU virtualization is done by hypervisor and the 
>>> guest OS doesn't actually know anything about physical CPUs because it is 
>>> running on virtual CPUs. It is quite clear that a decision about frequency 
>>> change should be taken by hypervisor as only it has information about 
>>> actual CPU load.
>>
>> Can you please sketch your usage scenario or workloads here? I can think
>> of quite different scenarios (oversubscribed server vs. partitioning
>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>> in the design are quite different between those.
> We keep embedded use-cases in mind. For example, it is a system with
> several domains,
> where one domain has most critical SW running on and other domain(s)
> are, let say, for entertainment purposes.
> I think, the CPUFreq is useful where power consumption is a question.

Does the SoC you use allow different frequencies for each core? Or is it
one frequency for all cores? Most x86 CPU allow different frequencies
for each core, AFAIK. Just having the same OPP for the whole SoC might
limit the usefulness of this approach in general.

>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>> position to make a decision on the proper frequency physical CPUs should
>> run with. From all I know it's already hard for an OS kernel to make
>> that call. So I would actually expect that guests provide some input,
>> for instance by signalling OPP change request up to the hypervisor. This
>> could then decide to act on it - or not.
> Each running guest sees only part of the picture, but hypervisor has
> the whole picture, it knows all about CPU, measures CPU load and able
> to choose required CPU frequency to run on.

But based on what data? All Xen sees is a vCPU trapping on MMIO, a
hypercall or on WFI, for that matter. It does not know much more about
the guest, especially it's rather clueless about what the guest OS
actually intended to do.
For instance Linux can track the actual utilization of a core by keeping
statistics of runnable processes and monitoring their time slice usage.
It can see that a certain process exhibits periodical, but bursty CPU
usage, which may hint that is could run at lower frequency. Xen does not
see this fine granular information.

> I am wondering, does Xen
> need additional input from guests for make a decision?

I very much believe so. The guest OS is in a much better position to
make that call.

> BTW, currently guest domain on ARM doesn't even know how many physical
> CPUs the system has and what are these OPPs. When creating guest
> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
> OPPs, thermal, etc are not passed to guest.

Sure, because this is what virtualization is about. And I am not asking
for unconditionally allowing any guest to change frequency.
But there could be certain use cases where this could be considered:
Think about your "critical SW" mentioned above, which is probably some
RTOS, also possibly running on pinned vCPUs. For that
(latency-sensitive) guest it might be well suited to run at a lower
frequency for some time, but how should Xen know about this?
"Normally" the best strategy to save power is to run as fast as
possible, finish all outstanding work, then put the core to sleep.
Because not running at all consumes much less energy than running at a
reduced frequency. But this may not be suitable for an RTOS.

So I think we would need a combined approach:
a) Let an administrator (via tools running in Dom0) tell Xen about power
management strategies to use for certain guests. An RTOS could be
treated differently (lower, but constant frequency) than an
"entertainment" guest (varying frequency, based on guest OS input), also
differently than some background guest doing logging, OTA update, etc.
(constant high frequency, but putting cores to sleep instead as often as
possible).
b) Allow some guests (based on policy from (a)) to signal CPUFreq change
requests to the hypervisor. Xen takes those into accoun

Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-13 Thread Andre Przywara
Hi,

thanks very much for your work on this!

On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Hi, all.
> 
> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
> use-cases in virtualized system powered by Xen hypervisor. Rationale behind 
> this activity is that CPU virtualization is done by hypervisor and the guest 
> OS doesn't actually know anything about physical CPUs because it is running 
> on virtual CPUs. It is quite clear that a decision about frequency change 
> should be taken by hypervisor as only it has information about actual CPU 
> load.

Can you please sketch your usage scenario or workloads here? I can think
of quite different scenarios (oversubscribed server vs. partitioning
RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
in the design are quite different between those.

In general I doubt that a hypervisor scheduling vCPUs is in a good
position to make a decision on the proper frequency physical CPUs should
run with. From all I know it's already hard for an OS kernel to make
that call. So I would actually expect that guests provide some input,
for instance by signalling OPP change request up to the hypervisor. This
could then decide to act on it - or not.

> Although these required components (CPUFreq core, governors, etc) already 
> exist in Xen, it is worth to mention that they are ACPI specific. So, a part 
> of the current patch series makes them more generic in order to make possible 
> a CPUFreq usage on architectures without ACPI support in.

Have you looked at how this is used on x86 these days? Can you briefly
describe how this works and it's used there?

> But, the main question we have to answer is about frequency changing 
> interface in virtualized system. The frequency changing interface and all 
> dependent components which needed CPUFreq to be functional on ARM are not 
> present in Xen these days. The list of required components is quite big and 
> may change across different ARM SoC vendors. As an example, the following 
> components are involved in DVFS on Renesas Salvator-X board which has R-Car 
> Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s 
> CPG, PMIC, AVS, THS drivers, i2c support, etc.
> 
> We were considering a few possible approaches of hypervisor based CPUFreqs on 
> ARM and came to conclusion to base this solution on popular at the moment, 
> already upstreamed to Linux, ARM System Control and Power Interface(SCPI) 
> protocol [1]. We chose SCPI protocol instead of newer ARM System Control and 
> Management Interface (SCMI) protocol [2] since it is widely spread in Linux, 
> there are good examples how to use it, the range of capabilities it has is 
> enough for implementing hypervisor based CPUFreq and, what is more, upstream 
> Linux support for SCMI is missed so far, but SCMI could be used as well.
> 
> Briefly speaking, the SCPI protocol is used between the System Control 
> Processor(SCP) and the Application Processors(AP). The mailbox feature 
> provides a mechanism for inter-processor communication between SCP and AP. 
> The main purpose of SCP is to offload different PM related tasks from AP and 
> one of the services that SCP provides is Dynamic voltage and frequency 
> scaling (DVFS), it is what we actually need for CPUFreq. I will describe this 
> approach in details down the text.
> 
> Let me explain a bit more what these possible approaches are:
> 
> 1. “Xen+hwdom” solution.
> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend 
> driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom 
> (possibly dom0) in order to scale physical CPUs. This solution hasn’t been 
> accepted by Xen community yet and seems it is not going to be accepted 
> without taking into the account still unanswered major questions and proving 
> that “all-in-Xen” solution, which Xen community considered as more 
> architecturally cleaner option, would be unworkable in practice.
> The other reasons why we decided not to stick to this approach are complex 
> communication interface between Xen and hwdom: event channel, hypercalls, 
> syscalls, passing CPU info via DT, etc and possible synchronization issues 
> with a proposed solution.
> Although it is worth to mention that the beauty of this approach was that 
> there wouldn’t be a need to port a lot of things to Xen. All frequency 
> changing interface and all dependent components which needed CPUFreq to be 
> functional were already in place.

Stefano, Julien and I were thinking about this: Wouldn't it be possible
to come up with some hardware domain, solely dealing with CPUFreq
changes? This could run a Linux kernel, but no or very little userland.
All its vCPUs would be pinned to pCPUs and would normally not be
scheduled by Xen. If Xen wants to change the 

Re: [Xen-devel] [PATCH 04/12] ARM: VGIC: move gic_remove_irq_from_queues()

2017-11-10 Thread Andre Przywara
Hi,

...

>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index e489d0bf21..8d0ff65708 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -204,6 +204,7 @@ extern int vcpu_vgic_init(struct vcpu *v);
>>  extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>> +void vgic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
> 
> cosmetic: you might as well add an extern

I was wondering about that. I think extern in front of a prototype in a
header file is a bit pointless. Linux mostly doesn't use it (apart from
fs/ and some parts of security/).
Though I can of course easily add it.

Cheers,
Andre.

> 
> 
>>  extern void vgic_clear_pending_irqs(struct vcpu *v);
>>  extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>> -- 
>> 2.14.1
>>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/12] ARM: VGIC: remove gic_clear_pending_irqs()

2017-11-10 Thread Andre Przywara
Hi,

On 26/10/17 01:14, Stefano Stabellini wrote:
> On Thu, 19 Oct 2017, Andre Przywara wrote:
>> gic_clear_pending_irqs() was not only misnamed, but also misplaced, as
>> a function solely dealing with the GIC emulation should not live in gic.c.
>> Move the functionality of this function into its only caller in vgic.c
>>
>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
> 
> The reason why gic_clear_pending_irqs is in gic.c is that lr_mask and
> lr_pending are considered part of the gic driver (gic.c). On the other
> end, inflight is part of the vgic.
> 
> As an example, the idea is that the code outside of gic.c (for example
> vgic.c) shouldn't have to know, or have to care, whether a given IRQ is
> in the lr_pending queue or actually in a LR register.

I can understand that the lr_pending queue *should* be logical
continuation of the LR registers, something like spill-over LRs.
Though I wasn't aware of this before ;-)
So I can see that from a *logical* point of view it looks like it
belongs to the hardware part of the GIC (more specifically gic-vgic.c),
which deals with the actual LRs. But I guess this is somewhat of a grey
area.

BUT:
This is a design choice of the VGIC, and one which the KVM VGIC design
for instance does *not* share. Also my earlier Xen VGIC rework patches
got rid of this as well (because dealing with two lists is too complicated).
Also, the name is misleading: gic_clear_pending_irqs() does not hint at
all that this is dealing with the GIC emulation, I think it should read
vgic_vcpu_clear_pending_irqs().
And as it accesses VGIC specific data structures only, I don't think it
belongs to gic.c, really.
So I could live with moving it into the new gic-vgic.c, let me see if
that works.

The need for this patch didn't come out of the blue, I actually need it
to be able to reuse gic.c with *any* other VGIC implementation. And this
applies to both a VGIC rework and the KVM VGIC port.
These lr_queue and lr_pending queues are really an implementation detail
of the existing *VGIC*, and, more importantly: they refer to the struct
pending_irq, which is definitely a VGIC detail.

The rabbit to follow in this series is to strictly split the usage of
struct pending_irq from the hardware GIC driver. The KVM VGIC does not
have a "struct pending_irq", so we can't have anything mentioning that
in code that should survive a KVM VGIC port.
So short of replacing gic.c at all, moving everything mentioning
pending_irq out of gic.c is the only option.

Cheers,
Andre.

> lr_mask and lr_pending are only accessed from gic.c. The only exception
> is the initialization (INIT_LIST_HEAD(>arch.vgic.lr_pending)).
> 
> 
>> ---
>>  xen/arch/arm/gic.c| 11 ---
>>  xen/arch/arm/vgic.c   |  4 +++-
>>  xen/include/asm-arm/gic.h |  1 -
>>  3 files changed, 3 insertions(+), 13 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index ed363f6c37..75b2e0e0ca 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -675,17 +675,6 @@ out:
>>  spin_unlock_irqrestore(>arch.vgic.lock, flags);
>>  }
>>  
>> -void gic_clear_pending_irqs(struct vcpu *v)
>> -{
>> -struct pending_irq *p, *t;
>> -
>> -ASSERT(spin_is_locked(>arch.vgic.lock));
>> -
>> -v->arch.lr_mask = 0;
>> -list_for_each_entry_safe ( p, t, >arch.vgic.lr_pending, lr_queue )
>> -gic_remove_from_lr_pending(v, p);
>> -}
>> -
>>  int gic_events_need_delivery(void)
>>  {
>>  struct vcpu *v = current;
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index d8acbbeaaa..451a306a98 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -504,7 +504,9 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>>  spin_lock_irqsave(>arch.vgic.lock, flags);
>>  list_for_each_entry_safe ( p, t, >arch.vgic.inflight_irqs, inflight )
>>  list_del_init(>inflight);
>> -gic_clear_pending_irqs(v);
>> +list_for_each_entry_safe ( p, t, >arch.vgic.lr_pending, lr_queue )
>> +gic_remove_from_lr_pending(v, p);
>> +v->arch.lr_mask = 0;
>>  spin_unlock_irqrestore(>arch.vgic.lock, flags);
>>  }
>>  
>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
>> index d3d7bda50d..2f248301ce 100644
>> --- a/xen/include/asm-arm/gic.h
>> +++ b/xen/include/asm-arm/gic.h
>> @@ -236,7 +236,6 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
>> int virq,
>>struct irq_desc *desc);
>>  
>>  extern void gic_inject(void);
>> -extern void gic_clear_pending_irqs(struct vcpu *v);
>>  extern int gic_events_need_delivery(void);
>>  
>>  extern void init_maintenance_interrupt(void);
>> -- 
>> 2.14.1
>>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-02 Thread Andre Przywara
Hi,

On 01/11/17 21:54, Stefano Stabellini wrote:
> On Wed, 1 Nov 2017, Andre Przywara wrote:
>> Hi Stefano,
>>
>>
>> On 01/11/17 01:58, Stefano Stabellini wrote:
>>> On Wed, 11 Oct 2017, Andre Przywara wrote:
>>
>> many thanks for going through all of this!
> 
> No problems, and thanks for your work and for caring about doing the
> best thing for the project.
> 
> 
>>>> (CC:ing some KVM/ARM folks involved in the VGIC)
>>>>
>>>> starting with the addition of the ITS support we were seeing more and
>>>> more issues with the current implementation of our ARM Generic Interrupt
>>>> Controller (GIC) emulation, the VGIC.
>>>> Among other approaches to fix those issues it was proposed to copy the
>>>> VGIC emulation used in KVM. This one was suffering from very similar
>>>> issues, and a clean design from scratch lead to a very robust and
>>>> capable re-implementation. Interestingly this implementation is fairly
>>>> self-contained, so it seems feasible to copy it. Hopefully we only need
>>>> minor adjustments, possibly we can even copy it verbatim with some
>>>> additional glue layer code.
>>>>
>>>> Stefano asked for getting a design overview, to assess the feasibility
>>>> of copying the KVM code without reviewing tons of code in the first
>>>> place.
>>>> So to follow Xen rules for new features, this design document below is
>>>> an attempt to describe the current KVM VGIC design - in a hypervisor
>>>> agnostic session. It is a bit of a retro-fit design description, as it
>>>> is not strictly forward-looking only, but actually describing the
>>>> existing implemenation [1].
>>>>
>>>> Please have a look and let me know:
>>>> 1) if this document has the right scope
>>>> 2) if this document has the right level of detail
>>>> 3) if there are points missing from the document
>>>> 3) if the design in general is a fit
>>>
>>> Please read the following statements as genuine questions and concerns.
>>> Most ideas on this document are good. Some of them I have even suggested
>>> them myself in the context of GIC improvements for Xen. I asked for a
>>> couple of clarifications.
>>>
>>> But I don't see why we cannot implement these ideas on top of the
>>> existing code, rather than with a separate codebase, ending up with two
>>> drivers. I would prefer a natual evolution. Specifically, the following
>>> improvements would be simple and would give us most of the benefits on
>>> top of the current codebase:
>>> - adding the irq lock, and the refcount
>>> - taking both vcpu locks when necessary (on migration code for example
>>>   it would help a lot), the lower vcpu_id first
>>> - level irq emulation
>>
>> I think some of those points you mentioned are not easily implemented in
>> the current Xen. For instance I ran into locking order issues with those
>> *two* inflight and lr_queue lists, when trying to implement the lock and
>> the refcount.
>> Also this "put vIRQs into LRs early, but possibly rip them out again" is
>> really complicating things a lot.
>>
>> I believe only level IRQs could be added in a relatively straight
>> forward manner.
>>
>> So the problem with the evolutionary approach is that it generates a lot
>> of patches, some of them quite invasive, others creating hard-to-read
>> diffs, which are both hard to review.
>> And chances are that the actual result would be pretty close to the KVM
>> code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
>> days some months ago, but it took me *weeks* to make sane patches of
>> only the first part of it.
>> And this would not cover all those general, tedious corner cases that
>> the VGIC comes with. Those would need to be fixed in a painful process,
>> which we could avoid by "lifting" the KVM code.
> 
> I hear you, but the principal cost here is the review time, not the
> development time. Julien told me that it would be pretty much the same
> for him in terms of time it takes to review the changes, it doesn't
> matter if it's a new driver or changes to the existing driver. For me,
> it wouldn't be the same: I think it would take me far less time to
> review them if they were against the existing codebase.

I am not so sure about this. The changes are quite dramatic, and those
changes tend to produce horrible diffs. Or we try to mitigate this, but
this comes

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi Christoffer,

On 12/10/17 13:05, Christoffer Dall wrote:
> Hi Andre,
> 
> On Wed, Oct 11, 2017 at 03:33:03PM +0100, Andre Przywara wrote:
>> Hi,
>>
>> (CC:ing some KVM/ARM folks involved in the VGIC)
> 
> Very nice writeup!
> 
> I added a bunch of comments, mostly for the writing and clarity, I hope
> it helps.

Thank you very much for the response and the comments! I really
appreciate your precise (academic) language here.
I held back the response since Stefano was the actual addressee of this
write-up, so: sorry for the delay.

>> starting with the addition of the ITS support we were seeing more and
>> more issues with the current implementation of our ARM Generic Interrupt
>> Controller (GIC) emulation, the VGIC.
>> Among other approaches to fix those issues it was proposed to copy the
>> VGIC emulation used in KVM. This one was suffering from very similar
>> issues, and a clean design from scratch lead to a very robust and
>> capable re-implementation. Interestingly this implementation is fairly
>> self-contained, so it seems feasible to copy it. Hopefully we only need
>> minor adjustments, possibly we can even copy it verbatim with some
>> additional glue layer code.
>> Stefano asked for getting a design overview, to assess the feasibility
>> of copying the KVM code without reviewing tons of code in the first
>> place.
>> So to follow Xen rules for new features, this design document below is
>> an attempt to describe the current KVM VGIC design - in a hypervisor
>> agnostic session. It is a bit of a retro-fit design description, as it
>> is not strictly forward-looking only, but actually describing the
>> existing implemenation [1].
>>
>> Please have a look and let me know:
>> 1) if this document has the right scope
>> 2) if this document has the right level of detail
>> 3) if there are points missing from the document
>> 3) if the design in general is a fit
>>
>> Appreciate any feedback!
>>
>> Cheers,
>> Andre.
>>
>> ---
>>
>> VGIC design
>> ===
>>
>> This document describes the design of an ARM Generic Interrupt Controller 
>> (GIC)
>> emulation. It is meant to emulate a GIC for a guest in an virtual machine,
>> the common name for that is VGIC (from "virtual GIC").
>>
>> This design was the result of a one-week-long design session with some
>> engineers in a room, triggered by ever-increasing difficulties in maintaining
>> the existing GIC emulation in the KVM hypervisor. The design eventually
>> materialised as an alternative VGIC implementation in the Linux kernel
>> (merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
>> was removed, so it is now the current code used by Linux.
>> Although being used in KVM, the actual design of this VGIC is rather 
>> hypervisor
>> agnostic and can be used by other hypervisors as well, in particular for Xen.
>>
>> GIC hardware virtualization support
>> ---
>>
>> The ARM Generic Interrupt Controller (since v2) supports the virtualization
>> extensions, which allows some parts of the interrupt life cycle to be handled
>> purely inside the guest without exiting into the hypervisor.
>> In the GICv2 and GICv3 architecture this covers mostly the "interrupt
>> acknowledgement", "priority drop" and "interrupt deactivate" actions.
>> So a guest can handle most of the interrupt processing code without
>> leaving EL1 and trapping into the hypervisor. To accomplish
>> this, the GIC holds so called "list registers" (LRs), which shadow the
>> interrupt state for any virtual interrupt. Injecting an interrupt to a guest
>> involves setting up one LR with the interrupt number, its priority and 
>> initial
>> state (mostly "pending"), then entering the guest. Any EOI related action
>> from within the guest just acts on those LRs, the hypervisor can later update
>> the virtual interrupt state when the guest exists the next time (for whatever
>> reason).
>> But despite the GIC hardware helping out here, the whole interrupt
>> configuration management is not virtualized at all and needs to be emulated
>> by the hypervisor - or another related software component, for instance a
>> userland emulator. This so called "distributor" part of the GIC consists of
>> memory mapped registers, which can be trapped by the hypervisor, so any guest
>> access can be emulated in the usual way.
>>
>> VGIC design motivation
>> 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi Stefano,


On 01/11/17 01:58, Stefano Stabellini wrote:
> On Wed, 11 Oct 2017, Andre Przywara wrote:

many thanks for going through all of this!

>> (CC:ing some KVM/ARM folks involved in the VGIC)
>>
>> starting with the addition of the ITS support we were seeing more and
>> more issues with the current implementation of our ARM Generic Interrupt
>> Controller (GIC) emulation, the VGIC.
>> Among other approaches to fix those issues it was proposed to copy the
>> VGIC emulation used in KVM. This one was suffering from very similar
>> issues, and a clean design from scratch lead to a very robust and
>> capable re-implementation. Interestingly this implementation is fairly
>> self-contained, so it seems feasible to copy it. Hopefully we only need
>> minor adjustments, possibly we can even copy it verbatim with some
>> additional glue layer code.
>>
>> Stefano asked for getting a design overview, to assess the feasibility
>> of copying the KVM code without reviewing tons of code in the first
>> place.
>> So to follow Xen rules for new features, this design document below is
>> an attempt to describe the current KVM VGIC design - in a hypervisor
>> agnostic session. It is a bit of a retro-fit design description, as it
>> is not strictly forward-looking only, but actually describing the
>> existing implemenation [1].
>>
>> Please have a look and let me know:
>> 1) if this document has the right scope
>> 2) if this document has the right level of detail
>> 3) if there are points missing from the document
>> 3) if the design in general is a fit
> 
> Please read the following statements as genuine questions and concerns.
> Most ideas on this document are good. Some of them I have even suggested
> them myself in the context of GIC improvements for Xen. I asked for a
> couple of clarifications.
> 
> But I don't see why we cannot implement these ideas on top of the
> existing code, rather than with a separate codebase, ending up with two
> drivers. I would prefer a natual evolution. Specifically, the following
> improvements would be simple and would give us most of the benefits on
> top of the current codebase:
> - adding the irq lock, and the refcount
> - taking both vcpu locks when necessary (on migration code for example
>   it would help a lot), the lower vcpu_id first
> - level irq emulation

I think some of those points you mentioned are not easily implemented in
the current Xen. For instance I ran into locking order issues with those
*two* inflight and lr_queue lists, when trying to implement the lock and
the refcount.
Also this "put vIRQs into LRs early, but possibly rip them out again" is
really complicating things a lot.

I believe only level IRQs could be added in a relatively straight
forward manner.

So the problem with the evolutionary approach is that it generates a lot
of patches, some of them quite invasive, others creating hard-to-read
diffs, which are both hard to review.
And chances are that the actual result would be pretty close to the KVM
code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
days some months ago, but it took me *weeks* to make sane patches of
only the first part of it.
And this would not cover all those general, tedious corner cases that
the VGIC comes with. Those would need to be fixed in a painful process,
which we could avoid by "lifting" the KVM code.

> If we do end up with a second separate driver for technical or process
> reasons, I would expect the regular Xen submission/review process to be
> followed. The code style will be different, the hooks into the rest of
> the hypervisors will be different and things will be generally changed.
> The new V/GIC might be derived from KVM, but it should end up looking
> and feeling like a 100% genuine Xen component. After all, we'll
> maintain it going forward. I don't want a copy of a Linux driver with
> glue code. The Xen community cannot be expected not to review the
> submission, but if we review it, then we'll ask for changes. Once we
> change the code, there will be no point in keeping the Linux code
> separate with glue code. We should fully adapt it to Xen.

I see your point, and this actually simplifies *my* work, but I am a bit
worried about the effects of having two separate implementations which
then diverge over time.
In the moment we have two separate implementations as well, but they are
quite different, which has the advantage of doing things differently
enough to help in finding bugs in the other one (something we should
actually exploit in testing, I believe).

So how is your feeling towards some shared "libvgic"? I understand that
people are not too happy about that extra maintenance cost of having a
separate reposit

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi,

On 01/11/17 04:31, Christoffer Dall wrote:
> On Wed, Nov 1, 2017 at 9:58 AM, Stefano Stabellini
>  wrote:
> 
> []

Christoffer, many thanks for answering this!
I think we have a lot of assumptions about the whole VGIC life cycle
floating around, but it would indeed be good to get some numbers behind it.
I would be all too happy to trace some workloads on Xen again and
getting some metrics, though this sounds time consuming if done properly.

Do you have any numbers on VGIC performance available somewhere?



>>> ### List register management
>>>
>>> A list register (LR) holds the state of a virtual interrupt, which will
>>> be used by the GIC hardware to simulate an IRQ life cycle for a guest.
>>> Each GIC hardware implementation can choose to implement a number of LRs,
>>> having four of them seems to be a common value. This design here does not
>>> try to manage the LRs very cleverly, instead on every guest exit every LR
>>> in use will be synced to the emulated state, then cleared. Upon guest entry
>>> the top priority virtual IRQs will be inserted into the LRs. If there are
>>> more pending or active IRQs than list registers, the GIC management IRQ
>>> will be configured to notify the hypervisor of a free LR (once the guest
>>> has EOIed one IRQ). This will trigger a normal exit, which will go through
>>> the normal cleanup/repopulate scheme, possibly now queuing the leftover
>>> interrupt(s).
>>> To facilitate quick guest exit and entry times, the VGIC maintains the list
>>> of pending or active interrupts (ap\_list) sorted by their priority. Active
>>> interrupts always go first on the list, since a guest and the hardware GIC
>>> expect those to stay until they have been explicitly deactivated. Failure
>>> in keeping active IRQs around will result in error conditions in the GIC.
>>> The second sort criteria for the ap\_list is their priority, so higher
>>> priority pending interrupt always go first into the LRs.
>>
>> The suggestion of using this model in Xen was made in the past already.
>> I always objected for the reason that we don't actually know how many
>> LRs the hardware provides, potentially very many, and it is expensive
>> and needless to read/write them all every time on entry/exit.
>>
>> I would prefer to avoid that, but I'll be honest: I can be convinced
>> that that model of handling LRs is so much simpler that it is worth it.
>> I am more concerned about the future maintainance of a separate new
>> driver developed elsewhere.
> 
> [Having just spent a fair amount of time optimizing KVM/ARM and
> measuring GIC interaction, I'll comment on this and leave it up to
> Andre to drive the rest of the discussion].
> 
> In KVM we currently only ever touch an LR when we absolutely have to.
> For example, if there are no interrupts, we do not touch an LR.

Yes, I think this is a key point. We only touch LRs that we need to
touch: On guest entry we iterate our per-VCPU list of pending IRQs
(ap_list, that could be empty!), and store that number in a variable.
On entry we just sync back the first  LRs.
I think the code in KVM explains it quite well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/virt/kvm/arm/vgic/vgic.c#n677

> When you do have an interrupt in flight, and have programmed one or
> more LRs, you have to either read back that LR, or read one of the
> status registers to figure out if the interrupt has become inactive
> (and should potentially be injected again).  I measured both on KVM
> for various workloads and it was faster to never read the status
> registers, but simply read back the LRs that were in use when entering
> the guest.
> 
> You can potentially micro-optimize slightly by remembering the exit
> value of an LR (and not clearing it on guest exit), but you have to
> pay the cost in terms of additional logic during VCPU migration and
> when you enter a VM again, maintaining a mapping of the LR and the
> virtual state, to avoid rewriting the same value to the LR again.  We
> tried that in KVM and could not measure any benefit using either a
> pinned or oversubscribed workload; I speculate that the number of
> times you exit with unprocessed interrupts in the LRs is extremely
> rare.
> 
> In terms of the number of LRs, I stil haven't seen an implementation
> with anything else than 4 LRs.

Yes, that is what I know of as well. The fast model has 16, but I guess
this doesn't count - though it's good to test some code. I can try to
learn the figure in newer hardware.

In the past I traced some workloads and found only a small number of LRs
to be actually used, with 4 or more being extremely rare.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-27 Thread Andre Przywara
Hi,

On 25/10/17 09:22, Manish Jaggi wrote:
>
>
> On 10/23/2017 7:27 PM, Andre Przywara wrote:
>> Hi Manish,
>>
>> On 12/10/17 22:03, Manish Jaggi wrote:
>>> ACPI/IORT Support in Xen.
>>> --
>>>
>>> I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
>>> the scope
>>> and including all that is required to support ACPI/IORT in Xen.
>>> Presenting for review
>>> first _draft_ of design of ACPI/IORT support in Xen. Not complete
>>> though.
>>>
>>> Discussed is the parsing and generation of IORT table for Dom0 and
>>> DomUs.
>>> It is proposed that IORT be parsed and the information in saved into xen
>>> data-structure
>>> say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
>>> etc.
>>>
>>> Since this is first draft is open to technical comments, modifications
>>> and suggestions. Please be open and feel free to add any missing points
>>> / additions.
>>>
>>> 1. What is IORT. What are its components ?
>>> 2. Current Support in Xen
>>> 3. IORT for Dom0
>>> 4. IORT for DomU
>>> 5. Parsing of IORT in Xen
>>> 6. Generation of IORT
>>> 7. Future Work and TODOs
>>>
>>> 1. What is IORT. What are its components ?
>>> 
>>> IORT refers to Input Output remapping table. It is essentially used
>>> to find
>>> information about the IO topology (PCIRC-SMMU-ITS) and relationships
>>> between
>>> devices.
>>>
>>> A general structure of IORT is has nodes which have information about
>>> PCI RC,
>>> SMMU, ITS and Platform devices. Using an IORT table relationship between
>>> RID -> StreamID -> DeviceId can be obtained. More specifically which
>>> device is
>>> behind which SMMU and which interrupt controller, this topology is
>>> described in
>>> IORT Table.
>>>
>>> RID is a requester ID in PCI context,
>>> StreamID is the ID of the device in SMMU context,
>>> DeviceID is the ID programmed in ITS.
>>>
>>> For a non-pci device RID could be simply an ID.
>>>
>>> Each iort_node contains an ID map array to translate from one ID into
>>> another.
>>> IDmap Entry {input_range, output_range, output_node_ref, id_count}
>>> This array is present in PCI RC node,SMMU node, Named component node etc
>>> and can reference to a SMMU or ITS node.
>>>
>>> 2. Current Support of IORT
>>> ---
>>> Currently Xen passes host IORT table to dom0 without any modifications.
>>> For DomU no IORT table is passed.
>>>
>>> 3. IORT for Dom0
>>> -
>>> IORT for Dom0 is prepared by xen and it is fairly similar to the host
>>> iort.
>>> However few nodes could be removed removed or modified. For instance
>>> - host SMMU nodes should not be present
>>> - ITS group nodes are same as host iort but, no stage2 mapping is done
>>> for them.
>> What do you mean with stage2 mapping?
> Please ignore this line. Copy paste error. Read it as follows
>
> - ITS group nodes are same as host iort.
> (though I would modify the same as in next draft)
>
>>
>>> - platform nodes (named components) may be selectively present depending
>>> on the case where xen is using some. This could be controlled by  xen
>>> command
>>> line.
>> Mmh, I am not so sure platform devices described in the IORT (those
>> which use MSIs!) are so much different from PCI devices here. My
>> understanding is those platform devices are network adapters, for
>> instance, for which Xen has no use.
> ok.
>> So I would translate "Named Components" or "platform devices" as devices
>> just not using the PCIe bus (so no config space and no (S)BDF), but
>> being otherwise the same from an ITS or SMMU point of view.
> Correct.
>>> - More items : TODO
>> I think we agreed upon rewriting the IORT table instead of patching it?
> yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was
> _rewriting_ most of Dom0 IORT and not patching it.

I was just after the wording above:
"IORT for Dom0 is prepared by xen and it is fairly similar to the host
iort. However few nodes could be removed removed or modified."
... which sounds a bit like you alter the h/w IORT.
It would be good to clarify this by explicitly mentioning the
parsing/genera

[Xen-devel] [PATCH v3 1/2] arm/xen: vpl011: Fix the slow early console SBSA UART output

2017-10-24 Thread Andre Przywara
From: Bhupinder Thakur <bhupinder.tha...@linaro.org>

The early console output uses pl011_early_write() to write data. This
function waits for BUSY bit to get cleared before writing the next byte.

In the SBSA UART emulation logic, the BUSY bit was set as soon one
byte was written in the FIFO and it remained set until the FIFO was
emptied. This meant that the output was delayed as each character needed
the BUSY to get cleared.

Since the SBSA UART is getting emulated in Xen using ring buffers, it
ensures that once the data is enqueued in the FIFO, it will be received
by xenconsole so it is safe to set the BUSY bit only when FIFO becomes
full. This will ensure that pl011_early_write() is not delayed unduly
to write the data.

Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
Reviewed-by: Andre Przywara <andre.przyw...@linaro.org>
Signed-off-by: Andre Przywara <andre.przyw...@linaro.org>
---
 xen/arch/arm/vpl011.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
index f7ddccb42a..0b0743679f 100644
--- a/xen/arch/arm/vpl011.c
+++ b/xen/arch/arm/vpl011.c
@@ -159,9 +159,15 @@ static void vpl011_write_data(struct domain *d, uint8_t 
data)
 {
 vpl011->uartfr |= TXFF;
 vpl011->uartris &= ~TXI;
-}
 
-vpl011->uartfr |= BUSY;
+/*
+ * This bit is set only when FIFO becomes full. This ensures that
+ * the SBSA UART driver can write the early console data as fast as
+ * possible, without waiting for the BUSY bit to get cleared before
+ * writing each byte.
+ */
+vpl011->uartfr |= BUSY;
+}
 
 vpl011->uartfr &= ~TXFE;
 
@@ -371,11 +377,16 @@ static void vpl011_data_avail(struct domain *d)
 {
 vpl011->uartfr &= ~TXFF;
 vpl011->uartris |= TXI;
+
+/*
+ * Clear the BUSY bit as soon as space becomes available
+ * so that the SBSA UART driver can start writing more data
+ * without any further delay.
+ */
+vpl011->uartfr &= ~BUSY;
+
 if ( out_ring_qsize == 0 )
-{
-vpl011->uartfr &= ~BUSY;
 vpl011->uartfr |= TXFE;
-}
 }
 
 vpl011_update_interrupt_status(d);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 2/2] arm/xen: vpl011: Fix SBSA UART interrupt assertion

2017-10-24 Thread Andre Przywara
From: Bhupinder Thakur <bhupinder.tha...@linaro.org>

With the current SBSA UART emulation, streaming larger amounts of data
(as caused by "find /", for instance) can lead to character loses.
This is due to the OUT ring buffer getting full, because we change the
TX interrupt bit only when the FIFO is actually full, and not already
when it's half-way filled, as the Linux driver expects.
The SBSA spec does not explicitly state this, but we assume that an
SBSA compliant UART uses the PL011 default "interrupt FIFO level select
register" value of "1/2 way". The Linux driver certainly makes this
assumption, so it expect to be able to write a number of characters
after the TX interrupt line has been asserted.
On a similar issue we have the same wrong behaviour on the receive side.
However changing the RX interrupt to trigger on reaching half of the FIFO
level will lead to lag, because the guest would not be notified of incoming
characters until the FIFO is half way filled. This leads to inacceptible
lags when typing on a terminal.
Real hardware solves this issue by using the "receive timeout
interrupt" (RTI), which is triggered when character reception stops for
32 baud cycles. As we cannot and do not want to emulate any timing here,
we slightly abuse the timeout interrupt to notify the guest of new
characters: when a new character comes in, the RTI is asserted, when
the FIFO is cleared, the interrupt gets cleared.

So this patch changes the emulated interrupt trigger behaviour to come
as close to real hardware as possible: the RX and TX interrupt trigger
when the FIFO gets half full / half empty, and the RTI interrupt signals
new incoming characters.

[Andre: reword commit message, introduce receive timeout interrupt, add
comments]

Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
Reviewed-by: Andre Przywara <andre.przyw...@linaro.org>
Signed-off-by: Andre Przywara <andre.przyw...@linaro.org>
---
 xen/arch/arm/vpl011.c| 131 ++-
 xen/include/asm-arm/vpl011.h |   2 +
 2 files changed, 94 insertions(+), 39 deletions(-)

diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
index 0b0743679f..6d02406acf 100644
--- a/xen/arch/arm/vpl011.c
+++ b/xen/arch/arm/vpl011.c
@@ -18,6 +18,9 @@
 
 #define XEN_WANT_FLEX_CONSOLE_RING 1
 
+/* We assume the PL011 default of "1/2 way" for the FIFO trigger level. */
+#define SBSA_UART_FIFO_LEVEL (SBSA_UART_FIFO_SIZE / 2)
+
 #include 
 #include 
 #include 
@@ -93,24 +96,37 @@ static uint8_t vpl011_read_data(struct domain *d)
  */
 if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) > 0 )
 {
+unsigned int fifo_level;
+
 data = intf->in[xencons_mask(in_cons, sizeof(intf->in))];
 in_cons += 1;
 smp_mb();
 intf->in_cons = in_cons;
+
+fifo_level = xencons_queued(in_prod, in_cons, sizeof(intf->in));
+
+/* If the FIFO is now empty, we clear the receive timeout interrupt. */
+if ( fifo_level == 0 )
+{
+vpl011->uartfr |= RXFE;
+vpl011->uartris &= ~RTI;
+}
+
+/* If the FIFO is more than half empty, we clear the RX interrupt. */
+if ( fifo_level < sizeof(intf->in) - SBSA_UART_FIFO_LEVEL )
+vpl011->uartris &= ~RXI;
+
+vpl011_update_interrupt_status(d);
 }
 else
 gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n");
 
-if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) == 0 )
-{
-vpl011->uartfr |= RXFE;
-vpl011->uartris &= ~RXI;
-}
-
+/*
+ * We have consumed a character or the FIFO was empty, so clear the
+ * "FIFO full" bit.
+ */
 vpl011->uartfr &= ~RXFF;
 
-vpl011_update_interrupt_status(d);
-
 VPL011_UNLOCK(d, flags);
 
 /*
@@ -122,6 +138,24 @@ static uint8_t vpl011_read_data(struct domain *d)
 return data;
 }
 
+static void vpl011_update_tx_fifo_status(struct vpl011 *vpl011,
+ unsigned int fifo_level)
+{
+struct xencons_interface *intf = vpl011->ring_buf;
+unsigned int fifo_threshold = sizeof(intf->out) - SBSA_UART_FIFO_LEVEL;
+
+BUILD_BUG_ON(sizeof (intf->out) < SBSA_UART_FIFO_SIZE);
+
+/*
+ * Set the TXI bit only when there is space for fifo_size/2 bytes which
+ * is the trigger level for asserting/de-assterting the TX interrupt.
+ */
+if ( fifo_level <= fifo_threshold )
+vpl011->uartris |= TXI;
+else
+vpl011->uartris &= ~TXI;
+}
+
 static void vpl011_write_data(struct domain *d, uint8_t data)
 {
 unsigned long flags;
@@ -146,33 +180,37 @@ static void vpl011_write_data(struct domain *d, uint8_t 
data)
 if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) !=
  sizeof (intf->o

Re: [Xen-devel] [PATCH RFC] ARM: vPL011: use receive timeout interrupt

2017-10-24 Thread Andre Przywara
Hi,

On 24/10/17 12:00, Julien Grall wrote:
> Hi,
> 
> On 23/10/2017 17:01, Andre Przywara wrote:
>> Hi,
>>
>> On 18/10/17 17:32, Bhupinder Thakur wrote:
>>> Hi Andre,
>>>
>>> I verified this patch on qualcomm platform. It is working fine.
>>>
>>> On 18 October 2017 at 19:11, Andre Przywara <andre.przyw...@arm.com>
>>> wrote:
>>>> Instead of asserting the receive interrupt (RXI) on the first character
>>>> in the FIFO, lets (ab)use the receive timeout interrupt (RTI) for that
>>>> purpose. That seems to be closer to the spec and what hardware does.
>>>> Improve the readability of vpl011_data_avail() on the way.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
>>>> ---
>>>> Hi,
>>>>
>>>> this one is the approach I mentioned in the email earlier today.
>>>> It goes on top of Bhupinders v12 27/27, but should eventually be merged
>>>> into this one once we agreed on the subject. I just carved it out here
>>>> for clarity to make it clearer what has been changed.
>>>> Would be good if someone could test it.
>>>>
>>>> Cheers,
>>>> Andre.
>>>>  xen/arch/arm/vpl011.c | 61
>>>> ---
>>>>  1 file changed, 29 insertions(+), 32 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
>>>> index adf1711571..ae18bddd81 100644
>>>> --- a/xen/arch/arm/vpl011.c
>>>> +++ b/xen/arch/arm/vpl011.c
>>>> @@ -105,9 +105,13 @@ static uint8_t vpl011_read_data(struct domain *d)
>>>>  if ( fifo_level == 0 )
>>>>  {
>>>>  vpl011->uartfr |= RXFE;
>>>> -    vpl011->uartris &= ~RXI;
>>>> -    vpl011_update_interrupt_status(d);
>>>> +    vpl011->uartris &= ~RTI;
>>>>  }
>>>> +
>>>> +    if ( fifo_level < sizeof(intf->in) - SBSA_UART_FIFO_SIZE / 2 )
>>>> +    vpl011->uartris &= ~RXI;
>>>> +
>>>> +    vpl011_update_interrupt_status(d);
>>> I think we check if ( fifo_level < SBSA_UART_FIFO_SIZE / 2 ) which
>>> should be a valid condition to clear the RX interrupt.
>>
>> Are you sure? My understanding is that the semantics of the return value
>> of xencons_queued() differs between intf and outf:
>> - For intf, Xen fills that buffer with incoming characters. The
>> watermark is assumed to be (FIFO / 2), which translates into 16
>> characters. Now for the SBSA vUART RX side that means: "Assert the RX
>> interrupt if there is only room for 16 (or less) characters in the FIFO
>> (read: intf buffer in our case). Since we (ab)use the Xen buffer for the
>> FIFO, this means we warn if the number of queued characters exceeds
>> (buffersize - 16).
>> - For outf, the UART emulation fills the buffer. The SBSA vUART TX side
>> demands that the TX interrupt is asserted if the fill level of the
>> transmit FIFO is less than or equal to the 16 characters, which means:
>> number of queued characters is less than 16.
>>
>> I think the key point is that our trigger level isn't symmetrical here,
>> since we have to emulate the architected 32-byte FIFO semantics for the
>> driver, but have a (secretly) much larger "FIFO" internally.
>>
>> Do you agree with this reasoning and do I have a thinko here? Could well
>> be I am seriously misguided here.
> 
> xencons_queued calculates how many bytes are currently on the ring. So I
> think your description makes sense.
> 
> With (fifo_level < (SBSA_UART_FIFO_SIZE / 2)), you would only clear it
> when the ring has less than 16 bytes queued.
> 
> I have a few requests on those patches for the resender:
> - Please introduce a define for SBSA_UART_FIFO_SIZE / 2 and use it
> everywhere.
> - Please add a bit more documentation on top of the checks in
> vpl011_read_data function. The checks in vpl011_write_data looks
> well-documented.

I am just at rewording the commit message and was planning on re-sending
the (merged) patches later today (keeping Bhupinder's authorship, of
course).

I hope that Bhupinder doesn't mind or this doesn't clash with any of his
plans.

Cheers,
Andre.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC] ARM: vPL011: use receive timeout interrupt

2017-10-23 Thread Andre Przywara
Hi,

On 18/10/17 17:32, Bhupinder Thakur wrote:
> Hi Andre,
> 
> I verified this patch on qualcomm platform. It is working fine.
> 
> On 18 October 2017 at 19:11, Andre Przywara <andre.przyw...@arm.com> wrote:
>> Instead of asserting the receive interrupt (RXI) on the first character
>> in the FIFO, lets (ab)use the receive timeout interrupt (RTI) for that
>> purpose. That seems to be closer to the spec and what hardware does.
>> Improve the readability of vpl011_data_avail() on the way.
>>
>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
>> ---
>> Hi,
>>
>> this one is the approach I mentioned in the email earlier today.
>> It goes on top of Bhupinders v12 27/27, but should eventually be merged
>> into this one once we agreed on the subject. I just carved it out here
>> for clarity to make it clearer what has been changed.
>> Would be good if someone could test it.
>>
>> Cheers,
>> Andre.
>>  xen/arch/arm/vpl011.c | 61 
>> ---
>>  1 file changed, 29 insertions(+), 32 deletions(-)
>>
>> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
>> index adf1711571..ae18bddd81 100644
>> --- a/xen/arch/arm/vpl011.c
>> +++ b/xen/arch/arm/vpl011.c
>> @@ -105,9 +105,13 @@ static uint8_t vpl011_read_data(struct domain *d)
>>  if ( fifo_level == 0 )
>>  {
>>  vpl011->uartfr |= RXFE;
>> -vpl011->uartris &= ~RXI;
>> -vpl011_update_interrupt_status(d);
>> +vpl011->uartris &= ~RTI;
>>  }
>> +
>> +if ( fifo_level < sizeof(intf->in) - SBSA_UART_FIFO_SIZE / 2 )
>> +vpl011->uartris &= ~RXI;
>> +
>> +vpl011_update_interrupt_status(d);
> I think we check if ( fifo_level < SBSA_UART_FIFO_SIZE / 2 ) which
> should be a valid condition to clear the RX interrupt.

Are you sure? My understanding is that the semantics of the return value
of xencons_queued() differs between intf and outf:
- For intf, Xen fills that buffer with incoming characters. The
watermark is assumed to be (FIFO / 2), which translates into 16
characters. Now for the SBSA vUART RX side that means: "Assert the RX
interrupt if there is only room for 16 (or less) characters in the FIFO
(read: intf buffer in our case). Since we (ab)use the Xen buffer for the
FIFO, this means we warn if the number of queued characters exceeds
(buffersize - 16).
- For outf, the UART emulation fills the buffer. The SBSA vUART TX side
demands that the TX interrupt is asserted if the fill level of the
transmit FIFO is less than or equal to the 16 characters, which means:
number of queued characters is less than 16.

I think the key point is that our trigger level isn't symmetrical here,
since we have to emulate the architected 32-byte FIFO semantics for the
driver, but have a (secretly) much larger "FIFO" internally.

Do you agree with this reasoning and do I have a thinko here? Could well
be I am seriously misguided here.

Cheers,
Andre

>>  }
>>  else
>>  gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n");
>> @@ -129,7 +133,7 @@ static void vpl011_update_tx_fifo_status(struct vpl011 
>> *vpl011,
>>   unsigned int fifo_level)
>>  {
>>  struct xencons_interface *intf = vpl011->ring_buf;
>> -unsigned int fifo_threshold;
>> +unsigned int fifo_threshold = sizeof(intf->out) - SBSA_UART_FIFO_SIZE/2;
>>
>>  BUILD_BUG_ON(sizeof (intf->out) < SBSA_UART_FIFO_SIZE);
>>
>> @@ -137,8 +141,6 @@ static void vpl011_update_tx_fifo_status(struct vpl011 
>> *vpl011,
>>   * Set the TXI bit only when there is space for fifo_size/2 bytes which
>>   * is the trigger level for asserting/de-assterting the TX interrupt.
>>   */
>> -fifo_threshold = sizeof(intf->out) - SBSA_UART_FIFO_SIZE/2;
>> -
>>  if ( fifo_level <= fifo_threshold )
>>  vpl011->uartris |= TXI;
>>  else
>> @@ -390,35 +392,30 @@ static void vpl011_data_avail(struct domain *d)
>>  out_cons,
>>  sizeof(intf->out));
>>
>> -/* Update the uart rx state if the buffer is not empty. */
>> -if ( in_fifo_level != 0 )
>> -{
>> +/ Update the UART RX state /
>> +
>> +/* Clear the FIFO_EMPTY bit if the FIFO holds at least one character. */
>> +if ( in_fifo_level > 0 )
>>  vpl011->uartfr &= ~RXF

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-23 Thread Andre Przywara
Hi Manish,

On 12/10/17 22:03, Manish Jaggi wrote:
> ACPI/IORT Support in Xen.
> --
> 
> I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
> the scope
> and including all that is required to support ACPI/IORT in Xen.
> Presenting for review
> first _draft_ of design of ACPI/IORT support in Xen. Not complete though.
> 
> Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
> It is proposed that IORT be parsed and the information in saved into xen
> data-structure
> say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
> etc.
> 
> Since this is first draft is open to technical comments, modifications
> and suggestions. Please be open and feel free to add any missing points
> / additions.
> 
> 1. What is IORT. What are its components ?
> 2. Current Support in Xen
> 3. IORT for Dom0
> 4. IORT for DomU
> 5. Parsing of IORT in Xen
> 6. Generation of IORT
> 7. Future Work and TODOs
> 
> 1. What is IORT. What are its components ?
> 
> IORT refers to Input Output remapping table. It is essentially used to find
> information about the IO topology (PCIRC-SMMU-ITS) and relationships
> between
> devices.
> 
> A general structure of IORT is has nodes which have information about
> PCI RC,
> SMMU, ITS and Platform devices. Using an IORT table relationship between
> RID -> StreamID -> DeviceId can be obtained. More specifically which
> device is
> behind which SMMU and which interrupt controller, this topology is
> described in
> IORT Table.
> 
> RID is a requester ID in PCI context,
> StreamID is the ID of the device in SMMU context,
> DeviceID is the ID programmed in ITS.
> 
> For a non-pci device RID could be simply an ID.
> 
> Each iort_node contains an ID map array to translate from one ID into
> another.
> IDmap Entry {input_range, output_range, output_node_ref, id_count}
> This array is present in PCI RC node,SMMU node, Named component node etc
> and can reference to a SMMU or ITS node.
> 
> 2. Current Support of IORT
> ---
> Currently Xen passes host IORT table to dom0 without any modifications.
> For DomU no IORT table is passed.
> 
> 3. IORT for Dom0
> -
> IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
> However few nodes could be removed removed or modified. For instance
> - host SMMU nodes should not be present
> - ITS group nodes are same as host iort but, no stage2 mapping is done
> for them.

What do you mean with stage2 mapping?

> - platform nodes (named components) may be selectively present depending
> on the case where xen is using some. This could be controlled by  xen command
> line.

Mmh, I am not so sure platform devices described in the IORT (those
which use MSIs!) are so much different from PCI devices here. My
understanding is those platform devices are network adapters, for
instance, for which Xen has no use.
So I would translate "Named Components" or "platform devices" as devices
just not using the PCIe bus (so no config space and no (S)BDF), but
being otherwise the same from an ITS or SMMU point of view.

> - More items : TODO

I think we agreed upon rewriting the IORT table instead of patching it?
So to some degree your statements are true, but when we rewrite the IORT
table without SMMUs (and possibly without other components like the
PMUs), it would be kind of a stretch to call it "fairly similar to the
host IORT". I think "based on the host IORT" would be more precise.

> 4. IORT for DomU
> -
> IORT for DomU is generated by the toolstack. IORT topology is different
> when DomU supports device passthrough.

Can you elaborate on that? Different compared to what? My understanding
is that without device passthrough there would be no IORT in the first
place?

> At a minimum domU IORT should include a single PCIRC and ITS Group.
> Similar PCIRC can be added in DSDT.
> Additional node can be added if platform device is assigned to domU.
> No extra node should be required for PCI device pass-through.

Again I don't fully understand this last sentence.

> It is proposed that the idrange of PCIRC and ITS group be constant for
> domUs.

"constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or
"independent from the actual h/w"?

> In case if PCI PT,using a domctl toolstack can communicate
> physical RID: virtual RID, deviceID: virtual deviceID to xen.
> 
> It is assumed that domU PCI Config access would be trapped in Xen. The
> RID at which assigned device is enumerated would be the one provided by the
> domctl, domctl_set_deviceid_mapping
> 
> TODO: device assign domctl i/f.
> Note: This should suffice the virtual deviceID support pointed by Andre.
> [4]

Well, there's more to it. First thing: while I tried to include virtual
ITS deviceIDs to be different from physical ones, in the moment there
are fixed to being mapped 1:1 in the code.

So the first step would 

Re: [Xen-devel] [PATCH v2] arm: configure interrupts to be in non-secure group1

2017-10-20 Thread Andre Przywara
Hi,

On 18/10/17 22:29, Stefano Stabellini wrote:
> Xen uses non-secure group1 interrupts, however it doesn't configure the
> GICv3 accordingly. Xen needs to set GICD_IGROUPR for SPIs and
> GICR_IGROUPR0 for local interrupt to "1" to specify that interrupts
> belong to group1. This is particularly important if the system has
> GICD_CTLR.DS set, also see commit
> 7c9b973061b03af62734f613f6abec46c0dd4a88 in Linux.

Indeed, good catch!
The spec says that those registers initialize to 0, and on normal
hardware this will be adjusted by the secure firmware side.
That's why we didn't see the issue before.
Now with QEMU there might be no secure firmware, also the emulated GIC
only provides a single security state, so we have to set this up ourselves.

> Signed-off-by: Stefano Stabellini <sstabell...@kernel.org>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>


> This is a candidate for stable backports.

It should definitely go into 4.10.

Cheers,
Andre.

> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 74d00e0..77da892 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -569,6 +569,13 @@ static void __init gicv3_dist_init(void)
>  for ( i = NR_GIC_LOCAL_IRQS; i < nr_lines; i += 32 )
>  writel_relaxed(0x, GICD + GICD_ICENABLER + (i / 32) * 4);
>  
> +/*
> + * Configure SPIs as non-secure Group-1. This will only matter
> + * if the GIC only has a single security state.
> + */
> +for ( i = NR_GIC_LOCAL_IRQS; i < nr_lines; i += 32 )
> +writel_relaxed(GENMASK(31, 0), GICD + GICD_IGROUPR + (i / 32) * 4);
> +
>  gicv3_dist_wait_for_rwp();
>  
>  /* Turn on the distributor */
> @@ -775,6 +782,8 @@ static int gicv3_cpu_init(void)
>   */
>  writel_relaxed(0x, GICD_RDIST_SGI_BASE + GICR_ICENABLER0);
>  writel_relaxed(0x, GICD_RDIST_SGI_BASE + GICR_ISENABLER0);
> +/* Configure SGIs/PPIs as non-secure Group-1 */
> +writel_relaxed(GENMASK(31, 0), GICD_RDIST_SGI_BASE + GICR_IGROUPR0);
>  
>  gicv3_redist_wait_for_rwp();
>  
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 00/12] ARM: VGIC/GIC separation cleanups

2017-10-19 Thread Andre Przywara
Hi,

On 19/10/17 13:48, Andre Przywara wrote:
> By the original VGIC design, Xen differentiates between the actual VGIC
> emulation on one hand and the GIC hardware accesses on the other.
> It seems there were some deviations from that scheme (over time?), so at
> the moment we end up happily accessing VGIC specific data structures
> like struct pending_irq and struct vgic_irq_rank from pure GIC files
> like gic.c or even irq.c (try: git grep -l struct\ pending_irq xen/arch/arm).
> But any future VGIC rework will depend on a clean separation, so this
> series tries to clean this up.
> It starts with some rather innocent patches, reaches its peak with the
> ugly patch 5/12 and the heavy 6/12, and calms down in the rest of the
> series again.
> After this series there are no more references to VGIC structures from
> GIC files, at least for non-ITS code. The ITS is a beast own its own
> (blame the author) and will be addressed later.
> 
> This is a first shot, any ideas on improvements are welcome.

Forgot to mention: This is of course not 4.10 material.

And I tested this is on Midway and Juno, with two guests migrating
interrupts like crazy over night:
   CPU0   CPU1
 18:88925198892530 GIC-0  27 Level arch_timer
 19:  193048966  192887534 GIC-0  31 Level events
 20:366  0   xen-dyn Edge-event xenbus
 21: 180335 183325   xen-dyn Edge-event hvc_console
 22:  112174867   81289537   xen-dyn Edge-event blkif
 23:   80768079  111489990   xen-dyn Edge-event blkif

But please give it a good shake on your setup to spot any regressions.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 07/12] ARM: VGIC: split gic.c to observe hardware/virtual GIC separation

2017-10-19 Thread Andre Przywara
Currently gic.c holds code to handle hardware IRQs as well as code to
bridge VGIC requests to the GIC virtualization hardware.
Despite being named gic.c, this file reaches into the VGIC and uses data
structures describing virtual IRQs.
To improve abstraction, move the VGIC functions into a separate file,
so that gic.c does what is says on the tin.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/Makefile   |   1 +
 xen/arch/arm/gic-vgic.c | 395 
 xen/arch/arm/gic.c  | 348 +-
 3 files changed, 398 insertions(+), 346 deletions(-)
 create mode 100644 xen/arch/arm/gic-vgic.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 30a2a6500a..41d7366527 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -16,6 +16,7 @@ obj-y += domain_build.o
 obj-y += domctl.o
 obj-$(EARLY_PRINTK) += early_printk.o
 obj-y += gic.o
+obj-y += gic-vgic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
 obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
new file mode 100644
index 00..66cae21e82
--- /dev/null
+++ b/xen/arch/arm/gic-vgic.c
@@ -0,0 +1,395 @@
+/*
+ * xen/arch/arm/gic-vgic.c
+ *
+ * ARM Generic Interrupt Controller virtualization support
+ *
+ * Tim Deegan <t...@xen.org>
+ * Copyright (c) 2011 Citrix Systems.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern uint64_t per_cpu__lr_mask;
+extern const struct gic_hw_operations *gic_hw_ops;
+
+#define lr_all_full() (this_cpu(lr_mask) == ((1 << gic_hw_ops->info->nr_lrs) - 
1))
+
+#undef GIC_DEBUG
+
+static void gic_update_one_lr(struct vcpu *v, int i);
+
+static inline void gic_set_lr(int lr, struct pending_irq *p,
+  unsigned int state)
+{
+ASSERT(!local_irq_is_enabled());
+
+clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status);
+
+gic_hw_ops->update_lr(lr, p, state);
+
+set_bit(GIC_IRQ_GUEST_VISIBLE, >status);
+clear_bit(GIC_IRQ_GUEST_QUEUED, >status);
+p->lr = lr;
+}
+
+static inline void gic_add_to_lr_pending(struct vcpu *v, struct pending_irq *n)
+{
+struct pending_irq *iter;
+
+ASSERT(spin_is_locked(>arch.vgic.lock));
+
+if ( !list_empty(>lr_queue) )
+return;
+
+list_for_each_entry ( iter, >arch.vgic.lr_pending, lr_queue )
+{
+if ( iter->priority > n->priority )
+{
+list_add_tail(>lr_queue, >lr_queue);
+return;
+}
+}
+list_add_tail(>lr_queue, >arch.vgic.lr_pending);
+}
+
+void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
+{
+struct pending_irq *n = irq_to_pending(v, virtual_irq);
+
+/* If an LPI has been removed meanwhile, there is nothing left to raise. */
+if ( unlikely(!n) )
+return;
+
+ASSERT(spin_is_locked(>arch.vgic.lock));
+
+/* Don't try to update the LR if the interrupt is disabled */
+if ( !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
+return;
+
+if ( list_empty(>lr_queue) )
+{
+if ( v == current )
+gic_update_one_lr(v, n->lr);
+}
+#ifdef GIC_DEBUG
+else
+gdprintk(XENLOG_DEBUG, "trying to inject irq=%u into d%dv%d, when it 
is still lr_pending\n",
+ virtual_irq, v->domain->domain_id, v->vcpu_id);
+#endif
+}
+
+/*
+ * Find an unused LR to insert an IRQ into, starting with the LR given
+ * by @lr. If this new interrupt is a PRISTINE LPI, scan the other LRs to
+ * avoid inserting the same IRQ twice. This situation can occur when an
+ * event gets discarded while the LPI is in an LR, and a new LPI with the
+ * same number gets mapped quickly afterwards.
+ */
+static unsigned int gic_find_unused_lr(struct vcpu *v,
+   struct pending_irq *p,
+   unsigned int lr)
+{
+unsigned int nr_lrs = gic_hw_ops->info->nr_lrs;
+unsigned long *lr_mask = (unsigned long *) _cpu(lr_mask);
+struct gic_lr lr_val;
+
+ASSERT(spin_is_locked(>arch.vgic.lock));
+
+if ( unlikely(test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status)) )
+{
+unsigned int used_lr;

[Xen-devel] [PATCH 10/12] ARM: VGIC: factor out vgic_connect_hw_irq()

2017-10-19 Thread Andre Przywara
At the moment we happily access VGIC internal data structures like
the rank and struct pending_irq in gic.c, which should be VGIC agnostic.

Factor out a new function vgic_connect_hw_irq(), which allows a virtual
IRQ to be connected to a hardware IRQ (using the hw bit in the LR).

This removes said accesses to VGIC data structures and improves abstraction.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic-vgic.c| 31 +++
 xen/arch/arm/gic.c | 42 ++
 xen/include/asm-arm/vgic.h |  2 ++
 3 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 66cae21e82..bf9455a34e 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -385,6 +385,37 @@ void gic_inject(struct vcpu *v)
 gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
 }
 
+int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq,
+struct irq_desc *desc)
+{
+unsigned long flags;
+/* Use vcpu0 to retrieve the pending_irq struct. Given that we only
+ * route SPIs to guests, it doesn't make any difference. */
+struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
+struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
+struct pending_irq *p = irq_to_pending(v_target, virq);
+int ret = 0;
+
+/* We are taking to rank lock to prevent parallel connections. */
+vgic_lock_rank(v_target, rank, flags);
+
+if ( desc )
+{
+/* The VIRQ should not be already enabled by the guest */
+if ( !p->desc &&
+ !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
+p->desc = desc;
+else
+ret = -EBUSY;
+}
+else
+p->desc = NULL;
+
+vgic_unlock_rank(v_target, rank, flags);
+
+return ret;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 4cb74d449e..d46a6d54b3 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -128,27 +128,12 @@ void gic_route_irq_to_xen(struct irq_desc *desc, unsigned 
int priority)
 int gic_route_irq_to_guest(struct domain *d, unsigned int virq,
struct irq_desc *desc, unsigned int priority)
 {
-unsigned long flags;
-/* Use vcpu0 to retrieve the pending_irq struct. Given that we only
- * route SPIs to guests, it doesn't make any difference. */
-struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
-struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
-struct pending_irq *p = irq_to_pending(v_target, virq);
-int res = -EBUSY;
-
 ASSERT(spin_is_locked(>lock));
 /* Caller has already checked that the IRQ is an SPI */
 ASSERT(virq >= 32);
 ASSERT(virq < vgic_num_irqs(d));
 ASSERT(!is_lpi(virq));
 
-vgic_lock_rank(v_target, rank, flags);
-
-if ( p->desc ||
- /* The VIRQ should not be already enabled by the guest */
- test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
-goto out;
-
 desc->handler = gic_hw_ops->gic_guest_irq_type;
 set_bit(_IRQ_GUEST, >status);
 
@@ -156,31 +141,19 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
 gic_set_irq_type(desc, desc->arch.type);
 gic_set_irq_priority(desc, priority);
 
-p->desc = desc;
-res = 0;
-
-out:
-vgic_unlock_rank(v_target, rank, flags);
-
-return res;
+return vgic_connect_hw_irq(d, NULL, virq, desc);
 }
 
 /* This function only works with SPIs for now */
 int gic_remove_irq_from_guest(struct domain *d, unsigned int virq,
   struct irq_desc *desc)
 {
-struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
-struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
-struct pending_irq *p = irq_to_pending(v_target, virq);
-unsigned long flags;
+int ret;
 
 ASSERT(spin_is_locked(>lock));
 ASSERT(test_bit(_IRQ_GUEST, >status));
-ASSERT(p->desc == desc);
 ASSERT(!is_lpi(virq));
 
-vgic_lock_rank(v_target, rank, flags);
-
 if ( d->is_dying )
 {
 desc->handler->shutdown(desc);
@@ -198,19 +171,16 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
  */
 if ( test_bit(_IRQ_INPROGRESS, >status) ||
  !test_bit(_IRQ_DISABLED, >status) )
-{
-vgic_unlock_rank(v_target, rank, flags);
 return -EBUSY;
-}
 }
 
+ret = vgic_connect_hw_irq(d, NULL, virq, NULL);
+if ( ret )
+return ret;
+
 clear_bit(_IRQ_GUEST, >status);
 desc->handler = _irq_type;
 
-p->desc = NULL;
-
-vgic_unlock_rank(v_target, rank, flags);
-
 return 0;
 }
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index dcdb1acaf3..cf02dc6394 100644
--- a/xe

[Xen-devel] [PATCH 04/12] ARM: VGIC: move gic_remove_irq_from_queues()

2017-10-19 Thread Andre Przywara
gic_remove_irq_from_queues() was not only misnamed, it also has the wrong
abstraction, as it should not live in gic.c.
Move it into vgic.c and vgic.h, where it belongs to, and rename it on
the way.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c |  9 -
 xen/arch/arm/vgic-v3-its.c |  4 ++--
 xen/arch/arm/vgic.c| 11 ++-
 xen/include/asm-arm/gic.h  |  1 -
 xen/include/asm-arm/vgic.h |  1 +
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 75b2e0e0ca..ef041354ea 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -411,15 +411,6 @@ void gic_remove_from_lr_pending(struct vcpu *v, struct 
pending_irq *p)
 list_del_init(>lr_queue);
 }
 
-void gic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p)
-{
-ASSERT(spin_is_locked(>arch.vgic.lock));
-
-clear_bit(GIC_IRQ_GUEST_QUEUED, >status);
-list_del_init(>inflight);
-gic_remove_from_lr_pending(v, p);
-}
-
 void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
 {
 struct pending_irq *n = irq_to_pending(v, virtual_irq);
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 72a5c70656..d8fa44258d 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -381,7 +381,7 @@ static int its_handle_clear(struct virt_its *its, uint64_t 
*cmdptr)
  * have no active state, we don't need to care about this here.
  */
 if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, >status) )
-gic_remove_irq_from_queues(vcpu, p);
+vgic_remove_irq_from_queues(vcpu, p);
 
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 ret = 0;
@@ -619,7 +619,7 @@ static int its_discard_event(struct virt_its *its,
 }
 
 /* Cleanup the pending_irq and disconnect it from the LPI. */
-gic_remove_irq_from_queues(vcpu, p);
+vgic_remove_irq_from_queues(vcpu, p);
 vgic_init_pending_irq(p, INVALID_LPI);
 
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 451a306a98..cd50b90d67 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -281,7 +281,7 @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, 
unsigned int irq)
 /* If the IRQ is still lr_pending, re-inject it to the new vcpu */
 if ( !list_empty(>lr_queue) )
 {
-gic_remove_irq_from_queues(old, p);
+vgic_remove_irq_from_queues(old, p);
 irq_set_affinity(p->desc, cpumask_of(new->processor));
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 vgic_vcpu_inject_irq(new, irq);
@@ -510,6 +510,15 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 }
 
+void vgic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p)
+{
+ASSERT(spin_is_locked(>arch.vgic.lock));
+
+clear_bit(GIC_IRQ_GUEST_QUEUED, >status);
+list_del_init(>inflight);
+gic_remove_from_lr_pending(v, p);
+}
+
 void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 {
 uint8_t priority;
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 2f248301ce..030c1d86a7 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -243,7 +243,6 @@ extern void gic_raise_guest_irq(struct vcpu *v, unsigned 
int irq,
 unsigned int priority);
 extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq);
 extern void gic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p);
-extern void gic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
 
 /* Accept an interrupt from the GIC and dispatch its handler */
 extern void gic_interrupt(struct cpu_user_regs *regs, int is_fiq);
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index e489d0bf21..8d0ff65708 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -204,6 +204,7 @@ extern int vcpu_vgic_init(struct vcpu *v);
 extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
+void vgic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
 extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/12] ARM: remove unneeded gic.h inclusions

2017-10-19 Thread Andre Przywara
gic.h is supposed to hold defines and prototypes for the hardware side
of the GIC interrupt controller. A lot of parts in Xen should not be
bothered with that, as they either only care about the VGIC or use
more generic interfaces.
Remove unneeded inclusions of gic.h from files where they are actually
not needed.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/domain_build.c  | 1 -
 xen/arch/arm/p2m.c   | 1 -
 xen/arch/arm/platforms/vexpress.c| 1 -
 xen/arch/arm/platforms/xgene-storm.c | 1 -
 xen/arch/arm/time.c  | 1 -
 xen/arch/arm/traps.c | 1 -
 xen/arch/arm/vpsci.c | 1 -
 xen/arch/arm/vtimer.c| 1 -
 8 files changed, 8 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index bf29299707..e7899fbf19 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 
-#include 
 #include 
 #include 
 #include "kernel.h"
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 68b488997d..07f5cc4468 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -10,7 +10,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/arm/platforms/vexpress.c 
b/xen/arch/arm/platforms/vexpress.c
index 39b6bcc70e..70839d676f 100644
--- a/xen/arch/arm/platforms/vexpress.c
+++ b/xen/arch/arm/platforms/vexpress.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #define DCC_SHIFT  26
 #define FUNCTION_SHIFT 20
diff --git a/xen/arch/arm/platforms/xgene-storm.c 
b/xen/arch/arm/platforms/xgene-storm.c
index 3b007fe5ed..deb8479a49 100644
--- a/xen/arch/arm/platforms/xgene-storm.c
+++ b/xen/arch/arm/platforms/xgene-storm.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 
 /* XGENE RESET Specific defines */
 #define XGENE_RESET_ADDR0x1714UL
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index 105c7410c7..36f640f0c1 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index f6f6de3691..ff3d6ff2aa 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -43,7 +43,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 0e024f7578..cd724904ef 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -15,7 +15,6 @@
 #include 
 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 3f84893a74..f52a723a5f 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -24,7 +24,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/12] ARM: vGIC: fix nr_irq definition

2017-10-19 Thread Andre Przywara
The global variable "nr_irqs" is used for x86 and some common Xen code.
To make the latter work easily for ARM, it was #defined to NR_IRQS.
This not only violated the common habit of capitalizing macros, but
also caused issues if one wanted to use a rather innocent "nr_irqs" as
a local variable name or as a function parameter.
Drop the optimization and make nr_irqs a normal variable for ARM also.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/irq.c| 2 ++
 xen/include/asm-arm/irq.h | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index cbc7e6ebb8..7f133de549 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 
+unsigned int __read_mostly nr_irqs = NR_IRQS;
+
 static unsigned int local_irqs_type[NR_LOCAL_IRQS];
 static DEFINE_SPINLOCK(local_irqs_type_lock);
 
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 2de76d0f56..abc8f06a13 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -31,7 +31,7 @@ struct arch_irq_desc {
 /* LPIs are always numbered starting at 8192, so 0 is a good invalid case. */
 #define INVALID_LPI 0
 
-#define nr_irqs NR_IRQS
+extern unsigned int nr_irqs;
 #define nr_static_irqs NR_IRQS
 #define arch_hwdom_irqs(domid) NR_IRQS
 
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 08/12] ARM: VGIC: split up gic_dump_info() to cover virtual part separately

2017-10-19 Thread Andre Przywara
Currently gic_dump_info() not only dumps the hardware state of the GIC,
but also the VGIC internal virtual IRQ lists.
Split the latter off and move it into vgic.c to observe the abstraction.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/domain.c  |  1 +
 xen/arch/arm/gic.c | 12 
 xen/arch/arm/vgic.c| 11 +++
 xen/include/asm-arm/vgic.h |  2 ++
 4 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 73f4d4b2b2..5250bc2f88 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -942,6 +942,7 @@ long arch_do_vcpu_op(int cmd, struct vcpu *v, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 void arch_dump_vcpu_info(struct vcpu *v)
 {
 gic_dump_info(v);
+vgic_dump_info(v);
 }
 
 void vcpu_mark_events_pending(struct vcpu *v)
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 04e6d66b69..4cb74d449e 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -443,20 +443,8 @@ static void maintenance_interrupt(int irq, void *dev_id, 
struct cpu_user_regs *r
 
 void gic_dump_info(struct vcpu *v)
 {
-struct pending_irq *p;
-
 printk("GICH_LRs (vcpu %d) mask=%"PRIx64"\n", v->vcpu_id, v->arch.lr_mask);
 gic_hw_ops->dump_state(v);
-
-list_for_each_entry ( p, >arch.vgic.inflight_irqs, inflight )
-{
-printk("Inflight irq=%u lr=%u\n", p->irq, p->lr);
-}
-
-list_for_each_entry( p, >arch.vgic.lr_pending, lr_queue )
-{
-printk("Pending irq=%d\n", p->irq);
-}
 }
 
 void init_maintenance_interrupt(void)
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 2cdaca7480..37a083e804 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -656,6 +656,17 @@ void vgic_free_virq(struct domain *d, unsigned int virq)
 clear_bit(virq, d->arch.vgic.allocated_irqs);
 }
 
+void vgic_dump_info(struct vcpu *v)
+{
+struct pending_irq *p;
+
+list_for_each_entry ( p, >arch.vgic.inflight_irqs, inflight )
+printk("Inflight irq=%u lr=%u\n", p->irq, p->lr);
+
+list_for_each_entry( p, >arch.vgic.lr_pending, lr_queue )
+printk("Pending irq=%d\n", p->irq);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 0d3810e6af..49b8a4bec0 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -226,6 +226,8 @@ extern bool vgic_to_sgi(struct vcpu *v, register_t sgir,
 const struct sgi_target *target);
 extern bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int 
irq);
 
+void vgic_dump_info(struct vcpu *v);
+
 /* Reserve a specific guest vIRQ */
 extern bool vgic_reserve_virq(struct domain *d, unsigned int virq);
 
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 05/12] ARM: VGIC: move gic_remove_from_lr_pending()

2017-10-19 Thread Andre Przywara
gic_remove_from_lr_pending() was not only misnamed, it also had the wrong
abstraction, as it should not live in gic.c.
Move it into vgic.c and vgic.h, where it belongs, and rename it on the
way.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c |  7 ---
 xen/arch/arm/vgic-v3-its.c |  2 +-
 xen/arch/arm/vgic.c| 13 ++---
 xen/include/asm-arm/gic.h  |  1 -
 xen/include/asm-arm/vgic.h |  1 +
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index ef041354ea..59dd255c2c 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -404,13 +404,6 @@ static inline void gic_add_to_lr_pending(struct vcpu *v, 
struct pending_irq *n)
 list_add_tail(>lr_queue, >arch.vgic.lr_pending);
 }
 
-void gic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p)
-{
-ASSERT(spin_is_locked(>arch.vgic.lock));
-
-list_del_init(>lr_queue);
-}
-
 void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
 {
 struct pending_irq *n = irq_to_pending(v, virtual_irq);
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index d8fa44258d..5b77594723 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -449,7 +449,7 @@ static void update_lpi_vgic_status(struct vcpu *v, struct 
pending_irq *p)
 gic_raise_guest_irq(v, p->irq, p->lpi_priority);
 }
 else
-gic_remove_from_lr_pending(v, p);
+vgic_remove_from_lr_pending(v, p);
 }
 
 static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index cd50b90d67..2cdaca7480 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -345,7 +345,7 @@ void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 spin_lock_irqsave(_target->arch.vgic.lock, flags);
 p = irq_to_pending(v_target, irq);
 clear_bit(GIC_IRQ_GUEST_ENABLED, >status);
-gic_remove_from_lr_pending(v_target, p);
+vgic_remove_from_lr_pending(v_target, p);
 desc = p->desc;
 spin_unlock_irqrestore(_target->arch.vgic.lock, flags);
 
@@ -505,18 +505,25 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 list_for_each_entry_safe ( p, t, >arch.vgic.inflight_irqs, inflight )
 list_del_init(>inflight);
 list_for_each_entry_safe ( p, t, >arch.vgic.lr_pending, lr_queue )
-gic_remove_from_lr_pending(v, p);
+vgic_remove_from_lr_pending(v, p);
 v->arch.lr_mask = 0;
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 }
 
+void vgic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p)
+{
+ASSERT(spin_is_locked(>arch.vgic.lock));
+
+list_del_init(>lr_queue);
+}
+
 void vgic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p)
 {
 ASSERT(spin_is_locked(>arch.vgic.lock));
 
 clear_bit(GIC_IRQ_GUEST_QUEUED, >status);
 list_del_init(>inflight);
-gic_remove_from_lr_pending(v, p);
+vgic_remove_from_lr_pending(v, p);
 }
 
 void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 030c1d86a7..4b2a60ee64 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -242,7 +242,6 @@ extern void init_maintenance_interrupt(void);
 extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq,
 unsigned int priority);
 extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq);
-extern void gic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p);
 
 /* Accept an interrupt from the GIC and dispatch its handler */
 extern void gic_interrupt(struct cpu_user_regs *regs, int is_fiq);
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 8d0ff65708..0d3810e6af 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -205,6 +205,7 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, 
unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 void vgic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
+void vgic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
 extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 11/12] ARM: VGIC: factor out vgic_get_hw_irq_desc()

2017-10-19 Thread Andre Przywara
At the moment we happily access the VGIC internal struct pending_irq
(which describes a virtual IRQ) in irq.c.
Factor out the actually needed functionality to learn the associated
hardware IRQ and move that into gic-vgic.c to improve abstraction.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic-vgic.c| 15 +++
 xen/arch/arm/irq.c |  7 ++-
 xen/include/asm-arm/vgic.h |  2 ++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index bf9455a34e..7765d83432 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -385,6 +385,21 @@ void gic_inject(struct vcpu *v)
 gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
 }
 
+struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu *v,
+  unsigned int virq)
+{
+struct pending_irq *p;
+
+if ( !v )
+v = d->vcpu[0];
+
+p = irq_to_pending(v, virq);
+if ( !p )
+return NULL;
+
+return p->desc;
+}
+
 int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq,
 struct irq_desc *desc)
 {
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 7f133de549..62103a20e3 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -534,19 +534,16 @@ int release_guest_irq(struct domain *d, unsigned int virq)
 struct irq_desc *desc;
 struct irq_guest *info;
 unsigned long flags;
-struct pending_irq *p;
 int ret;
 
 /* Only SPIs are supported */
 if ( virq < NR_LOCAL_IRQS || virq >= vgic_num_irqs(d) )
 return -EINVAL;
 
-p = spi_to_pending(d, virq);
-if ( !p->desc )
+desc = vgic_get_hw_irq_desc(d, NULL, virq);
+if ( !desc )
 return -EINVAL;
 
-desc = p->desc;
-
 spin_lock_irqsave(>lock, flags);
 
 ret = -EINVAL;
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index cf02dc6394..947950875b 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -220,6 +220,8 @@ int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);
 
 bool vgic_evtchn_irq_pending(struct vcpu *v);
+struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu *v,
+  unsigned int virq);
 int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq,
 struct irq_desc *desc);
 
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 06/12] ARM: VGIC: streamline gic_restore_pending_irqs()

2017-10-19 Thread Andre Przywara
In gic_restore_pending_irqs() we push our pending virtual IRQs into the
list registers. This function is called once from a GIC context and once
from a VGIC context. Refactor the calls so that we have only one callsite
from the VGIC context. This will help separating the two worlds later.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/domain.c |  1 +
 xen/arch/arm/gic.c| 11 +--
 xen/arch/arm/traps.c  |  2 +-
 xen/include/asm-arm/gic.h |  2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index a74ff1c07c..73f4d4b2b2 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -185,6 +185,7 @@ static void ctxt_switch_to(struct vcpu *n)
 
 /* VGIC */
 gic_restore_state(n);
+gic_inject(n);
 
 /* VFP */
 vfp_restore_state(n);
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 59dd255c2c..58d69955fb 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -36,8 +36,6 @@
 #include 
 #include 
 
-static void gic_restore_pending_irqs(struct vcpu *v);
-
 static DEFINE_PER_CPU(uint64_t, lr_mask);
 
 #define lr_all_full() (this_cpu(lr_mask) == ((1 << gic_hw_ops->info->nr_lrs) - 
1))
@@ -91,8 +89,6 @@ void gic_restore_state(struct vcpu *v)
 gic_hw_ops->restore_state(v);
 
 isb();
-
-gic_restore_pending_irqs(v);
 }
 
 /* desc->irq needs to be disabled before calling this function */
@@ -697,11 +693,14 @@ out:
 return rc;
 }
 
-void gic_inject(void)
+void gic_inject(struct vcpu *v)
 {
 ASSERT(!local_irq_is_enabled());
 
-gic_restore_pending_irqs(current);
+gic_restore_pending_irqs(v);
+
+if ( v != current )
+return;
 
 if ( !list_empty(>arch.vgic.lr_pending) && lr_all_full() )
 gic_hw_ops->update_hcr_status(GICH_HCR_UIE, true);
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index ff3d6ff2aa..7fd676ed9d 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2298,7 +2298,7 @@ void leave_hypervisor_tail(void)
 {
 local_irq_disable();
 if (!softirq_pending(smp_processor_id())) {
-gic_inject();
+gic_inject(current);
 
 /*
  * If the SErrors handle option is "DIVERSE", we have to prevent
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 4b2a60ee64..fe14094c0f 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -235,7 +235,7 @@ extern int gic_route_irq_to_guest(struct domain *, unsigned 
int virq,
 int gic_remove_irq_from_guest(struct domain *d, unsigned int virq,
   struct irq_desc *desc);
 
-extern void gic_inject(void);
+extern void gic_inject(struct vcpu *v);
 extern int gic_events_need_delivery(void);
 
 extern void init_maintenance_interrupt(void);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 12/12] ARM: VGIC: rework gicv[23]_update_lr to not use pending_irq

2017-10-19 Thread Andre Przywara
The functions to actually populate a list register were accessing
the VGIC internal pending_irq struct, although they should be abstracting
from that.
Break the needed information down to remove the reference to pending_irq
from gic-v[23].c.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic-v2.c | 14 +++---
 xen/arch/arm/gic-v3.c | 12 ++--
 xen/arch/arm/gic-vgic.c   |  3 ++-
 xen/include/asm-arm/gic.h |  4 ++--
 4 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 511c8d7294..e5acff8900 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -428,8 +428,8 @@ static void gicv2_disable_interface(void)
 spin_unlock();
 }
 
-static void gicv2_update_lr(int lr, const struct pending_irq *p,
-unsigned int state)
+static void gicv2_update_lr(int lr, unsigned int virq, uint8_t priority,
+unsigned int hw_irq, unsigned int state)
 {
 uint32_t lr_reg;
 
@@ -437,12 +437,12 @@ static void gicv2_update_lr(int lr, const struct 
pending_irq *p,
 BUG_ON(lr < 0);
 
 lr_reg = (((state & GICH_V2_LR_STATE_MASK) << GICH_V2_LR_STATE_SHIFT)  |
-  ((GIC_PRI_TO_GUEST(p->priority) & GICH_V2_LR_PRIORITY_MASK)
- << GICH_V2_LR_PRIORITY_SHIFT) |
-  ((p->irq & GICH_V2_LR_VIRTUAL_MASK) << 
GICH_V2_LR_VIRTUAL_SHIFT));
+  ((GIC_PRI_TO_GUEST(priority) & GICH_V2_LR_PRIORITY_MASK)
+  << GICH_V2_LR_PRIORITY_SHIFT) |
+  ((virq & GICH_V2_LR_VIRTUAL_MASK) << GICH_V2_LR_VIRTUAL_SHIFT));
 
-if ( p->desc != NULL )
-lr_reg |= GICH_V2_LR_HW | ((p->desc->irq & GICH_V2_LR_PHYSICAL_MASK )
+if ( hw_irq != -1 )
+lr_reg |= GICH_V2_LR_HW | ((hw_irq & GICH_V2_LR_PHYSICAL_MASK )
<< GICH_V2_LR_PHYSICAL_SHIFT);
 
 writel_gich(lr_reg, GICH_LR + lr * 4);
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 74d00e0c54..3dec407a02 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -944,8 +944,8 @@ static void gicv3_disable_interface(void)
 spin_unlock();
 }
 
-static void gicv3_update_lr(int lr, const struct pending_irq *p,
-unsigned int state)
+static void gicv3_update_lr(int lr, unsigned int virq, uint8_t priority,
+unsigned int hw_irq, unsigned int state)
 {
 uint64_t val = 0;
 
@@ -961,11 +961,11 @@ static void gicv3_update_lr(int lr, const struct 
pending_irq *p,
 if ( current->domain->arch.vgic.version == GIC_V3 )
 val |= GICH_LR_GRP1;
 
-val |= ((uint64_t)p->priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
-val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
+val |= (uint64_t)priority << GICH_LR_PRIORITY_SHIFT;
+val |= ((uint64_t)virq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
 
-   if ( p->desc != NULL )
-   val |= GICH_LR_HW | (((uint64_t)p->desc->irq & GICH_LR_PHYSICAL_MASK)
+   if ( hw_irq != -1 )
+   val |= GICH_LR_HW | (((uint64_t)hw_irq & GICH_LR_PHYSICAL_MASK)
<< GICH_LR_PHYSICAL_SHIFT);
 
 gicv3_ich_write_lr(lr, val);
diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 7765d83432..e783f3b54b 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -52,7 +52,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
 
 clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status);
 
-gic_hw_ops->update_lr(lr, p, state);
+gic_hw_ops->update_lr(lr, p->irq, p->priority,
+  p->desc ? p->desc->irq : -1, state);
 
 set_bit(GIC_IRQ_GUEST_VISIBLE, >status);
 clear_bit(GIC_IRQ_GUEST_QUEUED, >status);
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index fe14094c0f..66f0957fab 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -339,8 +339,8 @@ struct gic_hw_operations {
 /* Disable CPU physical and virtual interfaces */
 void (*disable_interface)(void);
 /* Update LR register with state and priority */
-void (*update_lr)(int lr, const struct pending_irq *pending_irq,
-  unsigned int state);
+void (*update_lr)(int lr, unsigned int virq, uint8_t priority,
+  unsigned int hw_irq, unsigned int state);
 /* Update HCR status register */
 void (*update_hcr_status)(uint32_t flag, bool set);
 /* Clear LR register */
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 00/12] ARM: VGIC/GIC separation cleanups

2017-10-19 Thread Andre Przywara
By the original VGIC design, Xen differentiates between the actual VGIC
emulation on one hand and the GIC hardware accesses on the other.
It seems there were some deviations from that scheme (over time?), so at
the moment we end up happily accessing VGIC specific data structures
like struct pending_irq and struct vgic_irq_rank from pure GIC files
like gic.c or even irq.c (try: git grep -l struct\ pending_irq xen/arch/arm).
But any future VGIC rework will depend on a clean separation, so this
series tries to clean this up.
It starts with some rather innocent patches, reaches its peak with the
ugly patch 5/12 and the heavy 6/12, and calms down in the rest of the
series again.
After this series there are no more references to VGIC structures from
GIC files, at least for non-ITS code. The ITS is a beast own its own
(blame the author) and will be addressed later.

This is a first shot, any ideas on improvements are welcome.

Cheers,
Andre.

Andre Przywara (12):
  ARM: remove unneeded gic.h inclusions
  ARM: vGIC: fix nr_irq definition
  ARM: VGIC: remove gic_clear_pending_irqs()
  ARM: VGIC: move gic_remove_irq_from_queues()
  ARM: VGIC: move gic_remove_from_lr_pending()
  ARM: VGIC: streamline gic_restore_pending_irqs()
  ARM: VGIC: split gic.c to observe hardware/virtual GIC separation
  ARM: VGIC: split up gic_dump_info() to cover virtual part separately
  ARM: VGIC: rework events_need_delivery()
  ARM: VGIC: factor out vgic_connect_hw_irq()
  ARM: VGIC: factor out vgic_get_hw_irq_desc()
  ARM: VGIC: rework gicv[23]_update_lr to not use pending_irq

 xen/arch/arm/Makefile|   1 +
 xen/arch/arm/domain.c|   2 +
 xen/arch/arm/domain_build.c  |   1 -
 xen/arch/arm/gic-v2.c|  14 +-
 xen/arch/arm/gic-v3.c|  12 +-
 xen/arch/arm/gic-vgic.c  | 442 +++
 xen/arch/arm/gic.c   | 430 +-
 xen/arch/arm/irq.c   |   9 +-
 xen/arch/arm/p2m.c   |   1 -
 xen/arch/arm/platforms/vexpress.c|   1 -
 xen/arch/arm/platforms/xgene-storm.c |   1 -
 xen/arch/arm/time.c  |   1 -
 xen/arch/arm/traps.c |   3 +-
 xen/arch/arm/vgic-v3-its.c   |   6 +-
 xen/arch/arm/vgic.c  |  46 +++-
 xen/arch/arm/vpsci.c |   1 -
 xen/arch/arm/vtimer.c|   1 -
 xen/include/asm-arm/event.h  |  13 +-
 xen/include/asm-arm/gic.h|   9 +-
 xen/include/asm-arm/irq.h|   2 +-
 xen/include/asm-arm/vgic.h   |  10 +
 21 files changed, 534 insertions(+), 472 deletions(-)
 create mode 100644 xen/arch/arm/gic-vgic.c

-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 09/12] ARM: VGIC: rework events_need_delivery()

2017-10-19 Thread Andre Przywara
In event.h we very deeply dive into the VGIC to learn if an event for
a guest is pending.
Rework that function to abstract the VGIC specific part out. Also
reorder the queries there, as we only actually need to check for the
event channel if there are no other pending IRQs.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic.c | 11 +++
 xen/include/asm-arm/event.h | 13 +++--
 xen/include/asm-arm/vgic.h  |  2 ++
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 37a083e804..f8d0f46e71 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -602,6 +602,17 @@ void arch_evtchn_inject(struct vcpu *v)
 vgic_vcpu_inject_irq(v, v->domain->arch.evtchn_irq);
 }
 
+bool vgic_evtchn_irq_pending(struct vcpu *v)
+{
+struct pending_irq *p;
+
+p = irq_to_pending(v, v->domain->arch.evtchn_irq);
+/* Does not work for LPIs. */
+ASSERT(!is_lpi(v->domain->arch.evtchn_irq));
+
+return list_empty(>inflight);
+}
+
 bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr)
 {
 struct vcpu *v = current;
diff --git a/xen/include/asm-arm/event.h b/xen/include/asm-arm/event.h
index caefa506a9..67684e9763 100644
--- a/xen/include/asm-arm/event.h
+++ b/xen/include/asm-arm/event.h
@@ -16,12 +16,6 @@ static inline int vcpu_event_delivery_is_enabled(struct vcpu 
*v)
 
 static inline int local_events_need_delivery_nomask(void)
 {
-struct pending_irq *p = irq_to_pending(current,
-   current->domain->arch.evtchn_irq);
-
-/* Does not work for LPIs. */
-ASSERT(!is_lpi(current->domain->arch.evtchn_irq));
-
 /* XXX: if the first interrupt has already been delivered, we should
  * check whether any other interrupts with priority higher than the
  * one in GICV_IAR are in the lr_pending queue or in the LR
@@ -33,11 +27,10 @@ static inline int local_events_need_delivery_nomask(void)
 if ( gic_events_need_delivery() )
 return 1;
 
-if ( vcpu_info(current, evtchn_upcall_pending) &&
-list_empty(>inflight) )
-return 1;
+if ( !vcpu_info(current, evtchn_upcall_pending) )
+return 0;
 
-return 0;
+return vgic_evtchn_irq_pending(current);
 }
 
 static inline int local_events_need_delivery(void)
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 49b8a4bec0..dcdb1acaf3 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -219,6 +219,8 @@ extern void register_vgic_ops(struct domain *d, const 
struct vgic_ops *ops);
 int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);
 
+bool vgic_evtchn_irq_pending(struct vcpu *v);
+
 extern int domain_vgic_register(struct domain *d, int *mmio_count);
 extern int vcpu_vgic_free(struct vcpu *v);
 extern bool vgic_to_sgi(struct vcpu *v, register_t sgir,
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 03/12] ARM: VGIC: remove gic_clear_pending_irqs()

2017-10-19 Thread Andre Przywara
gic_clear_pending_irqs() was not only misnamed, but also misplaced, as
a function solely dealing with the GIC emulation should not live in gic.c.
Move the functionality of this function into its only caller in vgic.c

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c| 11 ---
 xen/arch/arm/vgic.c   |  4 +++-
 xen/include/asm-arm/gic.h |  1 -
 3 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index ed363f6c37..75b2e0e0ca 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -675,17 +675,6 @@ out:
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 }
 
-void gic_clear_pending_irqs(struct vcpu *v)
-{
-struct pending_irq *p, *t;
-
-ASSERT(spin_is_locked(>arch.vgic.lock));
-
-v->arch.lr_mask = 0;
-list_for_each_entry_safe ( p, t, >arch.vgic.lr_pending, lr_queue )
-gic_remove_from_lr_pending(v, p);
-}
-
 int gic_events_need_delivery(void)
 {
 struct vcpu *v = current;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index d8acbbeaaa..451a306a98 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -504,7 +504,9 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 spin_lock_irqsave(>arch.vgic.lock, flags);
 list_for_each_entry_safe ( p, t, >arch.vgic.inflight_irqs, inflight )
 list_del_init(>inflight);
-gic_clear_pending_irqs(v);
+list_for_each_entry_safe ( p, t, >arch.vgic.lr_pending, lr_queue )
+gic_remove_from_lr_pending(v, p);
+v->arch.lr_mask = 0;
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 }
 
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index d3d7bda50d..2f248301ce 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -236,7 +236,6 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
   struct irq_desc *desc);
 
 extern void gic_inject(void);
-extern void gic_clear_pending_irqs(struct vcpu *v);
 extern int gic_events_need_delivery(void);
 
 extern void init_maintenance_interrupt(void);
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC] ARM: vPL011: use receive timeout interrupt

2017-10-18 Thread Andre Przywara
Instead of asserting the receive interrupt (RXI) on the first character
in the FIFO, lets (ab)use the receive timeout interrupt (RTI) for that
purpose. That seems to be closer to the spec and what hardware does.
Improve the readability of vpl011_data_avail() on the way.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
Hi,

this one is the approach I mentioned in the email earlier today.
It goes on top of Bhupinders v12 27/27, but should eventually be merged
into this one once we agreed on the subject. I just carved it out here
for clarity to make it clearer what has been changed.
Would be good if someone could test it.

Cheers,
Andre.
 xen/arch/arm/vpl011.c | 61 ---
 1 file changed, 29 insertions(+), 32 deletions(-)

diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
index adf1711571..ae18bddd81 100644
--- a/xen/arch/arm/vpl011.c
+++ b/xen/arch/arm/vpl011.c
@@ -105,9 +105,13 @@ static uint8_t vpl011_read_data(struct domain *d)
 if ( fifo_level == 0 )
 {
 vpl011->uartfr |= RXFE;
-vpl011->uartris &= ~RXI;
-vpl011_update_interrupt_status(d);
+vpl011->uartris &= ~RTI;
 }
+
+if ( fifo_level < sizeof(intf->in) - SBSA_UART_FIFO_SIZE / 2 )
+vpl011->uartris &= ~RXI;
+
+vpl011_update_interrupt_status(d);
 }
 else
 gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n");
@@ -129,7 +133,7 @@ static void vpl011_update_tx_fifo_status(struct vpl011 
*vpl011,
  unsigned int fifo_level)
 {
 struct xencons_interface *intf = vpl011->ring_buf;
-unsigned int fifo_threshold;
+unsigned int fifo_threshold = sizeof(intf->out) - SBSA_UART_FIFO_SIZE/2;
 
 BUILD_BUG_ON(sizeof (intf->out) < SBSA_UART_FIFO_SIZE);
 
@@ -137,8 +141,6 @@ static void vpl011_update_tx_fifo_status(struct vpl011 
*vpl011,
  * Set the TXI bit only when there is space for fifo_size/2 bytes which
  * is the trigger level for asserting/de-assterting the TX interrupt.
  */
-fifo_threshold = sizeof(intf->out) - SBSA_UART_FIFO_SIZE/2;
-
 if ( fifo_level <= fifo_threshold )
 vpl011->uartris |= TXI;
 else
@@ -390,35 +392,30 @@ static void vpl011_data_avail(struct domain *d)
 out_cons,
 sizeof(intf->out));
 
-/* Update the uart rx state if the buffer is not empty. */
-if ( in_fifo_level != 0 )
-{
+/ Update the UART RX state /
+
+/* Clear the FIFO_EMPTY bit if the FIFO holds at least one character. */
+if ( in_fifo_level > 0 )
 vpl011->uartfr &= ~RXFE;
 
-if ( in_fifo_level == sizeof(intf->in) )
-vpl011->uartfr |= RXFF;
+/* Set the FIFO_FULL bit if the ring buffer is full. */
+if ( in_fifo_level == sizeof(intf->in) )
+vpl011->uartfr |= RXFF;
 
-/*
- * Currently, the RXI bit is getting set even if there is a single
- * byte of data in the rx fifo. Ideally, the RXI bit should be set
- * only if the rx fifo level reaches the threshold.
- *
- * However, since currently RX timeout interrupt is not
- * implemented as there is not enough clarity in the SBSA spec,
- * the guest may keep waiting for an interrupt to read more
- * data. To ensure that guest reads all the data without
- * any delay, the RXI interrupt is raised if there is RX data
- * available without checking whether fifo level has reached
- * the threshold.
- *
- * TBD: Once there is more clarity in the SBSA spec on whether RX
- * timeout interrupt needs to be implemented, the RXI interrupt
- * will be raised only when rx fifo level reaches the threshold.
- */
+/* The FIFO trigger level is fixed to half of the FIFO. */
+if ( in_fifo_level >= sizeof(intf->in) - SBSA_UART_FIFO_SIZE / 2 )
 vpl011->uartris |= RXI;
-}
 
-/* Update the uart tx state if the buffer is not full. */
+/*
+ * If the input queue is not empty, we assert the receive timeout 
interrupt.
+ * As we don't emulate any timing here, we ignore the actual timeout
+ * of 32 bit periods.
+ */
+if ( in_fifo_level > 0 )
+vpl011->uartris |= RTI;
+
+/ Update the UART TX state /
+
 if ( out_fifo_level != sizeof(intf->out) )
 {
 vpl011->uartfr &= ~TXFF;
@@ -431,13 +428,13 @@ static void vpl011_data_avail(struct domain *d)
 vpl011->uartfr &= ~BUSY;
 
 vpl011_update_tx_fifo_status(vpl011, out_fifo_level);
-
-if ( out_fifo_level == 0 )
-vpl011->uartfr |= TXFE;
 }
 
 vpl011_update_interrupt_status(d);
 
+if ( out_fifo_level == 0 )
+vpl011->uartfr

Re: [Xen-devel] [PATCH 26/27 v12] arm/xen: vpl011: Fix the slow early console SBSA UART output

2017-10-18 Thread Andre Przywara
Hi,

On 18/10/17 11:17, Bhupinder Thakur wrote:
> Hi Andre,
> 
> On 17 October 2017 at 15:21, Andre Przywara <andre.przyw...@arm.com> wrote:
>> Hi Bhupinder,
>>
>> first thing: As the bulk of the series has been merged now, please
>> restart your patch and version numbering, so a (potential) next post
>> should be prefixed [PATCH v3 1/2]. And please have a cover letter giving
>> a brief overview what this series fixes.
>>
> Should I resend the patch series with a cover letter? I will also add
> a reported-by tag.

Please wait until we have settled upon a solution, especially for that
other patch. We can talk about this in our meeting later today.

Cheers,
Andre.

>> On 13/10/17 11:40, Bhupinder Thakur wrote:
>>> The early console output uses pl011_early_write() to write data. This
>>> function waits for BUSY bit to get cleared before writing the next byte.
>>
>> ... which is questionable given the actual definition of the BUSY bit in
>> the PL011 TRM:
>>
>> 
>>  The BUSY signal goes HIGH as soon as data is written to the
>> transmit FIFO (that is, the FIFO is non-empty) and remains asserted
>> HIGH while data is being transmitted. BUSY is negated only when the
>> transmit FIFO is empty, and the last character has been transmitted from
>> the shift register, 
>> 
>>
>> (I take it you are talking about the Linux driver in a guest here).
>> I think the early_write routine tries to (deliberately?) ignore the
>> FIFO, possibly to make sure characters really get pushed out before a
>> system crashes, maybe.
>>
>>>
>>> In the SBSA UART emulation logic, the BUSY bit was set as soon one
>>> byte was written in the FIFO and it remained set until the FIFO was
>>> emptied.
>>
>> Which is correct behaviour, as this matches the PL011 TRM as quoted above.
>>
>>> This meant that the output was delayed as each character needed
>>> the BUSY to get cleared.
>>
>> But this is true as well!
>>
>>> Since the SBSA UART is getting emulated in Xen using ring buffers, it
>>> ensures that once the data is enqueued in the FIFO, it will be received
>>> by xenconsole so it is safe to set the BUSY bit only when FIFO becomes
>>> full. This will ensure that pl011_early_write() is not delayed unduly
>>> to write the data.
>>
>> So I can confirm that this patch fixes the very slow earlycon output
>> observed with the current staging HEAD.
>>
>> So while this is somewhat deviating from the spec, I can see the benefit
>> for an emulation scenario. I believe that emulations in general might
>> choose implementing things a bit differently, to cope with the
>> fundamental differences in their environment, like the virtually endless
>> "FIFO" and the lack of any timing restrictions on the emulated "wire".
>>
>> So unless someone comes up with a better solution, I would support
>> taking this patch, as this fixes a real problem.
>>
>> Cheers,
>> Andre
>>
>>> Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
>>> ---
>>> CC: Julien Grall <julien.gr...@arm.com>
>>> CC: Andre Przywara <andre.przyw...@arm.com>
>>> CC: Stefano Stabellini <sstabell...@kernel.org>
>>>
>>>  xen/arch/arm/vpl011.c | 21 -
>>>  1 file changed, 16 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
>>> index f7ddccb..0b07436 100644
>>> --- a/xen/arch/arm/vpl011.c
>>> +++ b/xen/arch/arm/vpl011.c
>>> @@ -159,9 +159,15 @@ static void vpl011_write_data(struct domain *d, 
>>> uint8_t data)
>>>  {
>>>  vpl011->uartfr |= TXFF;
>>>  vpl011->uartris &= ~TXI;
>>> -}
>>>
>>> -vpl011->uartfr |= BUSY;
>>> +/*
>>> + * This bit is set only when FIFO becomes full. This ensures that
>>> + * the SBSA UART driver can write the early console data as fast as
>>> + * possible, without waiting for the BUSY bit to get cleared before
>>> + * writing each byte.
>>> + */
>>> +vpl011->uartfr |= BUSY;
>>> +}
>>>
>>>  vpl011->uartfr &= ~TXFE;
>>>
>>> @@ -371,11 +377,16 @@ static void vpl011_data_avail(struct domain *d)
>>>  {
>>>  vpl011->uartfr &= ~TXFF;
>>>  vpl011->uartris |= TXI;
>>> +
>>> +/*
>>> + * Clear the BUSY bit as soon as space becomes available
>>> + * so that the SBSA UART driver can start writing more data
>>> + * without any further delay.
>>> + */
>>> +vpl011->uartfr &= ~BUSY;
>>> +
>>>  if ( out_ring_qsize == 0 )
>>> -{
>>> -vpl011->uartfr &= ~BUSY;
>>>  vpl011->uartfr |= TXFE;
>>> -}
>>>  }
>>>
>>>  vpl011_update_interrupt_status(d);
>>>
> 
> Regards,
> Bhupinder
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 27/27 v12] arm/xen: vpl011: Correct the logic for asserting/de-asserting SBSA UART TX interrupt

2017-10-18 Thread Andre Przywara
Hi,

On 13/10/17 11:40, Bhupinder Thakur wrote:
> This patch fixes the issue observed when pl011 patches were tested on
> the junos hardware by Andre/Julien. It was observed that when large
> output is generated such as on running 'find /', output was getting
> truncated intermittently due to OUT ring buffer getting full.
> 
> This issue was due to the fact that the SBSA UART driver expects that
> when a TX interrupt is asserted then the FIFO queue should be atleast
> half empty and that it can write N bytes in the FIFO, where N is half
> the FIFO queue size, without the bytes getting dropped due to FIFO
> getting full.
> 
> The SBSA UART emulation logic was asserting the TX interrupt as soon
> as any space became available in the FIFO and the SBSA UART driver
> tried to write more data (upto 16 bytes) in the FIFO expecting that
> there is enough space available leading to dropped bytes.
> 
> The SBSA spec [1] does not specify when the TX interrupt should be
> asserted or de-asserted. Due to lack of clarity on the expected
> behavior, it is assumed for now that TX interrupt should be asserted
> only when the FIFO goes half empty.
> 
> TBD: Once the SBSA spec is updated with the expected behavior, the
> implementation will be modified to align with the spec requirement.

So similar to the other patch:

- I can confirm that this patch fixes the dropped characters issue we
see with current staging HEAD. And, differently from the first patch,
this one fixes a correctness issue (we are loosing characters at the
moment) rather than just a performance problem. So I think we definitely
need something along those lines.

However ... ;-)
Asserting the receive interrupt at the first character, while it is
architected to be only triggered at half the FIFO level, is not right.
Instead what we probably want it to use the timeout interrupt instead. I
quickly hacked something up like that:
- In vpl011_data_avail() we assert the timeout interrupt (RTI) if the
in-FIFO is not empty. This is following the idea that when this function
is called, Xen says: this is all the data I have at the moment. The
guest should be able to see the data, because Xen has no idea when and
if more data will come in.
- If we drain the in-FIFO in vpl011_mmio_read() (fifo_level becomes 0),
we clear RTI.
- We handle RXI like described in the spec: assert it in data_avail() if
the FIFO has 16 or less characters left, clear it in mmio_read() if the
FIFO has space for more than 16 characters.

This basically moves the trick of asserting RXI to asserting RTI
instead, which sounds architecturally cleaner.

Let me try to clean up my approach and post it.

Cheers,
Andre.



> 
> [1] http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf
> 
> Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
> ---
> CC: Julien Grall <julien.gr...@arm.com>
> CC: Andre Przywara <andre.przyw...@arm.com>
> CC: Stefano Stabellini <sstabell...@kernel.org>
> CC: Dave Martin <dave.mar...@arm.com>
> 
> Changes since v11:
> - Add a build assert to check that ring buffer size is more than minimum rx 
> fif size of 32
> - Added a comment to explain why threshold based logic is not implemented for 
> rx fifo
> - Moved calls to vpl011_update_interrupt_status() near where TXI/RXI status 
> bit is set
>  
> Changes since v8:
> - Used variables fifo_level/fifo_threshold for more clarity
> - Added a new macro SBSA_UART_FIFO_SIZE instead of using a magic number
> - Renamed ring_qsize variables to fifo_level for consistency 
> 
>  xen/arch/arm/vpl011.c| 113 
> ++-
>  xen/include/asm-arm/vpl011.h |   2 +
>  2 files changed, 82 insertions(+), 33 deletions(-)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 0b07436..adf1711 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -93,24 +93,27 @@ static uint8_t vpl011_read_data(struct domain *d)
>   */
>  if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) > 0 )
>  {
> +unsigned int fifo_level;
> +
>  data = intf->in[xencons_mask(in_cons, sizeof(intf->in))];
>  in_cons += 1;
>  smp_mb();
>  intf->in_cons = in_cons;
> +
> +fifo_level = xencons_queued(in_prod, in_cons, sizeof(intf->in));
> +
> +if ( fifo_level == 0 )
> +{
> +vpl011->uartfr |= RXFE;
> +vpl011->uartris &= ~RXI;
> +vpl011_update_interrupt_status(d);
> +}
>  }
>  else
>  gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n");
>  
> -if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) == 0 )
> -{
> -  

Re: [Xen-devel] [PATCH 26/27 v12] arm/xen: vpl011: Fix the slow early console SBSA UART output

2017-10-17 Thread Andre Przywara
Hi Bhupinder,

first thing: As the bulk of the series has been merged now, please
restart your patch and version numbering, so a (potential) next post
should be prefixed [PATCH v3 1/2]. And please have a cover letter giving
a brief overview what this series fixes.

On 13/10/17 11:40, Bhupinder Thakur wrote:
> The early console output uses pl011_early_write() to write data. This
> function waits for BUSY bit to get cleared before writing the next byte.

... which is questionable given the actual definition of the BUSY bit in
the PL011 TRM:


 The BUSY signal goes HIGH as soon as data is written to the
transmit FIFO (that is, the FIFO is non-empty) and remains asserted
HIGH while data is being transmitted. BUSY is negated only when the
transmit FIFO is empty, and the last character has been transmitted from
the shift register, 


(I take it you are talking about the Linux driver in a guest here).
I think the early_write routine tries to (deliberately?) ignore the
FIFO, possibly to make sure characters really get pushed out before a
system crashes, maybe.

> 
> In the SBSA UART emulation logic, the BUSY bit was set as soon one
> byte was written in the FIFO and it remained set until the FIFO was
> emptied.

Which is correct behaviour, as this matches the PL011 TRM as quoted above.

> This meant that the output was delayed as each character needed
> the BUSY to get cleared.

But this is true as well!

> Since the SBSA UART is getting emulated in Xen using ring buffers, it
> ensures that once the data is enqueued in the FIFO, it will be received
> by xenconsole so it is safe to set the BUSY bit only when FIFO becomes
> full. This will ensure that pl011_early_write() is not delayed unduly
> to write the data.

So I can confirm that this patch fixes the very slow earlycon output
observed with the current staging HEAD.

So while this is somewhat deviating from the spec, I can see the benefit
for an emulation scenario. I believe that emulations in general might
choose implementing things a bit differently, to cope with the
fundamental differences in their environment, like the virtually endless
"FIFO" and the lack of any timing restrictions on the emulated "wire".

So unless someone comes up with a better solution, I would support
taking this patch, as this fixes a real problem.

Cheers,
Andre

> Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
> ---
> CC: Julien Grall <julien.gr...@arm.com>
> CC: Andre Przywara <andre.przyw...@arm.com>
> CC: Stefano Stabellini <sstabell...@kernel.org>
> 
>  xen/arch/arm/vpl011.c | 21 -
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index f7ddccb..0b07436 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -159,9 +159,15 @@ static void vpl011_write_data(struct domain *d, uint8_t 
> data)
>  {
>  vpl011->uartfr |= TXFF;
>  vpl011->uartris &= ~TXI;
> -}
>  
> -vpl011->uartfr |= BUSY;
> +/*
> + * This bit is set only when FIFO becomes full. This ensures that
> + * the SBSA UART driver can write the early console data as fast as
> + * possible, without waiting for the BUSY bit to get cleared before
> + * writing each byte.
> + */
> +vpl011->uartfr |= BUSY;
> +}
>  
>  vpl011->uartfr &= ~TXFE;
>  
> @@ -371,11 +377,16 @@ static void vpl011_data_avail(struct domain *d)
>  {
>  vpl011->uartfr &= ~TXFF;
>  vpl011->uartris |= TXI;
> +
> +/*
> + * Clear the BUSY bit as soon as space becomes available
> + * so that the SBSA UART driver can start writing more data
> + * without any further delay.
> + */
> +vpl011->uartfr &= ~BUSY;
> +
>  if ( out_ring_qsize == 0 )
> -{
> -vpl011->uartfr &= ~BUSY;
>  vpl011->uartfr |= TXFE;
> -}
>  }
>  
>  vpl011_update_interrupt_status(d);
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-11 Thread Andre Przywara
Hi,

On 11/10/17 15:33, Andre Przywara wrote:
> Hi,
> 
> (CC:ing some KVM/ARM folks involved in the VGIC)
> 
> starting with the addition of the ITS support we were seeing more and
> more issues with the current implementation of our ARM Generic Interrupt
> Controller (GIC) emulation, the VGIC.
> Among other approaches to fix those issues it was proposed to copy the
> VGIC emulation used in KVM. This one was suffering from very similar
> issues, and a clean design from scratch lead to a very robust and
> capable re-implementation. Interestingly this implementation is fairly
> self-contained, so it seems feasible to copy it. Hopefully we only need
> minor adjustments, possibly we can even copy it verbatim with some
> additional glue layer code.
> Stefano asked for getting a design overview, to assess the feasibility
> of copying the KVM code without reviewing tons of code in the first
> place.
> So to follow Xen rules for new features, this design document below is
> an attempt to describe the current KVM VGIC design - in a hypervisor
> agnostic session. It is a bit of a retro-fit design description, as it
> is not strictly forward-looking only, but actually describing the
> existing implemenation [1].

and that link should point to:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/virt/kvm/arm/vgic

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] xen/arm: mm: Rework MAIR* definitions to handle 32-bit compilation environment

2017-10-11 Thread Andre Przywara
Hi,

On 11/10/17 15:15, Julien Grall wrote:
> Commit a0543df403 "xen/arm: page: Clean-up the definition of MAIRVAL"
> combined the definition of MAIR0VAL and MAIR1VAL in MAIRVAL. Sadly, when
> building in 32-bit environment, the assembler is unable to compute
> 64-bit constant and will ignore the 32-bit most-significants bits. This
> will result of MAIR1 set 0.
> 
> Rather than fully reverting the offending commit, the code is reworked
> to still avoid hardcoded values but split the definition in 2.

Nasty issue, but given the circumstances the workaround seems not too
bad for me.

> Lastly, a comment is added to avoid trying to blindly combine the both
> definition again in the future.
> 
> Signed-off-by: Julien Grall <julien.gr...@linaro.org>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/page.h | 23 ++-
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index f558184e10..d948250a4a 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -52,18 +52,23 @@
>   *   ??   101
>   *   reserved 110
>   *   MT_NORMAL111      -- Write-back write-allocate
> + *
> + * /!\ It is not possible to combine the definition in MAIRVAL and then
> + * split because it would result to a 64-bit value that some assembler
> + * doesn't understand.
>   */
> -#define MAIR(attr, mt) (_AC(attr, ULL) << ((mt) * 8))
> +#define _MAIR0(attr, mt) (_AC(attr, ULL) << ((mt) * 8))
> +#define _MAIR1(attr, mt) (_AC(attr, ULL) << (((mt) * 8) - 32))
> +
> +#define MAIR0VAL (_MAIR0(0x00, MT_DEVICE_nGnRnE)| \
> +  _MAIR0(0x44, MT_NORMAL_NC)| \
> +  _MAIR0(0xaa, MT_NORMAL_WT)| \
> +  _MAIR0(0xee, MT_NORMAL_WB))
>  
> -#define MAIRVAL (MAIR(0x00, MT_DEVICE_nGnRnE)| \
> - MAIR(0x44, MT_NORMAL_NC)| \
> - MAIR(0xaa, MT_NORMAL_WT)| \
> - MAIR(0xee, MT_NORMAL_WB)| \
> - MAIR(0x04, MT_DEVICE_nGnRE) | \
> - MAIR(0xff, MT_NORMAL))
> +#define MAIR1VAL (_MAIR1(0x04, MT_DEVICE_nGnRE) | \
> +  _MAIR1(0xff, MT_NORMAL))
>  
> -#define MAIR0VAL (MAIRVAL & 0x)
> -#define MAIR1VAL (MAIRVAL >> 32)
> +#define MAIRVAL (MAIR1VAL << 32 | MAIR0VAL)
>  
>  /*
>   * Layout of the flags used for updating the hypervisor page tables
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-11 Thread Andre Przywara
Hi,

(CC:ing some KVM/ARM folks involved in the VGIC)

starting with the addition of the ITS support we were seeing more and
more issues with the current implementation of our ARM Generic Interrupt
Controller (GIC) emulation, the VGIC.
Among other approaches to fix those issues it was proposed to copy the
VGIC emulation used in KVM. This one was suffering from very similar
issues, and a clean design from scratch lead to a very robust and
capable re-implementation. Interestingly this implementation is fairly
self-contained, so it seems feasible to copy it. Hopefully we only need
minor adjustments, possibly we can even copy it verbatim with some
additional glue layer code.
Stefano asked for getting a design overview, to assess the feasibility
of copying the KVM code without reviewing tons of code in the first
place.
So to follow Xen rules for new features, this design document below is
an attempt to describe the current KVM VGIC design - in a hypervisor
agnostic session. It is a bit of a retro-fit design description, as it
is not strictly forward-looking only, but actually describing the
existing implemenation [1].

Please have a look and let me know:
1) if this document has the right scope
2) if this document has the right level of detail
3) if there are points missing from the document
3) if the design in general is a fit

Appreciate any feedback!

Cheers,
Andre.

---

VGIC design
===

This document describes the design of an ARM Generic Interrupt Controller (GIC)
emulation. It is meant to emulate a GIC for a guest in an virtual machine,
the common name for that is VGIC (from "virtual GIC").

This design was the result of a one-week-long design session with some
engineers in a room, triggered by ever-increasing difficulties in maintaining
the existing GIC emulation in the KVM hypervisor. The design eventually
materialised as an alternative VGIC implementation in the Linux kernel
(merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
was removed, so it is now the current code used by Linux.
Although being used in KVM, the actual design of this VGIC is rather hypervisor
agnostic and can be used by other hypervisors as well, in particular for Xen.

GIC hardware virtualization support
---

The ARM Generic Interrupt Controller (since v2) supports the virtualization
extensions, which allows some parts of the interrupt life cycle to be handled
purely inside the guest without exiting into the hypervisor.
In the GICv2 and GICv3 architecture this covers mostly the "interrupt
acknowledgement", "priority drop" and "interrupt deactivate" actions.
So a guest can handle most of the interrupt processing code without
leaving EL1 and trapping into the hypervisor. To accomplish
this, the GIC holds so called "list registers" (LRs), which shadow the
interrupt state for any virtual interrupt. Injecting an interrupt to a guest
involves setting up one LR with the interrupt number, its priority and initial
state (mostly "pending"), then entering the guest. Any EOI related action
from within the guest just acts on those LRs, the hypervisor can later update
the virtual interrupt state when the guest exists the next time (for whatever
reason).
But despite the GIC hardware helping out here, the whole interrupt
configuration management is not virtualized at all and needs to be emulated
by the hypervisor - or another related software component, for instance a
userland emulator. This so called "distributor" part of the GIC consists of
memory mapped registers, which can be trapped by the hypervisor, so any guest
access can be emulated in the usual way.

VGIC design motivation
--

A GIC emulation thus needs to take care of those bits:

- trap GIC distributor MMIO accesses and shadow the configuration setup
  (enabled/disabled, level/edge, priority, affinity) for virtual interrupts
- handle incoming hardware and virtual interrupt requests and inject the
  associated virtual interrupt by manipulating one of the list registers
- track the state of a virtual interrupt by inspecting the LRs after the
  guest has exited, possibly adjusting the shadowed virtual interrupt state

Despite the distributor MMIO register emulation being a sizeable chunk of
the emulation, it is actually not dominant if looking at the frequency at
which it is accessed. Normally the interrupt configuration is done at boot
time or upon initialising the device (driver), but rarely during the actual
run time of a system. Injecting and EOI-ing interrupts however happens much
more often. A good emulation approach should thus focus on tracking the virtual
interrupt state efficiently, allowing quick handling of incoming and EOI-ed
interrupts.

The actual interrupt state tracking can be quite tricky in parts. Interrupt
injections can be independent from the guest entry/exit points, also MMIO
configuration accesses could be triggered by any VCPU at any point in 

Re: [Xen-devel] [PATCH v5 4/5] ARM: Update Formula to compute MADT size using new callbacks in gic_hw_operations

2017-10-10 Thread Andre Przywara
Hi Manish,

On 10/10/17 07:16, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi <mja...@cavium.com>
> 
> estimate_acpi_efi_size needs to be updated to provide correct size of
> hardware domains MADT, which now adds ITS information as well.
> 
> This patch updates the formula to compute extra MADT size, as per GICv2/3
> by calling gic_get_hwdom_extra_madt_size
> 
> Signed-off-by: Manish Jaggi <mja...@cavium.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Thanks!
Andre

> ---
>  xen/arch/arm/domain_build.c |  7 +--
>  xen/arch/arm/gic-v2.c   |  6 ++
>  xen/arch/arm/gic-v3.c   | 19 +++
>  xen/arch/arm/gic.c  | 12 
>  xen/include/asm-arm/gic.h   |  3 +++
>  5 files changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index d6f9585..f17fcf1 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1808,12 +1808,7 @@ static int estimate_acpi_efi_size(struct domain *d, 
> struct kernel_info *kinfo)
>  acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
>  acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
>  
> -madt_size = sizeof(struct acpi_table_madt)
> -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> -+ sizeof(struct acpi_madt_generic_distributor);
> -if ( d->arch.vgic.version == GIC_V3 )
> -madt_size += sizeof(struct acpi_madt_generic_redistributor)
> - * d->arch.vgic.nr_regions;
> +madt_size = gic_get_hwdom_madt_size(d);
>  acpi_size += ROUNDUP(madt_size, 8);
>  
>  addr = acpi_os_get_root_pointer();
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index cbe71a9..0123ea4 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain 
> *d)
>  return iomem_deny_access(d, mfn, mfn + nr);
>  }
>  
> +static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d)
> +{
> +return 0;
> +}
> +
>  #ifdef CONFIG_ACPI
>  static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
>  {
> @@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = {
>  .read_apr= gicv2_read_apr,
>  .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv2_make_hwdom_madt,
> +.get_hwdom_extra_madt_size = gicv2_get_hwdom_extra_madt_size,
>  .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>  .iomem_deny_access   = gicv2_iomem_deny_access,
>  .do_LPI  = gicv2_do_LPI,
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b3d605d..447998d 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1406,6 +1406,19 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  return table_len;
>  }
>  
> +static unsigned long gicv3_get_hwdom_extra_madt_size(const struct domain *d)
> +{
> +unsigned long size;
> +
> +size  = sizeof(struct acpi_madt_generic_redistributor)
> +* d->arch.vgic.nr_regions;
> +
> +size  += vgic_v3_its_count(d)
> +* sizeof(struct acpi_madt_generic_translator);
> +
> +return size;
> +}
> +
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
>  const unsigned long end)
> @@ -1597,6 +1610,11 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  {
>  return 0;
>  }
> +
> +static unsigned long gicv3_get_hwdom_extra_madt_size(const struct domain *d)
> +{
> +return 0;
> +}
>  #endif
>  
>  /* Set up the GIC */
> @@ -1698,6 +1716,7 @@ static const struct gic_hw_operations gicv3_ops = {
>  .secondary_init  = gicv3_secondary_cpu_init,
>  .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv3_make_hwdom_madt,
> +.get_hwdom_extra_madt_size = gicv3_get_hwdom_extra_madt_size,
>  .iomem_deny_access   = gicv3_iomem_deny_access,
>  .do_LPI  = gicv3_do_LPI,
>  };
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 6c803bf..3c7b6df 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -851,6 +851,18 @@ int gic_make_hwdom_madt(const struct domain *d, u32 
> offset)
>  return gic_hw_ops->make_hwdom_madt(d, offset);
>  }
>  
> +unsigned long gic_get_hwdom_madt_size(const struct domain *d)
> +{
> +unsigned long madt_size;
> +
> +madt_size = sizeof(struct acpi_table_madt

Re: [Xen-devel] [PATCH 27/27 v11] xen/arm: vpl011: Correct the logic for asserting/de-asserting SBSA UART TX interrupt

2017-10-09 Thread Andre Przywara
Hi,

can you please re-break the commit message to fit into 72 characters?
git show looks rather ugly as it is now.

On 27/09/17 07:13, Bhupinder Thakur wrote:
> This patch fixes the issue observed when pl011 patches were tested on
> the junos hardware by Andre/Julien. It was observed that when large output is
> generated such as on running 'find /', output was getting truncated 
> intermittently
> due to OUT ring buffer getting full.
> 
> This issue was due to the fact that the SBSA UART driver expects that when
> a TX interrupt is asserted then the FIFO queue should be atleast half empty 
> and
> that it can write N bytes in the FIFO, where N is half the FIFO queue size, 
> without
> the bytes getting dropped due to FIFO getting full.
> 
> The SBSA UART emulation logic was asserting the TX interrupt as soon as
> any space became available in the FIFO and the SBSA UART driver tried to write
> more data (upto 16 bytes) in the FIFO expecting that there is enough space
> available leading to dropped bytes.
> 
> The SBSA spec [1] does not specify when the TX interrupt should be asserted
> or de-asserted. Due to lack of clarity on the expected behavior, it is
> assumed for now that TX interrupt should be asserted only when the FIFO goes
> half empty.
> 
> TBD: Once the SBSA spec is updated with the expected behavior, the 
> implementation
> will be modified to align with the spec requirement.
> 
> [1] http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf
> 
> Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>

Only some minor things left below, but in general this looks much better
to me now.

> ---
> CC: Julien Grall <julien.gr...@arm.com>
> CC: Andre Przywara <andre.przyw...@arm.com>
> CC: Stefano Stabellini <sstabell...@kernel.org>
> 
> Changes since v8:
> - Used variables fifo_level/fifo_threshold for more clarity
> - Added a new macro SBSA_UART_FIFO_SIZE instead of using a magic number
> - Renamed ring_qsize variables to fifo_level for consistency 
> 
>  xen/arch/arm/vpl011.c| 87 
> ++--
>  xen/include/asm-arm/vpl011.h |  2 +
>  2 files changed, 61 insertions(+), 28 deletions(-)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 36794d8..1f97261 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -91,20 +91,24 @@ static uint8_t vpl011_read_data(struct domain *d)
>   */
>  if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) > 0 )
>  {
> +unsigned int fifo_level;
> +
>  data = intf->in[xencons_mask(in_cons, sizeof(intf->in))];
>  in_cons += 1;
>  smp_mb();
>  intf->in_cons = in_cons;
> +
> +fifo_level = xencons_queued(in_prod, in_cons, sizeof(intf->in));
> +
> +if ( fifo_level == 0 )
> +{
> +vpl011->uartfr |= RXFE;
> +vpl011->uartris &= ~RXI;
> +}
>  }
>  else
>  gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n");
>  
> -if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) == 0 )
> -{
> -vpl011->uartfr |= RXFE;
> -vpl011->uartris &= ~RXI;
> -}
> -
>  vpl011->uartfr &= ~RXFF;
>  
>  vpl011_update_interrupt_status(d);
> @@ -144,28 +148,41 @@ static void vpl011_write_data(struct domain *d, uint8_t 
> data)
>  if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) !=
>   sizeof (intf->out) )
>  {
> +unsigned int fifo_level, fifo_threshold;
> +
>  intf->out[xencons_mask(out_prod, sizeof(intf->out))] = data;
>  out_prod += 1;
>  smp_wmb();
>  intf->out_prod = out_prod;
> -}
> -else
> -gprintk(XENLOG_ERR, "vpl011: Unexpected OUT ring buffer full\n");
>  
> -if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) ==
> - sizeof (intf->out) )
> -{
> -vpl011->uartfr |= TXFF;
> -vpl011->uartris &= ~TXI;
> +fifo_level = xencons_queued(out_prod, out_cons, sizeof(intf->out));
> +
> +if ( fifo_level == sizeof (intf->out) )
> +{
> +vpl011->uartfr |= TXFF;
> +
> +/*
> + * This bit is set only when FIFO becomes full. This ensures that
> + * the SBSA UART driver can write the early console data as fast 
> as
> + * possible, without waiting for the BUSY bit to get cleared 
> before
> + * writing each byte.
> + */
> +vpl011

[Xen-devel] [PATCH] ARM: sunxi: support more Allwinner SoCs

2017-10-06 Thread Andre Przywara
So far we only supported the Allwinner A20 SoC. Add support for most
of the other virtualization capable Allwinner SoCs by:
- supporting the watchdog in newer (sun8i) SoCs
- getting the watchdog address from DT
- adding compatible strings for other 32-bit SoCs
- adding compatible strings for 64-bit SoCs

As all 64-bit SoCs support system reset via PSCI, we don't use the
platform specific reset routine there. Should the 32-bit SoCs start to
properly support the PSCI 0.2 SYSTEM_RESET call, we will use it for them
automatically, as we try PSCI first, then fall back to platform reset.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
Hi,

this is based on staging, which has the required UART fix.
Tested on:
- BananaPi M1 (A20)
- OrangePi Zero (H2+, which is almost the same as H3)
- OrangePi PC 2 (H5, arm64)
- Pine64+ (A64, arm64)

On the 64-bit boards I could boot into Dom0 prompt.
I had issues with U-Boot's fdt command on the two 32-bit boards, so couldn't
inject the Dom0 magic into the DT. But at least Xen booted and reset
worked with both the "old" and "new" watchdog.
The newer boards require "clk_ignore_unused" on the Linux command line at
the moment, I will try to find a more sustainable solution next week.
Will try to update the Wiki later on.

Please let me know if this is worth splitting up into multiple patches
(watchdog address from DT, new watchdog support, arm64 support).

Many thanks to Awais for the idea and his original patch, and for testing
this one!

Cheers,
Andre.

 xen/arch/arm/platforms/Makefile |  2 +-
 xen/arch/arm/platforms/sunxi.c  | 96 +++--
 2 files changed, 85 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index 49fa683780..53a47e48d2 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -5,6 +5,6 @@ obj-$(CONFIG_ARM_32) += midway.o
 obj-$(CONFIG_ARM_32) += omap5.o
 obj-$(CONFIG_ARM_32) += rcar2.o
 obj-$(CONFIG_ARM_64) += seattle.o
-obj-$(CONFIG_ARM_32) += sunxi.o
+obj-y += sunxi.o
 obj-$(CONFIG_ARM_64) += xgene-storm.o
 obj-$(CONFIG_ARM_64) += xilinx-zynqmp.o
diff --git a/xen/arch/arm/platforms/sunxi.c b/xen/arch/arm/platforms/sunxi.c
index 0ba7b3d9b4..c8a3e8eec8 100644
--- a/xen/arch/arm/platforms/sunxi.c
+++ b/xen/arch/arm/platforms/sunxi.c
@@ -1,7 +1,7 @@
 /*
  * xen/arch/arm/platforms/sunxi.c
  *
- * SUNXI (AllWinner A20/A31) specific settings
+ * SUNXI (Allwinner ARM SoCs) specific settings
  *
  * Copyright (c) 2013 Citrix Systems.
  *
@@ -22,36 +22,103 @@
 #include 
 
 /* Watchdog constants: */
-#define SUNXI_WDT_BASE0x01c20c90
-#define SUNXI_WDT_MODE0x04
-#define SUNXI_WDT_MODEADDR(SUNXI_WDT_BASE + SUNXI_WDT_MODE)
+#define SUNXI_WDT_MODE_REG0x04
 #define SUNXI_WDT_MODE_EN (1 << 0)
 #define SUNXI_WDT_MODE_RST_EN (1 << 1)
 
+#define SUNXI_WDT_CONFIG_SYSTEM_RESET   (1 << 0)
+#define SUNXI_WDOG0_CFG_REG 0x14
+#define SUNXI_WDOG0_MODE_REG0x18
 
-static void sunxi_reset(void)
+static void __iomem *sunxi_map_watchdog(bool *new_wdt)
 {
 void __iomem *wdt;
+struct dt_device_node *node;
+paddr_t wdt_start, wdt_len;
+bool _new_wdt = false;
+int ret;
+
+node = dt_find_compatible_node(NULL, NULL, "allwinner,sun6i-a31-wdt");
+if ( node )
+   _new_wdt = true;
+else
+node = dt_find_compatible_node(NULL, NULL, "allwinner,sun4i-a10-wdt");
+
+if ( !node )
+{
+dprintk(XENLOG_ERR, "Cannot find matching watchdog node in DT\n");
+return NULL;
+}
 
-wdt = ioremap_nocache(SUNXI_WDT_MODEADDR & PAGE_MASK, PAGE_SIZE);
+ret = dt_device_get_address(node, 0, _start, _len);
+if ( ret )
+{
+dprintk(XENLOG_ERR, "Cannot read watchdog register address\n");
+return NULL;
+}
+
+wdt = ioremap_nocache(wdt_start & PAGE_MASK, PAGE_SIZE);
 if ( !wdt )
 {
 dprintk(XENLOG_ERR, "Unable to map watchdog register!\n");
-return;
+return NULL;
 }
 
-/* Enable watchdog to trigger a reset after 500 ms: */
+if ( new_wdt )
+   *new_wdt = _new_wdt;
+
+return wdt + (wdt_start & ~PAGE_MASK);
+}
+
+/* Enable watchdog to trigger a reset after 500 ms */
+static void sunxi_old_wdt_reset(void __iomem *wdt)
+{
 writel(SUNXI_WDT_MODE_EN | SUNXI_WDT_MODE_RST_EN,
-  wdt + (SUNXI_WDT_MODEADDR & ~PAGE_MASK));
+   wdt + SUNXI_WDT_MODE_REG);
+}
+
+static void sunxi_new_wdt_reset(void __iomem *wdt)
+{
+writel(SUNXI_WDT_CONFIG_SYSTEM_RESET, wdt + SUNXI_WDOG0_CFG_REG);
+writel(SUNXI_WDT_MODE_EN, wdt + SUNXI_WDOG0_MODE_REG);
+}
+
+static void sunxi_reset(void)
+{
+void __iomem *wdt;
+bool is_new_wdt;
+
+wdt = sunxi_map_watchdog(_new_wdt);
+if ( !wdt )
+return;
+
+if ( is_new_wdt )
+

Re: [Xen-devel] [PATCH v4 5/5] ARM: ITS: Expose ITS in the MADT table

2017-10-05 Thread Andre Przywara
Hi Manish,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi <mja...@cavium.com>
> 
> Add gicv3_its_make_hwdom_madt to update hwdom MADT ITS information.

Thanks for the rework, that looks much better now!

> Signed-off-by: Manish Jaggi <mja...@cavium.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/gic-v3-its.c| 19 +++
>  xen/arch/arm/gic-v3.c|  1 +
>  xen/include/asm-arm/gic_v3_its.h |  8 
>  3 files changed, 28 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 8697e5b..e3e7e92 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -1062,6 +1062,25 @@ void gicv3_its_acpi_init(void)
>  acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
>  gicv3_its_acpi_probe, 0);
>  }
> +
> +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, void 
> *base_ptr)
> +{
> +unsigned long i = 0;
> +void *fw_its;
> +struct acpi_madt_generic_translator *hwdom_its;
> +
> +hwdom_its = base_ptr;
> +
> +for ( i = 0; i < vgic_v3_its_count(d); i++ )
> +{
> +fw_its = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
> +   i);
> +memcpy(hwdom_its, fw_its, sizeof(struct 
> acpi_madt_generic_translator));
> +hwdom_its++;
> +}
> +
> +return sizeof(struct acpi_madt_generic_translator) * 
> vgic_v3_its_count(d);
> +}
>  #endif
>  
>  /*
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 6e8d580..d29eea6 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1403,6 +1403,7 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  table_len += size;
>  }
>  
> +table_len += gicv3_its_make_hwdom_madt(d, base_ptr + table_len);
>  return table_len;
>  }
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h 
> b/xen/include/asm-arm/gic_v3_its.h
> index 31fca66..fc37776 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -138,6 +138,8 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  
>  #ifdef CONFIG_ACPI
>  void gicv3_its_acpi_init(void);
> +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d,
> +void *base_ptr);
>  #endif
>  
>  /* Deny iomem access for its */
> @@ -208,6 +210,12 @@ static inline void gicv3_its_dt_init(const struct 
> dt_device_node *node)
>  static inline void gicv3_its_acpi_init(void)
>  {
>  }
> +
> +static inline unsigned long gicv3_its_make_hwdom_madt(const struct domain *d,
> +  void *base_ptr)
> +{
> +return 0;
> +}
>  #endif
>  
>  static inline int gicv3_its_deny_access(const struct domain *d)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 4/5] ARM: Introduce get_hwdom_madt_size in gic_hw_operations

2017-10-05 Thread Andre Przywara
Hi,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> estimate_acpi_efi_size needs to be updated to provide correct size of
> hardware domains MADT, which now adds ITS information as well.
> 
> Introducing gic_get_hwdom_madt_size.
> 
> Signed-off-by: Manish Jaggi 
> ---
>  xen/arch/arm/domain_build.c |  7 +--
>  xen/arch/arm/gic-v2.c   |  9 +
>  xen/arch/arm/gic-v3.c   | 19 +++
>  xen/arch/arm/gic.c  | 12 
>  xen/include/asm-arm/gic.h   |  3 +++
>  5 files changed, 44 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index d6f9585..f17fcf1 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1808,12 +1808,7 @@ static int estimate_acpi_efi_size(struct domain *d, 
> struct kernel_info *kinfo)
>  acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
>  acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
>  
> -madt_size = sizeof(struct acpi_table_madt)
> -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> -+ sizeof(struct acpi_madt_generic_distributor);
> -if ( d->arch.vgic.version == GIC_V3 )
> -madt_size += sizeof(struct acpi_madt_generic_redistributor)
> - * d->arch.vgic.nr_regions;
> +madt_size = gic_get_hwdom_madt_size(d);
>  acpi_size += ROUNDUP(madt_size, 8);
>  
>  addr = acpi_os_get_root_pointer();
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index cbe71a9..2868766 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1012,6 +1012,14 @@ static int gicv2_iomem_deny_access(const struct domain 
> *d)
>  return iomem_deny_access(d, mfn, mfn + nr);
>  }
>  
> +static unsigned long gicv2_get_hwdom_madt_size(const struct domain *d)
> +{
> +return sizeof(struct acpi_table_madt)
> ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> ++ sizeof(struct acpi_madt_generic_distributor);
> +
> +}
> +
>  #ifdef CONFIG_ACPI
>  static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
>  {
> @@ -1248,6 +1256,7 @@ const static struct gic_hw_operations gicv2_ops = {
>  .read_apr= gicv2_read_apr,
>  .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv2_make_hwdom_madt,
> +.get_hwdom_madt_size = gicv2_get_hwdom_madt_size,
>  .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>  .iomem_deny_access   = gicv2_iomem_deny_access,
>  .do_LPI  = gicv2_do_LPI,
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b3d605d..6e8d580 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1406,6 +1406,19 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  return table_len;
>  }
>  
> +static unsigned long gicv3_get_hwdom_madt_size(const struct domain *d)
> +{
> +unsigned long size;
> +
> +size  = sizeof(struct acpi_madt_generic_redistributor)
> +* d->arch.vgic.nr_regions;
> +
> +size  += vgic_v3_its_count(d)
> +* sizeof(struct acpi_madt_generic_translator);
> +
> +return size;
> +}
> +
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
>  const unsigned long end)
> @@ -1597,6 +1610,11 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  {
>  return 0;
>  }
> +
> +static unsigned long gicv3_get_hwdom_madt_size(const struct domain *d)
> +{
> +return 0;
> +}
>  #endif
>  
>  /* Set up the GIC */
> @@ -1698,6 +1716,7 @@ static const struct gic_hw_operations gicv3_ops = {
>  .secondary_init  = gicv3_secondary_cpu_init,
>  .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv3_make_hwdom_madt,
> +.get_hwdom_madt_size = gicv3_get_hwdom_madt_size,
>  .iomem_deny_access   = gicv3_iomem_deny_access,
>  .do_LPI  = gicv3_do_LPI,
>  };
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 6c803bf..f3c1f0b 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -851,6 +851,18 @@ int gic_make_hwdom_madt(const struct domain *d, u32 
> offset)
>  return gic_hw_ops->make_hwdom_madt(d, offset);
>  }
>  
> +unsigned long gic_get_hwdom_madt_size(const struct domain *d)
> +{
> +unsigned long madt_size;
> +
> +madt_size = sizeof(struct acpi_table_madt)
> ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> ++ sizeof(struct acpi_madt_generic_distributor)
> ++ gic_hw_ops->get_hwdom_madt_size(d);

But this is now doubled for a GICv2? As you already do that calculation
in the GICv2 callback?
So I suggest you drop that *there* and rename the function member to
get_hwdom_extra_madt_size() (or so).
So in the 

Re: [Xen-devel] [PATCH v4 3/5] ARM: ITS: Deny hardware domain access to ITS

2017-10-05 Thread Andre Przywara
Hi Manish,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi <mja...@cavium.com>
> 
> This patch extends the gicv3_iomem_deny_access functionality by adding
> support for ITS region as well. Add function gicv3_its_deny_access.
> 
> Signed-off-by: Manish Jaggi <mja...@cavium.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Thanks,
Andre.

> ---
>  xen/arch/arm/gic-v3-its.c| 22 ++
>  xen/arch/arm/gic-v3.c|  3 +++
>  xen/include/asm-arm/gic_v3_its.h |  9 +
>  3 files changed, 34 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 0f662cf..8697e5b 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -905,6 +906,27 @@ struct pending_irq *gicv3_assign_guest_event(struct 
> domain *d,
>  return pirq;
>  }
>  
> +int gicv3_its_deny_access(const struct domain *d)
> +{
> +int rc = 0;
> +unsigned long mfn, nr;
> +const struct host_its *its_data;
> +
> +list_for_each_entry( its_data, _its_list, entry )
> +{
> +mfn = paddr_to_pfn(its_data->addr);
> +nr = PFN_UP(GICV3_ITS_SIZE);
> +rc = iomem_deny_access(d, mfn, mfn + nr);
> +if ( rc )
> +{
> +printk( "iomem_deny_access failed for %lx:%lx \r\n", mfn, nr);
> +break;
> +}
> +}
> +
> +return rc;
> +}
> +
>  /*
>   * Create the respective guest DT nodes from a list of host ITSes.
>   * This copies the reg property, so the guest sees the ITS at the same 
> address
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 6f562f4..b3d605d 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1308,6 +1308,9 @@ static int gicv3_iomem_deny_access(const struct domain 
> *d)
>  if ( rc )
>  return rc;
>  
> +if ( gicv3_its_deny_access(d) )
> +return rc;
> +
>  for ( i = 0; i < gicv3.rdist_count; i++ )
>  {
>  mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
> diff --git a/xen/include/asm-arm/gic_v3_its.h 
> b/xen/include/asm-arm/gic_v3_its.h
> index e1be33c..31fca66 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -139,6 +139,10 @@ void gicv3_its_dt_init(const struct dt_device_node 
> *node);
>  #ifdef CONFIG_ACPI
>  void gicv3_its_acpi_init(void);
>  #endif
> +
> +/* Deny iomem access for its */
> +int gicv3_its_deny_access(const struct domain *d);
> +
>  bool gicv3_its_host_has_its(void);
>  
>  unsigned int vgic_v3_its_count(const struct domain *d);
> @@ -206,6 +210,11 @@ static inline void gicv3_its_acpi_init(void)
>  }
>  #endif
>  
> +static inline int gicv3_its_deny_access(const struct domain *d)
> +{
> +return 0;
> +}
> +
>  static inline bool gicv3_its_host_has_its(void)
>  {
>  return false;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 2/5] ARM: ITS: Populate host_its_list from ACPI MADT Table

2017-10-05 Thread Andre Przywara
Hi,

On 04/10/17 06:29, Manish Jaggi wrote:
> Hello Julien,
> 
> On 10/3/2017 7:17 PM, Julien Grall wrote:
>> Hi Manish,
>>
>> On 21/09/17 14:17, mja...@caviumnetworks.com wrote:
>>> From: Manish Jaggi <mja...@cavium.com>
>>>
>>> Added gicv3_its_acpi_init to update host_its_list from MADT table.
>>> For ACPI, host_its structure  stores dt_node as NULL.
>>>
>>> Signed-off-by: Manish Jaggi <mja...@cavium.com>
>>> ---
>>>   xen/arch/arm/gic-v3-its.c    | 24 
>>>   xen/arch/arm/gic-v3.c    |  2 ++
>>>   xen/include/asm-arm/gic_v3_its.h | 10 ++
>>>   3 files changed, 36 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>> index 0610991..0f662cf 100644
>>> --- a/xen/arch/arm/gic-v3-its.c
>>> +++ b/xen/arch/arm/gic-v3-its.c
>>> @@ -18,6 +18,7 @@
>>>    * along with this program; If not, see
>>> <http://www.gnu.org/licenses/>.
>>>    */
>>>   +#include 
>>>   #include 
>>>   #include 
>>>   #include 
>>> @@ -1018,6 +1019,29 @@ void gicv3_its_dt_init(const struct
>>> dt_device_node *node)
>>>   }
>>>   }
>>>   +#ifdef CONFIG_ACPI
>>> +static int gicv3_its_acpi_probe(struct acpi_subtable_header *header,
>>> +    const unsigned long end)
>>> +{
>>> +    struct acpi_madt_generic_translator *its;
>>> +
>>> +    its = (struct acpi_madt_generic_translator *)header;
>>> +    if ( BAD_MADT_ENTRY(its, end) )
>>> +    return -EINVAL;
>>> +
>>> +    add_to_host_its_list(its->base_address, GICV3_ITS_SIZE, NULL);
>>
>> After the comment from Andre, I was expecting some rework to avoid
>> store the size of the ITS in host_its. So what's the plan for that?
> GICV3_ITS_SIZE  is now 128K (prev 64k, see below), same as what used in
> linux code, I think andre mentioned that need to add additional 64K.

That was one thing, but I was wondering about why we would need to store
that value as a *variable* in struct host_its when it is actually an
architecture defined constant. But as it was there before and it seems
cleaner to use the DT provided size, it could stay as well. We might fix
that later on.

>>
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +void gicv3_its_acpi_init(void)
>>> +{
>>> +    /* Parse ITS information */
>>> +    acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
>>> +    gicv3_its_acpi_probe, 0);
>>
>> The indentation still looks wrong here.
> ah.. ok.

So ignoring that "size" thing above and assuming this w/s issue fixed:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre

>>
>>> +}
>>> +#endif
>>> +
>>>   /*
>>>    * Local variables:
>>>    * mode: C
>>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>>> index f990eae..6f562f4 100644
>>> --- a/xen/arch/arm/gic-v3.c
>>> +++ b/xen/arch/arm/gic-v3.c
>>> @@ -1567,6 +1567,8 @@ static void __init gicv3_acpi_init(void)
>>>     gicv3.rdist_stride = 0;
>>>   +    gicv3_its_acpi_init();
>>> +
>>>   /*
>>>    * In ACPI, 0 is considered as the invalid address. However the
>>> rest
>>>    * of the initialization rely on the invalid address to be
>>> diff --git a/xen/include/asm-arm/gic_v3_its.h
>>> b/xen/include/asm-arm/gic_v3_its.h
>>> index 1fac1c7..e1be33c 100644
>>> --- a/xen/include/asm-arm/gic_v3_its.h
>>> +++ b/xen/include/asm-arm/gic_v3_its.h
>>> @@ -20,6 +20,7 @@
>>>   #ifndef __ASM_ARM_ITS_H__
>>>   #define __ASM_ARM_ITS_H__
>>>   +#define GICV3_ITS_SIZE  SZ_128K
>>
>> A less random place for this is close to the ITS_DOORBELL_OFFSET
>> definition.
> ok will do :)
>>
>>>   #define GITS_CTLR 0x000
>>>   #define GITS_IIDR   0x004
>>>   #define GITS_TYPER  0x008
>>> @@ -135,6 +136,9 @@ extern struct list_head host_its_list;
>>>   /* Parse the host DT and pick up all host ITSes. */
>>>   void gicv3_its_dt_init(const struct dt_device_node *node);
>>>   +#ifdef CONFIG_ACPI
>>> +void gicv3_its_acpi_init(void);
>>> +#endif
>>>   bool gicv3_its_host_has_its(void);
>>>     unsigned int vgic_v3_its_count(const struct domain *d);
>>> @@ -196,6 +200,12 @@ static inline void gicv3_its_dt_init(const
>>> struct dt_device_node *node)
>>>   {
>>>   }
>>>   +#ifdef CONFIG_ACPI
>>> +static inline void gicv3_its_acpi_init(void)
>>> +{
>>> +}
>>> +#endif
>>> +
>>>   static inline bool gicv3_its_host_has_its(void)
>>>   {
>>>   return false;
>>>
>>
>> Cheers,
>>
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen/arm64: Add Support for Allwinner H5 (sun50i)

2017-10-04 Thread Andre Przywara
Hi,



>>> Since reset routine will not be required with PSCI, I assume should revert 
>>> the reset code changes for this H5 patch and leave the DT retrieval for 
>>> another patch that adds H3 support. Or should I try that stuff for next 
>>> version of this patch? 
>>
>> Thanks for the offer, but I already made a patch that adds support for
>> basically all virtualization capable Allwinner SoCs (both v7 and v8
>> ones). This looks into the DT for ARMv7 SoCs, but relies entirely on
>> PSCI for ARMv8 SoCs. I just need to test it, then will send it out.
>>
>> So actually we won't need anything from that patch here at all, since my
>> patch supersedes it in a more generic way.
>> Do you plan on reworking/resending the UART fix (which should come
>> first, btw, as it is a prerequisite for H5 enablement)?
>>
>> You could either send the UART fix on its own if there are changes or I
>> include it as patch 1/2 of my Allwinner "series".
> 
> I'll send the latest version of UART fix as a standalone patch.

Thanks Awais, much appreciated. I don't have that other patch here at
the moment, but will send it to you for testing ASAP.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen/arm64: Add Support for Allwinner H5 (sun50i)

2017-10-04 Thread Andre Przywara
Hi Awais,

On 04/10/17 10:16, Awais Masood wrote:
> Hi,
> 
> On 09/29/2017 09:35 PM, Andre Przywara wrote:
>> Hi,
>>
>> On 09/28/2017 03:49 PM, Andre Przywara wrote:
>>> Hi,
>>>
>>> On 09/28/2017 01:03 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 09/26/2017 10:37 AM, Awais Masood wrote:
>>>>> This patch adds support for Allwinner H5/sun50i SoC.
>>>>>
>>>>> Makefile updated to enable ARM64 compilation for sunxi.c.
>>
>> ...
>>
>>>>> --- a/xen/arch/arm/platforms/sunxi.c
>>>>> +++ b/xen/arch/arm/platforms/sunxi.c
>>>>> @@ -22,18 +22,18 @@
>>>>>   #include 
>>>>>   /* Watchdog constants: */
>>>>> -#define SUNXI_WDT_BASE0x01c20c90
>>>>> +#define SUNXI_WDT_A20_BASE0x01c20c90
>>>>> +#define SUNXI_WDT_H5_BASE 0x01c20cA0
>>>>
>>>> I know that we hardcoded this value for the A20. However, I am wondering 
>>>> if we could find this address from the Device-Tree?
>>>
>>> Yes, both sun7i-a20.dtsi and the H5 .dts have the WDT.
>>> Its compatible strings are sun4i-a10-wdt and sun6i-a31-wdt, respectively. I 
>>> have to check what the differences are, but I guess for our purposes these 
>>> should be small.
>>> That seems like a call to some proper DT driven timer/WDT driver?
>>
>> Scratch that. I just see that this is solely used for the reset function. So 
>> we should not need this for the H5 (and the A64 for that matter). We may 
>> need this for the H3 (Cortex-A7) support, however, which seems quite popular 
>> on cheap boards.
>>
> 
> Since reset routine will not be required with PSCI, I assume should revert 
> the reset code changes for this H5 patch and leave the DT retrieval for 
> another patch that adds H3 support. Or should I try that stuff for next 
> version of this patch? 

Thanks for the offer, but I already made a patch that adds support for
basically all virtualization capable Allwinner SoCs (both v7 and v8
ones). This looks into the DT for ARMv7 SoCs, but relies entirely on
PSCI for ARMv8 SoCs. I just need to test it, then will send it out.

So actually we won't need anything from that patch here at all, since my
patch supersedes it in a more generic way.
Do you plan on reworking/resending the UART fix (which should come
first, btw, as it is a prerequisite for H5 enablement)?

You could either send the UART fix on its own if there are changes or I
include it as patch 1/2 of my Allwinner "series".

Thanks!
Andre.

>> Cheers,
>> Andre
>>
>>>>>   #define SUNXI_WDT_MODE0x04
>>>>> -#define SUNXI_WDT_MODEADDR(SUNXI_WDT_BASE + SUNXI_WDT_MODE)
>>>>>   #define SUNXI_WDT_MODE_EN (1 << 0)
>>>>>   #define SUNXI_WDT_MODE_RST_EN (1 << 1)
>>>>> -static void sunxi_reset(void)
>>>>> +static void sunxi_reset(u32 base)
>>>>>   {
>>>>>   void __iomem *wdt;
>>>>> -wdt = ioremap_nocache(SUNXI_WDT_MODEADDR & PAGE_MASK, PAGE_SIZE);
>>>>> +wdt = ioremap_nocache((base + SUNXI_WDT_MODE) & PAGE_MASK, 
>>>>> PAGE_SIZE);
>>>>>   if ( !wdt )
>>>>>   {
>>>>>   dprintk(XENLOG_ERR, "Unable to map watchdog register!\n");
>>>>> @@ -42,19 +42,35 @@ static void sunxi_reset(void)
>>>>>   /* Enable watchdog to trigger a reset after 500 ms: */
>>>>>   writel(SUNXI_WDT_MODE_EN | SUNXI_WDT_MODE_RST_EN,
>>>>> -  wdt + (SUNXI_WDT_MODEADDR & ~PAGE_MASK));
>>>>> +  wdt + ((base + SUNXI_WDT_MODE) & ~PAGE_MASK));
>>>>>   iounmap(wdt); >
>>>>>   for (;;)
>>>>>   wfi();
>>>>>   }
>>>>>
>>>>> -static const char * const sunxi_dt_compat[] __initconst =
>>>>> +static void sunxi_a20_reset(void)
>>>>> +{
>>>>> +sunxi_reset(SUNXI_WDT_A20_BASE);
>>>>> +}
>>>>> +
>>>>> +static void sunxi_h5_reset(void)
>>>>> +{
>>>>> +sunxi_reset(SUNXI_WDT_H5_BASE);
>>>>
>>>> If I read correctly the Device-Tree for 
>>>> (linux/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi), the firmware is 
>>>> supporting PSCI 0.2.
>>>>
>>>> PSCI 0.2 provides call for power-off/reset, so implementation the reset 
>>>> callback

Re: [Xen-devel] [PATCH v2 1/2] xen/arm64: Add Support for Allwinner H5 (sun50i)

2017-09-29 Thread Andre Przywara

Hi,

On 09/28/2017 03:49 PM, Andre Przywara wrote:

Hi,

On 09/28/2017 01:03 PM, Julien Grall wrote:

Hi,

On 09/26/2017 10:37 AM, Awais Masood wrote:

This patch adds support for Allwinner H5/sun50i SoC.

Makefile updated to enable ARM64 compilation for sunxi.c.


...


--- a/xen/arch/arm/platforms/sunxi.c
+++ b/xen/arch/arm/platforms/sunxi.c
@@ -22,18 +22,18 @@
  #include 
  /* Watchdog constants: */
-#define SUNXI_WDT_BASE    0x01c20c90
+#define SUNXI_WDT_A20_BASE    0x01c20c90
+#define SUNXI_WDT_H5_BASE 0x01c20cA0


I know that we hardcoded this value for the A20. However, I am 
wondering if we could find this address from the Device-Tree?


Yes, both sun7i-a20.dtsi and the H5 .dts have the WDT.
Its compatible strings are sun4i-a10-wdt and sun6i-a31-wdt, 
respectively. I have to check what the differences are, but I guess for 
our purposes these should be small.

That seems like a call to some proper DT driven timer/WDT driver?


Scratch that. I just see that this is solely used for the reset 
function. So we should not need this for the H5 (and the A64 for that 
matter). We may need this for the H3 (Cortex-A7) support, however, which 
seems quite popular on cheap boards.


Cheers,
Andre


  #define SUNXI_WDT_MODE    0x04
-#define SUNXI_WDT_MODEADDR    (SUNXI_WDT_BASE + SUNXI_WDT_MODE)
  #define SUNXI_WDT_MODE_EN (1 << 0)
  #define SUNXI_WDT_MODE_RST_EN (1 << 1)
-static void sunxi_reset(void)
+static void sunxi_reset(u32 base)
  {
  void __iomem *wdt;
-    wdt = ioremap_nocache(SUNXI_WDT_MODEADDR & PAGE_MASK, PAGE_SIZE);
+    wdt = ioremap_nocache((base + SUNXI_WDT_MODE) & PAGE_MASK, 
PAGE_SIZE);

  if ( !wdt )
  {
  dprintk(XENLOG_ERR, "Unable to map watchdog register!\n");
@@ -42,19 +42,35 @@ static void sunxi_reset(void)
  /* Enable watchdog to trigger a reset after 500 ms: */
  writel(SUNXI_WDT_MODE_EN | SUNXI_WDT_MODE_RST_EN,
-  wdt + (SUNXI_WDT_MODEADDR & ~PAGE_MASK));
+  wdt + ((base + SUNXI_WDT_MODE) & ~PAGE_MASK));
  iounmap(wdt); >
  for (;;)
  wfi();
  }

-static const char * const sunxi_dt_compat[] __initconst =
+static void sunxi_a20_reset(void)
+{
+    sunxi_reset(SUNXI_WDT_A20_BASE);
+}
+
+static void sunxi_h5_reset(void)
+{
+    sunxi_reset(SUNXI_WDT_H5_BASE);


If I read correctly the Device-Tree for 
(linux/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi), the firmware is 
supporting PSCI 0.2.


PSCI 0.2 provides call for power-off/reset, so implementation the 
reset callback should not be necessary.


Yes, indeed, on the H5 PSCI 0.2 reset works via ATF.

Similarly the cubietrucks we have in osstest are using PSCI 0.2 and 
should not need the reset. Andre do you know if it is the case for all 
the A20?


It claims 0.2, but in fact it seems not to be fully compliant, as (from 
looking at the code) U-Boot lacks the reset and poweroff calls. But it 
looks rather straight-forward to add them, as U-Boot knows how to reset 
and one would just need to wire up psci_system_reset to this.



For H5, I would impose PSCI 0.2 as the way to reset the platform.


Yes.

I am leaning towards the same for A20 given that it would just be a 
matter of upgrading the bootloader. Most likely you would have already 
done that to get fixes.


Not sure we should push people to upgrade U-Boot in general to be able 
to use Xen, but as even current mainline U-Boot doesn't seem to support 
it, I would rather leave the current reset support code in. Last time I 
checked Linux does the same.


Cheers,
Andre.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen/arm64: Add Support for Allwinner H5 (sun50i)

2017-09-28 Thread Andre Przywara

Hi,

On 09/28/2017 01:03 PM, Julien Grall wrote:

Hi,

On 09/26/2017 10:37 AM, Awais Masood wrote:

This patch adds support for Allwinner H5/sun50i SoC.

Makefile updated to enable ARM64 compilation for sunxi.c.

sunxi.c updates include:
   - Addition of H5/sun50i dt compatibility string.
   - Handling of different Watchdog timer base addresses on sun7i
 and sun50i.

Tested on Orange Pi PC2

Signed-off-by: Awais Masood 

---
Changes since v1:
   - Improved patch description
---
  xen/arch/arm/platforms/Makefile |  1 +
  xen/arch/arm/platforms/sunxi.c  | 40 
+++-

  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/platforms/Makefile 
b/xen/arch/arm/platforms/Makefile

index 49fa683..722897a 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -6,5 +6,6 @@ obj-$(CONFIG_ARM_32) += omap5.o
  obj-$(CONFIG_ARM_32) += rcar2.o
  obj-$(CONFIG_ARM_64) += seattle.o
  obj-$(CONFIG_ARM_32) += sunxi.o
+obj-$(CONFIG_ARM_64) += sunxi.o


Please use obj-y += sunxi.o as the platform is now supported by both 
Arm32 and Arm64.



  obj-$(CONFIG_ARM_64) += xgene-storm.o
  obj-$(CONFIG_ARM_64) += xilinx-zynqmp.o
diff --git a/xen/arch/arm/platforms/sunxi.c 
b/xen/arch/arm/platforms/sunxi.c

index 0ba7b3d..06d62e7 100644
--- a/xen/arch/arm/platforms/sunxi.c
+++ b/xen/arch/arm/platforms/sunxi.c
@@ -22,18 +22,18 @@
  #include 
  /* Watchdog constants: */
-#define SUNXI_WDT_BASE    0x01c20c90
+#define SUNXI_WDT_A20_BASE    0x01c20c90
+#define SUNXI_WDT_H5_BASE 0x01c20cA0


I know that we hardcoded this value for the A20. However, I am wondering 
if we could find this address from the Device-Tree?


Yes, both sun7i-a20.dtsi and the H5 .dts have the WDT.
Its compatible strings are sun4i-a10-wdt and sun6i-a31-wdt, 
respectively. I have to check what the differences are, but I guess for 
our purposes these should be small.

That seems like a call to some proper DT driven timer/WDT driver?




  #define SUNXI_WDT_MODE    0x04
-#define SUNXI_WDT_MODEADDR    (SUNXI_WDT_BASE + SUNXI_WDT_MODE)
  #define SUNXI_WDT_MODE_EN (1 << 0)
  #define SUNXI_WDT_MODE_RST_EN (1 << 1)
-static void sunxi_reset(void)
+static void sunxi_reset(u32 base)
  {
  void __iomem *wdt;
-    wdt = ioremap_nocache(SUNXI_WDT_MODEADDR & PAGE_MASK, PAGE_SIZE);
+    wdt = ioremap_nocache((base + SUNXI_WDT_MODE) & PAGE_MASK, 
PAGE_SIZE);

  if ( !wdt )
  {
  dprintk(XENLOG_ERR, "Unable to map watchdog register!\n");
@@ -42,19 +42,35 @@ static void sunxi_reset(void)
  /* Enable watchdog to trigger a reset after 500 ms: */
  writel(SUNXI_WDT_MODE_EN | SUNXI_WDT_MODE_RST_EN,
-  wdt + (SUNXI_WDT_MODEADDR & ~PAGE_MASK));
+  wdt + ((base + SUNXI_WDT_MODE) & ~PAGE_MASK));
  iounmap(wdt); >
  for (;;)
  wfi();
  }

-static const char * const sunxi_dt_compat[] __initconst =
+static void sunxi_a20_reset(void)
+{
+    sunxi_reset(SUNXI_WDT_A20_BASE);
+}
+
+static void sunxi_h5_reset(void)
+{
+    sunxi_reset(SUNXI_WDT_H5_BASE);


If I read correctly the Device-Tree for 
(linux/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi), the firmware is 
supporting PSCI 0.2.


PSCI 0.2 provides call for power-off/reset, so implementation the reset 
callback should not be necessary.


Yes, indeed, on the H5 PSCI 0.2 reset works via ATF.

Similarly the cubietrucks we have in osstest are using PSCI 0.2 and 
should not need the reset. Andre do you know if it is the case for all 
the A20?


It claims 0.2, but in fact it seems not to be fully compliant, as (from 
looking at the code) U-Boot lacks the reset and poweroff calls. But it 
looks rather straight-forward to add them, as U-Boot knows how to reset 
and one would just need to wire up psci_system_reset to this.



For H5, I would impose PSCI 0.2 as the way to reset the platform.


Yes.

I am 
leaning towards the same for A20 given that it would just be a matter of 
upgrading the bootloader. Most likely you would have already done that 
to get fixes.


Not sure we should push people to upgrade U-Boot in general to be able 
to use Xen, but as even current mainline U-Boot doesn't seem to support 
it, I would rather leave the current reset support code in. Last time I 
checked Linux does the same.


Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-09-22 Thread Andre Przywara
Hi Manish,

On 11/09/17 22:33, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> The set is divided into two patches. First one calculates the size of IORT
> while second one writes the IORT table itself.

It would be good if you could give a quick introduction *why* this set
is needed here (and introduce IORT to the casual reader).
In general some more high-level documentation on your functions would be
good, as it took me quite some time to understand what each function does.

So my understanding is:
phase 1:
- go over each entry in each RC node
-   if that points to an SMMU node, go over each outgoing ITS entry and
find overlaps with this RC entry
- for each overlap create a new entry in a list with this RC
pointing to the ITS directly

phase 2, creating the new IORT
- go over each RC node
-   if that points to an ITS, copy through IORT entries
-   if that points to an SMMU, replace with the remapped entries
- go over each ITS node
-   copy through IORT entries

So I believe this would do the trick and you end up with an efficient
representation of the IORT without SMMUs - at least for RC nodes.

After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. That should be fixable by removing the
hardcoded IORT node types in the code and treating NC nodes like RC nodes.
2) Eventually we will need *virtual* deviceID support, for DomUs. Now we
could start introducing that already, also doing some virtual mapping
for Dom0. The ITS code would then translate each virtual device ID that
Dom0 requests into a hardware device ID.
I agree that this means a lot more work, but we will need it anyway.

I think 1) can be solved using this series as a base. I have quite some
comments ready for the patches, shall we follow this route.

2) obviously would change the game completely. We need to sit down and
design this properly. Probably this means that Xen parses the IORT and
builds internal representations of the mappings, which are consulted as
needed when passing through devices. The guest's (that would include
Dom0) IORT would then be generated completely from scratch.

I would like to hear your opinion on this. I will try to discuss the
feasibility of 2) with people at Connect. It would be good if we could
decide whether this is the way to go or we should use a solution based
on this series.

Cheers,
Andre.


> patch1: estimates size of hardware domain IORT table by parsing all
> the pcirc nodes and their idmaps, and thereby calculating size by
> removing smmu nodes.
> 
> Hardware domain IORT table will have only ITS and PCIRC nodes, and PCIRC
> nodes' idmap will have output refrences to ITS group nodes.
> 
> patch 2: The steps are:
> a. First ITS group nodes are written and their offsets are saved
> along with the respective offsets from the firmware table.
> This is required when smmu node is hidden and smmu node still points
> to the old output_reference.
> 
> b. PCIRC idmap is parsed and a list of idmaps is created which will
> have PCIRC idmap -> ITS group nodes.
> Each idmap is written by resolving ITS offset from the map saved in
> previous step.
> 
> Changes wrt v1:
> No assumption is made wrt format of IORT / hw support
> 
> Manish Jaggi (2):
>   ARM: ACPI: IORT: Estimate the size of hardware domain IORT table
>   ARM: ACPI: IORT: Write Hardware domain's IORT table
> 
>  xen/arch/arm/acpi/Makefile  |   1 +
>  xen/arch/arm/acpi/iort.c| 414 
> 
>  xen/arch/arm/domain_build.c |  49 +-
>  xen/include/asm-arm/acpi.h  |   1 +
>  xen/include/asm-arm/iort.h  |  17 ++
>  5 files changed, 481 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/arm/acpi/iort.c
>  create mode 100644 xen/include/asm-arm/iort.h
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/24] xen/arm: page: Use ARMv8 naming to improve readability

2017-09-21 Thread Andre Przywara
Hi,

On 21/09/17 16:46, Stefano Stabellini wrote:
> On Thu, 21 Sep 2017, Julien Grall wrote:
>> Hi,
>>
>> On 20/09/17 00:45, Stefano Stabellini wrote:
 diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
 index 30fcfa0778..899fd1801a 100644
 --- a/xen/include/asm-arm/page.h
 +++ b/xen/include/asm-arm/page.h
 @@ -26,14 +26,14 @@
* LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and
 MAIR1.
*
*aiencoding
 - *   MT_UNCACHED  000      -- Strongly Ordered
 - *   MT_BUFFERABLE001   0100 0100  -- Non-Cacheable
 - *   MT_WRITETHROUGH  010   1010 1010  -- Write-through
 - *   MT_WRITEBACK 011   1110 1110  -- Write-back
 - *   MT_DEV_SHARED100    0100  -- Device
 + *   MT_DEVICE_nGnRE  000      -- Strongly Ordered/Device nGnRnE
>>>
>>> I admit I always hated the "nGnRE" acronym. However, it is on the ARM
>>> ARM too, so if you'd like to introduce it here, I'll accept it. But
>>> please at least expand the acronym in the comment to make it
>>> understandable (same with nGnRnE).
>>
>> "nGnRE" acronym are not great but convey the meaning of what would be the
>> resulting attribute.

I agree they are hideous to read, but easy to break down once you got
the idea ...

> This is an honest question, no pun intended: how do they convey the
> meaning? Personally, I have to look it up every time on the ARM ARM...

ARMv8 ARM B2.7.2  Device memory

G -> Gathering (can merge multiple accesses into one transfer)
R -> Reordering
E -> Early Write acknowledgement (other agents than the endpoint
(caches) can acknowledge the transfer).

n means not.

Done. More details in the ARM ARM.

Cheers,
Andre.

>> For instance MT_UNCACHED does not really say if it is for
>> device or memory. Lets not even mention MT_BUFFERABLE which is in fact
>> non-cacheable memory :).
>>
>>>
>>> Also, the comment say "nGnRnE" while the definition is MT_DEVICE_nGnRE.
>>
>> Actually, the comment is correct but not the naming. It should
>> MT_DEVICE_nGnRnE. I will rename it.
>>
>> Aside that, I think the comment is understandable. nGnRnE is equivalent to
>> Strongly ordered. I could expand nGnRnE (non-Gatherable, non-Reordering, No
>> Early write acknowledgment) but I feel at this stage you can just search the
>> name in the ARM ARM...
> 
> I am not asking to expland the name, only to expand nGnRnE in the
> comment on the side. Searching through that pdf is not really a fun
> activity.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 5/5] ARM: ITS: Expose ITS in the MADT table

2017-09-07 Thread Andre Przywara
Hi,

On 05/09/17 18:15, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> Add gicv3_its_make_hwdom_madt to update hwdom MADT ITS information.
> 
> Signed-off-by: Manish Jaggi 
> ---
>  xen/arch/arm/gic-v3-its.c| 23 +++
>  xen/arch/arm/gic-v3.c|  1 +
>  xen/include/asm-arm/gic_v3_its.h |  8 
>  3 files changed, 32 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 0ab1466..bf84db8 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -1064,6 +1064,29 @@ void gicv3_its_acpi_init(void)
>  acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
>gicv3_its_acpi_probe, 0);
>  }
> +
> +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, u8 *base_ptr,
> +unsigned long offset)

What about we drop offset here and add it at the caller, then return
just the size of the ITS MADT size? Also base_ptr could be a void* then.

> +{
> +unsigned long i;
> +struct acpi_madt_generic_translator *fw_its;

If you make this either a "void *" or a "struct acpi_subtable_header *"
then you can save the rather ugly cast in the assignment below.

> +struct acpi_madt_generic_translator *hwdom_its;
> +
> +hwdom_its = (struct acpi_madt_generic_translator *)(base_ptr
> +   + offset);

If you drop offset as mentioned above and make base_ptr a void*, you can
save the cast.

> +
> +for ( i = 0; i < vgic_v3_its_count(d); i++ )
> +{
> +fw_its = (struct acpi_madt_generic_translator *)
> +acpi_table_get_entry_madt(
> +ACPI_MADT_TYPE_GENERIC_TRANSLATOR, i);
> +memcpy(hwdom_its, fw_its, sizeof(struct 
> acpi_madt_generic_translator));
> +hwdom_its++;
> +}
> +
> +return (offset + sizeof(struct acpi_madt_generic_translator)
> +   * vgic_v3_its_count(d));
> +}
>  #endif
>  
>  /*
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 3eb67f2..0392795 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1403,6 +1403,7 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  table_len += size;
>  }
>  
> +table_len = gicv3_its_make_hwdom_madt(d, base_ptr, table_len);

... and here you could mimic the other calls then:
table_len += gicv3_its_make_hwdom_madt(d, base_ptr + table_len);

(or directly return).

Cheers,
Andre.


>  return table_len;
>  }
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h 
> b/xen/include/asm-arm/gic_v3_its.h
> index 9cf18da..ae8a494 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -137,6 +137,8 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  
>  #ifdef CONFIG_ACPI
>  void gicv3_its_acpi_init(void);
> +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, u8 *base_ptr,
> +unsigned long offset);
>  #endif
>  
>  /* Deny iomem access for its */
> @@ -207,6 +209,12 @@ static inline void gicv3_its_dt_init(const struct 
> dt_device_node *node)
>  static inline void gicv3_its_acpi_init(void)
>  {
>  }
> +
> +unsigned long gicv3_its_make_hwdom_madt(struct domain *d, u8 *base_ptr,
> +unsigned long offset)
> +{
> +return 0;
> +}
>  #endif
>  
>  static inline int gicv3_its_deny_access(const struct domain *d)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/5] ARM: ITS: Deny hardware domain access to ITS

2017-09-07 Thread Andre Przywara
Hi,

On 05/09/17 18:14, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> This patch extends the gicv3_iomem_deny_access functionality by adding
> support for ITS region as well. Add function gicv3_its_deny_access.
> 
> Signed-off-by: Manish Jaggi 
> ---
>  xen/arch/arm/gic-v3-its.c| 22 ++
>  xen/arch/arm/gic-v3.c|  3 +++
>  xen/include/asm-arm/gic_v3_its.h |  9 +
>  3 files changed, 34 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 536b48d..0ab1466 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,6 +20,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -906,6 +907,27 @@ struct pending_irq *gicv3_assign_guest_event(struct 
> domain *d,
>  return pirq;
>  }
>  
> +int gicv3_its_deny_access(const struct domain *d)
> +{
> +int rc = 0;
> +unsigned long mfn, nr;
> +const struct host_its *its_data;
> +
> +list_for_each_entry( its_data, _its_list, entry )
> +{
> +mfn = paddr_to_pfn(its_data->addr);
> +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);

Shouldn't this not only cover the ITS register frame, but also the
following 64K page containing the doorbell address? Otherwise we leave
the doorbell address open, which seems to be asking for trouble ...

Cheers,
Andre.

> +rc = iomem_deny_access(d, mfn, mfn + nr);
> +if ( rc )
> +{
> +printk( "iomem_deny_access failed for %lx:%lx \r\n", mfn, nr);
> +break;
> +}
> +}
> +
> +return rc;
> +}
> +
>  /*
>   * Create the respective guest DT nodes from a list of host ITSes.
>   * This copies the reg property, so the guest sees the ITS at the same 
> address
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 6f562f4..b3d605d 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1308,6 +1308,9 @@ static int gicv3_iomem_deny_access(const struct domain 
> *d)
>  if ( rc )
>  return rc;
>  
> +if ( gicv3_its_deny_access(d) )
> +return rc;
> +
>  for ( i = 0; i < gicv3.rdist_count; i++ )
>  {
>  mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
> diff --git a/xen/include/asm-arm/gic_v3_its.h 
> b/xen/include/asm-arm/gic_v3_its.h
> index 993819a..9cf18da 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -138,6 +138,10 @@ void gicv3_its_dt_init(const struct dt_device_node 
> *node);
>  #ifdef CONFIG_ACPI
>  void gicv3_its_acpi_init(void);
>  #endif
> +
> +/* Deny iomem access for its */
> +int gicv3_its_deny_access(const struct domain *d);
> +
>  bool gicv3_its_host_has_its(void);
>  
>  unsigned int vgic_v3_its_count(const struct domain *d);
> @@ -205,6 +209,11 @@ static inline void gicv3_its_acpi_init(void)
>  }
>  #endif
>  
> +static inline int gicv3_its_deny_access(const struct domain *d)
> +{
> +return 0;
> +}
> +
>  static inline bool gicv3_its_host_has_its(void)
>  {
>  return false;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 4/5] ARM: Introduce get_hwdom_madt_size in gic_hw_operations

2017-09-07 Thread Andre Przywara
Hi,

On 05/09/17 18:14, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> estimate_acpi_efi_size needs to be updated to provide correct size of
> hardware domains MADT, which now adds ITS information as well.
> 
> Introducing gic_get_hwdom_madt_size.
> 
> Signed-off-by: Manish Jaggi 
> ---
>  xen/arch/arm/domain_build.c |  7 +--
>  xen/arch/arm/gic-v2.c   |  6 ++
>  xen/arch/arm/gic-v3.c   | 18 ++
>  xen/arch/arm/gic.c  | 11 +++
>  xen/include/asm-arm/gic.h   |  3 +++
>  5 files changed, 39 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 1bec4fa..5739ea4 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1806,12 +1806,7 @@ static int estimate_acpi_efi_size(struct domain *d, 
> struct kernel_info *kinfo)
>  acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
>  acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
>  
> -madt_size = sizeof(struct acpi_table_madt)
> -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> -+ sizeof(struct acpi_madt_generic_distributor);
> -if ( d->arch.vgic.version == GIC_V3 )
> -madt_size += sizeof(struct acpi_madt_generic_redistributor)
> - * d->arch.vgic.nr_regions;
> +madt_size = gic_get_hwdom_madt_size(d);
>  acpi_size += ROUNDUP(madt_size, 8);
>  
>  addr = acpi_os_get_root_pointer();
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index cbe71a9..737c50a 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain 
> *d)
>  return iomem_deny_access(d, mfn, mfn + nr);
>  }
>  
> +static unsigned long gicv2_get_hwdom_madt_size(const struct domain *d)
> +{
> +return 0;
> +}

Nothing too critical, but this looks a bit confusing, as the size of the
GIC part of the MADT isn't 0 even for GICv2. So either you rename it to
something containing "additional" or the like or you do what it says on
the tin and return the per-VCPU size and the size for the distributor
here (at the cost of copying this to the GICv3 code).

Cheers,
Andre.

> +
>  #ifdef CONFIG_ACPI
>  static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
>  {
> @@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = {
>  .read_apr= gicv2_read_apr,
>  .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv2_make_hwdom_madt,
> +.get_hwdom_madt_size = gicv2_get_hwdom_madt_size,
>  .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>  .iomem_deny_access   = gicv2_iomem_deny_access,
>  .do_LPI  = gicv2_do_LPI,
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b3d605d..3eb67f2 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1406,6 +1406,18 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  return table_len;
>  }
>  
> +static unsigned long gicv3_get_hwdom_madt_size(const struct domain *d)
> +{
> +unsigned long size;
> +size  = sizeof(struct acpi_madt_generic_redistributor)
> +* d->arch.vgic.nr_regions;
> +
> +size  += vgic_v3_its_count(d)
> +* sizeof(struct acpi_madt_generic_translator);
> +
> +return size;
> +}
> +
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
>  const unsigned long end)
> @@ -1597,6 +1609,11 @@ static int gicv3_make_hwdom_madt(const struct domain 
> *d, u32 offset)
>  {
>  return 0;
>  }
> +
> +static u32 gicv3_get_hwdom_madt_size(const struct domain *d)
> +{
> +return 0;
> +}
>  #endif
>  
>  /* Set up the GIC */
> @@ -1698,6 +1715,7 @@ static const struct gic_hw_operations gicv3_ops = {
>  .secondary_init  = gicv3_secondary_cpu_init,
>  .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>  .make_hwdom_madt = gicv3_make_hwdom_madt,
> +.get_hwdom_madt_size = gicv3_get_hwdom_madt_size,
>  .iomem_deny_access   = gicv3_iomem_deny_access,
>  .do_LPI  = gicv3_do_LPI,
>  };
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 6c803bf..9ffd33a 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -851,6 +851,17 @@ int gic_make_hwdom_madt(const struct domain *d, u32 
> offset)
>  return gic_hw_ops->make_hwdom_madt(d, offset);
>  }
>  
> +unsigned long gic_get_hwdom_madt_size(const struct domain *d)
> +{
> +unsigned long madt_size;
> +madt_size = sizeof(struct acpi_table_madt)
> ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> ++ sizeof(struct acpi_madt_generic_distributor)
> ++ gic_hw_ops->get_hwdom_madt_size(d);
> +
> +return madt_size;
> +}
> +
>  int 

Re: [Xen-devel] [PATCH v3 2/5] ARM: ITS: Populate host_its_list from ACPI MADT Table

2017-09-07 Thread Andre Przywara
Hi,

On 05/09/17 18:14, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi 
> 
> Added gicv3_its_acpi_init to update host_its_list from MADT table.
> For ACPI, host_its structure  stores dt_node as NULL.
> 
> Signed-off-by: Manish Jaggi 
> ---
>  xen/arch/arm/gic-v3-its.c| 26 ++
>  xen/arch/arm/gic-v3.c|  2 ++
>  xen/include/asm-arm/gic_v3_its.h |  9 +
>  3 files changed, 37 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 61a6452..536b48d 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -32,6 +33,7 @@
>  #include 
>  
>  #define ITS_CMD_QUEUE_SZSZ_1M
> +#define ACPI_GICV3_ITS_MEM_SIZE SZ_64K

Although this is used for ACPI only, this size is really the architected
size for the ITS register frame and thus should be named like this,
possibly GUEST_GICV3_ITS_SIZE or so (in xen/include/public/arch-arm.h).
Which actually makes me wonder why we would need to store this size in
the data structure in the first place ...

>  /*
>   * No lock here, as this list gets only populated upon boot while scanning
> @@ -1018,6 +1020,30 @@ void gicv3_its_dt_init(const struct dt_device_node 
> *node)
>  }
>  }
>  
> +#ifdef CONFIG_ACPI
> +int gicv3_its_acpi_probe(struct acpi_subtable_header *header,
> +const unsigned long end)

  w/s?
> +{
> +struct acpi_madt_generic_translator *its;
> +
> +its = (struct acpi_madt_generic_translator *)header;
> +if ( BAD_MADT_ENTRY(its, end) )
> +return -EINVAL;
> +
> +add_to_host_its_list(its->base_address,
> +ACPI_GICV3_ITS_MEM_SIZE, NULL);

  w/s?

> +
> +return 0;
> +}
> +
> +void gicv3_its_acpi_init(void)
> +{
> +/* Parse ITS information */
> +acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
> +  gicv3_its_acpi_probe, 0);

w/s?

Cheers,
Andre.

> +}
> +#endif
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index f990eae..6f562f4 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1567,6 +1567,8 @@ static void __init gicv3_acpi_init(void)
>  
>  gicv3.rdist_stride = 0;
>  
> +gicv3_its_acpi_init();
> +
>  /*
>   * In ACPI, 0 is considered as the invalid address. However the rest
>   * of the initialization rely on the invalid address to be
> diff --git a/xen/include/asm-arm/gic_v3_its.h 
> b/xen/include/asm-arm/gic_v3_its.h
> index 1fac1c7..993819a 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -135,6 +135,9 @@ extern struct list_head host_its_list;
>  /* Parse the host DT and pick up all host ITSes. */
>  void gicv3_its_dt_init(const struct dt_device_node *node);
>  
> +#ifdef CONFIG_ACPI
> +void gicv3_its_acpi_init(void);
> +#endif
>  bool gicv3_its_host_has_its(void);
>  
>  unsigned int vgic_v3_its_count(const struct domain *d);
> @@ -196,6 +199,12 @@ static inline void gicv3_its_dt_init(const struct 
> dt_device_node *node)
>  {
>  }
>  
> +#ifdef CONFIG_ACPI
> +static inline void gicv3_its_acpi_init(void)
> +{
> +}
> +#endif
> +
>  static inline bool gicv3_its_host_has_its(void)
>  {
>  return false;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/5] ARM: ITS: Introduce common function add_to_host_its_list

2017-09-07 Thread Andre Przywara
Hi,

On 05/09/17 18:14, mja...@caviumnetworks.com wrote:
> From: Manish Jaggi <mja...@cavium.com>
> 
> add_to_host_its_list will update the host_its_list. This common
> function to be invoked from gicv3_its_dt_init and gic_v3_its_acpi_probe.
> 
> Signed-off-by: Manish Jaggi <mja...@cavium.com>

Makes sense.

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/gic-v3-its.c | 32 
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 2d36030..61a6452 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -976,11 +976,29 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain 
> *d,
>  return res;
>  }
>  
> +/* Common function for adding to host_its_list */
> +static void add_to_host_its_list(paddr_t addr, paddr_t size,
> + const struct dt_device_node *node)
> +{
> +struct host_its *its_data;
> +
> +its_data = xzalloc(struct host_its);
> +if ( !its_data )
> +panic("GICv3: Cannot allocate memory for ITS frame");
> +
> +its_data->addr = addr;
> +its_data->size = size;
> +its_data->dt_node = node;
> +
> +printk("GICv3: Found ITS @0x%lx\n", addr);
> +
> +list_add_tail(_data->entry, _its_list);
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. 
> */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>  const struct dt_device_node *its = NULL;
> -struct host_its *its_data;
>  
>  /*
>   * Check for ITS MSI subnodes. If any, add the ITS register
> @@ -996,17 +1014,7 @@ void gicv3_its_dt_init(const struct dt_device_node 
> *node)
>  if ( dt_device_get_address(its, 0, , ) )
>  panic("GICv3: Cannot find a valid ITS frame address");
>  
> -its_data = xzalloc(struct host_its);
> -if ( !its_data )
> -panic("GICv3: Cannot allocate memory for ITS frame");
> -
> -its_data->addr = addr;
> -its_data->size = size;
> -its_data->dt_node = its;
> -
> -printk("GICv3: Found ITS @0x%lx\n", addr);
> -
> -list_add_tail(_data->entry, _its_list);
> +add_to_host_its_list(addr, size, its);
>  }
>  }
>  
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 26/27 v8] xen/arm: vpl011: Correct the logic for asserting/de-asserting SBSA UART TX interrupt

2017-09-07 Thread Andre Przywara
Hi,

On 28/08/17 09:56, Bhupinder Thakur wrote:
> This patch fixes the issue observed when pl011 patches were tested on
> the junos hardware by Andre/Julien. It was observed that when large output is
> generated such as on running 'find /', output was getting truncated 
> intermittently
> due to OUT ring buffer getting full.
> 
> This issue was due to the fact that the SBSA UART driver expects that when
> a TX interrupt is asserted then the FIFO queue should be atleast half empty 
> and
> that it can write N bytes in the FIFO, where N is half the FIFO queue size, 
> without
> the bytes getting dropped due to FIFO getting full.
> 
> This requirement is as per section 3.4.2 of [1], which is:
> 
> ---
> UARTTXINTR
> 
> If the FIFOs are enabled and the transmit FIFO reaches the programmed
> trigger level. When this happens, the transmit interrupt is asserted HIGH. The
> transmit interrupt is cleared by writing data to the transmit FIFO until it
> becomes greater than the trigger level, or by clearing the interrupt.
> ---
> 
> The SBSA UART fifo size is 32 bytes and so it expects that space for 16 bytes
> should be available when TX interrupt is asserted.
> 
> The pl011 emulation logic was asserting the TX interrupt as soon as
> any space became available in the FIFO and the SBSA UART driver tried to write
> more data (upto 16 bytes) in the FIFO expecting that there is enough space
> available.
> 
> The fix was to ensure that the TX interriupt is raised only when there
> is space available for 16 bytes or more in the FIFO.
> 
> [1] http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf
> 
> Signed-off-by: Bhupinder Thakur <bhupinder.tha...@linaro.org>
> ---
> CC: Julien Grall <julien.gr...@arm.com>
> CC: Andre Przywara <andre.przyw...@arm.com>
> CC: Stefano Stabellini <sstabell...@kernel.org>
> 
>  xen/arch/arm/vpl011.c | 29 +++--
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 56d9cbe..1e72fca 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -152,12 +152,20 @@ static void vpl011_write_data(struct domain *d, uint8_t 
> data)
>  else
>  gprintk(XENLOG_ERR, "vpl011: Unexpected OUT ring buffer full\n");
>  
> -if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) ==
> - sizeof (intf->out) )
> -{
> -vpl011->uartfr |= TXFF;
> +/*
> + * Ensure that there is space for atleast 16 bytes before asserting the
> + * TXI interrupt status bit because the SBSA UART driver may write
> + * 16 bytes (i.e. half the SBSA UART fifo size of 32) on getting
> + * a TX interrupt.
> + */
> +if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) <=
> + (sizeof (intf->out) - 16) )
> +vpl011->uartris |= TXI;
> +else if ( xencons_queued(out_prod, out_cons, sizeof(intf->out)) !=
> +  sizeof (intf->out) )

Now this is really hard to read. Can't you use:

fifo_level = xencons_queued(out_prod, out_cons, sizeof(intf->out));

Also I think you could start the patch a few lines above, where you
check for any free space in the buffer.

>  vpl011->uartris &= ~TXI;
> -}
> +else
> +vpl011->uartfr |= TXFF;

And I believe we should separate the FIFO full condition from the
interrupt condition. I think it should more look like:

vpl011->uartfr |= BUSY;
vpl011->uartfr &= ~TXFE;

if ( fifo_level == sizeof(intf->out) )
vpl011->uartfr |= TXFF;

if ( fifo_level >= sizeof(intf->out) - 16 )
vpl011->uartris &= ~TXI;

Which is much easier to read and understand, also follows the spec
closely. The "16" should be either expressed at FIFOSIZE / 2 or
explained in a comment.

>  
>  vpl011->uartfr |= BUSY;
>  
> @@ -368,7 +376,16 @@ static void vpl011_data_avail(struct domain *d)
>  if ( out_ring_qsize != sizeof(intf->out) )
>  {
>  vpl011->uartfr &= ~TXFF;
> -vpl011->uartris |= TXI;
> +
> +/*
> + * Ensure that there is space for atleast 16 bytes before asserting 
> the
> + * TXI interrupt status bit because the SBSA UART driver may write 
> upto
> + * 16 bytes (i.e. half the SBSA UART fifo size of 32) on getting
> + * a TX interrupt.

The comment sounds a bit like this is hack, where it actually is a
totally legit spec requirement (the interrupt 

Re: [Xen-devel] [PATCH 04/27 v8] xen/arm: vpl011: Add support for vuart in libxl

2017-09-07 Thread Andre Przywara
Hi,

On 28/08/17 09:55, Bhupinder Thakur wrote:
> An option is provided in libxl to enable/disable SBSA vuart while
> creating a guest domain.
> 
> Libxl now supports a generic vuart console and SBSA uart is a specific type.
> In future support can be added for multiple vuart of different types.
> 
> User can enable SBSA vuart by adding the following line in the guest
> configuration file:
> 
> vuart = "sbsa_uart"
> 
> Signed-off-by: Bhupinder Thakur 
> Acked-by: Stefano Stabellini 
> Acked-by: Wei Liu 
> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Stefano Stabellini 
> CC: Julien Grall 
> 
> Changes since v4:
> - Renamed "pl011" to "sbsa_uart".
> 
> Changes since v3:
> - Added a new config option CONFIG_VUART_CONSOLE to enable/disable vuart 
> console
>   support.
> - Moved libxl_vuart_type to arch-arm part of libxl_domain_build_info
> - Updated xl command help to mention new console type - vuart.
> 
> Changes since v2:
> - Defined vuart option as an enum instead of a string.
> - Removed the domain creation flag defined for vuart and the related code
>   to pass on the information while domain creation. Now vpl011 is initialized
>   independent of domain creation through new DOMCTL APIs.
> 
>  tools/libxl/libxl.h  | 6 ++
>  tools/libxl/libxl_console.c  | 3 +++
>  tools/libxl/libxl_dom.c  | 1 +
>  tools/libxl/libxl_internal.h | 3 +++
>  tools/libxl/libxl_types.idl  | 7 +++
>  tools/xl/xl_cmdtable.c   | 2 +-
>  tools/xl/xl_console.c| 5 -
>  tools/xl/xl_parse.c  | 8 
>  8 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 229e289..8ce920a 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -306,6 +306,12 @@
>  #define LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE 1
>  
>  /*
> + * LIBXL_HAVE_BUILDINFO_ARM_VUART indicates that the toolstack supports 
> virtual UART
> + * for ARM.
> + */
> +#define LIBXL_HAVE_BUILDINFO_ARM_VUART 1
> +
> +/*

This requires some trivial fixup now if applied against origin/master
(or staging).

Cheers,
Andre.

>   * libxl ABI compatibility
>   *
>   * The only guarantee which libxl makes regarding ABI compatibility
> diff --git a/tools/libxl/libxl_console.c b/tools/libxl/libxl_console.c
> index 446e766..853be15 100644
> --- a/tools/libxl/libxl_console.c
> +++ b/tools/libxl/libxl_console.c
> @@ -67,6 +67,9 @@ int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int 
> cons_num,
>  case LIBXL_CONSOLE_TYPE_SERIAL:
>  cons_type_s = "serial";
>  break;
> +case LIBXL_CONSOLE_TYPE_VUART:
> +cons_type_s = "vuart";
> +break;
>  default:
>  goto out;
>  }
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index f54fd49..e0f0d78 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -803,6 +803,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
>  if (xc_dom_translated(dom)) {
>  state->console_mfn = dom->console_pfn;
>  state->store_mfn = dom->xenstore_pfn;
> +state->vuart_gfn = dom->vuart_gfn;
>  } else {
>  state->console_mfn = xc_dom_p2m(dom, dom->console_pfn);
>  state->store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 7247509..6b38453 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1139,6 +1139,9 @@ typedef struct {
>  uint32_t num_vmemranges;
>  
>  xc_domain_configuration_t config;
> +
> +xen_pfn_t vuart_gfn;
> +evtchn_port_t vuart_port;
>  } libxl__domain_build_state;
>  
>  _hidden int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 6e80d36..9959efb 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -105,6 +105,7 @@ libxl_console_type = Enumeration("console_type", [
>  (0, "UNKNOWN"),
>  (1, "SERIAL"),
>  (2, "PV"),
> +(3, "VUART"),
>  ])
>  
>  libxl_disk_format = Enumeration("disk_format", [
> @@ -240,6 +241,11 @@ libxl_checkpointed_stream = 
> Enumeration("checkpointed_stream", [
>  (2, "COLO"),
>  ])
>  
> +libxl_vuart_type = Enumeration("vuart_type", [
> +(0, "unknown"),
> +(1, "sbsa_uart"),
> +])
> +
>  #
>  # Complex libxl types
>  #
> @@ -581,6 +587,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>  
>  
>  ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
> +   ("vuart", libxl_vuart_type),
>])),
>  # Alternate p2m is not bound to any architecture or guest type, as it is
>  # supported by x86 HVM and ARM support is planned.
> diff --git 

Re: [Xen-devel] [PATCH 00/27] xen/arm: Memory subsystem clean-up

2017-08-23 Thread Andre Przywara
Hi Julien,

On 14/08/17 15:23, Julien Grall wrote:
> Hi all,
> 
> This patch series contains clean-up for the ARM Memory subsystem in
> preparation of reworking the page tables handling.

thanks for the work!
I am done with the review, the series looks fine in general to me.
Whenever there were verify-able bits changed, I tried to check against
the spec and couldn't spot any issues.
The smaller comments I had were more about clarity or documentation and
should be easy to fix.

> A branch with the patches can be found on xenbits:
> 
> https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
> branch mm-cleanup-v1

I also compile-tested every patch for ARM and arm64, no warnings.

Thanks,
Andre.

> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Tim Deegan 
> Cc: Wei Liu 
> Cc: Ross Lagerwall 
> 
> 
> Julien Grall (27):
>   xen/x86: numa: Don't check alloc_boot_pages return
>   xen/x86: srat: Don't check alloc_boot_pages return
>   xen/x86: mm: Don't check alloc_boot_pages return
>   xen/mm: Move {G,M]FN <-> {G,M}ADDR helpers to common code
>   xen/mm: Use typesafe MFN for alloc_boot_pages return
>   xen/mm: Use __virt_to_mfn in map_domain_page instead of virt_to_mfn
>   xen/arm: mm: Redefine mfn_to_virt to use typesafe
>   xen/arm: hsr_iabt: Document RES0 field
>   xen/arm: traps: Don't define FAR_EL2 for ARM32
>   xen/arm: arm32: Don't define FAR_EL1
>   xen/arm: Add FnV field in hsr_*abt
>   xen/arm: Introduce hsr_xabt to gather common bits between hsr_dabt and
>   xen/arm: traps: Introduce a helper to read the hypersivor fault
> register
>   xen/arm: traps: Improve logging for data/prefetch abort fault
>   xen/arm: Replace ioremap_attr(PAGE_HYPERVISOR_NOCACHE) call by
> ioremap_nocache
>   xen/arm: page: Remove unused attributes DEV_NONSHARED and DEV_CACHED
>   xen/arm: page: Use directly BUFFERABLE and drop DEV_WC
>   xen/arm: page: Prefix memory types with MT_
>   xen/arm: page: Clean-up the definition of MAIRVAL
>   xen/arm: page: Use ARMv8 naming to improve readability
>   xen/arm: mm: Rename and clarify AP[1] in the stage-1 page table
>   xen/arm: Switch to SYS_STATE_boot just after end_boot_allocator()
>   xen/arm: mm: Rename 'ai' into 'flags' in create_xen_entries
>   xen/arm: page: Describe the layout of flags used to update page tables
>   xen/arm: mm: Embed permission in the flags
>   xen/arm: mm: Handling permission flags when adding a new mapping
>   xen/arm: mm: Use memory flags for modify_xen_mappings rather than
> custom one
> 
>  xen/arch/arm/kernel.c |   2 +-
>  xen/arch/arm/livepatch.c  |   6 +--
>  xen/arch/arm/mm.c |  79 +-
>  xen/arch/arm/platforms/exynos5.c  |   2 +-
>  xen/arch/arm/platforms/omap5.c|   6 +--
>  xen/arch/arm/platforms/vexpress.c |   2 +-
>  xen/arch/arm/setup.c  |  12 +++--
>  xen/arch/arm/traps.c  |  52 +---
>  xen/arch/x86/mm.c |   8 +--
>  xen/arch/x86/numa.c   |  10 +---
>  xen/arch/x86/srat.c   |   7 +--
>  xen/common/page_alloc.c   |   7 ++-
>  xen/drivers/acpi/osl.c|   2 +-
>  xen/drivers/video/arm_hdlcd.c |   2 +-
>  xen/include/asm-arm/cpregs.h  |   2 -
>  xen/include/asm-arm/lpae.h|   2 +-
>  xen/include/asm-arm/mm.h  |   7 +--
>  xen/include/asm-arm/page.h| 100 
> ++
>  xen/include/asm-arm/processor.h   |  25 --
>  xen/include/xen/domain_page.h |   2 +-
>  xen/include/xen/mm.h  |   9 +++-
>  21 files changed, 204 insertions(+), 140 deletions(-)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 25/27] xen/arm: mm: Embed permission in the flags

2017-08-23 Thread Andre Przywara
Hi,

On 23/08/17 15:26, Julien Grall wrote:
> On 08/23/2017 03:08 PM, Andre Przywara wrote:
>> Hi,
> 
> Hi,
> 
>> On 14/08/17 15:24, Julien Grall wrote:
>>> Currently, it is not possible to specify the permission of a new
>>> mapping. It would be necessary to use the function modify_xen_mappings
>>> with a different set of flags.
>>>

Just saw that I forgot the typos here:

>>> Add introduce a couple of new flags for the permissions (Non-eXecutable,

Either "add" or "introduce", I guess.

>>> Read-Only) and also provides define that combine the memory attribute
>>> and permission for common combination.

Somehow the plural/singular is messed up here, I needed to read that
sentence multiple times.

>>
>> If I haven't been lost in the definitions, this now adds "not
>> executable" to the existing definitions, which seems to make sense, but
>> is a change that might trigger regressions (especially for
>> PAGE_HYPERVISOR). So I wonder if that should be mentioned in the commit
>> message then?
> 
> It will not trigger regression because mfn_to_xen_entry is setting xn to
> 1 by default. So all the mapping will be execute never when using
> PAGE_HYPERVISOR.

Ah right, I missed that. Might still be worth to mention in the commit
message, as this isn't obvious from just that patch.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 27/27] xen/arm: mm: Use memory flags for modify_xen_mappings rather than custom one

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> This will help to consolidate the page-table code and avoid different
> path depending on the action to perform.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

It could be worth to mention in the commit message that the removed
ASSERT is now already cared for in create_xen_entries() (which is also
another hint to make that an ASSERT, actually).

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> 
> ---
> 
> Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
> Cc: Ross Lagerwall <ross.lagerw...@citrix.com>
> 
> arch_livepatch_secure is now the same as on x86. It might be
> possible to combine both, but I left that alone for now.
> ---
>  xen/arch/arm/livepatch.c   |  6 +++---
>  xen/arch/arm/mm.c  |  5 ++---
>  xen/include/asm-arm/page.h | 11 ---
>  3 files changed, 5 insertions(+), 17 deletions(-)
> 
> diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
> index 3e53524365..279d52cc6c 100644
> --- a/xen/arch/arm/livepatch.c
> +++ b/xen/arch/arm/livepatch.c
> @@ -146,15 +146,15 @@ int arch_livepatch_secure(const void *va, unsigned int 
> pages, enum va_type type)
>  switch ( type )
>  {
>  case LIVEPATCH_VA_RX:
> -flags = PTE_RO; /* R set, NX clear */
> +flags = PAGE_HYPERVISOR_RX;
>  break;
>  
>  case LIVEPATCH_VA_RW:
> -flags = PTE_NX; /* R clear, NX set */
> +flags = PAGE_HYPERVISOR_RW;
>  break;
>  
>  case LIVEPATCH_VA_RO:
> -flags = PTE_NX | PTE_RO; /* R set, NX set */
> +flags = PAGE_HYPERVISOR_RO;
>  break;
>  
>  default:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index fe0646002e..c2fd4baef9 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1046,8 +1046,8 @@ static int create_xen_entries(enum xenmap_operation op,
>  else
>  {
>  pte = *entry;
> -pte.pt.ro = PTE_RO_MASK(flags);
> -pte.pt.xn = PTE_NX_MASK(flags);
> +pte.pt.ro = PAGE_RO_MASK(flags);
> +pte.pt.xn = PAGE_XN_MASK(flags);
>  if ( !pte.pt.ro && !pte.pt.xn )
>  {
>  printk("%s: Incorrect combination for addr=%lx\n",
> @@ -1090,7 +1090,6 @@ int destroy_xen_mappings(unsigned long v, unsigned long 
> e)
>  
>  int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int flags)
>  {
> -ASSERT((flags & (PTE_NX | PTE_RO)) == flags);
>  return create_xen_entries(MODIFY, s, INVALID_MFN, (e - s) >> PAGE_SHIFT,
>flags);
>  }
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 047220f86b..079097d429 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -91,17 +91,6 @@
>  #define PAGE_HYPERVISOR_WC  (_PAGE_DEVICE|MT_NORMAL_NC)
>  
>  /*
> - * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to 
> be
> - * used with modify_xen_mappings.
> - */
> -#define _PTE_NX_BIT 0U
> -#define _PTE_RO_BIT 1U
> -#define PTE_NX  (1U << _PTE_NX_BIT)
> -#define PTE_RO  (1U << _PTE_RO_BIT)
> -#define PTE_NX_MASK(x)  (((x) >> _PTE_NX_BIT) & 0x1U)
> -#define PTE_RO_MASK(x)  (((x) >> _PTE_RO_BIT) & 0x1U)
> -
> -/*
>   * Stage 2 Memory Type.
>   *
>   * These are valid in the MemAttr[3:0] field of an LPAE stage 2 page
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 26/27] xen/arm: mm: Handling permission flags when adding a new mapping

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Currently, all the new mappings will be read-write non-executable. Allow the
> caller to use other permissions.
> 
> Signed-off-by: Julien Grall 
> ---
>  xen/arch/arm/mm.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index cd7bcf7aca..fe0646002e 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1022,6 +1022,14 @@ static int create_xen_entries(enum xenmap_operation op,
>  if ( op == RESERVE )
>  break;
>  pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags));
> +pte.pt.ro = PAGE_RO_MASK(flags);
> +pte.pt.xn = PAGE_XN_MASK(flags);
> +if (  !pte.pt.ro && !pte.pt.xn )
> +{
> +printk("%s: Incorrect combination for addr=%lx\n",
> +   __func__, addr);
> +return -EINVAL;

I don't think this should be a handled runtime error, but rather a
BUG_ON() or an ASSERT().
I chased down the call chain for all create_xen_entries() invocations,
and they all stem from some constant (combination of) hard coded flags.
So ending up with an invalid combination here is clearly a bug in the
code and should be treated as such.

Cheers,
Andre.

> +}
>  pte.pt.table = 1;
>  write_pte(entry, pte);
>  break;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 21/27] xen/arm: mm: Rename and clarify AP[1] in the stage-1 page table

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> The description of AP[1] in Xen is based on testing rather than the ARM
> ARM.
> 
> Per the ARM ARM, on EL2 stage-1 page table, AP[1] is RES1 as the
> translation regime applies to only one exception level (see D4.4.4 and
> G4.6.1 in ARM DDI 0487B.a).

Indeed.

> 
> Update the comment and also rename the field to match the description in
> the ARM ARM.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/mm.c  | 10 +-
>  xen/include/asm-arm/lpae.h |  2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index ce1858fbf3..c0d5fda269 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -273,7 +273,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned 
> attr)
>  .table = 0,   /* Set to 1 for links and 4k maps */
>  .ai = attr,
>  .ns = 1,  /* Hyp mode is in the non-secure world */
> -.user = 1,/* See below */
> +.up = 1,  /* See below */
>  .ro = 0,  /* Assume read-write */
>  .af = 1,  /* No need for access tracking */
>  .ng = 1,  /* Makes TLB flushes easier */
> @@ -282,10 +282,10 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, 
> unsigned attr)
>  .avail = 0,   /* Reference count for domheap mapping */
>  }};
>  /*
> - * Setting the User bit is strange, but the ATS1H[RW] instructions
> - * don't seem to work otherwise, and since we never run on Xen
> - * pagetables in User mode it's OK.  If this changes, remember
> - * to update the hard-coded values in head.S too.
> + * For EL2 stage-1 page table, up (aka AP[1]) is RES1 as the translation
> + * regime applies to only one exception level (see D4.4.4 and G4.6.1
> + * in ARM DDI 0487B.a). If this changes, remember to update the
> + * hard-coded values in head.S too.
>   */
>  
>  switch ( attr )
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index a62b118630..9402434c1e 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -33,7 +33,7 @@ typedef struct __packed {
>   */
>  unsigned long ai:3; /* Attribute Index */
>  unsigned long ns:1; /* Not-Secure */
> -unsigned long user:1;   /* User-visible */
> +unsigned long up:1; /* Unpriviledged access */
>  unsigned long ro:1; /* Read-Only */
>  unsigned long sh:2; /* Shareability */
>  unsigned long af:1; /* Access Flag */
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 25/27] xen/arm: mm: Embed permission in the flags

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Currently, it is not possible to specify the permission of a new
> mapping. It would be necessary to use the function modify_xen_mappings
> with a different set of flags.
> 
> Add introduce a couple of new flags for the permissions (Non-eXecutable,
> Read-Only) and also provides define that combine the memory attribute
> and permission for common combination.

If I haven't been lost in the definitions, this now adds "not
executable" to the existing definitions, which seems to make sense, but
is a change that might trigger regressions (especially for
PAGE_HYPERVISOR). So I wonder if that should be mentioned in the commit
message then?

The actual patch looks OK though, so:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> 
> A follow-up patch will change modify_xen_mappings to use the new flags.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>
> ---
>  xen/include/asm-arm/page.h | 22 +++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 1bf8e9d012..047220f86b 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -67,12 +67,28 @@
>   * Layout of the flags used for updating the hypervisor page tables
>   *
>   * [0:2] Memory Attribute Index
> + * [3:4] Permission flags
>   */
>  #define PAGE_AI_MASK(x) ((x) & 0x7U)
>  
> -#define PAGE_HYPERVISOR (MT_NORMAL)
> -#define PAGE_HYPERVISOR_NOCACHE (MT_DEVICE_nGnRE)
> -#define PAGE_HYPERVISOR_WC  (MT_NORMAL_NC)
> +#define _PAGE_XN_BIT3
> +#define _PAGE_RO_BIT4
> +#define _PAGE_XN(1U << _PAGE_XN_BIT)
> +#define _PAGE_RO(1U << _PAGE_RO_BIT)
> +#define PAGE_XN_MASK(x) (((x) >> _PAGE_XN_BIT) & 0x1U)
> +#define PAGE_RO_MASK(x) (((x) >> _PAGE_RO_BIT) & 0x1U)
> +
> +/* Device memory will always be mapped read-write non-executable. */
> +#define _PAGE_DEVICE_PAGE_XN
> +#define _PAGE_NORMALMT_NORMAL
> +
> +#define PAGE_HYPERVISOR_RO  (_PAGE_NORMAL|_PAGE_RO|_PAGE_XN)
> +#define PAGE_HYPERVISOR_RX  (_PAGE_NORMAL|_PAGE_RO)
> +#define PAGE_HYPERVISOR_RW  (_PAGE_NORMAL|_PAGE_XN)
> +
> +#define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW
> +#define PAGE_HYPERVISOR_NOCACHE (_PAGE_DEVICE|MT_DEVICE_nGnRE)
> +#define PAGE_HYPERVISOR_WC  (_PAGE_DEVICE|MT_NORMAL_NC)
>  
>  /*
>   * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to 
> be
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 24/27] xen/arm: page: Describe the layout of flags used to update page tables

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Currently, the flags used to update page tables (i.e PAGE_HYPERVISOR_*)
> only contains the memory attribute index. Follow-up patches will add
> more information in it.
> 
> At the same time introduce PAGE_AI_MASK to get the memory attribute
> index easily.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

I wonder if that should be merged with the next patch, to explain the
reason for it. As it stands now it just applies a mask to some existing
call, which looks a bit suspicious.
But that's just a nit and the patch itself is fine, so:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/mm.c  | 2 +-
>  xen/include/asm-arm/page.h | 7 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 411fe02842..cd7bcf7aca 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1021,7 +1021,7 @@ static int create_xen_entries(enum xenmap_operation op,
>  }
>  if ( op == RESERVE )
>  break;
> -pte = mfn_to_xen_entry(mfn, flags);
> +pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags));
>  pte.pt.table = 1;
>  write_pte(entry, pte);
>  break;
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index d9dac92e73..1bf8e9d012 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -63,6 +63,13 @@
>  #define MAIR0VAL (MAIRVAL & 0x)
>  #define MAIR1VAL (MAIRVAL >> 32)
>  
> +/*
> + * Layout of the flags used for updating the hypervisor page tables
> + *
> + * [0:2] Memory Attribute Index
> + */
> +#define PAGE_AI_MASK(x) ((x) & 0x7U)
> +
>  #define PAGE_HYPERVISOR (MT_NORMAL)
>  #define PAGE_HYPERVISOR_NOCACHE (MT_DEVICE_nGnRE)
>  #define PAGE_HYPERVISOR_WC  (MT_NORMAL_NC)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 23/27] xen/arm: mm: Rename 'ai' into 'flags' in create_xen_entries

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> The parameter 'ai' is used either for attribute index or for
> permissions. Follow-up patch will rework that parameters to carry more
> information. So rename the parameter to 'flags'.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/mm.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index c0d5fda269..411fe02842 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -986,7 +986,7 @@ static int create_xen_entries(enum xenmap_operation op,
>unsigned long virt,
>mfn_t mfn,
>unsigned long nr_mfns,
> -  unsigned int ai)
> +  unsigned int flags)
>  {
>  int rc;
>  unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE;
> @@ -1021,7 +1021,7 @@ static int create_xen_entries(enum xenmap_operation op,
>  }
>  if ( op == RESERVE )
>  break;
> -pte = mfn_to_xen_entry(mfn, ai);
> +pte = mfn_to_xen_entry(mfn, flags);
>  pte.pt.table = 1;
>  write_pte(entry, pte);
>  break;
> @@ -1038,8 +1038,8 @@ static int create_xen_entries(enum xenmap_operation op,
>  else
>  {
>  pte = *entry;
> -pte.pt.ro = PTE_RO_MASK(ai);
> -pte.pt.xn = PTE_NX_MASK(ai);
> +pte.pt.ro = PTE_RO_MASK(flags);
> +pte.pt.xn = PTE_NX_MASK(flags);
>  if ( !pte.pt.ro && !pte.pt.xn )
>  {
>  printk("%s: Incorrect combination for addr=%lx\n",
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 19/27] xen/arm: page: Clean-up the definition of MAIRVAL

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Currently MAIRVAL is defined in term of MAIR0VAL and MAIR1VAL which are
> both hardcoded value. This makes quite difficult to understand the value
> written in both registers.
> 
> Rework the definition by using value of each attribute shifted by their
> associated index.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

I checked all the bits and encoding against the ARMv8 ARM, they look
correct to me.
However I feel that the attribute renaming patch (20/27) should come
before this one.
However:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/page.h | 43 +--
>  1 file changed, 25 insertions(+), 18 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index d7a048b64d..86b227c291 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -22,6 +22,21 @@
>  #define LPAE_SH_INNER 0x3
>  
>  /*
> + * Attribute Indexes.
> + *
> + * These are valid in the AttrIndx[2:0] field of an LPAE stage 1 page
> + * table entry. They are indexes into the bytes of the MAIR*
> + * registers, as defined above.
> + *
> + */
> +#define MT_UNCACHED  0x0
> +#define MT_BUFFERABLE0x1
> +#define MT_WRITETHROUGH  0x2
> +#define MT_WRITEBACK 0x3
> +#define MT_DEV_SHARED0x4
> +#define MT_WRITEALLOC0x7
> +
> +/*
>   * LPAE Memory region attributes. Indexed by the AttrIndex bits of a
>   * LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and 
> MAIR1.
>   *
> @@ -35,26 +50,18 @@
>   *   reserved 110
>   *   MT_WRITEALLOC111      -- Write-back write-allocate
>   *
> - *   MT_DEV_WC001   (== BUFFERABLE)
>   */
> -#define MAIR0VAL 0xeeaa4400
> -#define MAIR1VAL 0xff04
> -#define MAIRVAL (MAIR0VAL|MAIR1VAL<<32)
> +#define MAIR(attr, mt) (_AC(attr, ULL) << ((mt) * 8))
>  
> -/*
> - * Attribute Indexes.
> - *
> - * These are valid in the AttrIndx[2:0] field of an LPAE stage 1 page
> - * table entry. They are indexes into the bytes of the MAIR*
> - * registers, as defined above.
> - *
> - */
> -#define MT_UNCACHED  0x0
> -#define MT_BUFFERABLE0x1
> -#define MT_WRITETHROUGH  0x2
> -#define MT_WRITEBACK 0x3
> -#define MT_DEV_SHARED0x4
> -#define MT_WRITEALLOC0x7
> +#define MAIRVAL (MAIR(0x00, MT_UNCACHED) | \
> + MAIR(0x44, MT_BUFFERABLE)   | \
> + MAIR(0xaa, MT_WRITETHROUGH) | \
> + MAIR(0xee, MT_WRITEBACK)| \
> + MAIR(0x04, MT_DEV_SHARED)   | \
> + MAIR(0xff, MT_WRITEALLOC))
> +
> +#define MAIR0VAL (MAIRVAL & 0x)
> +#define MAIR1VAL (MAIRVAL >> 32)
>  
>  #define PAGE_HYPERVISOR (MT_WRITEALLOC)
>  #define PAGE_HYPERVISOR_NOCACHE (MT_DEV_SHARED)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 18/27] xen/arm: page: Prefix memory types with MT_

2017-08-23 Thread Andre Przywara
fn(domheap+i*LPAE_ENTRIES),
> -   WRITEALLOC);
> +   MT_WRITEALLOC);
>  pte.pt.table = 1;
>  
> write_pte([first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
>  }
> @@ -869,13 +869,13 @@ void __init setup_xenheap_mappings(unsigned long 
> base_mfn,
>  mfn_t first_mfn = alloc_boot_pages(1, 1);
>  
>  clear_page(mfn_to_virt(first_mfn));
> -pte = mfn_to_xen_entry(first_mfn, WRITEALLOC);
> +pte = mfn_to_xen_entry(first_mfn, MT_WRITEALLOC);
>  pte.pt.table = 1;
>  write_pte(p, pte);
>  first = mfn_to_virt(first_mfn);
>  }
>  
> -pte = mfn_to_xen_entry(_mfn(mfn), WRITEALLOC);
> +pte = mfn_to_xen_entry(_mfn(mfn), MT_WRITEALLOC);
>  /* TODO: Set pte.pt.contig when appropriate. */
>  write_pte([first_table_offset(vaddr)], pte);
>  
> @@ -915,7 +915,7 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t 
> pe)
>  for ( i = 0; i < nr_second; i++ )
>  {
>  clear_page(mfn_to_virt(mfn_add(second_base, i)));
> -pte = mfn_to_xen_entry(mfn_add(second_base, i), WRITEALLOC);
> +pte = mfn_to_xen_entry(mfn_add(second_base, i), MT_WRITEALLOC);
>  pte.pt.table = 1;
>  write_pte(_first[first_table_offset(FRAMETABLE_VIRT_START)+i], 
> pte);
>  }
> @@ -969,7 +969,7 @@ static int create_xen_table(lpae_t *entry)
>  if ( p == NULL )
>  return -ENOMEM;
>  clear_page(p);
> -pte = mfn_to_xen_entry(virt_to_mfn(p), WRITEALLOC);
> +pte = mfn_to_xen_entry(virt_to_mfn(p), MT_WRITEALLOC);
>  pte.pt.table = 1;
>  write_pte(entry, pte);
>  return 0;
> diff --git a/xen/arch/arm/platforms/vexpress.c 
> b/xen/arch/arm/platforms/vexpress.c
> index a26ac324ba..9badbc079d 100644
> --- a/xen/arch/arm/platforms/vexpress.c
> +++ b/xen/arch/arm/platforms/vexpress.c
> @@ -65,7 +65,7 @@ int vexpress_syscfg(int write, int function, int device, 
> uint32_t *data)
>  uint32_t *syscfg = (uint32_t *) FIXMAP_ADDR(FIXMAP_MISC);
>  int ret = -1;
>  
> -set_fixmap(FIXMAP_MISC, maddr_to_mfn(V2M_SYS_MMIO_BASE), DEV_SHARED);
> +set_fixmap(FIXMAP_MISC, maddr_to_mfn(V2M_SYS_MMIO_BASE), MT_DEV_SHARED);
>  
>  if ( syscfg[V2M_SYS_CFGCTRL/4] & V2M_SYS_CFG_START )
>  goto out;
> diff --git a/xen/drivers/video/arm_hdlcd.c b/xen/drivers/video/arm_hdlcd.c
> index 3915f731f5..5fa7f518b1 100644
> --- a/xen/drivers/video/arm_hdlcd.c
> +++ b/xen/drivers/video/arm_hdlcd.c
> @@ -227,7 +227,7 @@ void __init video_init(void)
>  /* uses FIXMAP_MISC */
>  set_pixclock(videomode->pixclock);
>  
> -set_fixmap(FIXMAP_MISC, maddr_to_mfn(hdlcd_start), DEV_SHARED);
> +set_fixmap(FIXMAP_MISC, maddr_to_mfn(hdlcd_start), MT_DEV_SHARED);
>  HDLCD[HDLCD_COMMAND] = 0;
>  
>  HDLCD[HDLCD_LINELENGTH] = videomode->xres * bytes_per_pixel;
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 660e1779c5..d7a048b64d 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -25,17 +25,17 @@
>   * LPAE Memory region attributes. Indexed by the AttrIndex bits of a
>   * LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and 
> MAIR1.
>   *
> - * aiencoding
> - *   UNCACHED  000      -- Strongly Ordered
> - *   BUFFERABLE001   0100 0100  -- Non-Cacheable
> - *   WRITETHROUGH  010   1010 1010  -- Write-through
> - *   WRITEBACK 011   1110 1110  -- Write-back
> - *   DEV_SHARED100    0100  -- Device
> - *   ??101
> - *   reserved  110
> - *   WRITEALLOC111      -- Write-back write-allocate
> + *aiencoding
> + *   MT_UNCACHED      000      -- Strongly Ordered
> + *   MT_BUFFERABLE001   0100 0100  -- Non-Cacheable
> + *   MT_WRITETHROUGH  010   1010 1010  -- Write-through
> + *   MT_WRITEBACK 011   1110 1110  -- Write-back
> + *   MT_DEV_SHARED100    0100  -- Device
> + *   ??   101
> + *   reserved 110
> + *   MT_WRITEALLOC111      -- Write-back write-allocate
>   *
> - *   DEV_WC001   (== BUFFERABLE)
> + *   MT_DEV_WC001   (== BUFFERABLE)

It's just a comment, but for consistency this should be MT_BUFFERABLE
here as well, I guess.

Apart from that nit the rest looks correct.

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

>   */
>  #define MAIR0VAL 0xeeaa4400
>  #define MAIR1VAL 0xff04
> @@ -49,16 +49,16 @@
>   * registers, as defined above.
>   *
>   */
> -#define UNCAC

Re: [Xen-devel] [PATCH 20/27] xen/arm: page: Use ARMv8 naming to improve readability

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> This is based on the Linux ARMv8 naming scheme (see arch/arm64/mm/proc.S). 
> Each
> type will contain "NORMAL" or "DEVICE" to make clear whether each attribute
> targets device or normal memory.

Yes, that makes sense and improves readability as the naming matches the
spec and is more intuitive. Also it looks correct to me.

However I feel it would be more helpful is this patches comes before the
previous one which reworks the MAIR construction.

However for this patch:

> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/kernel.c |  2 +-
>  xen/arch/arm/mm.c | 28 +-
>  xen/arch/arm/platforms/vexpress.c |  2 +-
>  xen/drivers/video/arm_hdlcd.c |  2 +-
>  xen/include/asm-arm/page.h| 42 
> +++
>  5 files changed, 38 insertions(+), 38 deletions(-)
> 
> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
> index 9c183f96da..a12baa86e7 100644
> --- a/xen/arch/arm/kernel.c
> +++ b/xen/arch/arm/kernel.c
> @@ -54,7 +54,7 @@ void copy_from_paddr(void *dst, paddr_t paddr, unsigned 
> long len)
>  s = paddr & (PAGE_SIZE-1);
>  l = min(PAGE_SIZE - s, len);
>  
> -set_fixmap(FIXMAP_MISC, maddr_to_mfn(paddr), MT_BUFFERABLE);
> +set_fixmap(FIXMAP_MISC, maddr_to_mfn(paddr), MT_NORMAL_NC);
>  memcpy(dst, src + s, l);
>  clean_dcache_va_range(dst, l);
>  
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 45974846a9..ce1858fbf3 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -290,7 +290,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned 
> attr)
>  
>  switch ( attr )
>  {
> -case MT_BUFFERABLE:
> +case MT_NORMAL_NC:
>  /*
>   * ARM ARM: Overlaying the shareability attribute (DDI
>   * 0406C.b B3-1376 to 1377)
> @@ -305,8 +305,8 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned 
> attr)
>   */
>  e.pt.sh = LPAE_SH_OUTER;
>  break;
> -case MT_UNCACHED:
> -case MT_DEV_SHARED:
> +case MT_DEVICE_nGnRnE:
> +case MT_DEVICE_nGnRE:
>  /*
>   * Shareability is ignored for non-Normal memory, Outer is as
>   * good as anything.
> @@ -369,7 +369,7 @@ static void __init create_mappings(lpae_t *second,
>  
>  count = nr_mfns / LPAE_ENTRIES;
>  p = second + second_linear_offset(virt_offset);
> -pte = mfn_to_xen_entry(_mfn(base_mfn), MT_WRITEALLOC);
> +pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL);
>  if ( granularity == 16 * LPAE_ENTRIES )
>  pte.pt.contig = 1;  /* These maps are in 16-entry contiguous chunks. 
> */
>  for ( i = 0; i < count; i++ )
> @@ -422,7 +422,7 @@ void *map_domain_page(mfn_t mfn)
>  else if ( map[slot].pt.avail == 0 )
>  {
>  /* Commandeer this 2MB slot */
> -pte = mfn_to_xen_entry(_mfn(slot_mfn), MT_WRITEALLOC);
> +pte = mfn_to_xen_entry(_mfn(slot_mfn), MT_NORMAL);
>  pte.pt.avail = 1;
>  write_pte(map + slot, pte);
>  break;
> @@ -543,7 +543,7 @@ static inline lpae_t pte_of_xenaddr(vaddr_t va)
>  {
>  paddr_t ma = va + phys_offset;
>  
> -return mfn_to_xen_entry(maddr_to_mfn(ma), MT_WRITEALLOC);
> +return mfn_to_xen_entry(maddr_to_mfn(ma), MT_NORMAL);
>  }
>  
>  /* Map the FDT in the early boot page table */
> @@ -652,7 +652,7 @@ void __init setup_pagetables(unsigned long 
> boot_phys_offset, paddr_t xen_paddr)
>  /* Initialise xen second level entries ... */
>  /* ... Xen's text etc */
>  
> -pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_WRITEALLOC);
> +pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_NORMAL);
>  pte.pt.xn = 0;/* Contains our text mapping! */
>  xen_second[second_table_offset(XEN_VIRT_START)] = pte;
>  
> @@ -669,7 +669,7 @@ void __init setup_pagetables(unsigned long 
> boot_phys_offset, paddr_t xen_paddr)
>  
>  /* ... Boot Misc area for xen relocation */
>  dest_va = BOOT_RELOC_VIRT_START;
> -pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_WRITEALLOC);
> +pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_NORMAL);
>  /* Map the destination in xen_second. */
>  xen_second[second_table_offset(dest_va)] = pte;
>  /* Map the destination in boot_second. */
> @@ -700,7 +700,7 @@ void __init setup_pagetables(unsigned long 
> boot_phys_offset, paddr_t xen_paddr)
>  unsigned long va = XEN_VIRT_START +

Re: [Xen-devel] [PATCH 16/27] xen/arm: page: Remove unused attributes DEV_NONSHARED and DEV_CACHED

2017-08-23 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> They were imported from non-LPAE Linux, but Xen is LPAE only. It is time
> to do some clean-up in the memory attribute and keep only what make
> sense for Xen. Follow-up patch will do more clean-up.
> 
> Also, update the comment saying our attribute matches Linux.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/page.h | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index cef2f28914..465300c6e5 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -21,9 +21,9 @@
>  #define LPAE_SH_OUTER 0x2
>  #define LPAE_SH_INNER 0x3
>  
> -/* LPAE Memory region attributes, to match Linux's (non-LPAE) choices.
> - * Indexed by the AttrIndex bits of a LPAE entry;
> - * the 8-bit fields are packed little-endian into MAIR0 and MAIR1
> +/*
> + * LPAE Memory region attributes. Indexed by the AttrIndex bits of a
> + * LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and 
> MAIR1.
>   *
>   * aiencoding
>   *   UNCACHED  000      -- Strongly Ordered
> @@ -35,9 +35,7 @@
>   *   reserved  110
>   *   WRITEALLOC111      -- Write-back write-allocate
>   *
> - *   DEV_NONSHARED 100   (== DEV_SHARED)
>   *   DEV_WC001   (== BUFFERABLE)
> - *   DEV_CACHED011   (== WRITEBACK)
>   */
>  #define MAIR0VAL 0xeeaa4400
>  #define MAIR1VAL 0xff04
> @@ -57,9 +55,7 @@
>  #define WRITEBACK 0x3
>  #define DEV_SHARED0x4
>  #define WRITEALLOC0x7
> -#define DEV_NONSHARED DEV_SHARED
>  #define DEV_WCBUFFERABLE
> -#define DEV_CACHEDWRITEBACK
>  
>  #define PAGE_HYPERVISOR (WRITEALLOC)
>  #define PAGE_HYPERVISOR_NOCACHE (DEV_SHARED)
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 22/27] xen/arm: Switch to SYS_STATE_boot just after end_boot_allocator()

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> We should consider the early boot period to end when we stop using the
> boot allocator. This is inline with x86 and will be helpful to know
> whether we should allocate memory from the boot allocator or xenheap.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/setup.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 277b566b88..46737a2eca 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -757,6 +757,12 @@ void __init start_xen(unsigned long boot_phys_offset,
>  
>  end_boot_allocator();
>  
> +/*
> + * The memory subsystem has been initialized, we can now switch from
> + * early_boot -> boot.
> + */
> +system_state = SYS_STATE_boot;
> +
>  vm_init();
>  
>  if ( acpi_disabled )
> @@ -779,8 +785,6 @@ void __init start_xen(unsigned long boot_phys_offset,
>  console_init_preirq();
>  console_init_ring();
>  
> -system_state = SYS_STATE_boot;
> -
>  processor_id();
>  
>  smp_init_cpus();
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 17/27] xen/arm: page: Use directly BUFFERABLE and drop DEV_WC

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> DEV_WC is only used for PAGE_HYPERVISOR_WC and does not bring much
> improvement.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/page.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 465300c6e5..660e1779c5 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -55,11 +55,10 @@
>  #define WRITEBACK 0x3
>  #define DEV_SHARED0x4
>  #define WRITEALLOC0x7
> -#define DEV_WCBUFFERABLE
>  
>  #define PAGE_HYPERVISOR (WRITEALLOC)
>  #define PAGE_HYPERVISOR_NOCACHE (DEV_SHARED)
> -#define PAGE_HYPERVISOR_WC  (DEV_WC)
> +#define PAGE_HYPERVISOR_WC  (BUFFERABLE)
>  
>  /*
>   * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to 
> be
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/27] xen/arm: Replace ioremap_attr(PAGE_HYPERVISOR_NOCACHE) call by ioremap_nocache

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> ioremap_cache is a wrapper of ioremap_attr(...).
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre

> ---
>  xen/arch/arm/platforms/exynos5.c | 2 +-
>  xen/arch/arm/platforms/omap5.c   | 6 ++
>  2 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/platforms/exynos5.c 
> b/xen/arch/arm/platforms/exynos5.c
> index 2ae5fa66e0..95d6581d33 100644
> --- a/xen/arch/arm/platforms/exynos5.c
> +++ b/xen/arch/arm/platforms/exynos5.c
> @@ -62,7 +62,7 @@ static int exynos5_init_time(void)
>  dprintk(XENLOG_INFO, "mct_base_addr: %016llx size: %016llx\n",
>  mct_base_addr, size);
>  
> -mct = ioremap_attr(mct_base_addr, size, PAGE_HYPERVISOR_NOCACHE);
> +mct = ioremap_nocache(mct_base_addr, size);
>  if ( !mct )
>  {
>  dprintk(XENLOG_ERR, "Unable to map MCT\n");
> diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c
> index 1e1f9fa970..7dbba95756 100644
> --- a/xen/arch/arm/platforms/omap5.c
> +++ b/xen/arch/arm/platforms/omap5.c
> @@ -51,8 +51,7 @@ static int omap5_init_time(void)
>  unsigned int sys_clksel;
>  unsigned int num, den, frac1, frac2;
>  
> -ckgen_prm_base = ioremap_attr(OMAP5_CKGEN_PRM_BASE,
> -  0x20, PAGE_HYPERVISOR_NOCACHE);
> +ckgen_prm_base = ioremap_nocache(OMAP5_CKGEN_PRM_BASE, 0x20);
>  if ( !ckgen_prm_base )
>  {
>  dprintk(XENLOG_ERR, "%s: PRM_BASE ioremap failed\n", __func__);
> @@ -64,8 +63,7 @@ static int omap5_init_time(void)
>  
>  iounmap(ckgen_prm_base);
>  
> -rt_ct_base = ioremap_attr(REALTIME_COUNTER_BASE,
> -  0x20, PAGE_HYPERVISOR_NOCACHE);
> +rt_ct_base = ioremap_nocache(REALTIME_COUNTER_BASE, 0x20);
>  if ( !rt_ct_base )
>  {
>  dprintk(XENLOG_ERR, "%s: REALTIME_COUNTER_BASE ioremap failed\n", 
> __func__);
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/27] xen/arm: traps: Improve logging for data/prefetch abort fault

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Walk the hypervisor page table for data/prefetch abort fault to help
> diagnostics error in the page tables.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>
> ---
>  xen/arch/arm/traps.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 819bdbc69e..dac4e54fa7 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2967,7 +2967,26 @@ asmlinkage void do_trap_hyp_sync(struct cpu_user_regs 
> *regs)
>  do_trap_brk(regs, hsr);
>  break;
>  #endif
> +case HSR_EC_DATA_ABORT_CURR_EL:
> +case HSR_EC_INSTR_ABORT_CURR_EL:
> +{
> +bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_CURR_EL);
> +const char *fault = (is_data) ? "Data Abort" : "Instruction Abort";
> +
> +printk("%s Trap. Syndrome=%#x\n", fault, hsr.iss);
> +/*
> + * FAR may not be valid for a Synchronous External abort other
> + * than translation table walk.
> + */
> +if ( hsr.xabt.fsc != FSC_SEA || !hsr.xabt.fnv )

This is quite hard to read. Would the DeMorgan'ed version be better?
   if ( hsr.xabt.fsc == FSC_SEA && hsr.xabt.fnv )
   printk 
   else
   dump_hyp_walk ...

> +dump_hyp_walk(get_hfar(is_data));
> +else
> +printk("Invalid FAR, don't walk the hypervisor tables\n");

Nit: "not walking" sounds less ambiguous.

> +do_unexpected_trap(fault, regs);
>  
> +break;
> +}
>  default:
>      printk("Hypervisor Trap. HSR=0x%x EC=0x%x IL=%x 
> Syndrome=0x%"PRIx32"\n",
> hsr.bits, hsr.ec, hsr.len, hsr.iss);

Ignoring the nits above:
Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/27] xen/arm: traps: Introduce a helper to read the hypersivor fault register

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> While ARM32 has 2 distinct registers for the hypervisor fault register
> (one for prefetch abort, the other for data abort), AArch64 has only
> one.
> 
> Currently, the logic is open-code but a follow-up patch will require to
> read it too. So move the logic in a separate helper and use it instead
> of open-coding it.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/traps.c | 35 +--
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 498d8c594a..819bdbc69e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2530,6 +2530,28 @@ done:
>  if (first) unmap_domain_page(first);
>  }
>  
> +/*
> + * Return the value of the hypervisor fault address register.
> + *
> + * On ARM32, the register will be different depending whether the
> + * fault is a prefetch abort or data abort.
> + */
> +static inline vaddr_t get_hfar(bool is_data)
> +{
> +vaddr_t gva;
> +
> +#ifdef CONFIG_ARM_32
> +if ( is_data )
> +gva = READ_CP32(HDFAR);
> +else
> +gva = READ_CP32(HIFAR);
> +#else
> +gva =  READ_SYSREG(FAR_EL2);
> +#endif
> +
> +return gva;
> +}
> +
>  static inline paddr_t get_faulting_ipa(vaddr_t gva)
>  {
>  register_t hpfar = READ_SYSREG(HPFAR_EL2);
> @@ -2565,11 +2587,7 @@ static void do_trap_instr_abort_guest(struct 
> cpu_user_regs *regs,
>  paddr_t gpa;
>  mfn_t mfn;
>  
> -#ifdef CONFIG_ARM_32
> -gva = READ_CP32(HIFAR);
> -#else
> -gva = READ_SYSREG64(FAR_EL2);
> -#endif
> +gva = get_hfar(false /* is_data */);
>  
>  /*
>   * If this bit has been set, it means that this instruction abort is 
> caused
> @@ -2711,11 +2729,8 @@ static void do_trap_data_abort_guest(struct 
> cpu_user_regs *regs,
>  return __do_trap_serror(regs, true);
>  
>  info.dabt = dabt;
> -#ifdef CONFIG_ARM_32
> -info.gva = READ_CP32(HDFAR);
> -#else
> -info.gva = READ_SYSREG64(FAR_EL2);
> -#endif
> +
> +info.gva = get_hfar(true /* is_data */);
>  
>  if ( hpfar_is_valid(dabt.s1ptw, fsc) )
>  info.gpa = get_faulting_ipa(info.gva);
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/27] xen/arm: Introduce hsr_xabt to gather common bits between hsr_dabt and

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> This will allow to consolidate some part of the data abort and prefetch
> abort handling in a single function later on.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/processor.h | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 3ef606c554..9964348189 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -537,6 +537,19 @@ union hsr {
>  unsigned long ec:6;/* Exception Class */
>  } dabt; /* HSR_EC_DATA_ABORT_* */
>  
> +/* Contain the common bits between DABT and IABT */
> +struct hsr_xabt {
> +unsigned long fsc:6;/* Fault status code */
> +unsigned long pad1:1;
> +unsigned long s1ptw:1;  /* Stage 2 fault during stage 1 translation 
> */
> +unsigned long pad2:1;
> +unsigned long eat:1;/* External abort type */
> +unsigned long fnv:1;/* FAR not Valid */
> +unsigned long pad3:14;
> +unsigned long len:1;/* Instruction length */
> +unsigned long ec:6; /* Exception Class */
> +} xabt;
> +
>  #ifdef CONFIG_ARM_64
>  struct hsr_brk {
>  unsigned long comment:16;   /* Comment */
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/27] xen/arm: Add FnV field in hsr_*abt

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> FnV (FAR not Valid) bit was introduced by ARMv8 in both AArch32 and
> AArch64 (See D7-2275, D7-2277, G6-4958, G6-4962 in ARM DDI 0487B.a).

I understand that this just prepares the data structures for patch #14,
but I was wondering if we should update the other fields on the way as
well, for instance there is now "ar" in Aarch32 also.

> Signed-off-by: Julien Grall <julien.gr...@arm.com>

But the actual bits are correct, so if we just need "fnv", then this is:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

> ---
>  xen/include/asm-arm/processor.h | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 51645f08c0..3ef606c554 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -509,7 +509,8 @@ union hsr {
>  unsigned long s1ptw:1; /* Stage 2 fault during stage 1 translation */
>  unsigned long res1:1;  /* RES0 */
>  unsigned long eat:1;   /* External abort type */
> -unsigned long res2:15;
> +unsigned long fnv:1;   /* FAR not Valid */
> +unsigned long res2:14;
>  unsigned long len:1;   /* Instruction length */
>  unsigned long ec:6;/* Exception Class */
>  } iabt; /* HSR_EC_INSTR_ABORT_* */
> @@ -520,10 +521,11 @@ union hsr {
>  unsigned long s1ptw:1; /* Stage 2 fault during stage 1 translation */
>  unsigned long cache:1; /* Cache Maintenance */
>  unsigned long eat:1;   /* External Abort Type */
> +unsigned long fnv:1;   /* FAR not Valid */
>  #ifdef CONFIG_ARM_32
> -unsigned long sbzp0:6;
> +unsigned long sbzp0:5;

This can be broken down further, as the ARMv8 ARM explains more of these
bits now. "ar" is now also defined here, for instance.

>  #else
> -unsigned long sbzp0:4;
> -unsigned long sbzp0:3;

And also on the Aarch64 side there are now more bits used.

Cheers,
Andre.

>  unsigned long ar:1;/* Acquire Release */
>  unsigned long sf:1;/* Sixty Four bit register */
>  #endif
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/27] xen/arm: arm32: Don't define FAR_EL1

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Aliasing FAR_EL1 to IFAR is wrong because on ARMv8 FAR_EL1[31:0] is
> architecturally mapped to DFAR and FAR_EL1[63:32] to DFAR.
   
Should be IFAR, I guess?
Please put a quid into the copy-and-paste piggy bank ;-)

Otherwise it's fine.

> As FAR_EL1 is not currently used in ARM32 code, remove it.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/include/asm-arm/cpregs.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
> index 1889d7cbfb..9e138489f0 100644
> --- a/xen/include/asm-arm/cpregs.h
> +++ b/xen/include/asm-arm/cpregs.h
> @@ -306,7 +306,6 @@
>  #define DACR32_EL2  DACR
>  #define ESR_EL1 DFSR
>  #define ESR_EL2 HSR
> -#define FAR_EL1 HIFAR
>  #define HCR_EL2 HCR
>  #define HPFAR_EL2   HPFAR
>  #define HSTR_EL2HSTR
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/27] xen/arm: hsr_iabt: Document RES0 field

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:23, Julien Grall wrote:
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

> ---
>  xen/include/asm-arm/processor.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index ab5225fa6c..51645f08c0 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -505,9 +505,9 @@ union hsr {
>  
>  struct hsr_iabt {
>  unsigned long ifsc:6;  /* Instruction fault status code */
> -unsigned long res0:1;
> +unsigned long res0:1;  /* RES0 */
>  unsigned long s1ptw:1; /* Stage 2 fault during stage 1 translation */
> -unsigned long res1:1;
> +unsigned long res1:1;  /* RES0 */
>  unsigned long eat:1;   /* External abort type */
>  unsigned long res2:15;

As we are at it: newer versions of the ARM ARM have the "FnV" bit here
at bit 10, so would it be worth to update it as:

   unsigned long fnv:1;   /* FAR not Valid */
   unsigned long res2:14;

Cheers,
Andre.

>  unsigned long len:1;   /* Instruction length */
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 09/27] xen/arm: traps: Don't define FAR_EL2 for ARM32

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:24, Julien Grall wrote:
> Aliasing FAR_EL2 to HIFAR makes the code confusing because on ARMv8
> FAR_EL2[31:0] is architecturally mapped to HDFAR and FAR_EL2[63:32] to
> FAR_EL2.
  ^^^
I guess you mean HIFAR here.
Otherwise the patch makes sense.

> See D7.2.30 in ARM DDI 0487B.a. Open-code the alias instead.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
>  xen/arch/arm/traps.c | 8 +++-
>  xen/include/asm-arm/cpregs.h | 1 -
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index c07999b518..498d8c594a 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2560,11 +2560,17 @@ static void do_trap_instr_abort_guest(struct 
> cpu_user_regs *regs,
>const union hsr hsr)
>  {
>  int rc;
> -register_t gva = READ_SYSREG(FAR_EL2);
> +register_t gva;
>  uint8_t fsc = hsr.iabt.ifsc & ~FSC_LL_MASK;
>  paddr_t gpa;
>  mfn_t mfn;
>  
> +#ifdef CONFIG_ARM_32
> +gva = READ_CP32(HIFAR);
> +#else
> +gva = READ_SYSREG64(FAR_EL2);
> +#endif
> +
>  /*
>   * If this bit has been set, it means that this instruction abort is 
> caused
>   * by a guest external abort. We can handle this instruction abort as 
> guest
> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
> index af45ec7a65..1889d7cbfb 100644
> --- a/xen/include/asm-arm/cpregs.h
> +++ b/xen/include/asm-arm/cpregs.h
> @@ -307,7 +307,6 @@
>  #define ESR_EL1 DFSR
>  #define ESR_EL2 HSR
>  #define FAR_EL1 HIFAR
> -#define FAR_EL2 HIFAR
>  #define HCR_EL2 HCR
>  #define HPFAR_EL2   HPFAR
>  #define HSTR_EL2HSTR
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/27] xen/x86: mm: Don't check alloc_boot_pages return

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:23, Julien Grall wrote:
> The only way alloc_boot_pages will return 0 is during the error case.

This statement is not true. If alloc_boot_pages() returns, it has
succeeded. Returning 0 is nothing special.

> Although, Xen will panic in the error path. So the check in the caller
> is pointless.
> 
> Looking at the loop, my understanding is it will try to allocate in
> smaller chunk if a bigger chunk fail. Given that alloc_boot_pages can
> never check, the loop seems unecessary.

Agreed, the algorithm doesn't work with (current) implementation of
alloc_boot_pages(), so the patch is valid.

> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Given that you adjust the commit message:

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> 
> I haven't tested this code, only build test it. I can't see how
> alloc_boot_pages would return 0 other than the error path.
> ---
>  xen/arch/x86/mm.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index f53ca43554..66e337109d 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -200,11 +200,7 @@ static void __init init_frametable_chunk(void *start, 
> void *end)
>   */
>  while ( step && s + (step << PAGE_SHIFT) > e + (4 << PAGE_SHIFT) )
>  step >>= PAGETABLE_ORDER;
> -do {
> -mfn = alloc_boot_pages(step, step);
> -} while ( !mfn && (step >>= PAGETABLE_ORDER) );
> -if ( !mfn )
> -panic("Not enough memory for frame table");
> +mfn = alloc_boot_pages(step, step);
>  map_pages_to_xen(s, mfn, step, PAGE_HYPERVISOR);
>  }
>  
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/27] xen/x86: srat: Don't check alloc_boot_pages return

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:23, Julien Grall wrote:
> alloc_boot_pages will panic if it is not possible to allocate. So the
> check in the caller is pointless.
> 
> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Thanks,
Andre.

> ---
> 
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> ---
>  xen/arch/x86/srat.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index cd1283e58c..95660a9bbc 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -194,11 +194,6 @@ void __init acpi_numa_slit_init(struct acpi_table_slit 
> *slit)
>   return;
>   }
>   mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
> - if (!mfn) {
> - printk(KERN_ERR "ACPI: Unable to allocate memory for "
> -"saving ACPI SLIT numa information.\n");
> - return;
> - }
>   acpi_slit = mfn_to_virt(mfn);
>   memcpy(acpi_slit, slit, slit->header.length);
>  }
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 01/27] xen/x86: numa: Don't check alloc_boot_pages return

2017-08-22 Thread Andre Przywara
Hi,

On 14/08/17 15:23, Julien Grall wrote:
> alloc_boot_pages will panic if it is not possible to allocate. So the
> check in the caller is pointless.

More than that I don't see why "0" couldn't be a valid MFN. At least the
code in alloc_boot_pages() doesn't treat it specially, so it doesn't
signal an error condition in the first place.

> Signed-off-by: Julien Grall <julien.gr...@arm.com>

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>

Cheers,
Andre.

> ---
> 
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> ---
>  xen/arch/x86/numa.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index d45196fafc..ffeba6e180 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -101,14 +101,6 @@ static int __init allocate_cachealigned_memnodemap(void)
>  unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
>  unsigned long mfn = alloc_boot_pages(size, 1);
>  
> -if ( !mfn )
> -{
> -printk(KERN_ERR
> -   "NUMA: Unable to allocate Memory to Node hash map\n");
> -memnodemapsize = 0;
> -return -1;
> -}
> -
>  memnodemap = mfn_to_virt(mfn);
>  mfn <<= PAGE_SHIFT;
>  size <<= PAGE_SHIFT;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 07/22] ARM: vGIC: introduce priority setter/getter

2017-08-18 Thread Andre Przywara
Hi,

On 18/08/17 15:21, Julien Grall wrote:
> 
> 
> On 17/08/17 18:06, Andre Przywara wrote:
>> Hi,
> 
> Hi Andre,
> 
>> On 11/08/17 15:10, Julien Grall wrote:
>>> Hi Andre,
>>> 
>>> On 21/07/17 20:59, Andre Przywara wrote:
>>>> Since the GICs MMIO access always covers a number of IRQs at
>>>> once, introduce wrapper functions which loop over those IRQs,
>>>> take their locks and read or update the priority values. This
>>>> will be used in a later patch.
>>>> 
>>>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com> --- 
>>>> xen/arch/arm/vgic.c| 37
>>>> + 
>>>> xen/include/asm-arm/vgic.h |  5 + 2 files changed, 42
>>>> insertions(+)
>>>> 
>>>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index
>>>> 434b7e2..b2c9632 100644 --- a/xen/arch/arm/vgic.c +++
>>>> b/xen/arch/arm/vgic.c @@ -243,6 +243,43 @@ static int
>>>> vgic_get_virq_priority(struct vcpu *v, unsigned int virq) 
>>>> return ACCESS_ONCE(rank->priority[virq &
>>>> INTERRUPT_RANK_MASK]); }
>>>> 
>>>> +#define MAX_IRQS_PER_IPRIORITYR 4
>>> 
>>> The name gives the impression that you may have IPRIORITYR with
>>> only 1 IRQ. But this is not true. The registers is always 4.
>>> However, you are able to access using byte or word.
>>> 
>>>> +uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int
>>>> nrirqs,
>>> 
>>> I am well aware that the vgic code is mixing between virq and
>>> irq. Moving forward, we should use virq to avoid confusion.
>>> 
>>>> + unsigned int first_irq)
>>> 
>>> Please stay consistent, with the naming. Either nr_irqs/first_irq
>>> or nrirqs/firstirq. But not a mix.
>>> 
>>> Also, it makes more sense to describe first the start then
>>> number.
>>> 
>>>> +{ +struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR]; +
>>>> unsigned long flags; +uint32_t ret = 0, i; + +
>>>> local_irq_save(flags); +vgic_lock_irqs(v, nrirqs,
>>>> first_irq, pirqs);
>>> 
>>> I am not convinced on the usefulness of taking all the locks in
>>> one go. At one point in the time, you only need to lock a given
>>> pending_irq.
>> 
>> I don't think so. The MMIO access a guest does is expected to be
>> atomic, so it expects to read the priorities of the four interrupts
>> as they were *at one point in time*. This issue is more obvious for
>> the enabled bit, for instance, but also here a (32-bit) read and a
>> write of some IPRIORITYR might race against each other. This was
>> covered by the rank lock before, but now we have to bite the bullet
>> and lock all involved IRQs.
> 
> A well-behaved guest would need a lock in order to modify the
> hardware as it can't predict in which order the write will happen. If
> the guest does not respect that I don't think you it is necessary to
> require atomicity of the modification.
> 
> This is making the code more complex for a little benefits and also 
> increase the duration of the interrupt masked.
> 
> So as long as it does not affect the hypervisor, then I think it is
> fine to not handle more than the atomicity at the IRQ level.

Fair enough, I can live with that. I didn't like the added complexity
for the tiny benefit either, just wanted to retain the behaviour we had
naturally with the rank lock before.
So this is definitely true for IPRIORITYR, ICFGR and friends, but I need
to double check on ISENABLER/ICENABLER, because of the OR/AND-NOT
semantics, which allows lockless accesses from the software side. I
believe this is fine, though.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 07/22] ARM: vGIC: introduce priority setter/getter

2017-08-17 Thread Andre Przywara
Hi,

On 11/08/17 15:10, Julien Grall wrote:
> Hi Andre,
> 
> On 21/07/17 20:59, Andre Przywara wrote:
>> Since the GICs MMIO access always covers a number of IRQs at once,
>> introduce wrapper functions which loop over those IRQs, take their
>> locks and read or update the priority values.
>> This will be used in a later patch.
>>
>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
>> ---
>>  xen/arch/arm/vgic.c| 37 +
>>  xen/include/asm-arm/vgic.h |  5 +
>>  2 files changed, 42 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 434b7e2..b2c9632 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -243,6 +243,43 @@ static int vgic_get_virq_priority(struct vcpu *v,
>> unsigned int virq)
>>  return ACCESS_ONCE(rank->priority[virq & INTERRUPT_RANK_MASK]);
>>  }
>>
>> +#define MAX_IRQS_PER_IPRIORITYR 4
> 
> The name gives the impression that you may have IPRIORITYR with only 1
> IRQ. But this is not true. The registers is always 4. However, you are
> able to access using byte or word.
> 
>> +uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
> 
> I am well aware that the vgic code is mixing between virq and irq.
> Moving forward, we should use virq to avoid confusion.
> 
>> + unsigned int first_irq)
> 
> Please stay consistent, with the naming. Either nr_irqs/first_irq or
> nrirqs/firstirq. But not a mix.
> 
> Also, it makes more sense to describe first the start then number.
> 
>> +{
>> +struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
>> +unsigned long flags;
>> +uint32_t ret = 0, i;
>> +
>> +local_irq_save(flags);
>> +vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
> 
> I am not convinced on the usefulness of taking all the locks in one go.
> At one point in the time, you only need to lock a given pending_irq.

I don't think so. The MMIO access a guest does is expected to be atomic,
so it expects to read the priorities of the four interrupts as they were
*at one point in time*.
This issue is more obvious for the enabled bit, for instance, but also
here a (32-bit) read and a write of some IPRIORITYR might race against
each other. This was covered by the rank lock before, but now we have to
bite the bullet and lock all involved IRQs.

Cheers,
Andre.

>> +
>> +for ( i = 0; i < nrirqs; i++ )
>> +ret |= pirqs[i]->priority << (i * 8);
> 
> Please avoid open-coding number.
> 
>> +
>> +vgic_unlock_irqs(pirqs, nrirqs);
>> +local_irq_restore(flags);
>> +
>> +return ret;
>> +}
>> +
>> +void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq, uint32_t value)
>> +{
>> +struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
>> +unsigned long flags;
>> +unsigned int i;
>> +
>> +local_irq_save(flags);
>> +vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
>> +
>> +for ( i = 0; i < nrirqs; i++, value >>= 8 )
> 
> Same here.
> 
>> +pirqs[i]->priority = value & 0xff;
>> +
>> +vgic_unlock_irqs(pirqs, nrirqs);
>> +local_irq_restore(flags);
>> +}
>> +
>>  bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned
>> int irq)
>>  {
>>  unsigned long flags;
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index ecf4969..f3791c8 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -198,6 +198,11 @@ void vgic_lock_irqs(struct vcpu *v, unsigned int
>> nrirqs, unsigned int first_irq,
>>  struct pending_irq **pirqs);
>>  void vgic_unlock_irqs(struct pending_irq **pirqs, unsigned int nrirqs);
>>
>> +uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq);
>> +void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq, uint32_t reg);
>> +
>>  enum gic_sgi_mode;
>>
>>  /*
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 07/22] ARM: vGIC: introduce priority setter/getter

2017-08-16 Thread Andre Przywara
Hi,

On 11/08/17 15:10, Julien Grall wrote:
> Hi Andre,
> 
> On 21/07/17 20:59, Andre Przywara wrote:
>> Since the GICs MMIO access always covers a number of IRQs at once,
>> introduce wrapper functions which loop over those IRQs, take their
>> locks and read or update the priority values.
>> This will be used in a later patch.
>>
>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
>> ---
>>  xen/arch/arm/vgic.c| 37 +
>>  xen/include/asm-arm/vgic.h |  5 +
>>  2 files changed, 42 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 434b7e2..b2c9632 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -243,6 +243,43 @@ static int vgic_get_virq_priority(struct vcpu *v,
>> unsigned int virq)
>>  return ACCESS_ONCE(rank->priority[virq & INTERRUPT_RANK_MASK]);
>>  }
>>
>> +#define MAX_IRQS_PER_IPRIORITYR 4
> 
> The name gives the impression that you may have IPRIORITYR with only 1
> IRQ. But this is not true. The registers is always 4. However, you are
> able to access using byte or word.
> 
>> +uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
> 
> I am well aware that the vgic code is mixing between virq and irq.
> Moving forward, we should use virq to avoid confusion.
> 
>> + unsigned int first_irq)
> 
> Please stay consistent, with the naming. Either nr_irqs/first_irq or
> nrirqs/firstirq. But not a mix.

I totally agree, but check this out:
xen/include/asm-arm/irq.h:#define nr_irqs NR_IRQS

So wherever you write nr_irqs in *any* part of ARM IRQ code you end up
with a compile error ...
Not easy to fix, though, hence I moved to the name without the
underscore, even though I don't really like it.

Cheers,
Andre.

> 
> Also, it makes more sense to describe first the start then number.
> 
>> +{
>> +struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
>> +unsigned long flags;
>> +uint32_t ret = 0, i;
>> +
>> +local_irq_save(flags);
>> +vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
> 
> I am not convinced on the usefulness of taking all the locks in one go.
> At one point in the time, you only need to lock a given pending_irq.
> 
>> +
>> +for ( i = 0; i < nrirqs; i++ )
>> +ret |= pirqs[i]->priority << (i * 8);
> 
> Please avoid open-coding number.
> 
>> +
>> +vgic_unlock_irqs(pirqs, nrirqs);
>> +local_irq_restore(flags);
>> +
>> +return ret;
>> +}
>> +
>> +void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq, uint32_t value)
>> +{
>> +struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
>> +unsigned long flags;
>> +unsigned int i;
>> +
>> +local_irq_save(flags);
>> +vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
>> +
>> +for ( i = 0; i < nrirqs; i++, value >>= 8 )
> 
> Same here.
> 
>> +pirqs[i]->priority = value & 0xff;
>> +
>> +vgic_unlock_irqs(pirqs, nrirqs);
>> +local_irq_restore(flags);
>> +}
>> +
>>  bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned
>> int irq)
>>  {
>>  unsigned long flags;
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index ecf4969..f3791c8 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -198,6 +198,11 @@ void vgic_lock_irqs(struct vcpu *v, unsigned int
>> nrirqs, unsigned int first_irq,
>>  struct pending_irq **pirqs);
>>  void vgic_unlock_irqs(struct pending_irq **pirqs, unsigned int nrirqs);
>>
>> +uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq);
>> +void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
>> + unsigned int first_irq, uint32_t reg);
>> +
>>  enum gic_sgi_mode;
>>
>>  /*
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 01/22] ARM: vGIC: introduce and initialize pending_irq lock

2017-08-16 Thread Andre Przywara
Hi,

On 10/08/17 16:35, Julien Grall wrote:
> Hi,
> 
> On 21/07/17 20:59, Andre Przywara wrote:
>> Currently we protect the pending_irq structure with the corresponding
>> VGIC VCPU lock. There are problems in certain corner cases (for
>> instance if an IRQ is migrating), so let's introduce a per-IRQ lock,
>> which will protect the consistency of this structure independent from
>> any VCPU.
>> For now this just introduces and initializes the lock, also adds
>> wrapper macros to simplify its usage (and help debugging).
>>
>> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
>> ---
>>  xen/arch/arm/vgic.c|  1 +
>>  xen/include/asm-arm/vgic.h | 11 +++
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 1e5107b..38dacd3 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -69,6 +69,7 @@ void vgic_init_pending_irq(struct pending_irq *p,
>> unsigned int virq)
>>  memset(p, 0, sizeof(*p));
>>  INIT_LIST_HEAD(>inflight);
>>  INIT_LIST_HEAD(>lr_queue);
>> +spin_lock_init(>lock);
>>  p->irq = virq;
>>  p->lpi_vcpu_id = INVALID_VCPU_ID;
>>  }
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index d4ed23d..1c38b9a 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -90,6 +90,14 @@ struct pending_irq
>>   * TODO: when implementing irq migration, taking only the current
>>   * vgic lock is not going to be enough. */
>>  struct list_head lr_queue;
>> +/* The lock protects the consistency of this structure. A single
>> status bit
>> + * can be read and/or set without holding the lock using the atomic
>> + * set_bit/clear_bit/test_bit functions, however accessing
>> multiple bits or
>> + * relating to other members in this struct requires the lock.
>> + * The list_head members are protected by their corresponding
>> VCPU lock,
>> + * it is not sufficient to hold this pending_irq lock here to
>> query or
>> + * change list order or affiliation. */
> 
> Actually, I have on question here. Do the vCPU lock sufficient to
> protect the list_head members. Or do you also mandate the pending_irq to
> be locked as well?

For *manipulating* a list (removing or adding a pending_irq) you need to
hold both locks. We need the VCPU lock as the list head in struct vcpu
could change, and we need the per-IRQ lock to prevent a pending_irq to
be inserted into two lists at the same time (and also the list_head
member variables are changed).
However just *checking* whether a certain pending_irq is a member of a
list works with just holding the per-IRQ lock.

> Also, it would be good to have the locking order documented maybe in
> docs/misc?

Yes, I agree having a high level VGIC document (focussing on the locking
for the beginning) is a good idea.

Cheers,
Andre.

> 
>> +spinlock_t lock;
>>  };
>>
>>  #define NR_INTERRUPT_PER_RANK   32
>> @@ -156,6 +164,9 @@ struct vgic_ops {
>>  #define vgic_lock(v)   spin_lock_irq(&(v)->domain->arch.vgic.lock)
>>  #define vgic_unlock(v) spin_unlock_irq(&(v)->domain->arch.vgic.lock)
>>
>> +#define vgic_irq_lock(p, flags) spin_lock_irqsave(&(p)->lock, flags)
>> +#define vgic_irq_unlock(p, flags) spin_unlock_irqrestore(&(p)->lock,
>> flags)
>> +
>>  #define vgic_lock_rank(v, r, flags)   spin_lock_irqsave(&(r)->lock,
>> flags)
>>  #define vgic_unlock_rank(v, r, flags)
>> spin_unlock_irqrestore(&(r)->lock, flags)
>>
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 17/22] ARM: vGIC: introduce vgic_lock_vcpu_irq()

2017-07-21 Thread Andre Przywara
Since a VCPU can own multiple IRQs, the natural locking order is to take
a VCPU lock first, then the individual per-IRQ locks.
However there are situations where the target VCPU is not known without
looking into the struct pending_irq first, which usually means we need to
take the IRQ lock first.
To solve this problem, we provide a function called vgic_lock_vcpu_irq(),
which takes a locked struct pending_irq() and returns with *both* the
VCPU and the IRQ lock held.
This is done by looking up the target VCPU, then briefly dropping the
IRQ lock, taking the VCPU lock, then grabbing the per-IRQ lock again.
Before returning there is a check whether something has changed in the
brief period where we didn't hold the IRQ lock, retrying in this (very
rare) case.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1ba0010..0e6dfe5 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -224,6 +224,48 @@ int vcpu_vgic_free(struct vcpu *v)
 return 0;
 }
 
+/**
+ * vgic_lock_vcpu_irq(): lock both the pending_irq and the corresponding VCPU
+ *
+ * @v: the VCPU (for private IRQs)
+ * @p: pointer to the locked struct pending_irq
+ * @flags: pointer to the IRQ flags used when locking the VCPU
+ *
+ * The function takes a locked IRQ and returns with both the IRQ and the
+ * corresponding VCPU locked. This is non-trivial due to the locking order
+ * being actually the other way round (VCPU first, then IRQ).
+ *
+ * Returns: pointer to the VCPU this IRQ is targeting.
+ */
+struct vcpu *vgic_lock_vcpu_irq(struct vcpu *v, struct pending_irq *p,
+unsigned long *flags)
+{
+struct vcpu *target_vcpu;
+
+ASSERT(spin_is_locked(>lock));
+
+target_vcpu = vgic_get_target_vcpu(v, p);
+spin_unlock(>lock);
+
+do
+{
+struct vcpu *current_vcpu;
+
+spin_lock_irqsave(_vcpu->arch.vgic.lock, *flags);
+spin_lock(>lock);
+
+current_vcpu = vgic_get_target_vcpu(v, p);
+
+if ( target_vcpu->vcpu_id == current_vcpu->vcpu_id )
+return target_vcpu;
+
+spin_unlock(>lock);
+spin_unlock_irqrestore(_vcpu->arch.vgic.lock, *flags);
+
+target_vcpu = current_vcpu;
+} while (1);
+}
+
 struct vcpu *vgic_get_target_vcpu(struct vcpu *v, struct pending_irq *p)
 {
 struct vgic_irq_rank *rank = vgic_rank_irq(v, p->irq);
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 18/22] ARM: vGIC: move virtual IRQ target VCPU from rank to pending_irq

2017-07-21 Thread Andre Przywara
The VCPU a shared virtual IRQ is targeting is currently stored in the
irq_rank structure.
For LPIs we already store the target VCPU in struct pending_irq, so
move SPIs over as well.
The ITS code, which was using this field already, was so far using the
VCPU lock to protect the pending_irq, so move this over to the new lock.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic-v2.c | 56 +++
 xen/arch/arm/vgic-v3-its.c |  9 +++---
 xen/arch/arm/vgic-v3.c | 69 ---
 xen/arch/arm/vgic.c| 73 +-
 xen/include/asm-arm/vgic.h | 13 +++--
 5 files changed, 96 insertions(+), 124 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 0c8a598..c7ed3ce 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -66,19 +66,22 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t 
csize,
  *
  * Note the byte offset will be aligned to an ITARGETSR boundary.
  */
-static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank *rank,
- unsigned int offset)
+static uint32_t vgic_fetch_itargetsr(struct vcpu *v, unsigned int offset)
 {
 uint32_t reg = 0;
 unsigned int i;
+unsigned long flags;
 
-ASSERT(spin_is_locked(>lock));
-
-offset &= INTERRUPT_RANK_MASK;
 offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);
 
 for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++ )
-reg |= (1 << read_atomic(>vcpu[offset])) << (i * 
NR_BITS_PER_TARGET);
+{
+struct pending_irq *p = irq_to_pending(v, offset);
+
+vgic_irq_lock(p, flags);
+reg |= (1 << p->vcpu_id) << (i * NR_BITS_PER_TARGET);
+vgic_irq_unlock(p, flags);
+}
 
 return reg;
 }
@@ -89,32 +92,29 @@ static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank 
*rank,
  *
  * Note the byte offset will be aligned to an ITARGETSR boundary.
  */
-static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
+static void vgic_store_itargetsr(struct domain *d,
  unsigned int offset, uint32_t itargetsr)
 {
 unsigned int i;
 unsigned int virq;
 
-ASSERT(spin_is_locked(>lock));
-
 /*
  * The ITARGETSR0-7, used for SGIs/PPIs, are implemented RO in the
  * emulation and should never call this function.
  *
- * They all live in the first rank.
+ * They all live in the first four bytes of ITARGETSR.
  */
-BUILD_BUG_ON(NR_INTERRUPT_PER_RANK != 32);
-ASSERT(rank->index >= 1);
+ASSERT(offset >= 4);
 
-offset &= INTERRUPT_RANK_MASK;
+virq = offset;
 offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);
 
-virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
-
 for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++, virq++ )
 {
 unsigned int new_target, old_target;
+unsigned long flags;
 uint8_t new_mask;
+struct pending_irq *p = spi_to_pending(d, virq);
 
 /*
  * Don't need to mask as we rely on new_mask to fit for only one
@@ -151,16 +151,14 @@ static void vgic_store_itargetsr(struct domain *d, struct 
vgic_irq_rank *rank,
 /* The vCPU ID always starts from 0 */
 new_target--;
 
-old_target = read_atomic(>vcpu[offset]);
+vgic_irq_lock(p, flags);
+old_target = p->vcpu_id;
 
 /* Only migrate the vIRQ if the target vCPU has changed */
 if ( new_target != old_target )
-{
-if ( vgic_migrate_irq(d->vcpu[old_target],
- d->vcpu[new_target],
- virq) )
-write_atomic(>vcpu[offset], new_target);
-}
+vgic_migrate_irq(p, , d->vcpu[new_target]);
+else
+vgic_irq_unlock(p, flags);
 }
 }
 
@@ -264,11 +262,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 uint32_t itargetsr;
 
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
-vgic_unlock_rank(v, rank, flags);
+itargetsr = vgic_fetch_itargetsr(v, gicd_reg - GICD_ITARGETSR);
 *r = vreg_reg32_extract(itargetsr, info);
 
 return 1;
@@ -498,14 +492,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 uint32_t itargetsr;
 
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-   

[Xen-devel] [RFC PATCH v2 21/22] ARM: vITS: injecting LPIs: use pending_irq lock

2017-07-21 Thread Andre Przywara
Instead of using an atomic access and hoping for the best, let's use
the new pending_irq lock now to make sure we read a sane version of
the target VCPU.
That still doesn't solve the problem mentioned in the comment, but
paves the way for future improvements.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic-v3-lpi.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 2306b58..9db26ed 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -140,20 +140,22 @@ void vgic_vcpu_inject_lpi(struct domain *d, unsigned int 
virq)
 {
 /*
  * TODO: this assumes that the struct pending_irq stays valid all of
- * the time. We cannot properly protect this with the current locking
- * scheme, but the future per-IRQ lock will solve this problem.
+ * the time. We cannot properly protect this with the current code,
+ * but a future refcounting will solve this problem.
  */
 struct pending_irq *p = irq_to_pending(d->vcpu[0], virq);
+unsigned long flags;
 unsigned int vcpu_id;
 
 if ( !p )
 return;
 
-vcpu_id = ACCESS_ONCE(p->vcpu_id);
-if ( vcpu_id >= d->max_vcpus )
-  return;
+vgic_irq_lock(p, flags);
+vcpu_id = p->vcpu_id;
+vgic_irq_unlock(p, flags);
 
-vgic_vcpu_inject_irq(d->vcpu[vcpu_id], virq);
+if ( vcpu_id < d->max_vcpus )
+vgic_vcpu_inject_irq(d->vcpu[vcpu_id], virq);
 }
 
 /*
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 09/22] ARM: vITS: protect LPI priority update with pending_irq lock

2017-07-21 Thread Andre Przywara
As the priority value is now officially a member of struct pending_irq,
we need to take its lock when manipulating it via ITS commands.
Make sure we take the IRQ lock after the VCPU lock when we need both.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 66095d4..705708a 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -402,6 +402,7 @@ static int update_lpi_property(struct domain *d, struct 
pending_irq *p)
 uint8_t property;
 int ret;
 
+ASSERT(spin_is_locked(>lock));
 /*
  * If no redistributor has its LPIs enabled yet, we can't access the
  * property table. In this case we just can't update the properties,
@@ -419,7 +420,7 @@ static int update_lpi_property(struct domain *d, struct 
pending_irq *p)
 if ( ret )
 return ret;
 
-write_atomic(>priority, property & LPI_PROP_PRIO_MASK);
+p->priority = property & LPI_PROP_PRIO_MASK;
 
 if ( property & LPI_PROP_ENABLED )
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
@@ -457,7 +458,7 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 uint32_t devid = its_cmd_get_deviceid(cmdptr);
 uint32_t eventid = its_cmd_get_id(cmdptr);
 struct pending_irq *p;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 struct vcpu *vcpu;
 uint32_t vlpi;
 int ret = -1;
@@ -485,7 +486,8 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 if ( unlikely(!p) )
 goto out_unlock_its;
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
+vgic_irq_lock(p, flags);
 
 /* Read the property table and update our cached status. */
 if ( update_lpi_property(d, p) )
@@ -497,7 +499,8 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 ret = 0;
 
 out_unlock:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+vgic_irq_unlock(p, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 
 out_unlock_its:
 spin_unlock(>its_lock);
@@ -517,7 +520,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 struct pending_irq *pirqs[16];
 uint64_t vlpi = 0;  /* 64-bit to catch overflows */
 unsigned int nr_lpis, i;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 int ret = 0;
 
 /*
@@ -542,7 +545,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 vcpu = get_vcpu_from_collection(its, collid);
 spin_unlock(>its_lock);
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
 read_lock(>d->arch.vgic.pend_lpi_tree_lock);
 
 do
@@ -555,9 +558,13 @@ static int its_handle_invall(struct virt_its *its, 
uint64_t *cmdptr)
 
 for ( i = 0; i < nr_lpis; i++ )
 {
+vgic_irq_lock(pirqs[i], flags);
 /* We only care about LPIs on our VCPU. */
 if ( pirqs[i]->lpi_vcpu_id != vcpu->vcpu_id )
+{
+vgic_irq_unlock(pirqs[i], flags);
 continue;
+}
 
 vlpi = pirqs[i]->irq;
 /* If that fails for a single LPI, carry on to handle the rest. */
@@ -566,6 +573,8 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 update_lpi_vgic_status(vcpu, pirqs[i]);
 else
 ret = err;
+
+vgic_irq_unlock(pirqs[i], flags);
 }
 /*
  * Loop over the next gang of pending_irqs until we reached the end of
@@ -576,7 +585,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
   (nr_lpis == ARRAY_SIZE(pirqs)) );
 
 read_unlock(>d->arch.vgic.pend_lpi_tree_lock);
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 
 return ret;
 }
@@ -712,6 +721,7 @@ static int its_handle_mapti(struct virt_its *its, uint64_t 
*cmdptr)
 uint32_t intid = its_cmd_get_physical_id(cmdptr), _intid;
 uint16_t collid = its_cmd_get_collection(cmdptr);
 struct pending_irq *pirq;
+unsigned long flags;
 struct vcpu *vcpu = NULL;
 int ret = -1;
 
@@ -765,7 +775,9 @@ static int its_handle_mapti(struct virt_its *its, uint64_t 
*cmdptr)
  * We don't need the VGIC VCPU lock here, because the pending_irq isn't
  * in the radix tree yet.
  */
+vgic_irq_lock(pirq, flags);
 ret = update_lpi_property(its->d, pirq);
+vgic_irq_unlock(pirq, flags);
 if ( ret )
 goto out_remove_host_entry;
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 14/22] ARM: vGIC: move virtual IRQ configuration from rank to pending_irq

2017-07-21 Thread Andre Przywara
The IRQ configuration (level or edge triggered) for a group of IRQs
are still stored in the irq_rank structure.
Introduce a new bit called GIC_IRQ_GUEST_LEVEL in the "status" field,
which holds that information.
Remove the storage from the irq_rank and use the existing wrappers to
store and retrieve the configuration bit for multiple IRQs.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic-v2.c | 21 +++-
 xen/arch/arm/vgic-v3.c | 25 --
 xen/arch/arm/vgic.c| 81 +-
 xen/include/asm-arm/vgic.h |  5 ++-
 4 files changed, 73 insertions(+), 59 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index a3fd500..0c8a598 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -278,20 +278,12 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 goto read_reserved;
 
 case VRANGE32(GICD_ICFGR, GICD_ICFGRN):
-{
-uint32_t icfgr;
-
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-icfgr = rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR, 
DABT_WORD)];
-vgic_unlock_rank(v, rank, flags);
 
-*r = vreg_reg32_extract(icfgr, info);
+irq = (gicd_reg - GICD_ICFGR) * 4;
+*r = vgic_fetch_irq_config(v, irq);
 
 return 1;
-}
 
 case VRANGE32(0xD00, 0xDFC):
 goto read_impl_defined;
@@ -529,13 +521,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ICFGR2, GICD_ICFGRN): /* SPIs */
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-vreg_reg32_update(>icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR,
- DABT_WORD)],
-  r, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICFGR) * 4; /* 2 bit per IRQ */
+vgic_store_irq_config(v, irq, r);
 return 1;
 
 case VRANGE32(0xD00, 0xDFC):
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d3356ae..e9e36eb 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -722,20 +722,11 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 return 1;
 
 case VRANGE32(GICD_ICFGR, GICD_ICFGRN):
-{
-uint32_t icfgr;
-
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL ) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-icfgr = rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR, DABT_WORD)];
-vgic_unlock_rank(v, rank, flags);
-
-*r = vreg_reg32_extract(icfgr, info);
-
+irq = (reg - GICD_ICFGR) * 4;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_config(v, irq);
 return 1;
-}
 
 default:
 printk(XENLOG_G_ERR
@@ -834,13 +825,9 @@ static int __vgic_v3_distr_common_mmio_write(const char 
*name, struct vcpu *v,
 /* ICFGR1 for PPI's, which is implementation defined
if ICFGR1 is programmable or not. We chose to program */
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL ) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-vreg_reg32_update(>icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR,
- DABT_WORD)],
-  r, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (reg - GICD_ICFGR) * 4;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_config(v, irq, r);
 return 1;
 
 default:
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index ddcd99b..e5a4765 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -268,6 +268,55 @@ void vgic_store_irq_priority(struct vcpu *v, unsigned int 
nrirqs,
 local_irq_restore(flags);
 }
 
+#define IRQS_PER_CFGR   16
+/**
+ * vgic_fetch_irq_config: assemble the configuration bits for a group of 16 
IRQs
+ * @v: the VCPU for private IRQs, any VCPU of a domain for SPIs
+ * @first_irq: the first IRQ to be queried, must be aligned to 16
+ */
+uint32_t vgic_fetch_irq_config(struct vcpu *v, unsigned int first_irq)
+{
+struct pending_irq *pirqs[IRQS_PER_CFGR];
+unsigned long flags;
+uint32_t ret = 0, i;
+
+local_irq_save(flags);
+vgic_lock_irqs(v, IRQS_PER_CFGR, first_irq, pi

[Xen-devel] [RFC PATCH v2 03/22] ARM: vGIC: move gic_raise_inflight_irq() into vgic_vcpu_inject_irq()

2017-07-21 Thread Andre Przywara
Currently there is a gic_raise_inflight_irq(), which serves the very
special purpose of handling a newly injected interrupt while an older
one is still handled. This has only one user, in vgic_vcpu_inject_irq().

Now with the introduction of the pending_irq lock this will later on
result in a nasty deadlock, which can only be solved properly by
actually embedding the function into the caller (and dropping the lock
later in-between).

This has the admittedly hideous consequence of needing to export
gic_update_one_lr(), but this will go away in a later stage of a rework.
In this respect this patch is more a temporary kludge.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c| 30 +-
 xen/arch/arm/vgic.c   | 11 ++-
 xen/include/asm-arm/gic.h |  2 +-
 3 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 2c99d71..5bd66a2 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -44,8 +44,6 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
 
 #undef GIC_DEBUG
 
-static void gic_update_one_lr(struct vcpu *v, int i);
-
 static const struct gic_hw_operations *gic_hw_ops;
 
 void register_gic_ops(const struct gic_hw_operations *ops)
@@ -416,32 +414,6 @@ void gic_remove_irq_from_queues(struct vcpu *v, struct 
pending_irq *p)
 gic_remove_from_lr_pending(v, p);
 }
 
-void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
-{
-struct pending_irq *n = irq_to_pending(v, virtual_irq);
-
-/* If an LPI has been removed meanwhile, there is nothing left to raise. */
-if ( unlikely(!n) )
-return;
-
-ASSERT(spin_is_locked(>arch.vgic.lock));
-
-/* Don't try to update the LR if the interrupt is disabled */
-if ( !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
-return;
-
-if ( list_empty(>lr_queue) )
-{
-if ( v == current )
-gic_update_one_lr(v, n->lr);
-}
-#ifdef GIC_DEBUG
-else
-gdprintk(XENLOG_DEBUG, "trying to inject irq=%u into d%dv%d, when it 
is still lr_pending\n",
- virtual_irq, v->domain->domain_id, v->vcpu_id);
-#endif
-}
-
 /*
  * Find an unused LR to insert an IRQ into, starting with the LR given
  * by @lr. If this new interrupt is a PRISTINE LPI, scan the other LRs to
@@ -503,7 +475,7 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int 
virtual_irq,
 gic_add_to_lr_pending(v, p);
 }
 
-static void gic_update_one_lr(struct vcpu *v, int i)
+void gic_update_one_lr(struct vcpu *v, int i)
 {
 struct pending_irq *p;
 int irq;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 38dacd3..7b122cd 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -536,7 +536,16 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int 
virq)
 
 if ( !list_empty(>inflight) )
 {
-gic_raise_inflight_irq(v, virq);
+bool update = test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
+  list_empty(>lr_queue) && (v == current);
+
+if ( update )
+gic_update_one_lr(v, n->lr);
+#ifdef GIC_DEBUG
+else
+gdprintk(XENLOG_DEBUG, "trying to inject irq=%u into d%dv%d, when 
it is still lr_pending\n",
+ n->irq, v->domain->domain_id, v->vcpu_id);
+#endif
 goto out;
 }
 
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 6203dc5..cf8b8fb 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -237,12 +237,12 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 
 extern void gic_inject(void);
 extern void gic_clear_pending_irqs(struct vcpu *v);
+extern void gic_update_one_lr(struct vcpu *v, int lr);
 extern int gic_events_need_delivery(void);
 
 extern void init_maintenance_interrupt(void);
 extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq,
 unsigned int priority);
-extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq);
 extern void gic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p);
 extern void gic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 04/22] ARM: vGIC: rename pending_irq->priority to cur_priority

2017-07-21 Thread Andre Przywara
In preparation for storing the virtual interrupt priority in the struct
pending_irq, rename the existing "priority" member to "cur_priority".
This is to signify that this is the current priority of an interrupt
which has been injected to a VCPU. Once this happened, its priority must
stay fixed at this value, subsequenct MMIO accesses to change the priority
can only affect newly triggered interrupts.
Also since the priority is a sorting criteria for the inflight list, it
must not change when it's on a VCPUs list.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic-v2.c  |  2 +-
 xen/arch/arm/gic-v3.c  |  2 +-
 xen/arch/arm/gic.c | 10 +-
 xen/arch/arm/vgic.c|  6 +++---
 xen/include/asm-arm/vgic.h |  2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index cbe71a9..735e23d 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -437,7 +437,7 @@ static void gicv2_update_lr(int lr, const struct 
pending_irq *p,
 BUG_ON(lr < 0);
 
 lr_reg = (((state & GICH_V2_LR_STATE_MASK) << GICH_V2_LR_STATE_SHIFT)  |
-  ((GIC_PRI_TO_GUEST(p->priority) & GICH_V2_LR_PRIORITY_MASK)
+  ((GIC_PRI_TO_GUEST(p->cur_priority) & GICH_V2_LR_PRIORITY_MASK)
  << GICH_V2_LR_PRIORITY_SHIFT) |
   ((p->irq & GICH_V2_LR_VIRTUAL_MASK) << 
GICH_V2_LR_VIRTUAL_SHIFT));
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f990eae..449bd55 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -961,7 +961,7 @@ static void gicv3_update_lr(int lr, const struct 
pending_irq *p,
 if ( current->domain->arch.vgic.version == GIC_V3 )
 val |= GICH_LR_GRP1;
 
-val |= ((uint64_t)p->priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
+val |= ((uint64_t)p->cur_priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
 val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
 
if ( p->desc != NULL )
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 5bd66a2..8dec736 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -389,7 +389,7 @@ static inline void gic_add_to_lr_pending(struct vcpu *v, 
struct pending_irq *n)
 
 list_for_each_entry ( iter, >arch.vgic.lr_pending, lr_queue )
 {
-if ( iter->priority > n->priority )
+if ( iter->cur_priority > n->cur_priority )
 {
 list_add_tail(>lr_queue, >lr_queue);
 return;
@@ -542,7 +542,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
  test_bit(GIC_IRQ_GUEST_QUEUED, >status) &&
  !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
-gic_raise_guest_irq(v, irq, p->priority);
+gic_raise_guest_irq(v, irq, p->cur_priority);
 else {
 list_del_init(>inflight);
 /*
@@ -610,7 +610,7 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 /* No more free LRs: find a lower priority irq to evict */
 list_for_each_entry_reverse( p_r, inflight_r, inflight )
 {
-if ( p_r->priority == p->priority )
+if ( p_r->cur_priority == p->cur_priority )
 goto out;
 if ( test_bit(GIC_IRQ_GUEST_VISIBLE, _r->status) &&
  !test_bit(GIC_IRQ_GUEST_ACTIVE, _r->status) )
@@ -676,9 +676,9 @@ int gic_events_need_delivery(void)
  * ordered by priority */
 list_for_each_entry( p, >arch.vgic.inflight_irqs, inflight )
 {
-if ( GIC_PRI_TO_GUEST(p->priority) >= mask_priority )
+if ( GIC_PRI_TO_GUEST(p->cur_priority) >= mask_priority )
 goto out;
-if ( GIC_PRI_TO_GUEST(p->priority) >= active_priority )
+if ( GIC_PRI_TO_GUEST(p->cur_priority) >= active_priority )
 goto out;
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
 {
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 7b122cd..21b545e 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -395,7 +395,7 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 p = irq_to_pending(v_target, irq);
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
 if ( !list_empty(>inflight) && !test_bit(GIC_IRQ_GUEST_VISIBLE, 
>status) )
-gic_raise_guest_irq(v_target, irq, p->priority);
+gic_raise_guest_irq(v_target, irq, p->cur_priority);
 spin_unlock_irqrestore(_target->arch.vgic.lock, flags);
 if ( p->desc != NULL )
 {
@@ -550,7 +550,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 }
 

[Xen-devel] [RFC PATCH v2 12/22] ARM: vGIC: protect gic_update_one_lr() with pending_irq lock

2017-07-21 Thread Andre Przywara
When we return from a domain with the active bit set in an LR,
we update our pending_irq accordingly. This touches multiple status
bits, so requires the pending_irq lock.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 9637682..84b282b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -508,6 +508,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 
 if ( lr_val.state & GICH_LR_ACTIVE )
 {
+vgic_irq_lock(p, flags);
 set_bit(GIC_IRQ_GUEST_ACTIVE, >status);
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
  test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, >status) )
@@ -521,6 +522,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 gdprintk(XENLOG_WARNING, "unable to inject hw irq=%d into 
d%dv%d: already active in LR%d\n",
  irq, v->domain->domain_id, v->vcpu_id, i);
 }
+vgic_irq_unlock(p, flags);
 }
 else if ( lr_val.state & GICH_LR_PENDING )
 {
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 11/22] ARM: vGIC: protect gic_events_need_delivery() with pending_irq lock

2017-07-21 Thread Andre Przywara
gic_events_need_delivery() reads the cur_priority field twice, also
relies on the consistency of status bits.
So it should take pending_irq lock.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/gic.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index df89530..9637682 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -666,7 +666,7 @@ int gic_events_need_delivery(void)
 {
 struct vcpu *v = current;
 struct pending_irq *p;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 const unsigned long apr = gic_hw_ops->read_apr(0);
 int mask_priority;
 int active_priority;
@@ -675,7 +675,7 @@ int gic_events_need_delivery(void)
 mask_priority = gic_hw_ops->read_vmcr_priority();
 active_priority = find_next_bit(, 32, 0);
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
 
 /* TODO: We order the guest irqs by priority, but we don't change
  * the priority of host irqs. */
@@ -684,19 +684,21 @@ int gic_events_need_delivery(void)
  * ordered by priority */
 list_for_each_entry( p, >arch.vgic.inflight_irqs, inflight )
 {
-if ( GIC_PRI_TO_GUEST(p->cur_priority) >= mask_priority )
-goto out;
-if ( GIC_PRI_TO_GUEST(p->cur_priority) >= active_priority )
-goto out;
-if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
+vgic_irq_lock(p, flags);
+if ( GIC_PRI_TO_GUEST(p->cur_priority) < mask_priority &&
+ GIC_PRI_TO_GUEST(p->cur_priority) < active_priority &&
+ !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
 {
-rc = 1;
-goto out;
+vgic_irq_unlock(p, flags);
+continue;
 }
+
+rc = test_bit(GIC_IRQ_GUEST_ENABLED, >status);
+vgic_irq_unlock(p, flags);
+break;
 }
 
-out:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 return rc;
 }
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 20/22] ARM: vGIC: move virtual IRQ enable bit from rank to pending_irq

2017-07-21 Thread Andre Przywara
The enabled bits for a group of IRQs are still stored in the irq_rank
structure, although we already have the same information in pending_irq,
in the GIC_IRQ_GUEST_ENABLED bit of the "status" field.
Remove the storage from the irq_rank and just utilize the existing
wrappers to cover enabling/disabling of multiple IRQs.
This also marks the removal of the last member of struct vgic_irq_rank.

Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
---
 xen/arch/arm/vgic-v2.c |  41 +++--
 xen/arch/arm/vgic-v3.c |  41 +++--
 xen/arch/arm/vgic.c| 201 +++--
 xen/include/asm-arm/vgic.h |  10 +--
 4 files changed, 152 insertions(+), 141 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index c7ed3ce..3320642 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -166,9 +166,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
register_t *r, void *priv)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
-unsigned long flags;
 unsigned int irq;
 
 perfc_incr(vgicd_reads);
@@ -222,20 +220,16 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ISENABLER, GICD_ISENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISENABLER, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-*r = vreg_reg32_extract(rank->ienable, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ISENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_enabled(v, irq);
 return 1;
 
 case VRANGE32(GICD_ICENABLER, GICD_ICENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICENABLER, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-*r = vreg_reg32_extract(rank->ienable, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_enabled(v, irq);
 return 1;
 
 /* Read the pending status of an IRQ via GICD is not supported */
@@ -386,10 +380,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 register_t r, void *priv)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
-uint32_t tr;
-unsigned long flags;
 unsigned int irq;
 
 perfc_incr(vgicd_writes);
@@ -426,24 +417,16 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ISENABLER, GICD_ISENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISENABLER, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-tr = rank->ienable;
-vreg_reg32_setbits(>ienable, r, info);
-vgic_enable_irqs(v, (rank->ienable) & (~tr), rank->index);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ISENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_enable(v, irq, r);
 return 1;
 
 case VRANGE32(GICD_ICENABLER, GICD_ICENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICENABLER, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-tr = rank->ienable;
-vreg_reg32_clearbits(>ienable, r, info);
-vgic_disable_irqs(v, (~rank->ienable) & tr, rank->index);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_disable(v, irq, r);
 return 1;
 
 case VRANGE32(GICD_ISPENDR, GICD_ISPENDRN):
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index e9d46af..00cc1e5 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -676,8 +676,6 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 register_t *r)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
-unsigned long flags;
 unsigned int irq;
 
 switch ( reg )
@@ -689,20 +687,16 @@ static int __v

  1   2   3   4   5   6   7   8   >