Re: [Xen-devel] Problem with commit bf22ff45bed664aefb5c4e43029057a199b7070c

2017-07-07 Thread Marc Zyngier
On 07/07/17 15:51, Juergen Gross wrote:
> Commit bf22ff45bed664aefb5c4e43029057a199b7070c ("genirq: Avoid
> unnecessary low level irq function calls") breaks Xen guest
> save/restore handling.
> 
> The main problem are the PV devices using Xen event channels as
> interrupt sources which are represented as an "irq chip" in the kernel.
> When saving the guest the event channels are masked internally. At
> restore time event channels are re-established and unmasked via
> irq_startup(). Unfortunately above commit will let the unmask operation
> be a nop as the irq handling doesn't know about the masking done before.
> 
> I have a patch repairing the issue, but I'm not sure if this way to do
> it would be accepted. I have exported mask_irq() and I'm doing the
> masking now through this function. Would the attached patch be
> acceptable? Or is there a better way to solve the problem?

The correct API to prevent an interrupt from firing would be
disable_irq(), which is much more straightforward and is readily exported.

I'm unsure of what the expected flow is though, so there might be some
more fixes to be made in your code.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/18] xen/arm: Restore HCR_EL2 register

2017-03-22 Thread Marc Zyngier
On 22/03/17 12:45, Mark Rutland wrote:
> On Wed, Mar 22, 2017 at 12:16:20PM +, Julien Grall wrote:
>> (CC Mark for the TLB question)
> 
> [Adding Marc since he should understand this better than I do]
> 
> I've trimmed a lot of context here, since it wasn't clear if it was
> relevant to the question. If there's something I've missed, please point
> that out explicitly.
> 
>> I am not entirely sure if we can only restore HCR_RW. My concern is
>> some of the bits are cached in the TLBs. Looking at the ARM ARM
>> (both v7 and v8 see D4.7 in DDI0487A.k_iss10775): "In these
>> descriptions, TLB entries for a translation regime for a particular
>> Exception level are out of context when executing at a higher
>> Exception level.". Which I interpret as TLB result cannot be cached
>> when translating a guest VA to guest PA at EL2. So I guess restoring
>> only HCR_RW might be fine.
> 
> My understanding is that when a translation regime is out-of-context,
> the only requirement is that the core does not begin speculative walks
> for that translation regime. See ARM DDI 0487A.k_iss10775, page D4-1735,
> "Use of out-of-context translation regimes".

+1. An AT instruction is certainly allowed to hit in the TLB, and the
only thing the core is not allowed to do is to speculate (because you
cannot restore a guest context atomically).

> I believe that an explicit address translation involving a translation
> regime can validly make use of (any part of) that regime's state and/or
> allocate new TLB entries for that regime.
> 
> It looks like you're asking if you can avoid installing all of the
> registers related to the EL1&0 regime when performing an EL1&0
> translation at EL2, right?
> 
> I do not believe that is valid; my understanding is that all of the
> relevant registers need to be installed first.

Indeed. I can't see how the you can perform an AT without first
restoring all of the context that define how the page tables are parsed
and including all the element that may be used to hit in the TLBs. Even
worse, you could create TLB entries (as you said, AT doesn't preclude
TLBs from being populated) that would conflict with an existing one on
the next lookup, generating a Conflict Abort.

In short, not restoring the full VM context is doomed.

Now, what is the actual question? ;-)

Cheers,

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 4/5] arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops

2015-11-17 Thread Marc Zyngier
On 17/11/15 17:29, Will Deacon wrote:
> On Tue, Nov 10, 2015 at 02:11:38PM +, Stefano Stabellini wrote:
>> On Thu, 5 Nov 2015, Stefano Stabellini wrote:
>>> Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM64.
>>> Necessary duplication of paravirt.h and paravirt.c with ARM.
>>>
>>> The only paravirt interface supported is pv_time_ops.steal_clock, so no
>>> runtime pvops patching needed.
>>>
>>> This allows us to make use of steal_account_process_tick for stolen
>>> ticks accounting.
>>>
>>> Signed-off-by: Stefano Stabellini <stefano.stabell...@eu.citrix.com>
>>> Acked-by: Marc Zyngier <marc.zyng...@arm.com>
>>
>> Ping?
>>
>> Catalin, Will,
>> are you happy with this change?
> 
> I'm happy if Marc's happy. Marc?

My Ack is already on the tin! ;-)

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] KVM: arm64: add workaround for Cortex-A57 erratum #852523

2015-09-14 Thread Marc Zyngier
On 14/09/15 16:06, Will Deacon wrote:
> When restoring the system register state for an AArch32 guest at EL2,
> writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
> which can lead to the guest effectively running with junk in the DACR
> and running into unexpected domain faults.
> 
> This patch works around the issue by re-ordering our restoration of the
> AArch32 register aliases so that they happen before the AArch64 system
> registers. Ensuring that the registers are restored in this order
> guarantees that they will be correctly synchronised by the core.
> 
> Cc: <sta...@vger.kernel.org>
> Cc: Marc Zyngier <marc.zyng...@arm.com>
> Signed-off-by: Will Deacon <will.dea...@arm.com>

Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>

I'll queue that together with the next batch of fixes.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 20/22] xen/arm: ITS: Map ITS translation space

2015-09-02 Thread Marc Zyngier
On 02/09/15 16:45, Ian Campbell wrote:
> On Tue, 2015-08-18 at 23:37 +0100, Marc Zyngier wrote:
>> On Tue, 18 Aug 2015 20:14:43 +0100 Julien Grall <julien.gr...@citrix.com> 
>> wrote:
>>
>>> Marc pointed me today that if the processor is writing into 
>>> GITS_TRANSLATER it may be able to deadlock the system.
>>>
>>> Reading more closely the spec (8.1.3 IHI0069A), there is undefined 
>>> behavior when writing to this register with wrong access size.
>>>
>>> Currently the page table are shared between the processor and the SMMU, 
>>>
>>> so that means that a domain will be able to deadlock the processor and 
>>> therefore the whole platform.
>>
>> Indeed. A CPU should *never* be able to write to the GITS_TRANSLATER
>> register. What would be the meaning anyway? How would a DeviceID be
>> sampled? This is definitely UNPREDICTIBLE territory, and you want to
>> make sure a guest cannot directly write to the HW.
>>
>>> So we should never expose GITS_TRANSLATER into the processor page 
>>> table. 
>>> Which means unsharing some parts if not all of the page tables between 
>>> the processor and the SMMU.
>>
>> Agreed. It looks to me like the CPU should only see the the virtual
>> ITS, and nothing else.
> 
> It's rather unfortunate that using an ITS therefore precludes sharing stage
> -2 page tables between MMU and SMMU, which it seems otherwise the
> architecture designers have tried hard to allow.
> 
> Do you know if this will be fixed in some future revision (although given
> we now need to have the functionality anyway I'm not sure it help more than
>  saving a few pages of memory :-()

I don't have any idea if something is being worked on to address this,
but I think you may be able to share at least the page tables describing
the memory, which should really be the bulk of the page tables.

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 20/22] xen/arm: ITS: Map ITS translation space

2015-08-18 Thread Marc Zyngier
On Tue, 18 Aug 2015 20:14:43 +0100
Julien Grall julien.gr...@citrix.com wrote:

 Hi,
 
 On 27/07/2015 04:12, vijay.kil...@gmail.com wrote:
  From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
 
  ITS translation space contains GITS_TRANSLATOR register
 
 s/GITS_TRANSLATOR/GITS_TRANSLATOR/

I assume you mean GITS_TRANSLATER? ;-)
 
  which is written by device to raise LPI. This space needs
  to mapped to every domain address space for all physical
  ITS available,so that device can access GITS_TRANSLATOR
 
 Ditto
 
  register using SMMU.
 
 Marc pointed me today that if the processor is writing into 
 GITS_TRANSLATER it may be able to deadlock the system.
 
 Reading more closely the spec (8.1.3 IHI0069A), there is undefined 
 behavior when writing to this register with wrong access size.
 
 Currently the page table are shared between the processor and the SMMU, 
 so that means that a domain will be able to deadlock the processor and 
 therefore the whole platform.

Indeed. A CPU should *never* be able to write to the GITS_TRANSLATER
register. What would be the meaning anyway? How would a DeviceID be
sampled? This is definitely UNPREDICTIBLE territory, and you want to
make sure a guest cannot directly write to the HW.

 So we should never expose GITS_TRANSLATER into the processor page table. 
 Which means unsharing some parts if not all of the page tables between 
 the processor and the SMMU.

Agreed. It looks to me like the CPU should only see the the virtual
ITS, and nothing else.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 05/17] xen/arm: ITS: implement hw_irq_controller for LPIs

2015-07-16 Thread Marc Zyngier
On 16/07/15 15:49, Ian Campbell wrote:
 On Wed, 2015-07-15 at 22:31 +0530, Vijay Kilari wrote:
 Sorry. I may not be clear.

 In Linux when MSIx is enabled.
 Device is created first and its_device-its_collection is set for
 first onine cpu
 and mapvi is called with collection set in its_create_device() as below.
 [...]
 When affinity is set, movi is sent with collection id selected
 for the cpu_mask.
 
 OK, so at start of day all events for a given device end up mapped to a
 single processor, but as individual interrupts are rebalanced they will
 then become spread out among different CPUs, that makes sense.
 
 I'm not sure I follow the scheme which Linux is using to achieve that
 behaviour though, so I've CC'd Marc.
 
 It seems to me from looking at the Linux code that its_dev-collection
 doesn't, as one might expect, contain the collection associated with the
 device (and therefore all Events of that device), but rather it contains
 the collection corresponding to a single Event which is the one which
 most recently had its affinity changed, i.e. the one with an potentially
 outstanding INV.
 
 So its_send_movi sends the MOVI command which ends up calling INV on
 that collection, which at this point is the _old_ collection. Then it
 stores the new collection for that Event in its_dev-collection.
 
 What I don't follow is how set_lpi_config copes with this, since it
 always sends the INV for an Event to the corresponding its_dev's
 collection, but after a bunch of Events/IRQs have been rebalanced there
 doesn't seem to be any guarantee that this would corresponding to the
 current collection associated with the interrupt which has been
 reconfigured.
 
 Marc, have I misunderstood what the Linux ITS driver is trying to
 achieve or how it goes about it?

Seems like there is a lot of terminology related confusion here, so
let's start from the beginning:

The ITS maps a [DevID, EventID] pair to a [LPI, Collection] pair, where
DevID is the device writing the MSI, EventID is the payload being
written, LPI is the actual interrupt number, and Collection is roughly
equivalent to a target CPU.

The DevID is not directly associated to a Collection; it is the LPI that
is mapped to a Collection (through the MAPVI command). When you perform
a MOVI, you are moving the *result* of the [DevID, EventID] pair
(effectively an LPI) from a Collection to another.

 Perhaps lpi_set_config (and by extension its_(un)mask_irq) is only
 called at start of day?

lpi_set_config is indeed only used when we put the interrupt in service.
It updates the configuration table for the redistributor (the thing
pointed to by the Collection) in charge of this LPI (*LPI*, not device
or event) and issue an INV so that the redistributor (gets the new
value). And yes, this is always done by [DevID, EventID] pair.

When you perform a MOVI, this implicitly contains an INV for the target
redistributor (MOVI will also move the pending state, and the old
redistributor will be left alone).

 I'm not sure if Xen is going to (want to) follow the way Linux arranges
 this stuff, but I would very much like to at least understand it, since
 it may have implications...

Hope this helps,

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 05/17] xen/arm: ITS: implement hw_irq_controller for LPIs

2015-07-16 Thread Marc Zyngier
On 16/07/15 17:18, Ian Campbell wrote:
 On Thu, 2015-07-16 at 16:21 +0100, Marc Zyngier wrote:
 Hope this helps,
 
 It, plus the chat we had on IRC did, yes, thanks.
 
 In summary:
 
 I was very confusedly talking about INV when I meant SYNC.
 
 There is a real issue with the update of its_dev-collection in
 its_set_affinity since it will result in the wrong collection being used
 for subsequent affinity operations.

Yeah, this is a very stupid bug on my side, and I realize that I never
saw it because the whole ITS driver has been written on a model that can
only generate a single interrupt per device. Lovely.

 Marc is intending to replace its_dev-collection with an array
 (presumably dynamically allocated) in its_dev mapping eventid to a
 collection.

Exactly.

 I think that is probably the right answer for Xen too.

I'd think so.

Thanks so much for putting me right!

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ARM: KVM/XEN: how should we support virt-what?

2015-03-25 Thread Marc Zyngier
On 25/03/15 09:44, Andrew Jones wrote:
 Hello ARM virt maintainers,
 
 I'd like to start a discussion about supporting virt-what[1]. virt-what
 allows userspace to determine if the system it's running on is running
 in a guest, and of what type (KVM, Xen, etc.). Despite it being a best
 effort tool, see the Caveat emptor in [1], it has become quite a useful
 tool, and is showing up in different places, such as OpenStack. If you
 look at the code[2], specifically [3], then you'll see how it works on
 x86, which is to use the dedicated hypervisor cpuid leaves. I'm
 wondering what equivalent we have, or can develop, for arm.
 Here are some thoughts;
 0) there's already something we can use, and I just need to be told
about it.
 1) be as similar as possible to x86 by dedicating some currently
undefined sysreg bits. This would take buy-in from lots of parties,
so is not likely the way to go.
 2) create a specific DT node that will get exposed through sysfs, or
somewhere.
 3) same as (2), but just use the nodes currently in mach-virt's DT
as the indication we're a guest. This would just be a heuristic,
i.e. have virtio mmio  psci.method == hvc, or something,
and we'd still need a way to know if we're kvm vs. xen vs. ??.

KVM doesn't have any specific DT node so far (at least kvmtool doesn't
expose anything, and QEMU doesn't seem to do it either).

You could define KVM as being !Xen by looking for the Xen hypervisor
node, but that's quite a heuristic.

Overall, I'm quite reluctant to introduce things that would immediately
become an ABI (cue the recent bogomips flame wars), forming a contract
between the host userspace and the guest userspace.

M.
-- 
Jazz is not dead. It just smells funny...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel