On 05.12.19 09:41, Peng Fan wrote:
>> Subject: Re: [PATCH 1/2] arm: irqchip: add lock in irqchip_inject_pending
>>
>> On 05.12.19 03:28, Peng Fan wrote:
>>> Hi Jan,
>>>
>>>> -----Original Message-----
>>>> From: Jan Kiszka <[email protected]>
>>>> Sent: 2019年12月3日 17:18
>>>> To: Peng Fan <[email protected]>; [email protected]
>>>> Cc: Alice Guo <[email protected]>
>>>> Subject: Re: [PATCH 1/2] arm: irqchip: add lock in
>>>> irqchip_inject_pending
>>>>
>>>> On 03.12.19 10:15, Peng Fan wrote:
>>>>>> Subject: Re: [PATCH 1/2] arm: irqchip: add lock in
>>>>>> irqchip_inject_pending
>>>>>>
>>>>>> On 03.12.19 09:58, Peng Fan wrote:
>>>>>>>> Subject: Re: [PATCH 1/2] arm: irqchip: add lock in
>>>>>>>> irqchip_inject_pending
>>>>>>>>
>>>>>>>> On 03.12.19 09:27, Peng Fan wrote:
>>>>>>>>> Thinking about core0 is inject SGI to core1, core1 is handling
>>>>>>>>> SGI interrupt.
>>>>>>>>>
>>>>>>>>> That means core0 might be in path to enqueue SGI into the
>>>>>>>>> pending_irqs array, core1 might be in path handling SGI and pick
>>>>>>>>> one from pending_irqs array. So need to use lock to protect
>>>>>>>>> unqueue, not only enqueue.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Peng Fan <[email protected]>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> V1:
>>>>>>>>>  The best case is only lock one entry, so no good solution,
>>>>>>>>> because there is possibility that inject fail.
>>>>>>>>>
>>>>>>>>>  hypervisor/arch/arm-common/irqchip.c | 5 +++++
>>>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/hypervisor/arch/arm-common/irqchip.c
>>>>>>>>> b/hypervisor/arch/arm-common/irqchip.c
>>>>>>>>> index 1c881b64..fbaa3099 100644
>>>>>>>>> --- a/hypervisor/arch/arm-common/irqchip.c
>>>>>>>>> +++ b/hypervisor/arch/arm-common/irqchip.c
>>>>>>>>> @@ -279,11 +279,14 @@ void irqchip_inject_pending(void)
>>>>>>>>>       struct pending_irqs *pending =
>> &this_cpu_public()->pending_irqs;
>>>>>>>>>       u16 irq_id, sender;
>>>>>>>>>
>>>>>>>>> +     spin_lock(&pending->lock);
>>>>>>>>> +
>>>>>>>>>       while (pending->head != pending->tail) {
>>>>>>>>>               irq_id = pending->irqs[pending->head];
>>>>>>>>>               sender = pending->sender[pending->head];
>>>>>>>>>
>>>>>>>>>               if (irqchip.inject_irq(irq_id, sender) == -EBUSY) {
>>>>>>>>> +                     spin_unlock(&pending->lock);
>>>>>>>>>                       /*
>>>>>>>>>                        * The list registers are full, trigger 
>>>>>>>>> maintenance
>>>>>>>>>                        * interrupt and leave.
>>>>>>>>> @@ -295,6 +298,8 @@ void irqchip_inject_pending(void)
>>>>>>>>>               pending->head = (pending->head + 1) %
>> MAX_PENDING_IRQS;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> +     spin_unlock(&pending->lock);
>>>>>>>>> +
>>>>>>>>>       /*
>>>>>>>>>        * The software interrupt queue is empty - turn off the
>>>>>> maintenance
>>>>>>>>>        * interrupt.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Did you see real corruptions?
>>>>>>>
>>>>>>> No. just code inspection currently. We met some SGI inject return
>>>>>>> -EBUSY, so go through the code, and think this piece code needs a lock.
>>>>>>>
>>>>>>>>
>>>>>>>> The idea behind this was that the injection to the lock-less
>>>>>>>> queue happens in a way that it won't make changes visible to the
>>>>>>>> consumer that are inconsistent, hence the barrier in
>> irqchip_set_pending.
>>>>>>>> Looking that again, though, we possibly need another barrier,
>>>>>>>> right before updating
>>>>>>>> pending->tail.
>>>>>>>
>>>>>>> Barrier could not prohibit enqueue/unqueue contention.
>>>>>>> enqueue will check head, unqueue will update head.
>>>>>>
>>>>>> Some topic as with lockless enqueuing: We need a barrier to shield
>>>>>> the readout of the head entry from the update of pending->head. So,
>>>>>> we are definitely lacking barriers here, but I don't think we need
>>>>>> the spinlock between producer and consumer because there is only
>>>>>> one
>>>> consumer.
>>>>>
>>>>> Lock-free should be possible, let me work out a non-lock solution.
>>>>
>>>> This is what comes to my mind so far, but please re-check carefully:
>>>>
>>>> diff --git a/hypervisor/arch/arm-common/irqchip.c
>>>> b/hypervisor/arch/arm-common/irqchip.c
>>>> index 1c881b64..a83cd81d 100644
>>>> --- a/hypervisor/arch/arm-common/irqchip.c
>>>> +++ b/hypervisor/arch/arm-common/irqchip.c
>>>> @@ -246,12 +246,12 @@ void irqchip_set_pending(struct public_per_cpu
>>>> *cpu_public, u16 irq_id)
>>>>    if (new_tail != pending->head) {
>>>>            pending->irqs[pending->tail] = irq_id;
>>>>            pending->sender[pending->tail] = sender;
>>>> -          pending->tail = new_tail;
>>>>            /*
>>>> -           * Make the change to pending_irqs.tail visible before the
>>>> -           * caller sends SGI_INJECT.
>>>> +           * Make the entry content visible before updating the tail
>>>> +           * index.
>>>>             */
>>>>            memory_barrier();
>>>> +          pending->tail = new_tail;
>>>
>>> The spin_unlock implies memory barrier. I think no need memory_barrier
>> here.
>>
>> We /might/ be fine here for the archs we support, but Linux is more
>> strict:
>>
>> "(2) RELEASE operation implication:
>>
>>      Memory operations issued before the RELEASE will be completed
>> before the
>>      RELEASE operation has completed.
>>
>>      Memory operations issued after the RELEASE may be completed before
>> the
>>      RELEASE operation has completed."
>>
>> Now, the x86 is ordered anyway, ARMv7 indeed has the same barrier in the
>> unload as in the memory_barrier(). However, ARM64 is not that clear to me.
> 
> ARM64 STLRH:
> Store-Release Register Halfword stores a halfword from a 32-bit register to a
>  memory location. The instruction also has memory ordering semantics as 
> described in Load-Acquire, Load-AcquirePC, and Store-Release on page B2-108.
> 
> DDI0487D version, B2-108:
> The Load-Acquire, Load-AcquirePC, and Store-Release instructions can remove
> the requirement to use the explicit DMB instruction.
> 
> Hope this is clear.

OK, let's document this in our spinlock implementations, to ensure that
potential future ones (and there will be at least one further arch added
in to mid-term) will read that and follow that semantic. Then we can go
away without the explicit barriers in my diff.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/b1309642-8acd-1d7c-310f-07b64e5a2940%40siemens.com.

Reply via email to