Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

Jan Beulich Wed, 27 Jan 2021 03:41:15 -0800

On 27.01.2021 12:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeul...@suse.com>
>> Sent: 27 January 2021 11:21
>> To: p...@xen.org
>> Cc: 'Paul Durrant' <pdurr...@amazon.com>; 'Konrad Rzeszutek Wilk' 
>> <konrad.w...@oracle.com>; 'Roger Pau
>> Monné' <roger....@citrix.com>; 'Jens Axboe' <ax...@kernel.dk>; 'Dongli Zhang'
>> <dongli.zh...@oracle.com>; linux-kernel@vger.kernel.org; 
>> linux-bl...@vger.kernel.org; xen-
>> de...@lists.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>> rings
>>
>> On 27.01.2021 12:09, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeul...@suse.com>
>>>> Sent: 27 January 2021 10:57
>>>> To: Paul Durrant <p...@xen.org>
>>>> Cc: Paul Durrant <pdurr...@amazon.com>; Konrad Rzeszutek Wilk 
>>>> <konrad.w...@oracle.com>; Roger Pau
>>>> Monné <roger....@citrix.com>; Jens Axboe <ax...@kernel.dk>; Dongli Zhang 
>>>> <dongli.zh...@oracle.com>;
>>>> linux-kernel@vger.kernel.org; linux-bl...@vger.kernel.org; 
>>>> xen-de...@lists.xenproject.org
>>>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>>>> rings
>>>>
>>>> On 27.01.2021 11:30, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurr...@amazon.com>
>>>>>
>>>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>>>> behaviour of xen-blkback when connecting to a frontend was:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring specified by 'ring-ref'
>>>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>>
>>>>> This was correct behaviour, but was broken by the afforementioned commit 
>>>>> to
>>>>> become:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring
>>>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>> - if that didn't work then see if there's a single page ring specified by
>>>>>   'ring-ref'
>>>>>
>>>>> This incorrect behaviour works most of the time but fails when a frontend
>>>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>>>> grant reference.
>>>>>
>>>>> This patch restores the original behaviour.
>>>>
>>>> Isn't this only the 2nd of a pair of fixes that's needed, the
>>>> first being the drivers, upon being unloaded, to fully clean up
>>>> after itself? Any stale key left may lead to confusion upon
>>>> re-use of the containing directory.
>>>
>>> In a backend we shouldn't be relying on, nor really expect IMO, a frontend 
>>> to clean up after itself.
>> Any backend should know *exactly* what xenstore nodes it’s looking for from 
>> a frontend.
>>
>> But the backend can't know whether a node exists because the present
>> frontend has written it, or because an earlier instance forgot to
>> delete it. It can only honor what's there. (In fact the other day I
>> was wondering whether some of the writes of boolean "false" nodes
>> wouldn't better be xenbus_rm() instead.)
> 
> In the particular case this patch is fixing for me, the frontends are the 
> Windows XENVBD driver and the Windows crash version of the same driver 
> (actually built from different code). The 'normal' instance is multi-page 
> aware and the crash instance is not quite, i.e. it uses the old ring-ref but 
> knows to clean up 'ring-page-order'.
> Clearly, in a crash situation, we cannot rely on frontend to clean up


Ah, I see (and agree).

> so what you say does highlight that there indeed needs to be a second patch 
> to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' 
> cycles through Closed and back to InitWait.

And not just this one node then, I suppose?

> I think this patch does still stand on its own though.

Perhaps, yes.

Jan

Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

Reply via email to