Re: [PATCH] powerpc/opal: Fix EBUSY bug in acquiring tokens

2017-11-06 Thread Michael Ellerman
William Kennington  writes:

>> On Nov 4, 2017, at 2:14 AM, Michael Ellerman > > wrote:
>> 
>> "William A. Kennington III" mailto:w...@google.com>> 
>> writes:
>> 
>>> The current code checks the completion map to look for the first token
>>> that is complete. In some cases, a completion can come in but the token
>>> can still be on lease to the caller processing the completion. If this
>>> completed but unreleased token is the first token found in the bitmap by
>>> another tasks trying to acquire a token, then the __test_and_set_bit
>>> call will fail since the token will still be on lease. The acquisition
>>> will then fail with an EBUSY.
>>> 
>>> This patch reorganizes the acquisition code to look at the
>>> opal_async_token_map for an unleased token. If the token has no lease it
>>> must have no outstanding completions so we should never see an EBUSY,
>>> unless we have leased out too many tokens. Since
>>> opal_async_get_token_inrerruptible is protected by a semaphore, we will
>>> practically never see EBUSY anymore.
>>> 
>>> Signed-off-by: William A. Kennington III >> >
>>> ---
>>> arch/powerpc/platforms/powernv/opal-async.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> I think this is superseeded by Cyrils rework (which he's finally
>> posted):
>> 
>>  http://patchwork.ozlabs.org/patch/833630/ 
>> 
>> 
>> If not please let us know.
>
> Yeah, I think Cyril’s rework fixes this. I wasn’t sure how long it
> would take for master to receive his changes so I figured we could use
> something in the interim to fix the locking failures. If his changes
> will be mailed into the next merge window then we should have the
> issue fixed in master. I understand that rework probably won’t make it
> into stable kernels? If not then we should probably send this along to
> stable kernel maintainers.

OK. I didn't realise the bug was sufficiently bad to need a backport
to stable.

To make a backport easier I've merged this patch first, and then Cyril's
on top of it (which essentially deletes this patch).

I assume you've tested this patch at least somewhat? :)

cheers


Re: [PATCH] powerpc/opal: Fix EBUSY bug in acquiring tokens

2017-11-05 Thread William Kennington

> On Nov 4, 2017, at 2:14 AM, Michael Ellerman  > wrote:
> 
> "William A. Kennington III" mailto:w...@google.com>> writes:
> 
>> The current code checks the completion map to look for the first token
>> that is complete. In some cases, a completion can come in but the token
>> can still be on lease to the caller processing the completion. If this
>> completed but unreleased token is the first token found in the bitmap by
>> another tasks trying to acquire a token, then the __test_and_set_bit
>> call will fail since the token will still be on lease. The acquisition
>> will then fail with an EBUSY.
>> 
>> This patch reorganizes the acquisition code to look at the
>> opal_async_token_map for an unleased token. If the token has no lease it
>> must have no outstanding completions so we should never see an EBUSY,
>> unless we have leased out too many tokens. Since
>> opal_async_get_token_inrerruptible is protected by a semaphore, we will
>> practically never see EBUSY anymore.
>> 
>> Signed-off-by: William A. Kennington III > >
>> ---
>> arch/powerpc/platforms/powernv/opal-async.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
> 
> I think this is superseeded by Cyrils rework (which he's finally
> posted):
> 
>  http://patchwork.ozlabs.org/patch/833630/ 
> 
> 
> 
> If not please let us know.
> 
> cheers

Yeah, I think Cyril’s rework fixes this. I wasn’t sure how long it would take 
for master to receive his changes so I figured we could use something in the 
interim to fix the locking failures. If his changes will be mailed into the 
next merge window then we should have the issue fixed in master. I understand 
that rework probably won’t make it into stable kernels? If not then we should 
probably send this along to stable kernel maintainers.

- William

Re: [PATCH] powerpc/opal: Fix EBUSY bug in acquiring tokens

2017-11-04 Thread Michael Ellerman
"William A. Kennington III"  writes:

> The current code checks the completion map to look for the first token
> that is complete. In some cases, a completion can come in but the token
> can still be on lease to the caller processing the completion. If this
> completed but unreleased token is the first token found in the bitmap by
> another tasks trying to acquire a token, then the __test_and_set_bit
> call will fail since the token will still be on lease. The acquisition
> will then fail with an EBUSY.
>
> This patch reorganizes the acquisition code to look at the
> opal_async_token_map for an unleased token. If the token has no lease it
> must have no outstanding completions so we should never see an EBUSY,
> unless we have leased out too many tokens. Since
> opal_async_get_token_inrerruptible is protected by a semaphore, we will
> practically never see EBUSY anymore.
>
> Signed-off-by: William A. Kennington III 
> ---
>  arch/powerpc/platforms/powernv/opal-async.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

I think this is superseeded by Cyrils rework (which he's finally
posted):

  http://patchwork.ozlabs.org/patch/833630/


If not please let us know.

cheers