Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-19 Thread Ard Biesheuvel
On 19 October 2017 at 11:57, David Miller  wrote:
> From: Ard Biesheuvel 
> Date: Wed, 18 Oct 2017 16:45:15 +0100
>
>> Even though calling dql_completed() with a count that exceeds the
>> queued count is a serious error, it still does not justify bringing
>> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
>> instead.
>>
>> Signed-off-by: Ard Biesheuvel 
>
> This is bogus.
>
> Unless you are going to do all of the work necessary to fix
> the out-of-bounds condition here, you cannot safely continue
> into the rest of this function.
>
> Things are going to explode in many places if you don't, at
> a minimum, fix the 'count' value to be in range.
>
> But like others I don't like this, the driver needs to be fixed
> urgently if this condition triggers.
>
> Sorry I'm not applying this.

Fair enough.


Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-19 Thread David Miller
From: Ard Biesheuvel 
Date: Wed, 18 Oct 2017 16:45:15 +0100

> Even though calling dql_completed() with a count that exceeds the
> queued count is a serious error, it still does not justify bringing
> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
> instead.
> 
> Signed-off-by: Ard Biesheuvel 

This is bogus.

Unless you are going to do all of the work necessary to fix
the out-of-bounds condition here, you cannot safely continue
into the rest of this function.

Things are going to explode in many places if you don't, at
a minimum, fix the 'count' value to be in range.

But like others I don't like this, the driver needs to be fixed
urgently if this condition triggers.

Sorry I'm not applying this.


Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread Ard Biesheuvel
On 18 October 2017 at 19:45, Eric Dumazet  wrote:
> On Wed, 2017-10-18 at 18:57 +0100, Ard Biesheuvel wrote:
>> On 18 October 2017 at 17:29, Eric Dumazet  wrote:
>> > On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
>> >> Even though calling dql_completed() with a count that exceeds the
>> >> queued count is a serious error, it still does not justify bringing
>> >> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
>> >> instead.
>> >>
>> >> Signed-off-by: Ard Biesheuvel 
>> >> ---
>> >>  lib/dynamic_queue_limits.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
>> >> index f346715e2255..24ce495d78f3 100644
>> >> --- a/lib/dynamic_queue_limits.c
>> >> +++ b/lib/dynamic_queue_limits.c
>> >> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
>> >>   num_queued = ACCESS_ONCE(dql->num_queued);
>> >>
>> >>   /* Can't complete more than what's in queue */
>> >> - BUG_ON(count > num_queued - dql->num_completed);
>> >> + WARN_ON(count > num_queued - dql->num_completed);
>> >>
>> >>   completed = dql->num_completed + count;
>> >>   limit = dql->limit;
>> >
>> > So instead fixing the faulty driver, you'll have strange lockups, and
>> > force your users to reboot anyway, after annoying periods where
>> > "Internet does not work"
>> >
>> > These kinds of errors should be found when testing a new device driver
>> > or new kernel.
>> >
>> > Have you found the root cause ?
>> >
>>
>> Not yet, and I don't intend to send out any patches for this
>> particular hardware until this is fixed.
>>
>> But that still doesn't mean you should crash hard. As Linus puts it,
>> it is better to 'limp on' if you can (unless we're likely to corrupt
>> any non-volatile data, e.g., files on disk etc)
>
> How many BUG() do you plan to change to WARN() exactly ?
>

How is that relevant?

> If you want to comply to Linus wish, just compile your kernel
> with appropriate option.
>
> CONFIG_BUG=n
>

If it is essential that we crash hard in this location, without *any*
opportunity whatsoever to shutdown cleanly or perform any diagnosis on
the system while it is still up, then please disregard this patch.


Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread Eric Dumazet
On Wed, 2017-10-18 at 18:57 +0100, Ard Biesheuvel wrote:
> On 18 October 2017 at 17:29, Eric Dumazet  wrote:
> > On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
> >> Even though calling dql_completed() with a count that exceeds the
> >> queued count is a serious error, it still does not justify bringing
> >> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
> >> instead.
> >>
> >> Signed-off-by: Ard Biesheuvel 
> >> ---
> >>  lib/dynamic_queue_limits.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
> >> index f346715e2255..24ce495d78f3 100644
> >> --- a/lib/dynamic_queue_limits.c
> >> +++ b/lib/dynamic_queue_limits.c
> >> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
> >>   num_queued = ACCESS_ONCE(dql->num_queued);
> >>
> >>   /* Can't complete more than what's in queue */
> >> - BUG_ON(count > num_queued - dql->num_completed);
> >> + WARN_ON(count > num_queued - dql->num_completed);
> >>
> >>   completed = dql->num_completed + count;
> >>   limit = dql->limit;
> >
> > So instead fixing the faulty driver, you'll have strange lockups, and
> > force your users to reboot anyway, after annoying periods where
> > "Internet does not work"
> >
> > These kinds of errors should be found when testing a new device driver
> > or new kernel.
> >
> > Have you found the root cause ?
> >
> 
> Not yet, and I don't intend to send out any patches for this
> particular hardware until this is fixed.
> 
> But that still doesn't mean you should crash hard. As Linus puts it,
> it is better to 'limp on' if you can (unless we're likely to corrupt
> any non-volatile data, e.g., files on disk etc)

How many BUG() do you plan to change to WARN() exactly ?

If you want to comply to Linus wish, just compile your kernel
with appropriate option.

CONFIG_BUG=n




Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread Ard Biesheuvel
On 18 October 2017 at 17:29, Eric Dumazet  wrote:
> On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
>> Even though calling dql_completed() with a count that exceeds the
>> queued count is a serious error, it still does not justify bringing
>> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
>> instead.
>>
>> Signed-off-by: Ard Biesheuvel 
>> ---
>>  lib/dynamic_queue_limits.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
>> index f346715e2255..24ce495d78f3 100644
>> --- a/lib/dynamic_queue_limits.c
>> +++ b/lib/dynamic_queue_limits.c
>> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
>>   num_queued = ACCESS_ONCE(dql->num_queued);
>>
>>   /* Can't complete more than what's in queue */
>> - BUG_ON(count > num_queued - dql->num_completed);
>> + WARN_ON(count > num_queued - dql->num_completed);
>>
>>   completed = dql->num_completed + count;
>>   limit = dql->limit;
>
> So instead fixing the faulty driver, you'll have strange lockups, and
> force your users to reboot anyway, after annoying periods where
> "Internet does not work"
>
> These kinds of errors should be found when testing a new device driver
> or new kernel.
>
> Have you found the root cause ?
>

Not yet, and I don't intend to send out any patches for this
particular hardware until this is fixed.

But that still doesn't mean you should crash hard. As Linus puts it,
it is better to 'limp on' if you can (unless we're likely to corrupt
any non-volatile data, e.g., files on disk etc)


Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread Eric Dumazet
On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
> Even though calling dql_completed() with a count that exceeds the
> queued count is a serious error, it still does not justify bringing
> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
> instead.
> 
> Signed-off-by: Ard Biesheuvel 
> ---
>  lib/dynamic_queue_limits.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
> index f346715e2255..24ce495d78f3 100644
> --- a/lib/dynamic_queue_limits.c
> +++ b/lib/dynamic_queue_limits.c
> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
>   num_queued = ACCESS_ONCE(dql->num_queued);
>  
>   /* Can't complete more than what's in queue */
> - BUG_ON(count > num_queued - dql->num_completed);
> + WARN_ON(count > num_queued - dql->num_completed);
>  
>   completed = dql->num_completed + count;
>   limit = dql->limit;

So instead fixing the faulty driver, you'll have strange lockups, and
force your users to reboot anyway, after annoying periods where
"Internet does not work"

These kinds of errors should be found when testing a new device driver
or new kernel.

Have you found the root cause ?




RE: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread David Laight
From: Ard Biesheuvel
> Sent: 18 October 2017 16:45
> Even though calling dql_completed() with a count that exceeds the
> queued count is a serious error, it still does not justify bringing
> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
> instead.
> 
> Signed-off-by: Ard Biesheuvel 
> ---
>  lib/dynamic_queue_limits.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
> index f346715e2255..24ce495d78f3 100644
> --- a/lib/dynamic_queue_limits.c
> +++ b/lib/dynamic_queue_limits.c
> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
>   num_queued = ACCESS_ONCE(dql->num_queued);
> 
>   /* Can't complete more than what's in queue */
> - BUG_ON(count > num_queued - dql->num_completed);
> + WARN_ON(count > num_queued - dql->num_completed);
> 
>   completed = dql->num_completed + count;

Don't you need to bound 'count' so that horrid things don't
happen further down the code?

David



[PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

2017-10-18 Thread Ard Biesheuvel
Even though calling dql_completed() with a count that exceeds the
queued count is a serious error, it still does not justify bringing
down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
instead.

Signed-off-by: Ard Biesheuvel 
---
 lib/dynamic_queue_limits.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
index f346715e2255..24ce495d78f3 100644
--- a/lib/dynamic_queue_limits.c
+++ b/lib/dynamic_queue_limits.c
@@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
num_queued = ACCESS_ONCE(dql->num_queued);
 
/* Can't complete more than what's in queue */
-   BUG_ON(count > num_queued - dql->num_completed);
+   WARN_ON(count > num_queued - dql->num_completed);
 
completed = dql->num_completed + count;
limit = dql->limit;
-- 
2.11.0