Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-14 Thread Mihai Donțu
On Tue, 13 Sep 2016 19:25:54 +0100 Andrew Cooper wrote:
> On 13/09/16 14:46, Mihai Donțu wrote:
> > On Tue, 13 Sep 2016 09:10:32 -0400 Konrad Rzeszutek Wilk wrote:  
> >> On Mon, Sep 12, 2016 at 09:23:41AM -0600, Jan Beulich wrote:  
> >>> All,
> >>>
> >>> in
> >>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
> >>> and
> >>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
> >>> Andrew basically suggests that we should switch away from using
> >>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
> >>> cases. And honestly I'm not convinced of this: We've been adding
> >>> quite a few ASSERT()s over the last years with the aim of doing
> >>> sanity checking in debug builds, without adding overhead to non-
> >>> debug builds. I can certainly see possible cases where using
> >>> BUG_ON() to prevent further possible damage is appropriate, but
> >>> I don't think we should overdo here.
> >>>
> >>> Thanks for other's opinions,
> >> I am in the mindset that ASSERTS are in the cases where a check
> >> has been done earlier and the ASSERT is more of a catch if we ended up
> >> somehow in the wrong state. We can then slowly follow the breadcrumbs to
> >> see what changed the state. In other words - something that the hypervisor
> >> has checked for and that invariant should have not changed.
> >>
> >> But a BUG_ON is in the same category - it should not have happend.
> >>
> >> Perhaps the distinction is that for ASSERTS() it is to catch me messing
> >> things up. While BUG_ON() is something (or somebody) else messing things 
> >> up.
> >>
> >> It is kind of hard to describe the semantic of an ASSERT vs BUG_ON now
> >> that I think of it ..  
> > I would see ASSERT() used to check for conditions that have immediate
> > and visible consequences, like the ones that lead to lack of
> > functionality (like a hw feature misdetection) or straight crashes
> > (like NULL-dereference).  
> 
> NULL dereferences in the context of PV guests are also a security issue,
> as the PV guest controls what is mapped at 0.
> 
> >  BUG_ON(), on the other hand, would be an early
> > warning for subtle corruptions that can lead to a hypervisor crash or
> > corrupted data after many hours of use (close to impossible to track
> > down).
> >
> > For example, a while ago I posted a small patch that would BUG_ON()
> > when it detected that the heap chunks were not properly linked. That's
> > the type of bug that's a pain to detect.  
> 
> Speaking of, what happened to that patch?

I did not post and updated version after the last review because I
wanted to produce some numbers too (wrt performance impact) ... and I
stopped there as I got distracted by other issues. I have a good mind
to update it, write a quick test and publish them. I hope I can do this
all sometime next week.

-- 
Mihai Donțu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-14 Thread Sander Eikelenboom

Wednesday, September 14, 2016, 10:35:30 AM, you wrote:

> On Tue, Sep 13, 2016 at 7:16 PM, Andrew Cooper
>  wrote:
>> On 12/09/16 16:23, Jan Beulich wrote:
>>> All,
>>>
>>> in
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
>>> and
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
>>> Andrew basically suggests that we should switch away from using
>>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
>>> cases. And honestly I'm not convinced of this: We've been adding
>>> quite a few ASSERT()s over the last years with the aim of doing
>>> sanity checking in debug builds, without adding overhead to non-
>>> debug builds. I can certainly see possible cases where using
>>> BUG_ON() to prevent further possible damage is appropriate, but
>>> I don't think we should overdo here.
>>
>> I am not advocating switching all ASSERT()s to BUG_ON()s.  That would be
>> silly.
>>
>> However, ASSERT()'s as a bounds check very definitely are dangerous.  If
>> there is any uncertainty about the bounds, the check must not disappear
>> in a release build.  (Better yet, code which copes cleanly with
>> insufficient bounds).
>>
>>
>> For anyone reading this email who hasn't already worked out the details
>> of XSA-186, the data corruption issue is here:
>>
>> static int hvmemul_insn_fetch(...)
>> {
>> unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip;
>> ...
>> ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf));
>> memcpy(_ctxt->insn_buf[insn_off], p_data, bytes);
>> ...
>>
>> It is left as an exercise to the reader to work out how to exploit this
>> on a release build of Xen, but it is hopefully obvious that the ASSERT()
>> isn't helpful.  A BUG_ON() in this case would have been a host DoS,
>> which is substantially better than a guest escape.

> This seems quite sensible, and I'm glad Andy clarified.  (Although in
> a lot of these cases, a domain_crash() would be preferable to a
> BUG_ON()  if possible.)

>  -George

Just my two cents, but please try to limit BUG_ON()'s to only really 
unrecoverable cases. Because it is quite a disaster especially on
remote administrated machines to have the host crash on a BUG_ON() 
(with or without auto restart).
 
Try to recover (for instance if necessary by crashing the guest 
instead of the host) as much as you can to make at least the hypervisor and 
dom0 
survive. Put a big fat warning in xl dmesg and taint the hypervisor 
(preferably in a way you could easily read out with monitoring tools),
so you could propagate that to the sysadmin that the system is
now deemed instable, so he can investigate (and report the bug),
triage and schedule a reboot. Instead of only having a hard to debug crashing 
machine with probably no or limited logs to go by. 

In the Linux kernel the problem now is to get a lot of them converted back to
ratelimited warnings instead of BUG_ON()'s.

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-14 Thread Andrew Cooper
On 14/09/16 09:35, George Dunlap wrote:
> On Tue, Sep 13, 2016 at 7:16 PM, Andrew Cooper
>  wrote:
>> On 12/09/16 16:23, Jan Beulich wrote:
>>> All,
>>>
>>> in
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
>>> and
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
>>> Andrew basically suggests that we should switch away from using
>>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
>>> cases. And honestly I'm not convinced of this: We've been adding
>>> quite a few ASSERT()s over the last years with the aim of doing
>>> sanity checking in debug builds, without adding overhead to non-
>>> debug builds. I can certainly see possible cases where using
>>> BUG_ON() to prevent further possible damage is appropriate, but
>>> I don't think we should overdo here.
>> I am not advocating switching all ASSERT()s to BUG_ON()s.  That would be
>> silly.
>>
>> However, ASSERT()'s as a bounds check very definitely are dangerous.  If
>> there is any uncertainty about the bounds, the check must not disappear
>> in a release build.  (Better yet, code which copes cleanly with
>> insufficient bounds).
>>
>>
>> For anyone reading this email who hasn't already worked out the details
>> of XSA-186, the data corruption issue is here:
>>
>> static int hvmemul_insn_fetch(...)
>> {
>> unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip;
>> ...
>> ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf));
>> memcpy(_ctxt->insn_buf[insn_off], p_data, bytes);
>> ...
>>
>> It is left as an exercise to the reader to work out how to exploit this
>> on a release build of Xen, but it is hopefully obvious that the ASSERT()
>> isn't helpful.  A BUG_ON() in this case would have been a host DoS,
>> which is substantially better than a guest escape.
> This seems quite sensible, and I'm glad Andy clarified.  (Although in
> a lot of these cases, a domain_crash() would be preferable to a
> BUG_ON()  if possible.)

Absolutely.  Clean error handling without a crash is certainly
preferable when possible.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-14 Thread George Dunlap
On Tue, Sep 13, 2016 at 7:16 PM, Andrew Cooper
 wrote:
> On 12/09/16 16:23, Jan Beulich wrote:
>> All,
>>
>> in
>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
>> and
>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
>> Andrew basically suggests that we should switch away from using
>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
>> cases. And honestly I'm not convinced of this: We've been adding
>> quite a few ASSERT()s over the last years with the aim of doing
>> sanity checking in debug builds, without adding overhead to non-
>> debug builds. I can certainly see possible cases where using
>> BUG_ON() to prevent further possible damage is appropriate, but
>> I don't think we should overdo here.
>
> I am not advocating switching all ASSERT()s to BUG_ON()s.  That would be
> silly.
>
> However, ASSERT()'s as a bounds check very definitely are dangerous.  If
> there is any uncertainty about the bounds, the check must not disappear
> in a release build.  (Better yet, code which copes cleanly with
> insufficient bounds).
>
>
> For anyone reading this email who hasn't already worked out the details
> of XSA-186, the data corruption issue is here:
>
> static int hvmemul_insn_fetch(...)
> {
> unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip;
> ...
> ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf));
> memcpy(_ctxt->insn_buf[insn_off], p_data, bytes);
> ...
>
> It is left as an exercise to the reader to work out how to exploit this
> on a release build of Xen, but it is hopefully obvious that the ASSERT()
> isn't helpful.  A BUG_ON() in this case would have been a host DoS,
> which is substantially better than a guest escape.

This seems quite sensible, and I'm glad Andy clarified.  (Although in
a lot of these cases, a domain_crash() would be preferable to a
BUG_ON()  if possible.)

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-13 Thread Andrew Cooper
On 13/09/16 14:46, Mihai Donțu wrote:
> On Tue, 13 Sep 2016 09:10:32 -0400 Konrad Rzeszutek Wilk wrote:
>> On Mon, Sep 12, 2016 at 09:23:41AM -0600, Jan Beulich wrote:
>>> All,
>>>
>>> in
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
>>> and
>>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
>>> Andrew basically suggests that we should switch away from using
>>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
>>> cases. And honestly I'm not convinced of this: We've been adding
>>> quite a few ASSERT()s over the last years with the aim of doing
>>> sanity checking in debug builds, without adding overhead to non-
>>> debug builds. I can certainly see possible cases where using
>>> BUG_ON() to prevent further possible damage is appropriate, but
>>> I don't think we should overdo here.
>>>
>>> Thanks for other's opinions,  
>> I am in the mindset that ASSERTS are in the cases where a check
>> has been done earlier and the ASSERT is more of a catch if we ended up
>> somehow in the wrong state. We can then slowly follow the breadcrumbs to
>> see what changed the state. In other words - something that the hypervisor
>> has checked for and that invariant should have not changed.
>>
>> But a BUG_ON is in the same category - it should not have happend.
>>
>> Perhaps the distinction is that for ASSERTS() it is to catch me messing
>> things up. While BUG_ON() is something (or somebody) else messing things up.
>>
>> It is kind of hard to describe the semantic of an ASSERT vs BUG_ON now
>> that I think of it ..
> I would see ASSERT() used to check for conditions that have immediate
> and visible consequences, like the ones that lead to lack of
> functionality (like a hw feature misdetection) or straight crashes
> (like NULL-dereference).

NULL dereferences in the context of PV guests are also a security issue,
as the PV guest controls what is mapped at 0.

>  BUG_ON(), on the other hand, would be an early
> warning for subtle corruptions that can lead to a hypervisor crash or
> corrupted data after many hours of use (close to impossible to track
> down).
>
> For example, a while ago I posted a small patch that would BUG_ON()
> when it detected that the heap chunks were not properly linked. That's
> the type of bug that's a pain to detect.

Speaking of, what happened to that patch?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-13 Thread Andrew Cooper
On 12/09/16 16:23, Jan Beulich wrote:
> All,
>
> in
> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
> and
> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
> Andrew basically suggests that we should switch away from using
> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
> cases. And honestly I'm not convinced of this: We've been adding
> quite a few ASSERT()s over the last years with the aim of doing
> sanity checking in debug builds, without adding overhead to non-
> debug builds. I can certainly see possible cases where using
> BUG_ON() to prevent further possible damage is appropriate, but
> I don't think we should overdo here.

I am not advocating switching all ASSERT()s to BUG_ON()s.  That would be
silly.

However, ASSERT()'s as a bounds check very definitely are dangerous.  If
there is any uncertainty about the bounds, the check must not disappear
in a release build.  (Better yet, code which copes cleanly with
insufficient bounds).


For anyone reading this email who hasn't already worked out the details
of XSA-186, the data corruption issue is here:

static int hvmemul_insn_fetch(...)
{
unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip;
...
ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf));
memcpy(_ctxt->insn_buf[insn_off], p_data, bytes);
...

It is left as an exercise to the reader to work out how to exploit this
on a release build of Xen, but it is hopefully obvious that the ASSERT()
isn't helpful.  A BUG_ON() in this case would have been a host DoS,
which is substantially better than a guest escape.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-13 Thread Mihai Donțu
On Tue, 13 Sep 2016 09:10:32 -0400 Konrad Rzeszutek Wilk wrote:
> On Mon, Sep 12, 2016 at 09:23:41AM -0600, Jan Beulich wrote:
> > All,
> > 
> > in
> > https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
> > and
> > https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
> > Andrew basically suggests that we should switch away from using
> > ASSERT() and over to BUG_ON() in perhaps quite broad a set of
> > cases. And honestly I'm not convinced of this: We've been adding
> > quite a few ASSERT()s over the last years with the aim of doing
> > sanity checking in debug builds, without adding overhead to non-
> > debug builds. I can certainly see possible cases where using
> > BUG_ON() to prevent further possible damage is appropriate, but
> > I don't think we should overdo here.
> > 
> > Thanks for other's opinions,  
> 
> I am in the mindset that ASSERTS are in the cases where a check
> has been done earlier and the ASSERT is more of a catch if we ended up
> somehow in the wrong state. We can then slowly follow the breadcrumbs to
> see what changed the state. In other words - something that the hypervisor
> has checked for and that invariant should have not changed.
> 
> But a BUG_ON is in the same category - it should not have happend.
> 
> Perhaps the distinction is that for ASSERTS() it is to catch me messing
> things up. While BUG_ON() is something (or somebody) else messing things up.
> 
> It is kind of hard to describe the semantic of an ASSERT vs BUG_ON now
> that I think of it ..

I would see ASSERT() used to check for conditions that have immediate
and visible consequences, like the ones that lead to lack of
functionality (like a hw feature misdetection) or straight crashes
(like NULL-dereference). BUG_ON(), on the other hand, would be an early
warning for subtle corruptions that can lead to a hypervisor crash or
corrupted data after many hours of use (close to impossible to track
down).

For example, a while ago I posted a small patch that would BUG_ON()
when it detected that the heap chunks were not properly linked. That's
the type of bug that's a pain to detect.

-- 
Mihai Donțu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-13 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 12 September 2016 16:24
> To: xen-devel <xen-de...@lists.xenproject.org>
> Subject: [Xen-devel] BUG_ON() vs ASSERT()
> 
> All,
> 
> in
> https://lists.xenproject.org/archives/html/xen-devel/2016-
> 09/msg01201.html
> and
> https://lists.xenproject.org/archives/html/xen-devel/2016-
> 09/msg01210.html
> Andrew basically suggests that we should switch away from using
> ASSERT() and over to BUG_ON() in perhaps quite broad a set of cases. And
> honestly I'm not convinced of this: We've been adding quite a few ASSERT()s
> over the last years with the aim of doing sanity checking in debug builds,
> without adding overhead to non- debug builds. I can certainly see possible
> cases where using
> BUG_ON() to prevent further possible damage is appropriate, but I don't
> think we should overdo here.
> 
> Thanks for other's opinions,

If there's a situation where code can survive a 'should not happen' then an 
ASSERT is good to catch the 'should not happen' when debugging, but being 
compiled out in production is fine. If the code is definitely not going to 
survive a 'should not happen' then I think a BUG_ON at the earliest opportunity 
is a good thing because it makes post mortem diagnosis much easier. Clearly 
this does have to be balanced against the cost of calculation of the subject of 
the BUG_ON though.

  Paul

> Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] BUG_ON() vs ASSERT()

2016-09-13 Thread Konrad Rzeszutek Wilk
On Mon, Sep 12, 2016 at 09:23:41AM -0600, Jan Beulich wrote:
> All,
> 
> in
> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
> and
> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
> Andrew basically suggests that we should switch away from using
> ASSERT() and over to BUG_ON() in perhaps quite broad a set of
> cases. And honestly I'm not convinced of this: We've been adding
> quite a few ASSERT()s over the last years with the aim of doing
> sanity checking in debug builds, without adding overhead to non-
> debug builds. I can certainly see possible cases where using
> BUG_ON() to prevent further possible damage is appropriate, but
> I don't think we should overdo here.
> 
> Thanks for other's opinions,

I am in the mindset that ASSERTS are in the cases where a check
has been done earlier and the ASSERT is more of a catch if we ended up
somehow in the wrong state. We can then slowly follow the breadcrumbs to
see what changed the state. In other words - something that the hypervisor
has checked for and that invariant should have not changed.

But a BUG_ON is in the same category - it should not have happend.

Perhaps the distinction is that for ASSERTS() it is to catch me messing
things up. While BUG_ON() is something (or somebody) else messing things up.

It is kind of hard to describe the semantic of an ASSERT vs BUG_ON now
that I think of it ..

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] BUG_ON() vs ASSERT()

2016-09-12 Thread Jan Beulich
All,

in
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html
and
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html
Andrew basically suggests that we should switch away from using
ASSERT() and over to BUG_ON() in perhaps quite broad a set of
cases. And honestly I'm not convinced of this: We've been adding
quite a few ASSERT()s over the last years with the aim of doing
sanity checking in debug builds, without adding overhead to non-
debug builds. I can certainly see possible cases where using
BUG_ON() to prevent further possible damage is appropriate, but
I don't think we should overdo here.

Thanks for other's opinions,
Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel