[ 
https://issues.apache.org/jira/browse/MESOS-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-3991:
-------------------------------
    Priority: Major  (was: Blocker)

Definitely something to discuss, although I also think it is definitely not a 
blocker.

The classical argument for _not_ doing this is that, if a CHECK fails, you 
can't necessarily continue execution safely. By throwing an assertion and 
bailing out, you avoid possibly corrupting distributed state or causing worse 
downstream problems. Since Mesos should always be run using a process 
supervisor in production, the real problem with the current behavior (IMO) is 
mostly when the CHECK failure is (a) relatively innocuous (b) occurs 
repeatedly. That is the case for the floating point precision problem, but not 
for many other CHECKs in the source code.

> CHECK shouldn't be an assert in a production environment.
> ---------------------------------------------------------
>
>                 Key: MESOS-3991
>                 URL: https://issues.apache.org/jira/browse/MESOS-3991
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Gabriel Hartmann
>
> For example:
> In this issue some very error-prone double math causes Mesos master to crash 
> when presented with a resource RESERVE Operation of the right form.  
> On-demand DOS!
> https://issues.apache.org/jira/browse/MESOS-3552



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to