[
https://issues.apache.org/jira/browse/MESOS-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neil Conway updated MESOS-3991:
-------------------------------
Priority: Major (was: Blocker)
Definitely something to discuss, although I also think it is definitely not a
blocker.
The classical argument for _not_ doing this is that, if a CHECK fails, you
can't necessarily continue execution safely. By throwing an assertion and
bailing out, you avoid possibly corrupting distributed state or causing worse
downstream problems. Since Mesos should always be run using a process
supervisor in production, the real problem with the current behavior (IMO) is
mostly when the CHECK failure is (a) relatively innocuous (b) occurs
repeatedly. That is the case for the floating point precision problem, but not
for many other CHECKs in the source code.
> CHECK shouldn't be an assert in a production environment.
> ---------------------------------------------------------
>
> Key: MESOS-3991
> URL: https://issues.apache.org/jira/browse/MESOS-3991
> Project: Mesos
> Issue Type: Bug
> Reporter: Gabriel Hartmann
>
> For example:
> In this issue some very error-prone double math causes Mesos master to crash
> when presented with a resource RESERVE Operation of the right form.
> On-demand DOS!
> https://issues.apache.org/jira/browse/MESOS-3552
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)