[ 
https://issues.apache.org/jira/browse/MESOS-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James DeFelice updated MESOS-4548:
----------------------------------
    Description: 
For example, in mesos 0.24 there was a change to the error message generated by 
the master when a previously removed framework attempts to re-register: 
https://github.com/apache/mesos/commit/8661672d80cbe3ebd05e68a6fc4167b54ea139ef

Some frameworks, rightly or not, attempt to compare the generated error string 
to "Completed framework attempted to re-register" which changed in mesos 0.24 
to "Framework has been removed". These frameworks are now broken with respect 
to this aspect of their error handling, at least until they're changed to check 
for the new error string.

Arguably frameworks shouldn't be comparing error strings since they're  not 
guaranteed to remain stable across releases. However, mesos currently offers no 
alternative since there's no error **code** in the API.

Furthermore, with the rise of the HTTP API there's room for two classes of 
errors: synchronous validation errors vs. asynchronous errors. It would be 
ideal to have meaningful 4xx error code responses for synchronous errors as 
well as error codes for asynchronous errors delivered via ERROR events. These 
error codes would become part of a stable API that mesos would treat just like 
the rest of its APIs, allowing for deprecation cycles before breaking changes - 
or at the very least a release note indicating an immediate breaking change.

/cc [~vinodkone], [~bmahler]

  was:
For example, in mesos 0.24 there was a change to the error message generated by 
the master when a previously removed framework attempts to re-register: 
https://github.com/apache/mesos/commit/8661672d80cbe3ebd05e68a6fc4167b54ea139ef

Some frameworks, rightly or not, attempt to compare the generated error string 
to "Completed framework attempted to re-register" which changed in mesos 0.24 
to "Framework has been removed". These frameworks are now broken with respect 
to this aspect of their error handling, at least until they're changed to check 
for the new error string.

Arguably frameworks shouldn't be comparing error strings since they're  not 
guaranteed to remain stable across releases. However, mesos currently offers no 
alternative since there's no error **code** in the API.

Furthermore, with the rise of the HTTP API there's room for two classes of 
errors: synchronous validation errors vs. asynchronous errors. It would be 
ideal to have meaningful 4xx error code responses for synchronous errors as 
well as error codes for asynchronous errors delivered via ERROR events. These 
error codes would become part of a stable API that mesos would treat just like 
the rest of its APIs, allowing for deprecation cycles before breaking changes - 
or at the very least a release note indicating an immediate breaking change.

/cc [~vinodkone]


> Errors communicated to the scheduler should be associated with stable error 
> codes.
> ----------------------------------------------------------------------------------
>
>                 Key: MESOS-4548
>                 URL: https://issues.apache.org/jira/browse/MESOS-4548
>             Project: Mesos
>          Issue Type: Improvement
>    Affects Versions: 0.24.0
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> For example, in mesos 0.24 there was a change to the error message generated 
> by the master when a previously removed framework attempts to re-register: 
> https://github.com/apache/mesos/commit/8661672d80cbe3ebd05e68a6fc4167b54ea139ef
> Some frameworks, rightly or not, attempt to compare the generated error 
> string to "Completed framework attempted to re-register" which changed in 
> mesos 0.24 to "Framework has been removed". These frameworks are now broken 
> with respect to this aspect of their error handling, at least until they're 
> changed to check for the new error string.
> Arguably frameworks shouldn't be comparing error strings since they're  not 
> guaranteed to remain stable across releases. However, mesos currently offers 
> no alternative since there's no error **code** in the API.
> Furthermore, with the rise of the HTTP API there's room for two classes of 
> errors: synchronous validation errors vs. asynchronous errors. It would be 
> ideal to have meaningful 4xx error code responses for synchronous errors as 
> well as error codes for asynchronous errors delivered via ERROR events. These 
> error codes would become part of a stable API that mesos would treat just 
> like the rest of its APIs, allowing for deprecation cycles before breaking 
> changes - or at the very least a release note indicating an immediate 
> breaking change.
> /cc [~vinodkone], [~bmahler]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to