On Wed, May 20, 2015 at 7:43 AM, Nick Wellnhofer <[email protected]> wrote:
> On 18/05/2015 02:09, Marvin Humphrey wrote:
>>
>> As an alternative to throwing exceptions or storing exception objects in
>> thread-local variables, let's consider encoding error information into
>> return values using a crude form of algebraic data types: pre-defined
>> "MAYBE" types which can be either an Err or something else.
>
> +1. This is a great idea.
OK, let's explore this further.
Changing the Clownfish runtime to avoid exceptions will mean more explicit
error checking in client code. Whether this is an improvement in general is
an old debate -- some people prefer the elegance of exceptions, some people
prefer the transparency of local status code handling. But for Clownfish
specifically, preferring return status codes over exceptions offers a key
advantage:
* It is reasonable to promote return codes to raised exceptions at the
Clownfish/host border.
* In contrast, it is challenging to trap all raised exceptions at the
Clownfish/host border and convert them to return codes.
For sample usage from C, let's consider a refactored version of Vector's
constructor, which can fail with either an out-of-memory error or an overflow
error.
MAYBEVector
Vec_new(size_t capacity) {
Vector *self = (Vector*)Class_Make_Obj(VECTOR)};
if (self == NULL) {
return MAYBEVec_bad(cfish_Err_oom);
}
return Vec_init(self, capacity);
}
MAYBEVector
Vec_init(Vector *self, size_t capacity) {
if (capacity > SIZE_MAX / sizeof(Obj*)) {
DECREF(self);
Err *err = DROP(OVERFLOWERR, "Vector index overflow");
return MAYBEVec_bad(err);
}
else {
self->size = 0;
self->cap = capacity;
self->elems = (Obj**)CALLOC(capacity, sizeof(Obj*));
if (self->elems == NULL && capacity > 0) {
DECREF(self);
return MAYBEVec_bad(cfish_Err_oom);
}
}
return MAYBEVec_good(self);
}
A couple of notes about the novel constructs `DROP` and `cfish_Err_oom`:
* `DROP` is intended as something akin to `THROW`, in that it generates an
exception object with context information using `__LINE__`, `__FILE__`,
and so on -- but it returns the error object rather than raising it as
exception.
* `cfish_Err_oom` is a special, immortal Err object indicating
out-of-memory. It's needed because attempting to allocate a new error
when you've just run out of memory has a significant chance of failing.
Some Vector subroutines currently return void but can throw exceptions -- for
example, `Vec_Push`. Such subroutines will need to change to return a value
which encodes a potential error:
MAYBEbool
Vec_Push_IMP(Vector *self, Obj *element) {
MAYBEbool result = SI_grow_and_oversize(self, self->size, 1);
if (MAYBEbool_ERR(result)) {
return result;
}
self->elems[self->size] = element;
self->size++;
return MAYBEbool_good(true);
}
Here's some client code using Vector as it exists today. It builds up a
Vector containing the strings "1" through "10":
Vector*
one_through_ten(void) {
Vector *vector = Vec_new(10);
for (int32_t i = 1; i <= 10; i++) {
Vec_Push(vector, (Obj*)Str_newf("%i32", i));
}
return vector;
}
Here's the equivalent code -- which still throws exceptions -- but using
Vector after migration to MAYBE types.
MAYBEVector
one_through_ten(void) {
Vector *vector = Vec_UNWRAP(Vec_new(10));
for (int32_t i = 1; i <= 10; i++) {
String *string = Str_UNWRAP(Str_newf("%i32", i));
Vec_Push(vector, (Obj*)string);
}
return MAYBEVec_good(vector);
}
Here's how that code might look after refactoring to handle error codes
locally instead of throwing exceptions:
MAYBEVector
one_through_ten(void) {
MAYBEVector maybe_vec = Vec_new(10);
if (MAYBEVec_ERR(maybe_vec)) {
return maybe_vec;
}
Vector *vector = MAYBEVec_VALUE(maybe_vec);
for (int32_t i = 1; i <= 10; i++) {
MAYBEString maybe_str = Str_newf("%i32", i);
String *string = MAYBEStr_VALUE(maybe_str);
if (string == NULL) {
DECREF(vector);
return MAYBEVec_bad(MAYBEStr_ERR(maybe_str));
}
Err *error = MAYBEbool_ERR(Vec_Push(vector, (Obj*)string));
if (error != NULL) {
DECREF(vector);
return MAYBEVec_bad(error);
}
}
return MAYBEVec_good(vector);
}
Pretty verbose!
Next let's contemplate Clownfish header syntax.
Syntax for sum types typically uses boolean operators. Here's some sample ML:
https://en.wikipedia.org/wiki/Tagged_union#Examples
datatype tree = Leaf
| Node of (int * tree * tree)
The problem with using such syntax in Clownfish headers is that the Clownfish
signature and C signature would look too different from each other.
Instead, I'd suggest declaring the possible error types that a subroutine can
return using a `drops` clause:
public inert incremented MAYBEVector
new(size_t capacity)
drops OverflowErr*, OutOfMemoryErr*;
public inert incremented MAYBEVector
init(Vector *vector, size_t capacity)
drops OverflowErr*, OutOfMemoryErr*;
// ...
public MAYBEbool
Push(Vector *vector, Obj *elem)
drops OverflowErr*, OutOfMemoryErr*;
Thoughts?
To be clear, this is only brainstorming at this point -- it's not something
we're going to implement tomorrow.
Marvin Humphrey