Re: [ClojureScript] Erlang-inspired error handling for core.async

Christian Weilbach Mon, 14 Sep 2015 13:48:32 -0700

Hi Moe,

> 
> I haven't felt the need for Erlang-style supervision in core.async,
> but am generally interested in moving around errors
> asynchronously.
> 
> How are you dealing with failures which pass through higher-order
> channel operations, e.g. into, map, pipe, etc.?


into and pipe should probably be safe. The best way to implement pure
channel logic now are transducers. I have added a chan-super
constructer which automatically reports exceptions in the transduction
to the supervisor. But for cljs I need to interleave control flow with
blocking side-effects, in which case the transducers don't help. For
this purpose I have refit the for-comprehension to work with channels
instead of lazy-seqs. I have added an example for both higher-order
constructs here, just uncomment the respective expressions and see how
you receive the exception after 3 retry attempts:

https://github.com/whilo/full.monty/blob/41ddec45f707ca4e29f67df4c8cff09b91bb8de6/full.async/src/full/async.clj#L452

Can you find constructs/cases I am missing?

> 
> Take care, Moe
> 
> P.S. What does >? do?

It is just there to throw an abortion exception when the supervised
context dies. I understand your confusion, but this is my best bet to
abort all nested contexts by intercepting all blocking ops and
throwing exceptions there. This allows you to clean up safely with
try-catch-finally locally, if this is needed (e.g. file-handles,
sockets, db connections...). Best would be some preemption, but this
needs runtime support or very invasive code instrumentation, I think.

Christian

> 
> 
> On Sun, Sep 13, 2015 at 5:30 PM, Christian Weilbach < 
> whitesp...@polyc0l0r.net> wrote:
> 
>> Hi,
>> 
>> I am working on a replication system which also works in the
>> browser (1). So far I have come a long way with core.async and a
>> pub-sub architecture, but I have avoided error-handling in the
>> beginning, just using a cascading close on the pub-sub
>> architecture on errors (e.g. disconnects). Lately I introduced <?
>> and go-try as described here (2) and extended that to go-loops
>> with dedicated error-channels. But this never felt right, as
>> error handling in a distributed system should not be an
>> afterthought. Erlang has a very successful and sound error 
>> handling concept, as Rich also pointed out (3).
>> 
>> The idea is basically that uncaught errors will always happen at
>> some point and when they do they just propagate and trigger the
>> restart of whole subsystems. Processes in Erlang terms, which are
>> somewhat an extended version of go-routines, have ids and are
>> explicitly wired to each other to receive exit messages when one
>> of them fails.
>> 
>> To retain the robustness of Erlang, I have reduced the concept to
>> the following 2 requirements:
>> 
>> 
>> 1) All errors (exceptions) need to be caught and propagated. 2)
>> Resources need to be freed reliably.
>> 
>> 
>> Since we operate with channels in core.async and not namend
>> process as in Erlang, the supervision needs to be passed to
>> routines by some other means unless go-routines were to be
>> globally registered as in Erlang. The most natural way is passing
>> a parameter to the routines, but this can be verbose. To reduce
>> this, one could use bindings, but these are thread-local and
>> difficult to reason about, which might break 1).
>> 
>> For requirement 2) the Erlang VM can preemptively terminate
>> processes. Since this is not possible in core.async, I have
>> decided to inject "abort" exceptions on every blocking channel
>> op, <?, >?, alt?, ... This is not perfect, but will eventually
>> throw an exception in every go-routine and free all resources
>> satisfying 2). The supervisor then needs to track all running
>> go-routines under its supervision and only restart the subsystem
>> once all routines are finished. Using the default
>> try-catch-finally exception mechanism inside of go-routines + 
>> bubbling through <? makes the error-handling somewhat intuitive
>> along Java/JavaScript semantics.
>> 
>> Example (4):
>> 
>> (let [test-fn (fn [super] (go-super super (try (<? (timeout
>> 500)) (catch Exception e (println "Caught:" (.getMessage e))) 
>> (finally (<! (timeout 500)) (println "Cleaned up slowly."))))) 
>> start-fn (fn [super] (go-super super (go-try (throw (ex-info
>> "stale" {}))) (test-fn super) (<? (timeout 300)) (throw (ex-info
>> "foo" {}))))] (<?? (restarting-supervisor start-fn :retries 1
>> :stale-timeout 100)))
>> 
>> Since one can accidentally leave exceptions stale in some
>> channel without taking them, we also need to track these and act
>> after some timeout to satisfy 1). I am really wondering what
>> other people have thought about error handling with core.async so
>> far and whether this is the first real attempt to satisfy these
>> requirements. Since error-handling should be standardized to
>> compose over library boundaries, I think something like this
>> should move into core.async eventually. What do you think?
>> 
>> Christian
>> 
>> 
>> (1) https://github.com/replikativ/replikativ (2)
>> http://swannodette.github.io/2013/08/31/asynchronous-error-handling/
>>
>> 
(3) http://www.erlang.org/download/armstrong_thesis_2003.pdf
>> (4)
>> 
>> https://github.com/whilo/full.monty/blob/3ba439c6971c8255d34b2d39f1b7619b3b016236/full.async/src/full/async.clj#L451
>>
>>
>> 
--
>> Note that posts from new members are moderated - please be
>> patient with your first post. --- You received this message
>> because you are subscribed to the Google Groups "ClojureScript"
>> group. To unsubscribe from this group and stop receiving emails
>> from it, send an email to
>> clojurescript+unsubscr...@googlegroups.com. To post to this
>> group, send email to clojurescript@googlegroups.com. Visit this
>> group at http://groups.google.com/group/clojurescript.
>> 
> 

-- 
Note that posts from new members are moderated - please be patient with your 
first post.
--- 
You received this message because you are subscribed to the Google Groups 
"ClojureScript" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojurescript+unsubscr...@googlegroups.com.
To post to this group, send email to clojurescript@googlegroups.com.
Visit this group at http://groups.google.com/group/clojurescript.

Re: [ClojureScript] Erlang-inspired error handling for core.async

Reply via email to