Going from silently ignoring to crashing might be too much of a step. It
would certainly be possible for there to be unhandled messages people don't
know about. Should we log for at least the next release cycle?

On 24 September 2016 at 18:24, James Fish <[email protected]> wrote:

> When I saw this thread I thought we should be logging in the default
> `handle_info/2` implementation, regardless of the changes proposed here,
> because we are silently ignoring an unhandled message. This could certainly
> hide bugs as described by Alexei, we got this one wrong and need to change
> the default behaviour. However if we are logging `handle_info/2` should we
> avoid crashing and do similarly for `handle_call/3` and `handle_cast/2`?
>
> For `handle_call/3`, if we log and reply then the caller may handle the
> error and if we log and noreply then the caller blocks for the timeout,
> which could be infinity. Therefore even though we are logging a message it
> requires handling by a human. A well designed supervision tree (and a
> poorly designed one can too) will limit the fault to the minimal error
> kernel and recover to a (hopefully) good state as soon as possible after a
> crash. In many cases the error will happen either nearly all the time and
> be caught early in the development cycle or will happen very rarely and can
> be fixed when priorities allow. This could be somewhere from immediately to
> never such is the beauty of OTP's "let it crash" philosophy.
>
> With this in mind it makes sense to crash on unexpected calls/casts/other
> messages so that we can use the supervision tree to recover the system to a
> good state. These can occur for three reasons:
>
> 1) The message is intended for the GenServer but the server is in a bad
> state
> 2) The message is intended for the GenServer but the sender is in a bad
> state
> 2) The message is not intended for the GenServer and the sender is in a
> bad state
>
> We can not recover from a bad state to a good state by logging alone.
>
> Currently `handle_call/3` and `handle_cast/2` have different behaviour to
> `handle_info/2` because calls and cast should only be triggered by function
> calls to the callback module from client processes. This means an unhandled
> call or cast is definitely bad state somewhere. Whereas `handle_info/2` is
> more likely to be a result of a function call made by the GenServer. Common
> examples would starting async tasks, monitors, links, ports and sockets. In
> many situations some messages can be safely ignored, such as for async
> tasks where the :DOWN can be ignored after receiving the result message.
> Therefore it is a common pattern to have a catch all clause as the last
> function clause for `handle_info/2` but not `handle_call/3` and
> `handle_cast/2`.
>
> When struggling with these problems I often look at the OTP source to see
> how it is handled there. The supervisor is the corner stone of OTP and it
> logs unexpected messages: https://github.com/erlang/otp/blob/
> 41d1444bb13317f78fdf600a7102ff9f7cb12d13/lib/stdlib/src/
> supervisor.erl#L630. However a supervisor can run user code, in the
> `init/1` or `start` function. Both of these can be called and have
> exceptions caught and the supervisor will continue to run. These cause
> unhandled messages, one example would be catching a `GenServer.call/3`
> timeout, which can lead to an unexpected response message arriving later.
> This type of handling is not normal though.
>
> If `handle_info/2` is not implemented then miscellaneous messages are not
> expected and perhaps should crash in the default implementation. Clearly
> something has gone wrong and there is a bug. When there is a bug the
> default behaviour in OTP is to crash and allow the supervision tree to do
> its work. Once the bug is understood then explicit handling is added, such
> as logging the unhandled message. I think we should crash in default
> implementation `handle_info/2` as with `handle_call/3` and `handle_cast/2`.
>
> On 23 September 2016 at 21:38, José Valim <[email protected]
> > wrote:
>
>>   2. The default implementation of handle_info/1 will exit on any
>>> incoming message, in the same way handle_cast/2 and handle_call/3 already
>>> do.
>>>
>>
>> I believe this is not a good default because your processes may receive
>> regular messages from other processes and you don't want to crash because
>> of them. This is much more unlikely to happen with cast and call though, so
>> we can afford to raise in cast and calls.
>>
>> I am wondering if the best way to solve this problem would be to have a
>> @before_compile callback that checks if you defined any of the callbacks
>> with a different arity while, at the same time, did not define one with the
>> proper arity.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk
>> 1hTMZ6r331iZKA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk1hTMZ6r331iZKA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CA%2BibZ98WrNhF0Xo_mbVn-ST2b9ysqryOCPoLWNf-jM7baisH2g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to