When I saw this thread I thought we should be logging in the default
`handle_info/2` implementation, regardless of the changes proposed here,
because we are silently ignoring an unhandled message. This could certainly
hide bugs as described by Alexei, we got this one wrong and need to change
the default behaviour. However if we are logging `handle_info/2` should we
avoid crashing and do similarly for `handle_call/3` and `handle_cast/2`?

For `handle_call/3`, if we log and reply then the caller may handle the
error and if we log and noreply then the caller blocks for the timeout,
which could be infinity. Therefore even though we are logging a message it
requires handling by a human. A well designed supervision tree (and a
poorly designed one can too) will limit the fault to the minimal error
kernel and recover to a (hopefully) good state as soon as possible after a
crash. In many cases the error will happen either nearly all the time and
be caught early in the development cycle or will happen very rarely and can
be fixed when priorities allow. This could be somewhere from immediately to
never such is the beauty of OTP's "let it crash" philosophy.

With this in mind it makes sense to crash on unexpected calls/casts/other
messages so that we can use the supervision tree to recover the system to a
good state. These can occur for three reasons:

1) The message is intended for the GenServer but the server is in a bad
state
2) The message is intended for the GenServer but the sender is in a bad
state
2) The message is not intended for the GenServer and the sender is in a bad
state

We can not recover from a bad state to a good state by logging alone.

Currently `handle_call/3` and `handle_cast/2` have different behaviour to
`handle_info/2` because calls and cast should only be triggered by function
calls to the callback module from client processes. This means an unhandled
call or cast is definitely bad state somewhere. Whereas `handle_info/2` is
more likely to be a result of a function call made by the GenServer. Common
examples would starting async tasks, monitors, links, ports and sockets. In
many situations some messages can be safely ignored, such as for async
tasks where the :DOWN can be ignored after receiving the result message.
Therefore it is a common pattern to have a catch all clause as the last
function clause for `handle_info/2` but not `handle_call/3` and
`handle_cast/2`.

When struggling with these problems I often look at the OTP source to see
how it is handled there. The supervisor is the corner stone of OTP and it
logs unexpected messages:
https://github.com/erlang/otp/blob/41d1444bb13317f78fdf600a7102ff9f7cb12d13/lib/stdlib/src/supervisor.erl#L630.
However a supervisor can run user code, in the `init/1` or `start`
function. Both of these can be called and have exceptions caught and the
supervisor will continue to run. These cause unhandled messages, one
example would be catching a `GenServer.call/3` timeout, which can lead to
an unexpected response message arriving later. This type of handling is not
normal though.

If `handle_info/2` is not implemented then miscellaneous messages are not
expected and perhaps should crash in the default implementation. Clearly
something has gone wrong and there is a bug. When there is a bug the
default behaviour in OTP is to crash and allow the supervision tree to do
its work. Once the bug is understood then explicit handling is added, such
as logging the unhandled message. I think we should crash in default
implementation `handle_info/2` as with `handle_call/3` and `handle_cast/2`.

On 23 September 2016 at 21:38, José Valim <[email protected]>
wrote:

>   2. The default implementation of handle_info/1 will exit on any incoming
>> message, in the same way handle_cast/2 and handle_call/3 already do.
>>
>
> I believe this is not a good default because your processes may receive
> regular messages from other processes and you don't want to crash because
> of them. This is much more unlikely to happen with cast and call though, so
> we can afford to raise in cast and calls.
>
> I am wondering if the best way to solve this problem would be to have a
> @before_compile callback that checks if you defined any of the callbacks
> with a different arity while, at the same time, did not define one with the
> proper arity.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk
> 1hTMZ6r331iZKA%40mail.gmail.com
> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk1hTMZ6r331iZKA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CA%2BibZ996XjTtLS%2BK6MQeFA2S%3DiTc%3DpkhO0ftvx5NCZGYgp_yow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to