Going from silently ignoring to crashing might be too much of a step. It would certainly be possible for there to be unhandled messages people don't know about. Should we log for at least the next release cycle?
On 24 September 2016 at 18:24, James Fish <[email protected]> wrote: > When I saw this thread I thought we should be logging in the default > `handle_info/2` implementation, regardless of the changes proposed here, > because we are silently ignoring an unhandled message. This could certainly > hide bugs as described by Alexei, we got this one wrong and need to change > the default behaviour. However if we are logging `handle_info/2` should we > avoid crashing and do similarly for `handle_call/3` and `handle_cast/2`? > > For `handle_call/3`, if we log and reply then the caller may handle the > error and if we log and noreply then the caller blocks for the timeout, > which could be infinity. Therefore even though we are logging a message it > requires handling by a human. A well designed supervision tree (and a > poorly designed one can too) will limit the fault to the minimal error > kernel and recover to a (hopefully) good state as soon as possible after a > crash. In many cases the error will happen either nearly all the time and > be caught early in the development cycle or will happen very rarely and can > be fixed when priorities allow. This could be somewhere from immediately to > never such is the beauty of OTP's "let it crash" philosophy. > > With this in mind it makes sense to crash on unexpected calls/casts/other > messages so that we can use the supervision tree to recover the system to a > good state. These can occur for three reasons: > > 1) The message is intended for the GenServer but the server is in a bad > state > 2) The message is intended for the GenServer but the sender is in a bad > state > 2) The message is not intended for the GenServer and the sender is in a > bad state > > We can not recover from a bad state to a good state by logging alone. > > Currently `handle_call/3` and `handle_cast/2` have different behaviour to > `handle_info/2` because calls and cast should only be triggered by function > calls to the callback module from client processes. This means an unhandled > call or cast is definitely bad state somewhere. Whereas `handle_info/2` is > more likely to be a result of a function call made by the GenServer. Common > examples would starting async tasks, monitors, links, ports and sockets. In > many situations some messages can be safely ignored, such as for async > tasks where the :DOWN can be ignored after receiving the result message. > Therefore it is a common pattern to have a catch all clause as the last > function clause for `handle_info/2` but not `handle_call/3` and > `handle_cast/2`. > > When struggling with these problems I often look at the OTP source to see > how it is handled there. The supervisor is the corner stone of OTP and it > logs unexpected messages: https://github.com/erlang/otp/blob/ > 41d1444bb13317f78fdf600a7102ff9f7cb12d13/lib/stdlib/src/ > supervisor.erl#L630. However a supervisor can run user code, in the > `init/1` or `start` function. Both of these can be called and have > exceptions caught and the supervisor will continue to run. These cause > unhandled messages, one example would be catching a `GenServer.call/3` > timeout, which can lead to an unexpected response message arriving later. > This type of handling is not normal though. > > If `handle_info/2` is not implemented then miscellaneous messages are not > expected and perhaps should crash in the default implementation. Clearly > something has gone wrong and there is a bug. When there is a bug the > default behaviour in OTP is to crash and allow the supervision tree to do > its work. Once the bug is understood then explicit handling is added, such > as logging the unhandled message. I think we should crash in default > implementation `handle_info/2` as with `handle_call/3` and `handle_cast/2`. > > On 23 September 2016 at 21:38, José Valim <[email protected] > > wrote: > >> 2. The default implementation of handle_info/1 will exit on any >>> incoming message, in the same way handle_cast/2 and handle_call/3 already >>> do. >>> >> >> I believe this is not a good default because your processes may receive >> regular messages from other processes and you don't want to crash because >> of them. This is much more unlikely to happen with cast and call though, so >> we can afford to raise in cast and calls. >> >> I am wondering if the best way to solve this problem would be to have a >> @before_compile callback that checks if you defined any of the callbacks >> with a different arity while, at the same time, did not define one with the >> proper arity. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elixir-lang-core" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit https://groups.google.com/d/ms >> gid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk >> 1hTMZ6r331iZKA%40mail.gmail.com >> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JqC6rzBUYJHd9ij%3DQRH9-mPNDG1qGk1hTMZ6r331iZKA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CA%2BibZ98WrNhF0Xo_mbVn-ST2b9ysqryOCPoLWNf-jM7baisH2g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
