On 5-12-2011 19:45, Sebastian Benoit wrote: > I see relayd crashes like this: (1) > fatal: relay_dispatch_pfe: invalid host id
> or like this: (2) > fatal: pfe_dispatch_hce: invalid host id > There is a race of the hce and the other childs (pfe and relays) > between loading the configuration and start of processing IMSG_HOST_STATUS > messages. > > The problem is that in hce_setup_events() the host checks are started before > all childs have all of the configuration. Yes, I experienced the same thing, see: http://marc.info/?l=openbsd-bugs&m=132207738531052&w=2 > A quick hack is to insert a sleep(1) at the beginning of hce_setup_events(). No, that does not work, I've seen crashes with sleeps upto 3 seconds on my system. And it is still a race. > A fix might be to make 'invalid host id' non fatal: That might lead to crashes later on, especially if the hce notifies about new host ids that the other processes have not loaded yet. > Another might be to inhibit the processing of IMSG_HOST_STATUS only until > the configuration has been completed (that is after receiving IMSG_CFG_DONE): I'm going to try this one. I'm not sure how bad it is to discard messages though.
