Re: [ntp:questions] refclock use causes core dump of ntpd

wa6zvp Thu, 22 Feb 2007 09:46:09 -0800

On Feb 22, 7:04 am, Ronan Flood <[EMAIL PROTECTED]> wrote:
> "wa6zvp" <[EMAIL PROTECTED]> wrote:
> > > > We know it is from the call to abort() at line 788 of refclock_true.c.
>
> > > Yep.  Unfortunately, the code should never get there.  Yea, right.
>
> > OK, I got warm and cuddly with gdb, at least enough to set some
> > breakpoints
> > and look at variables.
>
> > The main culprit looks like line 540 (in refclock_true).  This is in
> > the
> > received data function.  It calls true_doevent with a parameter of
> > e_Poll.
> > Event e_Poll is never handled anywhere in doevent, so is very state
> > dependant.
>
> > Even replacing line 788, the original abort call location with a
> > break;,
> > the program would abort at other unhandled places in doevent.
>
> That's more understandable, but looking at the code I don't see how it
> got to line 788, since that's the default on a switch(up->type) which
> should only ever be one of t_unknown, t_goes, t_omega, t_tm, or t_tcu
> as they are the only values ever assigned to it, and they all have
> matching cases in the switch.  What was the value of up->type when
> it got to line 788?  And up->state?


* My recollection is that up->type actually had t_unknown in it,
making
it even more puzzling.  Don't remember state.

> What I'd expect is that the state machine starts with t_unknown and
> s_Base then sees e_Init, from true_start() lines 290-292, which takes
> it into ss_InqGOES.  If it then gets e_Poll from true_receive(), it
> would abort at line 726.  Various other scenarios I have not looked
> at exhaustively, but getting to line 788 is puzzling ...

* It certainly is.  I'll fiddle with more gdb tonight, maybe doing
some
instruction tracing from true_recieve.

I can't do much from work, since I can't disconnect the serial data
line.
If I start ntpd with gdb, it just says 'normal completion', meanwhile
the forked process crashes.  Is there a way to get gdb to follow into
the forked process?  If not, I have to get it running without the data
and
attach to the running process. This will have to wait till tonight.

My feeling is that a refclock driver should _never_ cause ntpd to die.
I think it should just do verbose debugging and continue on as best it
can.
The fact that it never gets into a reached status would be a clue that
its not working right.  In this case, however, continuing makes it
work.
This happens because the serial data is actually parsed correctly.

More later.

Roger



_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions

Re: [ntp:questions] refclock use causes core dump of ntpd

Reply via email to