Re: possible bug in OpenNTPD code?

2008-12-22 Thread Henning Brauer
this analysis is as correct as thye provided diff, which has been
committed.

* Anirban Sinha  [2008-12-05 03:07]:
> Hi:
> 
> I am sort of digging my way through the OpenNTPD codebase for my work. I
> think I find a bug in the code. Please help me to understand the reason
> if this is not a bug.
> 
> In function ntp_main() (ntp.c), we poll() to check if there are any
> events of interest. We do this:
> 
> 1. Check internal fds (PIPE_MAIN)
> 2. Then check PIPE_DNS fds
> 3. Then check PIPE_HOTPLUG fds
> 
> Next, for the server, we check all the fds we are listening on. And then
> finally, for nfs clients, we check the fds for the remote servers. Now,
> there's the issue in this line;
> 
> for (j = 1; nfds > 0 && j < idx_peers; j++) {
> ...
> }
> 
> Shouldn't the index start with 3? That is, shouldn't we do this:
> 
> for (j = 3; nfds > 0 && j < idx_peers; j++)
> 
> since, indices 0,1 and 2 correspond to the three checks I have written
> above which are already done.
> 
> In other words, can we apply the following patch to fix the issue?
> 
> Index: ntpd/ntp.c
> ===
> --- ntpd.orig/ntp.c
> +++ ntpd/ntp.c
> @@ -344,7 +344,7 @@ ntp_main(int pipe_prnt[2], struct ntpd_c
>   sensor_hotplugevent(hotplugfd);
>   }
> 
> - for (j = 1; nfds > 0 && j < idx_peers; j++)
> + for (j = PFD_MAX; nfds > 0 && j < idx_peers; j++)
>   if (pfd[j].revents & (POLLIN|POLLERR)) {
>   nfds--;
>   if (server_dispatch(pfd[j].fd, conf) ==
> -1)
> 
> 
> 
> Thanks,
> 
> Ani
> 

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam



Re: possible bug in OpenNTPD code?

2008-12-12 Thread Anirban Sinha
>I'd take a look at your other diff later.


Thanks for patching the upstream source in cvs with my patch.

Ani


>
>   -Otto



Re: possible bug in OpenNTPD code?

2008-12-11 Thread Ed Ahlsen-Girard (TYBRIN Corp.)
> From: Tom Van Looy [mailto:[EMAIL PROTECTED]

> Sent: Wednesday, December 10, 2008 10:50 AM

> To: Anirban Sinha

> Cc: misc@openbsd.org

> Subject: Re: possible bug in OpenNTPD code?

>

> >>Why would you assume that? That seems a bit hostile. Perhaps the

> >>developers are a bit busy at the moment.

> >

> >True. I generally post on the Linux lists and I believe I am spoiled by

> getting quick responses from my postings. In future, I will remember to keep

> more patience.

>

> I would like to respond to this with a small anecdote: I opened a call at

> vendor x at 17/sep/2008. The call resulted in me having to open a design

> change request for their OS. So, 65 days later (coincidently today), I

> received a "tracking number" and probably have to wait an other (max) 90 days

> for the developers to decide if they even will make a fix. Oh, did I mention

> how ridiculously much we pay each month for their support?





This varies GREATLY from vendor to vendor.  We had a vendor that put additional 
features in the product, because we asked for them.



Of course, we were paying for between a third and half of a full time engineer 
in support fees.



--



Ed Ahlsen-Girard




Re: possible bug in OpenNTPD code?

2008-12-10 Thread Anirban Sinha
>I would like to respond to this with a small anecdote: I opened a call

>at vendor x at 17/sep/2008. The call resulted in me having to open a

>design change request for their OS. So, 65 days later (coincidently

>today), I received a "tracking number" and probably have to wait an

>other (max) 90 days for the developers to decide if they even will make

>a fix. Oh, did I mention how ridiculously much we pay each month for

>their support?



Yes. This bites in the ass. But this is not something unheard of. Due to this 
very same reasons, we have moved our entire OS distro development in house so 
that we can timely address all issues ourselves as opposed to waiting for a 
third party vendor to address our issues. 



In this respect, the open source philosophy really works well. If you post it 
in an appropriate mailing list, someone somewhere will definitely respond to 
your message, albeit may be with slight delay. Long live the open source 
community. 



Ani




Re: possible bug in OpenNTPD code?

2008-12-10 Thread Tom Van Looy
>>Why would you assume that? That seems a bit hostile. Perhaps the
>>developers are a bit busy at the moment.
>
>True. I generally post on the Linux lists and I believe I am spoiled by
getting quick responses from my postings. In future, I will remember to keep
more patience.

I would like to respond to this with a small anecdote: I opened a call at
vendor x at 17/sep/2008. The call resulted in me having to open a design
change request for their OS. So, 65 days later (coincidently today), I
received a "tracking number" and probably have to wait an other (max) 90 days
for the developers to decide if they even will make a fix. Oh, did I mention
how ridiculously much we pay each month for their support?



Re: possible bug in OpenNTPD code?

2008-12-10 Thread Hannah Schroeter
Hi!

On Tue, Dec 09, 2008 at 10:36:46PM -0800, Anirban Sinha wrote:
>>How is it blocking indefinitely?  Is wait() not returning -1 with
>>errno == ECHILD when there are no children to reap?  What led you to
>>the conclusion that this code was blocking?  (What platform are you
>>running this on?)

>Hmm, agreed. Looks like I was wrong with my analysis. In any case, I am
>running the portable version of the ntpd on Linux. I am definitely
>observing the parent still alive and blocked (sleeping) even when the
>child is dead. I need to do some more digging on this.

IIRC, you can show the wait channel on Linux too, using something like
  ps alxww|grep ntpd

Then you perhaps can see what the parent ntpd process is really waiting
for.

Kind regards,

Hannah.



Re: possible bug in OpenNTPD code?

2008-12-10 Thread Claudio Jeker
On Thu, Dec 04, 2008 at 06:03:32PM -0800, Anirban Sinha wrote:
> Hi:
> 
> I am sort of digging my way through the OpenNTPD codebase for my work. I
> think I find a bug in the code. Please help me to understand the reason
> if this is not a bug.
> 
> In function ntp_main() (ntp.c), we poll() to check if there are any
> events of interest. We do this:
> 
> 1. Check internal fds (PIPE_MAIN)
> 2. Then check PIPE_DNS fds
> 3. Then check PIPE_HOTPLUG fds
> 
> Next, for the server, we check all the fds we are listening on. And then
> finally, for nfs clients, we check the fds for the remote servers. Now,
> there's the issue in this line;
> 
> for (j = 1; nfds > 0 && j < idx_peers; j++) {
> ...
> }
> 
> Shouldn't the index start with 3? That is, shouldn't we do this:
> 
> for (j = 3; nfds > 0 && j < idx_peers; j++)
> 
> since, indices 0,1 and 2 correspond to the three checks I have written
> above which are already done.
> 
> In other words, can we apply the following patch to fix the issue?
> 

Yes, that seems like this bit was forgotten when PFD_MAX was "introduced"
in 1.68 of ntp.c. Your fix is correct and commited.

thanks for the report
-- 
:wq Claudio



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Otto Moerbeek
On Tue, Dec 09, 2008 at 09:57:25PM -0800, Philip Guenther wrote:

> On Tue, Dec 9, 2008 at 8:10 PM, Anirban Sinha <[EMAIL PROTECTED]> wrote:
> > There is yet another bug in Openntpd. This is direct copy-paste from
> > openntpd code (ntpd.c:main()):
> >
> > do {
> >if ((pid = wait(NULL)) == -1 &&
> >errno != EINTR && errno != ECHILD)
> >fatal("wait");
> >} while (pid != -1 || (pid == -1 && errno == EINTR));
> >
> > What this code intends to do is to reap all children and move on when
> > there are no more. Instead, it ends up blocking indefinitely even when
> > there are no children to reap!
> 
> How is it blocking indefinitely?  Is wait() not returning -1 with
> errno == ECHILD when there are no children to reap?  What led you to
> the conclusion that this code was blocking?  (What platform are you
> running this on?)
> 
> 
> > The way I fixed the bug is by doing this:
> > +   if (chld_pid && (pid = waitpid(chld_pid, &status, 0)) == -1 &&
> > +   errno != EINTR && errno != ECHILD)
> > +   fatal("wait");
> > +   if (pid !=-1)
> > +   log_info("child %d exited with return code %d", 
> > pid,WEXITSTATUS(status));
> 
> This code fails to retry the waitpid() if it returns with EINTR.
> 
> 
> Philip Guenther

Philip is right. The code is ok as is. Besides, in current ntpd has
multiple children, so you one child argument is lost as well.

I'd take a look at your other diff later.

-Otto



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Otto Moerbeek
On Wed, Dec 10, 2008 at 12:05:05AM -0600, Todd Alan Smith wrote:

> On Tue, Dec 9, 2008 at 10:10 PM, Anirban Sinha <[EMAIL PROTECTED]> wrote:
> > I hope I will get some response to this. If not, I will assume that
> > there is really no interest in fixing bugs in openntpd and in that case,
> 
> Why would you assume that? That seems a bit hostile. Perhaps the
> developers are a bit busy at the moment.

Indeed it sounded hostile to me. Especially since the op is sending in a
buggy diff to code that is ok, afaiks.

-Otto



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Anirban Sinha
>Why would you assume that? That seems a bit hostile. Perhaps the

>developers are a bit busy at the moment.



True. I generally post on the Linux lists and I believe I am spoiled by getting 
quick responses from my postings. In future, I will remember to keep more 
patience.



Ani




Re: possible bug in OpenNTPD code?

2008-12-09 Thread Anirban Sinha
>How is it blocking indefinitely?  Is wait() not returning -1 with
>errno == ECHILD when there are no children to reap?  What led you to
>the conclusion that this code was blocking?  (What platform are you
>running this on?)


Hmm, agreed. Looks like I was wrong with my analysis. In any case, I am
running the portable version of the ntpd on Linux. I am definitely
observing the parent still alive and blocked (sleeping) even when the
child is dead. I need to do some more digging on this.

Apologies.

Ani



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Todd Alan Smith
On Tue, Dec 9, 2008 at 10:10 PM, Anirban Sinha <[EMAIL PROTECTED]> wrote:
> I hope I will get some response to this. If not, I will assume that
> there is really no interest in fixing bugs in openntpd and in that case,

Why would you assume that? That seems a bit hostile. Perhaps the
developers are a bit busy at the moment.



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Philip Guenther
On Tue, Dec 9, 2008 at 8:10 PM, Anirban Sinha <[EMAIL PROTECTED]> wrote:
> There is yet another bug in Openntpd. This is direct copy-paste from
> openntpd code (ntpd.c:main()):
>
> do {
>if ((pid = wait(NULL)) == -1 &&
>errno != EINTR && errno != ECHILD)
>fatal("wait");
>} while (pid != -1 || (pid == -1 && errno == EINTR));
>
> What this code intends to do is to reap all children and move on when
> there are no more. Instead, it ends up blocking indefinitely even when
> there are no children to reap!

How is it blocking indefinitely?  Is wait() not returning -1 with
errno == ECHILD when there are no children to reap?  What led you to
the conclusion that this code was blocking?  (What platform are you
running this on?)


> The way I fixed the bug is by doing this:
> +   if (chld_pid && (pid = waitpid(chld_pid, &status, 0)) == -1 &&
> +   errno != EINTR && errno != ECHILD)
> +   fatal("wait");
> +   if (pid !=-1)
> +   log_info("child %d exited with return code %d", 
> pid,WEXITSTATUS(status));

This code fails to retry the waitpid() if it returns with EINTR.


Philip Guenther



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Otto Moerbeek
On Tue, Dec 09, 2008 at 08:10:20PM -0800, Anirban Sinha wrote:

> I hope I will get some response to this. If not, I will assume that
> there is really no interest in fixing bugs in openntpd and in that case,
> I will patch only our local copy of the ntpd codebase (as opposed to
> reporting to the community).

misc@ is probably not the most suited place to report bugs. Better use
[EMAIL PROTECTED]

Anyway, I'll take a look at your diffs.

-Otto



Re: possible bug in OpenNTPD code?

2008-12-09 Thread Anirban Sinha
There is yet another bug in Openntpd. This is direct copy-paste from
openntpd code (ntpd.c:main()):

do {
if ((pid = wait(NULL)) == -1 &&
errno != EINTR && errno != ECHILD)
fatal("wait");
} while (pid != -1 || (pid == -1 && errno == EINTR));

What this code intends to do is to reap all children and move on when
there are no more. Instead, it ends up blocking indefinitely even when
there are no children to reap!

The way I fixed the bug is by doing this:

Index: openntpd-src/ntpd.c
===
--- openntpd-src.orig/ntpd.c
+++ openntpd-src/ntpd.c
@@ -90,10 +90,11 @@ main(int argc, char *argv[])
 {
struct ntpd_conf lconf;
struct pollfdpfd[POLL_MAX];
-   pid_tchld_pid = 0, pid;
+   pid_tchld_pid = 0, pid=0;
const char  *conffile;
int  ch, nfds, timeout = INFTIM;
int  pipe_chld[2];
+   int  status;
extern char *__progname;

__progname = _compat_get_progname(argv[0]);
@@ -233,11 +234,11 @@ main(int argc, char *argv[])
if (chld_pid)
kill(chld_pid, SIGTERM);

-   do {
-   if ((pid = wait(NULL)) == -1 &&
-   errno != EINTR && errno != ECHILD)
-   fatal("wait");
-   } while (pid != -1 || (pid == -1 && errno == EINTR));
+   if (chld_pid && (pid = waitpid(chld_pid, &status, 0)) == -1 &&
+   errno != EINTR && errno != ECHILD)
+   fatal("wait");
+   if (pid !=-1)
+   log_info("child %d exited with return code %d", pid,
WEXITSTATUS(status));

msgbuf_clear(&ibuf->w);
free(ibuf);


It forks one child anyway. So it suffices to reap that one child.

I hope I will get some response to this. If not, I will assume that
there is really no interest in fixing bugs in openntpd and in that case,
I will patch only our local copy of the ntpd codebase (as opposed to
reporting to the community).

Thanks,

Ani

>-Original Message-----
>From: Anirban Sinha
>Sent: Thursday, December 04, 2008 6:04 PM
>To: 'misc@openbsd.org'
>Subject: possible bug in OpenNTPD code?
>
>Hi:
>
>I am sort of digging my way through the OpenNTPD codebase for my work.
I
>think I find a bug in the code. Please help me to understand the reason
>if this is not a bug.
>
>In function ntp_main() (ntp.c), we poll() to check if there are any
>events of interest. We do this:
>
>1. Check internal fds (PIPE_MAIN)
>2. Then check PIPE_DNS fds
>3. Then check PIPE_HOTPLUG fds
>
>Next, for the server, we check all the fds we are listening on. And
then
>finally, for nfs clients, we check the fds for the remote servers. Now,
>there's the issue in this line;
>
>for (j = 1; nfds > 0 && j < idx_peers; j++) {
>...
>}
>
>Shouldn't the index start with 3? That is, shouldn't we do this:
>
>for (j = 3; nfds > 0 && j < idx_peers; j++)
>
>since, indices 0,1 and 2 correspond to the three checks I have written
>above which are already done.
>
>In other words, can we apply the following patch to fix the issue?
>
>Index: ntpd/ntp.c
>===
>--- ntpd.orig/ntp.c
>+++ ntpd/ntp.c
>@@ -344,7 +344,7 @@ ntp_main(int pipe_prnt[2], struct ntpd_c
>   sensor_hotplugevent(hotplugfd);
>   }
>
>-  for (j = 1; nfds > 0 && j < idx_peers; j++)
>+  for (j = PFD_MAX; nfds > 0 && j < idx_peers; j++)
>   if (pfd[j].revents & (POLLIN|POLLERR)) {
>   nfds--;
>   if (server_dispatch(pfd[j].fd, conf) ==
-1)
>
>
>
>Thanks,
>
>Ani



possible bug in OpenNTPD code?

2008-12-04 Thread Anirban Sinha
Hi:

I am sort of digging my way through the OpenNTPD codebase for my work. I
think I find a bug in the code. Please help me to understand the reason
if this is not a bug.

In function ntp_main() (ntp.c), we poll() to check if there are any
events of interest. We do this:

1. Check internal fds (PIPE_MAIN)
2. Then check PIPE_DNS fds
3. Then check PIPE_HOTPLUG fds

Next, for the server, we check all the fds we are listening on. And then
finally, for nfs clients, we check the fds for the remote servers. Now,
there's the issue in this line;

for (j = 1; nfds > 0 && j < idx_peers; j++) {
...
}

Shouldn't the index start with 3? That is, shouldn't we do this:

for (j = 3; nfds > 0 && j < idx_peers; j++)

since, indices 0,1 and 2 correspond to the three checks I have written
above which are already done.

In other words, can we apply the following patch to fix the issue?

Index: ntpd/ntp.c
===
--- ntpd.orig/ntp.c
+++ ntpd/ntp.c
@@ -344,7 +344,7 @@ ntp_main(int pipe_prnt[2], struct ntpd_c
sensor_hotplugevent(hotplugfd);
}

-   for (j = 1; nfds > 0 && j < idx_peers; j++)
+   for (j = PFD_MAX; nfds > 0 && j < idx_peers; j++)
if (pfd[j].revents & (POLLIN|POLLERR)) {
nfds--;
if (server_dispatch(pfd[j].fd, conf) ==
-1)



Thanks,

Ani