Out of curiosity I just took a look into the thhtp cgi code and noticed something I don't understand myself.
There is the cgi_child(), which looks like it is to be called after a fork for the child case. It prepares the filedescriptors and calls exec. The result of the exec is stored in a variable named child, which is an odd name. Normally exec never returns and if it does it is an error. The normal thing would be for the child to kill itself. It goes into error handling in case the exec returned < 0, which it always does when exec returns. However the next thing it does is setting up a timeout for the child. It is the child and it already failed the exec, why the timeout when we are already in an error state because we returned from an exec. However: The function is called with task_create, so not a fork. If I got it right then the task has separate filedescriptors as a forked process would have and exec closing the task copies of the sockets should be fine, as long as the use count on those are properly increased. But if it is intendend to behave like fork/exec, why does stuff happen after the exec? On Tue, Jul 01, 2025 at 03:00:23PM -0300, Alan C. Assis wrote: > Hi Tim, > > You are right, it doesn't execute, but some subprocess (like a CGI) could > try to execute. > > This comment there shed some light about it: > > "I wouldn't describe O_CLOEXEC as there principally for privilege > escalation / security reasons -- it's also very, > very common to have non-security bugs happen (frequently of the > indefinite-blocking variety) if a FD is left > open beyond when it's intended to be closed because a subprocess still has > it." > > So, why does removing SOCK_CLOEXEC make http work? If the fd is not > executed, the socket shouldn't be closed, right? > > And why was it working in the past? Which modification broke this? > Maybe understanding it is important to have the right fix (maybe removing > it is acting as a band-aid). > > Wengzhe, could you please help us to understand this network issue? > > BR, > > Alan > > On Tue, Jul 1, 2025 at 12:28 PM Tim Hardisty <timhardist...@gmail.com> > wrote: > > > But that's the point - thttp *does* call exec() so the open socket file > > descriptor gets closed when it is still needed by the exec'd application. > > > > If there's another way of doing this I'm listening! > > > > On 01/07/2025 16:13, Alan C. Assis wrote: > > > Hi Tim, > > > > > > Nice finding! > > > > > > Now we need to understand why this worked in the past and now it doesn't. > > > > > > Also, what are the implications of removing SOCK_CLOEXEC? A few pointers > > > here: > > > > > https://stackoverflow.com/questions/22304631/what-is-the-purpose-to-set-sock-cloexec-flag-with-accept4-same-as-o-cloexec > > > > > > BR, > > > > > > Alan > > > > > > On Tue, Jul 1, 2025 at 11:27 AM Tim Hardisty <timhardist...@gmail.com> > > > wrote: > > > > > >> The error was, indeed, the socket being opened with the SOCK_CLOEXEC > > >> flasg set. > > >> > > >> PR to follow. > > >> > > >> On 28/06/2025 16:16, Tim Hardisty wrote: > > >>> Actually - it might be a change last year. The socket is now opened > > >>> like this and I assume CLOEXEC will mess up the operation of the > > >>> executed CGI app (will investigate on Monday; not sure what socket > > >>> mode it needs to be): > > >>> > > >>> hc->conn_fd = accept4(listen_fd, (struct sockaddr *)&sa, &sz, > > >>> SOCK_CLOEXEC); > > >>> > > >>> On 28/06/2025 13:22, Alan C. Assis wrote: > > >>>> Hi Tim, > > >>>> > > >>>> Yes, I think send() is the preferred form to work with sockets > > >>>> because you > > >>>> can have fine control, i.e. passing flags at forth argument > > >>>> (MSG_DONTWAIT, > > >>>> etc). > > >>>> > > >>>> If you suspect that the bug was caused by some recent modification, > > >>>> try to > > >>>> find a supported board that was used to test thttpd in the past and > > >>>> test an > > >>>> old NuttX release with it. > > >>>> This is the approach I use to double check if something is broken in > > the > > >>>> mainline. > > >>>> > > >>>> BR, > > >>>> > > >>>> Alan > > >>>> > > >>>> On Fri, Jun 27, 2025 at 3:39 PM Tim Hardisty <timhardist...@gmail.com > > > > > >>>> wrote: > > >>>> > > >>>>> Is it as "simple" as thttpd should do: > > >>>>> > > >>>>> nwritten= send(sock_fd, buffer, totalbytesread, 0); > > >>>>> > > >>>>> rather than the generic: > > >>>>> > > >>>>> nwritten= write(sock_fd, buffer, nbytes); > > >>>>> > > >>>>> On 27/06/2025 18:40, Tim Hardisty wrote: > > >>>>>> Trying to get thttpd's CGI handling working and have found that the > > >>>>>> dup(2) calls of stdin and stdout return a file descriptor that's > > >>>>>> already been allocated to the NET socket (via thttpd I think). > > >>>>>> > > >>>>>> That isn't right is it? > > >>>>>> > > >>>>>> I am not sure if it's a side effect of something that thttpd does > > >>>>>> (that might have been OK in the past but is now not right) or a > > NuttX > > >>>>>> bug, of a missing Kconfig setting that relates to this. > > >>>>>> > > >>>>>> The result is that the ultimate copying of buffered html that should > > >>>>>> be written via the socket FD gets rejected as the FD doesn't have WR > > >>>>>> access (and is now the wrong FD anyway!). > > >>>>>> > > >>>>>> Perhaps there's been a change in the way NuttX deals with all of > > this > > >>>>>> that didn't get sorted in thttpd? > > >>>>>> > > -- B.Walter <be...@bwct.de> https://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.