As I mentioned in my previous post about ioctl operations, LwIP
establishes a maximum limit of sockets, which is a big problem for a
system like the Hurd. Now I've finished all tasks in my initial
proposal, I thought it was a good idea to spend some time studying
this issue and trying to find a solution.
The problem is this: in LwIP, sockets are stored in a fixed-size
array, which is the best solution for embedded systems as it is a
natural way of indexing items without need of additional variables and
allows the user to directly access any socket from its index taking
the same time. If we remove the limited amount of sockets, then we
cannot keep using an array, and another solution is required to handle
In the LwIP translator, I've chosen to replace the array for a linked
list, and that has some implications: the need to iterate the list any
time we need to create, destroy or look for a particular socket, make
sure the sockets take the proper place in the list to be indexable and
create new variables to make the list work. All this can be done if we
accept the implicit loss of performance, but there's a problem that
doesn't have an easy solution: the limit established by the FD_SETSIZE
macro. I'll explain it in detail.
In the Hurd, each process has its own socket counter, whereas LwIP has
one single global socket counter. That implies that the global counter
is actually the sum of all socket counters in all processes, and it's
quite easy for this amount to eventually exceed FD_SETSIZE. When that
happens, the stack won't be able to put the next socket into a struct
fd_set and therefore it'll be no longer able to call lwip_select()
with this socket. Since Glibc calls the io_select() operation each
time the user calls send() or recv(), in practice such sockets are
just unusable. Thus, only processes getting one of the first
FD_SETSIZE sockets receive a useful socket, others will receive a
potentially infinite number of useless sockets.
After thinking about this I finally found a solution I'm not specially
happy with, but it was the only one that came to my mind and, removing
the restriction on number of sockets is a priority, so I wrote it.
This is the solution: if we take a look at FD_SET, FD_CLR and FD_ISSET
as they are defined in Glibc, we can see they don't protect the
fd_set against overflow (this is getting interesting), so we can use
them to write further than the FD_SETSIZE bit in the fd_set. Since in
our LwIP server all calls to lwip_select() are centralized in the
io_select() operation, we can allocate as much memory as needed in
this operation and cast the pointer to this memory area to be treated
as a pointer to fd_set, so lwip_select() can work as usual.
Unfortunately, I've had to make some changes in lwip_select() too,
because FD_ZERO doesn't write further than FD_SETSIZE and because it
declares some internal fd_set that must be ready to be safely written
as well. At the end, the whole thing looks like this. This said, if
anyone have any idea I'll be glad to hear it.
In conclusion, the issue is solved in the Hurd, but all the changes
I've made on the stack were initially intended to be merged to the
upstream and be part of the next release of LwIP. Unfortunately, the
FD_SETSIZE issue doesn't have an obvious solution for embedded
systems. Here is the link to the discussion in the LwIP mailing
lists for further information.