At this time, the only functionality provided by pfinet and pending to be
implemented in the LwIP translator is the --peer options, everything else
is done. From now on, most of the work will consist on polishing and fixing
bugs, so I'll probably talk about many different topics in each post.

The first topic for today is an error I got when I called apt-get update
over LwIP:

"../sysdeps/posix/getaddrinfo.c:1722: rfc3484_sort: Assertion
`a1->source_addr.sin6_family == PF_INET6' failed."

Unlike the previous errors I faced, this one was located in Glibc and
aborted the user program, not the translator. But obviously, it was some
unexpected behavior in the translator that leaded Glibc to fail. There was
the time to find it. Taking a look at the log generated by rpctrace, I
could see the last of the LwIP provided operation being called by the
client before the crash was socket_whatis_address(). I'll try to explain
what's this operation for.

Glibc needs to call two complementary operations: socket_create_address()
and socket_whatis_address(). Th first one[1] creates a libport object
containing a struct sockaddr which is created from the given parameters:
family, data and length. The second one[2] is the opposite, given a port it
returns family, the data and the length of the sockaddr structrure
contained in the libport object of that port. When Glibc receives an struct
sockaddr from the user, it sends its content to create_address() to get a
port which will later send to other operations, for instance, bind()[3]. In
the same way, when some socket operation returns an address, Glibc will
receive a port, and will have to call whatis_address() to get the necessary
data to create a copy of the sockaddr structure for the user. We can see it
in action in getpeername()[4].

Well then, in the rpctrace output I could see that the client was calling
create_address() with the family 26 (IPv6), but a bit later when calling
whatis_address() over the same port, the server was returning a sockaddr
which family was 2 (IPv4). The problem was a bug in LwIP that, when asked
for the address of an IPv6 socket, returned a IPv4 address if dual-stack
was enabled. LwIP developers fixed[5] this bug months ago and will be ready
for the next release, but for now I've applied the patch in my source.

After this problem, I wanted to ask the server with inetutils-After this
problem, I wanted to ask the server with inetutils-ifconfig, and found it
didn't work because it makes a call to ioctl with the SIOCGIFCONF command,
and that command is part of the pfinet interface, which was still not
implemented. I wrote the operation, but before, I observed that all ioctl
related operations were written to work only over an Ethernet device, and
that produced two problems: first, that trying an ioctl over the loopback
address lead to a crash; and second, that I was planning to create a new
module to implement PPP and then all iioctl and pfinet operations would
need to be rewritten. For these reasons, I've decided to create a common
interface[6] for communicating with device modules. Besides, I've created a
new loopback module[7] that sets up the loopback interface in LwIP to adapt
it to the Hurd's requirements and provide the necessary functions to be
integrated with the ioctls.

Another issue I left pending for some time was improving the performance of
the stack. Since the beginning of the project, LwIP never reached more than
40Kbps on download speed, that's because the default TCP tuning parameters
in LwIP are rather conservative, e.g. MSS equal to 536 bytes or TCP receive
window equal to 2144 bytes. I tried to increase the receive window to 65536
and set the MSS to 1460, the usual value for the Internet, and with this
setting the stack reaches about 600Kbps, still far from being competitive.
I spent a few days trying to make the stack faster but it's proved to be
non-trivial. Incrementing the receive window a bit more results in an
increase of loss packets, and for some reason LwIP has a hard time dealing
with the fast-retransmit mechanism when the window if high enough to
receive hundreds of packets before the sender fast-retransmits the loss
packet. In practice, the leads the stack to download at full speed during a
random time until the first packet is loss, which triggers the
fast-retransmit mechanism, which produces more loss packets and so on,
preventing the stack to receive more than 100Kbps.

---------------------------------------
[1]
https://github.com/jlledom/lwip-hurd/blob/b865e22e41b713198028de1034879dbff5f97824/socket-ops.c#L227
[2]
https://github.com/jlledom/lwip-hurd/blob/b865e22e41b713198028de1034879dbff5f97824/socket-ops.c#L272
[3]
http://git.savannah.gnu.org/cgit/hurd/glibc.git/tree/sysdeps/mach/hurd/bind.c?h=tschwinge/Roger_Whittaker#n107
[4]
http://git.savannah.gnu.org/cgit/hurd/glibc.git/tree/sysdeps/mach/hurd/getpeername.c?h=tschwinge/Roger_Whittaker#n38
[5]
https://github.com/yarrick/lwip/commit/d9a738d85f74f66dc7366ea8098f9c0bf4eb00e1
[6]
https://github.com/jlledom/lwip-hurd/blob/b865e22e41b713198028de1034879dbff5f97824/port/include/netif/ifcommon.h
[7]
https://github.com/jlledom/lwip-hurd/blob/b865e22e41b713198028de1034879dbff5f97824/port/netif/hurdloopif.c

Reply via email to