Hello everyone,

I am facing an issue with a deadlock in the TCP stack and I'm looking
for some conversation on ways I might fix this.

My setup:
I am using a slightly older version of Nuttx (nuttx-12.1.0) but I
believe this is still a problem in the latest code.
I am using a device with an external modem using a slightly modified
version nuttx-apps/netutils/pppd/pppd.c (but the stock one should have
the same problem)
Pppd uses the "tun" driver to pass TCP packets from the modem to the TCP stack.

The problem is that in my application, there is a period of time where
I am not calling recv() but instead I am calling send() (trying to
send something large).

What is happening is that during this time, I may receive data from
the server, which consumes all of the IOB. I am now unable to send
anything because IOB are required to send. Specifically I am failing
to get IOB in devif_send. This leads to a loop of TCP retries that
always fail to get IOB.

In my application I can't exit the send (needs IOB), so my application
never gets to recv() which would free the IOB for the send to finish
(circular dependency on the shared IOB).
I know I can change my application so that recv can never depend on
send (2 independent threads) but I am looking for a more general
approach.

I know there is CONFIG_IOB_THROTTLE to throttle RX packets, but that
wont work here because the tun driver needs IOB to input the data from
the modem in tun_write. Both new data and TCP acks need to come
through the tun_write and it can't know the difference when allocating
the IOB.

My current workaround is to set CONFIG_NET_RECV_BUFSIZE such that the
RX is limited to not allow consuming all IOB. This seems to be
working, but requires some manual pre-calculation and is likely to
break in the future with changes to other consumers of IOB (CAN,
syslog buffer, etc or if I have multiple sockets open).

My ask:

1. Is there something I am missing? Is there a better approach or fix
anyone is aware of?

2. While looking at IOB, I think the CONFIG_IOB_THROTTLE logic was
broken in https://github.com/apache/nuttx/pull/7616/
netdev_iob_prepare always calls net_iobtimedalloc(false, timeout);
first, so it always uses non-throlled IOB even when throttled=true.
Unless I am miss-understanding something this means that it
effectively ignores the throllted parameter. I think the 2 calls to
net_iobtimedalloc should be reversed. I did see it was flipped in
https://github.com/apache/nuttx/pull/8029/ (along with other changes)
but that was reverted a few days later
https://github.com/apache/nuttx/pull/8081

thanks

Daniel Lizewski

Reply via email to