Yigal Reiss (yreiss) <[email protected]> wrote: > Currently the "fail-open" feature in NFQUEUE passes packets only in the case > where the limit comes from the queue size (queue->queue_total >= > queue->queue_maxlen).
Right. > In case where the qlen is high and the load is high, packets will be dropped > as result of crossing the socket's receive buffer size. This will eventually > be reported through the proc file as 'user dropped' (don't know why). It should also be reported to the application via -ENOBUFS (see netlink_overrun() in net/netlink.c . > From user perspective IMO the user doesn't care if the packet cannot be > passed as a result of a queue size or socket receive buffer size. > This is quite arbitrary and depends on average packet size. Yes, it depends on packet size and time until verdict is sent. > Actually the result may be opposite to what the user desired if he raises the > qlen wishing to increase availability but in fact causing more packets be > dropped > due to receive buffer limitations. If userspace bumps qlen it should also increase its sk buffer size accordingly. (see for instance nfnl_rcvbufsiz() in libnfnetlink). > I suggest implementing the same behavior (fail open) also for the case where > nfnetlink_unicast() fails (which usually would be due to receive buffer > limit). > > Does this make sense? Even when qlen is near-full sk rmem could be low, and vice versa. I agree that it would be nice to also implement failopen reinject. > If so, what would be the recommended way of achieving that? > The problem is that skb is being freed at netlink level (netlink_attachskb). > So it's either copying the skb each time before calling nfnetlink_unicast() > (wasteful) or passing a flag all the way to indicate that freeing is not > desired (lots of changes and involves also core netlink). Any other > suggestions? I think it would be good to consider auditing the kernel and see if its feasible to switch to a 'skb must be free'd by caller on error' model. This doesn't necessarily mean changing netlink_unicast(), it could for instance be resolved by adding netlink_unicast_try (you might want to figure out a better name...) that doesn't free on error and then make netlink_unicast a short-hand wrapper for the latter to keep the current fire-and-forget sematics around. (Alternatively, use your 'flag' method, I'm not sure what will be 'nicer' or less intrusive). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
