Thanks Ido and David for your confirmation and insight. -- Ashutosh
On Wed, Jul 29, 2020 at 8:17 AM David Ahern <dsah...@gmail.com> wrote: > > On 7/29/20 5:43 AM, Ido Schimmel wrote: > > On Tue, Jul 28, 2020 at 05:52:44PM -0700, Ashutosh Grewal wrote: > >> Hello David and all, > >> > >> I hope this is the correct way to report a bug. > > > > Sure > > > >> > >> I observed this problem with 256 v4 next-hops or 128 v6 next-hops (or > >> 128 or so # of v4 next-hops with labels). > >> > >> Here is an example - > >> > >> root@a6be8c892bb7:/# ip route show 2.2.2.2 > >> Error: Buffer too small for object. > >> Dump terminated > >> > >> Kernel details (though I recall running into the same problem on 4.4* > >> kernel as well) - > >> root@ubuntu-vm:/# uname -a > >> Linux ch1 5.4.0-33-generic #37-Ubuntu SMP Thu May 21 12:53:59 UTC 2020 > >> x86_64 x86_64 x86_64 GNU/Linux > >> > >> I think the problem may be to do with the size of the skbuf being > >> allocated as part of servicing the netlink request. > >> > >> static int netlink_dump(struct sock *sk) > >> { > >> <snip> > >> > >> skb = alloc_skb(...) > > > > Yes, I believe you are correct. You will get an skb of size 4K and it > > can't fit the entire RTA_MULTIPATH attribute with all the nested > > nexthops. Since it's a single attribute it cannot be split across > > multiple messages. > > yep, well known problem. > > > > > Looking at the code, I think a similar problem was already encountered > > with IFLA_VFINFO_LIST. See commit c7ac8679bec9 ("rtnetlink: Compute and > > store minimum ifinfo dump size"). > > > > Maybe we can track the maximum number of IPv4/IPv6 nexthops during > > insertion and then consult it to adjust 'min_dump_alloc' for > > RTM_GETROUTE. > > That seems better than the current design for GETLINK which walks all > devices to determine max dump size. Not sure how you will track that > efficiently though - add is easy, delete is not. > > > > > It's a bit complicated for IPv6 because you can append nexthops, but I > > believe anyone using so many nexthops is already using RTA_MULTIPATH to > > insert them, so we can simplify. > > I hope so. > > > > > David, what do you think? You have a better / simpler idea? Maybe one > > day everyone will be using the new nexthop API and this won't be needed > > :) > > exactly. You won't have this problem with separate nexthops since each > one is small (< 4k) and the group (multipath) is a set of ids, not the > full set of attributes.