On Wed, Jun 19, 2013 at 02:56:03AM -0700, Eric Dumazet wrote: > On Wed, 2013-06-19 at 12:11 +0300, Michael S. Tsirkin wrote: > > > Well KVM supports up to 160 VCPUs on x86. > > > > Creating a queue per CPU is very reasonable, and > > assuming cache line size of 64 bytes, netdev_queue seems to be 320 > > bytes, that's 320*160 = 51200. So 12.5 pages, order-4 allocation. > > I agree most people don't have such systems yet, but > > they do exist. > > Even so, it will just work, like a fork() is likely to work, even if a > process needs order-1 allocation for kernel stack. > > Some drivers still use order-10 allocations with kmalloc(), and nobody > complained yet. > > We had complains with mlx4 driver lately only bcause kmalloc() now gives > a warning if allocations above MAX_ORDER are attempted. > > Having a single pointer means that we can : > > - Attempts a regular kmalloc() call, it will work most of the time. > - fallback to vmalloc() _if_ kmalloc() failed.
That's a good trick too - vmalloc memory is a bit slower on x86 since it's not using a huge page, but that's only when we have lots of CPUs/queues... Short term - how about switching to vmalloc if > 32 queues? > Frankly, if you want one tx queue per cpu, I would rather use > NETIF_F_LLTX, like some other virtual devices. > > This way, you can have real per cpu memory, with proper NUMA affinity. > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

