> Considering your configuration, if this machine has more addresses than one...
Scratch that; I just noticed you're using more than one address per protocol. This means that the traversal is `(65535 - 1025 + 1) * 16` nodes long. Oops. On Mon, Sep 4, 2017 at 7:09 PM, Alberto Leiva <[email protected]> wrote: > Hmmmmmmmmmmmmmmmmmmmmmmmmmm. I have a theory. There might be an > optimization that could fix this to a massive extent. Although I have the > feeling that I've already thought about this before, so there might be a > good reason why I didn't apply this optimization already. Then again, I > could just have forgotten while developing. > > How many addresses does the attacker machine have? I don't really know the > nature of TRex's test, but I can picture it trying to use as many of its > node's IPv6 addresses and ports to open as many connections as possible > through Jool. Considering your configuration, if this machine has more > addresses than one... then I think I see the problem. It's a very, very > stupid oversight of mine. And one I am extremely surprised that also > managed to slip past the performance tests so easily. > > Then again, I haven't had to dive deep into the session code for a while, > so I might simply be missing this optimization. I badly need to look into > this. > > > Q2: what can we do to improve this? > > Ok, here's my attempt to explain it: > > There should (in theory) exist a quick way for Jool to tell whether the > pool4 has been exhausted (ie. there is one existing BIB entry for every > available pool4 address). If this information were available, it should be > able to skip *an entire tree traversal* for every translated packet that > needs the creation of a new BIB entry. > > In your case, the tree traversal is `65535 - 1025 + 1 = 64511` nodes long. > And yes, this operation has to lock, because otherwise it can end up with a > corrupted tree. So yeah, this is really bad. > > Now, I think I'm starting to realize why I might not have implemented > this: Because pool4 is intended as a mostly static database (because this > helps minimize locking), and usage counters would break this. So > implementing this might end up being a tradeoff. But give me a few days; > I'll think about it. > > Maybe I also assumed that the admin would always grant enough addresses to > pool4. If pool4 has enough addresses to actually serve the traffic, it > won't waste so much time iterating pointlessly. > > > PS: Developers that want to work on this and would like access to my lab > boxes (Dell R630 with lots of cores, memory and some 10Gbit/s NICs): feel > free to contact me! > > That'd be interesting. If my theory proves to be a flop, this would be > helpful in spotting the real bottleneck; I don't have anything right now > capable of overwhelming Jool. > > > Lockless/LRU structures? > > I'd love to find a way to do this locklessly, but I haven't been proven > that creative. > > The problem is that there are two separate trees that need to be kept in > harmony (one for IPv4 BIB lookups, another for IPv6 BIB lookups), otherwise > Jool will inevitably create conflicting entries and traffic will start > going in mismatched directions. At the same time, the BIB needs to be kept > consistent with pool4, so the creation of a BIB entry also depends on many > other BIB entries. > > On top of that, this all happens in interrupt context so Jool can't afford > itself the luxury of a mutex. It *has* to be a dirty spinlock. > > Come to think of it, Jool 4 might not necessarily translate in interrupt > context so the latter constraint could be overcomed. Interesting. > > > "f-args": 10, > > This should be contributing to the problem. But not by much, I think. > > Actually, if TRex's test is random, it's probably actually not > contributing much at all. It would be more of an issue in a real > environment. > > > kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s! > [kworker/u769:0:209211] > > Yeah... this is not acceptable. I really hope I can find a fix. > > On Mon, Sep 4, 2017 at 4:23 PM, Sander Steffann <[email protected]> > wrote: > >> Hi, >> >> > Just before the box freezes there are a lot of ksoftirqd threads quite >> busy: >> > >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >> COMMAND >> > 20 root 20 0 0 0 0 R 100.0 0.0 1:27.29 >> [ksoftirqd/2] >> > 60 root 20 0 0 0 0 R 100.0 0.0 1:16.64 >> [ksoftirqd/10] >> > 90 root 20 0 0 0 0 R 100.0 0.0 1:44.79 >> [ksoftirqd/16] >> > 160 root 20 0 0 0 0 R 100.0 0.0 1:32.92 >> [ksoftirqd/30] >> > 220 root 20 0 0 0 0 R 100.0 0.0 1:42.52 >> [ksoftirqd/42] >> > 50 root 20 0 0 0 0 R 100.0 0.0 1:26.21 >> [ksoftirqd/8] >> > 120 root 20 0 0 0 0 R 100.0 0.0 1:24.15 >> [ksoftirqd/22] >> > 190 root 20 0 0 0 0 R 100.0 0.0 1:47.05 >> [ksoftirqd/36] >> > 200 root 20 0 0 0 0 R 100.0 0.0 1:49.58 >> [ksoftirqd/38] >> > 210 root 20 0 0 0 0 R 100.0 0.0 1:58.60 >> [ksoftirqd/40] >> > 230 root 20 0 0 0 0 R 100.0 0.0 1:35.77 >> [ksoftirqd/44] >> > 240 root 20 0 0 0 0 R 100.0 0.0 1:59.37 >> [ksoftirqd/46] >> > 250 root 20 0 0 0 0 R 100.0 0.0 1:41.46 >> [ksoftirqd/48] >> > 260 root 20 0 0 0 0 R 100.0 0.0 1:28.37 >> [ksoftirqd/50] >> > 280 root 20 0 0 0 0 R 100.0 0.0 1:17.73 >> [ksoftirqd/54] >> > 100 root 20 0 0 0 0 R 86.3 0.0 1:35.04 >> [ksoftirqd/18] >> > 150 root 20 0 0 0 0 R 86.3 0.0 1:36.73 >> [ksoftirqd/28] >> > 30 root 20 0 0 0 0 R 85.3 0.0 1:18.45 >> [ksoftirqd/4] >> > 40 root 20 0 0 0 0 R 85.3 0.0 1:42.14 >> [ksoftirqd/6] >> > 70 root 20 0 0 0 0 R 85.3 0.0 1:27.51 >> [ksoftirqd/12] >> > 110 root 20 0 0 0 0 R 85.3 0.0 1:22.93 >> [ksoftirqd/20] >> > 130 root 20 0 0 0 0 R 85.3 0.0 1:37.32 >> [ksoftirqd/24] >> > 140 root 20 0 0 0 0 R 85.3 0.0 1:36.85 >> [ksoftirqd/26] >> > 3 root 20 0 0 0 0 S 84.3 0.0 2:47.02 >> [ksoftirqd/0] >> > 80 root 20 0 0 0 0 R 84.3 0.0 1:43.63 >> [ksoftirqd/14] >> > 270 root 20 0 0 0 0 R 84.3 0.0 1:21.22 >> [ksoftirqd/52] >> > 170 root 20 0 0 0 0 R 66.7 0.0 1:43.50 >> [ksoftirqd/32] >> > 180 root 20 0 0 0 0 R 51.0 0.0 0:52.70 >> [ksoftirqd/34] >> > 444 root 20 0 0 0 0 R 46.1 0.0 1:41.75 >> [kworker/34:1] >> > 205389 root 20 0 0 0 0 R 19.6 0.0 0:04.53 >> [kworker/u769:2] >> > 892 root 20 0 110908 69888 69556 S 18.6 0.1 4:06.82 >> /usr/lib/systemd/systemd-journald >> > 644 root 20 0 0 0 0 S 15.7 0.0 0:19.73 >> [kworker/14:1] >> > 740 root 20 0 0 0 0 S 15.7 0.0 0:08.38 >> [kworker/52:1] >> > 26633 root 20 0 0 0 0 S 15.7 0.0 0:35.72 >> [kworker/6:0] >> > 209211 root 20 0 0 0 0 S 15.7 0.0 0:03.96 >> [kworker/u769:0] >> > 111 root 20 0 0 0 0 S 14.7 0.0 0:16.67 >> [kworker/20:0] >> > 541 root 20 0 0 0 0 S 14.7 0.0 0:16.59 >> [kworker/12:1] >> > 2698 root 20 0 0 0 0 S 14.7 0.0 0:09.77 >> [kworker/28:1] >> > 12100 root 20 0 0 0 0 S 14.7 0.0 0:17.94 >> [kworker/18:1] >> > 40451 root 20 0 0 0 0 S 14.7 0.0 0:12.69 >> [kworker/24:0] >> > 40592 root 20 0 0 0 0 S 14.7 0.0 0:23.30 >> [kworker/32:1] >> > 212923 root 20 0 0 0 0 S 14.7 0.0 0:02.89 >> [kworker/4:1] >> > 300255 root 20 0 0 0 0 S 14.7 0.0 0:17.89 >> [kworker/26:0] >> > >> > At least I'm making some use of those 28 cores ;) >> >> Fun addition: the kernel just warned me right after I could log back in: >> >> Message from [email protected] at Sep 4 23:15:42 ... >> kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s! >> [kworker/u769:0:209211] >> >> Very busy indeed :) >> >> Cheers! >> Sander >> >> >> _______________________________________________ >> Jool-list mailing list >> [email protected] >> https://mail-lists.nic.mx/listas/listinfo/jool-list >> >> >
_______________________________________________ Jool-list mailing list [email protected] https://mail-lists.nic.mx/listas/listinfo/jool-list
