On Thu, Nov 8, 2018 at 12:23 PM Juliusz Chroboczek <[email protected]> wrote: > > > Would you like me to try merging this against head? > > No need, it should be an easy merge. Please continue testing, you're > being very helfpul.
If you want packet caps of any of this insanity let me know. I did four tests, detailed below. I want to be clear that what I just tested was my nlogn-uthash-merge branch with your latest stuff. (however, these tests do not really touch the resend code - it's elsewhere that runs).... compiled with -O2 -pg on gcc 7 (except the older ubuntu boxes) (your merge pushed out now). I do also have an osx box (on the boat). I am primarily using unicast and enjoying my switches blinking madly... ... being as I was always the kid that brought the lithium to the swimming hole, I tried 64k routes from one box... (instead of my usual 2 or 6) things fall apart at about 32k in this case. (I'll try just the nlogn branch and see what happens. It's lunchtime though, don't expect a fast turnaround) 1) unicast, 64k routes, my uthash + you % cumulative self self total time seconds seconds calls ms/call ms/call name 28.75 5.06 5.06 42690 0.12 0.16 netlink_read 19.15 8.43 3.37 kernel_route_compare 8.44 9.92 1.49 check_xroutes 8.18 11.36 1.44 193771004 0.00 0.00 do_filter 6.68 12.53 1.18 178893184 0.00 0.00 filter_route 5.57 13.51 0.98 238581203 0.00 0.00 xroute_compare 4.09 14.23 0.72 21816688 0.00 0.00 really_buffer_update 3.21 14.80 0.57 357610981 0.00 0.00 martian_prefix 2.53 15.24 0.45 4593628 0.00 0.00 find_xroute_slot 1.99 15.59 0.35 171937780 0.00 0.00 redistribute_filter 1.51 15.86 0.27 27374912 0.00 0.00 roughly 1.48 16.12 0.26 9159708 0.00 0.00 find_route_slot 0.88 16.27 0.16 compare_buffered_updates 0.80 16.41 0.14 4504797 0.00 0.00 really_send_update 0.74 16.54 0.13 21816807 0.00 0.00 output_filter 0.71 16.67 0.13 2 62.50 62.50 wait_for_fd 0.68 16.79 0.12 26357212 0.00 0.00 timeval_minus_msec 0.57 16.89 0.10 1 100.00 100.00 getint 2) I inlined QSORT, and the relevant xroute and route match routines (the compiler thinks they are too big to inline and miss the caller data part of the compare is static... which is the whole point of inlining qsort....) - and I get to... wait for it... 61785 routes with ~5k being unreachable... before it all falls apart. It's not just cpu but I/O here... lemme try mcast % cumulative self self total time seconds seconds calls ms/call ms/call name 15.24 0.39 0.39 94 4.15 22.16 flushupdates 13.28 0.73 0.34 2714259 0.00 0.00 find_xroute_slot 8.20 0.94 0.21 5370044 0.00 0.00 find_route_slot 7.81 1.14 0.20 44478662 0.00 0.00 check_xroutes 7.62 1.34 0.20 13823548 0.00 0.00 roughly 7.42 1.53 0.19 2634733 0.00 0.00 really_buffer_update 7.42 1.72 0.19 10676052 0.00 0.00 send_multihop_request 5.08 1.85 0.13 10883489 0.00 0.00 network_prefix 4.49 1.96 0.12 55686887 0.00 0.00 find_xroute 2.73 2.03 0.07 479 0.15 0.29 netlink_read 2.73 2.10 0.07 437 0.16 0.20 parse_packet 2.15 2.16 0.06 13353980 0.00 0.00 timeval_minus_msec 2.15 2.21 0.06 3085315 0.00 0.00 filter_route 1.95 2.26 0.05 2714259 0.00 0.00 add_xroute 1.56 2.30 0.04 10780559 0.00 0.00 flushbuf 1.56 2.34 0.04 1 40.00 40.00 getword 1.17 2.37 0.03 3 10.00 10.00 do_filter 0.98 2.40 0.03 10780555 0.00 0.00 jitter 3) mcast instead, it appears to "hold it together" longer... 4) killed -pg (64k routes or BUST!) :) compiled with -O3 nope. But I'll try 6 boxes.... ... Just like the sith, there are always two bugs, and in these cases I'm looking at the uthash branch on my weak i3 nuc In terms of caps, evaluating the routes sent... I could really use some automation to pull apart the caps in context with the test. In particular, would like to see when wildcard and other forms of expired requests are getting sent. Got anything? > > -- Juliusz -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
