Appropriate Byte Counting during Congestion Avoidance
Hi everyone, We noticed CWND is growing much slower than expected during congestion avoidance with new reno, and we came to this piece of code in cc_ack_received() at tcp_input.c:353 if (type == CC_ACK) { if (tp->snd_cwnd > tp->snd_ssthresh) { tp->t_bytes_acked += min(tp->ccv->bytes_this_ack, nsegs * V_tcp_abc_l_var * tcp_maxseg(tp)); if (tp->t_bytes_acked >= tp->snd_cwnd) { tp->t_bytes_acked -= tp->snd_cwnd; tp->ccv->flags |= CCF_ABC_SENTAWND; } The increment of t_bytes_acked is capped at 2*maxseg. The description of the sysctl variable tcp_abc_l_var(default value 2) is "Cap the max cwnd increment during slow-start to this number of segments" After reading RFC3465, it doesn't look like this cap should be applied here since this is clearly not during slow-start. We've seen in some cases the receiver is ACKing every 16 packets, and CWND is growing at 1/8 of the expected rate because of this. I would appreciate your opinion on this. Thanks a lot. Regards, Liang ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
Writing a new netgraph node is relatively simple. Take ng_sample.c and ng_sample.h and copy them. Change names to suit, and add your own code in the middle. use one of 50 other nodes as examples. No matter what you want to do one of them already does it. -- +--\ _ __ | __--_|\ Julian Elischer\ \\ U \/ / On assignment | / \ jul...@elischer.org \ \ USA\ in a very strange | ( OZ) \-->x ___ | country ! +- X_.---._/ Mountain View, California \_/ \\ v sp; \\ v ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
> On Tue, Aug 18, 2020 at 2:43 PM Eugene Grosbein wrote: > > Sorry, missed that. But why wasn't possible? > > There's a daemon running on the system that handles most network > configuration. It's quite inflexible and will override any manual > configuration changes. It manages firewall rules but is ignorant of > netgraph, so it will remove any dummynet rules but leave netgraph > configuration alone. It was significantly easier to just use ng_pipe, > even after having to fix or work around the bugs, than it was to fight > the daemon. > > On Tue, Aug 18, 2020 at 2:56 PM Marko Zec wrote: > > The probability that a frame is completely unaffected by BER events, > > and thus shouldn't be dropped, is currently computed as > > > > Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen) > > The problem is in its calculation of Psingle_bit_unaffected(BER). The > BER is the fraction of bits that are affected, therefore it is the > probability that a bit is affected. But for some reason, > Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 - > BER. This leads to the probability table being wrong. For example, > given a BER of 2350, the probability that a 1500-byte packet is > not dropped is: Is this a confusion over Bit Error Rate vis Bit Error Ratio? A BER of 2350 I must assume is 1 bit in 2350 bits. 1 - 1/BER looks correct to me for a Bit Error Rate 1 - BER looks correct to me for a Bit Error Ratio (usually a percentage) > > (1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%. > > However, ng_pipe calculates a fixed-point probability value of > 281460603879001. To calculate whether a frame should be dropped, > ng_pipe takes this probability value and shifts it right by 17, > yielding 2147373991. It then calls rand() to generate a random number > in the range [0,2**31-1]; if the random number is larger than the > probability value than it is dropped, otherwise it is kept. The > chances that a packet is kept is therefore 2147373991/(2**31 - 1), or > about 99.99%. This looks like optimization to reduce calculation as it is done for every packet, why not do the calculation more accurately and only for each "error", see below. > It's easy enough to fix this one, but I wasn't sure that it would be > so easy to fix the TSO/LRO issue without significantly increasing the > memory usage, so I wanted to gauge whether it was worth pursuing that > avenue or if a simpler model would be a better use of my time. The > feedback is definitely that a simpler model is *not* warranted, so > let's talk about fixing TSO/LRO. I am not even sure how to deal with TSO/LRO and BER. Your not going to discard the whole segment are you? Are you going to try and packetize it, drop the packet(s) with errors, reassmble it? My method of calculating the future error point would at least allow you to just pass a whole segment without any of that hassle and only have to do that when an error is within some segment. > > On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes > wrote: > > Hum, that sounds like a poor implementation indeed. It seems > > like it would be easy to convert a BER into a packet drop > > probability based on bytes that have passed through the pipe. > > I'm not quite following you; can you elaborate? Would this solution > require us to update some shared state between each packet? One > advantage of the current approach is that there is no mutable state > (except, of course, when configuration changes). You would use the bytes transfered state that is already stored, and compute a "next error" point based on BER and some randomness such that your errors are not clocked at exact BER intervals. Compare the next error point to the bytes transfered + size of this packet/segment to decide if it needs dropped, if you drop then you must calculate a new "next error" point. This should considerable reduce the overhead for error rates that effect less than 50% of packets, and would be the same overhead for BER that effect every packet. And 100x more efficient for things that effect 1% of packets. -- Rod Grimes rgri...@freebsd.org ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, 18 Aug 2020 18:04:24 -0400 Ryan Stone wrote: > On Tue, Aug 18, 2020 at 5:43 PM Marko Zec wrote: > > Since in ng_pipe we define BER as an one error in BER bits (integer > > value), wouldn't your formula P = 1 - BER yield results less than or > > equal to zero for all non-zero values of BER? The domain of P is > > [0..1], so that won't fly. > > > > Your analysis seems to be based on an assumption that the BER > > parameter is given as a multiplier to 0.5**48, which it is not. > > > > The proper fix would be to clarify the current BER parameter > > semantics in ng_pipe(4), not to break bridges with the existing > > scripts / software which rely on ng_pipe. > > > > Cheers, > > > > Marko > > The manpage defined the BER parameter as follows: > > u_int64_t ber; /* errors per 2^48 bits */ > > If we instead want to change the documentation to match the > implementation, that's fine too. Yeah, I only saw that a few minutes ago... Fixed in r364367. Cheers Marko ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, Aug 18, 2020 at 5:43 PM Marko Zec wrote: > Since in ng_pipe we define BER as an one error in BER bits (integer > value), wouldn't your formula P = 1 - BER yield results less than or > equal to zero for all non-zero values of BER? The domain of P is > [0..1], so that won't fly. > > Your analysis seems to be based on an assumption that the BER > parameter is given as a multiplier to 0.5**48, which it is not. > > The proper fix would be to clarify the current BER parameter semantics > in ng_pipe(4), not to break bridges with the existing scripts / software > which rely on ng_pipe. > > Cheers, > > Marko The manpage defined the BER parameter as follows: u_int64_t ber; /* errors per 2^48 bits */ If we instead want to change the documentation to match the implementation, that's fine too. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, 18 Aug 2020 17:01:37 -0400 Ryan Stone wrote: ... > On Tue, Aug 18, 2020 at 2:56 PM Marko Zec wrote: > > The probability that a frame is completely unaffected by BER events, > > and thus shouldn't be dropped, is currently computed as > > > > Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen) > > The problem is in its calculation of Psingle_bit_unaffected(BER). The > BER is the fraction of bits that are affected, therefore it is the > probability that a bit is affected. But for some reason, > Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 - > BER. Since in ng_pipe we define BER as an one error in BER bits (integer value), wouldn't your formula P = 1 - BER yield results less than or equal to zero for all non-zero values of BER? The domain of P is [0..1], so that won't fly. Your analysis seems to be based on an assumption that the BER parameter is given as a multiplier to 0.5**48, which it is not. The proper fix would be to clarify the current BER parameter semantics in ng_pipe(4), not to break bridges with the existing scripts / software which rely on ng_pipe. Cheers, Marko > This leads to the probability table being wrong. For example, > given a BER of 2350, the probability that a 1500-byte packet is > not dropped is: > > (1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%. > > However, ng_pipe calculates a fixed-point probability value of > 281460603879001. To calculate whether a frame should be dropped, > ng_pipe takes this probability value and shifts it right by 17, > yielding 2147373991. It then calls rand() to generate a random number > in the range [0,2**31-1]; if the random number is larger than the > probability value than it is dropped, otherwise it is kept. The > chances that a packet is kept is therefore 2147373991/(2**31 - 1), or > about 99.99%. > > It's easy enough to fix this one, but I wasn't sure that it would be > so easy to fix the TSO/LRO issue without significantly increasing the > memory usage, so I wanted to gauge whether it was worth pursuing that > avenue or if a simpler model would be a better use of my time. The > feedback is definitely that a simpler model is *not* warranted, so > let's talk about fixing TSO/LRO. > > On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes > wrote: > > Hum, that sounds like a poor implementation indeed. It seems > > like it would be easy to convert a BER into a packet drop > > probability based on bytes that have passed through the pipe. > > I'm not quite following you; can you elaborate? Would this solution > require us to update some shared state between each packet? One > advantage of the current approach is that there is no mutable state > (except, of course, when configuration changes). ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, Aug 18, 2020 at 2:43 PM Eugene Grosbein wrote: > Sorry, missed that. But why wasn't possible? There's a daemon running on the system that handles most network configuration. It's quite inflexible and will override any manual configuration changes. It manages firewall rules but is ignorant of netgraph, so it will remove any dummynet rules but leave netgraph configuration alone. It was significantly easier to just use ng_pipe, even after having to fix or work around the bugs, than it was to fight the daemon. On Tue, Aug 18, 2020 at 2:56 PM Marko Zec wrote: > The probability that a frame is completely unaffected by BER events, > and thus shouldn't be dropped, is currently computed as > > Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen) The problem is in its calculation of Psingle_bit_unaffected(BER). The BER is the fraction of bits that are affected, therefore it is the probability that a bit is affected. But for some reason, Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 - BER. This leads to the probability table being wrong. For example, given a BER of 2350, the probability that a 1500-byte packet is not dropped is: (1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%. However, ng_pipe calculates a fixed-point probability value of 281460603879001. To calculate whether a frame should be dropped, ng_pipe takes this probability value and shifts it right by 17, yielding 2147373991. It then calls rand() to generate a random number in the range [0,2**31-1]; if the random number is larger than the probability value than it is dropped, otherwise it is kept. The chances that a packet is kept is therefore 2147373991/(2**31 - 1), or about 99.99%. It's easy enough to fix this one, but I wasn't sure that it would be so easy to fix the TSO/LRO issue without significantly increasing the memory usage, so I wanted to gauge whether it was worth pursuing that avenue or if a simpler model would be a better use of my time. The feedback is definitely that a simpler model is *not* warranted, so let's talk about fixing TSO/LRO. On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes wrote: > Hum, that sounds like a poor implementation indeed. It seems > like it would be easy to convert a BER into a packet drop > probability based on bytes that have passed through the pipe. I'm not quite following you; can you elaborate? Would this solution require us to update some shared state between each packet? One advantage of the current approach is that there is no mutable state (except, of course, when configuration changes). ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: net.add_addr_allfibs=1 behaviour deprecation
18.08.2020, 09:17, "Grzegorz Junka" : > On 18/08/2020 07:54, Julian Elischer wrote: >> The reason for the two behaviours is that there are two ways that the >> previous behaviour of "add addresses to the only FIB" could be >> interpreted and extended once multiple fibs became available. The >> single fib case could be interpreted as either of: >> >> "Add to All N fibs where N == 1" or "add to the first (of 1) >> fibs". >> I decided to do both :-) >> >> At Ironport where I wrote it we had a scenario where I didn't want to >> have wrong entries in all the fibs when a new interface was brought >> up. Even for a moment. An other scenarios where for example a tunnel >> uses fib 1 but the rest of the system uses fib0 (which points through >> the tunnel) The addition of new routes into the tunnel's route when a >> new virtual interface is brought up pointing through the tunnel to the >> same address, leads in the tunnel immediately redirecting packets >> through itself which ends in tears. SO the obvious thing to do was >> to make it possible to only add the entry in the fib that was the >> default fib or in the case of Ironport, the fib that was the default >> fib of the process adding the interface. >> >> If you had to make a choice I think the '0' choice is the way to go. >> All other fibs need to be populated deliberately.. >> >> On 8/15/20 4:24 AM, Alexander V. Chernikov wrote: >>> 18.07.2020, 14:22, "Alexander V. Chernikov" : Dear FreeBSD users, I would like to make net.add_addr_allfibs=0 as the default system behaviour and remove net.add_addr_allfibs. To do so, I would like to collect use cases with net.add_addr_allfibs=1 and multiple fibs, to ensure they can still be supported after removal. Background: Multi-fib support was added in r17 [1], 12 years ago. Addition of interface addresses to all fibs was a feature from day 1. The `net.add_addr_allfibs` sysctl was added in r180840 [2], 12 years ago. Problem: The goal of the fib support is to provide multiple independent routing tables, isolated from each other. `net.add_addr_allfibs` default tries to shift gears in the opposite direction, unconditionally inserting all addresses to all of the fibs. It complicates the logic, kernel code and makes control plane performance decrease with the number of fibs. It make impossible to use the same prefixes in multiple fibs, which may be desired given shortage of IPv4 address space. I do understand that there are some cases where such behaviour is desired. For example, it can be used to achieve VRF route leaking or binding on address from different fibs. I would like to collect such cases to consider supporting them in a different way. The goal is to make net.add_addr_allfibs=0 default behaviour and remove net.add_addr_allfibs. It will simplify kernel fib-related code and allow bringing more fib-related features. It will also improve fib scaling. >>> No objections has been received. >>> Next steps: >>> * Switch net.add_addr_allfibs to 0 ( >>> https://reviews.freebsd.org/D26076 ) >>> * Provide an ability to use nexthops from different fibs >>> * Remove net.add_addr_allfibs Timeline: Aug 1: summarising feedback and the usecases, decision on proceeding further Aug 20 (tentative): patches for supported usecases Sep 15 (tentative): net.add_addr_allfibs removal. [1]: [base Contents of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup) [2]: [base Diff of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;) /Alexander > > Agree completely, defaulting "add_addr_allfibs" to 1 broke many existing > installations, which goes against the least surprise principle so many > times advocated on FreeBSD lists. > > This is just one example: > https://forums.freebsd.org/threads/strange-behavior-of-setfib-since-freebsd-12-0.73348/ > > Now, changing the default again might again break existing > installations, which shouldn't be a reason for not doing it, but might > be a reason to better communicate it this time around. I plan to communicate it the following way: 1) this thread 2) GONE_IN13 in the sysctl (which will print console message when set to 1) 3) Release notes Do you think there are other communication channels one should try to use? > > GrzegorzJ > > ___ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net
Re: net.add_addr_allfibs=1 behaviour deprecation
___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, 18 Aug 2020 13:17:48 -0400 Ryan Stone wrote: > I'd like to dump all of this and just implement a packet loss rate, > which would simplify all this immensely. Is anybody using ng_pipe > with a non-zero BER who would object to this? Given this litany of > issues I doubt it, but I thought that I'd be sure. Yes, the BER feature is being actively used, please don't nuke it. If you wish to supplement it with PER, which is less realistic but simpler to implement, by all means go ahead... > On Tue, Aug 18, 2020 at 1:17 PM Ryan Stone wrote: > > 4. The table calculation had two integer truncation bugs and used > > the wrong formula. I'm reasonably sure it would never calculate a > > probability other than 0 due a 64-bit constant being truncated to > > 32-bits. > > I've gone back and checked, and I was partially wrong on this point. > I had gotten the idea that integer literals would be truncated to int, > which is not true. The use of the wrong formula still means that > packets are dropped at entirely the wrong rate, though. The probability that a frame is completely unaffected by BER events, and thus shouldn't be dropped, is currently computed as Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen) where Nbits(plen) = plen * 8 + user-configurable framing overhead. This is a crude model yet one which was fairly simple to implement. Could you elaborate why you consider it to be entirely wrong? Cheers, Marko ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
19.08.2020 0:17, Ryan Stone wrote: > where dummynet wasn't possible Sorry, missed that. But why wasn't possible? If you could use ng_pipe, you could probably use ng_ipfw too, or maybe create small node ng_dummynet to connect NETGRAPH network with kernel-side dummynet directly. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
19.08.2020 0:17, Ryan Stone wrote: > I'd like to dump all of this and just implement a packet loss rate, > which would simplify all this immensely. Is anybody using ng_pipe > with a non-zero BER who would object to this? Given this litany of > issues I doubt it, but I thought that I'd be sure. Take a look at dummynet(4): kldload dummynet # adds (optional) queueing delay plus 10 ms additional delay ipfw pipe 1 config bw 100Mbit/s delay 10 # add packet drop probability ipfw add 3000 prob 0.05 deny ip from any to any in # apply bandwidth limit/delay ipfw add 3010 pipe 1 ip from any to any in ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
> I recently needed to be able to simulate a lossy, high-latency network > in an environment where dummynet wasn't possible. I gave ng_pipe a > try, and hit some major issues > > 1. Instead of configuring a packet drop rate, you configure a bit > error rate, which I found significantly less intuitive >From your background being packet network centric perhaps? Those of us who have line oriented, aka telecom, centric backgrounds BER is a very meaningful and useful metric. > 2. The use of BER makes for a very inconvenient implementation, as > ng_pipe has to maintain a table of packet drop rates for every > possible packet size Hum, that sounds like a poor implementation indeed. It seems like it would be easy to convert a BER into a packet drop probability based on bytes that have passed through the pipe. It should be easy to covert a BER into a packet drop rate, but doing the converse leads to quantization errors. I would rather see us keep the BER as the metric and fix what is broken rather than convert this to a packet drop rate.. > 3. The table implementation isn't sized right for LRO or TSO, leading > to ng_pipe going out of bounds of the array and panicking the system Code predates LRO and TSO, so not unexpected. > 4. The table calculation had two integer truncation bugs and used the > wrong formula. I'm reasonably sure it would never calculate a > probability other than 0 due a 64-bit constant being truncated to > 32-bits. You retracted this. > I'd like to dump all of this and just implement a packet loss rate, > which would simplify all this immensely. Is anybody using ng_pipe > with a non-zero BER who would object to this? Given this litany of > issues I doubt it, but I thought that I'd be sure. My gut instinc is that statistically BER leads to a more realistic model. -- Rod Grimes rgri...@freebsd.org ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Is anybody using ng_pipe?
On Tue, Aug 18, 2020 at 1:17 PM Ryan Stone wrote: > 4. The table calculation had two integer truncation bugs and used the > wrong formula. I'm reasonably sure it would never calculate a > probability other than 0 due a 64-bit constant being truncated to > 32-bits. I've gone back and checked, and I was partially wrong on this point. I had gotten the idea that integer literals would be truncated to int, which is not true. The use of the wrong formula still means that packets are dropped at entirely the wrong rate, though. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Is anybody using ng_pipe?
I recently needed to be able to simulate a lossy, high-latency network in an environment where dummynet wasn't possible. I gave ng_pipe a try, and hit some major issues 1. Instead of configuring a packet drop rate, you configure a bit error rate, which I found significantly less intuitive 2. The use of BER makes for a very inconvenient implementation, as ng_pipe has to maintain a table of packet drop rates for every possible packet size 3. The table implementation isn't sized right for LRO or TSO, leading to ng_pipe going out of bounds of the array and panicking the system 4. The table calculation had two integer truncation bugs and used the wrong formula. I'm reasonably sure it would never calculate a probability other than 0 due a 64-bit constant being truncated to 32-bits. I'd like to dump all of this and just implement a packet loss rate, which would simplify all this immensely. Is anybody using ng_pipe with a non-zero BER who would object to this? Given this litany of issues I doubt it, but I thought that I'd be sure. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 247912] IPv6 ndp does not work across local bridge members
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247912 --- Comment #2 from Martin Birgmeier --- Hi Li, Since you want it "before and after the creation of bridge0", the following is from the host; but the issue actually occurs on the client - I'll provide the output for that, too. Host before "bridge0 create" and "tap904 create": [0]# ndp -a Neighbor Linklayer Address Netif ExpireS Flags 2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R fec0::4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6re0 permanent R fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h57m34s S R fe80::203:dff:fe4f:f3a7%re0 00:03:0d:4f:f3:a7re0 23h55m33s S R fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h55m33s S R hal.xyzzy20:cf:30:55:5c:b6re0 permanent R mizar.xyzzy f0:de:f1:98:86:a9re0 23h58m35s S [0]# After "ifconfig bridge0 create && ifconfig bridge0 addm re0 && ifconfig bridge0 up": [0]# ndp -a Neighbor Linklayer Address Netif ExpireS Flags 2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R fec0::4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6re0 permanent R fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h58m48s S R fe80::203:dff:fe4f:f3a7%re0 00:03:0d:4f:f3:a7re0 23h51m46s S R fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h51m46s S R hal.xyzzy20:cf:30:55:5c:b6re0 permanent R mizar.xyzzy f0:de:f1:98:86:a9re0 23h59m48s S [0]# After "ifconfig tap904 create && ifconfig bridge0 addm tap904": [0]# ndp -a Neighbor Linklayer Address Netif ExpireS Flags 2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R fec0::4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6re0 permanent R fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h58m2s S R fe80::203:dff:fe4f:f3a7%re0 00:03:0d:4f:f3:a7re0 23h51m0s S R fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h51m0s S R hal.xyzzy20:cf:30:55:5c:b6re0 permanent R mizar.xyzzy f0:de:f1:98:86:a9re0 23h59m2s S [0]# Now starting the bhyve VM; the rest is from inside the VM. Before manually added ndp entries: [0]# ndp -a Neighbor Linklayer Address Netif ExpireS Flags v904.xyzzy 00:a0:98:50:35:17 vtnet0 permanent R gandalf.xyzzy00:03:0d:4f:f3:a7 vtnet0 23h59m57s S R fe80::203:dff:fe4f:f3a7%vtnet0 00:03:0d:4f:f3:a7 vtnet0 23h59m2s S R fe80::218:e7ff:fee0:807b%vtnet0 00:18:e7:e0:80:7b vtnet0 23h59m2s S R 2002:b2bf:ee7e:4d42:2a0:98ff:fe50:3517 00:a0:98:50:35:17 vtnet0 permanent R fec0::4d42:2a0:98ff:fe50:351700:a0:98:50:35:17 vtnet0 permanent R fe80::2a0:98ff:fe50:3517%vtnet0 00:a0:98:50:35:17 vtnet0 permanent R mizar.xyzzy f0:de:f1:98:86:a9 vtnet0 23h59m57s S [0]# After "ndp -s fec0:0:0:4d42::e 20:cf:30:55:5c:b6 && ndp -s fec0:0:0:4d42::e1 20:cf:30:55:5c:b6" (the host has two IPv6 addresses assigned to its interface; fec0:0:0:4d42::e resolves to hal.xyzzy): [0]# ndp -a Neighbor Linklayer Address Netif ExpireS Flags fec0:0:0:4d42::e120:cf:30:55:5c:b6 vtnet0 permanent R v904.xyzzy 00:a0:98:50:35:17 vtnet0 permanent R gandalf.xyzzy00:03:0d:4f:f3:a7 vtnet0 23h58m54s S R fe80::203:dff:fe4f:f3a7%vtnet0 00:03:0d:4f:f3:a7 vtnet0 23h57m59s S R fe80::218:e7ff:fee0:807b%vtnet0 00:18:e7:e0:80:7b vtnet0 23h57m59s S R 2002:b2bf:ee7e:4d42:2a0:98ff:fe50:3517 00:a0:98:50:35:17 vtnet0 permanent R fec0::4d42:2a0:98ff:fe50:351700:a0:98:50:35:17 vtnet0 permanent R fe80::2a0:98ff:fe50:3517%vtnet0 00:a0:98:50:35:17 vtnet0 permanent R hal.xyzzy20:cf:30:55:5c:b6 vtnet0 permanent R mizar.xyzzy f0:de:f1:98:86:a9 vtnet0 23h58m54s S [0]# -- Martin -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send
[Bug 248652] netmap: pkt-gen TX huge pps difference between 11-STABLE and 12-STABLE/CURRENT on ix & ixl NIC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248652 --- Comment #7 from Vincenzo Maffione --- (In reply to Kubilay Kocak from comment #6) I would say ix/ixl and/or NIC driver & iflib because it's not something related to the netmap module itself, and it is an optimization which derives from ix/ixl netmap support code, which now is included within iflib. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: net.add_addr_allfibs=1 behaviour deprecation
On 18/08/2020 07:54, Julian Elischer wrote: The reason for the two behaviours is that there are two ways that the previous behaviour of "add addresses to the only FIB" could be interpreted and extended once multiple fibs became available. The single fib case could be interpreted as either of: "Add to All N fibs where N == 1" or "add to the first (of 1) fibs". I decided to do both :-) At Ironport where I wrote it we had a scenario where I didn't want to have wrong entries in all the fibs when a new interface was brought up. Even for a moment. An other scenarios where for example a tunnel uses fib 1 but the rest of the system uses fib0 (which points through the tunnel) The addition of new routes into the tunnel's route when a new virtual interface is brought up pointing through the tunnel to the same address, leads in the tunnel immediately redirecting packets through itself which ends in tears. SO the obvious thing to do was to make it possible to only add the entry in the fib that was the default fib or in the case of Ironport, the fib that was the default fib of the process adding the interface. If you had to make a choice I think the '0' choice is the way to go. All other fibs need to be populated deliberately.. On 8/15/20 4:24 AM, Alexander V. Chernikov wrote: 18.07.2020, 14:22, "Alexander V. Chernikov" : Dear FreeBSD users, I would like to make net.add_addr_allfibs=0 as the default system behaviour and remove net.add_addr_allfibs. To do so, I would like to collect use cases with net.add_addr_allfibs=1 and multiple fibs, to ensure they can still be supported after removal. Background: Multi-fib support was added in r17 [1], 12 years ago. Addition of interface addresses to all fibs was a feature from day 1. The `net.add_addr_allfibs` sysctl was added in r180840 [2], 12 years ago. Problem: The goal of the fib support is to provide multiple independent routing tables, isolated from each other. `net.add_addr_allfibs` default tries to shift gears in the opposite direction, unconditionally inserting all addresses to all of the fibs. It complicates the logic, kernel code and makes control plane performance decrease with the number of fibs. It make impossible to use the same prefixes in multiple fibs, which may be desired given shortage of IPv4 address space. I do understand that there are some cases where such behaviour is desired. For example, it can be used to achieve VRF route leaking or binding on address from different fibs. I would like to collect such cases to consider supporting them in a different way. The goal is to make net.add_addr_allfibs=0 default behaviour and remove net.add_addr_allfibs. It will simplify kernel fib-related code and allow bringing more fib-related features. It will also improve fib scaling. No objections has been received. Next steps: * Switch net.add_addr_allfibs to 0 ( https://reviews.freebsd.org/D26076 ) * Provide an ability to use nexthops from different fibs * Remove net.add_addr_allfibs Timeline: Aug 1: summarising feedback and the usecases, decision on proceeding further Aug 20 (tentative): patches for supported usecases Sep 15 (tentative): net.add_addr_allfibs removal. [1]: [base Contents of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup) [2]: [base Diff of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;) /Alexander Agree completely, defaulting "add_addr_allfibs" to 1 broke many existing installations, which goes against the least surprise principle so many times advocated on FreeBSD lists. This is just one example: https://forums.freebsd.org/threads/strange-behavior-of-setfib-since-freebsd-12-0.73348/ Now, changing the default again might again break existing installations, which shouldn't be a reason for not doing it, but might be a reason to better communicate it this time around. GrzegorzJ ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: net.add_addr_allfibs=1 behaviour deprecation
The reason for the two behaviours is that there are two ways that the previous behaviour of "add addresses to the only FIB" could be interpreted and extended once multiple fibs became available. The single fib case could be interpreted as either of: "Add to All N fibs where N == 1" or "add to the first (of 1) fibs". I decided to do both :-) At Ironport where I wrote it we had a scenario where I didn't want to have wrong entries in all the fibs when a new interface was brought up. Even for a moment. An other scenarios where for example a tunnel uses fib 1 but the rest of the system uses fib0 (which points through the tunnel) The addition of new routes into the tunnel's route when a new virtual interface is brought up pointing through the tunnel to the same address, leads in the tunnel immediately redirecting packets through itself which ends in tears. SO the obvious thing to do was to make it possible to only add the entry in the fib that was the default fib or in the case of Ironport, the fib that was the default fib of the process adding the interface. If you had to make a choice I think the '0' choice is the way to go. All other fibs need to be populated deliberately.. On 8/15/20 4:24 AM, Alexander V. Chernikov wrote: 18.07.2020, 14:22, "Alexander V. Chernikov" : Dear FreeBSD users, I would like to make net.add_addr_allfibs=0 as the default system behaviour and remove net.add_addr_allfibs. To do so, I would like to collect use cases with net.add_addr_allfibs=1 and multiple fibs, to ensure they can still be supported after removal. Background: Multi-fib support was added in r17 [1], 12 years ago. Addition of interface addresses to all fibs was a feature from day 1. The `net.add_addr_allfibs` sysctl was added in r180840 [2], 12 years ago. Problem: The goal of the fib support is to provide multiple independent routing tables, isolated from each other. `net.add_addr_allfibs` default tries to shift gears in the opposite direction, unconditionally inserting all addresses to all of the fibs. It complicates the logic, kernel code and makes control plane performance decrease with the number of fibs. It make impossible to use the same prefixes in multiple fibs, which may be desired given shortage of IPv4 address space. I do understand that there are some cases where such behaviour is desired. For example, it can be used to achieve VRF route leaking or binding on address from different fibs. I would like to collect such cases to consider supporting them in a different way. The goal is to make net.add_addr_allfibs=0 default behaviour and remove net.add_addr_allfibs. It will simplify kernel fib-related code and allow bringing more fib-related features. It will also improve fib scaling. No objections has been received. Next steps: * Switch net.add_addr_allfibs to 0 ( https://reviews.freebsd.org/D26076 ) * Provide an ability to use nexthops from different fibs * Remove net.add_addr_allfibs Timeline: Aug 1: summarising feedback and the usecases, decision on proceeding further Aug 20 (tentative): patches for supported usecases Sep 15 (tentative): net.add_addr_allfibs removal. [1]: [base Contents of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup) [2]: [base Diff of /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;) /Alexander ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"