Appropriate Byte Counting during Congestion Avoidance

2020-08-18 Thread Liang Tian
Hi everyone,

We noticed CWND is growing much slower than expected during congestion
avoidance with new reno, and we came to this piece of code in
cc_ack_received() at tcp_input.c:353

if (type == CC_ACK) {

if (tp->snd_cwnd > tp->snd_ssthresh) {
tp->t_bytes_acked += min(tp->ccv->bytes_this_ack,
 nsegs * V_tcp_abc_l_var * tcp_maxseg(tp));
if (tp->t_bytes_acked >= tp->snd_cwnd) {
tp->t_bytes_acked -= tp->snd_cwnd;
tp->ccv->flags |= CCF_ABC_SENTAWND;
}
The increment of t_bytes_acked is capped at 2*maxseg.
The description of the sysctl variable tcp_abc_l_var(default value 2) is
"Cap the max cwnd increment during slow-start to this number of segments"
After reading RFC3465, it doesn't look like this cap should be applied
here since this is clearly not during slow-start.
We've seen in some cases the receiver is ACKing every 16 packets, and
CWND is growing at 1/8 of the expected rate because of this.

I would appreciate your opinion on this. Thanks a lot.

Regards,
Liang
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Julian Elischer
Writing a new netgraph node is relatively simple. Take ng_sample.c and
ng_sample.h and copy them.
Change names to suit, and add your own code in the middle. use one of 50
other nodes as examples.
No matter what you want to do one of them already does it.

-- 

+--\  _  __

|   __--_|\  Julian Elischer\   \\   U \/ / On assignment

|  /   \ jul...@elischer.org \   \ USA\ in a very strange

| (   OZ) \-->x   ___ | country !

+- X_.---._/ Mountain View, California \_/   \\

  v

sp; \\

  v
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Rodney W. Grimes
> On Tue, Aug 18, 2020 at 2:43 PM Eugene Grosbein  wrote:
> > Sorry, missed that. But why wasn't possible?
> 
> There's a daemon running on the system that handles most network
> configuration.  It's quite inflexible and will override any manual
> configuration changes.  It manages firewall rules but is ignorant of
> netgraph, so it will remove any dummynet rules but leave netgraph
> configuration alone.  It was significantly easier to just use ng_pipe,
> even after having to fix or work around the bugs, than it was to fight
> the daemon.
> 
> On Tue, Aug 18, 2020 at 2:56 PM Marko Zec  wrote:
> > The probability that a frame is completely unaffected by BER events,
> > and thus shouldn't be dropped, is currently computed as
> >
> > Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen)
> 
> The problem is in its calculation of Psingle_bit_unaffected(BER).  The
> BER is the fraction of bits that are affected, therefore it is the
> probability that a bit is affected.  But for some reason,
> Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 -
> BER.  This leads to the probability table being wrong.  For example,
> given a BER of 2350, the probability that a 1500-byte packet is
> not dropped is:

Is this a confusion over Bit Error Rate vis Bit Error Ratio?
A BER of 2350 I must assume is 1 bit in 2350 bits.

1 - 1/BER looks correct to me for a Bit Error Rate
1 - BER looks correct to me for a Bit Error Ratio (usually a percentage)

> 
> (1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%.
> 
> However, ng_pipe calculates a fixed-point probability value of
> 281460603879001.  To calculate whether a frame should be dropped,
> ng_pipe takes this probability value and shifts it right by 17,
> yielding 2147373991.  It then calls rand() to generate a random number
> in the range [0,2**31-1]; if the random number is larger than the
> probability value than it is dropped, otherwise it is kept.  The
> chances that a packet is kept is therefore 2147373991/(2**31 - 1), or
> about 99.99%.

This looks like optimization to reduce calculation as it is
done for every packet, why not do the calculation more accurately
and only for each "error", see below.

> It's easy enough to fix this one, but I wasn't sure that it would be
> so easy to fix the TSO/LRO issue without significantly increasing the
> memory usage, so I wanted to gauge whether it was worth pursuing that
> avenue or if a simpler model would be a better use of my time.  The
> feedback is definitely that a simpler model is *not* warranted, so
> let's talk about fixing TSO/LRO.

I am not even sure how to deal with TSO/LRO and BER.  Your not
going to discard the whole segment are you?  Are you going to
try and packetize it, drop the packet(s) with errors, reassmble
it?   My method of calculating the future error point would
at least allow you to just pass a whole segment without any
of that hassle and only have to do that when an error is within
some segment.

> 
> On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes
>  wrote:
> > Hum, that sounds like a poor implementation indeed.  It seems
> > like it would be easy to convert a BER into a packet drop
> > probability based on bytes that have passed through the pipe.
> 
> I'm not quite following you; can you elaborate?  Would this solution
> require us to update some shared state between each packet?  One
> advantage of the current approach is that there is no mutable state
> (except, of course, when configuration changes).

You would use the bytes transfered state that is already stored,
and compute a "next error" point based on BER and some randomness
such that your errors are not clocked at exact BER intervals.

Compare the next error point to the bytes transfered + size of
this packet/segment to decide if it needs dropped, if you drop then you
must calculate a new "next error" point.

This should considerable reduce the overhead for error rates
that effect less than 50% of packets, and would be the
same overhead for BER that effect every packet.  And 100x
more efficient for things that effect 1% of packets.

-- 
Rod Grimes rgri...@freebsd.org
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Marko Zec
On Tue, 18 Aug 2020 18:04:24 -0400
Ryan Stone  wrote:

> On Tue, Aug 18, 2020 at 5:43 PM Marko Zec  wrote:
> > Since in ng_pipe we define BER as an one error in BER bits (integer
> > value), wouldn't your formula P = 1 - BER yield results less than or
> > equal to zero for all non-zero values of BER?  The domain of P is
> > [0..1], so that won't fly.
> >
> > Your analysis seems to be based on an assumption that the BER
> > parameter is given as a multiplier to 0.5**48, which it is not.
> >
> > The proper fix would be to clarify the current BER parameter
> > semantics in ng_pipe(4), not to break bridges with the existing
> > scripts / software which rely on ng_pipe.
> >
> > Cheers,
> >
> > Marko  
> 
> The manpage defined the BER parameter as follows:
> 
> u_int64_t  ber; /* errors per 2^48 bits */
> 
> If we instead want to change the documentation to match the
> implementation, that's fine too.

Yeah, I only saw that a few minutes ago...  Fixed in r364367.

Cheers

Marko
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Ryan Stone
On Tue, Aug 18, 2020 at 5:43 PM Marko Zec  wrote:
> Since in ng_pipe we define BER as an one error in BER bits (integer
> value), wouldn't your formula P = 1 - BER yield results less than or
> equal to zero for all non-zero values of BER?  The domain of P is
> [0..1], so that won't fly.
>
> Your analysis seems to be based on an assumption that the BER
> parameter is given as a multiplier to 0.5**48, which it is not.
>
> The proper fix would be to clarify the current BER parameter semantics
> in ng_pipe(4), not to break bridges with the existing scripts / software
> which rely on ng_pipe.
>
> Cheers,
>
> Marko

The manpage defined the BER parameter as follows:

u_int64_t  ber; /* errors per 2^48 bits */

If we instead want to change the documentation to match the
implementation, that's fine too.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Marko Zec
On Tue, 18 Aug 2020 17:01:37 -0400
Ryan Stone  wrote:
...
> On Tue, Aug 18, 2020 at 2:56 PM Marko Zec  wrote:
> > The probability that a frame is completely unaffected by BER events,
> > and thus shouldn't be dropped, is currently computed as
> >
> > Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen)  
> 
> The problem is in its calculation of Psingle_bit_unaffected(BER).  The
> BER is the fraction of bits that are affected, therefore it is the
> probability that a bit is affected.  But for some reason,
> Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 -
> BER.

Since in ng_pipe we define BER as an one error in BER bits (integer
value), wouldn't your formula P = 1 - BER yield results less than or
equal to zero for all non-zero values of BER?  The domain of P is
[0..1], so that won't fly.

Your analysis seems to be based on an assumption that the BER
parameter is given as a multiplier to 0.5**48, which it is not.

The proper fix would be to clarify the current BER parameter semantics
in ng_pipe(4), not to break bridges with the existing scripts / software
which rely on ng_pipe.

Cheers,

Marko

> This leads to the probability table being wrong.  For example,
> given a BER of 2350, the probability that a 1500-byte packet is
> not dropped is:
> 
> (1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%.
> 
> However, ng_pipe calculates a fixed-point probability value of
> 281460603879001.  To calculate whether a frame should be dropped,
> ng_pipe takes this probability value and shifts it right by 17,
> yielding 2147373991.  It then calls rand() to generate a random number
> in the range [0,2**31-1]; if the random number is larger than the
> probability value than it is dropped, otherwise it is kept.  The
> chances that a packet is kept is therefore 2147373991/(2**31 - 1), or
> about 99.99%.
> 
> It's easy enough to fix this one, but I wasn't sure that it would be
> so easy to fix the TSO/LRO issue without significantly increasing the
> memory usage, so I wanted to gauge whether it was worth pursuing that
> avenue or if a simpler model would be a better use of my time.  The
> feedback is definitely that a simpler model is *not* warranted, so
> let's talk about fixing TSO/LRO.
> 
> On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes
>  wrote:
> > Hum, that sounds like a poor implementation indeed.  It seems
> > like it would be easy to convert a BER into a packet drop
> > probability based on bytes that have passed through the pipe.  
> 
> I'm not quite following you; can you elaborate?  Would this solution
> require us to update some shared state between each packet?  One
> advantage of the current approach is that there is no mutable state
> (except, of course, when configuration changes).

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Ryan Stone
On Tue, Aug 18, 2020 at 2:43 PM Eugene Grosbein  wrote:
> Sorry, missed that. But why wasn't possible?

There's a daemon running on the system that handles most network
configuration.  It's quite inflexible and will override any manual
configuration changes.  It manages firewall rules but is ignorant of
netgraph, so it will remove any dummynet rules but leave netgraph
configuration alone.  It was significantly easier to just use ng_pipe,
even after having to fix or work around the bugs, than it was to fight
the daemon.

On Tue, Aug 18, 2020 at 2:56 PM Marko Zec  wrote:
> The probability that a frame is completely unaffected by BER events,
> and thus shouldn't be dropped, is currently computed as
>
> Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen)

The problem is in its calculation of Psingle_bit_unaffected(BER).  The
BER is the fraction of bits that are affected, therefore it is the
probability that a bit is affected.  But for some reason,
Psingle_bit_unaffected(BER) is calculated as 1 - 1/BER rather than 1 -
BER.  This leads to the probability table being wrong.  For example,
given a BER of 2350, the probability that a 1500-byte packet is
not dropped is:

(1 - 2350/2**48)**(1500 * 8), which is approximately 99.00%.

However, ng_pipe calculates a fixed-point probability value of
281460603879001.  To calculate whether a frame should be dropped,
ng_pipe takes this probability value and shifts it right by 17,
yielding 2147373991.  It then calls rand() to generate a random number
in the range [0,2**31-1]; if the random number is larger than the
probability value than it is dropped, otherwise it is kept.  The
chances that a packet is kept is therefore 2147373991/(2**31 - 1), or
about 99.99%.

It's easy enough to fix this one, but I wasn't sure that it would be
so easy to fix the TSO/LRO issue without significantly increasing the
memory usage, so I wanted to gauge whether it was worth pursuing that
avenue or if a simpler model would be a better use of my time.  The
feedback is definitely that a simpler model is *not* warranted, so
let's talk about fixing TSO/LRO.

On Tue, Aug 18, 2020 at 1:47 PM Rodney W. Grimes
 wrote:
> Hum, that sounds like a poor implementation indeed.  It seems
> like it would be easy to convert a BER into a packet drop
> probability based on bytes that have passed through the pipe.

I'm not quite following you; can you elaborate?  Would this solution
require us to update some shared state between each packet?  One
advantage of the current approach is that there is no mutable state
(except, of course, when configuration changes).
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: net.add_addr_allfibs=1 behaviour deprecation

2020-08-18 Thread Alexander V . Chernikov
18.08.2020, 09:17, "Grzegorz Junka" :
> On 18/08/2020 07:54, Julian Elischer wrote:
>>  The reason for the two behaviours is that there are two ways that the
>>  previous behaviour of  "add addresses to the only FIB" could be
>>  interpreted and extended once multiple fibs became available. The
>>  single fib case could be interpreted as either of:
>>
>>  "Add to All N fibs where N == 1"    or     "add to the first (of 1)
>>  fibs".
>>  I decided to do both :-)
>>
>>  At Ironport where I wrote it we had a scenario where I didn't want to
>>  have wrong entries in all the fibs when a new interface was brought
>>  up. Even for a moment. An other scenarios where  for example a tunnel
>>  uses fib 1 but the rest of the system uses fib0 (which points through
>>  the tunnel) The addition of new routes into the tunnel's route when a
>>  new virtual interface is brought up pointing through the tunnel to the
>>  same address, leads in the tunnel immediately redirecting packets
>>  through itself which ends in tears. SO the obvious thing to do was
>>  to make it possible to only add the entry in the fib that was the
>>  default fib or in the case of Ironport, the fib that was the default
>>  fib of the process adding the interface.
>>
>>  If you had to make a choice I think the '0' choice is the way to go.
>>  All other fibs need to be populated deliberately..
>>
>>  On 8/15/20 4:24 AM, Alexander V. Chernikov wrote:
>>>  18.07.2020, 14:22, "Alexander V. Chernikov" :
  Dear FreeBSD users,

  I would like to make net.add_addr_allfibs=0 as the default system
  behaviour and remove net.add_addr_allfibs.
  To do so, I would like to collect use cases with
  net.add_addr_allfibs=1 and multiple fibs, to ensure they can still
  be supported after removal.

  Background:

  Multi-fib support was added in r17 [1], 12 years ago. Addition
  of interface addresses to all fibs was a feature from day 1.
  The `net.add_addr_allfibs` sysctl  was added in r180840 [2], 12
  years ago.

  Problem:
  The goal of the fib support is to provide multiple independent
  routing tables, isolated from each other.
  `net.add_addr_allfibs` default tries to shift gears in the opposite
  direction, unconditionally inserting all addresses to all of the fibs.

  It complicates the logic, kernel code and makes control plane
  performance decrease with the number of fibs.
  It make impossible to use the same prefixes in multiple fibs, which
  may be desired given shortage of IPv4 address space.

  I do understand that there are some cases where such behaviour is
  desired.
  For example, it can be used to achieve VRF route leaking or binding
  on address from different fibs.
  I would like to collect such cases to consider supporting them in a
  different way.

  The goal is to make net.add_addr_allfibs=0 default behaviour and
  remove net.add_addr_allfibs.
  It will simplify kernel fib-related code and allow bringing more
  fib-related features. It will also improve fib scaling.
>>>  No objections has been received.
>>>  Next steps:
>>>  * Switch net.add_addr_allfibs to 0 (
>>>  https://reviews.freebsd.org/D26076 )
>>>  * Provide an ability to use nexthops from different fibs
>>>  * Remove net.add_addr_allfibs
  Timeline:
  Aug 1: summarising feedback and the usecases, decision on proceeding
  further
  Aug 20 (tentative):  patches for supported usecases
  Sep 15 (tentative):  net.add_addr_allfibs removal.

  [1]: [base Contents of
  
 /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup)
  [2]: [base Diff of
  
 /head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;)

  /Alexander
>
> Agree completely, defaulting "add_addr_allfibs" to 1 broke many existing
> installations, which goes against the least surprise principle so many
> times advocated on FreeBSD lists.
>
> This is just one example:
> https://forums.freebsd.org/threads/strange-behavior-of-setfib-since-freebsd-12-0.73348/
>
> Now, changing the default again might again break existing
> installations, which shouldn't be a reason for not doing it, but might
> be a reason to better communicate it this time around.
I plan to communicate it the following way:
1) this thread
2) GONE_IN13 in the sysctl (which will print console message when set to 1)
3) Release notes
Do you think there are other communication channels one should try to use?
>
> GrzegorzJ
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net

Re: net.add_addr_allfibs=1 behaviour deprecation

2020-08-18 Thread Alexander V . Chernikov


___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Marko Zec
On Tue, 18 Aug 2020 13:17:48 -0400
Ryan Stone  wrote:

> I'd like to dump all of this and just implement a packet loss rate,
> which would simplify all this immensely.  Is anybody using ng_pipe
> with a non-zero BER who would object to this?  Given this litany of
> issues I doubt it, but I thought that I'd be sure.

Yes, the BER feature is being actively used, please don't nuke it.  If
you wish to supplement it with PER, which is less realistic but simpler
to implement, by all means go ahead...

> On Tue, Aug 18, 2020 at 1:17 PM Ryan Stone  wrote:
> > 4. The table calculation had two integer truncation bugs and used
> > the wrong formula.  I'm reasonably sure it would never calculate a
> > probability other than 0 due a 64-bit constant being truncated to
> > 32-bits.  
> 
> I've gone back and checked, and I was partially wrong on this point.
> I had gotten the idea that integer literals would be truncated to int,
> which is not true.  The use of the wrong formula still means that
> packets are dropped at entirely the wrong rate, though.

The probability that a frame is completely unaffected by BER events,
and thus shouldn't be dropped, is currently computed as

Ppass(BER, plen) = Psingle_bit_unaffected(BER) ^ Nbits(plen)

where Nbits(plen) = plen * 8 + user-configurable framing overhead.

This is a crude model yet one which was fairly simple to implement.
Could you elaborate why you consider it to be entirely wrong?

Cheers,

Marko
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Eugene Grosbein
19.08.2020 0:17, Ryan Stone wrote:

> where dummynet wasn't possible

Sorry, missed that. But why wasn't possible?

If you could use ng_pipe, you could probably use ng_ipfw too,
or maybe create small node ng_dummynet to connect NETGRAPH network
with kernel-side dummynet directly.


___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Eugene Grosbein
19.08.2020 0:17, Ryan Stone wrote:

> I'd like to dump all of this and just implement a packet loss rate,
> which would simplify all this immensely.  Is anybody using ng_pipe
> with a non-zero BER who would object to this?  Given this litany of
> issues I doubt it, but I thought that I'd be sure.

Take a look at dummynet(4):

kldload dummynet

# adds (optional) queueing delay plus 10 ms additional delay
ipfw pipe 1 config bw 100Mbit/s delay 10

# add packet drop probability
ipfw add 3000 prob 0.05 deny ip from any to any in

# apply bandwidth limit/delay
ipfw add 3010 pipe 1 ip from any to any in

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Rodney W. Grimes
> I recently needed to be able to simulate a lossy, high-latency network
> in an environment where dummynet wasn't possible.  I gave ng_pipe a
> try, and hit some major issues
> 
> 1. Instead of configuring a packet drop rate, you configure a bit
> error rate, which I found significantly less intuitive

>From your background being packet network centric perhaps?
Those of us who have line oriented, aka telecom, centric backgrounds
BER is a very meaningful and useful metric.

> 2. The use of BER makes for a very inconvenient implementation, as
> ng_pipe has to maintain a table of packet drop rates for every
> possible packet size

Hum, that sounds like a poor implementation indeed.  It seems
like it would be easy to convert a BER into a packet drop
probability based on bytes that have passed through the pipe.

It should be easy to covert a BER into a packet drop rate, but
doing the converse leads to quantization errors.  I would rather
see us keep the BER as the metric and fix what is broken rather
than convert this to a packet drop rate..

> 3. The table implementation isn't sized right for LRO or TSO, leading
> to ng_pipe going out of bounds of the array and panicking the system

Code predates LRO and TSO, so not unexpected.

> 4. The table calculation had two integer truncation bugs and used the
> wrong formula.  I'm reasonably sure it would never calculate a
> probability other than 0 due a 64-bit constant being truncated to
> 32-bits.
You retracted this.

> I'd like to dump all of this and just implement a packet loss rate,
> which would simplify all this immensely.  Is anybody using ng_pipe
> with a non-zero BER who would object to this?  Given this litany of
> issues I doubt it, but I thought that I'd be sure.

My gut instinc is that statistically BER leads to a more realistic model.

-- 
Rod Grimes rgri...@freebsd.org
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is anybody using ng_pipe?

2020-08-18 Thread Ryan Stone
On Tue, Aug 18, 2020 at 1:17 PM Ryan Stone  wrote:
> 4. The table calculation had two integer truncation bugs and used the
> wrong formula.  I'm reasonably sure it would never calculate a
> probability other than 0 due a 64-bit constant being truncated to
> 32-bits.

I've gone back and checked, and I was partially wrong on this point.
I had gotten the idea that integer literals would be truncated to int,
which is not true.  The use of the wrong formula still means that
packets are dropped at entirely the wrong rate, though.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Is anybody using ng_pipe?

2020-08-18 Thread Ryan Stone
I recently needed to be able to simulate a lossy, high-latency network
in an environment where dummynet wasn't possible.  I gave ng_pipe a
try, and hit some major issues

1. Instead of configuring a packet drop rate, you configure a bit
error rate, which I found significantly less intuitive
2. The use of BER makes for a very inconvenient implementation, as
ng_pipe has to maintain a table of packet drop rates for every
possible packet size
3. The table implementation isn't sized right for LRO or TSO, leading
to ng_pipe going out of bounds of the array and panicking the system
4. The table calculation had two integer truncation bugs and used the
wrong formula.  I'm reasonably sure it would never calculate a
probability other than 0 due a 64-bit constant being truncated to
32-bits.

I'd like to dump all of this and just implement a packet loss rate,
which would simplify all this immensely.  Is anybody using ng_pipe
with a non-zero BER who would object to this?  Given this litany of
issues I doubt it, but I thought that I'd be sure.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Bug 247912] IPv6 ndp does not work across local bridge members

2020-08-18 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247912

--- Comment #2 from Martin Birgmeier  ---
Hi Li,

Since you want it "before and after the creation of bridge0", the following is
from the host; but the issue actually occurs on the client - I'll provide the
output for that, too.

Host before "bridge0 create" and "tap904 create":

[0]# ndp -a
Neighbor Linklayer Address  Netif ExpireS Flags
2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R 
fec0::4d42:22cf:30ff:fe55:5cb6   20:cf:30:55:5c:b6re0 permanent R 
fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R 
fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R 
gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h57m34s S R
fe80::203:dff:fe4f:f3a7%re0  00:03:0d:4f:f3:a7re0 23h55m33s S R
fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h55m33s S R
hal.xyzzy20:cf:30:55:5c:b6re0 permanent R 
mizar.xyzzy  f0:de:f1:98:86:a9re0 23h58m35s S 
[0]# 

After "ifconfig bridge0 create && ifconfig bridge0 addm re0 && ifconfig bridge0
up":

[0]# ndp -a 
Neighbor Linklayer Address  Netif ExpireS Flags
2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R 
fec0::4d42:22cf:30ff:fe55:5cb6   20:cf:30:55:5c:b6re0 permanent R 
fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R 
fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R 
gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h58m48s S R
fe80::203:dff:fe4f:f3a7%re0  00:03:0d:4f:f3:a7re0 23h51m46s S R
fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h51m46s S R
hal.xyzzy20:cf:30:55:5c:b6re0 permanent R 
mizar.xyzzy  f0:de:f1:98:86:a9re0 23h59m48s S 
[0]# 

After "ifconfig tap904 create && ifconfig bridge0 addm tap904":

[0]# ndp -a
Neighbor Linklayer Address  Netif ExpireS Flags
2002:b2bf:ee7e:4d42:22cf:30ff:fe55:5cb6 20:cf:30:55:5c:b6 re0 permanent R 
fec0::4d42:22cf:30ff:fe55:5cb6   20:cf:30:55:5c:b6re0 permanent R 
fec0:0:0:4d42::e120:cf:30:55:5c:b6re0 permanent R 
fe80::22cf:30ff:fe55:5cb6%re020:cf:30:55:5c:b6re0 permanent R 
gandalf.xyzzy00:03:0d:4f:f3:a7re0 23h58m2s  S R
fe80::203:dff:fe4f:f3a7%re0  00:03:0d:4f:f3:a7re0 23h51m0s  S R
fe80::218:e7ff:fee0:807b%re0 00:18:e7:e0:80:7bre0 23h51m0s  S R
hal.xyzzy20:cf:30:55:5c:b6re0 permanent R 
mizar.xyzzy  f0:de:f1:98:86:a9re0 23h59m2s  S 
[0]# 

Now starting the bhyve VM; the rest is from inside the VM.

Before manually added ndp entries:

[0]# ndp -a
Neighbor Linklayer Address  Netif ExpireS Flags
v904.xyzzy   00:a0:98:50:35:17 vtnet0 permanent R 
gandalf.xyzzy00:03:0d:4f:f3:a7 vtnet0 23h59m57s S R
fe80::203:dff:fe4f:f3a7%vtnet0   00:03:0d:4f:f3:a7 vtnet0 23h59m2s  S R
fe80::218:e7ff:fee0:807b%vtnet0  00:18:e7:e0:80:7b vtnet0 23h59m2s  S R
2002:b2bf:ee7e:4d42:2a0:98ff:fe50:3517 00:a0:98:50:35:17 vtnet0 permanent R 
fec0::4d42:2a0:98ff:fe50:351700:a0:98:50:35:17 vtnet0 permanent R 
fe80::2a0:98ff:fe50:3517%vtnet0  00:a0:98:50:35:17 vtnet0 permanent R 
mizar.xyzzy  f0:de:f1:98:86:a9 vtnet0 23h59m57s S 
[0]# 

After "ndp -s fec0:0:0:4d42::e 20:cf:30:55:5c:b6 && ndp -s fec0:0:0:4d42::e1
20:cf:30:55:5c:b6" (the host has two IPv6 addresses assigned to its interface;
fec0:0:0:4d42::e resolves to hal.xyzzy):

[0]# ndp -a
Neighbor Linklayer Address  Netif ExpireS Flags
fec0:0:0:4d42::e120:cf:30:55:5c:b6 vtnet0 permanent R 
v904.xyzzy   00:a0:98:50:35:17 vtnet0 permanent R 
gandalf.xyzzy00:03:0d:4f:f3:a7 vtnet0 23h58m54s S R
fe80::203:dff:fe4f:f3a7%vtnet0   00:03:0d:4f:f3:a7 vtnet0 23h57m59s S R
fe80::218:e7ff:fee0:807b%vtnet0  00:18:e7:e0:80:7b vtnet0 23h57m59s S R
2002:b2bf:ee7e:4d42:2a0:98ff:fe50:3517 00:a0:98:50:35:17 vtnet0 permanent R 
fec0::4d42:2a0:98ff:fe50:351700:a0:98:50:35:17 vtnet0 permanent R 
fe80::2a0:98ff:fe50:3517%vtnet0  00:a0:98:50:35:17 vtnet0 permanent R 
hal.xyzzy20:cf:30:55:5c:b6 vtnet0 permanent R 
mizar.xyzzy  f0:de:f1:98:86:a9 vtnet0 23h58m54s S 
[0]# 

-- Martin

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send 

[Bug 248652] netmap: pkt-gen TX huge pps difference between 11-STABLE and 12-STABLE/CURRENT on ix & ixl NIC

2020-08-18 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248652

--- Comment #7 from Vincenzo Maffione  ---
(In reply to Kubilay Kocak from comment #6)
I would say
  ix/ixl and/or NIC driver & iflib
because it's not something related to the netmap module itself, and it is an
optimization which derives from ix/ixl netmap support code, which now is
included within iflib.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: net.add_addr_allfibs=1 behaviour deprecation

2020-08-18 Thread Grzegorz Junka

On 18/08/2020 07:54, Julian Elischer wrote:
The reason for the two behaviours is that there are two ways that the 
previous behaviour of  "add addresses to the only FIB" could be 
interpreted and extended once multiple fibs became available. The 
single fib case could be interpreted as either of:


"Add to All N fibs where N == 1"    or     "add to the first (of 1) 
fibs".

I decided to do both :-)

At Ironport where I wrote it we had a scenario where I didn't want to 
have wrong entries in all the fibs when a new interface was brought 
up. Even for a moment. An other scenarios where  for example a tunnel 
uses fib 1 but the rest of the system uses fib0 (which points through 
the tunnel) The addition of new routes into the tunnel's route when a 
new virtual interface is brought up pointing through the tunnel to the 
same address, leads in the tunnel immediately redirecting packets 
through itself which ends in tears. SO the obvious thing to do was 
to make it possible to only add the entry in the fib that was the 
default fib or in the case of Ironport, the fib that was the default 
fib of the process adding the interface.


If you had to make a choice I think the '0' choice is the way to go. 
All other fibs need to be populated deliberately..


On 8/15/20 4:24 AM, Alexander V. Chernikov wrote:

18.07.2020, 14:22, "Alexander V. Chernikov" :

Dear FreeBSD users,

I would like to make net.add_addr_allfibs=0 as the default system 
behaviour and remove net.add_addr_allfibs.
To do so, I would like to collect use cases with 
net.add_addr_allfibs=1 and multiple fibs, to ensure they can still 
be supported after removal.


Background:

Multi-fib support was added in r17 [1], 12 years ago. Addition 
of interface addresses to all fibs was a feature from day 1.
The `net.add_addr_allfibs` sysctl  was added in r180840 [2], 12 
years ago.


Problem:
The goal of the fib support is to provide multiple independent 
routing tables, isolated from each other.
`net.add_addr_allfibs` default tries to shift gears in the opposite 
direction, unconditionally inserting all addresses to all of the fibs.


It complicates the logic, kernel code and makes control plane 
performance decrease with the number of fibs.
It make impossible to use the same prefixes in multiple fibs, which 
may be desired given shortage of IPv4 address space.


I do understand that there are some cases where such behaviour is 
desired.
For example, it can be used to achieve VRF route leaking or binding 
on address from different fibs.
I would like to collect such cases to consider supporting them in a 
different way.


The goal is to make net.add_addr_allfibs=0 default behaviour and 
remove net.add_addr_allfibs.
It will simplify kernel fib-related code and allow bringing more 
fib-related features. It will also improve fib scaling.

No objections has been received.
Next steps:
* Switch net.add_addr_allfibs to 0 ( 
https://reviews.freebsd.org/D26076 )

* Provide an ability to use nexthops from different fibs
* Remove net.add_addr_allfibs

Timeline:
Aug 1: summarising feedback and the usecases, decision on proceeding 
further

Aug 20 (tentative):  patches for supported usecases
Sep 15 (tentative):  net.add_addr_allfibs removal.

[1]: [base Contents of 
/head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup)
[2]: [base Diff of 
/head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;)


/Alexander




Agree completely, defaulting "add_addr_allfibs" to 1 broke many existing 
installations, which goes against the least surprise principle so many 
times advocated on FreeBSD lists.


This is just one example: 
https://forums.freebsd.org/threads/strange-behavior-of-setfib-since-freebsd-12-0.73348/


Now, changing the default again might again break existing 
installations, which shouldn't be a reason for not doing it, but might 
be a reason to better communicate it this time around.


GrzegorzJ

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: net.add_addr_allfibs=1 behaviour deprecation

2020-08-18 Thread Julian Elischer
The reason for the two behaviours is that there are two ways that the 
previous behaviour of  "add addresses to the only FIB" could be 
interpreted and extended once multiple fibs became available. The 
single fib case could be interpreted as either of:


"Add to All N fibs where N == 1"    or     "add to the first (of 1) 
fibs".

I decided to do both :-)

At Ironport where I wrote it we had a scenario where I didn't want to 
have wrong entries in all the fibs when a new interface was brought 
up. Even for a moment. An other scenarios where  for example a tunnel 
uses fib 1 but the rest of the system uses fib0 (which points through 
the tunnel) The addition of new routes into the tunnel's route when a 
new virtual interface is brought up pointing through the tunnel to the 
same address, leads in the tunnel immediately redirecting packets 
through itself which ends in tears. SO the obvious thing to do was 
to make it possible to only add the entry in the fib that was the 
default fib or in the case of Ironport, the fib that was the default 
fib of the process adding the interface.


If you had to make a choice I think the '0' choice is the way to go. 
All other fibs need to be populated deliberately..


On 8/15/20 4:24 AM, Alexander V. Chernikov wrote:

18.07.2020, 14:22, "Alexander V. Chernikov" :

Dear FreeBSD users,

I would like to make net.add_addr_allfibs=0 as the default system behaviour and 
remove net.add_addr_allfibs.
To do so, I would like to collect use cases with net.add_addr_allfibs=1 and 
multiple fibs, to ensure they can still be supported after removal.

Background:

Multi-fib support was added in r17 [1], 12 years ago. Addition of interface 
addresses to all fibs was a feature from day 1.
The `net.add_addr_allfibs` sysctl  was added in r180840 [2], 12 years ago.

Problem:
The goal of the fib support is to provide multiple independent routing tables, 
isolated from each other.
`net.add_addr_allfibs` default tries to shift gears in the opposite direction, 
unconditionally inserting all addresses to all of the fibs.

It complicates the logic, kernel code and makes control plane performance 
decrease with the number of fibs.
It make impossible to use the same prefixes in multiple fibs, which may be 
desired given shortage of IPv4 address space.

I do understand that there are some cases where such behaviour is desired.
For example, it can be used to achieve VRF route leaking or binding on address 
from different fibs.
I would like to collect such cases to consider supporting them in a different 
way.

The goal is to make net.add_addr_allfibs=0 default behaviour and remove 
net.add_addr_allfibs.
It will simplify kernel fib-related code and allow bringing more fib-related 
features. It will also improve fib scaling.

No objections has been received.
Next steps:
* Switch net.add_addr_allfibs to 0 ( https://reviews.freebsd.org/D26076 )
* Provide an ability to use nexthops from different fibs
* Remove net.add_addr_allfibs

Timeline:
Aug 1: summarising feedback and the usecases, decision on proceeding further
Aug 20 (tentative):  patches for supported usecases
Sep 15 (tentative):  net.add_addr_allfibs removal.

[1]: [base Contents of 
/head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?revision=17=markup)
[2]: [base Diff of 
/head/sys/net/route.c](https://svnweb.freebsd.org/base/head/sys/net/route.c?r1=180839=180840;)

/Alexander

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"