Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-12 Thread Hans Petter Selasky

On 02/11/16 17:47, Warner Losh wrote:

On Wed, Feb 10, 2016 at 8:08 AM, Pedro Giffuni  wrote:


Hello;

El 10/02/2016 a las 02:20, Hans Petter Selasky escribió:


On 01/19/16 17:09, Ryan Stone wrote:


On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky <
hsela...@freebsd.org>
wrote:



+   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf
*),
+   _lro_mbuf_compare_header);



In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space.  Is there a DOS concern here?



Hi,

Our FreeBSD qsort() routine has been specifically modified to not exhibit
the so-called QuickSort worst case behaviour of O(N**2) sorting time. This
is not documented in our source code, but here:

http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf

So I think DOS w.r.t O(N**2) is not a valid consern.

Thank you for your input Ryan.

BTW:

Drew Gallatin has tested our qsort() v.s. my mergesort() and found that:

"It looks like mergesort is nearly 2x as expensive. (4.7% vs 2.5%)"



FWIW, our libc qsort() has an additional enhancement:

http://svnweb.freebsd.org/base?view=revision=279663

In my measurements qsort(3) was now always faster than mergesort(3).



If it is faster, is there any good reason to maintain both qsort and
mergesort
in the kernel then?



No, I've abandoned the mergesort patch.

--HPS

___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-11 Thread Pedro Giffuni



On 02/11/16 11:47, Warner Losh wrote:



On Wed, Feb 10, 2016 at 8:08 AM, Pedro Giffuni 

Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-11 Thread Warner Losh
On Wed, Feb 10, 2016 at 8:08 AM, Pedro Giffuni  wrote:

> Hello;
>
> El 10/02/2016 a las 02:20, Hans Petter Selasky escribió:
>
>> On 01/19/16 17:09, Ryan Stone wrote:
>>
>>> On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky <
>>> hsela...@freebsd.org>
>>> wrote:
>>>
>>>
 +   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf
 *),
 +   _lro_mbuf_compare_header);


>>> In the worst case, qsort() can take O(n**2) time and consume O(n) stack
>>> space.  Is there a DOS concern here?
>>>
>>>
>> Hi,
>>
>> Our FreeBSD qsort() routine has been specifically modified to not exhibit
>> the so-called QuickSort worst case behaviour of O(N**2) sorting time. This
>> is not documented in our source code, but here:
>>
>> http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf
>>
>> So I think DOS w.r.t O(N**2) is not a valid consern.
>>
>> Thank you for your input Ryan.
>>
>> BTW:
>>
>> Drew Gallatin has tested our qsort() v.s. my mergesort() and found that:
>>
>> "It looks like mergesort is nearly 2x as expensive. (4.7% vs 2.5%)"
>>
>>
> FWIW, our libc qsort() has an additional enhancement:
>
> http://svnweb.freebsd.org/base?view=revision=279663
>
> In my measurements qsort(3) was now always faster than mergesort(3).


If it is faster, is there any good reason to maintain both qsort and
mergesort
in the kernel then?

Warner
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-11 Thread Benjamin Kaduk
On Thu, Feb 11, 2016 at 10:47 AM, Warner Losh  wrote:

> On Wed, Feb 10, 2016 at 8:08 AM, Pedro Giffuni  wrote:
>
>
> If it is faster, is there any good reason to maintain both qsort and
> mergesort
> in the kernel then?
>

qsort is not stable; mergesort is.
(It's a shame that glibc didn't pick up mergesort.)

-Ben
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-10 Thread Pedro Giffuni

Hello;

El 10/02/2016 a las 02:20, Hans Petter Selasky escribió:

On 01/19/16 17:09, Ryan Stone wrote:
On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky 


wrote:



+   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct 
mbuf *),

+   _lro_mbuf_compare_header);



In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space.  Is there a DOS concern here?



Hi,

Our FreeBSD qsort() routine has been specifically modified to not 
exhibit the so-called QuickSort worst case behaviour of O(N**2) 
sorting time. This is not documented in our source code, but here:


http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf

So I think DOS w.r.t O(N**2) is not a valid consern.

Thank you for your input Ryan.

BTW:

Drew Gallatin has tested our qsort() v.s. my mergesort() and found that:

"It looks like mergesort is nearly 2x as expensive. (4.7% vs 2.5%)"



FWIW, our libc qsort() has an additional enhancement:

http://svnweb.freebsd.org/base?view=revision=279663

In my measurements qsort(3) was now always faster than mergesort(3).

Pedro.

___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-02-09 Thread Hans Petter Selasky

On 01/19/16 17:09, Ryan Stone wrote:

On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky 
wrote:



+   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf *),
+   _lro_mbuf_compare_header);



In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space.  Is there a DOS concern here?



Hi,

Our FreeBSD qsort() routine has been specifically modified to not 
exhibit the so-called QuickSort worst case behaviour of O(N**2) sorting 
time. This is not documented in our source code, but here:


http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf

So I think DOS w.r.t O(N**2) is not a valid consern.

Thank you for your input Ryan.

BTW:

Drew Gallatin has tested our qsort() v.s. my mergesort() and found that:

"It looks like mergesort is nearly 2x as expensive. (4.7% vs 2.5%)"

See:

https://reviews.freebsd.org/D5200

And:

https://reviews.freebsd.org/D4994

--HPS
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Adrian Chadd
On 19 January 2016 at 12:24, Hans Petter Selasky  wrote:
> On 01/19/16 21:05, Adrian Chadd wrote:
>>
>> Erm, ok. So I'm confused about the state of how the streams are
>> tracked. So not all mbufs are in a stream, but they're in some single
>> LRO mbuf list?
>>
>
> Hi,
>
> The streams are typically tracked using the hardware generated Toeplitz hash
> value. The mbufs are sorted according to the hash value and the hash type.
> The list of mbufs is scanned, flushing the LRO entries every time we see a
> change in the hash value or hash type. OK?

Sure, but TCP fragments and non-fragments can hash to different
values, so suddenly you get interleaved packets.

You can't rely on the toeplitz hash 100% hashing /all packets in a
stream/ reliably, because it's highly dependent upon the NIC config.

This is why I did all that effort in the RSS code to handle things.

-a
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Adrian Chadd
On 19 January 2016 at 12:44, Hans Petter Selasky  wrote:
> On 01/19/16 21:23, Adrian Chadd wrote:
>>
>> On 19 January 2016 at 12:24, Hans Petter Selasky  wrote:
>>>
>>> On 01/19/16 21:05, Adrian Chadd wrote:


 Erm, ok. So I'm confused about the state of how the streams are
 tracked. So not all mbufs are in a stream, but they're in some single
 LRO mbuf list?

>>>
>>> Hi,
>>>
>>> The streams are typically tracked using the hardware generated Toeplitz
>>> hash
>>> value. The mbufs are sorted according to the hash value and the hash
>>> type.
>>> The list of mbufs is scanned, flushing the LRO entries every time we see
>>> a
>>> change in the hash value or hash type. OK?
>>
>>
>
> Hi,
>
>> Sure, but TCP fragments and non-fragments can hash to different
>> values, so suddenly you get interleaved packets.
>
>
> Fragmented TCP packets should not be subject to LRO. Depending on the NIC we
> will get the same hash value or not. I think that's fine. If you want to use
> this feature the NIC should not hash the TCP port numbers when it sees a
> fragmented packet including the starting fragment. That will give the best
> result.
>
> What does the RSS code expect currently?

The NIC doesn't mis-tag something, so it knows to rehash things in the
netisr input path.

And yes, I've seen streams with both fragments and non-fragments at
the same time. Then you end up with some packets with different hash
types. The IP input path will reassemble fragments, and then you'll
rehash the packet with the correct L3 or L4 calculation as
appropriate.

In this setup you may have the fragments show up at a different hash
value to the non-fragments, so you'll handle all the non-fragments
first, then the fragments show up later. So hopefully the LRO code
handles seeing that hole in the TCP stream and will eject the whole
stream up.


>> You can't rely on the toeplitz hash 100% hashing /all packets in a
>> stream/ reliably, because it's highly dependent upon the NIC config.
>
>
>> This is why I did all that effort in the RSS code to handle things.
>
> --HPS
>
>
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 21:23, Adrian Chadd wrote:

On 19 January 2016 at 12:24, Hans Petter Selasky  wrote:

On 01/19/16 21:05, Adrian Chadd wrote:


Erm, ok. So I'm confused about the state of how the streams are
tracked. So not all mbufs are in a stream, but they're in some single
LRO mbuf list?



Hi,

The streams are typically tracked using the hardware generated Toeplitz hash
value. The mbufs are sorted according to the hash value and the hash type.
The list of mbufs is scanned, flushing the LRO entries every time we see a
change in the hash value or hash type. OK?




Hi,


Sure, but TCP fragments and non-fragments can hash to different
values, so suddenly you get interleaved packets.


Fragmented TCP packets should not be subject to LRO. Depending on the 
NIC we will get the same hash value or not. I think that's fine. If you 
want to use this feature the NIC should not hash the TCP port numbers 
when it sees a fragmented packet including the starting fragment. That 
will give the best result.


What does the RSS code expect currently?



You can't rely on the toeplitz hash 100% hashing /all packets in a
stream/ reliably, because it's highly dependent upon the NIC config.


> This is why I did all that effort in the RSS code to handle things.

--HPS


___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Adrian Chadd
.. and just for reference, qsort ends up being very inefficient when
you start having lots of entries with the same value, because the
partitioning stops being log n...


-a
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 21:47, Adrian Chadd wrote:

In this setup you may have the fragments show up at a different hash
value to the non-fragments, so you'll handle all the non-fragments
first, then the fragments show up later. So hopefully the LRO code
handles seeing that hole in the TCP stream and will eject the whole
stream up.


Yes, the LRO code ejects all fragmented packets. They might be received 
out of order though after the sorting, assuming that's fine.


BTW: I don't see how you can get very high througput with fragmented TCP 
packets w/o LRO.


--HPS
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 21:05, Adrian Chadd wrote:

Erm, ok. So I'm confused about the state of how the streams are
tracked. So not all mbufs are in a stream, but they're in some single
LRO mbuf list?



Hi,

The streams are typically tracked using the hardware generated Toeplitz 
hash value. The mbufs are sorted according to the hash value and the 
hash type. The list of mbufs is scanned, flushing the LRO entries every 
time we see a change in the hash value or hash type. OK?


--HPS

___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Ryan Stone
On Tue, Jan 19, 2016 at 3:23 PM, Adrian Chadd 
wrote:

> Sure, but TCP fragments and non-fragments can hash to different
> values, so suddenly you get interleaved packets.
>

If the NIC is hashing fragments and non-fragments to different values,
aren't we already potentially getting out-of-order packets?  If the same
stream is spread across multiple queues all bets are already off.  I'm not
sure that this change has made things significantly worse in this regard.

> So hopefully the LRO code
> handles seeing that hole in the TCP stream and will eject the whole
> stream up.

The LRO code is aware of sequence numbers and will not merge frames if
there is a hole in the sequence numbers.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 20:26, Adrian Chadd wrote:

Well, why are you using qsort to do it, rather than walking the list
of streams and inputting all the mbufs per stream?

That's an O(1) operation.



Because you don't know when the stream ends and then you'll need to pass 
the list one time for every stream which is O(N*N) :-)


--HPS

___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Adrian Chadd
Well, why are you using qsort to do it, rather than walking the list
of streams and inputting all the mbufs per stream?

That's an O(1) operation.



-a
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Adrian Chadd
Erm, ok. So I'm confused about the state of how the streams are
tracked. So not all mbufs are in a stream, but they're in some single
LRO mbuf list?


-adrian

On 19 January 2016 at 11:38, Hans Petter Selasky  wrote:
> On 01/19/16 20:26, Adrian Chadd wrote:
>>
>> Well, why are you using qsort to do it, rather than walking the list
>> of streams and inputting all the mbufs per stream?
>>
>> That's an O(1) operation.
>>
>
> Because you don't know when the stream ends and then you'll need to pass the
> list one time for every stream which is O(N*N) :-)
>
> --HPS
>
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 18:01, Ryan Stone wrote:

libkern's qsort() is a quicksort implementation. AFAICT it has the worst
case behaviour that I describe.


FYI:

https://reviews.freebsd.org/D4994

--HPS
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Ryan Stone
On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky 
wrote:

>
> +   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf *),
> +   _lro_mbuf_compare_header);
>

In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space.  Is there a DOS concern here?
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Hans Petter Selasky

On 01/19/16 17:09, Ryan Stone wrote:

On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky 
wrote:



+   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf *),
+   _lro_mbuf_compare_header);



In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space.  Is there a DOS concern here?



Hi Ryan,

Is this the case for the qsort() we have in the FreeBSD kernel?

There are other sorting algorithms which can be used instead of qsort() 
which consume O(n * log(n)) time and O(1) stack, but requires a power of 
two set of elements to sort.


--HPS
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294327 - in head/sys: dev/cxgb dev/cxgbe dev/e1000 dev/hyperv/netvsc dev/ixgbe dev/mxge netinet sys

2016-01-19 Thread Ryan Stone
libkern's qsort() is a quicksort implementation. AFAICT it has the worst
case behaviour that I describe.

On Tue, Jan 19, 2016 at 11:54 AM, Hans Petter Selasky 
wrote:

> On 01/19/16 17:09, Ryan Stone wrote:
>
>> On Tue, Jan 19, 2016 at 10:33 AM, Hans Petter Selasky <
>> hsela...@freebsd.org>
>> wrote:
>>
>>
>>> +   qsort(lc->lro_mbuf_data, lc->lro_mbuf_count, sizeof(struct mbuf
>>> *),
>>> +   _lro_mbuf_compare_header);
>>>
>>>
>> In the worst case, qsort() can take O(n**2) time and consume O(n) stack
>> space.  Is there a DOS concern here?
>>
>>
> Hi Ryan,
>
> Is this the case for the qsort() we have in the FreeBSD kernel?
>
> There are other sorting algorithms which can be used instead of qsort()
> which consume O(n * log(n)) time and O(1) stack, but requires a power of
> two set of elements to sort.
>
> --HPS
>
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"