Re: [RFC PATCH 1/1] dql: add dql_set_min_limit()

2021-03-09 Thread Dave Taht
I note that "proof" is very much in the developer's opinion and
limited testing base.

Actual operational experience, as in a real deployment, with other applications,
heavy context switching, or virtualization, might yield better results.

There's lots of defaults in the linux kernel that are just swags, the
default NAPI and rx/tx ring buffer sizes being two where devs just
copy/paste stuff, which either doesn't scale up, or doesn't scale
down.

This does not mean I oppose your patch! However I have two points I'd
like to make
regarding bql and dql in general that I have long longed be explored.

0) Me being an advocate of low latency in general, does mean that I
have no problem
and even prefer, starving the device rather than always keeping it busy.

/me hides

1) BQL is MIAD - multiplicative increase, additive decrease. While in
practice so far this does not seem to matter much (and also measuring
things down to "us" really hard), a stabler algorithm is AIMD. BQL
often absorbs a large TSO burst - usually a minimum of 128k is
observed on gbit, where a stabler state (without GSO) seemed to be
around 40k on many of the chipsets I worked with, back when I was
working in this area.

(cake's gso-splitting also gets lower bql values in general, if you
have enough cpu to run cake)

2) BQL + hardware mq is increasingly an issue in my mind in that, say,
you are hitting
64 hw queues, each with 128k stored in there, is additive, where in
order to service interrupts properly and keep the media busy might
only require 128k total, spread across the active queues and flows. I
have often thought that making BQL scale better to multiple hw queues
by globally sharing the buffering state(s), would lead to lower
latency, but
also that probably sharing that state would be too high overhead.

Having not worked out a solution to 2), and preferring to start with
1), and not having a whole lot of support for item 0) in the world, I
just thought I'd mention it, in the hope
someone might give it a go.


Re: [PATCH] ath10k: increase rx buffer size to 2048

2020-04-28 Thread Dave Taht
On Tue, Apr 28, 2020 at 5:06 AM Kalle Valo  wrote:
>
> Sven Eckelmann  writes:
>
> > On Wednesday, 1 April 2020 09:00:49 CEST Sven Eckelmann wrote:
> >> On Wednesday, 5 February 2020 20:10:43 CEST Linus Lüssing wrote:
> >> > From: Linus Lüssing 
> >> >
> >> > Before, only frames with a maximum size of 1528 bytes could be
> >> > transmitted between two 802.11s nodes.
> >> >
> >> > For batman-adv for instance, which adds its own header to each frame,
> >> > we typically need an MTU of at least 1532 bytes to be able to transmit
> >> > without fragmentation.
> >> >
> >> > This patch now increases the maxmimum frame size from 1528 to 1656
> >> > bytes.
> >> [...]
> >>
> >> @Kalle, I saw that this patch was marked as deferred [1] but I couldn't 
> >> find
> >> any mail why it was done so. It seems like this currently creates real 
> >> world
> >> problems - so would be nice if you could explain shortly what is currently
> >> blocking its acceptance.
> >
> > Ping?
>
> Sorry for the delay, my plan was to first write some documentation about
> different hardware families but haven't managed to do that yet.
>
> My problem with this patch is that I don't know what hardware and
> firmware versions were tested, so it needs analysis before I feel safe
> to apply it. The ath10k hardware families are very different that even
> if a patch works perfectly on one ath10k hardware it could still break
> badly on another one.
>
> What makes me faster to apply ath10k patches is to have comprehensive
> analysis in the commit log. This shows me the patch author has
> considered about all hardware families, not just the one he is testing
> on, and that I don't need to do the analysis myself.

I have been struggling to get the ath10k to sing and dance using
various variants
of the firmware, on this bug over here:

https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/

The puzzling thing is the loss of bidirectional throughput at codel target 20,
and getting WAY more (but less than I expected) at codel target 5.

This doesn't quite have bearing the size of the rx ring, except that in my
experiments the rx ring is rather small!! and yet I get way more performance
out of it

(still,  as you'll see from the bug, it's WAY better than it used to be)

is NAPI in this driver? I'm afraid to look.
> --
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729


Re: [PATCH net-next v5] sched: Add dualpi2 qdisc

2019-08-28 Thread Dave Taht
On Wed, Aug 28, 2019 at 7:00 AM Bob Briscoe  wrote:
>
> Olivier, Dave,
>
> On 23/08/2019 13:59, Tilmans, Olivier (Nokia - BE/Antwerp) wrote:
>
> as best as I can
> tell (but could be wrong) the NQB idea wants to put something into the
> l4s fast queue? Or is NQB supposed to
> be a third queue?
>
> NQB is not supported in this release of the code. But FYI, it's not for a 
> third queue.

At the time of my code review of dualpi I had not gone back to review
the NQB draft fully.

> We can add support for NQB in the future, by expanding the
> dualpi2_skb_classify() function. This is however out of scope at the
> moment as NQB is not yet adopted by the TSV WG. I'd guess we may want more

> than just the NQB DSCP codepoint in the L queue, which then warrant
> another way to classify traffic, e.g., using tc filter hints.

Yes, you'll find find folk are fans of being able to put tc (and ebpf)
filters in front of various qdiscs for classification, logging, and/or
dropping behavior.

A fairly typical stanza is here:
https://github.com/torvalds/linux/blob/master/net/sched/sch_sfq.c#L171
to line 193.

> The IETF adopted the NQB draft at the meeting just passed in July, but the 
> draft has not yet been updated to reflect that: 
> https://tools.ietf.org/html/draft-white-tsvwg-nqb-02

Hmmm... no. I think oliver's statement was correct.

NQB was put into the "call for adoption into tsvwg" state (
https://mailarchive.ietf.org/arch/msg/tsvwg/fjyYQgU9xQCNalwPO7v9-al6mGk
) in the tsvwg aug 21st, which
doesn't mean "adopted by the ietf", either. In response to that call
several folk did put in (rather pithy),
comments on the current state of the NQB idea and internet draft, starting here:

https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk

For those here that are not familiar with IETF processes (and there
are many!) there are "internet drafts" that may or may not become
working group items, that if they become accepted by the working group
may or may not evolve to become actual RFCs.  Unlike lkml usage where
we use RFC in its original meaning as a mere request for comments,
there are several classes of IETF RFC - standards track, experimental,
and informational - whenever they are adopted and published by the
ietf.

There are RFCs for how they do RFCs, and BCPs and other TLAs, and if
you really want to know more about how the ietf processes actually
work, please contact me off list. Anyway...

Much of the experimental L4S architecture itself (of which NQB MAY
become part, and dualpi/tcpprague/etc are) is presently an accepted
tsvwg wg item with a list of 11 problems on the bug database here (
https://trac.ietf.org/trac/tsvwg/report/1?sort=ticket=1=1 ).
IMHO it's not currently near last call for standardization as a set of
experimental RFCs.

L4S takes advantage of several RFCs that have
indeed been published as experimental, notably, RFC8311, which too few
have read as yet.

While using up ECT1 in the L4S code as an identifier and not as a
congestion indicator is very controversial for me (
https://lwn.net/Articles/783673/ ), AND I'd rather it not be baked
into the linux api for dualpi should this identifier not be chosen by
the wg (thus my suggestion of a mask or lookup table)...

... I also dearly would like both sides of this code - dualpi and tcp
prague - in a simultaneously testable and high quality state. Without
that, many core ideas in dualpi cannot be tested, nor objectively
evaluated against other tcps and qdiscs using rfc3168 behavior along
the path. Multiple experimental ideas in RFC8311 (such as those in
section 4.3) have also not been re-evaluated in any context.

Is the known to work reference codebase for "tcp prague" still 3.19 based?

> The draft requests 0x2A (decimal 42) as the DSCP but, until the IETF 
> converges on a specific DSCP for NQB, I believe we should not code in a 
> default classifier anyway.
>
>
>
> Bob
>
> --
> 
> Bob Briscoe   http://bobbriscoe.net/



--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


Re: [PATCH net-next v5] sched: Add dualpi2 qdisc

2019-08-22 Thread Dave Taht
This is vastly improved code, thank you!

1) Since we're still duking it out over the meaning of the bits - not
just the SCE thing, but as best as I can
tell (but could be wrong) the NQB idea wants to put something into the
l4s fast queue? Or is NQB supposed to
be a third queue?

In those cases, the ecn_mask should just be mask.

2) Is the intent to make the drop probability 0 by default? (10 in the
pie rfc, not mentioned in the l4s rfc as yet)

3) has this been tested on a hw mq system as yet? (10gigE is typically
64 queues)


Re: [PATCH] net: can: Increase tx queue length

2019-03-09 Thread Dave Taht
Toke Høiland-Jørgensen  writes:

> Appana Durga Kedareswara Rao  writes:
>
>> Hi Andre,
>>
>>  
>>> 
>>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote:
>>> > While stress testing the CAN interface on xilinx axi can in loopback
>>> > mode getting message "write: no buffer space available"
>>> > Increasing device tx queue length resolved the above mentioned issue.
>>> 
>>> No need to patch the kernel:
>>> 
>>> $ ip link set  txqueuelen 500
>>> 
>>> does the same thing.
>>
>> Thanks for the review... 
>> Agree but it is not an out of box solution right?? 
>> Do you have any idea for socket can devices why the tx queue length is 10 
>> whereas
>> for other network devices (ex: ethernet) it is 1000 ??
>
> Probably because you don't generally want a long queue adding latency on
> a CAN interface? The default 1000 is already way too much even for an
> Ethernet device in a lot of cases.
>
> If you get "out of buffer" errors it means your application is sending
> things faster than the receiver (or device) can handle them. If you
> solve this by increasing the queue length you are just papering over the
> underlying issue, and trading latency for fewer errors. This tradeoff
> *may* be appropriate for your particular application, but I can imagine
> it would not be appropriate as a default. Keeping the buffer size small
> allows errors to propagate up to the application, which can then back
> off, or do something smarter, as appropriate.
>
> I don't know anything about the actual discussions going on when the
> defaults were set, but I can imagine something along the lines of the
> above was probably a part of it :)
>
> -Toke

In a related discussion, loud and often difficult, over here on the can bus, 

https://github.com/systemd/systemd/issues/9194#issuecomment-469403685

we found that applying fq_codel as the default via sysctl qdisc a bad
idea for systems for at least one model of can device.

If you scroll back on the bug, a good description of what the can
subsystem expects from the qdisc is therein - it mandates an in-order
fifo qdisc or no queue at all. the CAN protocol expects each packet to
be transmitted successfully or rejected, and if so, passes the error up
to userspace and is supposed to stop for further input.

As this was the first serious bug ever reported against using fq_codel
as the default in 5+ years of systemd and 7 of openwrt deployment I've
been taking it very seriously. It's worse than just systemd - openwrt
patches out pfifo_fast entirely. pfifo_fast is the wrong qdisc - the
right choices are noqueue and possibly pfifo.

However, the vcan device exposes noqueue, and so far it has been only
the one device ( a 8Devices socketcan USB2CAN ) that did not do this in
their driver that was misbehaving.

Which was just corrected with a simple:

static int usb_8dev_probe(struct usb_interface *intf,
 const struct usb_device_id *id)
{
 ...
 netdev->netdev_ops = _8dev_netdev_ops;

 netdev->flags |= IFF_ECHO; /* we support local echo */
+netdev->priv_flags |= IFF_NO_QUEUE;
 ...
}

and successfully tested on that bug report.

So at the moment, my thought is that all can devices should default to
noqueue, if they are not already. I think a pfifo_fast and a qlen of any
size is the wrong thing, but I still don't know enough about what other
can devices do or did to be certain.



Re: [bug, bisected] pfifo_fast causes packet reordering

2018-03-13 Thread Dave Taht
On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
 wrote:
> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
> delivered out-of-order.
>
> We have tracked the problem down to the driver interface level, and it seems
> that the driver's net_device_ops.ndo_start_xmit() function gets the packets
> handed over in the wrong order.
>
> This behavior was not observed on Linux v4.15 and I have bisected the
> problem down to this patch:
>
>> commit c5ad119fb6c09b0297446be05bd66602fa564758
>> Author: John Fastabend 
>> Date:   Thu Dec 7 09:58:19 2017 -0800
>>
>>net: sched: pfifo_fast use skb_array
>>
>>This converts the pfifo_fast qdisc to use the skb_array data structure
>>and set the lockless qdisc bit. pfifo_fast is the first qdisc to
>> support
>>the lockless bit that can be a child of a qdisc requiring locking. So
>>we add logic to clear the lock bit on initialization in these cases
>> when
>>the qdisc graft operation occurs.
>>
>>This also removes the logic used to pick the next band to dequeue from
>>and instead just checks a per priority array for packets from top
>> priority
>>to lowest. This might need to be a bit more clever but seems to work
>>for now.
>>
>>Signed-off-by: John Fastabend 
>>Signed-off-by: David S. Miller 
>
>
> The patch does not revert cleanly, but moving to one commit earlier makes
> the problem go away.
>
> Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go
> away as well.

I am of course, a fan of obsoleting pfifo_fast. There's no good reason
for it anymore.

>
> Is this an unintended side-effect of the patch or is there something the
> driver has to do to request in-order delivery?
>
> Thanks,
> Jakob



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619


Re: [bug, bisected] pfifo_fast causes packet reordering

2018-03-13 Thread Dave Taht
On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
 wrote:
> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
> delivered out-of-order.
>
> We have tracked the problem down to the driver interface level, and it seems
> that the driver's net_device_ops.ndo_start_xmit() function gets the packets
> handed over in the wrong order.
>
> This behavior was not observed on Linux v4.15 and I have bisected the
> problem down to this patch:
>
>> commit c5ad119fb6c09b0297446be05bd66602fa564758
>> Author: John Fastabend 
>> Date:   Thu Dec 7 09:58:19 2017 -0800
>>
>>net: sched: pfifo_fast use skb_array
>>
>>This converts the pfifo_fast qdisc to use the skb_array data structure
>>and set the lockless qdisc bit. pfifo_fast is the first qdisc to
>> support
>>the lockless bit that can be a child of a qdisc requiring locking. So
>>we add logic to clear the lock bit on initialization in these cases
>> when
>>the qdisc graft operation occurs.
>>
>>This also removes the logic used to pick the next band to dequeue from
>>and instead just checks a per priority array for packets from top
>> priority
>>to lowest. This might need to be a bit more clever but seems to work
>>for now.
>>
>>Signed-off-by: John Fastabend 
>>Signed-off-by: David S. Miller 
>
>
> The patch does not revert cleanly, but moving to one commit earlier makes
> the problem go away.
>
> Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go
> away as well.

I am of course, a fan of obsoleting pfifo_fast. There's no good reason
for it anymore.

>
> Is this an unintended side-effect of the patch or is there something the
> driver has to do to request in-order delivery?
>
> Thanks,
> Jakob



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619


Re: [Tech-board-discuss] [Ksummit-discuss] Linux Foundation Technical Advisory Board election results 2017

2017-10-25 Thread Dave Taht
On Wed, Oct 25, 2017 at 5:23 PM, H. Peter Anvin  wrote:
> On 10/26/17 00:14, Laurent Pinchart wrote:
>>
>>> It was a very close election, with the next candidate on the list
>>> receiving 36 votes.
>>
>> Do we have a procedure in place in case the tie that we only avoided by one
>> vote would have happened ?
>>
>
> In that case thw winner(s) would be drawn randomly.  The scanning tool
> scripts actually assigns each candidate a tiebreaker number via
> /dev/urandom, but we have not yet decided if we would do that or, for
> example, a physical coin toss.

Once upon a time, Heinlein, it was, I think, suggested that a
candidate for office be
picked entirely randomly from the pool. The lottery concept - the idea
that if you ran,
you had a small chance of serving no matter the popular vote - encouraged
participation.

>
> We have had ties in the past, and had one this time too, but none that
> has ever crossed the cut line.
>
> -hpa
>
> ___
> Tech-board-discuss mailing list
> tech-board-disc...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/tech-board-discuss



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619


Re: [Tech-board-discuss] [Ksummit-discuss] Linux Foundation Technical Advisory Board election results 2017

2017-10-25 Thread Dave Taht
On Wed, Oct 25, 2017 at 5:23 PM, H. Peter Anvin  wrote:
> On 10/26/17 00:14, Laurent Pinchart wrote:
>>
>>> It was a very close election, with the next candidate on the list
>>> receiving 36 votes.
>>
>> Do we have a procedure in place in case the tie that we only avoided by one
>> vote would have happened ?
>>
>
> In that case thw winner(s) would be drawn randomly.  The scanning tool
> scripts actually assigns each candidate a tiebreaker number via
> /dev/urandom, but we have not yet decided if we would do that or, for
> example, a physical coin toss.

Once upon a time, Heinlein, it was, I think, suggested that a
candidate for office be
picked entirely randomly from the pool. The lottery concept - the idea
that if you ran,
you had a small chance of serving no matter the popular vote - encouraged
participation.

>
> We have had ties in the past, and had one this time too, but none that
> has ever crossed the cut line.
>
> -hpa
>
> ___
> Tech-board-discuss mailing list
> tech-board-disc...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/tech-board-discuss



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Dave Taht
On Tue, Jan 10, 2017 at 9:23 AM, Felix Fietkau <n...@nbd.name> wrote:
> On 2017-01-10 18:17, Dave Taht wrote:
>> In the case of wifi I have 3 issues with this line of thought.
>>
>> multicast in wifi has generally supposed to be unreliable. This makes
>> it reliable. reliability comes at a cost -
>>
>> multicast is typically set at a fixed low rate today. unicast is
>> retried at different rates until it succeeds - for every station
>> listening. If one station is already at the lowest rate, the total
>> cost of the transmit increases, rather than decreases.
>>
>> unicast gets block acks until it succeeds. Again, more delay.
>>
>> I think there is something like 31 soft-retries in the ath9k driver
> If I remember correctly, hardware retries are counted here as well.

I chopped this to something more reasonable but never got around to
quantifying it, so never pushed the patch. I figured I'd measure ATF
in a noisy environment (which I'd be doing now if it weren't for
https://bugs.lede-project.org/index.php?do=details_id=368 )
first.

>> what happens to diffserv markings here? for unicast CS1 goes into the
>> BE queue, CS6, the VO queue. Do we go from one flat queue for all of
>> multicast to punching it through one of the hardware queues based on
>> the diffserv mark now with this patch?

I meant CS1=BK here. Tracing the path through the bridge code made my
head hurt, I can go look at some aircaps to see if the mcast->unicast
conversion respects those markings or not (my vote is *not*).

>> I would like it if there was a way to preserve the unreliability
>> (which multiple mesh protocols depend on), send stuff with QoSNoack,
>> etc - or dynamically choose (based on the rates of the stations)
>> between conventional multicast and unicast.
>>
>> Or - better, IMHO, keep sending multicast as is but pick the best of
>> the rates available to all the listening stations for it.

> The advantage of the multicast-to-unicast conversion goes beyond simply
> selecting a better rate - aggregation matters a lot as well, and that is
> simply incompatible with normal multicast.

Except for the VO queue which cannot aggregate. And for that matter,
using any other hardware queue than BE tends to eat a txop that would
otherwise possibly be combined with an aggregate.

(and the VI queue has always misbehaved, long on my todo list)

> Some multicast streams use lots of small-ish packets, the airtime impact
> of those is vastly reduced, even if the transmission has to be
> duplicated for a few stations.

The question was basically how far up does it scale. Arguably, for a
very few, well connected stations, this patch would help. For a
network with more - and more badly connected stations, I think it
would hurt.

What sorts of multicast traffic are being observed that flood the
network sufficiently to be worth optimizing out? arp? nd? upnp? mdns?
uftp? tv?

(my questions above are related to basically trying to setup a sane
a/b test, I've been building up a new testbed in noisy environment to
match the one I have in a quiet one, and don't have any "good" mcast
tests defined. Has anyone done an a/b test of this code with some
repeatable test already?)

(In my observations... The only truly heavy creator of a multicast
"burp" has tended to be upnp and mdns on smaller networks. Things like
nd and arp get more problematic as the number of stations go up also.
I can try things like abusing vlc or uftp to see what happens?)

I certainly agree multicast is a "problem" (I've seen 20-80% or more
of a given wifi network eaten by multicast) but I'm not convinced that
making it reliable, aggregatable unicast scales much past
basement-level testing of a few "good" stations, and don't know which
protocols are making it worse, the worst, in typical environments.
Certainly apple gear puts out a lot of multicast.

...

As best as I recall a recommendation in the 802.11-2012 standard was
that multicast packets be rate-limited so that you'd have a fixed
amount of crap after each beacon sufficient to keep the rest of the
unicast traffic flowing rapidly, instead of dumping everything into a
given beacon transmit.

That, combined with (maybe) picking the "best" union of known rates
per station, was essentially the strategy I'd intended[1] to pursue
for tackling the currently infinite wifi multicast queue - fq the
entries, have a fairly short queue (codel is not the best choice here)
drop from head, and limit the number of packets transmitted per beacon
to spread them out. That would solve the issue for sparse multicast
(dhcp etc), and smooth out the burps from bigger chunks while
impacting conventional unicast minimally.

There's also the pursuit of less multicast overall at least in some protocols

https://tools.ietf.org/html/draft-ietf-dnssd-hybrid-05


>
> - Felix


[1] but make-wifi-fast has been out of funding since august

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Dave Taht
On Tue, Jan 10, 2017 at 9:23 AM, Felix Fietkau  wrote:
> On 2017-01-10 18:17, Dave Taht wrote:
>> In the case of wifi I have 3 issues with this line of thought.
>>
>> multicast in wifi has generally supposed to be unreliable. This makes
>> it reliable. reliability comes at a cost -
>>
>> multicast is typically set at a fixed low rate today. unicast is
>> retried at different rates until it succeeds - for every station
>> listening. If one station is already at the lowest rate, the total
>> cost of the transmit increases, rather than decreases.
>>
>> unicast gets block acks until it succeeds. Again, more delay.
>>
>> I think there is something like 31 soft-retries in the ath9k driver
> If I remember correctly, hardware retries are counted here as well.

I chopped this to something more reasonable but never got around to
quantifying it, so never pushed the patch. I figured I'd measure ATF
in a noisy environment (which I'd be doing now if it weren't for
https://bugs.lede-project.org/index.php?do=details_id=368 )
first.

>> what happens to diffserv markings here? for unicast CS1 goes into the
>> BE queue, CS6, the VO queue. Do we go from one flat queue for all of
>> multicast to punching it through one of the hardware queues based on
>> the diffserv mark now with this patch?

I meant CS1=BK here. Tracing the path through the bridge code made my
head hurt, I can go look at some aircaps to see if the mcast->unicast
conversion respects those markings or not (my vote is *not*).

>> I would like it if there was a way to preserve the unreliability
>> (which multiple mesh protocols depend on), send stuff with QoSNoack,
>> etc - or dynamically choose (based on the rates of the stations)
>> between conventional multicast and unicast.
>>
>> Or - better, IMHO, keep sending multicast as is but pick the best of
>> the rates available to all the listening stations for it.

> The advantage of the multicast-to-unicast conversion goes beyond simply
> selecting a better rate - aggregation matters a lot as well, and that is
> simply incompatible with normal multicast.

Except for the VO queue which cannot aggregate. And for that matter,
using any other hardware queue than BE tends to eat a txop that would
otherwise possibly be combined with an aggregate.

(and the VI queue has always misbehaved, long on my todo list)

> Some multicast streams use lots of small-ish packets, the airtime impact
> of those is vastly reduced, even if the transmission has to be
> duplicated for a few stations.

The question was basically how far up does it scale. Arguably, for a
very few, well connected stations, this patch would help. For a
network with more - and more badly connected stations, I think it
would hurt.

What sorts of multicast traffic are being observed that flood the
network sufficiently to be worth optimizing out? arp? nd? upnp? mdns?
uftp? tv?

(my questions above are related to basically trying to setup a sane
a/b test, I've been building up a new testbed in noisy environment to
match the one I have in a quiet one, and don't have any "good" mcast
tests defined. Has anyone done an a/b test of this code with some
repeatable test already?)

(In my observations... The only truly heavy creator of a multicast
"burp" has tended to be upnp and mdns on smaller networks. Things like
nd and arp get more problematic as the number of stations go up also.
I can try things like abusing vlc or uftp to see what happens?)

I certainly agree multicast is a "problem" (I've seen 20-80% or more
of a given wifi network eaten by multicast) but I'm not convinced that
making it reliable, aggregatable unicast scales much past
basement-level testing of a few "good" stations, and don't know which
protocols are making it worse, the worst, in typical environments.
Certainly apple gear puts out a lot of multicast.

...

As best as I recall a recommendation in the 802.11-2012 standard was
that multicast packets be rate-limited so that you'd have a fixed
amount of crap after each beacon sufficient to keep the rest of the
unicast traffic flowing rapidly, instead of dumping everything into a
given beacon transmit.

That, combined with (maybe) picking the "best" union of known rates
per station, was essentially the strategy I'd intended[1] to pursue
for tackling the currently infinite wifi multicast queue - fq the
entries, have a fairly short queue (codel is not the best choice here)
drop from head, and limit the number of packets transmitted per beacon
to spread them out. That would solve the issue for sparse multicast
(dhcp etc), and smooth out the burps from bigger chunks while
impacting conventional unicast minimally.

There's also the pursuit of less multicast overall at least in some protocols

https://tools.ietf.org/html/draft-ietf-dnssd-hybrid-05


>
> - Felix


[1] but make-wifi-fast has been out of funding since august

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Dave Taht
In the case of wifi I have 3 issues with this line of thought.

multicast in wifi has generally supposed to be unreliable. This makes
it reliable. reliability comes at a cost -

multicast is typically set at a fixed low rate today. unicast is
retried at different rates until it succeeds - for every station
listening. If one station is already at the lowest rate, the total
cost of the transmit increases, rather than decreases.

unicast gets block acks until it succeeds. Again, more delay.

I think there is something like 31 soft-retries in the ath9k driver

what happens to diffserv markings here? for unicast CS1 goes into the
BE queue, CS6, the VO queue. Do we go from one flat queue for all of
multicast to punching it through one of the hardware queues based on
the diffserv mark now with this patch?

I would like it if there was a way to preserve the unreliability
(which multiple mesh protocols depend on), send stuff with QoSNoack,
etc - or dynamically choose (based on the rates of the stations)
between conventional multicast and unicast.

Or - better, IMHO, keep sending multicast as is but pick the best of
the rates available to all the listening stations for it.

Has anyone actually looked at the effects of this with, say, 5-10
stations at middlin to poor quality (longer distance)? using something
to measure the real effect of the multicast conversion? (uftp, mdns?)


Re: [PATCH net-next] bridge: multicast to unicast

2017-01-10 Thread Dave Taht
In the case of wifi I have 3 issues with this line of thought.

multicast in wifi has generally supposed to be unreliable. This makes
it reliable. reliability comes at a cost -

multicast is typically set at a fixed low rate today. unicast is
retried at different rates until it succeeds - for every station
listening. If one station is already at the lowest rate, the total
cost of the transmit increases, rather than decreases.

unicast gets block acks until it succeeds. Again, more delay.

I think there is something like 31 soft-retries in the ath9k driver

what happens to diffserv markings here? for unicast CS1 goes into the
BE queue, CS6, the VO queue. Do we go from one flat queue for all of
multicast to punching it through one of the hardware queues based on
the diffserv mark now with this patch?

I would like it if there was a way to preserve the unreliability
(which multiple mesh protocols depend on), send stuff with QoSNoack,
etc - or dynamically choose (based on the rates of the stations)
between conventional multicast and unicast.

Or - better, IMHO, keep sending multicast as is but pick the best of
the rates available to all the listening stations for it.

Has anyone actually looked at the effects of this with, say, 5-10
stations at middlin to poor quality (longer distance)? using something
to measure the real effect of the multicast conversion? (uftp, mdns?)


Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-07 Thread Dave Taht
The openwrt tree has long contained a set of patches that correct for
unaligned issues throughout the linux network stack.

https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

unaligned access traps in the packet processing path on certain versions of
the mips architecture is horrifically bad. I had kind of hoped these
patches in some form would have made it upstream by now. (or the
arches that have the issue retired, I think it's mostly just mips24k)


Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-07 Thread Dave Taht
The openwrt tree has long contained a set of patches that correct for
unaligned issues throughout the linux network stack.

https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

unaligned access traps in the packet processing path on certain versions of
the mips architecture is horrifically bad. I had kind of hoped these
patches in some form would have made it upstream by now. (or the
arches that have the issue retired, I think it's mostly just mips24k)


Re: [PATCH net-next] net: stmmac: add BQL support

2014-12-29 Thread Dave Taht
On Sun, Dec 28, 2014 at 1:48 PM, Beniamino Galvani  wrote:
> On Sun, Dec 28, 2014 at 08:25:40AM -0800, Dave Taht wrote:
>> On Sun, Dec 28, 2014 at 6:57 AM, Beniamino Galvani  
>> wrote:
>> > Add support for Byte Queue Limits to the STMicro MAC driver.
>>
>> Thank you!
>>
>> > Tested on a Amlogic S805 Cortex-A5 board, where the use of BQL
>> > slightly decreases the ping latency from ~10ms to ~3ms when the
>> > 100Mbps link is saturated by TCP streams. No difference is
>> > observed at 1Gbps.
>>
>> I see the plural. With TSQ in place it is hard (without something like
>> the rrul test driving multiple streams) to drive a driver to
>> saturation with small numbers of flows. This was with pfifo_fast, not
>> sch_fq, at 100mbit?
>
> Hi Dave,
>
> yes, this was with pfifo_fast and I used 4 iperf TCP streams. The total
> throughput didn't seem to increase adding more streams.

>>
>> Can this board actually drive a full gigabit in the first place? Until
>> now most of the low end arm boards I have seen only came with
>> a 100mbit mac, and the gig ones lacking offloads seemed to peak
>> out at about 600mbit.
>
> I measured a throughput of 650mbit in rx and 600mbit in tx.

You might want to try the rrul test which tests both directions and
latency at the same time.

In my case I have been trying to find a low-cost chip that could do soft
rate limiting (htb) + fq_codel at up to 300mbit/sec, as that is about
the peak speed
we will be getting from cable modems, and these are horribly overbuffered,
at these speeds too, with 1.2sec of bidirectional latency observed at
120mbit/12mbit.

I'm open to crazy ideas like trying to find a use for the gpu, etc, to
get there.

>
>>
>> Under my christmas tree landed a quad core A5 (odroid-c1), also an
>> xgene and zedboard - both of the latter are a-needing BQL,
>> and I haven't booted the udroid yet. Hopefully it is the
>> same driver you just improved.
>
> I'm using the odroid-c1 too, with this tree based on the recent
> Amlogic mainline work:
>
>   https://github.com/bengal/linux/tree/meson8b

Oh, cool, thx!

> Unfortunately at the moment the support for the board is very basic
> (for example, SMP is not working yet) but it's enough to do some NIC
> tests.

Good to know. Have you looked at xmit_more yet?

http://lwn.net/Articles/615238/


> Beniamino



-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] net: stmmac: add BQL support

2014-12-29 Thread Dave Taht
On Sun, Dec 28, 2014 at 1:48 PM, Beniamino Galvani b.galv...@gmail.com wrote:
 On Sun, Dec 28, 2014 at 08:25:40AM -0800, Dave Taht wrote:
 On Sun, Dec 28, 2014 at 6:57 AM, Beniamino Galvani b.galv...@gmail.com 
 wrote:
  Add support for Byte Queue Limits to the STMicro MAC driver.

 Thank you!

  Tested on a Amlogic S805 Cortex-A5 board, where the use of BQL
  slightly decreases the ping latency from ~10ms to ~3ms when the
  100Mbps link is saturated by TCP streams. No difference is
  observed at 1Gbps.

 I see the plural. With TSQ in place it is hard (without something like
 the rrul test driving multiple streams) to drive a driver to
 saturation with small numbers of flows. This was with pfifo_fast, not
 sch_fq, at 100mbit?

 Hi Dave,

 yes, this was with pfifo_fast and I used 4 iperf TCP streams. The total
 throughput didn't seem to increase adding more streams.


 Can this board actually drive a full gigabit in the first place? Until
 now most of the low end arm boards I have seen only came with
 a 100mbit mac, and the gig ones lacking offloads seemed to peak
 out at about 600mbit.

 I measured a throughput of 650mbit in rx and 600mbit in tx.

You might want to try the rrul test which tests both directions and
latency at the same time.

In my case I have been trying to find a low-cost chip that could do soft
rate limiting (htb) + fq_codel at up to 300mbit/sec, as that is about
the peak speed
we will be getting from cable modems, and these are horribly overbuffered,
at these speeds too, with 1.2sec of bidirectional latency observed at
120mbit/12mbit.

I'm open to crazy ideas like trying to find a use for the gpu, etc, to
get there.



 Under my christmas tree landed a quad core A5 (odroid-c1), also an
 xgene and zedboard - both of the latter are a-needing BQL,
 and I haven't booted the udroid yet. Hopefully it is the
 same driver you just improved.

 I'm using the odroid-c1 too, with this tree based on the recent
 Amlogic mainline work:

   https://github.com/bengal/linux/tree/meson8b

Oh, cool, thx!

 Unfortunately at the moment the support for the board is very basic
 (for example, SMP is not working yet) but it's enough to do some NIC
 tests.

Good to know. Have you looked at xmit_more yet?

http://lwn.net/Articles/615238/


 Beniamino



-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] net: stmmac: add BQL support

2014-12-28 Thread Dave Taht
On Sun, Dec 28, 2014 at 6:57 AM, Beniamino Galvani  wrote:
> Add support for Byte Queue Limits to the STMicro MAC driver.

Thank you!

> Tested on a Amlogic S805 Cortex-A5 board, where the use of BQL
> slightly decreases the ping latency from ~10ms to ~3ms when the
> 100Mbps link is saturated by TCP streams. No difference is
> observed at 1Gbps.

I see the plural. With TSQ in place it is hard (without something like
the rrul test driving multiple streams) to drive a driver to
saturation with small numbers of flows. This was with pfifo_fast, not
sch_fq, at 100mbit?

Can this board actually drive a full gigabit in the first place? Until
now most of the low end arm boards I have seen only came with
a 100mbit mac, and the gig ones lacking offloads seemed to peak
out at about 600mbit.

Under my christmas tree landed a quad core A5 (odroid-c1), also an
xgene and zedboard - both of the latter are a-needing BQL,
and I haven't booted the udroid yet. Hopefully it is the
same driver you just improved.

(https://plus.google.com/u/0/107942175615993706558/posts/f1D43umhm7E )

> Signed-off-by: Beniamino Galvani 
> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 118a427..c5af3d8 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1097,6 +1097,7 @@ static int init_dma_desc_rings(struct net_device *dev, 
> gfp_t flags)
>
> priv->dirty_tx = 0;
> priv->cur_tx = 0;
> +   netdev_reset_queue(priv->dev);
>
> stmmac_clear_descriptors(priv);
>
> @@ -1300,6 +1301,7 @@ static void stmmac_dma_operation_mode(struct 
> stmmac_priv *priv)
>  static void stmmac_tx_clean(struct stmmac_priv *priv)
>  {
> unsigned int txsize = priv->dma_tx_size;
> +   unsigned int bytes_compl = 0, pkts_compl = 0;
>
> spin_lock(>tx_lock);
>
> @@ -1356,6 +1358,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
> priv->hw->mode->clean_desc3(priv, p);
>
> if (likely(skb != NULL)) {
> +   pkts_compl++;
> +   bytes_compl += skb->len;
> dev_consume_skb_any(skb);
> priv->tx_skbuff[entry] = NULL;
> }
> @@ -1364,6 +1368,9 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
>
> priv->dirty_tx++;
> }
> +
> +   netdev_completed_queue(priv->dev, pkts_compl, bytes_compl);
> +
> if (unlikely(netif_queue_stopped(priv->dev) &&
>  stmmac_tx_avail(priv) > STMMAC_TX_THRESH(priv))) {
> netif_tx_lock(priv->dev);
> @@ -1418,6 +1425,7 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
>  (i == txsize - 1));
> priv->dirty_tx = 0;
> priv->cur_tx = 0;
> +   netdev_reset_queue(priv->dev);
> priv->hw->dma->start_tx(priv->ioaddr);
>
> priv->dev->stats.tx_errors++;
> @@ -2049,6 +2057,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, 
> struct net_device *dev)
> skb_tx_timestamp(skb);
>
> priv->hw->dma->enable_dma_transmission(priv->ioaddr);
> +   netdev_sent_queue(dev, skb->len);
>
> spin_unlock(>tx_lock);
> return NETDEV_TX_OK;
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] net: stmmac: add BQL support

2014-12-28 Thread Dave Taht
On Sun, Dec 28, 2014 at 6:57 AM, Beniamino Galvani b.galv...@gmail.com wrote:
 Add support for Byte Queue Limits to the STMicro MAC driver.

Thank you!

 Tested on a Amlogic S805 Cortex-A5 board, where the use of BQL
 slightly decreases the ping latency from ~10ms to ~3ms when the
 100Mbps link is saturated by TCP streams. No difference is
 observed at 1Gbps.

I see the plural. With TSQ in place it is hard (without something like
the rrul test driving multiple streams) to drive a driver to
saturation with small numbers of flows. This was with pfifo_fast, not
sch_fq, at 100mbit?

Can this board actually drive a full gigabit in the first place? Until
now most of the low end arm boards I have seen only came with
a 100mbit mac, and the gig ones lacking offloads seemed to peak
out at about 600mbit.

Under my christmas tree landed a quad core A5 (odroid-c1), also an
xgene and zedboard - both of the latter are a-needing BQL,
and I haven't booted the udroid yet. Hopefully it is the
same driver you just improved.

(https://plus.google.com/u/0/107942175615993706558/posts/f1D43umhm7E )

 Signed-off-by: Beniamino Galvani b.galv...@gmail.com
 ---
  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
 b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 index 118a427..c5af3d8 100644
 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 @@ -1097,6 +1097,7 @@ static int init_dma_desc_rings(struct net_device *dev, 
 gfp_t flags)

 priv-dirty_tx = 0;
 priv-cur_tx = 0;
 +   netdev_reset_queue(priv-dev);

 stmmac_clear_descriptors(priv);

 @@ -1300,6 +1301,7 @@ static void stmmac_dma_operation_mode(struct 
 stmmac_priv *priv)
  static void stmmac_tx_clean(struct stmmac_priv *priv)
  {
 unsigned int txsize = priv-dma_tx_size;
 +   unsigned int bytes_compl = 0, pkts_compl = 0;

 spin_lock(priv-tx_lock);

 @@ -1356,6 +1358,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 priv-hw-mode-clean_desc3(priv, p);

 if (likely(skb != NULL)) {
 +   pkts_compl++;
 +   bytes_compl += skb-len;
 dev_consume_skb_any(skb);
 priv-tx_skbuff[entry] = NULL;
 }
 @@ -1364,6 +1368,9 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)

 priv-dirty_tx++;
 }
 +
 +   netdev_completed_queue(priv-dev, pkts_compl, bytes_compl);
 +
 if (unlikely(netif_queue_stopped(priv-dev) 
  stmmac_tx_avail(priv)  STMMAC_TX_THRESH(priv))) {
 netif_tx_lock(priv-dev);
 @@ -1418,6 +1425,7 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
  (i == txsize - 1));
 priv-dirty_tx = 0;
 priv-cur_tx = 0;
 +   netdev_reset_queue(priv-dev);
 priv-hw-dma-start_tx(priv-ioaddr);

 priv-dev-stats.tx_errors++;
 @@ -2049,6 +2057,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, 
 struct net_device *dev)
 skb_tx_timestamp(skb);

 priv-hw-dma-enable_dma_transmission(priv-ioaddr);
 +   netdev_sent_queue(dev, skb-len);

 spin_unlock(priv-tx_lock);
 return NETDEV_TX_OK;
 --
 2.1.4

 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH net-next] tun: support retrieving multiple packets in a single read with IFF_MULTI_READ

2014-12-22 Thread Dave Taht
On Mon, Dec 22, 2014 at 12:18 PM, Alex Gartrell  wrote:
> Hey Herbert,
>
> Thanks for getting back to me
>
> On 12/22/14 4:09 AM, Herbert Xu wrote:
>>
>> As tun already has a socket interface can we do this through
>> recvmmsg?
>
>
> This just presents an easier interface (IMHO) for accomplishing that. And I
> say easier because I was unable how to figure out the recvmmsg way to do it.

the recvmsg and recvmmsg calls and layers above them could use an abstraction
that allows for better passing of per packet header information to applications
in the QUIC and webrtc age.

> While fully aware that this makes me look like an idiot, I have to admit

I have lost several days of hair to *msg calls. So have the authors of
multipath mosh
(which is WAY cool, btw: https://github.com/boutier/mosh

So, no, trying and failing does not make you an idiot. Trying at all does
make you a mite crazy, however. :)

> that I've tried and failed to figure out how to get a socket fd out of the
> tun device.
>
> The regular fd doesn't work (which is obvious when you look at the
> implementation sock_from_file), there's a tun_get_socket function but it's
> only referenced by a single file, and none of the ioctl's jump out at me as
> doing anything to enable this behavior.  Additionally, tuntap.txt makes no
> mention of sockets specifically.
>
> FWIW, I don't feel strongly that IFF_MULTI_READ is the right way to do this
> either.

I have been thinking about how to implement multiple ways of eliminating
serialization dependencies in userspace vpns using fair queueing, and
multithreading...
(with splitting out the seqno + address across an entire /64)

... and excess latency with multipacket reads, and then codeling
internal queues (as many vpns
bottleneck on the encap and encode step allowing for packets to
accumulate in the OS recv buffer)

See:

http://www.tinc-vpn.org/pipermail/tinc-devel/2014-December/000680.html

And especially:

https://plus.google.com/u/0/107942175615993706558/posts/QWPWLoGMtrm

and after having just suffered through making that work with recvmsg,
was dreading trying to make it work with recvmmsg.

It appears that one of the core crazy ideas (listening on an entire
/64) doesn´t work with the existing APIs, and this new interface would
help? Or recvmmsg could be generalized? Or?


-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH net-next] tun: support retrieving multiple packets in a single read with IFF_MULTI_READ

2014-12-22 Thread Dave Taht
On Mon, Dec 22, 2014 at 12:18 PM, Alex Gartrell agartr...@fb.com wrote:
 Hey Herbert,

 Thanks for getting back to me

 On 12/22/14 4:09 AM, Herbert Xu wrote:

 As tun already has a socket interface can we do this through
 recvmmsg?


 This just presents an easier interface (IMHO) for accomplishing that. And I
 say easier because I was unable how to figure out the recvmmsg way to do it.

the recvmsg and recvmmsg calls and layers above them could use an abstraction
that allows for better passing of per packet header information to applications
in the QUIC and webrtc age.

 While fully aware that this makes me look like an idiot, I have to admit

I have lost several days of hair to *msg calls. So have the authors of
multipath mosh
(which is WAY cool, btw: https://github.com/boutier/mosh

So, no, trying and failing does not make you an idiot. Trying at all does
make you a mite crazy, however. :)

 that I've tried and failed to figure out how to get a socket fd out of the
 tun device.

 The regular fd doesn't work (which is obvious when you look at the
 implementation sock_from_file), there's a tun_get_socket function but it's
 only referenced by a single file, and none of the ioctl's jump out at me as
 doing anything to enable this behavior.  Additionally, tuntap.txt makes no
 mention of sockets specifically.

 FWIW, I don't feel strongly that IFF_MULTI_READ is the right way to do this
 either.

I have been thinking about how to implement multiple ways of eliminating
serialization dependencies in userspace vpns using fair queueing, and
multithreading...
(with splitting out the seqno + address across an entire /64)

... and excess latency with multipacket reads, and then codeling
internal queues (as many vpns
bottleneck on the encap and encode step allowing for packets to
accumulate in the OS recv buffer)

See:

http://www.tinc-vpn.org/pipermail/tinc-devel/2014-December/000680.html

And especially:

https://plus.google.com/u/0/107942175615993706558/posts/QWPWLoGMtrm

and after having just suffered through making that work with recvmsg,
was dreading trying to make it work with recvmmsg.

It appears that one of the core crazy ideas (listening on an entire
/64) doesn´t work with the existing APIs, and this new interface would
help? Or recvmmsg could be generalized? Or?


-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] ath10k: add firmware files

2014-03-14 Thread Dave Taht
On Fri, Mar 14, 2014 at 5:36 AM, Luis R. Rodriguez
 wrote:
> On Fri, Mar 14, 2014 at 1:45 AM, Kalle Valo  wrote:
>> +* No Reverse engineering, decompiling, decrypting, or disassembling of
>> +  this software is permitted.
>
> We have other firmware licenses that have this language already on
> linux-firmware but let us also keep in mind that linux-firmware will
> likely not only be used by Linux folks.
>
> <-- cut -->
>
>> + NO LICENSES OR OTHER RIGHTS,
>> +WHETHER EXPRESS, IMPLIED, BASED ON ESTOPPEL OR OTHERWISE, ARE GRANTED
>> +TO ANY PARTY'S PATENTS, PATENT APPLICATIONS, OR PATENTABLE INVENTIONS
>> +BY VIRTUE OF THIS LICENSE OR THE DELIVERY OR PROVISION BY QUALCOMM
>> +ATHEROS, INC. OF THE SOFTWARE.
>
> This -- however is new to linux-firmware -- and I hereby raise a big
> red fucking flag. All other licenses on linux-firmware provide at the
> very least a limited patent grant. What makes Qualcomm special ?
>
>> +FOR ANY DIRECT DAMAGES ARISING UNDER OR RESULTING FROM
>> +THIS AGREEMENT OR IN CONNECTION WITH ANY USE OF THE SOFTWARE SHALL NOT
>> +EXCEED A TOTAL AMOUNT OF US$5.00.
>
> WTF - was this inspired by a Sci-Fi movie?

Actually this is awesome. Does this mean I CAN reverse engineer,
decompile, decrypt and dissemble this software?

Where do I send my 5 bucks?

>   Luis
>
> ___
> ath10k mailing list
> ath...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k



-- 
Dave Täht
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] ath10k: add firmware files

2014-03-14 Thread Dave Taht
On Fri, Mar 14, 2014 at 5:36 AM, Luis R. Rodriguez
mcg...@do-not-panic.com wrote:
 On Fri, Mar 14, 2014 at 1:45 AM, Kalle Valo kv...@qca.qualcomm.com wrote:
 +* No Reverse engineering, decompiling, decrypting, or disassembling of
 +  this software is permitted.

 We have other firmware licenses that have this language already on
 linux-firmware but let us also keep in mind that linux-firmware will
 likely not only be used by Linux folks.

 -- cut --

 + NO LICENSES OR OTHER RIGHTS,
 +WHETHER EXPRESS, IMPLIED, BASED ON ESTOPPEL OR OTHERWISE, ARE GRANTED
 +TO ANY PARTY'S PATENTS, PATENT APPLICATIONS, OR PATENTABLE INVENTIONS
 +BY VIRTUE OF THIS LICENSE OR THE DELIVERY OR PROVISION BY QUALCOMM
 +ATHEROS, INC. OF THE SOFTWARE.

 This -- however is new to linux-firmware -- and I hereby raise a big
 red fucking flag. All other licenses on linux-firmware provide at the
 very least a limited patent grant. What makes Qualcomm special ?

 +FOR ANY DIRECT DAMAGES ARISING UNDER OR RESULTING FROM
 +THIS AGREEMENT OR IN CONNECTION WITH ANY USE OF THE SOFTWARE SHALL NOT
 +EXCEED A TOTAL AMOUNT OF US$5.00.

 WTF - was this inspired by a Sci-Fi movie?

Actually this is awesome. Does this mean I CAN reverse engineer,
decompile, decrypt and dissemble this software?

Where do I send my 5 bucks?

   Luis

 ___
 ath10k mailing list
 ath...@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/ath10k



-- 
Dave Täht
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH,RFC] random: collect cpu randomness

2014-02-06 Thread Dave Taht
On Thu, Feb 6, 2014 at 5:20 PM, Kees Cook  wrote:
> Hi Jörn,
>
> On Sun, Feb 02, 2014 at 03:36:17PM -0500, Jörn Engel wrote:
>> Collects entropy from random behaviour all modern cpus exhibit.  The
>> scheduler and slab allocator are instrumented for this purpose.  How
>> much randomness can be gathered is clearly hardware-dependent and hard
>> to estimate.  Therefore the entropy estimate is zero, but random bits
>> still get mixed into the pools.
>
> Have you seen this work from PaX Team?
>
> http://grsecurity.net/pipermail/grsecurity/2012-July/001093.html
>
> See http://grsecurity.net/test/grsecurity-3.0-3.13.1-201402052349.patch
> and search for PAX_LATENT_ENTROPY.

The hardware rng world just got easier with the "hashlet".

https://plus.google.com/u/0/107942175615993706558/posts/4iq6W524SxL

Kernel driver wanted...

> -Kees
>
> --
> Kees Cook@outflux.net



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH,RFC] random: collect cpu randomness

2014-02-06 Thread Dave Taht
On Thu, Feb 6, 2014 at 5:20 PM, Kees Cook k...@outflux.net wrote:
 Hi Jörn,

 On Sun, Feb 02, 2014 at 03:36:17PM -0500, Jörn Engel wrote:
 Collects entropy from random behaviour all modern cpus exhibit.  The
 scheduler and slab allocator are instrumented for this purpose.  How
 much randomness can be gathered is clearly hardware-dependent and hard
 to estimate.  Therefore the entropy estimate is zero, but random bits
 still get mixed into the pools.

 Have you seen this work from PaX Team?

 http://grsecurity.net/pipermail/grsecurity/2012-July/001093.html

 See http://grsecurity.net/test/grsecurity-3.0-3.13.1-201402052349.patch
 and search for PAX_LATENT_ENTROPY.

The hardware rng world just got easier with the hashlet.

https://plus.google.com/u/0/107942175615993706558/posts/4iq6W524SxL

Kernel driver wanted...

 -Kees

 --
 Kees Cook@outflux.net



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/