Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread David Woodhouse
On Mon, 2012-04-30 at 19:16 +, Karl P wrote:
> As a user, I _never_ want to have to go and turn on the switch that says, 
> "make 
> my device work as fast as it can"
> 
> Is there any reason that I would _not_ want this?  

You may only have a printer, or something like that, attached to the
crappy Ethernet port. And hacking the entire network stack to load
integers byte-by-byte instead of doing 32-bit loads may have an adverse
effect on the performance of the decent interfaces that you *do* care
about.

Do we have numbers on how much the performance *decrease* is, for decent
network devices that *do* actually align their packets sanely?

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Karl P



On 04/30/2012 06:48 PM, David Woodhouse wrote:

On Mon, 2012-04-30 at 20:34 +0200, Felix Fietkau wrote:

There could be a generic Kconfig variable which could be selected by CPU
targets or ethernet drivers (with a dependency on
!HAVE_EFFICIENT_UNALIGNED_ACCESS).
That's cleaner than messing around with #define stuff manually.


Don't just select it; you *might* have a crappy network driver that
can't insert padding to align its packets, but you might not care about
performance for that particular device — you might use it for a
low-traffic interface or something, or not use it at all.

It's a user choice, I think.


As a user, I _never_ want to have to go and turn on the switch that says, "make 
my device work as fast as it can"


Is there any reason that I would _not_ want this?  Not caring about performance 
is not a good reason to not provide performance.


Cheers,
Karl P
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread David Woodhouse
On Mon, 2012-04-30 at 20:34 +0200, Felix Fietkau wrote:
> There could be a generic Kconfig variable which could be selected by CPU
> targets or ethernet drivers (with a dependency on
> !HAVE_EFFICIENT_UNALIGNED_ACCESS).
> That's cleaner than messing around with #define stuff manually.

Don't just select it; you *might* have a crappy network driver that
can't insert padding to align its packets, but you might not care about
performance for that particular device — you might use it for a
low-traffic interface or something, or not use it at all.

It's a user choice, I think.

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread David Woodhouse
On Mon, 2012-04-30 at 11:11 -0700, Dave Taht wrote:
> On Mon, Apr 30, 2012 at 10:26 AM, Felix Fietkau  wrote:
> > On 2012-04-30 5:08 PM, Dave Taht wrote:
> >> On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau  wrote:
> >>> On 2012-04-30 4:49 PM, David Woodhouse wrote:
>  On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
> > Tell it to however wired up this chip and shipped it in qty millions.
> > Actually that message was already received, successor chipsets from
> > this manufacturer did it up right.
> 
>  So the real problem is that the ar71xx doesn't allow you to DMA to
>  addresses which aren't aligned to 4 bytes? Which network devices does
>  that restriction apply to? All of them?
> >>> AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.
> >>
> >> I guess a long term issue is this patch needs to apply only to those 
> >> arches,
> >> somehow. It's silly to be hurtful to the successor arches.
> >>
> >>>
>  Out of interest, have you done a performance comparison with just
>  *moving* the packet when it arrives?
> >>> I did some tests before I did the first unaligned hack patch, I don't
> >>> have the numbers anymore, but the results were horrible.
> >>
> >> I'm still looking for the 'one liner to pass the driver to tell it to 
> >> realign',
> >> but was that the approach you were trying...
> > I did test with such a one liner (or maybe it was a two liner), but I
> > didn't keep the patch.
> > I think it's quite normal that this approach is so much more expensive
> > than adding the unaligned access hacks. The main bottleneck on these
> > devices is the memory bus, and for normal traffic passed through the
> > device as a router, the CPU does not have to touch much of the payload
> > contents at all, since it's only touched by DMA.
> > Doing a memmove to re-align the packet contents blows the dcache
> > footprint out of proportion and significantly increases the amount of
> > unnecessary memory bus traffic.
> 
> I agree it would blow up the dcache and be worse than what exists by a lot.
> 
> So, out of this conversation:
> 
> 1) It would be nice to not have this patch take effect on any but the
> ar71xx and ar91xx. As these share code in openwrt, doing it with a
> compile time define

This bothers me. It sounds like you don't even *care* about getting code
merged upstream. I know it's potentially hard in this case, but it
should still be the default mode.

A good start would be to have patches that aren't platform-specific, and
can be applied across the board.

 
> 2) IF there existed another brain damaged ethernet chip on some other
> arch, 

There are certainly plenty of architectures where alignment traps are
very expensive, and a fair number of brain-damaged network devices that
can't ensure their packets end up suitably aligned, either. I think it's
sane enough to at least *try* talking to DaveM about a condition
'__packed' on the relevant structures, and a simple 'load __be32' and
'load __u32' operation that may or may not explicitly use unaligned
loads according to the same config option.

> it would be worth coming up with a Kbuild option to enable
> defining __packed generically as part of the network stack for those
> arches. Something more
> pithy than
> 
> #define F_ING_HW_ENGINEER_SAVED_PIN
> 
> would be needed tho.
> 
> #define UNALIGNED_ETHERNET

Network, not Ethernet. But yeah, something like that.

> perhaps.
> 
> 3) I don't like the tcp_hdr macro in general, but it looks like that
> is an obsolete part of the patch anyway, so I'll try ripping it out.

Sounds good.

There are also a bunch of 16-bit loads that are being patched — is there
any point in that? Surely the packets are still aligned to *2* bytes?

This one looks suspicious too:

--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -17,7 +17,10 @@ static void iph_to_flow_copy_addrs(struct flow_keys *flow, co
nst struct iphdr *i
 {
BUILD_BUG_ON(offsetof(typeof(*flow), dst) !=
 offsetof(typeof(*flow), src) + sizeof(flow->src));
-   memcpy(&flow->src, &iph->saddr, sizeof(flow->src) + sizeof(flow->dst));
+   /*  memcpy(&flow->src, &iph->saddr, sizeof(flow->src) + 
sizeof(flow->dst)); */
+   flow->src = iph->saddr;
+   flow->dst = iph->daddr;
+
 }

> 4) get_unaligned_be32 seems like the right thing rather than
> get_unaligned_cpu32?

In some cases, yes. I actually think you should define specific macros
which do each, and their behaviour is just a straight dereference if
CONFIG_NETWORK_UNALIGNED is not set.

> 5) I THINK the 'if aligned, do assembly version' for the checksums is
> a win, but if item #2 is true, I'm not happy with the casts...
> 
> @@ -105,6 +141,9 @@ static inline __sum16 ip_fast_csum(const void
> *iph, unsigned int ihl)
> unsigned int csum;
> int carry;
> 
> +   if ((unsigned int) iph & 3)
> +   return ip_fast_csum_unaligned(iph,ihl);
> +

At least that one is in M

Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Felix Fietkau
On 2012-04-30 8:11 PM, Dave Taht wrote:
> On Mon, Apr 30, 2012 at 10:26 AM, Felix Fietkau  wrote:
>> On 2012-04-30 5:08 PM, Dave Taht wrote:
>>> On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau  wrote:
 On 2012-04-30 4:49 PM, David Woodhouse wrote:
> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
>> Tell it to however wired up this chip and shipped it in qty millions.
>> Actually that message was already received, successor chipsets from
>> this manufacturer did it up right.
>
> So the real problem is that the ar71xx doesn't allow you to DMA to
> addresses which aren't aligned to 4 bytes? Which network devices does
> that restriction apply to? All of them?
 AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.
>>>
>>> I guess a long term issue is this patch needs to apply only to those arches,
>>> somehow. It's silly to be hurtful to the successor arches.
>>>

> Out of interest, have you done a performance comparison with just
> *moving* the packet when it arrives?
 I did some tests before I did the first unaligned hack patch, I don't
 have the numbers anymore, but the results were horrible.
>>>
>>> I'm still looking for the 'one liner to pass the driver to tell it to 
>>> realign',
>>> but was that the approach you were trying...
>> I did test with such a one liner (or maybe it was a two liner), but I
>> didn't keep the patch.
>> I think it's quite normal that this approach is so much more expensive
>> than adding the unaligned access hacks. The main bottleneck on these
>> devices is the memory bus, and for normal traffic passed through the
>> device as a router, the CPU does not have to touch much of the payload
>> contents at all, since it's only touched by DMA.
>> Doing a memmove to re-align the packet contents blows the dcache
>> footprint out of proportion and significantly increases the amount of
>> unnecessary memory bus traffic.
> 
> I agree it would blow up the dcache and be worse than what exists by a lot.
> 
> So, out of this conversation:
> 
> 1) It would be nice to not have this patch take effect on any but the
> ar71xx and ar91xx. As these share code in openwrt, doing it with a
> compile time define
Right. For upstreaming parts of this stuff, this would definitely be
necessary.

> 2) IF there existed another brain damaged ethernet chip on some other
> arch, it would be worth coming up with a Kbuild option to enable
> defining __packed generically as part of the network stack for those
> arches. Something more
> pithy than
> 
> #define F_ING_HW_ENGINEER_SAVED_PIN
> 
> would be needed tho.
> 
> #define UNALIGNED_ETHERNET
> 
> perhaps.
There could be a generic Kconfig variable which could be selected by CPU
targets or ethernet drivers (with a dependency on
!HAVE_EFFICIENT_UNALIGNED_ACCESS).
That's cleaner than messing around with #define stuff manually.

> 3) I don't like the tcp_hdr macro in general, but it looks like that
> is an obsolete part of the patch anyway, so I'll try ripping it out.
> 
> 4) get_unaligned_be32 seems like the right thing rather than
> get_unaligned_cpu32?
Right.

> 5) I THINK the 'if aligned, do assembly version' for the checksums is
> a win, but if item #2 is true, I'm not happy with the casts...
> 
> @@ -105,6 +141,9 @@ static inline __sum16 ip_fast_csum(const void
> *iph, unsigned int ihl)
> unsigned int csum;
> int carry;
> 
> +   if ((unsigned int) iph & 3)
> +   return ip_fast_csum_unaligned(iph,ihl);
> +
For casts from pointers to integers, the kernel typically uses unsigned
long instead of unsigned int.

- Felix
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Dave Taht
On Mon, Apr 30, 2012 at 10:26 AM, Felix Fietkau  wrote:
> On 2012-04-30 5:08 PM, Dave Taht wrote:
>> On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau  wrote:
>>> On 2012-04-30 4:49 PM, David Woodhouse wrote:
 On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
> Tell it to however wired up this chip and shipped it in qty millions.
> Actually that message was already received, successor chipsets from
> this manufacturer did it up right.

 So the real problem is that the ar71xx doesn't allow you to DMA to
 addresses which aren't aligned to 4 bytes? Which network devices does
 that restriction apply to? All of them?
>>> AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.
>>
>> I guess a long term issue is this patch needs to apply only to those arches,
>> somehow. It's silly to be hurtful to the successor arches.
>>
>>>
 Out of interest, have you done a performance comparison with just
 *moving* the packet when it arrives?
>>> I did some tests before I did the first unaligned hack patch, I don't
>>> have the numbers anymore, but the results were horrible.
>>
>> I'm still looking for the 'one liner to pass the driver to tell it to 
>> realign',
>> but was that the approach you were trying...
> I did test with such a one liner (or maybe it was a two liner), but I
> didn't keep the patch.
> I think it's quite normal that this approach is so much more expensive
> than adding the unaligned access hacks. The main bottleneck on these
> devices is the memory bus, and for normal traffic passed through the
> device as a router, the CPU does not have to touch much of the payload
> contents at all, since it's only touched by DMA.
> Doing a memmove to re-align the packet contents blows the dcache
> footprint out of proportion and significantly increases the amount of
> unnecessary memory bus traffic.

I agree it would blow up the dcache and be worse than what exists by a lot.

So, out of this conversation:

1) It would be nice to not have this patch take effect on any but the
ar71xx and ar91xx. As these share code in openwrt, doing it with a
compile time define

2) IF there existed another brain damaged ethernet chip on some other
arch, it would be worth coming up with a Kbuild option to enable
defining __packed generically as part of the network stack for those
arches. Something more
pithy than

#define F_ING_HW_ENGINEER_SAVED_PIN

would be needed tho.

#define UNALIGNED_ETHERNET

perhaps.

3) I don't like the tcp_hdr macro in general, but it looks like that
is an obsolete part of the patch anyway, so I'll try ripping it out.

4) get_unaligned_be32 seems like the right thing rather than
get_unaligned_cpu32?

5) I THINK the 'if aligned, do assembly version' for the checksums is
a win, but if item #2 is true, I'm not happy with the casts...

@@ -105,6 +141,9 @@ static inline __sum16 ip_fast_csum(const void
*iph, unsigned int ihl)
unsigned int csum;
int carry;

+   if ((unsigned int) iph & 3)
+   return ip_fast_csum_unaligned(iph,ihl);
+

>
> - Felix



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Felix Fietkau
On 2012-04-30 5:08 PM, Dave Taht wrote:
> On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau  wrote:
>> On 2012-04-30 4:49 PM, David Woodhouse wrote:
>>> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
 Tell it to however wired up this chip and shipped it in qty millions.
 Actually that message was already received, successor chipsets from
 this manufacturer did it up right.
>>>
>>> So the real problem is that the ar71xx doesn't allow you to DMA to
>>> addresses which aren't aligned to 4 bytes? Which network devices does
>>> that restriction apply to? All of them?
>> AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.
> 
> I guess a long term issue is this patch needs to apply only to those arches,
> somehow. It's silly to be hurtful to the successor arches.
> 
>>
>>> Out of interest, have you done a performance comparison with just
>>> *moving* the packet when it arrives?
>> I did some tests before I did the first unaligned hack patch, I don't
>> have the numbers anymore, but the results were horrible.
> 
> I'm still looking for the 'one liner to pass the driver to tell it to 
> realign',
> but was that the approach you were trying...
I did test with such a one liner (or maybe it was a two liner), but I
didn't keep the patch.
I think it's quite normal that this approach is so much more expensive
than adding the unaligned access hacks. The main bottleneck on these
devices is the memory bus, and for normal traffic passed through the
device as a router, the CPU does not have to touch much of the payload
contents at all, since it's only touched by DMA.
Doing a memmove to re-align the packet contents blows the dcache
footprint out of proportion and significantly increases the amount of
unnecessary memory bus traffic.

- Felix
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Dave Taht
On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau  wrote:
> On 2012-04-30 4:49 PM, David Woodhouse wrote:
>> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
>>> Tell it to however wired up this chip and shipped it in qty millions.
>>> Actually that message was already received, successor chipsets from
>>> this manufacturer did it up right.
>>
>> So the real problem is that the ar71xx doesn't allow you to DMA to
>> addresses which aren't aligned to 4 bytes? Which network devices does
>> that restriction apply to? All of them?
> AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.

I guess a long term issue is this patch needs to apply only to those arches,
somehow. It's silly to be hurtful to the successor arches.

>
>> Out of interest, have you done a performance comparison with just
>> *moving* the packet when it arrives?
> I did some tests before I did the first unaligned hack patch, I don't
> have the numbers anymore, but the results were horrible.

I'm still looking for the 'one liner to pass the driver to tell it to realign',
but was that the approach you were trying...

>
>> If this really is needed to work around hardware limitations, I don't
>> see why *some* of it shouldn't be acceptable upstream. A config option
>> to add '__packed' to various structures is fairly harmless, for
>> example...
> Makes sense
>
> - Felix



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Dave Taht
On Mon, Apr 30, 2012 at 7:49 AM, David Woodhouse  wrote:
> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
>> Tell it to however wired up this chip and shipped it in qty millions.
>> Actually that message was already received, successor chipsets from
>> this manufacturer did it up right.
>
> So the real problem is that the ar71xx doesn't allow you to DMA to
> addresses which aren't aligned to 4 bytes? Which network devices does
> that restriction apply to? All of them?


> Out of interest, have you done a performance comparison with just
> *moving* the packet when it arrives?

I did not. Felix did.



Fixing this issue has had a long, tortuous history...

this got tried and later backed out...

https://dev.openwrt.org/changeset/20892

"- on ar724x, rx buffers can be aligned with an offset of 2, which
keeps the ip header aligned
- alignment offset is only added if the ar8216 workaround is not
active and the phy driver does not advertise its own packet alignment
- ar71xx and ar91xx can not handle rx alignment offsets, however
taking a hit on unaligned exceptions seems to have less overhead than
re-aligning the data for large packets
- use memmove to re-align small packets, if necessary

tested on ar9132, ar7240 and ar7242 based devices without ar8216 headers"

Which got backed out here:

https://dev.openwrt.org/ticket/7236

So I'd certainly be willing to attempt realignment in the driver, and
sort of remember that it's like a one-liner to do nowadays...

but it is a 16 bit bus, and packets are dmaed, so getting the cpu
involved on every ethernet transfer strikes me as potentially very
bad.

>
> If this really is needed to work around hardware limitations, I don't
> see why *some* of it shouldn't be acceptable upstream. A config option
> to add '__packed' to various structures is fairly harmless, for
> example...

The __packed trick, so far as I know, is undefined gcc behavior. But, yea,
perhaps something more PC than this

#define F_ING_HW_ENGINEER_SAVED_A_PIN __packed

>
> --
> dwmw2



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Felix Fietkau
On 2012-04-30 4:49 PM, David Woodhouse wrote:
> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
>> Tell it to however wired up this chip and shipped it in qty millions.
>> Actually that message was already received, successor chipsets from
>> this manufacturer did it up right.
> 
> So the real problem is that the ar71xx doesn't allow you to DMA to
> addresses which aren't aligned to 4 bytes? Which network devices does
> that restriction apply to? All of them?
AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected.

> Out of interest, have you done a performance comparison with just
> *moving* the packet when it arrives?
I did some tests before I did the first unaligned hack patch, I don't
have the numbers anymore, but the results were horrible.

> If this really is needed to work around hardware limitations, I don't
> see why *some* of it shouldn't be acceptable upstream. A config option
> to add '__packed' to various structures is fairly harmless, for
> example...
Makes sense

- Felix
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread David Woodhouse
On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote:
> Tell it to however wired up this chip and shipped it in qty millions.
> Actually that message was already received, successor chipsets from
> this manufacturer did it up right.

So the real problem is that the ar71xx doesn't allow you to DMA to
addresses which aren't aligned to 4 bytes? Which network devices does
that restriction apply to? All of them?

Out of interest, have you done a performance comparison with just
*moving* the packet when it arrives?

If this really is needed to work around hardware limitations, I don't
see why *some* of it shouldn't be acceptable upstream. A config option
to add '__packed' to various structures is fairly harmless, for
example...

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread Dave Taht
On Mon, Apr 30, 2012 at 3:49 AM, David Woodhouse  wrote:
> On Sun, 2012-04-22 at 14:48 -0700, Dave Täht wrote:
>>

Thank you very much for the code review!

 +
>> +-#define tcp_flag_word(tp) ( ((union tcp_word_hdr *)(tp))->words [3])
>> ++#define tcp_flag_word2(tp) ( ((union tcp_word_hdr *)(tp))->words [3])
>> ++#define tcp_flag_word(tp) ( __get_unaligned_cpu32(&(((union tcp_word_hdr 
>> *)(tp))->words [3])))
>
> Ewww. And didn't you already make that union __packed anyway?

yes, ewww. I'll note that I don't like the define in the general case.
It is used both for assignment (once!) and for access , which is why
it got broken into two macros...

>
>> +diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> +index 22ef5f9..bf3f773 100644
>> +--- a/net/ipv4/tcp.c
>>  b/net/ipv4/tcp.c
>> +@@ -2834,7 +2834,7 @@ found:
>> +
>> +       p = *head;
>> +       th2 = tcp_hdr(p);
>> +-      tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH);
>> ++      tcp_flag_word2(th2) = tcp_flag_word(th2) | flags & (TCP_FLAG_FIN | 
>> TCP_FLAG_PSH);
>
> net/ipv4/tcp.c: In function 'tcp_gro_receive':
> net/ipv4/tcp.c:2837:82: warning: suggest parentheses around arithmetic in 
> operand of '|' [-Wparentheses]
>
> Have you posted these patches to the netdev list? And is the response
> going to be "just make sure your network devices are receiving packets
> into sensibly-aligned buffers, and none of this is necessary"?

Yes, that would be the answer I expect from the list.

There is no reason to try to push this stuff upstream.

> There's a reason a lot of Ethernet drivers pad the start of their skb by
> 2 bytes before receiving the packet...

Tell it to however wired up this chip and shipped it in qty millions.
Actually that message was already received, successor chipsets from
this manufacturer did it up right.

I am somewhat forgiving in that. Compared to some arches I've worked
with the ar71xx is very solid. (for comparison, take the ep9302, where
a working toolchain didn't arrive until 2 years after the chip had
ceased production)

But this problem is a PITA, and as demonstrated, scribbling on these
core portions of the stack improves performance by an enormous amount.

> --
> dwmw2
>
> ___
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 2/2] Comprehensive ipv4 and ipv6 unaligned access patch for ar71xx

2012-04-30 Thread David Woodhouse
On Sun, 2012-04-22 at 14:48 -0700, Dave Täht wrote:
> 
> + 
> +-#define tcp_flag_word(tp) ( ((union tcp_word_hdr *)(tp))->words [3]) 
> ++#define tcp_flag_word2(tp) ( ((union tcp_word_hdr *)(tp))->words [3])
> ++#define tcp_flag_word(tp) ( __get_unaligned_cpu32(&(((union tcp_word_hdr 
> *)(tp))->words [3])))

Ewww. And didn't you already make that union __packed anyway? 

> +diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> +index 22ef5f9..bf3f773 100644
> +--- a/net/ipv4/tcp.c
>  b/net/ipv4/tcp.c
> +@@ -2834,7 +2834,7 @@ found:
> + 
> +   p = *head;
> +   th2 = tcp_hdr(p);
> +-  tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH);
> ++  tcp_flag_word2(th2) = tcp_flag_word(th2) | flags & (TCP_FLAG_FIN | 
> TCP_FLAG_PSH);

net/ipv4/tcp.c: In function 'tcp_gro_receive':
net/ipv4/tcp.c:2837:82: warning: suggest parentheses around arithmetic in 
operand of '|' [-Wparentheses]

Have you posted these patches to the netdev list? And is the response
going to be "just make sure your network devices are receiving packets
into sensibly-aligned buffers, and none of this is necessary"?

There's a reason a lot of Ethernet drivers pad the start of their skb by
2 bytes before receiving the packet... 

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH 1/3 V1] correct eglibc version numbers

2012-04-30 Thread Emmanuel Deloget

Le 28/04/2012 22:50, Mirko Vogt a écrit :

By the way: All your patches didn't apply out of the box due to
whitespace errors.


My bad. I think I did a straight svn diff + select the text in my
xterm and copy it using the middle button. That may have changed
the whitespaces here and there. I'll do somethign better for my
next patches.

Best regards,

-- Emmanuel Deloget
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] [PATCH] net/openl2tp: fix ppp setup timeout in non-rpc (mini) version

2012-04-30 Thread Daniel Golle
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I merged your branches into my openwrt, packages and luci repos and gave it a
try over the past 24 hours running on a ath9k-based router.
Results: very nice. This should go uptream.

On 27/04/12 15:39, David Woodhouse wrote:
> On Fri, 2012-04-27 at 15:30 +0300, Daniel Golle wrote:
>> I wasn't aware of your solution, which I like a lot and which perfectly
>> fits into how things are done in OpenWrt. Having openl2tp running and
>> handling all the ppp stuff is pita and just having a plain pppd plugin is
>> a much better solution
> 
> Well, you do still have xl2tpd running, but at least it's less than half 
> the size of openl2tp-mini, and a quarter of the size of openl2tp-full!
> 
>> -- given that this uses the in-kernel ppp-over-l2tp handling and doesn't
>> just provide a PTY through userspace, as that would be a lot of 
>> unnecessary copying from kernel-space to user-space and back...
> 
> Right. I fixed that in the xl2tpd package too ☺
> 
>> Can your work go upstream into the openwrt trunk?
> 
> I posted the pull request yesterday...
> 


- -- 
ALLNET GmbH ; Maistr. 2 ; D-82110 Germering ; Germany
Tel. +49-89-8942 - Fax +49-89-89422233
http://www.allnet.de
email: Daniel Golle 
Schulungs-/Veranstaltungsprogramm: http://www.802lab.de
Geschäftsführer: Wolfgang Marcus Bauer
Handelsregister München B 95922 ; UST-ID-Nr. DE 128214294 ;
St.-Nr.117/115/00164
WEEE-Reg.-NR. DE 13101093
Bankverbindung:
Sparkasse Fürstenfeldbruck KTO: 2774594 ; BLZ: 70053070
Swift-Code: BYLADEM1FFB ; IBAN: DE6170053072774594
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQIcBAEBAgAGBQJPnllKAAoJEDy9cKN/1Et0xAYP/1W0yxRscji2rQ/3JmpZQVX8
ZCSQz9jMRcHlnI11K4H+rMiCbPg08Un5jyrqZV3TvMX1dU87ywGGnxKCLffb9X4U
b7sk8dojcPDkUPTCBrYTqjJ4JDdL0hN20RFK55975w7ZhTpQm4i1+1K/uUedV7V6
ZwZDQQsPgax8DxopnVDzgSj1sDAqu8xkwSGG4+TdJThRTEUfQ4Qbwzytf4J8fMOF
ANMWhu13vxJFQY6Z4L6sge+wkYUKJpbdKpUyjfGYLvTdOtRBwhwZGbDTSBrlesqD
LOtDd2db+rzP/nREVOEtEarIzZ1z7k8mNKpWNWU+R5lzKvxkwOLpHD1oMuY6LDRH
oMxUO+dfT3va2O8oLLE8jgmN0VdYTPp2vjz5nHclgAWgMFZK+dydjNINIXacNspa
fNz+TLGyvZ6GNQWYME3bOzDUfIsq9VYPb1EVgAKeiTXifvTYAZHxj8X6jTeEaeko
11kA8IYSzG6MNt3DooDHb3c9tNgb5jIIPRw6AFEy23j05X/XP+lscKBBWmF3tqUO
qRlicwAsKtQRqJ8Wx5mN/2qzNNVVXnKfkF0rieuvQo60xVsm5xnv4d+2y1qr7eIb
jO9no/+F5az4ibzL9VtkhM23w9knB2UEg+SyyCljJ8n64wbQ+2VnSgDh/R1cGZxs
50r5PcBJELR8Dqe+i3Nf
=qPGs
-END PGP SIGNATURE-
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel