On Mon, Apr 30, 2012 at 10:26 AM, Felix Fietkau <n...@openwrt.org> wrote: > On 2012-04-30 5:08 PM, Dave Taht wrote: >> On Mon, Apr 30, 2012 at 8:02 AM, Felix Fietkau <n...@openwrt.org> wrote: >>> On 2012-04-30 4:49 PM, David Woodhouse wrote: >>>> On Mon, 2012-04-30 at 07:41 -0700, Dave Taht wrote: >>>>> Tell it to however wired up this chip and shipped it in qty millions. >>>>> Actually that message was already received, successor chipsets from >>>>> this manufacturer did it up right. >>>> >>>> So the real problem is that the ar71xx doesn't allow you to DMA to >>>> addresses which aren't aligned to 4 bytes? Which network devices does >>>> that restriction apply to? All of them? >>> AR71xx and AR91xx are affected, AR724x, AR933x, AR934x are unaffected. >> >> I guess a long term issue is this patch needs to apply only to those arches, >> somehow. It's silly to be hurtful to the successor arches. >> >>> >>>> Out of interest, have you done a performance comparison with just >>>> *moving* the packet when it arrives? >>> I did some tests before I did the first unaligned hack patch, I don't >>> have the numbers anymore, but the results were horrible. >> >> I'm still looking for the 'one liner to pass the driver to tell it to >> realign', >> but was that the approach you were trying... > I did test with such a one liner (or maybe it was a two liner), but I > didn't keep the patch. > I think it's quite normal that this approach is so much more expensive > than adding the unaligned access hacks. The main bottleneck on these > devices is the memory bus, and for normal traffic passed through the > device as a router, the CPU does not have to touch much of the payload > contents at all, since it's only touched by DMA. > Doing a memmove to re-align the packet contents blows the dcache > footprint out of proportion and significantly increases the amount of > unnecessary memory bus traffic.
I agree it would blow up the dcache and be worse than what exists by a lot. So, out of this conversation: 1) It would be nice to not have this patch take effect on any but the ar71xx and ar91xx. As these share code in openwrt, doing it with a compile time define 2) IF there existed another brain damaged ethernet chip on some other arch, it would be worth coming up with a Kbuild option to enable defining __packed generically as part of the network stack for those arches. Something more pithy than #define F_ING_HW_ENGINEER_SAVED_PIN would be needed tho. #define UNALIGNED_ETHERNET perhaps. 3) I don't like the tcp_hdr macro in general, but it looks like that is an obsolete part of the patch anyway, so I'll try ripping it out. 4) get_unaligned_be32 seems like the right thing rather than get_unaligned_cpu32? 5) I THINK the 'if aligned, do assembly version' for the checksums is a win, but if item #2 is true, I'm not happy with the casts... @@ -105,6 +141,9 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) unsigned int csum; int carry; + if ((unsigned int) iph & 3) + return ip_fast_csum_unaligned(iph,ihl); + > > - Felix -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net _______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel