Re: bonding sysfs output
On Mon, 26 Nov 2007 09:29:40 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote: Andrew Morton [EMAIL PROTECTED] writes: On Sun, 25 Nov 2007 16:12:57 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote: I propose it as a fix for trailing NULs and spaces like eg. $ od -c /sys/class/net/bond0/bonding/slaves 000 e t h - l e f t e t h - r i g 020 h t \n \0 025 I'm afraid there're other problems with ++more++ handling, but let's not consider those just yet. Find the patch attached. The first hunks also renames buffer to buf, for consistency's shake. The original version had varying behaviour for Not Applicable cases. This patch also settles for empty files (not even a line feed) in those cases, but I'm not sure about the general policy on this matter. hm, there are a lot of changes there. Were they all actually needed to fix the one bug which you have described? Trailing NULs are present in each file under /sys/class/net/*/bonding and also in /sys/class/net/bonding_masters. That is, in every file provided by drivers/net/bonding/bond_sysfs.c. Most of the patch is concerned with this. Closely related is the presence of trailing spaces in multivalue files. There are three such files, one of them has the trailing space removed. This patch removes it from the other two. During this it also renames one function argument 'buffer' to 'buf', for consistency. On the policy side: some files are not applicable to some types of bonds, and return a single linefeed in that case. Except for one single case, which returns 'NA\n'. The patch changes these cases into emtpy files. If these are worthy changes, I'm absolutely willing to split up the patch into three parts as the above. Well that would be good if poss, thanks. But fixing bugs is way more important than niceties of patch presentation however I wasn't prepared to fix the rejects which that patch is hitting in the considerably-changed bonding_show_ad_partner_mac(). Please: - raise patches against the latest Linus tree (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/) - cc netdev@vger.kernel.org on networking-related matters - Include a Signed-off-by: as per Documentation/SubmittingPatches - Try to ensure that the full explanation (such as you have above) is covered in the changelog text. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
Perhaps you've got lots of patches were people are using internal APIs they shouldn't? Maybe the issue is who can tell since what is external and what is internal is not explicitly defined? Exactly. Or rather it is not defined on the module level. We got static of course, but I think we should have a similar mechanism on a module level. Explicitly documenting what comprises the kernel API (external, supported) It would not be fully supported either -- can still change etc. -- but there is a reasonable expectation that those external APIs will change less often than internal interfaces. - forcing developers to identify their exports as part of the implementation or as part of the kernel API That is EXPORT_SYMBOL already. The trouble is just that it covers too much. My patchkit is trying to limit it again for a specific use case -- exporting an internal interface to another module. Or rather a set of modules. Standard example is TCP: TCP exports nearly everything and the single user is the TCP code in ipv6.ko. Instead those symbols should be limited to be only accessable to ipv6.ko. The reason I went with the more generic namespace mechanism instead of EXPORT_SYMBOL_TO() is that ipv6 is ever split up it would still work. Also using namespaces doesn't have any more overhead than EXPORT_SYMBOL_TO() and the complexity is about the same (not very much anyways -- just look at the patches) - making it easier for reviewers to identify when developers are adding to the kernel API and thereby focusing the appropriate level of review to the new function That is another reason. -ANdi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Does tc-prio really work as advertised?
Jarek, iptables chains (this is what I think you are referring to) are not the issue. This is about the qdisc that sits immediately over the device driver and decides the order waiting packets are sent over the line/air/carrier pigeon/... . My suspicion is that skb-priority used to be set to a value that derived from the TOS bits. Then something changed and nobody noticed. -- Regards Joerg - Ursprüngliche Mail Von: Jarek Poplawski [EMAIL PROTECTED] An: Jarek Poplawski [EMAIL PROTECTED] CC: Joerg Pommnitz [EMAIL PROTECTED]; netdev@vger.kernel.org; OLSR discussion and development [EMAIL PROTECTED] Gesendet: Dienstag, den 27. November 2007, 07:46:04 Uhr Betreff: Re: Does tc-prio really work as advertised? On 26-11-2007 23:25, Jarek Poplawski wrote: ... Are you doing this on the same box? I was tracing this long time ago too, and, if I didn't miss something, it was about the place! So, as I recall (after finding some old message) this TOS is considered only for packets going through the FORWARD chain. (But, I haven't checked this at all now, so no complaints...) ...Too exactly! Iptables aren't needed for this, so going through the forward. should be enough... Jarek P. Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: http://de.yahoo.com/set - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
2.6.23-mm1 tg3 wake-on-lan oddity...
Scenario - Dell Latitude D820 laptop, tg3 driver says this at boot: eth0: Tigon3 [partno(BCM5752KFBG) rev 6002 PHY(5752)] (PCI Express) 10/100/1000Base-T Ethernet 00:15:c5:c8:33:4e eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[1] eth0: dma_rwctrl[7618] dma_mask[64-bit] # (lspci; lspci -n) | grep 09 09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express (rev 02) 09:00.0 0200: 14e4:1600 (rev 02) (I think that's most of the likely-relevant info...) Issue: I (for unrelated reasons) run powertop, and it suggests I conserve power by doing 'ethtool -s eth0 wol d'. I look at it, and think that it's daft, because (a) the Dell factory default is WOL disabled and (b) if it wasn't the default, I'd have *set* it to disabled, and (c) I even went back and rebooted and checked the BIOS setting - disabled. Nonetheless: # ethtool eth0 | grep Wake Supports Wake-on: g Wake-on: g Is this expected behavior? pgph055mRDfkl.pgp Description: PGP signature
Re: Does tc-prio really work as advertised?
On Tue, Nov 27, 2007 at 01:28:43AM -0800, Joerg Pommnitz wrote: Jarek, iptables chains (this is what I think you are referring to) are not the issue. Yes, but this could (wrongly) look like this according to my 1-st message. This is about the qdisc that sits immediately over the device driver and decides the order waiting packets are sent over the line/air/carrier pigeon/... . My suspicion is that skb-priority used to be set to a value that derived from the TOS bits. Then something changed and nobody noticed. I'm not sure of your problem: did you try this on a box which gets packets with TOS set earlier, does forwarding, and uses this prio on egress? If so, and this doesn't work, then you are right something could be wrong. Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bonding sysfs output
Andrew Morton [EMAIL PROTECTED] writes: On Mon, 26 Nov 2007 09:29:40 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote: Trailing NULs are present in each file under /sys/class/net/*/bonding and also in /sys/class/net/bonding_masters. That is, in every file provided by drivers/net/bonding/bond_sysfs.c. Most of the patch is concerned with this. Closely related is the presence of trailing spaces in multivalue files. There are three such files, one of them has the trailing space removed. This patch removes it from the other two. During this it also renames one function argument 'buffer' to 'buf', for consistency. On the policy side: some files are not applicable to some types of bonds, and return a single linefeed in that case. Except for one single case, which returns 'NA\n'. The patch changes these cases into emtpy files. If these are worthy changes, I'm absolutely willing to split up the patch into three parts as the above. Well that would be good if poss, thanks. Will do. Not exactly a simple thing, as the changes collide. But fixing bugs is way more important than niceties of patch presentation however I wasn't prepared to fix the rejects which that patch is hitting in the considerably-changed bonding_show_ad_partner_mac(). Please: Yes, the patch was against 2.6.23.8. Forgot to mention. :( - raise patches against the latest Linus tree (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/) I thought it was better to change to git. Isn't it so? SubmittingPatches has nothing to say about that... Can I find collected best practices somewhere? Which tree, which branch, how/when to rebase, format-patch, etc... (If given no further instructions, I'll try my best and you can reflect on the result. I mean, the above questions are not blocking me, feel free not to answer.) - cc netdev@vger.kernel.org on networking-related matters - Include a Signed-off-by: as per Documentation/SubmittingPatches - Try to ensure that the full explanation (such as you have above) is covered in the changelog text. Ok. -- Thanks for your time, Feri. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bonding sysfs output
On Tue, 27 Nov 2007 10:56:43 +0100 Ferenc Wagner [EMAIL PROTECTED] wrote: - raise patches against the latest Linus tree (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/) I thought it was better to change to git. Isn't it so? Yes, git is a bit more uptodate than the snapshots. But if that matters you were very unlucky. SubmittingPatches has nothing to say about that... Can I find collected best practices somewhere? Which tree, which branch, how/when to rebase, format-patch, etc... gosh. Documentation/Submit*, http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt, http://linux.yyz.us/patch-format.html, other places. Probably people have written books about it by now. But don't sweat it - you're close enough ;) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][UNIX] EOF on non-blocking SOCK_SEQPACKET
Hi, I am not absolutely sure whether this actually is a bug (as in: I've got no clue what the standards say or what other implementations do), but at least I was pretty surprised when I noticed that a recv() on a non-blocking unix domain socket of type SOCK_SEQPACKET (which is connection oriented, after all) where the remote end has closed the connection returned -1 (EAGAIN) rather than 0 to indicate end of file. This is a test case: | #include sys/types.h | #include unistd.h | #include sys/socket.h | #include sys/un.h | #include fcntl.h | #include string.h | #include stdlib.h | | int main(){ | int sock; | struct sockaddr_un addr; | char buf[4096]; | int pfds[2]; | | pipe(pfds); | sock=socket(PF_UNIX,SOCK_SEQPACKET,0); | addr.sun_family=AF_UNIX; | strcpy(addr.sun_path,/tmp/foobar_testsock); | bind(sock,(struct sockaddr *)addr,sizeof(addr)); | listen(sock,1); | if(fork()){ | close(sock); | sock=socket(PF_UNIX,SOCK_SEQPACKET,0); | connect(sock,(struct sockaddr *)addr,sizeof(addr)); | fcntl(sock,F_SETFL,fcntl(sock,F_GETFL)|O_NONBLOCK); | close(pfds[1]); | read(pfds[0],buf,sizeof(buf)); | recv(sock,buf,sizeof(buf),0); // -- this one | }else accept(sock,NULL,NULL); | exit(0); | } If you try it, make sure /tmp/foobar_testsock doesn't exist. The marked recv() returns -1 (EAGAIN) on 2.6.23.9. Below you find a patch that fixes that. Florian --- Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a05c342..fa0aec5 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1632,8 +1632,13 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock, mutex_lock(u-readlock); skb = skb_recv_datagram(sk, flags, noblock, err); - if (!skb) + if (!skb) { + unix_state_lock(sk); + if (sk-sk_type==SOCK_SEQPACKET err==-EAGAIN (sk-sk_shutdownRCV_SHUTDOWN)) + err=0; /* signal EOF on disconnected non-blocking SEQPACKET socket */ + unix_state_unlock(sk); goto out_unlock; + } wake_up_interruptible(u-peer_wait); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 03:26:52PM +1100, Rusty Russell wrote: On Monday 26 November 2007 16:58:08 Roland Dreier wrote: I agree that we shouldn't make things too hard for out-of-tree modules, but I disagree with your first statement: there clearly is a large class of symbols that are used by multiple modules but which are not generically useful -- they are only useful by a certain small class of modules. If it is so clear, you should be able to easily provide examples? Sure -- Andi's example of symbols required only by TCP congestion modules; Exactly. Why exactly should someone not write a new TCP congestion module? Agreed the congestion modules are a corner case. I even mentioned that in the patch. I would be happy to drop that one if that is the consensus. It was more done as a example anyways. That is why I made it an separate namespace from tcp But for many other TCP symbols it makes a lot of sense: all the functions only used by tcp_ipv6.c. If someone wants to write support for a IPv7 or similar they really should do it in tree. So I think the tcp and inet namespaces make a lot of sense. Right. So presumably there will only ever be two drivers using this core code, so no new users will ever be written? Now we've found one use case, is If there are new users they will need to get proper review and should be in tree. MODULE_ALLOW() enforces that. it worth the complexity of namespaces? Is it worth the halfway point of What complexity? You're always claiming it is complex. It isn't really. export-to-module? No, we've seen the solution and various people applying it. I'm still trying to discover the problem it's solving. Again for rusty @) Goals are: - Limit the interfaces available for out of tree modules to reasonably stable ones that are already used by a larger set of drivers. This can also have further downstream advantages. For example it might be a reasonable future rule to require all unconditionally EXPORT_SYMBOL()s to have a complete LinuxDoc documentation entry. - Explicitely declare in source what is clearly internal and not intended to be a generally usable interface. e.g. for the LinuxDoc example above such internal functions don't necessarily need full LinuxDoc documentation. - Force review from core maintainers for use of such internal interfaces - Limit size of exported API to make stable ABIs for enterprise distributions easier [Yes I know that is not a popular topic on l-k, but it's a day-to-day problem for these distros and out of tree solutions do not work] -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
AW: Does tc-prio really work as advertised?
Jarek, this is all about outgoing packets, e.g. egress to use your word. It doesn't matter whether the packets are originated locally or whether the packets are forwarded from another host (I tried both). To restate the problem: according to my observations the prio qdisc (and probably pfifo_fast, but I couldn't observe this) does not prioritize at all and always uses the band indicated by the first entry in the priomap. By default the priomap looks like this: qdisc prio 1: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through the different bands according to the TOS value of the packets. My observation is, that the traffic always uses the band pointed to by the first entry in the priomap. This value is 1 by default, so all traffic goes through band 1:2. Now it's entirely possible that I did something stupid, but nobody came forward to show me the error of my ways (neither here nor on the lartc list). -- Regards Joerg - Ursprüngliche Mail Von: Jarek Poplawski [EMAIL PROTECTED] An: Joerg Pommnitz [EMAIL PROTECTED] CC: netdev@vger.kernel.org Gesendet: Dienstag, den 27. November 2007, 10:58:38 Uhr Betreff: Re: Does tc-prio really work as advertised? On Tue, Nov 27, 2007 at 01:28:43AM -0800, Joerg Pommnitz wrote: Jarek, iptables chains (this is what I think you are referring to) are not the issue. Yes, but this could (wrongly) look like this according to my 1-st message. This is about the qdisc that sits immediately over the device driver and decides the order waiting packets are sent over the line/air/carrier pigeon/... . My suspicion is that skb-priority used to be set to a value that derived from the TOS bits. Then something changed and nobody noticed. I'm not sure of your problem: did you try this on a box which gets packets with TOS set earlier, does forwarding, and uses this prio on egress? If so, and this doesn't work, then you are right something could be wrong. Regards, Jarek P. __ Ihr erstes Baby? Holen Sie sich Tipps von anderen Eltern. www.yahoo.de/clever - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module
Hi Patrick, I appreciate your comments. I will update and resend patches. +typedef long long gapclock_t; It seems you only add to this, does it need to be signed? How about using a fixed size type (u64) and getting rid of the typedef? OK. I will use u64 instead of gapclock_t, and remove the typedef. +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; +}; What unit is rate measured in? 'rate' is the transmission rate in bytes per sec. +struct tc_psp_xstats +{ + __u32 bytes; /* gap packet statistics */ + __u32 packets; +}; How about using gnet_stats_basic for this? OK. I will use gnet_stats_basic. + if (!skb) + return NULL; + + memset(skb-data, 0xff, size); memsetting the header portion seems unnecessary since you overwrite it again directly below. The size of a gap packet is variable, so it fills a whole gap packet with 0xff, where 'size' indicates the interface MTU size. + skb_put(skb, size); This is usually done before putting data in the packet. Therefore, skb_put() is needed. +static int recalc_gapsize(struct sk_buff *skb, struct Qdisc *sch) +{ + struct psp_sched_data *q = qdisc_priv(sch); + struct psp_class *cl; + unsigned int len = skb-len; + int gapsize = 0; + int err; + + cl = psp_classify(skb, sch, err); + switch (cl-mode) { + case MODE_STATIC: + gapsize = cl-gapsize; + if (len q-mtu) /* workaround for overflow */ + gapsize = (int)((long long)gapsize * len / q-mtu); No need to cast to the LHS type. Shouldn't this also be rounded up like the other gapsize calculation? How about this? u64 gapsize = 0; : case MODE_STATIC: gapsize = (u64)cl-gapsize * len / q-mtu; gapsize = min_t(u64, gapsize, UINT_MAX); break; } +static struct psp_class *lookup_next_class(struct Qdisc *sch, int *gapsize) +{ + struct psp_sched_data *q = qdisc_priv(sch); + struct psp_class *cl, *next = NULL; + gapclock_t diff; + int shortest; + + /* pacing class */ + shortest = q-mtu; + list_for_each_entry(cl, q-pacing_list, plist) { + diff = cl-clock - q-clock; + if (diff 0) { + if ((gapclock_t)shortest diff) + shortest = (int)diff; Is diff guaranteed not to exceed an int? No. But actually, 'diff' is much smaller than INT_MAX. How about this? u64 diff, shortest; : if (q-clock cl-clock) { diff = cl-clock - q-clock; if (shortest diff) shortest = diff; continue; } +static void psp_destroy(struct Qdisc *sch) +{ + struct psp_sched_data *q = qdisc_priv(sch); + + tcf_destroy_chain(q-filter_list); + + while (!list_empty(q-root)) + psp_destroy_class(sch, list_entry(q-root.next, + struct psp_class, sibling)); list_for_each_entry_safe. I think it works well. Should I need to use list_for_each_entry_safe? Best regards, Ryousei Takano - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage
[snip] Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify -lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] Thanks, Eric. Much better ('find /proc' works and so does 'ls ..'), but one issue is still unsolved :( I mentioned the program, that opens the directory and dumps the content of the /proc/self/fd. Here it is (stupid but simple): prog.c #include stdio.h #include unistd.h #include stdlib.h #include asm/fcntl.h #include unistd.h int main(int argc, char **argv) { int fd; fd = open(argv[1], O_RDONLY|O_DIRECTORY); if (fd == -1) { perror(Can't open); return 1; } system(ls -l /proc/self/fd); return 0; } So. Here's the result of running this program: # cd /proc/net/ # pwd /proc/net # ~/a.out . total 0 lrwx-- 1 root root 64 Nov 27 13:27 0 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:27 1 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:27 2 - /dev/pts/0 lr-x-- 1 root root 64 Nov 27 13:27 3 - /proc/net (deleted) lr-x-- 1 root root 64 Nov 27 13:27 4 - /proc/4475/fd # cd /proc # pwd /proc # ~/a.out net total 0 lrwx-- 1 root root 64 Nov 27 13:27 0 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:27 1 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:27 2 - /dev/pts/0 lr-x-- 1 root root 64 Nov 27 13:27 3 - /proc/net lr-x-- 1 root root 64 Nov 27 13:27 4 - /proc/4477/fd # cd /proc/net/stat # pwd /proc/net/stat # ~/a.out .. total 0 lrwx-- 1 root root 64 Nov 27 13:29 0 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:29 1 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:29 2 - /dev/pts/0 lr-x-- 1 root root 64 Nov 27 13:29 3 - /proc/net (deleted) lr-x-- 1 root root 64 Nov 27 13:29 4 - /proc/4482/fd # ~/a.out . total 0 lrwx-- 1 root root 64 Nov 27 13:32 0 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:32 1 - /dev/pts/0 lrwx-- 1 root root 64 Nov 27 13:32 2 - /dev/pts/0 lr-x-- 1 root root 64 Nov 27 13:32 3 - /proc/net/stat lr-x-- 1 root root 64 Nov 27 13:32 4 - /proc/4488/fd Bad thing is that . when cdir is /proc/net and .. when cdir is anything under /proc/net (i.e. the /proc/net itself) is marked as (deleted). [snip] Thanks, Pavel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property
On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote: On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote: fixed-link says: register new Fixed/emulated PHY, i.e. PHY that not connected to the real MDIO bus. Signed-off-by: Vitaly Bordug [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Documentation/powerpc/booting-without-of.txt |3 + arch/powerpc/sysdev/fsl_soc.c| 56 ++ 2 files changed, 42 insertions(+), 17 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index e9a3cb1..cf25070 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1254,6 +1254,9 @@ platforms are moved over to use the flattened-device-tree model. services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. +- fixed-link : a b c where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000. Good work! May I suggest adding a d to a b c where d is flow control - 0 no, 1 yes Well, I see no reference of the flow neither in the include/linux/mii.h nor in the drivers/net/phy/*. :-/ Thus today there is no such register bit we can emulate?.. flow control or not just popped up here today so I got a use for it. ..and I looked into few drivers (gianfar, ucc), and they seem to use hard-coded flow control values, thus they don't speak to the PHYs about that. Can you give any reference to the driver that needs that parameter? Does it use PHY subsystem to obtain flow-control on/off information? Thanks, -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does tc-prio really work as advertised?
On Tue, Nov 27, 2007 at 02:54:10AM -0800, Joerg Pommnitz wrote: Jarek, this is all about outgoing packets, e.g. egress to use your word. It doesn't matter whether the packets are originated locally or whether the packets are forwarded from another host (I tried both). To restate the problem: according to my observations the prio qdisc (and probably pfifo_fast, but I couldn't observe this) does not prioritize at all and always uses the band indicated by the first entry in the priomap. By default the priomap looks like this: qdisc prio 1: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through the different bands according to the TOS value of the packets. My observation is, that the traffic always uses the band pointed to by the first entry in the priomap. This value is 1 by default, so all traffic goes through band 1:2. Now it's entirely possible that I did something stupid, but nobody came forward to show me the error of my ways (neither here nor on the lartc list). I don't think there could be anything stupid if something is maybe not enough documented. But, this really should work - just like you've found: TOS should be recalculated to skb-priority, and this should affect prio. You should only consider that this recalculation isn't done for all packets, but only forwarded ones (if I can remember, didn't miss something, and nothing changed later...). So, are you still sure you've tested such a case? (Btw., there are some other tools which can change this priority field, so I hope you don't use them too.) Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module
TAKANO Ryousei wrote: Hi Patrick, +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; +}; What unit is rate measured in? 'rate' is the transmission rate in bytes per sec. So wouldn't it make sense to use u64 then for 10GBit networks? + skb_put(skb, size); This is usually done before putting data in the packet. Therefore, skb_put() is needed. I meant this is usually done before writing to the packet data, so you should move it up a few lines. +static int recalc_gapsize(struct sk_buff *skb, struct Qdisc *sch) +{ + struct psp_sched_data *q = qdisc_priv(sch); + struct psp_class *cl; + unsigned int len = skb-len; + int gapsize = 0; + int err; + + cl = psp_classify(skb, sch, err); + switch (cl-mode) { + case MODE_STATIC: + gapsize = cl-gapsize; + if (len q-mtu) /* workaround for overflow */ + gapsize = (int)((long long)gapsize * len / q-mtu); No need to cast to the LHS type. Shouldn't this also be rounded up like the other gapsize calculation? How about this? u64 gapsize = 0; : case MODE_STATIC: gapsize = (u64)cl-gapsize * len / q-mtu; gapsize = min_t(u64, gapsize, UINT_MAX); Really a minimum of UINT_MAX? This will probably also break on 32 bit unless you use do_div(). break; } +static struct psp_class *lookup_next_class(struct Qdisc *sch, int *gapsize) +{ + struct psp_sched_data *q = qdisc_priv(sch); + struct psp_class *cl, *next = NULL; + gapclock_t diff; + int shortest; + + /* pacing class */ + shortest = q-mtu; + list_for_each_entry(cl, q-pacing_list, plist) { + diff = cl-clock - q-clock; + if (diff 0) { + if ((gapclock_t)shortest diff) + shortest = (int)diff; Is diff guaranteed not to exceed an int? No. But actually, 'diff' is much smaller than INT_MAX. How about this? u64 diff, shortest; : if (q-clock cl-clock) { diff = cl-clock - q-clock; if (shortest diff) shortest = diff; continue; } I don't know, its your qdisc :) I was merely wondering whether you need larger types since there seems to be some inconsisteny. Your old code had a check for diff 0, so it appears that cl-clock could be smaller than q-clock. + while (!list_empty(q-root)) + psp_destroy_class(sch, list_entry(q-root.next, + struct psp_class, sibling)); list_for_each_entry_safe. I think it works well. Should I need to use list_for_each_entry_safe? I don't doubt that it works, but list_for_each_entry_safe is the proper interface for this. One more thing: your qdisc can only be used as root qdisc since it produces packets itself and thereby violates the rule that a qdisc can only hand out packets that were previously enqueued to it. Using it as a leaf qdisc can make the upper qdiscs qlen counter go negative, causing infinite dequeue-loops, so you should make sure that its only possibly to use as root qdisc by checking the parent. It would also be better to do something like netem (enqueue produced packets at the root) to make sure the qlen counter is always accurate. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does tc-prio really work as advertised?
Jarek Poplawski wrote: On Tue, Nov 27, 2007 at 02:54:10AM -0800, Joerg Pommnitz wrote: Jarek, this is all about outgoing packets, e.g. egress to use your word. It doesn't matter whether the packets are originated locally or whether the packets are forwarded from another host (I tried both). To restate the problem: according to my observations the prio qdisc (and probably pfifo_fast, but I couldn't observe this) does not prioritize at all and always uses the band indicated by the first entry in the priomap. By default the priomap looks like this: qdisc prio 1: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through the different bands according to the TOS value of the packets. My observation is, that the traffic always uses the band pointed to by the first entry in the priomap. This value is 1 by default, so all traffic goes through band 1:2. Now it's entirely possible that I did something stupid, but nobody came forward to show me the error of my ways (neither here nor on the lartc list). I don't think there could be anything stupid if something is maybe not enough documented. But, this really should work - just like you've found: TOS should be recalculated to skb-priority, and this should affect prio. You should only consider that this recalculation isn't done for all packets, but only forwarded ones (if I can remember, didn't miss something, and nothing changed later...). So, are you still sure you've tested such a case? (Btw., there are some other tools which can change this priority field, so I hope you don't use them too.) It works fine here, I'm guessing that Jörg is using an old kernel version that had a bug in prio classification without filters. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage
Pavel Emelyanov [EMAIL PROTECTED] writes: [snip] Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify -lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] Thanks, Eric. Much better ('find /proc' works and so does 'ls ..'), but one issue is still unsolved :( I mentioned the program, that opens the directory and dumps the content of the /proc/self/fd. Here it is (stupid but simple): The cause is totally different this time, and is not limited to /proc/net. But this is the only side effect of the changes now. I used the one line version of your test program to confirm this: $ cd /proc/driver/ $ ls -l /proc/self/fd/100 100 . lr-x-- 1 eric eric 64 Nov 27 05:06 /proc/self/fd/100 - /proc/driver/ $ ls -l /proc/self/fd/100 100 . lr-x-- 1 eric eric 64 Nov 27 05:07 /proc/self/fd/100 - /proc/driver (deleted) What is happening is that the aggressive non-caching logic is dropping the dentries and thus they show up as deleted. I.e. When something triggers another lookup of that dentry revalidate drops the old dentry. This actually fixes a race in /proc today where if you open a file, remove the module for it, reload the module for that file, and then attempt to access it, lookup will return the old dentry with the old fops (even if there is not a current version of that file). kill_proc_inodes current keeps us from using the old version of the file_operations for directories (because it removes them) but it does not prevent this DOS attack where keeping a proc file open always ensures everyone will see the old version. Given that the behavior seems more correct then what we have currently I can live with files showing up as deleted from time to time. Ultimately the fix is to correct the caching logic in fs/proc/generic.c to only drop dentries when necessary and then the deleted markers will only show up when the file or directory really goes away. I don't expect user space will care about the semi-legitimate (deleted) marks. So I don't think this one down side will be a big deal for 2.6.24. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
AW: Does tc-prio really work as advertised?
It works fine here, I'm guessing that Jörg is using an old kernel version that had a bug in prio classification without filters. This is 2.6.20.21, from 17-Oct-2007. -- Regards Joerg __ Ihre erste Baustelle? Wissenswertes für Bastler und Hobby Handwerker. www.yahoo.de/clever - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AW: Does tc-prio really work as advertised?
Joerg Pommnitz wrote: It works fine here, I'm guessing that Jörg is using an old kernel version that had a bug in prio classification without filters. This is 2.6.20.21, from 17-Oct-2007. Yes, that version is broken. I think it was fixed in 2.6.22 or 2.6.23. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
AW: Does tc-prio really work as advertised?
So, are you still sure you've tested such a case? Well, the problem that triggered my investigation was that the OLSR daemon (www.olsr.org) calculates the quality of a link according to the packet loss for LQ HELLO packets (UDP broadcast packets). To prevent other traffic from interfering with the LQ calculation, olsrd sends the HELLO packets with a TOS value of 0x10 (minimize delay). This should give them the highest priority. What I saw was a degrading Link quality with more user traffic over a link. The LQ fell so far that olsrd judged the other host unreachable and deleted the routing entry. The user traffic in question was iperf (TOS value 0x00). The OLSR traffic was obviously generated locally (not forwarded). You claim, that the TOS value for locally generated traffic does not influence its priority? Now I THINK that I did my tests both, for forwarded and for local traffic, but I'll redo my tests to make sure. -- Regards and thanks for taking an interest Joerg __ Ihr erstes Baby? Holen Sie sich Tipps von anderen Eltern. www.yahoo.de/clever - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sungem: fix napi regression with reset work
sungem's gem_reset_task() will unconditionally try to disable NAPI even when it's called while the interface is not operating and hence the NAPI struct isn't enabled. Make napi_disable() depend on gp-running. Also removes a superfluous test of gp-running in the same function. Signed-off-by: Johannes Berg [EMAIL PROTECTED] --- Not exactly sure how I got there but I did get stuck in there (sysrq-W) when I plugged in the network cable and booted the computer on the other end of it... while the interface was down. I would've thought that the driver wouldn't even notice the fact that I plugged in the cable while the interface is down, but maybe it does. Or it sets up the device wrongly at resume time when the interface was down at suspend. In any case, this should fix the issue. drivers/net/sungem.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) --- everything.orig/drivers/net/sungem.c2007-11-26 20:48:11.489044815 +0100 +++ everything/drivers/net/sungem.c 2007-11-26 21:01:28.519045736 +0100 @@ -2279,34 +2279,33 @@ static void gem_reset_task(struct work_s { struct gem *gp = container_of(work, struct gem, reset_task); mutex_lock(gp-pm_mutex); - napi_disable(gp-napi); + if (gp-opened) + napi_disable(gp-napi); spin_lock_irq(gp-lock); spin_lock(gp-tx_lock); - if (gp-running == 0) - goto not_running; - if (gp-running) { netif_stop_queue(gp-dev); /* Reset the chip rings */ gem_reinit_chip(gp); if (gp-lstate == link_up) gem_set_link_modes(gp); netif_wake_queue(gp-dev); } - not_running: + gp-reset_task_pending = 0; spin_unlock(gp-tx_lock); spin_unlock_irq(gp-lock); - napi_enable(gp-napi); + if (gp-opened) + napi_enable(gp-napi); mutex_unlock(gp-pm_mutex); } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does tc-prio really work as advertised?
Great :-(. I went over the ChangeLogs for later kernel versions, but couldn't find anything. I'll try to come up with a working .config for the most current kernel that works on my hardware and test again. Do you have a pointer to the fix so that I could try to patch a 2.6.20 kernel? -- Regards Joerg - Ursprüngliche Mail Von: Patrick McHardy [EMAIL PROTECTED] An: Joerg Pommnitz [EMAIL PROTECTED] CC: Jarek Poplawski [EMAIL PROTECTED]; netdev@vger.kernel.org Gesendet: Dienstag, den 27. November 2007, 13:54:11 Uhr Betreff: Re: AW: Does tc-prio really work as advertised? Joerg Pommnitz wrote: It works fine here, I'm guessing that Jörg is using an old kernel version that had a bug in prio classification without filters. This is 2.6.20.21, from 17-Oct-2007. Yes, that version is broken. I think it was fixed in 2.6.22 or 2.6.23. Jetzt Mails schnell in einem Vorschaufenster überfliegen. Dies und viel mehr bietet das neue Yahoo! Mail - www.yahoo.de/mail - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 2/3] Interface group: core (netlink) part
Laszlo Attila Toth wrote: Interface groups let handle different interfaces together. Modified net device structure and netlink interface. Looks good. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 3/3] Netfilter Interface group match
Laszlo Attila Toth wrote: Interface group values can be checked on both input and output interfaces. Needs a minor update to comply with the naming scheme Jan introduced, but I can take care of that once the other patches are merged. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix inet_diag.ko register vs rcv race
The following race is possible when one cpu unregisters the handler while other one is trying to receive a message and call this one: CPU1: CPU2: inet_diag_rcv() inet_diag_unregister() mutex_lock(inet_diag_mutex); netlink_rcv_skb(skb, inet_diag_rcv_msg); if (inet_diag_table[nlh-nlmsg_type] == NULL) /* false handler is still registered */ ... netlink_dump_start(idiagnl, skb, nlh, inet_diag_dump, NULL); cb = kzalloc(sizeof(*cb), GFP_KERNEL); /* sleep here freeing memory * or preempt * or sleep later on nlk-cb_mutex */ spin_lock(inet_diag_register_lock); inet_diag_table[type] = NULL; ... spin_unlock(inet_diag_register_lock); synchronize_rcu(); /* CPU1 is sleeping - RCU quiescent * state is passed */ return; /* inet_diag_dump is finally called: */ inet_diag_dump() handler = inet_diag_table[cb-nlh-nlmsg_type]; BUG_ON(handler == NULL); /* OOPS! While we slept the unregister has set * handler to NULL :( */ Grep showed, that the register/unregister functions are called from init/fini module callbacks for tcp_/dccp_diag, so it's OK to use the inet_diag_mutex to synchronize manipulations with the inet_diag_table and the access to it. Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index b017073..5fe32d5 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -853,8 +853,6 @@ static void inet_diag_rcv(struct sk_buff *skb) mutex_unlock(inet_diag_mutex); } -static DEFINE_SPINLOCK(inet_diag_register_lock); - int inet_diag_register(const struct inet_diag_handler *h) { const __u16 type = h-idiag_type; @@ -863,13 +861,13 @@ int inet_diag_register(const struct inet_diag_handler *h) if (type = INET_DIAG_GETSOCK_MAX) goto out; - spin_lock(inet_diag_register_lock); + mutex_lock(inet_diag_mutex); err = -EEXIST; if (inet_diag_table[type] == NULL) { inet_diag_table[type] = h; err = 0; } - spin_unlock(inet_diag_register_lock); + mutex_unlock(inet_diag_mutex); out: return err; } @@ -882,11 +880,9 @@ void inet_diag_unregister(const struct inet_diag_handler *h) if (type = INET_DIAG_GETSOCK_MAX) return; - spin_lock(inet_diag_register_lock); + mutex_lock(inet_diag_mutex); inet_diag_table[type] = NULL; - spin_unlock(inet_diag_register_lock); - - synchronize_rcu(); + mutex_unlock(inet_diag_mutex); } EXPORT_SYMBOL_GPL(inet_diag_unregister); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property
On Tue, 2007-11-27 at 14:39 +0300, Anton Vorontsov wrote: On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote: On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote: fixed-link says: register new Fixed/emulated PHY, i.e. PHY that not connected to the real MDIO bus. Signed-off-by: Vitaly Bordug [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Documentation/powerpc/booting-without-of.txt |3 + arch/powerpc/sysdev/fsl_soc.c| 56 ++ 2 files changed, 42 insertions(+), 17 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index e9a3cb1..cf25070 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1254,6 +1254,9 @@ platforms are moved over to use the flattened-device-tree model. services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. +- fixed-link : a b c where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000. Good work! May I suggest adding a d to a b c where d is flow control - 0 no, 1 yes Well, I see no reference of the flow neither in the include/linux/mii.h nor in the drivers/net/phy/*. :-/ Thus today there is no such register bit we can emulate?.. Well, as good as I can recall, flow control(pause) is something that the PHY negotiates, just like FDX/HDX and should be dealt with in the adjust_link callback but not many do currently. If you seach for pause in phy_device.c you will find it. flow control or not just popped up here today so I got a use for it. ..and I looked into few drivers (gianfar, ucc), and they seem to use hard-coded flow control values, thus they don't speak to the PHYs about that. Yes, but they should. Can you give any reference to the driver that needs that parameter? Does it use PHY subsystem to obtain flow-control on/off information? Thanks, - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] bridging: don't forward EAPOL frames
+ if (unlikely(skb-protocol = htons(ETH_P_PAE))) + goto drop; + switch (p-state) { case BR_STATE_FORWARDING: Not needed because the bridge is already handling it: 1) If running STP (ie true bridge), then all link local multicast is only received by the bridge and never forwarded. Well, typical access point setups bridge the wireless AP interface with wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and we want to avoid having them unicast to another host. Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather than depending on the link-local multicast address because otherwise eapol frames can be unicast into the network behind the (authorized) port which is undesirable. johannes signature.asc Description: This is a digitally signed message part
Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property
On Tue, Nov 27, 2007 at 02:17:11PM +0100, Joakim Tjernlund wrote: On Tue, 2007-11-27 at 14:39 +0300, Anton Vorontsov wrote: On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote: On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote: fixed-link says: register new Fixed/emulated PHY, i.e. PHY that not connected to the real MDIO bus. Signed-off-by: Vitaly Bordug [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Documentation/powerpc/booting-without-of.txt |3 + arch/powerpc/sysdev/fsl_soc.c| 56 ++ 2 files changed, 42 insertions(+), 17 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index e9a3cb1..cf25070 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1254,6 +1254,9 @@ platforms are moved over to use the flattened-device-tree model. services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. +- fixed-link : a b c where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000. Good work! May I suggest adding a d to a b c where d is flow control - 0 no, 1 yes Well, I see no reference of the flow neither in the include/linux/mii.h nor in the drivers/net/phy/*. :-/ Thus today there is no such register bit we can emulate?.. Well, as good as I can recall, flow control(pause) is something that the PHY negotiates, just like FDX/HDX and should be dealt with in the adjust_link callback but not many do currently. If you seach for pause in phy_device.c you will find it. Ah, pause. Sure, we can emulate that. -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
Andi Kleen [EMAIL PROTECTED] wrote: On Tue, Nov 27, 2007 at 03:26:52PM +1100, Rusty Russell wrote: On Monday 26 November 2007 16:58:08 Roland Dreier wrote: I agree that we shouldn't make things too hard for out-of-tree modules, but I disagree with your first statement: there clearly is a large class of symbols that are used by multiple modules but which are not generically useful -- they are only useful by a certain small class of modules. If it is so clear, you should be able to easily provide examples? Sure -- Andi's example of symbols required only by TCP congestion modules; Exactly. Why exactly should someone not write a new TCP congestion module? Agreed the congestion modules are a corner case. I even mentioned that in the patch. I would be happy to drop that one if that is the consensus. It was more done as a example anyways. That is why I made it an separate namespace from tcp But for many other TCP symbols it makes a lot of sense: all the functions only used by tcp_ipv6.c. If someone wants to write support for a IPv7 or similar they really should do it in tree. So I think the tcp and inet namespaces make a lot of sense. OK, short of making IPv4 a module (which would be a worthy task :) do you have an example where a symbol is used by more than one module but needs to be put into a namespace? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][BRIDGE] Lost call to br_fdb_fini() in br_init() error path
In case the br_netfilter_init() (or any subsequent call) fails, the br_fdb_fini() must be called to free the allocated in br_fdb_init() br_fdb_cache kmem cache. Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/bridge/br.c b/net/bridge/br.c index 93867bb..a901828 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -39,7 +39,7 @@ static int __init br_init(void) err = br_fdb_init(); if (err) - goto err_out1; + goto err_out; err = br_netfilter_init(); if (err) @@ -65,6 +65,8 @@ err_out3: err_out2: br_netfilter_fini(); err_out1: + br_fdb_fini(); +err_out: llc_sap_put(br_stp_sap); return err; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
In message [EMAIL PROTECTED],Rober t P. J. Day writes: in any event, i just thought i'd point it out. if you're absolutely sure there will never be another call to setup_dev() from somewhere else, then, yes, it's safe. I understood your opinions. and partially agree with you. But isn't it a unfounded fear? i don't know, i just thought i'd mention it. if no one thinks it's an issue, it's certainly fine with me. its very unlikely that setup_dev() is likely to be called from another code path. this patch looks fine to me. i will take it and get it submitted on the next merge. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 03:12:42PM +0100, Andi Kleen wrote: For Networking: e.g. symbols i put into inet, which are only used by protocols (sctp, dccp, udplite, ipv6) Wait, that's exactly Rusty's point (I think :) These symbols are exported because they're needed by protocols. If they weren't available to everyone then it would be difficult to start writing new protocols. I already caught someone doing something wrong with that BTW -- wanrouter clearly does some things it shouldn't be doing. Can you be more precise? Or the fib namespace, where all the fib functions should be only used by the two fib_* modules and ipv6/decnet. Again, if it's used by decnet then it sounds like it should be exported because new protocol families may need them. So based on the network code at least I'm kind of starting to agree with Rusty now: if a symbol is needed by more than one in-tree module chances are we want it to be exported for all. Although I admit I haven't examined your examples elsewhere. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH]: fix lro_gen_skb() alignment
The inet_lro.c:lro_gen_skb() function fails to include NET_IP_ALIGN padding at the front of the sk_buffs it creates, leading to alignment warnings on architectures which require strict alignment (seen on sparc64). The attached patch adds NET_IP_ALIGN padding. Signed off by: Andrew Gallatin [EMAIL PROTECTED] Drew diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c index ac3b1d3..91e9371 100644 --- a/net/ipv4/inet_lro.c +++ b/net/ipv4/inet_lro.c @@ -401,10 +401,11 @@ static struct sk_buff *lro_gen_skb(struc int data_len = len; int hdr_len = min(len, hlen); - skb = netdev_alloc_skb(lro_mgr-dev, hlen); + skb = netdev_alloc_skb(lro_mgr-dev, hlen + NET_IP_ALIGN); if (!skb) return NULL; + skb_reserve(skb, NET_IP_ALIGN); skb-len = len; skb-data_len = len - hdr_len; skb-truesize += true_size;
Re: [Bridge] Re: [RFC] bridging: don't forward EAPOL frames
On Nov 27, 2007 8:24 AM, Johannes Berg [EMAIL PROTECTED] wrote: + if (unlikely(skb-protocol = htons(ETH_P_PAE))) + goto drop; + switch (p-state) { case BR_STATE_FORWARDING: Not needed because the bridge is already handling it: 1) If running STP (ie true bridge), then all link local multicast is only received by the bridge and never forwarded. Well, typical access point setups bridge the wireless AP interface with wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and we want to avoid having them unicast to another host. Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather than depending on the link-local multicast address because otherwise eapol frames can be unicast into the network behind the (authorized) port which is undesirable. johannes I agree with Stephen, that based on the way it's likely people use linux bridges it seems like this is something that could be configured rather than simply dropping those frames without any chance to forward them. There are probably quite a few people out there who will not expect this change, so it should be easy to change during runtime. Don't forget that a simple ebtables rule could also drop EAPOL if needed. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bridge] Re: [RFC] bridging: don't forward EAPOL frames
Not needed because the bridge is already handling it: 1) If running STP (ie true bridge), then all link local multicast is only received by the bridge and never forwarded. Well, typical access point setups bridge the wireless AP interface with wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and we want to avoid having them unicast to another host. Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather than depending on the link-local multicast address because otherwise eapol frames can be unicast into the network behind the (authorized) port which is undesirable. I agree with Stephen, that based on the way it's likely people use linux bridges it seems like this is something that could be configured rather than simply dropping those frames without any chance to forward them. Well Stephen is wrong in one thing that eapol need not be link local multicast for 802.11, it's unicast there so the dropping of link local packets doesn't help. There are probably quite a few people out there who will not expect this change, so it should be easy to change during runtime. I'm not aware of any use for EAPOL frames traversing the network. I'm also not aware of any proper 802.1X implementation for linux bridges but I didn't do too much research yet. I don't see why people would rely on EAPOL frames being bridged when the protocol is by definition local to a link. Don't forget that a simple ebtables rule could also drop EAPOL if needed. Indeed, but I'd prefer the bridge to do the right thing in absence of configuration. johannes signature.asc Description: This is a digitally signed message part
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
Rusty said: Well, introduce an EXPORT_SYMBOL_INTERNAL(). It's a lot less code. But you'd still need to show that people are having trouble knowing what APIs to use. Might the recent discussion on the exporting of sys_open() and sys_read() be an example here? There would appear to be a consensus that people should not have used those functions, but they are now proving difficult to unexport. Perhaps the best use of the namespace stuff might be for *future* exports which are needed to help make the mainline kernel fully modular, but which are not meant to be part of a more widely-used API? jon - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 08:43:24AM -0700, Jonathan Corbet wrote: Rusty said: Well, introduce an EXPORT_SYMBOL_INTERNAL(). It's a lot less code. But you'd still need to show that people are having trouble knowing what APIs to use. Might the recent discussion on the exporting of sys_open() and sys_read() be an example here? There would appear to be a consensus that people should not have used those functions, but they are now proving difficult to unexport. That is a good example yes. Perhaps the best use of the namespace stuff might be for *future* exports which are needed to help make the mainline kernel fully modular, but which are not meant to be part of a more widely-used API? Not sure about future only, but yes that is its intention. Thanks for putting it clearly. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-mm1 tg3 wake-on-lan oddity...
[EMAIL PROTECTED] wrote: (a) the Dell factory default is WOL disabled and (b) if it wasn't the default, I'd have *set* it to disabled, and (c) I even went back and rebooted and checked the BIOS setting - disabled. Nonetheless: # ethtool eth0 | grep Wake Supports Wake-on: g Wake-on: g Is this expected behavior? The new tg3 is supposed to follow the WoL setting in the NVRAM, so this is not expected. We'll have to look into this. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][BRIDGE] Properly dereference the br_should_route_hook
This hook is protected with the RCU, so simple if (br_should_route_hook) br_should_route_hook(...) is not enough on some architectures. Use the rcu_dereference/rcu_assign_pointer in this case. Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 3cedd4e..b42b192 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest) struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb) { const unsigned char *dest = eth_hdr(skb)-h_dest; + typeof(br_should_route_hook) rhook; if (!is_valid_ether_addr(eth_hdr(skb)-h_source)) goto drop; @@ -147,9 +148,9 @@ struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb) switch (p-state) { case BR_STATE_FORWARDING: - - if (br_should_route_hook) { - if (br_should_route_hook(skb)) + rhook = rcu_dereference(br_should_route_hook); + if (rhook != NULL) { + if (rhook(skb)) return skb; dest = eth_hdr(skb)-h_dest; } diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c index e44519e..be6f186 100644 --- a/net/bridge/netfilter/ebtable_broute.c +++ b/net/bridge/netfilter/ebtable_broute.c @@ -70,13 +70,13 @@ static int __init ebtable_broute_init(void) if (ret 0) return ret; /* see br_input.c */ - br_should_route_hook = ebt_broute; + rcu_assign_pointer(br_should_route_hook, ebt_broute); return ret; } static void __exit ebtable_broute_fini(void) { - br_should_route_hook = NULL; + rcu_assign_pointer(br_should_route_hook, NULL); synchronize_net(); ebt_unregister_table(broute_table); } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BRIDGE] Properly dereference the br_should_route_hook
On Tue, 27 Nov 2007 19:12:11 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote: This hook is protected with the RCU, so simple if (br_should_route_hook) br_should_route_hook(...) is not enough on some architectures. Use the rcu_dereference/rcu_assign_pointer in this case. Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 3cedd4e..b42b192 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest) struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb) { const unsigned char *dest = eth_hdr(skb)-h_dest; + typeof(br_should_route_hook) rhook; Okay, but I don't like the typeof() magic. Resubmit with proper declartion. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BRIDGE] Lost call to br_fdb_fini() in br_init() error path
On Tue, 27 Nov 2007 17:39:42 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote: In case the br_netfilter_init() (or any subsequent call) fails, the br_fdb_fini() must be called to free the allocated in br_fdb_init() br_fdb_cache kmem cache. Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/bridge/br.c b/net/bridge/br.c index 93867bb..a901828 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -39,7 +39,7 @@ static int __init br_init(void) err = br_fdb_init(); if (err) - goto err_out1; + goto err_out; err = br_netfilter_init(); if (err) @@ -65,6 +65,8 @@ err_out3: err_out2: br_netfilter_fini(); err_out1: + br_fdb_fini(); +err_out: llc_sap_put(br_stp_sap); return err; } Good catch, thanks I hope you didn't find this in live system. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH (resubmit)][BRIDGE] Properly dereference the br_should_route_hook
This hook is protected with the RCU, so simple if (br_should_route_hook) br_should_route_hook(...) is not enough on some architectures. Use the rcu_dereference/rcu_assign_pointer in this case. Fixed Stephen's comment concerning using the typeof(). Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED] --- diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 3cedd4e..0ee79a7 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest) struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb) { const unsigned char *dest = eth_hdr(skb)-h_dest; + int (*rhook)(struct sk_buff *skb); if (!is_valid_ether_addr(eth_hdr(skb)-h_source)) goto drop; @@ -147,9 +148,9 @@ struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb) switch (p-state) { case BR_STATE_FORWARDING: - - if (br_should_route_hook) { - if (br_should_route_hook(skb)) + rhook = rcu_dereference(br_should_route_hook); + if (rhook != NULL) { + if (rhook(skb)) return skb; dest = eth_hdr(skb)-h_dest; } diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c index e44519e..be6f186 100644 --- a/net/bridge/netfilter/ebtable_broute.c +++ b/net/bridge/netfilter/ebtable_broute.c @@ -70,13 +70,13 @@ static int __init ebtable_broute_init(void) if (ret 0) return ret; /* see br_input.c */ - br_should_route_hook = ebt_broute; + rcu_assign_pointer(br_should_route_hook, ebt_broute); return ret; } static void __exit ebtable_broute_fini(void) { - br_should_route_hook = NULL; + rcu_assign_pointer(br_should_route_hook, NULL); synchronize_net(); ebt_unregister_table(broute_table); } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SGISEEQ: use cached memory access to make driver work on IP28
On Sat, Nov 24, 2007 at 01:29:19PM +0100, Thomas Bogendoerfer wrote: Following patch is clearly 2.6.25 material and is needed to get SGI IP28 machines supported. Thomas. SGI IP28 machines would need special treatment (enable adding addtional wait states) when accessing memory uncached. To avoid this pain I changed the driver to use only cached access to memory. Signed-off-by: Thomas Bogendoerfer [EMAIL PROTECTED] IP28 is clearly a maximum weirdo beast. Technically the patch looks fine it's just a few stilistic issues such as there no reason for DMA_SYNC_DESC_CPU and DMA_SYNC_DESC_DEV being macros so why not using inlines. Acked-by: Ralf Baechle [EMAIL PROTECTED] Ralf - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 08:43:24AM -0700, Jonathan Corbet wrote: Might the recent discussion on the exporting of sys_open() and sys_read() be an example here? There would appear to be a consensus that people should not have used those functions, but they are now proving difficult to unexport. They're not. The exports aren't used anymore and could just go away any time. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module
One more thing: your qdisc can only be used as root qdisc since it produces packets itself and thereby violates the rule that a qdisc can only hand out packets that were previously enqueued to it. Using it as a leaf qdisc can make the upper qdiscs qlen counter go negative, causing infinite dequeue-loops, so you should make sure that its only possibly to use as root qdisc by checking the parent. It would also be better to do something like netem (enqueue produced packets at the root) to make sure the qlen counter is always accurate. I agree with you. PSPacer should not use with other rate regulation qdiscs. But, I think that a combination of netem and PSPacer is a useful for emulating networks. The following paper describes experimental results using PSPacer with netem: Large Scale Gigabit Emulated Testbed for Grid Transport Evaluation, PFLDnet 2006. http://www.hpcc.jp/pfldnet2006/paper/s1_02.pdf Best regards, Ryousei Takano - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 01/01] ipv6: RFC4214 Support (v2.5)
-Original Message- From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] Sent: Monday, November 26, 2007 10:01 AM To: [EMAIL PROTECTED] Cc: Templin, Fred L; netdev@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PATCH 01/01] ipv6: RFC4214 Support (v2.5) In article [EMAIL PROTECTED] eing.com (at Mon, 26 Nov 2007 09:16:16 -0800), Templin, Fred L [EMAIL PROTECTED] says: From: Fred L. Templin [EMAIL PROTECTED] This patch includes support for the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses the SIT module, and is configured using extensions to the iproute2 utility. The diffs are specific to the Linux 2.6.24-rc2 kernel distribution. This version includes the diff for ./include/linux/if.h which was missing in the v2.4 submission and is needed to make the patch compile. The patch has been installed, compiled and tested in a clean 2.6.24-rc2 kernel build area. Signed-off-by: Fred L. Templin [EMAIL PROTECTED] Acked-by: YOSHIFUJI Hideaki [EMAIL PROTECTED] Note: With linux-2.6: | % patch -p1 /tmp/isatap.patch | patching file include/linux/if.h | patching file include/linux/if_tunnel.h | patching file include/linux/in.h | patching file include/net/addrconf.h | patching file net/ipv6/addrconf.c | patching file net/ipv6/route.c | Hunk #1 succeeded at 1660 (offset -8 lines). | patching file net/ipv6/sit.c Confirmed in a clean linux-2.6.24-rc3 build area: - Installed patch with identical results as above. - Verified that the patch installed correctly (it did). - Compiled, booted and tested. Fred [EMAIL PROTECTED] With net-2.6.24: | % patch -p1 /tmp/isatap.patch | % patch -p1 /tmp/isatap.patch | patching file include/linux/if.h | patching file include/linux/if_tunnel.h | patching file include/linux/in.h | patching file include/net/addrconf.h | patching file net/ipv6/addrconf.c | Hunk #1 succeeded at 378 (offset -1 lines). | Hunk #2 succeeded at 1441 (offset -1 lines). | Hunk #3 succeeded at 1479 (offset -1 lines). | Hunk #4 succeeded at 2210 (offset -1 lines). | patching file net/ipv6/route.c | Hunk #1 succeeded at 1727 (offset 59 lines). | patching file net/ipv6/sit.c --yoshfuji - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote: On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote: ... No. That's the wrong question. What's the real upside? Explicitly documenting what comprises the kernel API (external, supported) and what comprises the kernel implementation (internal, not supported). ... There is not, never was, and never will be, any supported external API of the kernel. cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: wireless vs. alignment requirements
Stephen Hemminger wrote: Herbert Xu wrote: On Sat, Nov 24, 2007 at 02:49:36PM +0100, Johannes Berg wrote: Right. I just didn't think that would be a valid value for an architecture to set. OK. Let me clarify this a bit more. We require at least one of the following rules to be met: * the IPv4/IPv6 header is aligned by 8 bytes on reception; * or the platform provides unaligned exception handlers. So if your platform violates both rules then it won't work with the IP stack, simple as that. Fortunately I don't think such a platform exists currently on Linux. Cheers, Then what about hardware that can't dma ethernet to non-aligned address. Sky2 hardware breaks if DMA is not 8 byte aligned. IMHO the IP stack should handle any alignment, and do the appropriate memove if the CPU requires alignment. I wrote a patch for the IP stack to realign packets if necessary at one point. I should dredge it up again and submit it for collective flamage. -hpa - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: wireless vs. alignment requirements
Herbert Xu wrote: Luckily all sky2 users have been on x86 so far :) Hardly so. My previous employer was MIPS and we used it there (with my patch.) -hpa - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 10:02:22AM +0100, Andi Kleen wrote: ... That is EXPORT_SYMBOL already. The trouble is just that it covers too much. My patchkit is trying to limit it again for a specific use case -- exporting an internal interface to another module. Or rather a set of modules. Standard example is TCP: TCP exports nearly everything and the single user is the TCP code in ipv6.ko. Instead those symbols should be limited to be only accessable to ipv6.ko. ... Let's forget about external modules that are anyway irrelevant for upstream kernel development. Do you have past examples where this would have brought advantages for the upstream kernel justifying all the work required for creating and maintaining these namespaces? IOW, where modules were submitted for upstream inclusion and merging them was impossible or much harder only because they were developed aginst the wrong API? -ANdi cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e1000 driver problems
[moving this discussion to netdev, dropping lkml] Lukas Hejtmanek wrote: On Tue, Nov 27, 2007 at 08:48:52AM -0800, Kok, Auke wrote: unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2 kernel due to compilation errors. but the in-kernel version of e1000 supports the ich8 lan device just fine and can be builtin. also this kernel has the first release of e1000e which supports the ich9 onboard lan device. I'm afraid, I'm missing the point as you have stated that in-kernel drivers have problem with suspicious board hang... my mistake, sorry for that confusion. the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the code for a line that says: adapter-tx_timeout_factor = 8; change that number to 16 to cope with the problem on 10mbit link partners. However this won't fix the issue on 100mbit partners and we need to investigate that if that is the case. Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, 2007-11-27 at 18:15 +0100, Adrian Bunk wrote: On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote: On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote: ... No. That's the wrong question. What's the real upside? Explicitly documenting what comprises the kernel API (external, supported) and what comprises the kernel implementation (internal, not supported). ... There is not, never was, and never will be, any supported external API of the kernel. Philosophically I understand what you're saying, but in practical terms there is the issue of managing core API like kmalloc. Although kmalloc _could_ change, doing so would be extremely painful. In fact anyone who proposed such a change would have to have a profoundly powerful argument as to why it was necessary. I think this patchset is an attempt to make it easier to identify and review these kinds of interfaces. cu Adrian - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e1000 driver problems
On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote: I'm afraid, I'm missing the point as you have stated that in-kernel drivers have problem with suspicious board hang... my mistake, sorry for that confusion. the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the code for a line that says: adapter-tx_timeout_factor = 8; change that number to 16 to cope with the problem on 10mbit link partners. However this won't fix the issue on 100mbit partners and we need to investigate that if that is the case. 100mbit is the issue in my case. -- Lukáš Hejtmánek - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e1000 driver problems
Lukas Hejtmanek wrote: On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote: I'm afraid, I'm missing the point as you have stated that in-kernel drivers have problem with suspicious board hang... my mistake, sorry for that confusion. the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the code for a line that says: adapter-tx_timeout_factor = 8; change that number to 16 to cope with the problem on 10mbit link partners. However this won't fix the issue on 100mbit partners and we need to investigate that if that is the case. 100mbit is the issue in my case. can you see if your problem goes away with this patch? --- e1000: increase tx timeout factor for 100mbit speeds Signed-off-by: Auke Kok [EMAIL PROTECTED] diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index b7c3070..2e46a15 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -2601,7 +2601,7 @@ e1000_watchdog(unsigned long data) case SPEED_100: txb2b = 0; netdev-tx_queue_len = 100; - /* maybe add some timeout factor ? */ + adapter-tx_timeout_factor = 4; break; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: wireless vs. alignment requirements
On Tue, 27 Nov 2007 09:16:07 -0800 H. Peter Anvin [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: Herbert Xu wrote: On Sat, Nov 24, 2007 at 02:49:36PM +0100, Johannes Berg wrote: Right. I just didn't think that would be a valid value for an architecture to set. OK. Let me clarify this a bit more. We require at least one of the following rules to be met: * the IPv4/IPv6 header is aligned by 8 bytes on reception; * or the platform provides unaligned exception handlers. So if your platform violates both rules then it won't work with the IP stack, simple as that. Fortunately I don't think such a platform exists currently on Linux. Cheers, Then what about hardware that can't dma ethernet to non-aligned address. Sky2 hardware breaks if DMA is not 8 byte aligned. IMHO the IP stack should handle any alignment, and do the appropriate memove if the CPU requires alignment. I wrote a patch for the IP stack to realign packets if necessary at one point. I should dredge it up again and submit it for collective flamage. -hpa Is there any standard kernel config define for this platform can't do unaligned accesses? -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] Open-FCoE - Fibre Channel over Ethernet Project
Hello Linux Community, I'd like to introduce a new Open Source project named Open-FCoE. FCoE is the acronym for Fibre Channel over Ethernet, which is the encapsulation of Fibre Channel frames within Ethernet packets. FCoE will allow systems with an Ethernet adapter and a Fibre Channel Forwarder to login to a Fibre Channel fabric (the FCF is a gateway that bridges the LAN and the SAN). That fabric login was previously reserved exclusively for Fibre Channel HBAs. This technology reduces complexity in the data center by aiding network convergence. It is targeted for 10Gps Ethernet NICs but will work on any Ethernet NIC supporting pause frames. Our code base provides a Fibre Channel protocol processing module as well as an Ethernet based transport module. The Open-FC module acts as a LLD for SCSI and the Open-FCoE transport uses net_device to send and receive packets. We've set up a web home for this project at www.Open-FCoE.org. Its purpose is to support development and users; it's very light on content right now. We have a lot of code and it will take time for us to harden and improve upon it. We're still early in development but are making this code available as soon as possible so that our project's development can truly be open. This is our code's organization as it is today. 1) git://open-fcoe.org/openfc/open-fcoe-upstream.git This repository is based on the SCSI 2.6.24 bug fix tree. Two patches are applied to add our code and resolve any compatibility problems between our code and 2.6.24. 2) git://open-fcoe.org/openfc/open-fcoe.git This repository contains our user space tools. It also contains an out-of-kernel Makefile that allows a user to build the kernel modules within this layout. Pulling our kernel code from open-fcoe-upstream into this code base and then tarring it up provides an easily distributable package. 3) git://open-fcoe.org/openfc/open-fcoe-misc.git This repository contains our SW target code. The SW target allows someone to connect two systems back-to-back and run FCoE traffic from one to another. Currently our target uses SCST which is not upstream. This is causing us a problem because the target shares code with the initiator. Since the initiator is building with 2.6.24 and SCST isn't we're in a bind. A priority of ours will be to get the target back to a better shape as it's our primary development tool. We also have a lot of things to fix. Here are our initial concerns, 1) Kernel/Userspace interaction needs direction. Currently we're using ioctls, which are no longer desirable. We're planning to change the implementation to netlink. Also, some non-data path code will probably need to move to user space. Open-iSCSI has spent a lot of time on its kernel to user interaction; we will likely borrow from that model as much as we can. 2) SW Target and its integration with the kernel. 3) Reduction of code, removal of unnecessary abstraction. 4) Non-GPL code. We're currently using a BSD queue.h and some CRC code from BSD as well. We need to convert this code to use kernel code. 5) Documentation. Our basic documentation needs some help so people can use the initiator and target easily. We have a mailing list established on our website, but believe that the community will want us to discuss this technology on the SCSI mailing list. That is where we plan to start discussing this code. We're looking forward to building a strong community around this technology and code base. We're eager to start coding and improving our code with any other community contributors! Robert Love Open-FCoE.org - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 11:45:37AM -0600, Tom Tucker wrote: On Tue, 2007-11-27 at 18:15 +0100, Adrian Bunk wrote: On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote: On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote: ... No. That's the wrong question. What's the real upside? Explicitly documenting what comprises the kernel API (external, supported) and what comprises the kernel implementation (internal, not supported). ... There is not, never was, and never will be, any supported external API of the kernel. Philosophically I understand what you're saying, but in practical terms there is the issue of managing core API like kmalloc. Although kmalloc _could_ change, doing so would be extremely painful. In fact anyone who proposed such a change would have to have a profoundly powerful argument as to why it was necessary. The latter should at least in theory be required for all changes no matter how core they are... I think this patchset is an attempt to make it easier to identify and review these kinds of interfaces. As long as the submitter fixes all in-kernel users these interfaces are not handled differently from interfaces with fewer users. And I remember at least one commit that changed 1000 files because it changed a frequently used driver API. [1] cu Adrian [1] commit 7d12e780e003f93433d49ce78cfedf4b4c52adc5 -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote: 1) Why is everyone so concerned that export symbol space is large? - does it cost cpu or running memory? - does it cause bugs? - or are you just worried about evil modules? To clarify something here, by evil, don't necessarily think binary only. Out of tree modules are frequently using symbols that they shouldn't be. Because they get no peer-review here, they 'get away with it' for the most part. Until distro vendors push rebased kernel updates that removed exports that should never have been exported, and suddenly people like me get bombed with Fedora broke my xyz driver mails. Reducing the opportunity for people to screw things up is a good thing. If a symbol is exported, most out-of-tree driver authors seem to think its fair game to use it. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
net_rx_action/NAPI oops [PATCH]
Hello! I've discovered a bug while testing the new multiQ NAPI code. In hi-load situations when we take down an interface we get a kernel panic. The oops is below. From what I see this happens when driver does napi_disable() and clears NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight a sort indirect test but that's now not enough to cover the load situation. where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still full quota. Latest git but I'll guess the is the same in all later kernels. There might be different solutions... one variant is below: Signed-off-by: Robert Olsson [EMAIL PROTECTED] diff --git a/net/core/dev.c b/net/core/dev.c index 043e2f8..1031233 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2207,7 +2207,7 @@ static void net_rx_action(struct softirq_action *h) * still owns the NAPI instance and therefore can * move the instance around on the list at-will. */ - if (unlikely(work == weight)) + if (unlikely(work == weight) (test_bit(NAPI_STATE_SCHED, n-state))) list_move_tail(n-poll_list, list); netpoll_poll_unlock(have); Cheers --ro labb:/# ifconfig eth0 down BUG: unable to handle kernel paging request at virtual address 00100104 printing eip: c0433d67 *pde = Oops: 0002 [#1] SMP Modules linked in: Pid: 4, comm: ksoftirqd/0 Not tainted (2.6.24-rc3bifrost-gb3664d45-dirty #32) EIP: 0060:[c0433d67] EFLAGS: 00010046 CPU: 0 EIP is at net_rx_action+0x107/0x120 EAX: 00100100 EBX: f757d4e0 ECX: c200d334 EDX: 00200200 ESI: 0040 EDI: c200d334 EBP: 00ec ESP: f7c6bf78 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 Process ksoftirqd/0 (pid: 4, ti=f7c6a000 task=f7c58ab0 task.ti=f7c6a000) Stack: c0236217 c200ce9c c200ce9c fffcf892 0040 0005 c05b2a98 c0603e60 0008 c022a275 c06066c0 c06066c0 0246 c022a5e0 c022a327 c06066c0 c022a636 fffc c02384f2 Call Trace: [c0236217] __rcu_process_callbacks+0x107/0x190 [c022a275] __do_softirq+0x75/0xf0 [c022a5e0] ksoftirqd+0x0/0xd0 [c022a327] do_softirq+0x37/0x40 [c022a636] ksoftirqd+0x56/0xd0 [c02384f2] kthread+0x42/0x70 [c02384b0] kthread+0x0/0x70 [c02039df] kernel_thread_helper+0x7/0x18 === Code: 88 8c 52 c0 e8 4b 1d df ff e8 96 0c dd ff c7 05 64 7d 63 c0 01 00 00 00 e9 61 ff ff ff 8d b4 26 00 00 00 00 8b 03 8b 53 04 89 f9 89 50 04 89 02 89 d8 8b 57 04 e8 5a 34 eb ff e9 4a ff ff ff 90 EIP: [c0433d67] net_rx_action+0x107/0x120 SS:ESP 0068:f7c6bf78 Kernel panic - not syncing: Fatal exception in interrupt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-mm1 tg3 wake-on-lan oddity...
On Tue, 27 Nov 2007 08:04:28 PST, Michael Chan said: [EMAIL PROTECTED] wrote: (a) the Dell factory default is WOL disabled and (b) if it wasn't the default, I'd have *set* it to disabled, and (c) I even went back and rebooted and checked the BIOS setting - disabled. Nonetheless: # ethtool eth0 | grep Wake Supports Wake-on: g Wake-on: g Is this expected behavior? The new tg3 is supposed to follow the WoL setting in the NVRAM, so this is not expected. We'll have to look into this. Any info that would help? printk's to stick in tg3.c? Dumping the relevant bytes of NVRAM? etc? pgpwUS9hBRdWf.pgp Description: PGP signature
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, 27 Nov 2007 15:12:42 +0100, Andi Kleen said: OK, short of making IPv4 a module (which would be a worthy task :) At some point there were patches, it is probably not very difficult. But DaveM resisted at some point because he didn't want people to replace the network stack (although I personally don't have a problem with that) Personally, I wouldn't find replacing the ipv4 stack very interesting. However, stress-testing your system for IPv6-readiness by doing 'rmmod ipv4' :) (Though I admit it's something I'd *try* if it was available, but certainly not sufficient for a reason to do it...) pgpAMJ4wTClYk.pgp Description: PGP signature
Re: 2.6.23-mm1 tg3 wake-on-lan oddity...
On Tue, 2007-11-27 at 01:35 -0800, [EMAIL PROTECTED] wrote: Issue: I (for unrelated reasons) run powertop, and it suggests I conserve power by doing 'ethtool -s eth0 wol d'. I look at it, and think that it's daft, because (a) the Dell factory default is WOL disabled and (b) if it wasn't the default, I'd have *set* it to disabled, and (c) I even went back and rebooted and checked the BIOS setting - disabled. Nonetheless: # ethtool eth0 | grep Wake Supports Wake-on: g Wake-on: g Is this expected behavior? What's happening is that there are 2 WoL settings: one in the BIOS and one in the NIC's NVRAM. For WoL to work, I think both settings have to be enabled. Apparently in this case, when you turn off WoL in the BIOS, the NVRAM's WoL setting is unchanged, and will be seen by tg3 as enabled. Ideally, the BIOS should modify the NVRAM's setting when it is changed. We will talk to Dell to get their opinion on this as this is very confusing to the user. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 02:00:37PM -0500, Dave Jones wrote: On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote: 1) Why is everyone so concerned that export symbol space is large? - does it cost cpu or running memory? - does it cause bugs? - or are you just worried about evil modules? To clarify something here, by evil, don't necessarily think binary only. Out of tree modules are frequently using symbols that they shouldn't be. Because they get no peer-review here, they 'get away with it' for the most part. Until distro vendors push rebased kernel updates that removed exports that should never have been exported, and suddenly people like me get bombed with Fedora broke my xyz driver mails. ... The real problem is that these drivers are not in the upstream kernel. Are there common reasons why these drivers are not upstream? Dave cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
The real problem is that these drivers are not in the upstream kernel. Are there common reasons why these drivers are not upstream? One might be that upstream has not accepted them. Anything doing or smelling of TOE comes to mind right away. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 10:09:42PM +0100, Adrian Bunk wrote: On Tue, Nov 27, 2007 at 02:00:37PM -0500, Dave Jones wrote: On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote: 1) Why is everyone so concerned that export symbol space is large? - does it cost cpu or running memory? - does it cause bugs? - or are you just worried about evil modules? To clarify something here, by evil, don't necessarily think binary only. Out of tree modules are frequently using symbols that they shouldn't be. Because they get no peer-review here, they 'get away with it' for the most part. Until distro vendors push rebased kernel updates that removed exports that should never have been exported, and suddenly people like me get bombed with Fedora broke my xyz driver mails. ... The real problem is that these drivers are not in the upstream kernel. You're preaching to the choir. Are there common reasons why these drivers are not upstream? It varies case by case. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] cxgb - driver fixes.
Jeff, I'm submitting a patch series for inclusion in 2.6.24 for the cxgb driver. The patches are built against Linus'git tree. Here is a brief description: - Ensure that GSO skbs have enough headroom before encapsulating them, - Fix a crash in NAPI mode, - Fix statistics accounting and report. Cheers, Divy - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] cxgb - fix T2 GSO
From: Divy Le Ray [EMAIL PROTECTED] The patch ensures that a GSO skb has enough headroom to push an encapsulating cpl_tx_pkt_lso header. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/chelsio/cxgb2.c |3 ++- drivers/net/chelsio/sge.c | 34 +++--- drivers/net/chelsio/sge.h |1 + 3 files changed, 18 insertions(+), 20 deletions(-) diff --git a/drivers/net/chelsio/cxgb2.c b/drivers/net/chelsio/cxgb2.c old mode 100644 new mode 100755 index 2dbf8dc..2461f91 --- a/drivers/net/chelsio/cxgb2.c +++ b/drivers/net/chelsio/cxgb2.c @@ -401,7 +401,8 @@ static char stats_strings[][ETH_GSTRING_LEN] = { TxTso, RxVlan, TxVlan, - + TxNeedHeadroom, + /* Interrupt stats */ rx drops, pure_rsps, diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c old mode 100644 new mode 100755 index 4436662..e8b1036 --- a/drivers/net/chelsio/sge.c +++ b/drivers/net/chelsio/sge.c @@ -991,6 +991,7 @@ void t1_sge_get_port_stats(const struct sge *sge, int port, ss-tx_packets += st-tx_packets; ss-tx_cso += st-tx_cso; ss-tx_tso += st-tx_tso; + ss-tx_need_hdrroom += st-tx_need_hdrroom; ss-vlan_xtract += st-vlan_xtract; ss-vlan_insert += st-vlan_insert; } @@ -1848,7 +1849,8 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct adapter *adapter = dev-priv; struct sge *sge = adapter-sge; - struct sge_port_stats *st = per_cpu_ptr(sge-port_stats[dev-if_port], smp_processor_id()); + struct sge_port_stats *st = per_cpu_ptr(sge-port_stats[dev-if_port], + smp_processor_id()); struct cpl_tx_pkt *cpl; struct sk_buff *orig_skb = skb; int ret; @@ -1856,6 +1858,18 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) if (skb-protocol == htons(ETH_P_CPL5)) goto send; + /* +* We are using a non-standard hard_header_len. +* Allocate more header room in the rare cases it is not big enough. +*/ + if (unlikely(skb_headroom(skb) dev-hard_header_len - ETH_HLEN)) { + skb = skb_realloc_headroom(skb, sizeof(struct cpl_tx_pkt_lso)); + ++st-tx_need_hdrroom; + dev_kfree_skb_any(orig_skb); + if (!skb) + return NETDEV_TX_OK; + } + if (skb_shinfo(skb)-gso_size) { int eth_type; struct cpl_tx_pkt_lso *hdr; @@ -1889,24 +1903,6 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } - /* -* We are using a non-standard hard_header_len and some kernel -* components, such as pktgen, do not handle it right. -* Complain when this happens but try to fix things up. -*/ - if (unlikely(skb_headroom(skb) dev-hard_header_len - ETH_HLEN)) { - pr_debug(%s: headroom %d header_len %d\n, dev-name, -skb_headroom(skb), dev-hard_header_len); - - if (net_ratelimit()) - printk(KERN_ERR %s: inadequate headroom in - Tx packet\n, dev-name); - skb = skb_realloc_headroom(skb, sizeof(*cpl)); - dev_kfree_skb_any(orig_skb); - if (!skb) - return NETDEV_TX_OK; - } - if (!(adapter-flags UDP_CSUM_CAPABLE) skb-ip_summed == CHECKSUM_PARTIAL ip_hdr(skb)-protocol == IPPROTO_UDP) { diff --git a/drivers/net/chelsio/sge.h b/drivers/net/chelsio/sge.h old mode 100644 new mode 100755 index 713d9c5..285bbb2 --- a/drivers/net/chelsio/sge.h +++ b/drivers/net/chelsio/sge.h @@ -64,6 +64,7 @@ struct sge_port_stats { u64 tx_tso; /* # of TSO requests */ u64 vlan_xtract; /* # of VLAN tag extractions */ u64 vlan_insert; /* # of VLAN tag insertions */ + u64 tx_need_hdrroom; /* # of TX skbs in need of more header room */ }; struct sk_buff; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] cxgb - fix NAPI
From: Divy Le Ray [EMAIL PROTECTED] netif_rx_complete() should be called only when work_done budget. Signed-off-by: Divy Le ray [EMAIL PROTECTED] --- drivers/net/chelsio/sge.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c old mode 100755 new mode 100644 index e8b1036..4b6258f --- a/drivers/net/chelsio/sge.c +++ b/drivers/net/chelsio/sge.c @@ -1625,11 +1625,9 @@ int t1_poll(struct napi_struct *napi, int budget) { struct adapter *adapter = container_of(napi, struct adapter, napi); struct net_device *dev = adapter-port[0].dev; - int work_done; - - work_done = process_responses(adapter, budget); + int work_done = process_responses(adapter, budget); - if (likely(!responses_pending(adapter))) { + if (likely(work_done budget)) { netif_rx_complete(dev, napi); writel(adapter-sge-respQ.cidx, adapter-regs + A_SG_SLEEPING); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] cxgb - fix stats
From: Divy Le Ray [EMAIL PROTECTED] Fix MAC stats accounting. Fix get_stats. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/chelsio/cxgb2.c | 67 +++-- drivers/net/chelsio/pm3393.c | 112 +- drivers/net/chelsio/sge.c|4 -- drivers/net/chelsio/sge.h|2 - 4 files changed, 96 insertions(+), 89 deletions(-) diff --git a/drivers/net/chelsio/cxgb2.c b/drivers/net/chelsio/cxgb2.c index 2461f91..c597504 100755 --- a/drivers/net/chelsio/cxgb2.c +++ b/drivers/net/chelsio/cxgb2.c @@ -374,7 +374,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = { TxInternalMACXmitError, TxFramesWithExcessiveDeferral, TxFCSErrors, - + TxJumboFramesOk, + TxJumboOctetsOk, + RxOctetsOK, RxOctetsBad, RxUnicastFramesOK, @@ -392,11 +394,11 @@ static char stats_strings[][ETH_GSTRING_LEN] = { RxInRangeLengthErrors, RxOutOfRangeLengthField, RxFrameTooLongErrors, + RxJumboFramesOk, + RxJumboOctetsOk, /* Port stats */ - RxPackets, RxCsumGood, - TxPackets, TxCsumOffload, TxTso, RxVlan, @@ -464,23 +466,56 @@ static void get_stats(struct net_device *dev, struct ethtool_stats *stats, const struct cmac_statistics *s; const struct sge_intr_counts *t; struct sge_port_stats ss; - unsigned int len; s = mac-ops-statistics_update(mac, MAC_STATS_UPDATE_FULL); - - len = sizeof(u64)*(s-TxFCSErrors + 1 - s-TxOctetsOK); - memcpy(data, s-TxOctetsOK, len); - data += len; - - len = sizeof(u64)*(s-RxFrameTooLongErrors + 1 - s-RxOctetsOK); - memcpy(data, s-RxOctetsOK, len); - data += len; - + t = t1_sge_get_intr_counts(adapter-sge); t1_sge_get_port_stats(adapter-sge, dev-if_port, ss); - memcpy(data, ss, sizeof(ss)); - data += sizeof(ss); - t = t1_sge_get_intr_counts(adapter-sge); + *data++ = s-TxOctetsOK; + *data++ = s-TxOctetsBad; + *data++ = s-TxUnicastFramesOK; + *data++ = s-TxMulticastFramesOK; + *data++ = s-TxBroadcastFramesOK; + *data++ = s-TxPauseFrames; + *data++ = s-TxFramesWithDeferredXmissions; + *data++ = s-TxLateCollisions; + *data++ = s-TxTotalCollisions; + *data++ = s-TxFramesAbortedDueToXSCollisions; + *data++ = s-TxUnderrun; + *data++ = s-TxLengthErrors; + *data++ = s-TxInternalMACXmitError; + *data++ = s-TxFramesWithExcessiveDeferral; + *data++ = s-TxFCSErrors; + *data++ = s-TxJumboFramesOK; + *data++ = s-TxJumboOctetsOK; + + *data++ = s-RxOctetsOK; + *data++ = s-RxOctetsBad; + *data++ = s-RxUnicastFramesOK; + *data++ = s-RxMulticastFramesOK; + *data++ = s-RxBroadcastFramesOK; + *data++ = s-RxPauseFrames; + *data++ = s-RxFCSErrors; + *data++ = s-RxAlignErrors; + *data++ = s-RxSymbolErrors; + *data++ = s-RxDataErrors; + *data++ = s-RxSequenceErrors; + *data++ = s-RxRuntErrors; + *data++ = s-RxJabberErrors; + *data++ = s-RxInternalMACRcvError; + *data++ = s-RxInRangeLengthErrors; + *data++ = s-RxOutOfRangeLengthField; + *data++ = s-RxFrameTooLongErrors; + *data++ = s-RxJumboFramesOK; + *data++ = s-RxJumboOctetsOK; + + *data++ = ss.rx_cso_good; + *data++ = ss.tx_cso; + *data++ = ss.tx_tso; + *data++ = ss.vlan_xtract; + *data++ = ss.vlan_insert; + *data++ = ss.tx_need_hdrroom; + *data++ = t-rx_drops; *data++ = t-pure_rsps; *data++ = t-unhandled_irqs; diff --git a/drivers/net/chelsio/pm3393.c b/drivers/net/chelsio/pm3393.c old mode 100644 new mode 100755 index 678778a..2117c4f --- a/drivers/net/chelsio/pm3393.c +++ b/drivers/net/chelsio/pm3393.c @@ -45,7 +45,7 @@ #include linux/crc32.h -#define OFFSET(REG_ADDR)(REG_ADDR 2) +#define OFFSET(REG_ADDR)((REG_ADDR) 2) /* Max frame size PM3393 can handle. Includes Ethernet header and CRC. */ #define MAX_FRAME_SIZE 9600 @@ -428,69 +428,26 @@ static int pm3393_set_speed_duplex_fc(struct cmac *cmac, int speed, int duplex, return 0; } -static void pm3393_rmon_update(struct adapter *adapter, u32 offs, u64 *val, - int over) -{ - u32 val0, val1, val2; - - t1_tpi_read(adapter, offs, val0); - t1_tpi_read(adapter, offs + 4, val1); - t1_tpi_read(adapter, offs + 8, val2); - - *val = ~0ull 40; - *val |= val0 0x; - *val |= (val1 0x) 16; - *val |= (u64)(val2 0xff) 32; - - if (over) - *val += 1ull 40; +#define RMON_UPDATE(mac, name, stat_name) \ +{ \ + t1_tpi_read((mac)-adapter, OFFSET(name), val0); \ + t1_tpi_read((mac)-adapter, OFFSET((name)+1), val1); \ + t1_tpi_read((mac)-adapter, OFFSET((name)+2), val2); \ +
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 01:15:23PM -0800, Rick Jones wrote: The real problem is that these drivers are not in the upstream kernel. Are there common reasons why these drivers are not upstream? One might be that upstream has not accepted them. Anything doing or smelling of TOE comes to mind right away. Which modules doing or smelling of TOE do work with unmodified vendor kernels? AFAIR TOE was both not a module and required some hooks in the network stack, so it's completely outside the scope of this thread. rick jones cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-mm1 tg3 wake-on-lan oddity...
On Tue, 27 Nov 2007 13:34:57 PST, Michael Chan said: Ideally, the BIOS should modify the NVRAM's setting when it is changed. We will talk to Dell to get their opinion on this as this is very confusing to the user. That would certainly explain what I'm seeing, and I can certainly wait if the answer is indeed Buggy BIOS, fixed in D820 A08 or A09 or whenever. (If Dell cares, I'm at BIOS A07 already). In the meantime, I just stuck an 'ethtool -s eth0 wol d' in /etc/rc.local until a proper fix shows up. pgprI27r6QKbi.pgp Description: PGP signature
Re: net_rx_action/NAPI oops [PATCH]
On Tue, 27 Nov 2007 19:52:24 +0100 Robert Olsson [EMAIL PROTECTED] wrote: Hello! I've discovered a bug while testing the new multiQ NAPI code. In hi-load situations when we take down an interface we get a kernel panic. The oops is below. From what I see this happens when driver does napi_disable() and clears NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight a sort indirect test but that's now not enough to cover the load situation. where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still full quota. Latest git but I'll guess the is the same in all later kernels. There might be different solutions... one variant is below: It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear NAPI_STATE_SCHED) and do a full quota. That bug already had to be fixed in other drivers, look like e1000 has same problem. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net_rx_action/NAPI oops [PATCH]
Stephen Hemminger wrote: On Tue, 27 Nov 2007 19:52:24 +0100 Robert Olsson [EMAIL PROTECTED] wrote: Hello! I've discovered a bug while testing the new multiQ NAPI code. In hi-load situations when we take down an interface we get a kernel panic. The oops is below. From what I see this happens when driver does napi_disable() and clears NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight a sort indirect test but that's now not enough to cover the load situation. where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still full quota. Latest git but I'll guess the is the same in all later kernels. There might be different solutions... one variant is below: It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear NAPI_STATE_SCHED) and do a full quota. That bug already had to be fixed in other drivers, look like e1000 has same problem. Stephen, please enlighten me, can you e.g. show me a commit of other drivers where you fixed this up? Thanks, Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] sky2: don't use AER routines
Using PCIE advanced error recovery stuff creates more user problems than it's worth. The AER stuff depends on MMCONFIG and in many configurations it just doesn't work. Plus it doesn't add any real functionality to the driver. The sky2 driver handles its own errors fine as is. This reverts 555382cbfc6d2187b53888190755e56f52308cd6 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- a/drivers/net/sky2.c2007-11-26 14:02:31.0 -0800 +++ b/drivers/net/sky2.c2007-11-26 14:02:36.0 -0800 @@ -31,7 +31,6 @@ #include linux/etherdevice.h #include linux/ethtool.h #include linux/pci.h -#include linux/aer.h #include linux/ip.h #include net/ip.h #include linux/tcp.h @@ -2432,26 +2431,15 @@ static void sky2_hw_intr(struct sky2_hw if (status Y2_IS_PCI_EXP) { /* PCI-Express uncorrectable Error occurred */ - int aer = pci_find_aer_capability(hw-pdev); u32 err; - if (aer) { - pci_read_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, - err); - pci_cleanup_aer_uncorrect_error_status(pdev); - } else { - /* Either AER not configured, or not working -* because of bad MMCONFIG, so just do recover -* manually. -*/ - err = sky2_read32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS); - sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS, -0xul); - } - + err = sky2_read32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS); + sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS, +0xul); if (net_ratelimit()) dev_err(pdev-dev, PCI Express error (0x%x)\n, err); + sky2_read32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS); } if (status Y2_HWE_L1_MASK) @@ -2806,24 +2794,13 @@ static void sky2_reset(struct sky2_hw *h cap = pci_find_capability(pdev, PCI_CAP_ID_EXP); if (cap) { - if (pci_find_aer_capability(pdev)) { - /* Check for advanced error reporting */ - pci_cleanup_aer_uncorrect_error_status(pdev); - pci_cleanup_aer_correct_error_status(pdev); - } else { - dev_warn(pdev-dev, - PCI Express Advanced Error Reporting -not configured or MMCONFIG problem?\n); - - sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS, -0xul); - } + sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS, +0xul); /* If error bit is stuck on ignore it */ if (sky2_read32(hw, B0_HWE_ISRC) Y2_IS_PCI_EXP) dev_info(pdev-dev, ignoring stuck error report bit\n); - - else if (pci_enable_pcie_error_reporting(pdev)) + else hwe_mask |= Y2_IS_PCI_EXP; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] sky2: turn of dynamic Tx watermark workaround (FE+ only)
Add workaround for issues FE+ (A0) transmit watermark. This is copied verbatim from vendor driver sk98lin (10.22.4.3). Don't have that chip version and no more information seems to be available. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- a/drivers/net/sky2.c2007-11-26 14:02:36.0 -0800 +++ b/drivers/net/sky2.c2007-11-26 14:28:47.0 -0800 @@ -845,6 +845,13 @@ static void sky2_mac_init(struct sky2_hw sky2_set_tx_stfwd(hw, port); } + if (hw-chip_id == CHIP_ID_YUKON_FE_P + hw-chip_rev == CHIP_REV_YU_FE2_A0) { + /* disable dynamic watermark */ + reg = sky2_read16(hw, SK_REG(port, TX_GMF_EA)); + reg = ~TX_DYN_WM_ENA; + sky2_write16(hw, SK_REG(port, TX_GMF_EA), reg); + } } /* Assign Ram Buffer allocation to queue */ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
With my Enterprise hat on, I can see where Andi was coming from originally. For the record my original motivation was to fix the TCP exports everything for ipv6.ko case cleanly. I later realized that it would be useful for the ABI stability issues too, but it was really not my primary motivation. This is why I didn't even mention that in the original patch description. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] sky2: revert to access PCI config via device space
Using the hardware window into PCI config space is more reliable and smaller/faster than using the pci_config routines. It avoids issues with MMCONFIG etc. Reverts: 167f53d05fccb47b6eeadac7f6705b3f2f042d03 Please apply for 2.6.24 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- a/drivers/net/sky2.c2007-11-26 14:02:14.0 -0800 +++ b/drivers/net/sky2.c2007-11-26 14:02:31.0 -0800 @@ -240,22 +240,21 @@ static void sky2_power_on(struct sky2_hw sky2_write8(hw, B2_Y2_CLK_GATE, 0); if (hw-flags SKY2_HW_ADV_POWER_CTL) { - struct pci_dev *pdev = hw-pdev; u32 reg; - pci_write_config_dword(pdev, PCI_DEV_REG3, 0); + sky2_pci_write32(hw, PCI_DEV_REG3, 0); - pci_read_config_dword(pdev, PCI_DEV_REG4, reg); + reg = sky2_pci_read32(hw, PCI_DEV_REG4); /* set all bits to 0 except bits 15..12 and 8 */ reg = P_ASPM_CONTROL_MSK; - pci_write_config_dword(pdev, PCI_DEV_REG4, reg); + sky2_pci_write32(hw, PCI_DEV_REG4, reg); - pci_read_config_dword(pdev, PCI_DEV_REG5, reg); + reg = sky2_pci_read32(hw, PCI_DEV_REG5); /* set all bits to 0 except bits 28 27 */ reg = P_CTL_TIM_VMAIN_AV_MSK; - pci_write_config_dword(pdev, PCI_DEV_REG5, reg); + sky2_pci_write32(hw, PCI_DEV_REG5, reg); - pci_write_config_dword(pdev, PCI_CFG_REG_1, 0); + sky2_pci_write32(hw, PCI_CFG_REG_1, 0); /* Enable workaround for dev 4.107 on Yukon-Ultra Extreme */ reg = sky2_read32(hw, B2_GP_IO); @@ -619,12 +618,11 @@ static void sky2_phy_init(struct sky2_hw static void sky2_phy_power(struct sky2_hw *hw, unsigned port, int onoff) { - struct pci_dev *pdev = hw-pdev; u32 reg1; static const u32 phy_power[] = { PCI_Y2_PHY1_POWD, PCI_Y2_PHY2_POWD }; static const u32 coma_mode[] = { PCI_Y2_PHY1_COMA, PCI_Y2_PHY2_COMA }; - pci_read_config_dword(pdev, PCI_DEV_REG1, reg1); + reg1 = sky2_pci_read32(hw, PCI_DEV_REG1); /* Turn on/off phy power saving */ if (onoff) reg1 = ~phy_power[port]; @@ -634,8 +632,8 @@ static void sky2_phy_power(struct sky2_h if (onoff hw-chip_id == CHIP_ID_YUKON_XL hw-chip_rev 1) reg1 |= coma_mode[port]; - pci_write_config_dword(pdev, PCI_DEV_REG1, reg1); - pci_read_config_dword(pdev, PCI_DEV_REG1, reg1); + sky2_pci_write32(hw, PCI_DEV_REG1, reg1); + reg1 = sky2_pci_read32(hw, PCI_DEV_REG1); udelay(100); } @@ -704,9 +702,9 @@ static void sky2_wol_init(struct sky2_po sky2_write16(hw, WOL_REGS(port, WOL_CTRL_STAT), ctrl); /* Turn on legacy PCI-Express PME mode */ - pci_read_config_dword(hw-pdev, PCI_DEV_REG1, reg1); + reg1 = sky2_pci_read32(hw, PCI_DEV_REG1); reg1 |= PCI_Y2_PME_LEGACY; - pci_write_config_dword(hw-pdev, PCI_DEV_REG1, reg1); + sky2_pci_write32(hw, PCI_DEV_REG1, reg1); /* block receiver */ sky2_write8(hw, SK_REG(port, RX_GMF_CTRL_T), GMF_RST_SET); @@ -1322,9 +1320,10 @@ static int sky2_up(struct net_device *de (cap = pci_find_capability(hw-pdev, PCI_CAP_ID_PCIX))) { u16 cmd; - pci_read_config_word(hw-pdev, cap + PCI_X_CMD, cmd); + cmd = sky2_pci_read16(hw, cap + PCI_X_CMD); cmd = ~PCI_X_CMD_MAX_SPLIT; - pci_write_config_word(hw-pdev, cap + PCI_X_CMD, cmd); + sky2_pci_write16(hw, cap + PCI_X_CMD, cmd); + } if (netif_msg_ifup(sky2)) @@ -2422,12 +2421,12 @@ static void sky2_hw_intr(struct sky2_hw if (status (Y2_IS_MST_ERR | Y2_IS_IRQ_STAT)) { u16 pci_err; - pci_read_config_word(pdev, PCI_STATUS, pci_err); + pci_err = sky2_pci_read16(hw, PCI_STATUS); if (net_ratelimit()) dev_err(pdev-dev, PCI hardware error (0x%x)\n, pci_err); - pci_write_config_word(pdev, PCI_STATUS, + sky2_pci_write16(hw, PCI_STATUS, pci_err | PCI_STATUS_ERROR_BITS); } @@ -2699,13 +2698,10 @@ static inline u32 sky2_clk2us(const stru static int __devinit sky2_init(struct sky2_hw *hw) { - int rc; u8 t8; /* Enable all clocks and check for bad PCI access */ - rc = pci_write_config_dword(hw-pdev, PCI_DEV_REG3, 0); - if (rc) - return rc; + sky2_pci_write32(hw, PCI_DEV_REG3, 0); sky2_write8(hw, B0_CTST, CS_RST_CLR); @@ -2802,9 +2798,9 @@ static void sky2_reset(struct sky2_hw *h sky2_write8(hw, B2_TST_CTRL1, TST_CFG_WRITE_ON); /* clear PCI errors, if any */ - pci_read_config_word(pdev, PCI_STATUS,
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, 27 Nov 2007 22:09:42 +0100, Adrian Bunk said: Are there common reasons why these drivers are not upstream? Well, on my laptop, I'm currently dragging along 3 out-of-tree kernel modules. 2 are well-known binary blobs so it's between me and the vendor, as usual. The third is a USB webcam driver that happened to get caught at the wrong end of the colorspace-conversion-in-kernel bunfight in the V4L playpen. Somebody wants to figure out how to get the gspca drivers into the kernel, they're at http://mxhaard.free.fr/download.html waiting for attention. ;) (Don't look at *me*, I don't understand the code, or the bunfight - I just happen to have one of the 244 supported webcams, and it works with that driver) pgpRlM5JB5L1m.pgp Description: PGP signature
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, Nov 27, 2007 at 03:00:22PM -0800, Stephen Hemminger wrote: On Tue, 27 Nov 2007 23:37:43 +0100 Andi Kleen [EMAIL PROTECTED] wrote: With my Enterprise hat on, I can see where Andi was coming from originally. For the record my original motivation was to fix the TCP exports everything for ipv6.ko case cleanly. I later realized that it would be useful for the ABI stability issues too, but it was really not my primary motivation. This is why I didn't even mention that in the original patch description. Since ipv6 can never be removed because it references itself, the whole concept AFAIK that is obsolete anyways. It was done because it was feared it would be broken, but at least the broken cases I knew about were all fixed a long time ago. Most likely it could be safely removed. of a modular ipv6 is flawed. Modules that cannot be unloaded are still useful. Standard case: Distributions like to offer an option to not use ipv6 because that is popular workaround for the common DNS server eats queries and causes delays issue. Forcing the user to rebuild the kernel for this wouldn't be practical. If ipv6 wasn't modular that would be hard to do. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote: On Monday 26 November 2007 17:15:44 Roland Dreier wrote: It seems pretty clear to me that having a mechanism that requires modules to make explicit which (semi-)internal APIs makes reviewing easier Perhaps you've got lots of patches were people are using internal APIs they shouldn't? With my Enterprise hat on, I can see where Andi was coming from originally. Just like we do in RHEL, SuSE have a concept of a kernel ABI and we too have a concept of a whitelist of symbols - a subset of kernel ABI that is approved for use by third parties (this is nothing to do with licensing, this is to do with ABI stability we try to ensure). As part of figuring out what should and should not be used by external modules (outside of the core kernel), we've gained a bit of experience, and it would be nice to be able to help others - this is nothing to do with upstream ABI stability, but just to be of a public service by documenting those functions that are never intended for modules but which necessarily are exported because they have one or two users. , makes it easier to communicate please don't use that API to module authors, Well, introduce an EXPORT_SYMBOL_INTERNAL(). It's a lot less code. But you'd still need to show that people are having trouble knowing what APIs to use. I suggested this exact idea privately at OLS. I still think it's the best compromise, though I like bits of the namespace idea. An EXPORT_SYMBOL_INTERNAL would indeed be trivial. Jon. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] e1000: Fix NAPI state bug when Rx complete
Don't exit polling when we have not yet used our budget, this causes the NAPI system to end up with a messed up poll list. Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e1000/e1000_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index b7c3070..724f067 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3926,7 +3926,7 @@ e1000_clean(struct napi_struct *napi, int budget) work_done, budget); /* If no Tx and not enough Rx work done, exit the polling mode */ - if ((!tx_cleaned (work_done budget)) || + if ((!tx_cleaned (work_done == 0)) || !netif_running(poll_dev)) { quit_polling: if (likely(adapter-itr_setting 3)) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net_rx_action/NAPI oops [PATCH]
Stephen Hemminger wrote: On Tue, 27 Nov 2007 14:34:44 -0800 Kok, Auke [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: On Tue, 27 Nov 2007 19:52:24 +0100 Robert Olsson [EMAIL PROTECTED] wrote: Hello! I've discovered a bug while testing the new multiQ NAPI code. In hi-load situations when we take down an interface we get a kernel panic. The oops is below. From what I see this happens when driver does napi_disable() and clears NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight a sort indirect test but that's now not enough to cover the load situation. where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still full quota. Latest git but I'll guess the is the same in all later kernels. There might be different solutions... one variant is below: It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear NAPI_STATE_SCHED) and do a full quota. That bug already had to be fixed in other drivers, look like e1000 has same problem. Stephen, please enlighten me, can you e.g. show me a commit of other drivers where you fixed this up? Thanks, Auke Author: David S. Miller [EMAIL PROTECTED] 2007-10-11 18:08:29 Committer: David S. Miller [EMAIL PROTECTED] 2007-10-11 18:08:29 Parent: b9f2c0440d806e01968c3ed4def930a43be248ad ([netdrvr] Stop using legacy hooks -self_test_count, -get_stats_count) Child: 266918303226cceac7eca38ced30f15f277bd89c ([SKY2]: status polling loop (post merge)) Branches: master, origin Follows: v2.6.23 Precedes: v2.6.24-rc1 [NET]: Fix NAPI completion handling in some drivers. In order for the list handling in net_rx_action() to be correct, drivers must follow certain rules as stated by this comment in net_rx_action(): /* Drivers must not modify the NAPI state if they * consume the entire weight. In such cases this code * still owns the NAPI instance and therefore can * move the instance around on the list at-will. */ A few drivers do not do this because they mix the budget checks with reading hardware state, resulting in crashes like the one reported by [EMAIL PROTECTED] BNX2 and TG3 are taken care of here, SKY2 fix is from Stephen Hemminger. Signed-off-by: David S. Miller [EMAIL PROTECTED] OK, I'm not sure what went wrong there with e1000, but I'll send a patch in a second. Robert, please give that patch a try (it fixes a crash that I had here as well) and let us know if it works for you. Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.
Which ones were they exactly? I think JamesB wanted only common transport values in the transport class. If it is driver specific then it should go on the host or target or device with the scsi_host_template attrs. It's a chicken egg issue to put port mapper sysfs entry in scsi host attributes. Application won't see sysfs unless initiator creates an Sorry for the late response. I was on vacation. That is only with how you coded it today. I asked you to do something like qla4xxx where the session and host are not so closely bound. bnx2i is still using scsi host per iscsi session model, we plan to migrate to scsi host per device model pretty soon. Previously you indicated that you are working on this port, can you please share the code? We will build on your work and bring this to closure. iSCSI session and driver can't create an iSCSI session without a tcp That is not right with how things are today even. The iscsi_session struct can be created before the tcp connection. This was done because we thought we were going to have to use only sysfs for all setup and management (we ended up netlink and sysfs though). You are right, software is flexible enough to allow creation of session/connection and endpoint independently but so far user daemon creates TCP endpoint before iSCSI session and bind them all together. Any changes to this scheme will not be compatible with existing distributions - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] net/bonding: Return nothing for not applicable values
The previous code returned '\n' (that is, a single empty line) from most files, with one exception (xmit_hash_policy), where it returned 'NA\n'. This patch consolidates each file to return nothing at all if not applicable, not even a '\n'. I find this behaviour more usual, more useful, more efficient and shorter to code from both sides. Signed-off-by: Ferenc Wágner [EMAIL PROTECTED] --- drivers/net/bonding/bond_sysfs.c | 25 - 1 files changed, 4 insertions(+), 21 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index a3f1b4a..6bb91e2 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -455,14 +455,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d, struct device_attribute *attr, char *buf) { - int count; + int count = 0; struct bonding *bond = to_bond(d); - if ((bond-params.mode != BOND_MODE_XOR) - (bond-params.mode != BOND_MODE_8023AD)) { - // Not Applicable - count = sprintf(buf, NA\n); - } else { + if ((bond-params.mode == BOND_MODE_XOR) || + (bond-params.mode == BOND_MODE_8023AD)) { count = sprintf(buf, %s %d\n, xmit_hashtype_tbl[bond-params.xmit_policy].modename, bond-params.xmit_policy); @@ -1079,8 +1076,6 @@ static ssize_t bonding_show_primary(struct device *d, if (bond-primary_slave) count = sprintf(buf, %s\n, bond-primary_slave-dev-name); - else - count = sprintf(buf, \n); return count; } @@ -1186,7 +1181,7 @@ static ssize_t bonding_show_active_slave(struct device *d, { struct slave *curr; struct bonding *bond = to_bond(d); - int count; + int count = 0; read_lock(bond-curr_slave_lock); curr = bond-curr_active_slave; @@ -1194,8 +1189,6 @@ static ssize_t bonding_show_active_slave(struct device *d, if (USES_PRIMARY(bond-params.mode) curr) count = sprintf(buf, %s\n, curr-dev-name); - else - count = sprintf(buf, \n); return count; } @@ -1309,8 +1302,6 @@ static ssize_t bonding_show_ad_aggregator(struct device *d, struct ad_info ad_info; count = sprintf(buf, %d\n, (bond_3ad_get_active_agg_info(bond, ad_info)) ? 0 : ad_info.aggregator_id); } - else - count = sprintf(buf, \n); return count; } @@ -1331,8 +1322,6 @@ static ssize_t bonding_show_ad_num_ports(struct device *d, struct ad_info ad_info; count = sprintf(buf, %d\n, (bond_3ad_get_active_agg_info(bond, ad_info)) ? 0: ad_info.ports); } - else - count = sprintf(buf, \n); return count; } @@ -1353,8 +1342,6 @@ static ssize_t bonding_show_ad_actor_key(struct device *d, struct ad_info ad_info; count = sprintf(buf, %d\n, (bond_3ad_get_active_agg_info(bond, ad_info)) ? 0 : ad_info.actor_key); } - else - count = sprintf(buf, \n); return count; } @@ -1375,8 +1362,6 @@ static ssize_t bonding_show_ad_partner_key(struct device *d, struct ad_info ad_info; count = sprintf(buf, %d\n, (bond_3ad_get_active_agg_info(bond, ad_info)) ? 0 : ad_info.partner_key); } - else - count = sprintf(buf, \n); return count; } @@ -1401,8 +1386,6 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d, print_mac(mac, ad_info.partner_system)); } } - else - count = sprintf(buf, \n); return count; } -- 1.4.4.4 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] net/bonding: Purely cosmetic: rename a local variable.
Code for rendering multivalue sysfs files occurs three times in this module. Rename 'buffer' to 'buf' in the first, for the sake of consistency. Signed-off-by: Ferenc Wágner [EMAIL PROTECTED] --- drivers/net/bonding/bond_sysfs.c |9 - 1 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 6bb91e2..5c31f5c 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -74,7 +74,7 @@ struct rw_semaphore bonding_rwsem; * show function for the bond_masters attribute. * The class parameter is ignored. */ -static ssize_t bonding_show_bonds(struct class *cls, char *buffer) +static ssize_t bonding_show_bonds(struct class *cls, char *buf) { int res = 0; struct bonding *bond; @@ -86,13 +86,12 @@ static ssize_t bonding_show_bonds(struct class *cls, char *buffer) /* not enough space for another interface name */ if ((PAGE_SIZE - res) 10) res = PAGE_SIZE - 10; - res += sprintf(buffer + res, ++more++ ); + res += sprintf(buf + res, ++more++ ); break; } - res += sprintf(buffer + res, %s , - bond-dev-name); + res += sprintf(buf + res, %s , bond-dev-name); } - if (res) buffer[res-1] = '\n'; /* eat the leftover space */ + if (res) buf[res-1] = '\n'; /* eat the leftover space */ up_read((bonding_rwsem)); return res; } -- 1.4.4.4 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bonding sysfs output
Andrew Morton [EMAIL PROTECTED] writes: On Tue, 27 Nov 2007 10:56:43 +0100 Ferenc Wagner [EMAIL PROTECTED] wrote: - raise patches against the latest Linus tree (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/) I thought it was better to change to git. Isn't it so? SubmittingPatches has nothing to say about that... Can I find collected best practices somewhere? Which tree, which branch, how/when to rebase, format-patch, etc... gosh. Documentation/Submit*, http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt, http://linux.yyz.us/patch-format.html, other places. Probably people have written books about it by now. But don't sweat it - you're close enough ;) I wonder where the information got lost... I miss docs on submitting patches from git ONLY. The general documentation is pretty good and helpful, just doesn't treat git (not using git in general, but using it for submitting patches to the Linux kernel). On the other hand there's a multitude of repositories to clone times a zillion branches to follow. Which should be the basis of the patches? That's not very clear. Anyway, find them in my previous mails. Too bad I realised just after the fact that cosmetic changes should go first. Hope it's mostly OK. -- Regards, Feri. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] Remove trailing NULs from network bonding sysfs interface.
Also remove trailing spaces from multivalued files. This fixes output like for example: $ od -c /sys/class/net/bond0/bonding/slaves 000 e t h - l e f t e t h - r i g 020 h t \n \0 025 It mostly entails deleting '+1'-s after sprintf() calls: the return value of sprintf is the number of characters printed, without the closing NUL, ie. exactly what the sysfs interface requires. The three multivalue cases are different, because they also have to swallow back a trailing space. Signed-off-by: Ferenc Wágner [EMAIL PROTECTED] --- drivers/net/bonding/bond_sysfs.c | 66 + 1 files changed, 30 insertions(+), 36 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index b29330d..a3f1b4a 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -86,14 +86,13 @@ static ssize_t bonding_show_bonds(struct class *cls, char *buffer) /* not enough space for another interface name */ if ((PAGE_SIZE - res) 10) res = PAGE_SIZE - 10; - res += sprintf(buffer + res, ++more++); + res += sprintf(buffer + res, ++more++ ); break; } res += sprintf(buffer + res, %s , bond-dev-name); } - res += sprintf(buffer + res, \n); - res++; + if (res) buffer[res-1] = '\n'; /* eat the leftover space */ up_read((bonding_rwsem)); return res; } @@ -235,14 +234,13 @@ static ssize_t bonding_show_slaves(struct device *d, /* not enough space for another interface name */ if ((PAGE_SIZE - res) 10) res = PAGE_SIZE - 10; - res += sprintf(buf + res, ++more++); + res += sprintf(buf + res, ++more++ ); break; } res += sprintf(buf + res, %s , slave-dev-name); } read_unlock(bond-lock); - res += sprintf(buf + res, \n); - res++; + if (res) buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -406,7 +404,7 @@ static ssize_t bonding_show_mode(struct device *d, return sprintf(buf, %s %d\n, bond_mode_tbl[bond-params.mode].modename, - bond-params.mode) + 1; + bond-params.mode); } static ssize_t bonding_store_mode(struct device *d, @@ -463,11 +461,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d, if ((bond-params.mode != BOND_MODE_XOR) (bond-params.mode != BOND_MODE_8023AD)) { // Not Applicable - count = sprintf(buf, NA\n) + 1; + count = sprintf(buf, NA\n); } else { count = sprintf(buf, %s %d\n, xmit_hashtype_tbl[bond-params.xmit_policy].modename, - bond-params.xmit_policy) + 1; + bond-params.xmit_policy); } return count; @@ -527,7 +525,7 @@ static ssize_t bonding_show_arp_validate(struct device *d, return sprintf(buf, %s %d\n, arp_validate_tbl[bond-params.arp_validate].modename, - bond-params.arp_validate) + 1; + bond-params.arp_validate); } static ssize_t bonding_store_arp_validate(struct device *d, @@ -627,7 +625,7 @@ static ssize_t bonding_show_arp_interval(struct device *d, { struct bonding *bond = to_bond(d); - return sprintf(buf, %d\n, bond-params.arp_interval) + 1; + return sprintf(buf, %d\n, bond-params.arp_interval); } static ssize_t bonding_store_arp_interval(struct device *d, @@ -711,10 +709,7 @@ static ssize_t bonding_show_arp_targets(struct device *d, res += sprintf(buf + res, %u.%u.%u.%u , NIPQUAD(bond-params.arp_targets[i])); } - if (res) - res--; /* eat the leftover space */ - res += sprintf(buf + res, \n); - res++; + if (res) buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -815,7 +810,7 @@ static ssize_t bonding_show_downdelay(struct device *d, { struct bonding *bond = to_bond(d); - return sprintf(buf, %d\n, bond-params.downdelay * bond-params.miimon) + 1; + return sprintf(buf, %d\n, bond-params.downdelay * bond-params.miimon); } static ssize_t bonding_store_downdelay(struct device *d, @@ -872,7 +867,7 @@ static ssize_t bonding_show_updelay(struct device *d, { struct bonding *bond = to_bond(d); - return sprintf(buf, %d\n, bond-params.updelay * bond-params.miimon) + 1; + return sprintf(buf, %d\n, bond-params.updelay * bond-params.miimon); } @@ -936,7
Re: [PATCH 6/7] [IPSEC]: Lock state when copying non-atomic fields to user-space
Herbert, Monday 26 November 2007 20:07, Herbert Xu wrote: On Mon, Nov 26, 2007 at 11:18:45AM +0800, Herbert Xu wrote: I'm just going to revert this patch for 2.6.24 since we've lived with this race for so long anyway. Actually, instead of reverting it completely I'm just going to remove the newly added locks which should be just as effective. This would reduce the churn in the code as we'd be putting most of it back soon anyway. With the patch you sent, the xfrm_state_walk() issue I reported is solved at current net-2.6.25. -- Masahide NAKAMURA - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
Adrian Bunk wrote: On Tue, Nov 27, 2007 at 01:15:23PM -0800, Rick Jones wrote: The real problem is that these drivers are not in the upstream kernel. Are there common reasons why these drivers are not upstream? One might be that upstream has not accepted them. Anything doing or smelling of TOE comes to mind right away. Which modules doing or smelling of TOE do work with unmodified vendor kernels? At the very real risk of further demonstrating my Linux vocabulary limitations, I believe there is a Linux Sockets Acceleration module/whatnot for NetXen and related 10G NICs, and a cxgb3_toe (?) module for Chelsio 10G NICs. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
drivers/net/r6040.c warnings on x86_64
drivers/net/r6040.c: In function 'rx_buf_alloc': drivers/net/r6040.c:262: warning: passing argument 2 of 'pci_map_single' makes pointer from integer without a cast drivers/net/r6040.c: In function 'r6040_up': drivers/net/r6040.c:631: warning: cast from pointer to integer of different size drivers/net/r6040.c:632: warning: cast to pointer from integer of different size drivers/net/r6040.c:637: warning: cast from pointer to integer of different size drivers/net/r6040.c:638: warning: cast to pointer from integer of different size can we fix these please? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html