Re: bonding sysfs output

2007-11-27 Thread Andrew Morton
On Mon, 26 Nov 2007 09:29:40 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote:

 Andrew Morton [EMAIL PROTECTED] writes:
 
  On Sun, 25 Nov 2007 16:12:57 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote:
 
  I propose it as a fix for trailing NULs and spaces like eg.
  
  $ od -c /sys/class/net/bond0/bonding/slaves 
  000   e   t   h   -   l   e   f   t   e   t   h   -   r   i   g
  020   h   t  \n  \0
  025
  
  I'm afraid there're other problems with ++more++ handling, but let's
  not consider those just yet.  Find the patch attached.  The first
  hunks also renames buffer to buf, for consistency's shake.
  
  The original version had varying behaviour for Not Applicable cases.
  This patch also settles for empty files (not even a line feed) in
  those cases, but I'm not sure about the general policy on this matter.
 
  hm, there are a lot of changes there.  Were they all actually needed to fix
  the one bug which you have described?
 
 Trailing NULs are present in each file under /sys/class/net/*/bonding
 and also in /sys/class/net/bonding_masters.  That is, in every file
 provided by drivers/net/bonding/bond_sysfs.c.  Most of the patch is
 concerned with this.
 
 Closely related is the presence of trailing spaces in multivalue
 files.  There are three such files, one of them has the trailing space
 removed.  This patch removes it from the other two.  During this it
 also renames one function argument 'buffer' to 'buf', for consistency.
 
 On the policy side: some files are not applicable to some types of
 bonds, and return a single linefeed in that case.  Except for one
 single case, which returns 'NA\n'.  The patch changes these cases into
 emtpy files.
 
 If these are worthy changes, I'm absolutely willing to split up the
 patch into three parts as the above.

Well that would be good if poss, thanks.

But fixing bugs is way more important than niceties of patch presentation
however I wasn't prepared to fix the rejects which that patch is hitting in
the considerably-changed bonding_show_ad_partner_mac().  Please:

- raise patches against the latest Linus tree
(ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/)

- cc netdev@vger.kernel.org on networking-related matters

- Include a Signed-off-by: as per Documentation/SubmittingPatches

- Try to ensure that the full explanation (such as you have above) is
covered in the changelog text.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Andi Kleen

  Perhaps you've got lots of patches were people are using internal APIs they 
  shouldn't?
  
 
 Maybe the issue is who can tell since what is external and what is
 internal is not explicitly defined?

Exactly.  Or rather it is not defined on the module level. We got 
static of course, but I think we should have a similar mechanism
on a module level.


 Explicitly documenting what comprises the kernel API (external,
 supported) 

It would not be fully supported either -- can still change etc. --
but there is a reasonable expectation that those external
APIs will change less often than internal interfaces.

 - forcing developers to identify their exports as part of the
 implementation or as part of the kernel API

That is EXPORT_SYMBOL already. The trouble is just that it covers
too much. My patchkit is trying to limit it again for a specific
use case -- exporting an internal interface to another module.
Or rather a set of modules. 

Standard example is TCP: TCP exports nearly everything and the
single user is the TCP code in ipv6.ko. Instead those symbols should
be limited to be only accessable to ipv6.ko. 

The reason I went with the more generic namespace mechanism
instead of EXPORT_SYMBOL_TO() is that ipv6 is ever split up
it would still work. 

Also using namespaces doesn't have any more overhead than
EXPORT_SYMBOL_TO() and the complexity is about the same
(not very much anyways -- just look at the patches) 

 - making it easier for reviewers to identify when developers are adding
 to the kernel API and thereby focusing the appropriate level of review
 to the new function

That is another reason.
 
-ANdi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Does tc-prio really work as advertised?

2007-11-27 Thread Joerg Pommnitz
Jarek,
iptables chains (this is what I think you are referring to) are not the issue. 
This
is about the qdisc that sits immediately over the device driver and decides the
order waiting packets are sent over the line/air/carrier pigeon/... .
My suspicion is that skb-priority used to be set to a value that derived from 
the
TOS bits. Then something changed and nobody noticed.
-- 
 
Regards
 
   Joerg
 


- Ursprüngliche Mail 
Von: Jarek Poplawski [EMAIL PROTECTED]
An: Jarek Poplawski [EMAIL PROTECTED]
CC: Joerg Pommnitz [EMAIL PROTECTED]; netdev@vger.kernel.org; OLSR discussion 
and development [EMAIL PROTECTED]
Gesendet: Dienstag, den 27. November 2007, 07:46:04 Uhr
Betreff: Re: Does tc-prio really work as advertised?

On 26-11-2007 23:25, Jarek Poplawski wrote:
...
 Are you doing this on the same box? I was tracing this long time ago  too, 
 and, if
 I didn't miss something, it was about the place! So, as I recall  (after 
 finding
 some old message) this TOS is considered only for packets going  through the 
 FORWARD
 chain. (But, I haven't checked this at all now, so no  complaints...)

...Too exactly! Iptables aren't needed for this, so going through
the forward. should be enough...

Jarek P.






  Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: 
http://de.yahoo.com/set
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.23-mm1 tg3 wake-on-lan oddity...

2007-11-27 Thread Valdis . Kletnieks
Scenario - Dell Latitude D820 laptop, tg3 driver says this at boot:

eth0: Tigon3 [partno(BCM5752KFBG) rev 6002 PHY(5752)] (PCI Express) 
10/100/1000Base-T Ethernet 00:15:c5:c8:33:4e
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[7618] dma_mask[64-bit]

# (lspci; lspci -n) | grep 09
09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752 Gigabit 
Ethernet PCI Express (rev 02)
09:00.0 0200: 14e4:1600 (rev 02)

(I think that's most of the likely-relevant info...)

Issue:

I (for unrelated reasons) run powertop, and it suggests I conserve power
by doing 'ethtool -s eth0 wol d'.  I look at it, and think that it's daft,
because (a) the Dell factory default is WOL disabled and (b) if it wasn't
the default, I'd have *set* it to disabled, and (c) I even went back and
rebooted and checked the BIOS setting - disabled. Nonetheless:

# ethtool eth0 | grep Wake
Supports Wake-on: g
Wake-on: g

Is this expected behavior?


pgph055mRDfkl.pgp
Description: PGP signature


Re: Does tc-prio really work as advertised?

2007-11-27 Thread Jarek Poplawski
On Tue, Nov 27, 2007 at 01:28:43AM -0800, Joerg Pommnitz wrote:
 Jarek,
 iptables chains (this is what I think you are referring to) are not the issue.

Yes, but this could (wrongly) look like this according to my 1-st message.

 This
 is about the qdisc that sits immediately over the device driver and decides 
 the
 order waiting packets are sent over the line/air/carrier pigeon/... .
 My suspicion is that skb-priority used to be set to a value that derived 
 from the
 TOS bits. Then something changed and nobody noticed.

I'm not sure of your problem: did you try this on a box which
gets packets with TOS set earlier, does forwarding, and uses this
prio on egress? If so, and this doesn't work, then you are right
something could be wrong.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bonding sysfs output

2007-11-27 Thread Ferenc Wagner
Andrew Morton [EMAIL PROTECTED] writes:

 On Mon, 26 Nov 2007 09:29:40 +0100 Wagner Ferenc [EMAIL PROTECTED] wrote:

 Trailing NULs are present in each file under /sys/class/net/*/bonding
 and also in /sys/class/net/bonding_masters.  That is, in every file
 provided by drivers/net/bonding/bond_sysfs.c.  Most of the patch is
 concerned with this.
 
 Closely related is the presence of trailing spaces in multivalue
 files.  There are three such files, one of them has the trailing space
 removed.  This patch removes it from the other two.  During this it
 also renames one function argument 'buffer' to 'buf', for consistency.
 
 On the policy side: some files are not applicable to some types of
 bonds, and return a single linefeed in that case.  Except for one
 single case, which returns 'NA\n'.  The patch changes these cases into
 emtpy files.
 
 If these are worthy changes, I'm absolutely willing to split up the
 patch into three parts as the above.

 Well that would be good if poss, thanks.

Will do.  Not exactly a simple thing, as the changes collide.

 But fixing bugs is way more important than niceties of patch presentation
 however I wasn't prepared to fix the rejects which that patch is hitting in
 the considerably-changed bonding_show_ad_partner_mac().  Please:

Yes, the patch was against 2.6.23.8.  Forgot to mention. :(

 - raise patches against the latest Linus tree
 (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/)

I thought it was better to change to git.  Isn't it so?
SubmittingPatches has nothing to say about that...
Can I find collected best practices somewhere?  Which tree, which
branch, how/when to rebase, format-patch, etc...

(If given no further instructions, I'll try my best and you can
reflect on the result.  I mean, the above questions are not blocking
me, feel free not to answer.)

 - cc netdev@vger.kernel.org on networking-related matters

 - Include a Signed-off-by: as per Documentation/SubmittingPatches

 - Try to ensure that the full explanation (such as you have above) is
 covered in the changelog text.

Ok.
-- 
Thanks for your time,
Feri.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bonding sysfs output

2007-11-27 Thread Andrew Morton
On Tue, 27 Nov 2007 10:56:43 +0100 Ferenc Wagner [EMAIL PROTECTED] wrote:

  - raise patches against the latest Linus tree
  (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/)
 
 I thought it was better to change to git.  Isn't it so?

Yes, git is a bit more uptodate than the snapshots.  But if that matters
you were very unlucky.

 SubmittingPatches has nothing to say about that...
 Can I find collected best practices somewhere?  Which tree, which
 branch, how/when to rebase, format-patch, etc...

gosh.  Documentation/Submit*,
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt,
http://linux.yyz.us/patch-format.html, other places.  Probably people have
written books about it by now.  But don't sweat it - you're close enough ;)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][UNIX] EOF on non-blocking SOCK_SEQPACKET

2007-11-27 Thread Florian Zumbiehl
Hi,

I am not absolutely sure whether this actually is a bug (as in: I've got
no clue what the standards say or what other implementations do), but at
least I was pretty surprised when I noticed that a recv() on a
non-blocking unix domain socket of type SOCK_SEQPACKET (which is connection
oriented, after all) where the remote end has closed the connection
returned -1 (EAGAIN) rather than 0 to indicate end of file.

This is a test case:

| #include sys/types.h
| #include unistd.h
| #include sys/socket.h
| #include sys/un.h
| #include fcntl.h
| #include string.h
| #include stdlib.h
| 
| int main(){
|   int sock;
|   struct sockaddr_un addr;
|   char buf[4096];
|   int pfds[2];
| 
|   pipe(pfds);
|   sock=socket(PF_UNIX,SOCK_SEQPACKET,0);
|   addr.sun_family=AF_UNIX;
|   strcpy(addr.sun_path,/tmp/foobar_testsock);
|   bind(sock,(struct sockaddr *)addr,sizeof(addr));
|   listen(sock,1);
|   if(fork()){
|   close(sock);
|   sock=socket(PF_UNIX,SOCK_SEQPACKET,0);
|   connect(sock,(struct sockaddr *)addr,sizeof(addr));
|   fcntl(sock,F_SETFL,fcntl(sock,F_GETFL)|O_NONBLOCK);
|   close(pfds[1]);
|   read(pfds[0],buf,sizeof(buf));
|   recv(sock,buf,sizeof(buf),0); // -- this one
|   }else accept(sock,NULL,NULL);
|   exit(0);
| }

If you try it, make sure /tmp/foobar_testsock doesn't exist.

The marked recv() returns -1 (EAGAIN) on 2.6.23.9. Below you find a
patch that fixes that.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index a05c342..fa0aec5 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1632,8 +1632,13 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct 
socket *sock,
mutex_lock(u-readlock);
 
skb = skb_recv_datagram(sk, flags, noblock, err);
-   if (!skb)
+   if (!skb) {
+   unix_state_lock(sk);
+   if (sk-sk_type==SOCK_SEQPACKET  err==-EAGAIN  
(sk-sk_shutdownRCV_SHUTDOWN))
+   err=0; /* signal EOF on disconnected non-blocking 
SEQPACKET socket */
+   unix_state_unlock(sk);
goto out_unlock;
+   }
 
wake_up_interruptible(u-peer_wait);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Andi Kleen
On Tue, Nov 27, 2007 at 03:26:52PM +1100, Rusty Russell wrote:
 On Monday 26 November 2007 16:58:08 Roland Dreier wrote:
 I agree that we shouldn't make things too hard for out-of-tree
 modules, but I disagree with your first statement: there clearly is a
 large class of symbols that are used by multiple modules but which are
 not generically useful -- they are only useful by a certain small
 class of modules.
   
If it is so clear, you should be able to easily provide examples?
 
  Sure -- Andi's example of symbols required only by TCP congestion
  modules;
 
 Exactly.  Why exactly should someone not write a new TCP congestion module?

Agreed the congestion modules are a corner case. I even mentioned that in the
patch. I would be happy to drop that one if that is the consensus.
It was more done as a example anyways. That is why I made it an separate
namespace from tcp

But for many other TCP symbols it makes a lot of sense: all the functions
only used by tcp_ipv6.c. If someone wants to write support for a IPv7 or
similar they really should do it in tree. So I think the tcp and  inet
namespaces make a lot of sense.

 Right.  So presumably there will only ever be two drivers using this core 
 code, so no new users will ever be written?  Now we've found one use case, is 

If there are new users they will need to get proper review and should
be in tree.  MODULE_ALLOW() enforces that.

 it worth the complexity of namespaces?  Is it worth the halfway point of 

What complexity? You're always claiming it is complex. It isn't really.

 export-to-module?
 

 No, we've seen the solution and various people applying it.  I'm still trying 
 to discover the problem it's solving.

Again for rusty @)

Goals are:
- Limit the interfaces available for out of tree modules to reasonably 
stable ones that are already used by a larger set of drivers.

This can also have further downstream advantages.
For example it might be a reasonable future rule to require all unconditionally
EXPORT_SYMBOL()s to have a complete LinuxDoc documentation entry.

- Explicitely declare in source what is clearly internal and not intended to
be a generally usable interface.

e.g. for the LinuxDoc example above such internal functions don't necessarily
need full LinuxDoc documentation.

- Force review from core maintainers for use of such internal interfaces

- Limit size of exported API to make stable ABIs for enterprise
distributions easier
[Yes I know that is not a popular topic on l-k, but it's a day-to-day
problem for these distros and out of tree solutions do not work]

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AW: Does tc-prio really work as advertised?

2007-11-27 Thread Joerg Pommnitz
Jarek,
this is all about outgoing packets, e.g. egress to use your word.
It doesn't matter whether the packets are originated locally or
whether the packets are forwarded from another host (I tried
both).

To restate the problem: according to my observations the prio qdisc
(and probably pfifo_fast, but I couldn't observe this) does not prioritize
at all and always uses the band indicated by the first entry in the
priomap.

By default the priomap looks like this:
qdisc prio 1: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through
the different bands according to the TOS value of the packets. My observation
is, that the traffic always uses the band pointed to by the first entry in the
priomap. This value is 1 by default, so all traffic goes through band 1:2.

Now it's entirely possible that I did something stupid, but nobody came forward
to show me the error of my ways (neither here nor on the lartc list).

-- Regards
 
   Joerg

- Ursprüngliche Mail 
Von: Jarek Poplawski [EMAIL PROTECTED]
An: Joerg Pommnitz [EMAIL PROTECTED]
CC: netdev@vger.kernel.org
Gesendet: Dienstag, den 27. November 2007, 10:58:38 Uhr
Betreff: Re: Does tc-prio really work as advertised?

On Tue, Nov 27, 2007 at 01:28:43AM -0800, Joerg Pommnitz wrote:
 Jarek,
 iptables chains (this is what I think you are referring to) are not  the 
 issue.

Yes, but this could (wrongly) look like this according to my 1-st  message.

 This
 is about the qdisc that sits immediately over the device driver and  decides 
 the
 order waiting packets are sent over the line/air/carrier pigeon/... .
 My suspicion is that skb-priority used to be set to a value that  derived 
 from the
 TOS bits. Then something changed and nobody noticed.

I'm not sure of your problem: did you try this on a box which
gets packets with TOS set earlier, does forwarding, and uses this
prio on egress? If so, and this doesn't work, then you are right
something could be wrong.

Regards,
Jarek P.






__  Ihr erstes Baby? Holen Sie sich 
Tipps von anderen Eltern.  www.yahoo.de/clever
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module

2007-11-27 Thread TAKANO Ryousei
Hi Patrick,

I appreciate your comments.
I will update and resend patches.

  +typedef long long gapclock_t;
 
 It seems you only add to this, does it need to be signed?
 How about using a fixed size type (u64) and getting rid
 of the typedef?
 
OK. I will use u64 instead of gapclock_t, and remove the typedef.

  +struct tc_psp_qopt
  +{
  +   __u32   defcls;
  +   __u32   rate;
  +};
 
 
 What unit is rate measured in?
 
'rate' is the transmission rate in bytes per sec.

  +struct tc_psp_xstats
  +{
  +   __u32   bytes;  /* gap packet statistics */
  +   __u32   packets;
  +};
 
 How about using gnet_stats_basic for this?
 
OK. I will use gnet_stats_basic.

  +   if (!skb)
  +   return NULL;
  +
  +   memset(skb-data, 0xff, size);
 
 memsetting the header portion seems unnecessary since you overwrite
 it again directly below.
 
The size of a gap packet is variable, so it fills a whole gap packet
with 0xff, where 'size' indicates the interface MTU size.

  +   skb_put(skb, size);
 
 This is usually done before putting data in the packet.
 
Therefore, skb_put() is needed.

  +static int recalc_gapsize(struct sk_buff *skb, struct Qdisc *sch)
  +{
  +   struct psp_sched_data *q = qdisc_priv(sch);
  +   struct psp_class *cl;
  +   unsigned int len = skb-len;
  +   int gapsize = 0;
  +   int err;
  +
  +   cl = psp_classify(skb, sch, err);
  +   switch (cl-mode) {
  +   case MODE_STATIC:
  +   gapsize = cl-gapsize;
  +   if (len  q-mtu) /* workaround for overflow */
  +   gapsize = (int)((long long)gapsize * len / q-mtu);
 
 No need to cast to the LHS type. Shouldn't this also be rounded up
 like the other gapsize calculation?
 
How about this?

u64 gapsize = 0;
:
case MODE_STATIC:
gapsize = (u64)cl-gapsize * len / q-mtu;
gapsize = min_t(u64, gapsize, UINT_MAX);
break;
}

  +static struct psp_class *lookup_next_class(struct Qdisc *sch, int *gapsize)
  +{
  +   struct psp_sched_data *q = qdisc_priv(sch);
  +   struct psp_class *cl, *next = NULL;
  +   gapclock_t diff;
  +   int shortest;
  +
  +   /* pacing class */
  +   shortest = q-mtu;
  +   list_for_each_entry(cl, q-pacing_list, plist) {
  +   diff = cl-clock - q-clock;
  +   if (diff  0) {
  +   if ((gapclock_t)shortest  diff)
  +   shortest = (int)diff;
 
 Is diff guaranteed not to exceed an int?
 
No. But actually, 'diff' is much smaller than INT_MAX.
How about this?

u64 diff, shortest;
:
if (q-clock  cl-clock) {
diff = cl-clock - q-clock;
if (shortest  diff)
shortest = diff;
continue;
}


  +static void psp_destroy(struct Qdisc *sch)
  +{
  +   struct psp_sched_data *q = qdisc_priv(sch);
  +
  +   tcf_destroy_chain(q-filter_list);
  +
  +   while (!list_empty(q-root))
  +   psp_destroy_class(sch, list_entry(q-root.next,
  + struct psp_class, sibling));
 
 list_for_each_entry_safe.
 
I think it works well. Should I need to use list_for_each_entry_safe?

Best regards,
Ryousei Takano
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage

2007-11-27 Thread Pavel Emelyanov
[snip]

 
 Well I clearly goofed when I added the initial network namespace support
 for /proc/net.  Currently things work but there are odd details visible
 to user space, even when we have a single network namespace.
 
 Since we do not cache proc_dir_entry dentries at the moment we can
 just modify -lookup to return a different directory inode depending
 on the network namespace of the process looking at /proc/net, replacing
 the current technique of using a magic and fragile follow_link method.
 
 To accomplish that this patch:
 - introduces a shadow_proc method to allow different dentries to
   be returned from proc_lookup.
 - Removes the old /proc/net follow_link magic
 - Fixes a weakness in our not caching of proc generic dentries.
 
 As shadow_proc uses a task struct to decided which dentry to return we
 can go back later and fix the proc generic caching without modifying any code 
 that
 uses the shadow_proc method.
 
 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]

Thanks, Eric. 

Much better ('find /proc' works and so does 'ls ..'), but one 
issue is still unsolved :(

I mentioned the program, that opens the directory and dumps the
content of the /proc/self/fd. Here it is (stupid but simple):

 prog.c
#include stdio.h
#include unistd.h
#include stdlib.h
#include asm/fcntl.h
#include unistd.h

int main(int argc, char **argv)
{
int fd;

fd = open(argv[1], O_RDONLY|O_DIRECTORY);
if (fd == -1) {
perror(Can't open);
return 1;
}

system(ls -l /proc/self/fd);
return 0;
}


So. Here's the result of running this program:

# cd /proc/net/
# pwd
/proc/net
# ~/a.out .
total 0
lrwx--  1 root root 64 Nov 27 13:27 0 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:27 1 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:27 2 - /dev/pts/0
lr-x--  1 root root 64 Nov 27 13:27 3 - /proc/net (deleted)
lr-x--  1 root root 64 Nov 27 13:27 4 - /proc/4475/fd

# cd /proc
# pwd 
/proc
# ~/a.out net
total 0
lrwx--  1 root root 64 Nov 27 13:27 0 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:27 1 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:27 2 - /dev/pts/0
lr-x--  1 root root 64 Nov 27 13:27 3 - /proc/net
lr-x--  1 root root 64 Nov 27 13:27 4 - /proc/4477/fd

# cd /proc/net/stat
# pwd
/proc/net/stat
# ~/a.out ..
total 0
lrwx--  1 root root 64 Nov 27 13:29 0 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:29 1 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:29 2 - /dev/pts/0
lr-x--  1 root root 64 Nov 27 13:29 3 - /proc/net (deleted)
lr-x--  1 root root 64 Nov 27 13:29 4 - /proc/4482/fd
# ~/a.out .
total 0
lrwx--  1 root root 64 Nov 27 13:32 0 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:32 1 - /dev/pts/0
lrwx--  1 root root 64 Nov 27 13:32 2 - /dev/pts/0
lr-x--  1 root root 64 Nov 27 13:32 3 - /proc/net/stat
lr-x--  1 root root 64 Nov 27 13:32 4 - /proc/4488/fd

Bad thing is that . when cdir is /proc/net and .. when cdir is
anything under /proc/net (i.e. the /proc/net itself) is marked as (deleted).

[snip]

Thanks,
Pavel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property

2007-11-27 Thread Anton Vorontsov
On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote:
 On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote:
  fixed-link says: register new Fixed/emulated PHY, i.e. PHY that
  not connected to the real MDIO bus.
  
  Signed-off-by: Vitaly Bordug [EMAIL PROTECTED]
  Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]
  
  ---
  
   Documentation/powerpc/booting-without-of.txt |3 +
   arch/powerpc/sysdev/fsl_soc.c|   56 
  ++
   2 files changed, 42 insertions(+), 17 deletions(-)
  
  
  diff --git a/Documentation/powerpc/booting-without-of.txt 
  b/Documentation/powerpc/booting-without-of.txt
  index e9a3cb1..cf25070 100644
  --- a/Documentation/powerpc/booting-without-of.txt
  +++ b/Documentation/powerpc/booting-without-of.txt
  @@ -1254,6 +1254,9 @@ platforms are moved over to use the 
  flattened-device-tree model.
 services interrupts for this device.
   - phy-handle : The phandle for the PHY connected to this ethernet
 controller.
  +- fixed-link : a b c where a is emulated phy id - choose any,
  +  but unique to the all specified fixed-links, b is duplex - 0 half,
  +  1 full, c is link speed - d#10/d#100/d#1000.
 
 Good work!
 May I suggest adding a d to a b c where d is flow control - 0 no, 1 yes

Well, I see no reference of the flow neither in the include/linux/mii.h
nor in the drivers/net/phy/*. :-/ Thus today there is no such register
bit we can emulate?..

 flow control or not just popped up here today so I got a use for it.

..and I looked into few drivers (gianfar, ucc), and they seem to use
hard-coded flow control values, thus they don't speak to the PHYs
about that.

Can you give any reference to the driver that needs that parameter?
Does it use PHY subsystem to obtain flow-control on/off information?

Thanks,

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
backup email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does tc-prio really work as advertised?

2007-11-27 Thread Jarek Poplawski
On Tue, Nov 27, 2007 at 02:54:10AM -0800, Joerg Pommnitz wrote:
 Jarek,
 this is all about outgoing packets, e.g. egress to use your word.
 It doesn't matter whether the packets are originated locally or
 whether the packets are forwarded from another host (I tried
 both).
 
 To restate the problem: according to my observations the prio qdisc
 (and probably pfifo_fast, but I couldn't observe this) does not prioritize
 at all and always uses the band indicated by the first entry in the
 priomap.
 
 By default the priomap looks like this:
 qdisc prio 1: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 
 there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through
 the different bands according to the TOS value of the packets. My observation
 is, that the traffic always uses the band pointed to by the first entry in the
 priomap. This value is 1 by default, so all traffic goes through band 1:2.
 
 Now it's entirely possible that I did something stupid, but nobody came 
 forward
 to show me the error of my ways (neither here nor on the lartc list).
 

I don't think there could be anything stupid if something is maybe not
enough documented. But, this really should work - just like you've
found: TOS should be recalculated to skb-priority, and this should
affect prio. You should only consider that this recalculation isn't
done for all packets, but only forwarded ones (if I can remember, didn't
miss something, and nothing changed later...). So, are you still sure
you've tested such a case? (Btw., there are some other tools which can
change this priority field, so I hope you don't use them too.)

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module

2007-11-27 Thread Patrick McHardy

TAKANO Ryousei wrote:

Hi Patrick,


+struct tc_psp_qopt
+{
+   __u32   defcls;
+   __u32   rate;
+};


What unit is rate measured in?


'rate' is the transmission rate in bytes per sec.



So wouldn't it make sense to use u64 then for 10GBit networks?


+   skb_put(skb, size);

This is usually done before putting data in the packet.


Therefore, skb_put() is needed.



I meant this is usually done before writing to the packet data,
so you should move it up a few lines.


+static int recalc_gapsize(struct sk_buff *skb, struct Qdisc *sch)
+{
+   struct psp_sched_data *q = qdisc_priv(sch);
+   struct psp_class *cl;
+   unsigned int len = skb-len;
+   int gapsize = 0;
+   int err;
+
+   cl = psp_classify(skb, sch, err);
+   switch (cl-mode) {
+   case MODE_STATIC:
+   gapsize = cl-gapsize;
+   if (len  q-mtu) /* workaround for overflow */
+   gapsize = (int)((long long)gapsize * len / q-mtu);

No need to cast to the LHS type. Shouldn't this also be rounded up
like the other gapsize calculation?


How about this?

u64 gapsize = 0;
:
case MODE_STATIC:
gapsize = (u64)cl-gapsize * len / q-mtu;
gapsize = min_t(u64, gapsize, UINT_MAX);



Really a minimum of UINT_MAX? This will probably also break on 32 bit
unless you use do_div().


break;
}


+static struct psp_class *lookup_next_class(struct Qdisc *sch, int *gapsize)
+{
+   struct psp_sched_data *q = qdisc_priv(sch);
+   struct psp_class *cl, *next = NULL;
+   gapclock_t diff;
+   int shortest;
+
+   /* pacing class */
+   shortest = q-mtu;
+   list_for_each_entry(cl, q-pacing_list, plist) {
+   diff = cl-clock - q-clock;
+   if (diff  0) {
+   if ((gapclock_t)shortest  diff)
+   shortest = (int)diff;

Is diff guaranteed not to exceed an int?


No. But actually, 'diff' is much smaller than INT_MAX.
How about this?

u64 diff, shortest;
:
if (q-clock  cl-clock) {
diff = cl-clock - q-clock;
if (shortest  diff)
shortest = diff;
continue;
}



I don't know, its your qdisc :) I was merely wondering whether
you need larger types since there seems to be some inconsisteny.
Your old code had a check for diff  0, so it appears that
cl-clock could be smaller than q-clock.


+   while (!list_empty(q-root))
+   psp_destroy_class(sch, list_entry(q-root.next,
+ struct psp_class, sibling));

list_for_each_entry_safe.


I think it works well. Should I need to use list_for_each_entry_safe?



I don't doubt that it works, but list_for_each_entry_safe is
the proper interface for this.

One more thing: your qdisc can only be used as root qdisc since it
produces packets itself and thereby violates the rule that a qdisc
can only hand out packets that were previously enqueued to it.
Using it as a leaf qdisc can make the upper qdiscs qlen counter
go negative, causing infinite dequeue-loops, so you should make
sure that its only possibly to use as root qdisc by checking the
parent. It would also be better to do something like netem
(enqueue produced packets at the root) to make sure the qlen
counter is always accurate.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does tc-prio really work as advertised?

2007-11-27 Thread Patrick McHardy

Jarek Poplawski wrote:

On Tue, Nov 27, 2007 at 02:54:10AM -0800, Joerg Pommnitz wrote:

Jarek,
this is all about outgoing packets, e.g. egress to use your word.
It doesn't matter whether the packets are originated locally or
whether the packets are forwarded from another host (I tried
both).

To restate the problem: according to my observations the prio qdisc
(and probably pfifo_fast, but I couldn't observe this) does not prioritize
at all and always uses the band indicated by the first entry in the
priomap.

By default the priomap looks like this:
qdisc prio 1: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

there are 3 bands (1:1, 1:2 and 1:3). In theory the traffic should go through
the different bands according to the TOS value of the packets. My observation
is, that the traffic always uses the band pointed to by the first entry in the
priomap. This value is 1 by default, so all traffic goes through band 1:2.

Now it's entirely possible that I did something stupid, but nobody came forward
to show me the error of my ways (neither here nor on the lartc list).



I don't think there could be anything stupid if something is maybe not
enough documented. But, this really should work - just like you've
found: TOS should be recalculated to skb-priority, and this should
affect prio. You should only consider that this recalculation isn't
done for all packets, but only forwarded ones (if I can remember, didn't
miss something, and nothing changed later...). So, are you still sure
you've tested such a case? (Btw., there are some other tools which can
change this priority field, so I hope you don't use them too.)



It works fine here, I'm guessing that Jörg is using an old kernel
version that had a bug in prio classification without filters.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage

2007-11-27 Thread Eric W. Biederman
Pavel Emelyanov [EMAIL PROTECTED] writes:

 [snip]

 
 Well I clearly goofed when I added the initial network namespace support
 for /proc/net.  Currently things work but there are odd details visible
 to user space, even when we have a single network namespace.
 
 Since we do not cache proc_dir_entry dentries at the moment we can
 just modify -lookup to return a different directory inode depending
 on the network namespace of the process looking at /proc/net, replacing
 the current technique of using a magic and fragile follow_link method.
 
 To accomplish that this patch:
 - introduces a shadow_proc method to allow different dentries to
   be returned from proc_lookup.
 - Removes the old /proc/net follow_link magic
 - Fixes a weakness in our not caching of proc generic dentries.
 
 As shadow_proc uses a task struct to decided which dentry to return we
 can go back later and fix the proc generic caching without modifying any code
 that
 uses the shadow_proc method.
 
 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]

 Thanks, Eric. 

 Much better ('find /proc' works and so does 'ls ..'), but one 
 issue is still unsolved :(

 I mentioned the program, that opens the directory and dumps the
 content of the /proc/self/fd. Here it is (stupid but simple):

The cause is totally different this time, and is not limited
to /proc/net.  But this is the only side effect of the changes now.

I used the one line version of your test program to confirm this:

 $ cd /proc/driver/

 $ ls -l /proc/self/fd/100 100 .

 lr-x-- 1 eric eric 64 Nov 27 05:06 /proc/self/fd/100 - /proc/driver/

 $  ls -l /proc/self/fd/100 100 .

 lr-x-- 1 eric eric 64 Nov 27 05:07 /proc/self/fd/100 - /proc/driver 
(deleted)

What is happening is that the aggressive non-caching logic is dropping
the dentries and thus they show up as deleted.  I.e. When something
triggers another lookup of that dentry revalidate drops the old
dentry.

This actually fixes a race in /proc today where if you open a file,
remove the module for it, reload the module for that file, and then
attempt to access it, lookup will return the old dentry with the
old fops (even if there is not a current version of that file).
kill_proc_inodes current keeps us from using the old version of
the file_operations for directories (because it removes them) but
it does not prevent this DOS attack where keeping a proc file open
always ensures everyone will see the old version.

Given that the behavior seems more correct then what we have currently
I can live with files showing up as deleted from time to time.

Ultimately the fix is to correct the caching logic in
fs/proc/generic.c to only drop dentries when necessary and then
the deleted markers will only show up when the file or directory
really goes away.

I don't expect user space will care about the semi-legitimate
(deleted) marks.  So I don't think this one down side will be a big
deal for 2.6.24.

Eric




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AW: Does tc-prio really work as advertised?

2007-11-27 Thread Joerg Pommnitz
  It works fine here, I'm guessing that Jörg is using an old kernel
  version that had a bug in prio classification without filters.

This is 2.6.20.21, from 17-Oct-2007. 
 
--  
Regards
 
   Joerg




   __  Ihre erste Baustelle? Wissenswertes 
für Bastler und Hobby Handwerker. www.yahoo.de/clever
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AW: Does tc-prio really work as advertised?

2007-11-27 Thread Patrick McHardy

Joerg Pommnitz wrote:

  It works fine here, I'm guessing that Jörg is using an old kernel
  version that had a bug in prio classification without filters.

This is 2.6.20.21, from 17-Oct-2007. 



Yes, that version is broken. I think it was fixed in 2.6.22 or 2.6.23.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AW: Does tc-prio really work as advertised?

2007-11-27 Thread Joerg Pommnitz
 So, are you still sure you've tested such a case?

Well, the problem that triggered my investigation was
that the OLSR daemon (www.olsr.org) calculates the quality
of a link according to the packet loss for LQ HELLO packets
(UDP broadcast packets). To prevent other traffic from
interfering with the LQ calculation, olsrd sends the HELLO
packets with a TOS value of 0x10 (minimize delay). This
should give them the highest priority.

What I saw was a degrading Link quality with more user traffic
over a link. The LQ fell so far that olsrd judged the other host
unreachable and deleted the routing entry. The user traffic in
question was iperf (TOS value 0x00).

The OLSR traffic was obviously generated locally (not forwarded).
You claim, that the TOS value for locally generated traffic does
not influence its priority?

Now I THINK that I did my tests both, for forwarded and for
local traffic, but I'll redo my tests to make sure.

 
--  
Regards and thanks for taking an interest

 
   Joerg




__  Ihr erstes Baby? Holen Sie sich 
Tipps von anderen Eltern.  www.yahoo.de/clever
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sungem: fix napi regression with reset work

2007-11-27 Thread Johannes Berg
sungem's gem_reset_task() will unconditionally try to disable NAPI even
when it's called while the interface is not operating and hence the NAPI
struct isn't enabled. Make napi_disable() depend on gp-running.

Also removes a superfluous test of gp-running in the same function.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
---
Not exactly sure how I got there but I did get stuck in there (sysrq-W)
when I plugged in the network cable and booted the computer on the other
end of it... while the interface was down. I would've thought that the
driver wouldn't even notice the fact that I plugged in the cable while
the interface is down, but maybe it does. Or it sets up the device
wrongly at resume time when the interface was down at suspend.

In any case, this should fix the issue.

 drivers/net/sungem.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

--- everything.orig/drivers/net/sungem.c2007-11-26 20:48:11.489044815 
+0100
+++ everything/drivers/net/sungem.c 2007-11-26 21:01:28.519045736 +0100
@@ -2279,34 +2279,33 @@ static void gem_reset_task(struct work_s
 {
struct gem *gp = container_of(work, struct gem, reset_task);
 
mutex_lock(gp-pm_mutex);
 
-   napi_disable(gp-napi);
+   if (gp-opened)
+   napi_disable(gp-napi);
 
spin_lock_irq(gp-lock);
spin_lock(gp-tx_lock);
 
-   if (gp-running == 0)
-   goto not_running;
-
if (gp-running) {
netif_stop_queue(gp-dev);
 
/* Reset the chip  rings */
gem_reinit_chip(gp);
if (gp-lstate == link_up)
gem_set_link_modes(gp);
netif_wake_queue(gp-dev);
}
- not_running:
+
gp-reset_task_pending = 0;
 
spin_unlock(gp-tx_lock);
spin_unlock_irq(gp-lock);
 
-   napi_enable(gp-napi);
+   if (gp-opened)
+   napi_enable(gp-napi);
 
mutex_unlock(gp-pm_mutex);
 }
 
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does tc-prio really work as advertised?

2007-11-27 Thread Joerg Pommnitz
Great :-(. I went over the ChangeLogs for later kernel versions, but
couldn't find anything.

I'll try to come up with a working .config for the most current kernel
that works on my hardware and test again.

Do you have a pointer to the fix so that I could try to patch a 2.6.20
kernel?
 
--  
Regards
 
   Joerg

- Ursprüngliche Mail 
Von: Patrick McHardy [EMAIL PROTECTED]
An: Joerg Pommnitz [EMAIL PROTECTED]
CC: Jarek Poplawski [EMAIL PROTECTED]; netdev@vger.kernel.org
Gesendet: Dienstag, den 27. November 2007, 13:54:11 Uhr
Betreff: Re: AW: Does tc-prio really work as advertised?

Joerg Pommnitz wrote:
   It works fine here, I'm guessing that Jörg is using an old kernel
   version that had a bug in prio classification without filters.
 
 This is 2.6.20.21, from 17-Oct-2007. 


Yes, that version is broken. I think it was fixed in 2.6.22 or 2.6.23.






  Jetzt Mails schnell in einem Vorschaufenster überfliegen. Dies und viel 
mehr bietet das neue Yahoo! Mail - www.yahoo.de/mail
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 2/3] Interface group: core (netlink) part

2007-11-27 Thread Patrick McHardy

Laszlo Attila Toth wrote:

Interface groups let handle different interfaces together.
Modified net device structure and netlink interface.



Looks good.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 3/3] Netfilter Interface group match

2007-11-27 Thread Patrick McHardy

Laszlo Attila Toth wrote:

Interface group values can be checked on both input and output interfaces.



Needs a minor update to comply with the naming scheme Jan introduced,
but I can take care of that once the other patches are merged.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix inet_diag.ko register vs rcv race

2007-11-27 Thread Pavel Emelyanov
The following race is possible when one cpu unregisters the handler
while other one is trying to receive a message and call this one:

CPU1: CPU2:
inet_diag_rcv()   inet_diag_unregister()
  mutex_lock(inet_diag_mutex);
  netlink_rcv_skb(skb, inet_diag_rcv_msg);
if (inet_diag_table[nlh-nlmsg_type] == 
   NULL) /* false handler is still registered */
...
netlink_dump_start(idiagnl, skb, nlh,
   inet_diag_dump, NULL);
   cb = kzalloc(sizeof(*cb), GFP_KERNEL);
   /* sleep here freeing memory 
* or preempt
* or sleep later on nlk-cb_mutex
*/
 
spin_lock(inet_diag_register_lock);
 inet_diag_table[type] 
= NULL;
...  
spin_unlock(inet_diag_register_lock);
 synchronize_rcu();
 /* CPU1 is sleeping - 
RCU quiescent
  * state is passed
  */
 return;
/* inet_diag_dump is finally called: */
inet_diag_dump()
  handler = inet_diag_table[cb-nlh-nlmsg_type];
  BUG_ON(handler == NULL); 
  /* OOPS! While we slept the unregister has set
   * handler to NULL :(
   */

Grep showed, that the register/unregister functions are called
from init/fini module callbacks for tcp_/dccp_diag, so it's OK
to use the inet_diag_mutex to synchronize manipulations with the
inet_diag_table and the access to it.

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index b017073..5fe32d5 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -853,8 +853,6 @@ static void inet_diag_rcv(struct sk_buff *skb)
mutex_unlock(inet_diag_mutex);
 }
 
-static DEFINE_SPINLOCK(inet_diag_register_lock);
-
 int inet_diag_register(const struct inet_diag_handler *h)
 {
const __u16 type = h-idiag_type;
@@ -863,13 +861,13 @@ int inet_diag_register(const struct inet_diag_handler *h)
if (type = INET_DIAG_GETSOCK_MAX)
goto out;
 
-   spin_lock(inet_diag_register_lock);
+   mutex_lock(inet_diag_mutex);
err = -EEXIST;
if (inet_diag_table[type] == NULL) {
inet_diag_table[type] = h;
err = 0;
}
-   spin_unlock(inet_diag_register_lock);
+   mutex_unlock(inet_diag_mutex);
 out:
return err;
 }
@@ -882,11 +880,9 @@ void inet_diag_unregister(const struct inet_diag_handler 
*h)
if (type = INET_DIAG_GETSOCK_MAX)
return;
 
-   spin_lock(inet_diag_register_lock);
+   mutex_lock(inet_diag_mutex);
inet_diag_table[type] = NULL;
-   spin_unlock(inet_diag_register_lock);
-
-   synchronize_rcu();
+   mutex_unlock(inet_diag_mutex);
 }
 EXPORT_SYMBOL_GPL(inet_diag_unregister);
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property

2007-11-27 Thread Joakim Tjernlund

On Tue, 2007-11-27 at 14:39 +0300, Anton Vorontsov wrote:
 On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote:
  On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote:
   fixed-link says: register new Fixed/emulated PHY, i.e. PHY that
   not connected to the real MDIO bus.
   
   Signed-off-by: Vitaly Bordug [EMAIL PROTECTED]
   Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]
   
   ---
   
Documentation/powerpc/booting-without-of.txt |3 +
arch/powerpc/sysdev/fsl_soc.c|   56 
   ++
2 files changed, 42 insertions(+), 17 deletions(-)
   
   
   diff --git a/Documentation/powerpc/booting-without-of.txt 
   b/Documentation/powerpc/booting-without-of.txt
   index e9a3cb1..cf25070 100644
   --- a/Documentation/powerpc/booting-without-of.txt
   +++ b/Documentation/powerpc/booting-without-of.txt
   @@ -1254,6 +1254,9 @@ platforms are moved over to use the 
   flattened-device-tree model.
  services interrupts for this device.
- phy-handle : The phandle for the PHY connected to this ethernet
  controller.
   +- fixed-link : a b c where a is emulated phy id - choose any,
   +  but unique to the all specified fixed-links, b is duplex - 0 half,
   +  1 full, c is link speed - d#10/d#100/d#1000.
  
  Good work!
  May I suggest adding a d to a b c where d is flow control - 0 no, 1 yes
 
 Well, I see no reference of the flow neither in the include/linux/mii.h
 nor in the drivers/net/phy/*. :-/ Thus today there is no such register
 bit we can emulate?..

Well, as good as I can recall, flow control(pause) is something that the
PHY negotiates, just like FDX/HDX and should be dealt with in the
adjust_link callback but not many do currently.

If you seach for pause in phy_device.c you will find it.

 
  flow control or not just popped up here today so I got a use for it.
 
 ..and I looked into few drivers (gianfar, ucc), and they seem to use
 hard-coded flow control values, thus they don't speak to the PHYs
 about that.

Yes, but they should.

 
 Can you give any reference to the driver that needs that parameter?
 Does it use PHY subsystem to obtain flow-control on/off information?
 
 Thanks,
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] bridging: don't forward EAPOL frames

2007-11-27 Thread Johannes Berg

  +   if (unlikely(skb-protocol = htons(ETH_P_PAE)))
  +   goto drop;
  +
  switch (p-state) {
  case BR_STATE_FORWARDING:
 
 Not needed because the bridge is already handling it:
 
 1) If running STP (ie true bridge), then all link local multicast is only 
 received by
 the bridge and never forwarded.

Well, typical access point setups bridge the wireless AP interface with
wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and
we want to avoid having them unicast to another host.

Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather
than depending on the link-local multicast address because otherwise
eapol frames can be unicast into the network behind the (authorized)
port which is undesirable.

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property

2007-11-27 Thread Anton Vorontsov
On Tue, Nov 27, 2007 at 02:17:11PM +0100, Joakim Tjernlund wrote:
 
 On Tue, 2007-11-27 at 14:39 +0300, Anton Vorontsov wrote:
  On Mon, Nov 26, 2007 at 04:04:08PM +0100, Joakim Tjernlund wrote:
   On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote:
fixed-link says: register new Fixed/emulated PHY, i.e. PHY that
not connected to the real MDIO bus.

Signed-off-by: Vitaly Bordug [EMAIL PROTECTED]
Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]

---

 Documentation/powerpc/booting-without-of.txt |3 +
 arch/powerpc/sysdev/fsl_soc.c|   56 
++
 2 files changed, 42 insertions(+), 17 deletions(-)


diff --git a/Documentation/powerpc/booting-without-of.txt 
b/Documentation/powerpc/booting-without-of.txt
index e9a3cb1..cf25070 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -1254,6 +1254,9 @@ platforms are moved over to use the 
flattened-device-tree model.
   services interrupts for this device.
 - phy-handle : The phandle for the PHY connected to this ethernet
   controller.
+- fixed-link : a b c where a is emulated phy id - choose any,
+  but unique to the all specified fixed-links, b is duplex - 0 
half,
+  1 full, c is link speed - d#10/d#100/d#1000.
   
   Good work!
   May I suggest adding a d to a b c where d is flow control - 0 no, 1 
   yes
  
  Well, I see no reference of the flow neither in the include/linux/mii.h
  nor in the drivers/net/phy/*. :-/ Thus today there is no such register
  bit we can emulate?..
 
 Well, as good as I can recall, flow control(pause) is something that the
 PHY negotiates, just like FDX/HDX and should be dealt with in the
 adjust_link callback but not many do currently.
 
 If you seach for pause in phy_device.c you will find it.

Ah, pause. Sure, we can emulate that.

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
backup email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Herbert Xu
Andi Kleen [EMAIL PROTECTED] wrote:
 On Tue, Nov 27, 2007 at 03:26:52PM +1100, Rusty Russell wrote:
 On Monday 26 November 2007 16:58:08 Roland Dreier wrote:
 I agree that we shouldn't make things too hard for out-of-tree
 modules, but I disagree with your first statement: there clearly is a
 large class of symbols that are used by multiple modules but which are
 not generically useful -- they are only useful by a certain small
 class of modules.
   
If it is so clear, you should be able to easily provide examples?
 
  Sure -- Andi's example of symbols required only by TCP congestion
  modules;
 
 Exactly.  Why exactly should someone not write a new TCP congestion module?
 
 Agreed the congestion modules are a corner case. I even mentioned that in the
 patch. I would be happy to drop that one if that is the consensus.
 It was more done as a example anyways. That is why I made it an separate
 namespace from tcp
 
 But for many other TCP symbols it makes a lot of sense: all the functions
 only used by tcp_ipv6.c. If someone wants to write support for a IPv7 or
 similar they really should do it in tree. So I think the tcp and  inet
 namespaces make a lot of sense.

OK, short of making IPv4 a module (which would be a worthy task :)
do you have an example where a symbol is used by more than one module
but needs to be put into a namespace?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][BRIDGE] Lost call to br_fdb_fini() in br_init() error path

2007-11-27 Thread Pavel Emelyanov
In case the br_netfilter_init() (or any subsequent call) 
fails, the br_fdb_fini() must be called to free the allocated
in br_fdb_init() br_fdb_cache kmem cache.

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---

diff --git a/net/bridge/br.c b/net/bridge/br.c
index 93867bb..a901828 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -39,7 +39,7 @@ static int __init br_init(void)
 
err = br_fdb_init();
if (err)
-   goto err_out1;
+   goto err_out;
 
err = br_netfilter_init();
if (err)
@@ -65,6 +65,8 @@ err_out3:
 err_out2:
br_netfilter_fini();
 err_out1:
+   br_fdb_fini();
+err_out:
llc_sap_put(br_stp_sap);
return err;
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc

2007-11-27 Thread chas williams - CONTRACTOR
In message [EMAIL PROTECTED],Rober
t P. J. Day writes:
  in any event, i just thought i'd point it out.  if you're absolutely
  sure there will never be another call to setup_dev() from somewhere
  else, then, yes, it's safe.

 I understood your opinions. and partially agree with you.
 But isn't it a unfounded fear?

i don't know, i just thought i'd mention it.  if no one thinks it's an
issue, it's certainly fine with me.

its very unlikely that setup_dev() is likely to be called from another
code path.  this patch looks fine to me.  i will take it and get it
submitted on the next merge.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Herbert Xu
On Tue, Nov 27, 2007 at 03:12:42PM +0100, Andi Kleen wrote:

 For Networking: e.g. symbols i put into inet, which are only
 used by protocols (sctp, dccp, udplite, ipv6)

Wait, that's exactly Rusty's point (I think :)

These symbols are exported because they're needed by protocols.
If they weren't available to everyone then it would be difficult
to start writing new protocols.

 I already caught someone doing something wrong with that BTW -- 
 wanrouter clearly does some things it shouldn't be doing. 

Can you be more precise?

 Or the fib namespace, where all the fib functions should be only
 used by the two fib_* modules and ipv6/decnet.

Again, if it's used by decnet then it sounds like it should be
exported because new protocol families may need them.

So based on the network code at least I'm kind of starting to
agree with Rusty now: if a symbol is needed by more than one
in-tree module chances are we want it to be exported for all.

Although I admit I haven't examined your examples elsewhere.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]: fix lro_gen_skb() alignment

2007-11-27 Thread Andrew Gallatin

The inet_lro.c:lro_gen_skb() function fails to include
NET_IP_ALIGN padding at the front of the sk_buffs it creates,
leading to alignment warnings on architectures which require
strict alignment (seen on sparc64).  The attached patch
adds NET_IP_ALIGN padding.

Signed off by: Andrew Gallatin [EMAIL PROTECTED]

Drew

diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c
index ac3b1d3..91e9371 100644
--- a/net/ipv4/inet_lro.c
+++ b/net/ipv4/inet_lro.c
@@ -401,10 +401,11 @@ static struct sk_buff *lro_gen_skb(struc
int data_len = len;
int hdr_len = min(len, hlen);
 
-   skb = netdev_alloc_skb(lro_mgr-dev, hlen);
+   skb = netdev_alloc_skb(lro_mgr-dev, hlen + NET_IP_ALIGN);
if (!skb)
return NULL;
 
+   skb_reserve(skb, NET_IP_ALIGN);
skb-len = len;
skb-data_len = len - hdr_len;
skb-truesize += true_size;


Re: [Bridge] Re: [RFC] bridging: don't forward EAPOL frames

2007-11-27 Thread Andy Gospodarek
On Nov 27, 2007 8:24 AM, Johannes Berg [EMAIL PROTECTED] wrote:

   +   if (unlikely(skb-protocol = htons(ETH_P_PAE)))
   +   goto drop;
   +
   switch (p-state) {
   case BR_STATE_FORWARDING:
 
  Not needed because the bridge is already handling it:
 
  1) If running STP (ie true bridge), then all link local multicast is only 
  received by
  the bridge and never forwarded.

 Well, typical access point setups bridge the wireless AP interface with
 wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and
 we want to avoid having them unicast to another host.

 Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather
 than depending on the link-local multicast address because otherwise
 eapol frames can be unicast into the network behind the (authorized)
 port which is undesirable.

 johannes



I agree with Stephen, that based on the way it's likely people use
linux bridges it seems like this is something that could be configured
rather than simply dropping those frames without any chance to forward
them.  There are probably quite a few people out there who will not
expect this change, so it should be easy to change during runtime.

Don't forget that a simple ebtables rule could also drop EAPOL if needed.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bridge] Re: [RFC] bridging: don't forward EAPOL frames

2007-11-27 Thread Johannes Berg

   Not needed because the bridge is already handling it:
  
   1) If running STP (ie true bridge), then all link local multicast is only 
   received by
   the bridge and never forwarded.
 
  Well, typical access point setups bridge the wireless AP interface with
  wired, EAPOL frames can be unicast (and 802.11 specifies to do so) and
  we want to avoid having them unicast to another host.
 
  Also, 802.1X in C.3.3 recommends not bridging the *ethertype* rather
  than depending on the link-local multicast address because otherwise
  eapol frames can be unicast into the network behind the (authorized)
  port which is undesirable.

 I agree with Stephen, that based on the way it's likely people use
 linux bridges it seems like this is something that could be configured
 rather than simply dropping those frames without any chance to forward
 them.

Well Stephen is wrong in one thing that eapol need not be link local
multicast for 802.11, it's unicast there so the dropping of link local
packets doesn't help.

 There are probably quite a few people out there who will not
 expect this change, so it should be easy to change during runtime.

I'm not aware of any use for EAPOL frames traversing the network. I'm
also not aware of any proper 802.1X implementation for linux bridges but
I didn't do too much research yet. I don't see why people would rely on
EAPOL frames being bridged when the protocol is by definition local to a
link.

 Don't forget that a simple ebtables rule could also drop EAPOL if needed.

Indeed, but I'd prefer the bridge to do the right thing in absence of
configuration.

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Jonathan Corbet
Rusty said:

 Well, introduce an EXPORT_SYMBOL_INTERNAL().  It's a lot less code.  But 
 you'd 
 still need to show that people are having trouble knowing what APIs to use.

Might the recent discussion on the exporting of sys_open() and
sys_read() be an example here?  There would appear to be a consensus
that people should not have used those functions, but they are now
proving difficult to unexport.

Perhaps the best use of the namespace stuff might be for *future*
exports which are needed to help make the mainline kernel fully modular,
but which are not meant to be part of a more widely-used API?

jon
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Andi Kleen
On Tue, Nov 27, 2007 at 08:43:24AM -0700, Jonathan Corbet wrote:
 Rusty said:
 
  Well, introduce an EXPORT_SYMBOL_INTERNAL().  It's a lot less code.  But 
  you'd 
  still need to show that people are having trouble knowing what APIs to use.
 
 Might the recent discussion on the exporting of sys_open() and
 sys_read() be an example here?  There would appear to be a consensus
 that people should not have used those functions, but they are now
 proving difficult to unexport.

That is a good example yes.

 Perhaps the best use of the namespace stuff might be for *future*
 exports which are needed to help make the mainline kernel fully modular,
 but which are not meant to be part of a more widely-used API?

Not sure about future only, but yes that is its intention.
Thanks for putting it clearly.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-mm1 tg3 wake-on-lan oddity...

2007-11-27 Thread Michael Chan
[EMAIL PROTECTED] wrote:

 (a) the Dell factory default is WOL disabled and (b) 
 if it wasn't
 the default, I'd have *set* it to disabled, and (c) I even 
 went back and
 rebooted and checked the BIOS setting - disabled. Nonetheless:
 
 # ethtool eth0 | grep Wake
 Supports Wake-on: g
 Wake-on: g
 
 Is this expected behavior?
 

The new tg3 is supposed to follow the WoL setting in the
NVRAM, so this is not expected.  We'll have to look into
this.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][BRIDGE] Properly dereference the br_should_route_hook

2007-11-27 Thread Pavel Emelyanov
This hook is protected with the RCU, so simple

if (br_should_route_hook)
br_should_route_hook(...)

is not enough on some architectures.

Use the rcu_dereference/rcu_assign_pointer in this case.

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 3cedd4e..b42b192 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest)
 struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb)
 {
const unsigned char *dest = eth_hdr(skb)-h_dest;
+   typeof(br_should_route_hook) rhook;
 
if (!is_valid_ether_addr(eth_hdr(skb)-h_source))
goto drop;
@@ -147,9 +148,9 @@ struct sk_buff *br_handle_frame(struct net_bridge_port *p, 
struct sk_buff *skb)
 
switch (p-state) {
case BR_STATE_FORWARDING:
-
-   if (br_should_route_hook) {
-   if (br_should_route_hook(skb))
+   rhook = rcu_dereference(br_should_route_hook);
+   if (rhook != NULL) {
+   if (rhook(skb))
return skb;
dest = eth_hdr(skb)-h_dest;
}
diff --git a/net/bridge/netfilter/ebtable_broute.c 
b/net/bridge/netfilter/ebtable_broute.c
index e44519e..be6f186 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -70,13 +70,13 @@ static int __init ebtable_broute_init(void)
if (ret  0)
return ret;
/* see br_input.c */
-   br_should_route_hook = ebt_broute;
+   rcu_assign_pointer(br_should_route_hook, ebt_broute);
return ret;
 }
 
 static void __exit ebtable_broute_fini(void)
 {
-   br_should_route_hook = NULL;
+   rcu_assign_pointer(br_should_route_hook, NULL);
synchronize_net();
ebt_unregister_table(broute_table);
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BRIDGE] Properly dereference the br_should_route_hook

2007-11-27 Thread Stephen Hemminger
On Tue, 27 Nov 2007 19:12:11 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:

 This hook is protected with the RCU, so simple
 
   if (br_should_route_hook)
   br_should_route_hook(...)
 
 is not enough on some architectures.
 
 Use the rcu_dereference/rcu_assign_pointer in this case.
 
 Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]
 
 ---
 
 diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
 index 3cedd4e..b42b192 100644
 --- a/net/bridge/br_input.c
 +++ b/net/bridge/br_input.c
 @@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest)
  struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff 
 *skb)
  {
   const unsigned char *dest = eth_hdr(skb)-h_dest;
 + typeof(br_should_route_hook) rhook;

Okay, but I don't like the typeof() magic. Resubmit with proper declartion.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BRIDGE] Lost call to br_fdb_fini() in br_init() error path

2007-11-27 Thread Stephen Hemminger
On Tue, 27 Nov 2007 17:39:42 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:

 In case the br_netfilter_init() (or any subsequent call) 
 fails, the br_fdb_fini() must be called to free the allocated
 in br_fdb_init() br_fdb_cache kmem cache.
 
 Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]
 
 ---
 
 diff --git a/net/bridge/br.c b/net/bridge/br.c
 index 93867bb..a901828 100644
 --- a/net/bridge/br.c
 +++ b/net/bridge/br.c
 @@ -39,7 +39,7 @@ static int __init br_init(void)
  
   err = br_fdb_init();
   if (err)
 - goto err_out1;
 + goto err_out;
  
   err = br_netfilter_init();
   if (err)
 @@ -65,6 +65,8 @@ err_out3:
  err_out2:
   br_netfilter_fini();
  err_out1:
 + br_fdb_fini();
 +err_out:
   llc_sap_put(br_stp_sap);
   return err;
  }

Good catch, thanks I hope you didn't find this in live system.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH (resubmit)][BRIDGE] Properly dereference the br_should_route_hook

2007-11-27 Thread Pavel Emelyanov
This hook is protected with the RCU, so simple

if (br_should_route_hook)
br_should_route_hook(...)

is not enough on some architectures.

Use the rcu_dereference/rcu_assign_pointer in this case.

Fixed Stephen's comment concerning using the typeof().

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 3cedd4e..0ee79a7 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -122,6 +122,7 @@ static inline int is_link_local(const unsigned char *dest)
 struct sk_buff *br_handle_frame(struct net_bridge_port *p, struct sk_buff *skb)
 {
const unsigned char *dest = eth_hdr(skb)-h_dest;
+   int (*rhook)(struct sk_buff *skb);
 
if (!is_valid_ether_addr(eth_hdr(skb)-h_source))
goto drop;
@@ -147,9 +148,9 @@ struct sk_buff *br_handle_frame(struct net_bridge_port *p, 
struct sk_buff *skb)
 
switch (p-state) {
case BR_STATE_FORWARDING:
-
-   if (br_should_route_hook) {
-   if (br_should_route_hook(skb))
+   rhook = rcu_dereference(br_should_route_hook);
+   if (rhook != NULL) {
+   if (rhook(skb))
return skb;
dest = eth_hdr(skb)-h_dest;
}
diff --git a/net/bridge/netfilter/ebtable_broute.c 
b/net/bridge/netfilter/ebtable_broute.c
index e44519e..be6f186 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -70,13 +70,13 @@ static int __init ebtable_broute_init(void)
if (ret  0)
return ret;
/* see br_input.c */
-   br_should_route_hook = ebt_broute;
+   rcu_assign_pointer(br_should_route_hook, ebt_broute);
return ret;
 }
 
 static void __exit ebtable_broute_fini(void)
 {
-   br_should_route_hook = NULL;
+   rcu_assign_pointer(br_should_route_hook, NULL);
synchronize_net();
ebt_unregister_table(broute_table);
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SGISEEQ: use cached memory access to make driver work on IP28

2007-11-27 Thread Ralf Baechle
On Sat, Nov 24, 2007 at 01:29:19PM +0100, Thomas Bogendoerfer wrote:

 Following patch is clearly 2.6.25 material and is needed to get SGI IP28
 machines supported.
 
 Thomas.
 
 SGI IP28 machines would need special treatment (enable adding addtional
 wait states) when accessing memory uncached. To avoid this pain I changed
 the driver to use only cached access to memory.
 
 Signed-off-by: Thomas Bogendoerfer [EMAIL PROTECTED]

IP28 is clearly a maximum weirdo beast.  Technically the patch looks fine
it's just a few stilistic issues such as there no reason for
DMA_SYNC_DESC_CPU and DMA_SYNC_DESC_DEV being macros so why not using
inlines.

Acked-by: Ralf Baechle [EMAIL PROTECTED]

  Ralf
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Christoph Hellwig
On Tue, Nov 27, 2007 at 08:43:24AM -0700, Jonathan Corbet wrote:
 Might the recent discussion on the exporting of sys_open() and
 sys_read() be an example here?  There would appear to be a consensus
 that people should not have used those functions, but they are now
 proving difficult to unexport.

They're not.  The exports aren't used anymore and could just go away
any time.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module

2007-11-27 Thread Ryousei Takano
 One more thing: your qdisc can only be used as root qdisc since it
 produces packets itself and thereby violates the rule that a qdisc
 can only hand out packets that were previously enqueued to it.
 Using it as a leaf qdisc can make the upper qdiscs qlen counter
 go negative, causing infinite dequeue-loops, so you should make
 sure that its only possibly to use as root qdisc by checking the
 parent. It would also be better to do something like netem
 (enqueue produced packets at the root) to make sure the qlen
 counter is always accurate.
 
I agree with you.
PSPacer should not use with other rate regulation qdiscs. But, 
I think that a combination of netem and PSPacer is a useful for
emulating networks. The following paper describes experimental
results using PSPacer with netem:

   Large Scale Gigabit Emulated Testbed for Grid Transport
   Evaluation, PFLDnet 2006.
   http://www.hpcc.jp/pfldnet2006/paper/s1_02.pdf

Best regards,
Ryousei Takano
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/01] ipv6: RFC4214 Support (v2.5)

2007-11-27 Thread Templin, Fred L
 

 -Original Message-
 From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
 Sent: Monday, November 26, 2007 10:01 AM
 To: [EMAIL PROTECTED]
 Cc: Templin, Fred L; netdev@vger.kernel.org; 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
 [EMAIL PROTECTED]
 Subject: Re: [PATCH 01/01] ipv6: RFC4214 Support (v2.5)
 
 In article 
 [EMAIL PROTECTED]
 eing.com (at Mon, 26 Nov 2007 09:16:16 -0800), Templin, 
 Fred L [EMAIL PROTECTED] says:
 
  From: Fred L. Templin [EMAIL PROTECTED]
  
  This patch includes support for the Intra-Site Automatic Tunnel
  Addressing Protocol (ISATAP) per RFC4214. It uses the SIT
  module, and is configured using extensions to the iproute2
  utility. The diffs are specific to the Linux 2.6.24-rc2 kernel
  distribution.
  
  This version includes the diff for ./include/linux/if.h which was
  missing in the v2.4 submission and is needed to make the
  patch compile. The patch has been installed, compiled and
  tested in a clean 2.6.24-rc2 kernel build area.
  
  Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
 Acked-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
 
 Note:
 With linux-2.6:
 | % patch -p1  /tmp/isatap.patch 
 | patching file include/linux/if.h
 | patching file include/linux/if_tunnel.h
 | patching file include/linux/in.h
 | patching file include/net/addrconf.h
 | patching file net/ipv6/addrconf.c
 | patching file net/ipv6/route.c
 | Hunk #1 succeeded at 1660 (offset -8 lines).
 | patching file net/ipv6/sit.c

Confirmed in a clean linux-2.6.24-rc3 build area:

  - Installed patch with identical results as above.
  - Verified that the patch installed correctly (it did).
  - Compiled, booted and tested.

Fred
[EMAIL PROTECTED] 

 With net-2.6.24:
 | % patch -p1  /tmp/isatap.patch
 | % patch -p1  /tmp/isatap.patch 
 | patching file include/linux/if.h
 | patching file include/linux/if_tunnel.h
 | patching file include/linux/in.h
 | patching file include/net/addrconf.h
 | patching file net/ipv6/addrconf.c
 | Hunk #1 succeeded at 378 (offset -1 lines).
 | Hunk #2 succeeded at 1441 (offset -1 lines).
 | Hunk #3 succeeded at 1479 (offset -1 lines).
 | Hunk #4 succeeded at 2210 (offset -1 lines).
 | patching file net/ipv6/route.c
 | Hunk #1 succeeded at 1727 (offset 59 lines).
 | patching file net/ipv6/sit.c
 
 --yoshfuji
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Adrian Bunk
On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote:
 On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote:
...
  No.  That's the wrong question.  What's the real upside?
 
 Explicitly documenting what comprises the kernel API (external,
 supported) and what comprises the kernel implementation (internal, not
 supported).
...

There is not, never was, and never will be, any supported external API 
of the kernel.
 
cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: wireless vs. alignment requirements

2007-11-27 Thread H. Peter Anvin

Stephen Hemminger wrote:

Herbert Xu wrote:

On Sat, Nov 24, 2007 at 02:49:36PM +0100, Johannes Berg wrote:
 

Right. I just didn't think that would be a valid value for an
architecture to set.



OK.  Let me clarify this a bit more.  We require at least one
of the following rules to be met:

* the IPv4/IPv6 header is aligned by 8 bytes on reception;
* or the platform provides unaligned exception handlers.

So if your platform violates both rules then it won't work with
the IP stack, simple as that.  Fortunately I don't think such a
platform exists currently on Linux.

Cheers,
  


Then what about hardware that can't dma ethernet to non-aligned address.
Sky2 hardware breaks if DMA is not 8 byte aligned.  IMHO the IP stack 
should handle any alignment, and do the appropriate memove if the CPU requires 
alignment.


I wrote a patch for the IP stack to realign packets if necessary at one 
point.  I should dredge it up again and submit it for collective flamage.


-hpa

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: wireless vs. alignment requirements

2007-11-27 Thread H. Peter Anvin

Herbert Xu wrote:


Luckily all sky2 users have been on x86 so far :)



Hardly so.  My previous employer was MIPS and we used it there (with my 
patch.)


-hpa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Adrian Bunk
On Tue, Nov 27, 2007 at 10:02:22AM +0100, Andi Kleen wrote:
...
 That is EXPORT_SYMBOL already. The trouble is just that it covers
 too much. My patchkit is trying to limit it again for a specific
 use case -- exporting an internal interface to another module.
 Or rather a set of modules. 
 
 Standard example is TCP: TCP exports nearly everything and the
 single user is the TCP code in ipv6.ko. Instead those symbols should
 be limited to be only accessable to ipv6.ko. 
...

Let's forget about external modules that are anyway irrelevant for 
upstream kernel development.

Do you have past examples where this would have brought advantages 
for the upstream kernel justifying all the work required for creating 
and maintaining these namespaces?

IOW, where modules were submitted for upstream inclusion and merging 
them was impossible or much harder only because they were developed 
aginst the wrong API?

 -ANdi

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000 driver problems

2007-11-27 Thread Kok, Auke

[moving this discussion to netdev, dropping lkml]

Lukas Hejtmanek wrote:
 On Tue, Nov 27, 2007 at 08:48:52AM -0800, Kok, Auke wrote:
 unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2
 kernel due to compilation errors.
 but the in-kernel version of e1000 supports the ich8 lan device just fine
 and can be builtin. also this kernel has the first release of e1000e which
 supports the ich9 onboard lan device.
 
 I'm afraid, I'm missing the point as you have stated that in-kernel drivers
 have problem with suspicious board hang...

my mistake, sorry for that confusion.

the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in 
the
code for a line that says:

adapter-tx_timeout_factor = 8;

change that number to 16 to cope with the problem on 10mbit link partners. 
However
this won't fix the issue on 100mbit partners and we need to investigate that if
that is the case.

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Tom Tucker

On Tue, 2007-11-27 at 18:15 +0100, Adrian Bunk wrote:
 On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote:
  On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote:
 ...
   No.  That's the wrong question.  What's the real upside?
  
  Explicitly documenting what comprises the kernel API (external,
  supported) and what comprises the kernel implementation (internal, not
  supported).
 ...
 
 There is not, never was, and never will be, any supported external API 
 of the kernel.

Philosophically I understand what you're saying, but in practical terms
there is the issue of managing core API like kmalloc. Although kmalloc
_could_ change, doing so would be extremely painful. In fact anyone who
proposed such a change would have to have a profoundly powerful argument
as to why it was necessary.

I think this patchset is an attempt to make it easier to identify and
review these kinds of interfaces.

  
 cu
 Adrian
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000 driver problems

2007-11-27 Thread Lukas Hejtmanek
On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote:
  I'm afraid, I'm missing the point as you have stated that in-kernel drivers
  have problem with suspicious board hang...
 
 my mistake, sorry for that confusion.
 
 the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in 
 the
 code for a line that says:
 
   adapter-tx_timeout_factor = 8;
 
 change that number to 16 to cope with the problem on 10mbit link partners.
 However this won't fix the issue on 100mbit partners and we need to
 investigate that if that is the case.

100mbit is the issue in my case.

-- 
Lukáš Hejtmánek
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000 driver problems

2007-11-27 Thread Kok, Auke
Lukas Hejtmanek wrote:
 On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote:
 I'm afraid, I'm missing the point as you have stated that in-kernel drivers
 have problem with suspicious board hang...
 my mistake, sorry for that confusion.

 the fake hangs on 82562/6 devices occur on 10mbit link only. You can check 
 in the
 code for a line that says:

  adapter-tx_timeout_factor = 8;

 change that number to 16 to cope with the problem on 10mbit link partners.
 However this won't fix the issue on 100mbit partners and we need to
 investigate that if that is the case.
 
 100mbit is the issue in my case.
 

can you see if your problem goes away with this patch?

---
e1000: increase tx timeout factor for 100mbit speeds

Signed-off-by: Auke Kok [EMAIL PROTECTED]

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index b7c3070..2e46a15 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2601,7 +2601,7 @@ e1000_watchdog(unsigned long data)
case SPEED_100:
txb2b = 0;
netdev-tx_queue_len = 100;
-   /* maybe add some timeout factor ? */
+   adapter-tx_timeout_factor = 4;
break;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: wireless vs. alignment requirements

2007-11-27 Thread Stephen Hemminger
On Tue, 27 Nov 2007 09:16:07 -0800
H. Peter Anvin [EMAIL PROTECTED] wrote:

 Stephen Hemminger wrote:
  Herbert Xu wrote:
  On Sat, Nov 24, 2007 at 02:49:36PM +0100, Johannes Berg wrote:
   
  Right. I just didn't think that would be a valid value for an
  architecture to set.
  
 
  OK.  Let me clarify this a bit more.  We require at least one
  of the following rules to be met:
 
  * the IPv4/IPv6 header is aligned by 8 bytes on reception;
  * or the platform provides unaligned exception handlers.
 
  So if your platform violates both rules then it won't work with
  the IP stack, simple as that.  Fortunately I don't think such a
  platform exists currently on Linux.
 
  Cheers,

  
  Then what about hardware that can't dma ethernet to non-aligned address.
  Sky2 hardware breaks if DMA is not 8 byte aligned.  IMHO the IP stack 
  should handle any alignment, and do the appropriate memove if the CPU 
  requires 
  alignment.
 
 I wrote a patch for the IP stack to realign packets if necessary at one 
 point.  I should dredge it up again and submit it for collective flamage.
 
   -hpa
 

Is there any standard kernel config define for this platform can't do
unaligned accesses?

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] Open-FCoE - Fibre Channel over Ethernet Project

2007-11-27 Thread Love, Robert W
Hello Linux Community,

I'd like to introduce a new Open Source project named Open-FCoE.
FCoE is the acronym for Fibre Channel over Ethernet, which is the
encapsulation of Fibre Channel frames within Ethernet packets. FCoE will
allow systems with an Ethernet adapter and a Fibre Channel Forwarder to
login to a Fibre Channel fabric (the FCF is a gateway that bridges the
LAN and the SAN). That fabric login was previously reserved exclusively
for Fibre Channel HBAs. This technology reduces complexity in the data
center by aiding network convergence. It is targeted for 10Gps Ethernet
NICs but will work on any Ethernet NIC supporting pause frames. Our code
base provides a Fibre Channel protocol processing module as well as an
Ethernet based transport module. The Open-FC module acts as a LLD for
SCSI and the Open-FCoE transport uses net_device to send and receive
packets.

We've set up a web home for this project at www.Open-FCoE.org. Its
purpose is to support development and users; it's very light on content
right now. 

We have a lot of code and it will take time for us to harden and
improve upon it. We're still early in development but are making this
code available as soon as possible so that our project's development can
truly be open.


This is our code's organization as it is today.

1) git://open-fcoe.org/openfc/open-fcoe-upstream.git
  
  This repository is based on the SCSI 2.6.24 bug fix tree. Two
patches are applied to add our code and resolve any compatibility
problems between our code and 2.6.24.

2) git://open-fcoe.org/openfc/open-fcoe.git

   This repository contains our user space tools. It also contains
an out-of-kernel Makefile that allows a user to build the kernel modules
within this layout. Pulling our kernel code from open-fcoe-upstream into
this code base and then tarring it up provides an easily distributable
package.

3) git://open-fcoe.org/openfc/open-fcoe-misc.git

   This repository contains our SW target code. The SW target allows
someone to connect two systems back-to-back and run FCoE traffic from
one to another. Currently our target uses SCST which is not upstream.
This is causing us a problem because the target shares code with the
initiator.  Since the initiator is building with 2.6.24 and SCST isn't
we're in a bind.  A priority of ours will be to get the target back to a
better shape as it's our primary development tool.


We also have a lot of things to fix. Here are our initial concerns,

1) Kernel/Userspace interaction needs direction. Currently we're using
ioctls, which are no longer desirable. We're planning to change the
implementation to netlink. Also, some non-data path code will probably
need to move to user space. Open-iSCSI has spent a lot of time on its
kernel to user interaction; we will likely borrow from that model as
much as we can.

2) SW Target and its integration with the kernel.

3) Reduction of code, removal of unnecessary abstraction.

4) Non-GPL code. We're currently using a BSD queue.h and some CRC code
from BSD as well. We need to convert this code to use kernel code.

5) Documentation. Our basic documentation needs some help so people can
use the initiator and target easily.


We have a mailing list established on our website, but believe that the
community will want us to discuss this technology on the SCSI mailing
list. That is where we plan to start discussing this code.

We're looking forward to building a strong community around this
technology and code base. We're eager to start coding and improving our
code with any other community contributors!


Robert Love
Open-FCoE.org
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Adrian Bunk
On Tue, Nov 27, 2007 at 11:45:37AM -0600, Tom Tucker wrote:
 
 On Tue, 2007-11-27 at 18:15 +0100, Adrian Bunk wrote:
  On Mon, Nov 26, 2007 at 11:35:42PM -0600, Tom Tucker wrote:
   On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote:
  ...
No.  That's the wrong question.  What's the real upside?
   
   Explicitly documenting what comprises the kernel API (external,
   supported) and what comprises the kernel implementation (internal, not
   supported).
  ...
  
  There is not, never was, and never will be, any supported external API 
  of the kernel.
 
 Philosophically I understand what you're saying, but in practical terms
 there is the issue of managing core API like kmalloc. Although kmalloc
 _could_ change, doing so would be extremely painful. In fact anyone who
 proposed such a change would have to have a profoundly powerful argument
 as to why it was necessary.

The latter should at least in theory be required for all changes no 
matter how core they are...

 I think this patchset is an attempt to make it easier to identify and
 review these kinds of interfaces.

As long as the submitter fixes all in-kernel users these interfaces are 
not handled differently from interfaces with fewer users.

And I remember at least one commit that changed  1000 files because it 
changed a frequently used driver API. [1]

cu
Adrian

[1] commit 7d12e780e003f93433d49ce78cfedf4b4c52adc5

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Dave Jones
On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote:
 
  1) Why is everyone so concerned that export symbol space is large?
   - does it cost cpu or running memory?
   - does it cause bugs?
   - or are you just worried about evil modules?

To clarify something here, by evil, don't necessarily think binary only. 

Out of tree modules are frequently using symbols that they shouldn't be.
Because they get no peer-review here, they 'get away with it' for the most part.
Until distro vendors push rebased kernel updates that removed exports that
should never have been exported, and suddenly people like me get bombed
with Fedora broke my xyz driver mails.

Reducing the opportunity for people to screw things up is a good thing.
If a symbol is exported, most out-of-tree driver authors seem to
think its fair game to use it.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net_rx_action/NAPI oops [PATCH]

2007-11-27 Thread Robert Olsson

Hello!

I've discovered a bug while testing the new multiQ NAPI code. In hi-load 
situations when we take down an interface we get a kernel panic. The
oops is below.

From what I see this happens when driver does napi_disable() and clears
NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight 
a sort indirect test but that's now not enough to cover the load situation. 
where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still 
full quota. Latest git but I'll guess the is the same in all later kernels.
There might be different solutions... one variant is below:

Signed-off-by: Robert Olsson [EMAIL PROTECTED]

diff --git a/net/core/dev.c b/net/core/dev.c
index 043e2f8..1031233 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2207,7 +2207,7 @@ static void net_rx_action(struct softirq_action *h)
 * still owns the NAPI instance and therefore can
 * move the instance around on the list at-will.
 */
-   if (unlikely(work == weight))
+   if (unlikely(work == weight)  (test_bit(NAPI_STATE_SCHED, 
n-state)))
list_move_tail(n-poll_list, list);
 
netpoll_poll_unlock(have);


Cheers
--ro


labb:/# ifconfig  eth0 down 

BUG: unable to handle kernel paging request at virtual address 00100104
printing eip: c0433d67 *pde =  
Oops: 0002 [#1] SMP 
Modules linked in:

Pid: 4, comm: ksoftirqd/0 Not tainted (2.6.24-rc3bifrost-gb3664d45-dirty #32)
EIP: 0060:[c0433d67] EFLAGS: 00010046 CPU: 0
EIP is at net_rx_action+0x107/0x120
EAX: 00100100 EBX: f757d4e0 ECX: c200d334 EDX: 00200200
ESI: 0040 EDI: c200d334 EBP: 00ec ESP: f7c6bf78
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process ksoftirqd/0 (pid: 4, ti=f7c6a000 task=f7c58ab0 task.ti=f7c6a000)
Stack: c0236217 c200ce9c c200ce9c  fffcf892 0040 0005 c05b2a98 
   c0603e60 0008 c022a275  c06066c0 c06066c0 0246  
   c022a5e0  c022a327 c06066c0 c022a636 fffc  c02384f2 
Call Trace:
 [c0236217] __rcu_process_callbacks+0x107/0x190
 [c022a275] __do_softirq+0x75/0xf0
 [c022a5e0] ksoftirqd+0x0/0xd0
 [c022a327] do_softirq+0x37/0x40
 [c022a636] ksoftirqd+0x56/0xd0
 [c02384f2] kthread+0x42/0x70
 [c02384b0] kthread+0x0/0x70
 [c02039df] kernel_thread_helper+0x7/0x18
 ===
Code: 88 8c 52 c0 e8 4b 1d df ff e8 96 0c dd ff c7 05 64 7d 63 c0 01 00 00 00 
e9 61 ff ff ff 8d b4
 26 00 00 00 00 8b 03 8b 53 04 89 f9 89 50 04 89 02 89 d8 8b 57 04 e8 5a 34 
eb ff e9 4a ff ff ff
 90 
EIP: [c0433d67] net_rx_action+0x107/0x120 SS:ESP 0068:f7c6bf78
Kernel panic - not syncing: Fatal exception in interrupt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-mm1 tg3 wake-on-lan oddity...

2007-11-27 Thread Valdis . Kletnieks
On Tue, 27 Nov 2007 08:04:28 PST, Michael Chan said:
 [EMAIL PROTECTED] wrote:
 
  (a) the Dell factory default is WOL disabled and (b)
  if it wasn't
  the default, I'd have *set* it to disabled, and (c) I even
  went back and
  rebooted and checked the BIOS setting - disabled. Nonetheless:
 
  # ethtool eth0 | grep Wake
  Supports Wake-on: g
  Wake-on: g
 
  Is this expected behavior?
 
 
 The new tg3 is supposed to follow the WoL setting in the
 NVRAM, so this is not expected.  We'll have to look into
 this.

Any info that would help?  printk's to stick in tg3.c?  Dumping the
relevant bytes of NVRAM? etc?


pgpwUS9hBRdWf.pgp
Description: PGP signature


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Valdis . Kletnieks
On Tue, 27 Nov 2007 15:12:42 +0100, Andi Kleen said:
  OK, short of making IPv4 a module (which would be a worthy task :)
 
 At some point there were patches, it is probably not very difficult.
 But DaveM resisted at some point because he didn't want people
 to replace the network stack (although I personally don't have a problem
 with that)

Personally, I wouldn't find replacing the ipv4 stack very interesting.

However, stress-testing your system for IPv6-readiness by doing 'rmmod ipv4' :)

(Though I admit it's something I'd *try* if it was available, but certainly
not sufficient for a reason to do it...)


pgpAMJ4wTClYk.pgp
Description: PGP signature


Re: 2.6.23-mm1 tg3 wake-on-lan oddity...

2007-11-27 Thread Michael Chan
On Tue, 2007-11-27 at 01:35 -0800, [EMAIL PROTECTED] wrote:

 
 Issue:
 
 I (for unrelated reasons) run powertop, and it suggests I conserve power
 by doing 'ethtool -s eth0 wol d'.  I look at it, and think that it's daft,
 because (a) the Dell factory default is WOL disabled and (b) if it wasn't
 the default, I'd have *set* it to disabled, and (c) I even went back and
 rebooted and checked the BIOS setting - disabled. Nonetheless:
 
 # ethtool eth0 | grep Wake
 Supports Wake-on: g
 Wake-on: g
 
 Is this expected behavior?

What's happening is that there are 2 WoL settings: one in the BIOS and
one in the NIC's NVRAM.  For WoL to work, I think both settings have to
be enabled.  Apparently in this case, when you turn off WoL in the BIOS,
the NVRAM's WoL setting is unchanged, and will be seen by tg3 as
enabled.

Ideally, the BIOS should modify the NVRAM's setting when it is changed.
We will talk to Dell to get their opinion on this as this is very
confusing to the user.

Thanks.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Adrian Bunk
On Tue, Nov 27, 2007 at 02:00:37PM -0500, Dave Jones wrote:
 On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote:
  
   1) Why is everyone so concerned that export symbol space is large?
  - does it cost cpu or running memory?
  - does it cause bugs?
  - or are you just worried about evil modules?
 
 To clarify something here, by evil, don't necessarily think binary only. 
 
 Out of tree modules are frequently using symbols that they shouldn't be.
 Because they get no peer-review here, they 'get away with it' for the most 
 part.
 Until distro vendors push rebased kernel updates that removed exports that
 should never have been exported, and suddenly people like me get bombed
 with Fedora broke my xyz driver mails.
...

The real problem is that these drivers are not in the upstream kernel.

Are there common reasons why these drivers are not upstream?

   Dave

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Rick Jones

The real problem is that these drivers are not in the upstream kernel.

Are there common reasons why these drivers are not upstream?


One might be that upstream has not accepted them.  Anything doing or 
smelling of TOE comes to mind right away.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Dave Jones
On Tue, Nov 27, 2007 at 10:09:42PM +0100, Adrian Bunk wrote:
  On Tue, Nov 27, 2007 at 02:00:37PM -0500, Dave Jones wrote:
   On Mon, Nov 26, 2007 at 10:25:33AM -0800, Stephen Hemminger wrote:

 1) Why is everyone so concerned that export symbol space is large?
 - does it cost cpu or running memory?
 - does it cause bugs?
 - or are you just worried about evil modules?
   
   To clarify something here, by evil, don't necessarily think binary 
   only. 
   
   Out of tree modules are frequently using symbols that they shouldn't be.
   Because they get no peer-review here, they 'get away with it' for the most 
   part.
   Until distro vendors push rebased kernel updates that removed exports that
   should never have been exported, and suddenly people like me get bombed
   with Fedora broke my xyz driver mails.
  ...
  
  The real problem is that these drivers are not in the upstream kernel.

You're preaching to the choir.

  Are there common reasons why these drivers are not upstream?

It varies case by case.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] cxgb - driver fixes.

2007-11-27 Thread Divy Le Ray

Jeff,

I'm submitting a patch series for inclusion in 2.6.24 for the cxgb driver.
The patches are built against Linus'git tree.

Here is a brief description:
- Ensure that GSO skbs have enough headroom before encapsulating them,
- Fix a crash in NAPI mode,
- Fix statistics accounting and report.

Cheers,
Divy


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] cxgb - fix T2 GSO

2007-11-27 Thread Divy Le Ray
From: Divy Le Ray [EMAIL PROTECTED]

The patch ensures that a GSO skb has enough headroom
to push an encapsulating cpl_tx_pkt_lso header.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/chelsio/cxgb2.c |3 ++-
 drivers/net/chelsio/sge.c   |   34 +++---
 drivers/net/chelsio/sge.h   |1 +
 3 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/drivers/net/chelsio/cxgb2.c b/drivers/net/chelsio/cxgb2.c
old mode 100644
new mode 100755
index 2dbf8dc..2461f91
--- a/drivers/net/chelsio/cxgb2.c
+++ b/drivers/net/chelsio/cxgb2.c
@@ -401,7 +401,8 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
TxTso,
RxVlan,
TxVlan,
-
+   TxNeedHeadroom, 
+   
/* Interrupt stats */
rx drops,
pure_rsps,
diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c
old mode 100644
new mode 100755
index 4436662..e8b1036
--- a/drivers/net/chelsio/sge.c
+++ b/drivers/net/chelsio/sge.c
@@ -991,6 +991,7 @@ void t1_sge_get_port_stats(const struct sge *sge, int port,
ss-tx_packets += st-tx_packets;
ss-tx_cso += st-tx_cso;
ss-tx_tso += st-tx_tso;
+   ss-tx_need_hdrroom += st-tx_need_hdrroom;
ss-vlan_xtract += st-vlan_xtract;
ss-vlan_insert += st-vlan_insert;
}
@@ -1848,7 +1849,8 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device 
*dev)
 {
struct adapter *adapter = dev-priv;
struct sge *sge = adapter-sge;
-   struct sge_port_stats *st = per_cpu_ptr(sge-port_stats[dev-if_port], 
smp_processor_id());
+   struct sge_port_stats *st = per_cpu_ptr(sge-port_stats[dev-if_port],
+   smp_processor_id());
struct cpl_tx_pkt *cpl;
struct sk_buff *orig_skb = skb;
int ret;
@@ -1856,6 +1858,18 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device 
*dev)
if (skb-protocol == htons(ETH_P_CPL5))
goto send;
 
+   /*
+* We are using a non-standard hard_header_len.
+* Allocate more header room in the rare cases it is not big enough.
+*/
+   if (unlikely(skb_headroom(skb)  dev-hard_header_len - ETH_HLEN)) {
+   skb = skb_realloc_headroom(skb, sizeof(struct cpl_tx_pkt_lso));
+   ++st-tx_need_hdrroom;
+   dev_kfree_skb_any(orig_skb);
+   if (!skb)
+   return NETDEV_TX_OK;
+   }
+
if (skb_shinfo(skb)-gso_size) {
int eth_type;
struct cpl_tx_pkt_lso *hdr;
@@ -1889,24 +1903,6 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device 
*dev)
return NETDEV_TX_OK;
}
 
-   /*
-* We are using a non-standard hard_header_len and some kernel
-* components, such as pktgen, do not handle it right.
-* Complain when this happens but try to fix things up.
-*/
-   if (unlikely(skb_headroom(skb)  dev-hard_header_len - 
ETH_HLEN)) {
-   pr_debug(%s: headroom %d header_len %d\n, dev-name,
-skb_headroom(skb), dev-hard_header_len);
-
-   if (net_ratelimit())
-   printk(KERN_ERR %s: inadequate headroom in 
-  Tx packet\n, dev-name);
-   skb = skb_realloc_headroom(skb, sizeof(*cpl));
-   dev_kfree_skb_any(orig_skb);
-   if (!skb)
-   return NETDEV_TX_OK;
-   }
-
if (!(adapter-flags  UDP_CSUM_CAPABLE) 
skb-ip_summed == CHECKSUM_PARTIAL 
ip_hdr(skb)-protocol == IPPROTO_UDP) {
diff --git a/drivers/net/chelsio/sge.h b/drivers/net/chelsio/sge.h
old mode 100644
new mode 100755
index 713d9c5..285bbb2
--- a/drivers/net/chelsio/sge.h
+++ b/drivers/net/chelsio/sge.h
@@ -64,6 +64,7 @@ struct sge_port_stats {
u64 tx_tso;  /* # of TSO requests */
u64 vlan_xtract; /* # of VLAN tag extractions */
u64 vlan_insert; /* # of VLAN tag insertions */
+   u64 tx_need_hdrroom; /* # of TX skbs in need of more header room */
 };
 
 struct sk_buff;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] cxgb - fix NAPI

2007-11-27 Thread Divy Le Ray
From: Divy Le Ray [EMAIL PROTECTED]

netif_rx_complete() should be called only 
when work_done  budget.

Signed-off-by: Divy Le ray [EMAIL PROTECTED]
---

 drivers/net/chelsio/sge.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c
old mode 100755
new mode 100644
index e8b1036..4b6258f
--- a/drivers/net/chelsio/sge.c
+++ b/drivers/net/chelsio/sge.c
@@ -1625,11 +1625,9 @@ int t1_poll(struct napi_struct *napi, int budget)
 {
struct adapter *adapter = container_of(napi, struct adapter, napi);
struct net_device *dev = adapter-port[0].dev;
-   int work_done;
-
-   work_done = process_responses(adapter, budget);
+   int work_done = process_responses(adapter, budget);
 
-   if (likely(!responses_pending(adapter))) {
+   if (likely(work_done  budget)) {
netif_rx_complete(dev, napi);
writel(adapter-sge-respQ.cidx,
   adapter-regs + A_SG_SLEEPING);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] cxgb - fix stats

2007-11-27 Thread Divy Le Ray
From: Divy Le Ray [EMAIL PROTECTED]

Fix MAC stats accounting. 
Fix get_stats.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/chelsio/cxgb2.c  |   67 +++--
 drivers/net/chelsio/pm3393.c |  112 +-
 drivers/net/chelsio/sge.c|4 --
 drivers/net/chelsio/sge.h|2 -
 4 files changed, 96 insertions(+), 89 deletions(-)

diff --git a/drivers/net/chelsio/cxgb2.c b/drivers/net/chelsio/cxgb2.c
index 2461f91..c597504 100755
--- a/drivers/net/chelsio/cxgb2.c
+++ b/drivers/net/chelsio/cxgb2.c
@@ -374,7 +374,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
TxInternalMACXmitError,
TxFramesWithExcessiveDeferral,
TxFCSErrors,
-
+   TxJumboFramesOk,
+   TxJumboOctetsOk,
+   
RxOctetsOK,
RxOctetsBad,
RxUnicastFramesOK,
@@ -392,11 +394,11 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
RxInRangeLengthErrors,
RxOutOfRangeLengthField,
RxFrameTooLongErrors,
+   RxJumboFramesOk,
+   RxJumboOctetsOk,
 
/* Port stats */
-   RxPackets,
RxCsumGood,
-   TxPackets,
TxCsumOffload,
TxTso,
RxVlan,
@@ -464,23 +466,56 @@ static void get_stats(struct net_device *dev, struct 
ethtool_stats *stats,
const struct cmac_statistics *s;
const struct sge_intr_counts *t;
struct sge_port_stats ss;
-   unsigned int len;
 
s = mac-ops-statistics_update(mac, MAC_STATS_UPDATE_FULL);
-
-   len = sizeof(u64)*(s-TxFCSErrors + 1 - s-TxOctetsOK);
-   memcpy(data, s-TxOctetsOK, len);
-   data += len;
-
-   len = sizeof(u64)*(s-RxFrameTooLongErrors + 1 - s-RxOctetsOK);
-   memcpy(data, s-RxOctetsOK, len);
-   data += len;
-
+   t = t1_sge_get_intr_counts(adapter-sge);
t1_sge_get_port_stats(adapter-sge, dev-if_port, ss);
-   memcpy(data, ss, sizeof(ss));
-   data += sizeof(ss);
 
-   t = t1_sge_get_intr_counts(adapter-sge);
+   *data++ = s-TxOctetsOK;
+   *data++ = s-TxOctetsBad;
+   *data++ = s-TxUnicastFramesOK;
+   *data++ = s-TxMulticastFramesOK;
+   *data++ = s-TxBroadcastFramesOK;
+   *data++ = s-TxPauseFrames;
+   *data++ = s-TxFramesWithDeferredXmissions;
+   *data++ = s-TxLateCollisions;
+   *data++ = s-TxTotalCollisions;
+   *data++ = s-TxFramesAbortedDueToXSCollisions;
+   *data++ = s-TxUnderrun;
+   *data++ = s-TxLengthErrors;
+   *data++ = s-TxInternalMACXmitError;
+   *data++ = s-TxFramesWithExcessiveDeferral;
+   *data++ = s-TxFCSErrors;
+   *data++ = s-TxJumboFramesOK;
+   *data++ = s-TxJumboOctetsOK;
+
+   *data++ = s-RxOctetsOK;
+   *data++ = s-RxOctetsBad;
+   *data++ = s-RxUnicastFramesOK;
+   *data++ = s-RxMulticastFramesOK;
+   *data++ = s-RxBroadcastFramesOK;
+   *data++ = s-RxPauseFrames;
+   *data++ = s-RxFCSErrors;
+   *data++ = s-RxAlignErrors;
+   *data++ = s-RxSymbolErrors;
+   *data++ = s-RxDataErrors;
+   *data++ = s-RxSequenceErrors;
+   *data++ = s-RxRuntErrors;
+   *data++ = s-RxJabberErrors;
+   *data++ = s-RxInternalMACRcvError;
+   *data++ = s-RxInRangeLengthErrors;
+   *data++ = s-RxOutOfRangeLengthField;
+   *data++ = s-RxFrameTooLongErrors;
+   *data++ = s-RxJumboFramesOK;
+   *data++ = s-RxJumboOctetsOK;
+
+   *data++ = ss.rx_cso_good;
+   *data++ = ss.tx_cso;
+   *data++ = ss.tx_tso;
+   *data++ = ss.vlan_xtract;
+   *data++ = ss.vlan_insert;
+   *data++ = ss.tx_need_hdrroom;
+   
*data++ = t-rx_drops;
*data++ = t-pure_rsps;
*data++ = t-unhandled_irqs;
diff --git a/drivers/net/chelsio/pm3393.c b/drivers/net/chelsio/pm3393.c
old mode 100644
new mode 100755
index 678778a..2117c4f
--- a/drivers/net/chelsio/pm3393.c
+++ b/drivers/net/chelsio/pm3393.c
@@ -45,7 +45,7 @@
 
 #include linux/crc32.h
 
-#define OFFSET(REG_ADDR)(REG_ADDR  2)
+#define OFFSET(REG_ADDR)((REG_ADDR)  2)
 
 /* Max frame size PM3393 can handle. Includes Ethernet header and CRC. */
 #define MAX_FRAME_SIZE  9600
@@ -428,69 +428,26 @@ static int pm3393_set_speed_duplex_fc(struct cmac *cmac, 
int speed, int duplex,
return 0;
 }
 
-static void pm3393_rmon_update(struct adapter *adapter, u32 offs, u64 *val,
-  int over)
-{
-   u32 val0, val1, val2;
-
-   t1_tpi_read(adapter, offs, val0);
-   t1_tpi_read(adapter, offs + 4, val1);
-   t1_tpi_read(adapter, offs + 8, val2);
-
-   *val = ~0ull  40;
-   *val |= val0  0x;
-   *val |= (val1  0x)  16;
-   *val |= (u64)(val2  0xff)  32;
-
-   if (over)
-   *val += 1ull  40;
+#define RMON_UPDATE(mac, name, stat_name) \
+{ \
+   t1_tpi_read((mac)-adapter, OFFSET(name), val0); \
+   t1_tpi_read((mac)-adapter, OFFSET((name)+1), val1); \
+   t1_tpi_read((mac)-adapter, OFFSET((name)+2), val2); \
+   

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Adrian Bunk
On Tue, Nov 27, 2007 at 01:15:23PM -0800, Rick Jones wrote:
 The real problem is that these drivers are not in the upstream kernel.

 Are there common reasons why these drivers are not upstream?

 One might be that upstream has not accepted them.  Anything doing or 
 smelling of TOE comes to mind right away.

Which modules doing or smelling of TOE do work with unmodified vendor 
kernels?

AFAIR TOE was both not a module and required some hooks in the network 
stack, so it's completely outside the scope of this thread.

 rick jones

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-mm1 tg3 wake-on-lan oddity...

2007-11-27 Thread Valdis . Kletnieks
On Tue, 27 Nov 2007 13:34:57 PST, Michael Chan said:

 Ideally, the BIOS should modify the NVRAM's setting when it is changed.
 We will talk to Dell to get their opinion on this as this is very
 confusing to the user.

That would certainly explain what I'm seeing, and I can certainly wait
if the answer is indeed Buggy BIOS, fixed in D820 A08 or A09 or whenever.
(If Dell cares, I'm at BIOS A07 already).

In the meantime, I just stuck an 'ethtool -s eth0 wol d' in /etc/rc.local
until a proper fix shows up.


pgprI27r6QKbi.pgp
Description: PGP signature


Re: net_rx_action/NAPI oops [PATCH]

2007-11-27 Thread Stephen Hemminger
On Tue, 27 Nov 2007 19:52:24 +0100
Robert Olsson [EMAIL PROTECTED] wrote:

 
 Hello!
 
 I've discovered a bug while testing the new multiQ NAPI code. In hi-load 
 situations when we take down an interface we get a kernel panic. The
 oops is below.
 
 From what I see this happens when driver does napi_disable() and clears
 NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight 
 a sort indirect test but that's now not enough to cover the load situation. 
 where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still 
 full quota. Latest git but I'll guess the is the same in all later kernels.
 There might be different solutions... one variant is below:

It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear 
NAPI_STATE_SCHED)
and do a full quota. That bug already had to be fixed in other drivers,
look like e1000 has same problem.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net_rx_action/NAPI oops [PATCH]

2007-11-27 Thread Kok, Auke
Stephen Hemminger wrote:
 On Tue, 27 Nov 2007 19:52:24 +0100
 Robert Olsson [EMAIL PROTECTED] wrote:
 
 Hello!

 I've discovered a bug while testing the new multiQ NAPI code. In hi-load 
 situations when we take down an interface we get a kernel panic. The
 oops is below.

 From what I see this happens when driver does napi_disable() and clears
 NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight 
 a sort indirect test but that's now not enough to cover the load situation. 
 where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still 
 full quota. Latest git but I'll guess the is the same in all later kernels.
 There might be different solutions... one variant is below:
 
 It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear 
 NAPI_STATE_SCHED)
 and do a full quota. That bug already had to be fixed in other drivers,
 look like e1000 has same problem.

Stephen,

please enlighten me, can you e.g. show me a commit of other drivers where you
fixed this up?

Thanks,

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] sky2: don't use AER routines

2007-11-27 Thread Stephen Hemminger
Using PCIE advanced error recovery stuff creates more user problems than it's 
worth.
The AER stuff depends on MMCONFIG and in many configurations it just doesn't 
work.
Plus it doesn't add any real functionality to the driver. The sky2
driver handles its own errors fine as is.

This reverts 555382cbfc6d2187b53888190755e56f52308cd6

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- a/drivers/net/sky2.c2007-11-26 14:02:31.0 -0800
+++ b/drivers/net/sky2.c2007-11-26 14:02:36.0 -0800
@@ -31,7 +31,6 @@
 #include linux/etherdevice.h
 #include linux/ethtool.h
 #include linux/pci.h
-#include linux/aer.h
 #include linux/ip.h
 #include net/ip.h
 #include linux/tcp.h
@@ -2432,26 +2431,15 @@ static void sky2_hw_intr(struct sky2_hw 
 
if (status  Y2_IS_PCI_EXP) {
/* PCI-Express uncorrectable Error occurred */
-   int aer = pci_find_aer_capability(hw-pdev);
u32 err;
 
-   if (aer) {
-   pci_read_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS,
- err);
-   pci_cleanup_aer_uncorrect_error_status(pdev);
-   } else {
-   /* Either AER not configured, or not working
-* because of bad MMCONFIG, so just do recover
-* manually.
-*/
-   err = sky2_read32(hw, Y2_CFG_AER + 
PCI_ERR_UNCOR_STATUS);
-   sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS,
-0xul);
-   }
-
+   err = sky2_read32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS);
+   sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS,
+0xul);
if (net_ratelimit())
dev_err(pdev-dev, PCI Express error (0x%x)\n, err);
 
+   sky2_read32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS);
}
 
if (status  Y2_HWE_L1_MASK)
@@ -2806,24 +2794,13 @@ static void sky2_reset(struct sky2_hw *h
 
cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
if (cap) {
-   if (pci_find_aer_capability(pdev)) {
-   /* Check for advanced error reporting */
-   pci_cleanup_aer_uncorrect_error_status(pdev);
-   pci_cleanup_aer_correct_error_status(pdev);
-   } else {
-   dev_warn(pdev-dev,
-   PCI Express Advanced Error Reporting
-not configured or MMCONFIG problem?\n);
-
-   sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS,
-0xul);
-   }
+   sky2_write32(hw, Y2_CFG_AER + PCI_ERR_UNCOR_STATUS,
+0xul);
 
/* If error bit is stuck on ignore it */
if (sky2_read32(hw, B0_HWE_ISRC)  Y2_IS_PCI_EXP)
dev_info(pdev-dev, ignoring stuck error report 
bit\n);
-
-   else if (pci_enable_pcie_error_reporting(pdev))
+   else
hwe_mask |= Y2_IS_PCI_EXP;
}
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] sky2: turn of dynamic Tx watermark workaround (FE+ only)

2007-11-27 Thread Stephen Hemminger
Add workaround for issues FE+ (A0) transmit watermark.
This is copied verbatim from vendor driver sk98lin (10.22.4.3).
Don't have that chip version and no more information seems to be available.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- a/drivers/net/sky2.c2007-11-26 14:02:36.0 -0800
+++ b/drivers/net/sky2.c2007-11-26 14:28:47.0 -0800
@@ -845,6 +845,13 @@ static void sky2_mac_init(struct sky2_hw
sky2_set_tx_stfwd(hw, port);
}
 
+   if (hw-chip_id == CHIP_ID_YUKON_FE_P 
+   hw-chip_rev == CHIP_REV_YU_FE2_A0) {
+   /* disable dynamic watermark */
+   reg = sky2_read16(hw, SK_REG(port, TX_GMF_EA));
+   reg = ~TX_DYN_WM_ENA;
+   sky2_write16(hw, SK_REG(port, TX_GMF_EA), reg);
+   }
 }
 
 /* Assign Ram Buffer allocation to queue */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Andi Kleen
 With my Enterprise hat on, I can see where Andi was coming from
 originally. 

For the record my original motivation was to fix the TCP exports everything
for ipv6.ko case cleanly. I later realized that it would be useful for the
ABI stability issues too, but it was really not my primary motivation.
This is why I didn't even mention that in the original patch description.

-Andi

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] sky2: revert to access PCI config via device space

2007-11-27 Thread Stephen Hemminger
Using the hardware window into PCI config space is more reliable
and smaller/faster than using the pci_config routines. It avoids issues
with MMCONFIG etc.

Reverts: 167f53d05fccb47b6eeadac7f6705b3f2f042d03

Please apply for 2.6.24

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- a/drivers/net/sky2.c2007-11-26 14:02:14.0 -0800
+++ b/drivers/net/sky2.c2007-11-26 14:02:31.0 -0800
@@ -240,22 +240,21 @@ static void sky2_power_on(struct sky2_hw
sky2_write8(hw, B2_Y2_CLK_GATE, 0);
 
if (hw-flags  SKY2_HW_ADV_POWER_CTL) {
-   struct pci_dev *pdev = hw-pdev;
u32 reg;
 
-   pci_write_config_dword(pdev, PCI_DEV_REG3, 0);
+   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
 
-   pci_read_config_dword(pdev, PCI_DEV_REG4, reg);
+   reg = sky2_pci_read32(hw, PCI_DEV_REG4);
/* set all bits to 0 except bits 15..12 and 8 */
reg = P_ASPM_CONTROL_MSK;
-   pci_write_config_dword(pdev, PCI_DEV_REG4, reg);
+   sky2_pci_write32(hw, PCI_DEV_REG4, reg);
 
-   pci_read_config_dword(pdev, PCI_DEV_REG5, reg);
+   reg = sky2_pci_read32(hw, PCI_DEV_REG5);
/* set all bits to 0 except bits 28  27 */
reg = P_CTL_TIM_VMAIN_AV_MSK;
-   pci_write_config_dword(pdev, PCI_DEV_REG5, reg);
+   sky2_pci_write32(hw, PCI_DEV_REG5, reg);
 
-   pci_write_config_dword(pdev, PCI_CFG_REG_1, 0);
+   sky2_pci_write32(hw, PCI_CFG_REG_1, 0);
 
/* Enable workaround for dev 4.107 on Yukon-Ultra  Extreme */
reg = sky2_read32(hw, B2_GP_IO);
@@ -619,12 +618,11 @@ static void sky2_phy_init(struct sky2_hw
 
 static void sky2_phy_power(struct sky2_hw *hw, unsigned port, int onoff)
 {
-   struct pci_dev *pdev = hw-pdev;
u32 reg1;
static const u32 phy_power[] = { PCI_Y2_PHY1_POWD, PCI_Y2_PHY2_POWD };
static const u32 coma_mode[] = { PCI_Y2_PHY1_COMA, PCI_Y2_PHY2_COMA };
 
-   pci_read_config_dword(pdev, PCI_DEV_REG1, reg1);
+   reg1 = sky2_pci_read32(hw, PCI_DEV_REG1);
/* Turn on/off phy power saving */
if (onoff)
reg1 = ~phy_power[port];
@@ -634,8 +632,8 @@ static void sky2_phy_power(struct sky2_h
if (onoff  hw-chip_id == CHIP_ID_YUKON_XL  hw-chip_rev  1)
reg1 |= coma_mode[port];
 
-   pci_write_config_dword(pdev, PCI_DEV_REG1, reg1);
-   pci_read_config_dword(pdev, PCI_DEV_REG1, reg1);
+   sky2_pci_write32(hw, PCI_DEV_REG1, reg1);
+   reg1 = sky2_pci_read32(hw, PCI_DEV_REG1);
 
udelay(100);
 }
@@ -704,9 +702,9 @@ static void sky2_wol_init(struct sky2_po
sky2_write16(hw, WOL_REGS(port, WOL_CTRL_STAT), ctrl);
 
/* Turn on legacy PCI-Express PME mode */
-   pci_read_config_dword(hw-pdev, PCI_DEV_REG1, reg1);
+   reg1 = sky2_pci_read32(hw, PCI_DEV_REG1);
reg1 |= PCI_Y2_PME_LEGACY;
-   pci_write_config_dword(hw-pdev, PCI_DEV_REG1, reg1);
+   sky2_pci_write32(hw, PCI_DEV_REG1, reg1);
 
/* block receiver */
sky2_write8(hw, SK_REG(port, RX_GMF_CTRL_T), GMF_RST_SET);
@@ -1322,9 +1320,10 @@ static int sky2_up(struct net_device *de
(cap = pci_find_capability(hw-pdev, PCI_CAP_ID_PCIX))) {
u16 cmd;
 
-   pci_read_config_word(hw-pdev, cap + PCI_X_CMD, cmd);
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
cmd = ~PCI_X_CMD_MAX_SPLIT;
-   pci_write_config_word(hw-pdev, cap + PCI_X_CMD, cmd);
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
}
 
if (netif_msg_ifup(sky2))
@@ -2422,12 +2421,12 @@ static void sky2_hw_intr(struct sky2_hw 
if (status  (Y2_IS_MST_ERR | Y2_IS_IRQ_STAT)) {
u16 pci_err;
 
-   pci_read_config_word(pdev, PCI_STATUS, pci_err);
+   pci_err = sky2_pci_read16(hw, PCI_STATUS);
if (net_ratelimit())
dev_err(pdev-dev, PCI hardware error (0x%x)\n,
pci_err);
 
-   pci_write_config_word(pdev, PCI_STATUS,
+   sky2_pci_write16(hw, PCI_STATUS,
  pci_err | PCI_STATUS_ERROR_BITS);
}
 
@@ -2699,13 +2698,10 @@ static inline u32 sky2_clk2us(const stru
 
 static int __devinit sky2_init(struct sky2_hw *hw)
 {
-   int rc;
u8 t8;
 
/* Enable all clocks and check for bad PCI access */
-   rc = pci_write_config_dword(hw-pdev, PCI_DEV_REG3, 0);
-   if (rc)
-   return rc;
+   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
 
sky2_write8(hw, B0_CTST, CS_RST_CLR);
 
@@ -2802,9 +2798,9 @@ static void sky2_reset(struct sky2_hw *h
sky2_write8(hw, B2_TST_CTRL1, TST_CFG_WRITE_ON);
 
/* clear PCI errors, if any */
-   pci_read_config_word(pdev, PCI_STATUS, 

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Valdis . Kletnieks
On Tue, 27 Nov 2007 22:09:42 +0100, Adrian Bunk said:

 Are there common reasons why these drivers are not upstream?

Well, on my laptop, I'm currently dragging along 3 out-of-tree kernel modules.
2 are well-known binary blobs so it's between me and the vendor, as usual.

The third is a USB webcam driver that happened to get caught at the wrong
end of the colorspace-conversion-in-kernel bunfight in the V4L playpen.
Somebody wants to figure out how to get the gspca drivers into the kernel,
they're at http://mxhaard.free.fr/download.html waiting for attention. ;)

(Don't look at *me*, I don't understand the code, or the bunfight - I just
happen to have one of the 244 supported webcams, and it works with that driver)


pgpRlM5JB5L1m.pgp
Description: PGP signature


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Andi Kleen
On Tue, Nov 27, 2007 at 03:00:22PM -0800, Stephen Hemminger wrote:
 On Tue, 27 Nov 2007 23:37:43 +0100
 Andi Kleen [EMAIL PROTECTED] wrote:
 
   With my Enterprise hat on, I can see where Andi was coming from
   originally. 
  
  For the record my original motivation was to fix the TCP exports everything
  for ipv6.ko case cleanly. I later realized that it would be useful for the
  ABI stability issues too, but it was really not my primary motivation.
  This is why I didn't even mention that in the original patch description.
 
 Since ipv6 can never be removed because it references itself, the whole 
 concept

AFAIK that is obsolete anyways. It was done because it was feared it would
be broken, but at least the broken cases I knew about were all fixed
a long time ago. Most likely it could be safely removed.

 of a modular ipv6 is flawed. 

Modules that cannot be unloaded are still useful. Standard case: Distributions
like to offer an option to not use ipv6 because that is popular workaround
for the common DNS server eats  queries and causes delays issue.
Forcing the user to rebuild the kernel for this wouldn't be practical.
If ipv6 wasn't modular that would be hard to do.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Jon Masters
On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote:
 On Monday 26 November 2007 17:15:44 Roland Dreier wrote:

  It seems pretty   
  clear to me that having a mechanism that requires modules to make
  explicit which (semi-)internal APIs makes reviewing easier
 
 Perhaps you've got lots of patches were people are using internal APIs they 
 shouldn't?

With my Enterprise hat on, I can see where Andi was coming from
originally. Just like we do in RHEL, SuSE have a concept of a kernel ABI
and we too have a concept of a whitelist of symbols - a subset of kernel
ABI that is approved for use by third parties (this is nothing to do
with licensing, this is to do with ABI stability we try to ensure).

As part of figuring out what should and should not be used by external
modules (outside of the core kernel), we've gained a bit of experience,
and it would be nice to be able to help others - this is nothing to do
with upstream ABI stability, but just to be of a public service by
documenting those functions that are never intended for modules but
which necessarily are exported because they have one or two users.

  , makes it 
  easier to communicate please don't use that API to module authors,
 
 Well, introduce an EXPORT_SYMBOL_INTERNAL().  It's a lot less code.  But 
 you'd 
 still need to show that people are having trouble knowing what APIs to use.

I suggested this exact idea privately at OLS. I still think it's the
best compromise, though I like bits of the namespace idea. An
EXPORT_SYMBOL_INTERNAL would indeed be trivial.

Jon.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] e1000: Fix NAPI state bug when Rx complete

2007-11-27 Thread Auke Kok
Don't exit polling when we have not yet used our budget, this causes
the NAPI system to end up with a messed up poll list.

Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 drivers/net/e1000/e1000_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index b7c3070..724f067 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3926,7 +3926,7 @@ e1000_clean(struct napi_struct *napi, int budget)
  work_done, budget);
 
/* If no Tx and not enough Rx work done, exit the polling mode */
-   if ((!tx_cleaned  (work_done  budget)) ||
+   if ((!tx_cleaned  (work_done == 0)) ||
   !netif_running(poll_dev)) {
 quit_polling:
if (likely(adapter-itr_setting  3))
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net_rx_action/NAPI oops [PATCH]

2007-11-27 Thread Kok, Auke
Stephen Hemminger wrote:
 On Tue, 27 Nov 2007 14:34:44 -0800
 Kok, Auke [EMAIL PROTECTED] wrote:
 
 Stephen Hemminger wrote:
 On Tue, 27 Nov 2007 19:52:24 +0100
 Robert Olsson [EMAIL PROTECTED] wrote:

 Hello!

 I've discovered a bug while testing the new multiQ NAPI code. In hi-load 
 situations when we take down an interface we get a kernel panic. The
 oops is below.

 From what I see this happens when driver does napi_disable() and clears
 NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight 
 a sort indirect test but that's now not enough to cover the load 
 situation. 
 where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still 
 full quota. Latest git but I'll guess the is the same in all later kernels.
 There might be different solutions... one variant is below:
 It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear 
 NAPI_STATE_SCHED)
 and do a full quota. That bug already had to be fixed in other drivers,
 look like e1000 has same problem.
 Stephen,

 please enlighten me, can you e.g. show me a commit of other drivers where you
 fixed this up?

 Thanks,

 Auke
 
 Author: David S. Miller [EMAIL PROTECTED]  2007-10-11 18:08:29
 Committer: David S. Miller [EMAIL PROTECTED]  2007-10-11 18:08:29
 Parent: b9f2c0440d806e01968c3ed4def930a43be248ad ([netdrvr] Stop using legacy 
 hooks -self_test_count, -get_stats_count)
 Child:  266918303226cceac7eca38ced30f15f277bd89c ([SKY2]: status polling loop 
 (post merge))
 Branches: master, origin
 Follows: v2.6.23
 Precedes: v2.6.24-rc1
 
 [NET]: Fix NAPI completion handling in some drivers.
 
 In order for the list handling in net_rx_action() to be
 correct, drivers must follow certain rules as stated by
 this comment in net_rx_action():
 
   /* Drivers must not modify the NAPI state if they
* consume the entire weight.  In such cases this code
* still owns the NAPI instance and therefore can
* move the instance around on the list at-will.
*/
 
 A few drivers do not do this because they mix the budget checks
 with reading hardware state, resulting in crashes like the one
 reported by [EMAIL PROTECTED]
 
 BNX2 and TG3 are taken care of here, SKY2 fix is from Stephen
 Hemminger.
 
 Signed-off-by: David S. Miller [EMAIL PROTECTED]

OK, I'm not sure what went wrong there with e1000, but I'll send a patch in a 
second.

Robert, please give that patch a try (it fixes a crash that I had here as well)
and let us know if it works for you.

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-11-27 Thread Anil Veerabhadrappa

  Which ones were they exactly? I think JamesB wanted only common 
  transport values in the transport class. If it is driver specific then 
  it should go on the host or target or device with the scsi_host_template 
  attrs.
 
  
  It's a chicken  egg issue to put port mapper sysfs entry in scsi host
  attributes. Application won't see sysfs unless initiator creates an
 
 Sorry for the late response. I was on vacation.
 
 That is only with how you coded it today. I asked you to do something 
 like qla4xxx where the session and host are not so closely bound.

bnx2i is still using scsi host per iscsi session model, we plan to
migrate to scsi host per device model pretty soon. Previously you
indicated that you are working on this port, can you please share the
code? We will build on your work and bring this to closure.


 
  iSCSI session and driver can't create an iSCSI session without a tcp
 
 That is not right with how things are today even. The iscsi_session 
 struct can be created before the tcp connection. This was done because 
 we thought we were going to have to use only sysfs for all setup and 
 management (we ended up netlink and sysfs though).

You are right, software is flexible enough to allow creation of
session/connection and endpoint independently but so far user daemon
creates TCP endpoint before iSCSI session and bind them all together.
Any changes to this scheme will not be compatible with existing
distributions



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] net/bonding: Return nothing for not applicable values

2007-11-27 Thread =?utf-8?q?Ferenc_W=C3=A1gner?=
The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'.  This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.

I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.

Signed-off-by: Ferenc Wágner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |   25 -
 1 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index a3f1b4a..6bb91e2 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -455,14 +455,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d,
  struct device_attribute *attr,
  char *buf)
 {
-   int count;
+   int count = 0;
struct bonding *bond = to_bond(d);
 
-   if ((bond-params.mode != BOND_MODE_XOR) 
-   (bond-params.mode != BOND_MODE_8023AD)) {
-   // Not Applicable
-   count = sprintf(buf, NA\n);
-   } else {
+   if ((bond-params.mode == BOND_MODE_XOR) ||
+   (bond-params.mode == BOND_MODE_8023AD)) {
count = sprintf(buf, %s %d\n,
xmit_hashtype_tbl[bond-params.xmit_policy].modename,
bond-params.xmit_policy);
@@ -1079,8 +1076,6 @@ static ssize_t bonding_show_primary(struct device *d,
 
if (bond-primary_slave)
count = sprintf(buf, %s\n, bond-primary_slave-dev-name);
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1186,7 +1181,7 @@ static ssize_t bonding_show_active_slave(struct device *d,
 {
struct slave *curr;
struct bonding *bond = to_bond(d);
-   int count;
+   int count = 0;
 
read_lock(bond-curr_slave_lock);
curr = bond-curr_active_slave;
@@ -1194,8 +1189,6 @@ static ssize_t bonding_show_active_slave(struct device *d,
 
if (USES_PRIMARY(bond-params.mode)  curr)
count = sprintf(buf, %s\n, curr-dev-name);
-   else
-   count = sprintf(buf, \n);
return count;
 }
 
@@ -1309,8 +1302,6 @@ static ssize_t bonding_show_ad_aggregator(struct device 
*d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.aggregator_id);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1331,8 +1322,6 @@ static ssize_t bonding_show_ad_num_ports(struct device *d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0: ad_info.ports);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1353,8 +1342,6 @@ static ssize_t bonding_show_ad_actor_key(struct device *d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.actor_key);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1375,8 +1362,6 @@ static ssize_t bonding_show_ad_partner_key(struct device 
*d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.partner_key);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1401,8 +1386,6 @@ static ssize_t bonding_show_ad_partner_mac(struct device 
*d,
print_mac(mac, ad_info.partner_system));
}
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] net/bonding: Purely cosmetic: rename a local variable.

2007-11-27 Thread =?utf-8?q?Ferenc_W=C3=A1gner?=
Code for rendering multivalue sysfs files occurs three times
in this module.  Rename 'buffer' to 'buf' in the first, for
the sake of consistency.

Signed-off-by: Ferenc Wágner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 6bb91e2..5c31f5c 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -74,7 +74,7 @@ struct rw_semaphore bonding_rwsem;
  * show function for the bond_masters attribute.
  * The class parameter is ignored.
  */
-static ssize_t bonding_show_bonds(struct class *cls, char *buffer)
+static ssize_t bonding_show_bonds(struct class *cls, char *buf)
 {
int res = 0;
struct bonding *bond;
@@ -86,13 +86,12 @@ static ssize_t bonding_show_bonds(struct class *cls, char 
*buffer)
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buffer + res, ++more++ );
+   res += sprintf(buf + res, ++more++ );
break;
}
-   res += sprintf(buffer + res, %s ,
-  bond-dev-name);
+   res += sprintf(buf + res, %s , bond-dev-name);
}
-   if (res) buffer[res-1] = '\n'; /* eat the leftover space */
+   if (res) buf[res-1] = '\n'; /* eat the leftover space */
up_read((bonding_rwsem));
return res;
 }
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bonding sysfs output

2007-11-27 Thread Wagner Ferenc
Andrew Morton [EMAIL PROTECTED] writes:

 On Tue, 27 Nov 2007 10:56:43 +0100 Ferenc Wagner [EMAIL PROTECTED] wrote:

 - raise patches against the latest Linus tree
 (ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/)
 
 I thought it was better to change to git.  Isn't it so?
 SubmittingPatches has nothing to say about that...
 Can I find collected best practices somewhere?  Which tree, which
 branch, how/when to rebase, format-patch, etc...

 gosh.  Documentation/Submit*,
 http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt,
 http://linux.yyz.us/patch-format.html, other places.  Probably people have
 written books about it by now.  But don't sweat it - you're close enough ;)

I wonder where the information got lost...  I miss docs on submitting
patches from git ONLY.  The general documentation is pretty good and
helpful, just doesn't treat git (not using git in general, but using
it for submitting patches to the Linux kernel).  On the other hand
there's a multitude of repositories to clone times a zillion branches
to follow.  Which should be the basis of the patches?  That's not very
clear.

Anyway, find them in my previous mails.  Too bad I realised just after
the fact that cosmetic changes should go first.  Hope it's mostly OK.
-- 
Regards,
Feri.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] Remove trailing NULs from network bonding sysfs interface.

2007-11-27 Thread =?utf-8?q?Ferenc_W=C3=A1gner?=
Also remove trailing spaces from multivalued files.

This fixes output like for example:

$ od -c /sys/class/net/bond0/bonding/slaves
000   e   t   h   -   l   e   f   t   e   t   h   -   r   i   g
020   h   t  \n  \0
025

It mostly entails deleting '+1'-s after sprintf() calls: the return value
of sprintf is the number of characters printed, without the closing NUL,
ie. exactly what the sysfs interface requires.  The three multivalue
cases are different, because they also have to swallow back a trailing
space.

Signed-off-by: Ferenc Wágner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |   66 +
 1 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index b29330d..a3f1b4a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -86,14 +86,13 @@ static ssize_t bonding_show_bonds(struct class *cls, char 
*buffer)
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buffer + res, ++more++);
+   res += sprintf(buffer + res, ++more++ );
break;
}
res += sprintf(buffer + res, %s ,
   bond-dev-name);
}
-   res += sprintf(buffer + res, \n);
-   res++;
+   if (res) buffer[res-1] = '\n'; /* eat the leftover space */
up_read((bonding_rwsem));
return res;
 }
@@ -235,14 +234,13 @@ static ssize_t bonding_show_slaves(struct device *d,
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buf + res, ++more++);
+   res += sprintf(buf + res, ++more++ );
break;
}
res += sprintf(buf + res, %s , slave-dev-name);
}
read_unlock(bond-lock);
-   res += sprintf(buf + res, \n);
-   res++;
+   if (res) buf[res-1] = '\n'; /* eat the leftover space */
return res;
 }
 
@@ -406,7 +404,7 @@ static ssize_t bonding_show_mode(struct device *d,
 
return sprintf(buf, %s %d\n,
bond_mode_tbl[bond-params.mode].modename,
-   bond-params.mode) + 1;
+   bond-params.mode);
 }
 
 static ssize_t bonding_store_mode(struct device *d,
@@ -463,11 +461,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d,
if ((bond-params.mode != BOND_MODE_XOR) 
(bond-params.mode != BOND_MODE_8023AD)) {
// Not Applicable
-   count = sprintf(buf, NA\n) + 1;
+   count = sprintf(buf, NA\n);
} else {
count = sprintf(buf, %s %d\n,
xmit_hashtype_tbl[bond-params.xmit_policy].modename,
-   bond-params.xmit_policy) + 1;
+   bond-params.xmit_policy);
}
 
return count;
@@ -527,7 +525,7 @@ static ssize_t bonding_show_arp_validate(struct device *d,
 
return sprintf(buf, %s %d\n,
   arp_validate_tbl[bond-params.arp_validate].modename,
-  bond-params.arp_validate) + 1;
+  bond-params.arp_validate);
 }
 
 static ssize_t bonding_store_arp_validate(struct device *d,
@@ -627,7 +625,7 @@ static ssize_t bonding_show_arp_interval(struct device *d,
 {
struct bonding *bond = to_bond(d);
 
-   return sprintf(buf, %d\n, bond-params.arp_interval) + 1;
+   return sprintf(buf, %d\n, bond-params.arp_interval);
 }
 
 static ssize_t bonding_store_arp_interval(struct device *d,
@@ -711,10 +709,7 @@ static ssize_t bonding_show_arp_targets(struct device *d,
res += sprintf(buf + res, %u.%u.%u.%u ,
   NIPQUAD(bond-params.arp_targets[i]));
}
-   if (res)
-   res--;  /* eat the leftover space */
-   res += sprintf(buf + res, \n);
-   res++;
+   if (res) buf[res-1] = '\n'; /* eat the leftover space */
return res;
 }
 
@@ -815,7 +810,7 @@ static ssize_t bonding_show_downdelay(struct device *d,
 {
struct bonding *bond = to_bond(d);
 
-   return sprintf(buf, %d\n, bond-params.downdelay * 
bond-params.miimon) + 1;
+   return sprintf(buf, %d\n, bond-params.downdelay * 
bond-params.miimon);
 }
 
 static ssize_t bonding_store_downdelay(struct device *d,
@@ -872,7 +867,7 @@ static ssize_t bonding_show_updelay(struct device *d,
 {
struct bonding *bond = to_bond(d);
 
-   return sprintf(buf, %d\n, bond-params.updelay * bond-params.miimon) 
+ 1;
+   return sprintf(buf, %d\n, bond-params.updelay * bond-params.miimon);
 
 }
 
@@ -936,7 

Re: [PATCH 6/7] [IPSEC]: Lock state when copying non-atomic fields to user-space

2007-11-27 Thread Masahide NAKAMURA
Herbert,

Monday 26 November 2007 20:07, Herbert Xu wrote:
 On Mon, Nov 26, 2007 at 11:18:45AM +0800, Herbert Xu wrote:
 
  I'm just going to revert this patch for 2.6.24 since we've lived
  with this race for so long anyway.
 
 Actually, instead of reverting it completely I'm just going to
 remove the newly added locks which should be just as effective.
 
 This would reduce the churn in the code as we'd be putting most
 of it back soon anyway.

With the patch you sent, the xfrm_state_walk() issue I reported
is solved at current net-2.6.25.

-- 
Masahide NAKAMURA
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-27 Thread Rick Jones

Adrian Bunk wrote:

On Tue, Nov 27, 2007 at 01:15:23PM -0800, Rick Jones wrote:


The real problem is that these drivers are not in the upstream kernel.

Are there common reasons why these drivers are not upstream?


One might be that upstream has not accepted them.  Anything doing or 
smelling of TOE comes to mind right away.



Which modules doing or smelling of TOE do work with unmodified vendor 
kernels?


At the very real risk of further demonstrating my Linux vocabulary 
limitations, I believe there is a Linux Sockets Acceleration 
module/whatnot for NetXen and related 10G NICs, and a cxgb3_toe (?) 
module for Chelsio 10G NICs.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


drivers/net/r6040.c warnings on x86_64

2007-11-27 Thread Andrew Morton
drivers/net/r6040.c: In function 'rx_buf_alloc':
drivers/net/r6040.c:262: warning: passing argument 2 of 'pci_map_single' makes 
pointer from integer without a cast
drivers/net/r6040.c: In function 'r6040_up':
drivers/net/r6040.c:631: warning: cast from pointer to integer of different size
drivers/net/r6040.c:632: warning: cast to pointer from integer of different size
drivers/net/r6040.c:637: warning: cast from pointer to integer of different size
drivers/net/r6040.c:638: warning: cast to pointer from integer of different size

can we fix these please?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html