[PATCH][UNIX] EOF on non-blocking SOCK_SEQPACKET

2007-11-27 Thread Florian Zumbiehl
Hi,

I am not absolutely sure whether this actually is a bug (as in: I've got
no clue what the standards say or what other implementations do), but at
least I was pretty surprised when I noticed that a recv() on a
non-blocking unix domain socket of type SOCK_SEQPACKET (which is connection
oriented, after all) where the remote end has closed the connection
returned -1 (EAGAIN) rather than 0 to indicate end of file.

This is a test case:

| #include sys/types.h
| #include unistd.h
| #include sys/socket.h
| #include sys/un.h
| #include fcntl.h
| #include string.h
| #include stdlib.h
| 
| int main(){
|   int sock;
|   struct sockaddr_un addr;
|   char buf[4096];
|   int pfds[2];
| 
|   pipe(pfds);
|   sock=socket(PF_UNIX,SOCK_SEQPACKET,0);
|   addr.sun_family=AF_UNIX;
|   strcpy(addr.sun_path,/tmp/foobar_testsock);
|   bind(sock,(struct sockaddr *)addr,sizeof(addr));
|   listen(sock,1);
|   if(fork()){
|   close(sock);
|   sock=socket(PF_UNIX,SOCK_SEQPACKET,0);
|   connect(sock,(struct sockaddr *)addr,sizeof(addr));
|   fcntl(sock,F_SETFL,fcntl(sock,F_GETFL)|O_NONBLOCK);
|   close(pfds[1]);
|   read(pfds[0],buf,sizeof(buf));
|   recv(sock,buf,sizeof(buf),0); // -- this one
|   }else accept(sock,NULL,NULL);
|   exit(0);
| }

If you try it, make sure /tmp/foobar_testsock doesn't exist.

The marked recv() returns -1 (EAGAIN) on 2.6.23.9. Below you find a
patch that fixes that.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index a05c342..fa0aec5 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1632,8 +1632,13 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct 
socket *sock,
mutex_lock(u-readlock);
 
skb = skb_recv_datagram(sk, flags, noblock, err);
-   if (!skb)
+   if (!skb) {
+   unix_state_lock(sk);
+   if (sk-sk_type==SOCK_SEQPACKET  err==-EAGAIN  
(sk-sk_shutdownRCV_SHUTDOWN))
+   err=0; /* signal EOF on disconnected non-blocking 
SEQPACKET socket */
+   unix_state_unlock(sk);
goto out_unlock;
+   }
 
wake_up_interruptible(u-peer_wait);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND][PATCH 1/3] PPPoE: improved hashing routine

2007-07-31 Thread Florian Zumbiehl
Hi,

  -static int hash_item(unsigned long sid, unsigned char *addr)
  +#if 8%PPPOE_HASH_BITS
  +#error 8 must be a multiple of PPPOE_HASH_BITS
  +#endif
 
 Since PPPOE_HASH_BITS is 4 I would think this check will break the
 build. :-)

Erm, I thought that 8 was 4*2, but maybe I didn't quite understand that
natural numbers business ;-)

 Anyways, I looked at this myself and half of the problem comes from
 the fact that sid is passed around as an unsigned long throughout
 this entire file yet the thing is just a __u16.
 
 So the first thing to fix is to use __u16 consistently for sid values.
 Then the sid hash looks obviously wrong and is easy to fix.
 
 Then you end up with a hash_item() that simply looks like:
 
 static unsigned int hash_item(__u16 sid, unsigned char *addr)
 {
   unsigned int hash;
 
   hash = (((unsigned int)addr[0]  24) |
   ((unsigned int)addr[1]  16) |
   ((unsigned int)addr[2]   8) |
   ((unsigned int)addr[3]   0));
 
   hash ^= (((unsigned int)addr[4]  8) |
((unsigned int)addr[5]  0));
 
   hash ^= sid;
 
   return ((hash ^ (hash  8) ^ (hash  16)) 
   (PPPOE_HASH_SIZE - 1));
 }
 
 which is what I've checked into my tree.

Erm, I'd say this not only produces different results than the old
version, but it also produces wrong results, in that it ignores quite
a bit of the data that's supposed to be hashed. If I didn't overlook
something, it only considers addr0x0f0f0f0f0f00 and sid0x0f0f, given
the right endianness of addr and that PPPOE_HASH_SIZE stays 16.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND][PATCH 1/3] PPPoE: improved hashing routine

2007-07-31 Thread Florian Zumbiehl
Hi,

  Erm, I'd say this not only produces different results than the old
  version, but it also produces wrong results, in that it ignores quite
  a bit of the data that's supposed to be hashed. If I didn't overlook
  something, it only considers addr0x0f0f0f0f0f00 and sid0x0f0f, given
  the right endianness of addr and that PPPOE_HASH_SIZE stays 16.
 
 You're right, I need to fix the shifts up.  How does this version
 look?
 
   hash ^= (hash  4) ^ (hash  12) ^ (hash  20);
   return (head ^ (hash  8) ^ (hash  24)) 
   (PPPOE_HASH_SIZE - 1);

Assuming that it was supposed to read s/head/hash/: Same disclaimers
apply, but I'd say this considers only addr0xff0fff0f000f and
sid0x0fff, so, well, yes, it's better, but still not quite what I
think it should be ;-)

 Actually it might be simpler and more efficient to just make
 PPPOE_HASH_SHIFT be 8.

SHIFT? SIZE? BITS?

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND][PATCH 1/3] PPPoE: improved hashing routine

2007-07-31 Thread Florian Zumbiehl
Hi,

   Actually it might be simpler and more efficient to just make
   PPPOE_HASH_SHIFT be 8.
  
  SHIFT? SIZE? BITS?
 
 You know what I meant :-)
 
 PPPOE_HASH_BITS.

Actually, I wasn't sure, for SHIFT looks more similar to SIZE
than to BITS, plus numbers are somewhat same order of magnitude
anyway, and I didn't quite see what you were up to, so ... whatever ;-)

 I guess otherwise we degenerate back to your original patch :)

Well, as I don't quite see what you were up to and as my patch doesn't
change anything about the semantics of the code, I'd at least prefer
that over your other suggestions so far ;-)

A few variations I tried back when I created the patch, using larger
things than a char for accumulating the pieces and then folding down
from that, turned out to be slower than what I finally submitted, at
least on the machines I tested it on. I didn't do comprehensive testing,
as it really doesn't matter, after all, but I think the version I
submitted is pretty fast, plus it's quite readable, and it keeps the
flexibility of different hash sizes, but still should allow the
compiler to optimize away the loops that allow for this flexibility.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND][PATCH 1/3] PPPoE: improved hashing routine

2007-07-29 Thread Florian Zumbiehl
Hi,

I'm not sure whether this is really worth it, but it looked so
extremely inefficient that I couldn't resist - so let's hope providers
will keep PPPoE around for a while, at least until terabit dsl ;-)

The new code produces the same results as the old version and is
~ 3 to 6 times faster for 4-bit hashes on the CPUs I tested.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9e51fcc..954328c 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -108,19 +108,24 @@ static inline int cmp_addr(struct pppoe_addr *a, unsigned 
long sid, char *addr)
(memcmp(a-remote,addr,ETH_ALEN) == 0));
 }
 
-static int hash_item(unsigned long sid, unsigned char *addr)
+#if 8%PPPOE_HASH_BITS
+#error 8 must be a multiple of PPPOE_HASH_BITS
+#endif
+
+static int hash_item(unsigned int sid, unsigned char *addr)
 {
-   char hash = 0;
-   int i, j;
+   unsigned char hash = 0;
+   unsigned int i;
 
-   for (i = 0; i  ETH_ALEN ; ++i) {
-   for (j = 0; j  8/PPPOE_HASH_BITS ; ++j) {
-   hash ^= addr[i]  ( j * PPPOE_HASH_BITS );
-   }
+   for (i = 0 ; i  ETH_ALEN ; i++) {
+   hash ^= addr[i];
+   }
+   for (i = 0 ; i  sizeof(sid_t)*8 ; i += 8 ){
+   hash ^= sidi;
+   }
+   for (i = 8 ; (i=1) = PPPOE_HASH_BITS ; ) {
+   hash ^= hashi;
}
-
-   for (i = 0; i  (sizeof(unsigned long)*8) / PPPOE_HASH_BITS ; ++i)
-   hash ^= sid  (i*PPPOE_HASH_BITS);
 
return hash  ( PPPOE_HASH_SIZE - 1 );
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND][PATCH 2/3] PPPoX/E: return ENOTTY on unknown ioctl requests

2007-07-29 Thread Florian Zumbiehl
Hi,

here another patch for the PPPoX/E code that makes sure that ENOTTY is
returned for unknown ioctl requests rather than 0 (and removes another
unneeded initializer which I didn't bother creating a separate patch for).

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 954328c..9554924 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -669,8 +669,8 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int val = 0;
-   int err = 0;
+   int val;
+   int err;
 
switch (cmd) {
case PPPIOCGMRU:
@@ -759,8 +759,9 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
err = 0;
break;
 
-   default:;
-   };
+   default:
+   err = -ENOTTY;
+   }
 
return err;
 }
diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 3f8115d..51de561 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -72,7 +72,7 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int rc = 0;
+   int rc;
 
lock_sock(sk);
 
@@ -93,12 +93,9 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
break;
}
default:
-   if (pppox_protos[sk-sk_protocol]-ioctl)
-   rc = pppox_protos[sk-sk_protocol]-ioctl(sock, cmd,
- arg);
-
-   break;
-   };
+   rc = pppox_protos[sk-sk_protocol]-ioctl ?
+   pppox_protos[sk-sk_protocol]-ioctl(sock, cmd, arg) : 
-ENOTTY;
+   }
 
release_sock(sk);
return rc;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND][PATCH 3/3] PPPoE: move lock_sock() in pppoe_sendmsg() to the right location

2007-07-29 Thread Florian Zumbiehl
Hi,

and the last one for now: Acquire the sock lock in pppoe_sendmsg()
before accessing the sock - and in particular avoid releasing the lock
even though it hasn't been acquired.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9554924..eef8a5b 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -779,6 +779,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
struct net_device *dev;
char *start;
 
+   lock_sock(sk);
if (sock_flag(sk, SOCK_DEAD) || !(sk-sk_state  PPPOX_CONNECTED)) {
error = -ENOTCONN;
goto end;
@@ -789,8 +790,6 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
hdr.code = 0;
hdr.sid = po-num;
 
-   lock_sock(sk);
-
dev = po-pppoe_dev;
 
error = -EMSGSIZE;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] PPPoE: race between interface going down and release()

2007-04-20 Thread Florian Zumbiehl
Hi,

 This patch doesn't apply becuase in the actual pppoe.c code:

[...]

 So I'm having trouble figuring out what tree you generated
 that patch against :-)  Perhaps there was an earlier patch
 I missed or something.
 
 But I won't second guess and leave it to you to let me know
 what I should actually apply.

That one was against my first try of a fix. So, either you apply that
one first and then remove it again by applying that patch ;-) - or just
ignore those four patches completely and apply
[EMAIL PROTECTED] ff instead.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5 2.6.21-rc4] l2tp: pppol2tp core

2007-03-24 Thread Florian Zumbiehl
Hi,

 Florian Zumbiehl wrote:
 Hi,
 
 + * 251003 :Copied from pppoe.c version 0.6.9.
 
 you might want to have a look at the patches to the PPPoE code that were
 posted to netdev recently, as some of them seem to apply to code that's
 left over from pppoe.c.
 
 Do you mean this change?
 
  * 070228 : Fix to allow multiple sessions with same remote MAC and same
  *session id by including the local device ifindex in the
  *tuple identifying a session. This also ensures packets can't
  *  be injected into a session from interfaces other than the
  *  one   specified by userspace. Florian Zumbiehl 
  [EMAIL PROTECTED]
 
 The pppoe session hash table isn't used in the pppol2tp code so the 
 above change doesn't affect pppol2tp. I rechecked the latest pppoe.c and 
 I don't see other changes that haven't already been incorporated in 
 pppol2tp. Which specific code in pppol2tp do you think needs to be updated?

I meant those that haven't been committed yet (AFAIK):

http://www.spinics.net/lists/netdev/msg26926.html
http://www.spinics.net/lists/netdev/msg26927.html
http://www.spinics.net/lists/netdev/msg26928.html
http://www.spinics.net/lists/netdev/msg26751.html
http://www.spinics.net/lists/netdev/msg26752.html
http://www.spinics.net/lists/netdev/msg26753.html
http://www.spinics.net/lists/netdev/msg26754.html
http://www.spinics.net/lists/netdev/msg26787.html

(Plus the respective followup postings, of course.)

Some probably don't apply. I didn't have a very detailed look at your
code, but some certainly do apply.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] PPPoE: improved hashing routine

2007-03-13 Thread Florian Zumbiehl
Hi,

I'm not sure whether this is really worth it, but it looked so
extremely inefficient that I couldn't resist - so let's hope providers
will keep PPPoE around for a while, at least until terabit dsl ;-)

The new code produces the same results as the old version and is
~ 3 to 6 times faster for 4-bit hashes on the CPUs I tested.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9e51fcc..954328c 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -108,19 +108,24 @@ static inline int cmp_addr(struct pppoe_addr *a, unsigned 
long sid, char *addr)
(memcmp(a-remote,addr,ETH_ALEN) == 0));
 }
 
-static int hash_item(unsigned long sid, unsigned char *addr)
+#if 8%PPPOE_HASH_BITS
+#error 8 must be a multiple of PPPOE_HASH_BITS
+#endif
+
+static int hash_item(unsigned int sid, unsigned char *addr)
 {
-   char hash = 0;
-   int i, j;
+   unsigned char hash = 0;
+   unsigned int i;
 
-   for (i = 0; i  ETH_ALEN ; ++i) {
-   for (j = 0; j  8/PPPOE_HASH_BITS ; ++j) {
-   hash ^= addr[i]  ( j * PPPOE_HASH_BITS );
-   }
+   for (i = 0 ; i  ETH_ALEN ; i++) {
+   hash ^= addr[i];
+   }
+   for (i = 0 ; i  sizeof(sid_t)*8 ; i += 8 ){
+   hash ^= sidi;
+   }
+   for (i = 8 ; (i=1) = PPPOE_HASH_BITS ; ) {
+   hash ^= hashi;
}
-
-   for (i = 0; i  (sizeof(unsigned long)*8) / PPPOE_HASH_BITS ; ++i)
-   hash ^= sid  (i*PPPOE_HASH_BITS);
 
return hash  ( PPPOE_HASH_SIZE - 1 );
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] PPPoX/E: return ENOTTY on unknown ioctl requests

2007-03-13 Thread Florian Zumbiehl
Hi,

here another patch for the PPPoX/E code that makes sure that ENOTTY is
returned for unknown ioctl requests rather than 0 (and removes another
unneeded initializer which I didn't bother creating a separate patch for).

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 954328c..9554924 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -669,8 +669,8 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int val = 0;
-   int err = 0;
+   int val;
+   int err;
 
switch (cmd) {
case PPPIOCGMRU:
@@ -759,8 +759,9 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
err = 0;
break;
 
-   default:;
-   };
+   default:
+   err = -ENOTTY;
+   }
 
return err;
 }
diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 3f8115d..51de561 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -72,7 +72,7 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int rc = 0;
+   int rc;
 
lock_sock(sk);
 
@@ -93,12 +93,9 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
break;
}
default:
-   if (pppox_protos[sk-sk_protocol]-ioctl)
-   rc = pppox_protos[sk-sk_protocol]-ioctl(sock, cmd,
- arg);
-
-   break;
-   };
+   rc = pppox_protos[sk-sk_protocol]-ioctl ?
+   pppox_protos[sk-sk_protocol]-ioctl(sock, cmd, arg) : 
-ENOTTY;
+   }
 
release_sock(sk);
return rc;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] PPPoE: move lock_sock() in pppoe_sendmsg() to the right location

2007-03-13 Thread Florian Zumbiehl
Hi,

and the last one for now: Acquire the sock lock in pppoe_sendmsg()
before accessing the sock - and in particular avoid releasing the lock
even though it hasn't been acquired.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9554924..eef8a5b 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -779,6 +779,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
struct net_device *dev;
char *start;
 
+   lock_sock(sk);
if (sock_flag(sk, SOCK_DEAD) || !(sk-sk_state  PPPOX_CONNECTED)) {
error = -ENOTCONN;
goto end;
@@ -789,8 +790,6 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
hdr.code = 0;
hdr.sid = po-num;
 
-   lock_sock(sk);
-
dev = po-pppoe_dev;
 
error = -EMSGSIZE;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] PPPoE: race between interface going down and release()

2007-03-12 Thread Florian Zumbiehl
Hi,

 Attached below is my take on how to address this problem.  
 This addresses any concerns you may have had about checking 
 po-pppoe_dev==NULL,
 because accesses to this field are now synchronized with pppoe_hash_lock.

That indeed looks like a much cleaner solution, so I'd certainly prefer
it over mine. Testing so far didn't turn up any problems, either.

 Once we can settle on a fix for this, I'll deal with the SID==0 issue 
 (trying to do that now would just cause patch conflicts).

Well, yeah, this stuff probably is more important than support
for SID 0 :-)

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] PPPoE: miscellaneous smaller cleanups

2007-03-10 Thread Florian Zumbiehl
Hi,

below is a patch that just removes dead code/initializers without any
effect (first access is an assignment) that I stumbled accross while
reading the source.

Florian

---
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 13f5b22..18d1a4d 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -207,7 +207,7 @@ static inline struct pppox_sock *get_item(unsigned long sid,
 
 static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp)
 {
-   struct net_device *dev = NULL;
+   struct net_device *dev;
int ifindex;
 
dev = dev_get_by_name(sp-sa_addr.pppoe.dev);
@@ -222,9 +222,6 @@ static inline int set_item(struct pppox_sock *po)
 {
int i;
 
-   if (!po)
-   return -EINVAL;
-
write_lock_bh(pppoe_hash_lock);
i = __set_item(po);
write_unlock_bh(pppoe_hash_lock);
@@ -344,7 +341,7 @@ static struct notifier_block pppoe_notifier = {
 static int pppoe_rcv_core(struct sock *sk, struct sk_buff *skb)
 {
struct pppox_sock *po = pppox_sk(sk);
-   struct pppox_sock *relay_po = NULL;
+   struct pppox_sock *relay_po;
 
if (sk-sk_state  PPPOX_BOUND) {
struct pppoe_hdr *ph = (struct pppoe_hdr *) skb-nh.raw;
@@ -514,7 +511,6 @@ static int pppoe_release(struct socket *sock)
 {
struct sock *sk = sock-sk;
struct pppox_sock *po;
-   int error = 0;
 
if (!sk)
return 0;
@@ -543,7 +539,7 @@ static int pppoe_release(struct socket *sock)
skb_queue_purge(sk-sk_receive_queue);
sock_put(sk);
 
-   return error;
+   return 0;
 }
 
 
@@ -762,10 +758,10 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
 static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
  struct msghdr *m, size_t total_len)
 {
-   struct sk_buff *skb = NULL;
+   struct sk_buff *skb;
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int error = 0;
+   int error;
struct pppoe_hdr hdr;
struct pppoe_hdr *ph;
struct net_device *dev;
@@ -929,10 +925,10 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct 
socket *sock,
  struct msghdr *m, size_t total_len, int flags)
 {
struct sock *sk = sock-sk;
-   struct sk_buff *skb = NULL;
+   struct sk_buff *skb;
int error = 0;
int len;
-   struct pppoe_hdr *ph = NULL;
+   struct pppoe_hdr *ph;
 
if (sk-sk_state  PPPOX_BOUND) {
error = -EIO;
@@ -949,7 +945,6 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket 
*sock,
m-msg_namelen = 0;
 
if (skb) {
-   error = 0;
ph = (struct pppoe_hdr *) skb-nh.raw;
len = ntohs(ph-length);
 
@@ -991,7 +986,7 @@ out:
 
 static __inline__ struct pppox_sock *pppoe_get_idx(loff_t pos)
 {
-   struct pppox_sock *po = NULL;
+   struct pppox_sock *po;
int i = 0;
 
for (; i  PPPOE_HASH_SIZE; i++) {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] PPPoE: race between interface going down and connect()

2007-03-10 Thread Florian Zumbiehl
Hi,

below you find a patch that (hopefully) fixes a race between an interface
going down and a connect() to a peer on that interface. Before,
connect() would determine that an interface is up, then the interface
could go down and all entries referring to that interface in the
item_hash_table would be marked as ZOMBIEs and their references to
the device would be freed, and after that, connect() would put a new
entry into the hash table referring to the device that meanwhile is
down already - which also would cause unregister_netdevice() to wait
until the socket has been release()d.

This patch does not suffice if we are not allowed to accept connect()s
referring to a device that we already acked a NETDEV_GOING_DOWN for
(that is: all references are only guaranteed to be freed after
NETDEV_DOWN has been acknowledged, not necessarily after the
NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid
looking through the hash table upon NETDEV_GOING_DOWN completely and
only do that once we get the NETDEV_DOWN ...

Florian

---
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 18d1a4d..1aeac2c 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -218,17 +218,6 @@ static inline struct pppox_sock *get_item_by_addr(struct 
sockaddr_pppox *sp)
return get_item(sp-sa_addr.pppoe.sid, sp-sa_addr.pppoe.remote, 
ifindex);
 }
 
-static inline int set_item(struct pppox_sock *po)
-{
-   int i;
-
-   write_lock_bh(pppoe_hash_lock);
-   i = __set_item(po);
-   write_unlock_bh(pppoe_hash_lock);
-
-   return i;
-}
-
 static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, 
int ifindex)
 {
struct pppox_sock *ret;
@@ -595,14 +584,18 @@ static int pppoe_connect(struct socket *sock, struct 
sockaddr *uservaddr,
po-pppoe_dev = dev;
po-pppoe_ifindex = dev-ifindex;
 
-   if (!(dev-flags  IFF_UP))
+   write_lock_bh(pppoe_hash_lock);
+   if (!(dev-flags  IFF_UP)){
+   write_unlock_bh(pppoe_hash_lock);
goto err_put;
+   }
 
memcpy(po-pppoe_pa,
   sp-sa_addr.pppoe,
   sizeof(struct pppoe_addr));
 
-   error = set_item(po);
+   error = __set_item(po);
+   write_unlock_bh(pppoe_hash_lock);
if (error  0)
goto err_put;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] PPPoE: memory leak when socket is release()d before PPPIOCGCHAN has been called on it

2007-03-10 Thread Florian Zumbiehl
Hi,

below you find a patch that fixes a memory leak when a PPPoE socket is
release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl
ever has been called on it.

This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.

Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...

Florian

---
diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 9315046..3f8115d 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -58,7 +58,7 @@ void pppox_unbind_sock(struct sock *sk)
 {
/* Clear connection to ppp device, if attached. */
 
-   if (sk-sk_state  (PPPOX_BOUND | PPPOX_ZOMBIE)) {
+   if (sk-sk_state  (PPPOX_BOUND | PPPOX_CONNECTED | PPPOX_ZOMBIE)) {
ppp_unregister_channel(pppox_sk(sk)-chan);
sk-sk_state = PPPOX_DEAD;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] PPPoE: race between interface going down and release()

2007-03-10 Thread Florian Zumbiehl
Hi,

below you find the last patch for now. It (hopefully) fixes a race
between a socket being release()d and the interface it's using going
down. As pppoe_release() didn't lock the socket, and pppoe_flush_dev()
did the locking in the wrong place, pppoe_flush_dev() could set
po-pppoe_dev to NULL, which would then cause pppoe_release() to not
dev_put() the interface, but to still mark the socket as DEAD,
which in turn would cause pppoe_flush_dev() to not dev_put() the
interface, effectively leaking one reference to the device, thus making it
impossible to remove (ignoring the possibility of overflowing the reference
counter by repeated use of this race ;-).

The thing I'm not quite sure about is whether the outer

|if (po-pppoe_dev == dev) {

is actually reliable this way on SMP systems, as far as cache consistency
is concerned. I left it that way for now, as any alternative locking
strategies that would lock the socket before doing this comparison seemed
to be pretty complicated to implement because of the need to drop the
hash table lock before trying to acquire the socket lock, so I'd rather
be sure that this actually is a problem before I try to solve it ;-)

Florian

---
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 1aeac2c..f5abfff 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -253,7 +253,6 @@ static void pppoe_flush_dev(struct net_device *dev)
struct sock *sk = sk_pppox(po);
 
sock_hold(sk);
-   po-pppoe_dev = NULL;
 
/* We hold a reference to SK, now drop the
 * hash table lock so that we may attempt
@@ -263,12 +262,15 @@ static void pppoe_flush_dev(struct net_device *dev)
 
lock_sock(sk);
 
-   if (sk-sk_state 
-   (PPPOX_CONNECTED | PPPOX_BOUND)) {
-   pppox_unbind_sock(sk);
+   if(po-pppoe_dev==dev){
dev_put(dev);
-   sk-sk_state = PPPOX_ZOMBIE;
-   sk-sk_state_change(sk);
+   po-pppoe_dev = NULL;
+   if (sk-sk_state 
+   (PPPOX_CONNECTED | PPPOX_BOUND)) {
+   pppox_unbind_sock(sk);
+   sk-sk_state = PPPOX_ZOMBIE;
+   sk-sk_state_change(sk);
+   }
}
 
release_sock(sk);
@@ -504,8 +506,11 @@ static int pppoe_release(struct socket *sock)
if (!sk)
return 0;
 
-   if (sock_flag(sk, SOCK_DEAD))
+   lock_sock(sk);
+   if (sock_flag(sk, SOCK_DEAD)){
+   release_sock(sk);
return -EBADF;
+   }
 
pppox_unbind_sock(sk);
 
@@ -526,6 +531,7 @@ static int pppoe_release(struct socket *sock)
sock-sk = NULL;
 
skb_queue_purge(sk-sk_receive_queue);
+   release_sock(sk);
sock_put(sk);
 
return 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Session ID 0 with PPPoE

2007-03-07 Thread Florian Zumbiehl
Hi,

 In the current  code SID 0 indicates that the socket is to be un-bound.

That are the semantics used by the kernel code, yes - but even pppd uses
different semantics (which can't quite work, of course ...).

 Supporting unbinding of the socket was intended to permit the PPPoE
 session to be reconnected without closing/reopening the socket; which
 would mean that you'd have to re-bind the PPPoE/PPP channel bindings.
 Thus it is conceivable to swap or renegotiate PPPoE connection
 underneath a PPP connection, hypothetically if anyone ever considered
 doing so.  Is that worth it?  I don't know.  One could eliminate that
 disconnect behavior and I don't think anyone would care.

Well, if _you_ don't even know ... =:-)

Anyway, if it is not to be eliminated, it should be represented some way
differently, I guess. Which probably would break backwards compatibility
at the userspace API completely, of course, which is a bad thing[tm], so
possibly simply changing the semantics to what is already assumed by most
userspace applications might be the way to go.

 I'll conceed that a SID of 0 could appear from outer space.   I've never
 seen that happening.

Now, the question is, of course: How many samples is this based on
(both, number of connection attempts and number of different peer
implementations)?

 The only way I see this being an issue is if a
 PPPoE server insists on giving you SID 0 and only SID 0 repeatedly.  And
 I've never seen *that* happening.

So, what you are saying is that pppd/kernel-PPP/-PPPoE is so unstable that
you shouldn't ever be using it without some cron job/wrapper that takes
care of restarts anyway, so some additional bug that causes pppd to exit
unexpectedly doesn't matter?

That is to say: If I didn't overlook anything, pppd (the rp-pppoe plugin,
that is) warns you if the PADS' session id field is 0 that the AC was
somehow violating the RFC, but then goes on, using 0 as the session id
anyway - thus calling connect() with a session id field of 0, obviously
assuming that this will bind to session id 0 - which it doesn't.

Now, when it invokes the PPPIOCGCHAN ioctl on that seemingly successfully
connect()ed socket, it gets an ENOTCONN - and exits with a fatal error.
And exiting with a fatal error isn't quite what I'd expect pppd to do
when the peer is behaving correctly according to the RFC, since that will
not just cause a retry that might be likely to give a different session
id and thus would cause the connection to finally be established with
some delay, but rather will break network connectivity until manual
intervention.

Of course, you could argue that this is somehow a bug in pppd, as it
doesn't use the kernel API correctly. But I assume that fixing the
kernel to support sid 0 rather than fixing pppd to support sid 0
(which would mean implementing an automatic retry when assigned
sid 0 by the AC) is somehow the better solution.

 If you'd really like to pursue this, I'll be happy to review and ack
 patches in this regard.  However, I don't see what there is to be
 actually gained by pursuing this.   I'm open to being convinced; what is
 the motivation behind this?  If there is a real problem here I'll be
 glad to get involved in fixing it myself.

Now, what is a real problem? I haven't noticed this being a problem for
me yet, no. But I have been using userspace RP-PPPOE until recently and
I am using a cron job that takes care of restarts in case pppd exits for
some reason, so it's rather unlikely that I'd notice if it did happen.

But isn't a problem that does occur seldom, is difficult to reproduce,
and has more than just temporary effect much worse a problem than something
that causes a crash every half an hour and thus is simple to come by using
debugging and doesn't surprise you the very moment you need things to
simply work?

Actually, IMO the major question is: Is there any application that does
(intentionally) make use of this session rebinding/unbinding? Since
simply dropping that probably would be the easiest fix to implement.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Session ID 0 with PPPoE

2007-03-07 Thread Florian Zumbiehl
Hi,

 This change can be made; the unbinding behavior can be removed and SID 0
 can be made valid. I hope I was clear in my previous e-mail that I
 didn't object to this.

Not quite. But now I think I got it ;-)

 PPPoE connections are unstable.  Ethernet frames get dropped.  Things
 die randomly. And yes, you typically want to have a cron job or script
 re-spawning pppd on failure.  Making this change won't increase the
 reliability of these connections in any meaningful way, it won't break
 pppd either.  

I mean, you are right from my experience that you want to have that for
pppd (which is why I _do_ have that cron job) - but your rationale
doesn't look very convincing to me. IP networks are unstable, you know.
IP packets get dropped. Things (read: TCP peers) die randomly. But no,
I don't have a cron job that restarts the MTA or web server in case some
TCP peer died - because they don't exit when some TCP connection times
out or something.

The very same thing I'd expect from pppd. If I do specify the persist
and maxfail 0 options, I expect pppd not to exit even when it receives
complete garbage from the network, much less when it receives completely
valid packets as per the RFC. The only reasonable reasons for exiting IMO
are lack of system resources or missing system features/invalid
configuration. The only reason why I do have a cron job that takes care
of restarts are all the bugs I encountered in pppd so far that caused
it to exit because it used the kernel PPP API wrongly in case of reconnects
or because it leaked ptys until there were none available anymore, not
the fact that PPP(oE) is a potentially unreliable network protocol or
because modems do lose their connection at times.

 The only question I have is why is this important to you?  I'm simply
 curious as to what you are trying to accomplish; is this related to some
 other work you are doing or is it correctness as a virtue?

It isn't extremely important to me as it doesn't affect me. But if for
any reason, then because I hope that others will take care of bugs that
they stumble across before they become a potentially difficult to solve
problem for me, too.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Session ID 0 with PPPoE

2007-03-04 Thread Florian Zumbiehl
Hi,

 From the RFC:
 
 5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet
 
When the Access Concentrator receives a PADR packet, it prepares to
begin a PPP session.  It generates a unique SESSION_ID for the PPPoE
session and replies to the Host with a PADS packet.  The
DESTINATION_ADDR field is the unicast Ethernet address of the Host
that sent the PADR.  The CODE field is set to 0x65 and the SESSION_ID
MUST be set to the unique value generated for this PPPoE session.
 
The PADS packet contains exactly one TAG of TAG_TYPE Service-Name,
indicating the service under which Access Concentrator has accepted
the PPPoE session, and any number of other TAG types.
 
If the Access Concentrator does not like the Service-Name in the
PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE
Service-Name-Error (and any number of other TAG types).  In this case
the SESSION_ID MUST be set to 0x.
 
 
 
 As you can see from the last paragraph, a session id of 0 implies a
 rejection of the PADR.  Thus, you can't possibly get a PADS packet that
 completes and initiates a valid session if the session id is 0.
 
 Note that the RFC does not prohibit all other aspects of the PADS to be
 structured as if it were a valid success response; the only condition
 and requirement of a failure mode here is the session id.

| [...] then it MUST reply with a PADS containing a TAG of TAG_TYPE
| Service-Name-Error [...]

!?!

To my understanding, the indicator is the Service-Name-Error tag, and
the RFC only states that if such a tag is present (indicating that
the AC doesn't like the requested service name and thus rejects the
session request), the session id field must be 0x - not that the
session id field may not be 0x if this tag is not present (which
would indicate that this is a valid session).

 Also 0x is reserved for future use. Thus it cannot be used as a
 sentinel value to indicate an invalid session id.

Well, currently it could (IMO, a connect() specifying 0x as the
session ID should fail anyway as of now as it is not a valid session id
as per the RFC - and 0x in the session id field could be used to mean
basically anything at the protocol level in the future) - however that
probably wouldn't be a good choice for extensibility reasons: If at
some point, a protocol session id field of 0x does somehow mean
something that would sensibly be represented as 0x in the session
id field of the internal data structure, one would have to change the
code again. So I guess the session id simply shouldn't be overloaded,
not even with an indication of its validity.

 Changing this code would require that the user-space component be
 synchronized with this change; as the socket interface implies that 0 is
 an invalid/unbound session id. 

Well, either that or the indication as to whether the session id is
currently valid should be stored in some different way.

 Lots of badness will occur if 0 is allowed as a session id, and nothing
 will be gained because it can't possibly be a valid session id.

Well, if that was the case, sure. But I still don't see any reason why
it can't be.

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface

2007-03-03 Thread Florian Zumbiehl
Hi,

 From: Florian Zumbiehl [EMAIL PROTECTED]
 Date: Wed, 28 Feb 2007 13:38:44 +0100
 
  As noone seems to have an opinion on this: Here is a patch that does
  work for me and that should solve the problem as far as that is easily
  possible. It is based on the assumption that an interface's ifindex is
  basically an alias for a local MAC address, so incoming packets now are
  matched to sockets based on remote MAC, session id, and ifindex of the
  interface the packet came in on/the socket was bound to by connect().
 
 I agree with your analysis and have applied your patch.

Below you find a slightly changed version of the patch that avoids
a possible NULL pointer dereference in case pppoe_device_event()/
pppoe_flush_dev() dev_put()s dev and sets it to NULL before pppoe_connect()
tries to unbind from the previous address, in which case it would
dereference the NULL pointer in dev.

It now saves the ifindex in the socket's data structure upon connect(),
so that it's still available for finding the entry to remove from the
hash table in case pppoe_device_event() should have dropped the socket's
reference to dev.

 Another way to implement this would have been to store the
 pre-computed ifindex on the kernel side sockaddr.

Well, that probably depends on the intended semantics. There isn't any
documentation somewhere that specifies what the intended behaviour is,
is there?!

Florian

--- linux-2.6.20/drivers/net/pppoe.c.orig   2007-02-25 19:23:51.0 
+0100
+++ linux-2.6.20/drivers/net/pppoe.c2007-03-04 02:11:51.0 +0100
@@ -7,6 +7,12 @@
  *
  * Version:0.7.0
  *
+ * 070228 :Fix to allow multiple sessions with same remote MAC and same
+ * session id by including the local device ifindex in the
+ * tuple identifying a session. This also ensures packets can't
+ * be injected into a session from interfaces other than the one
+ * specified by userspace. Florian Zumbiehl [EMAIL PROTECTED]
+ * (Oh, BTW, this one is YYMMDD, in case you were wondering ...)
  * 220102 :Fix module use count on failure in pppoe_create, pppox_sk -acme
  * 030700 :Fixed connect logic to allow for disconnect.
  * 270700 :Fixed potential SMP problems; we must protect against
@@ -127,14 +133,14 @@
  *  Set/get/delete/rehash items  (internal versions)
  *
  **/
-static struct pppox_sock *__get_item(unsigned long sid, unsigned char *addr)
+static struct pppox_sock *__get_item(unsigned long sid, unsigned char *addr, 
int ifindex)
 {
int hash = hash_item(sid, addr);
struct pppox_sock *ret;
 
ret = item_hash_table[hash];
 
-   while (ret  !cmp_addr(ret-pppoe_pa, sid, addr))
+   while (ret  !(cmp_addr(ret-pppoe_pa, sid, addr)  
ret-pppoe_ifindex == ifindex))
ret = ret-next;
 
return ret;
@@ -147,21 +153,19 @@
 
ret = item_hash_table[hash];
while (ret) {
-   if (cmp_2_addr(ret-pppoe_pa, po-pppoe_pa))
+   if (cmp_2_addr(ret-pppoe_pa, po-pppoe_pa)  
ret-pppoe_ifindex == po-pppoe_ifindex)
return -EALREADY;
 
ret = ret-next;
}
 
-   if (!ret) {
-   po-next = item_hash_table[hash];
-   item_hash_table[hash] = po;
-   }
+   po-next = item_hash_table[hash];
+   item_hash_table[hash] = po;
 
return 0;
 }
 
-static struct pppox_sock *__delete_item(unsigned long sid, char *addr)
+static struct pppox_sock *__delete_item(unsigned long sid, char *addr, int 
ifindex)
 {
int hash = hash_item(sid, addr);
struct pppox_sock *ret, **src;
@@ -170,7 +174,7 @@
src = item_hash_table[hash];
 
while (ret) {
-   if (cmp_addr(ret-pppoe_pa, sid, addr)) {
+   if (cmp_addr(ret-pppoe_pa, sid, addr)  ret-pppoe_ifindex 
== ifindex) {
*src = ret-next;
break;
}
@@ -188,12 +192,12 @@
  *
  **/
 static inline struct pppox_sock *get_item(unsigned long sid,
-unsigned char *addr)
+unsigned char *addr, int ifindex)
 {
struct pppox_sock *po;
 
read_lock_bh(pppoe_hash_lock);
-   po = __get_item(sid, addr);
+   po = __get_item(sid, addr, ifindex);
if (po)
sock_hold(sk_pppox(po));
read_unlock_bh(pppoe_hash_lock);
@@ -203,7 +207,15 @@
 
 static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp)
 {
-   return get_item(sp-sa_addr.pppoe.sid, sp-sa_addr.pppoe.remote);
+   struct net_device *dev = NULL;
+   int ifindex;
+
+   dev = dev_get_by_name(sp-sa_addr.pppoe.dev);
+   if(!dev)
+   return NULL;
+   ifindex = dev-ifindex;
+   dev_put(dev);
+   return

Session ID 0 with PPPoE

2007-03-03 Thread Florian Zumbiehl
Hi,

I noticed that the PPPoE code doesn't allow session id 0x to be used
for an actual session but rather considers 0 a special value denoting
that the socket is unbound. Now, when reading RFC 2516, I couldn't really
find anything that would forbid 0x as a session id. Only 0x is
reserved for future use and MUST NOT be used, while 0x is specified
as the only allowed value for the session id field on certain types of
packets, but neither can I find any statement that forbids 0x as
an ordinary session identifier, nor can I find any reasons that would
prevent PPPoE from functioning properly with a session id of 0x.

Does anyone of you see any reason why a server would not be allowed to
select 0x as the session id for a PPPoE session?

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface

2007-02-28 Thread Florian Zumbiehl
Hi,

 Well, your opinions are welcome. Plus any hints as to how to fix this.
 I'd tend to simply(?) add some more fields to the
 {hash,get,set,delete}_item() functions in drivers/net/pppoe.c.
 But maybe there is some better way?

As noone seems to have an opinion on this: Here is a patch that does
work for me and that should solve the problem as far as that is easily
possible. It is based on the assumption that an interface's ifindex is
basically an alias for a local MAC address, so incoming packets now are
matched to sockets based on remote MAC, session id, and ifindex of the
interface the packet came in on/the socket was bound to by connect().

For relayed packets, the socket that's used for relaying is selected
based on destination MAC, session ID and the interface index of the
interface whose name currently matches the name requested by userspace
as the relaying source interface. The relaying part of the patch is
untested.

Please note that I'd consider this a security fix for reasons outlined
in previous mails.

Florian

--- linux-2.6.20/drivers/net/pppoe.c.orig   2007-02-25 19:23:51.0 
+0100
+++ linux-2.6.20/drivers/net/pppoe.c2007-02-28 12:56:05.0 +0100
@@ -7,6 +7,12 @@
  *
  * Version:0.7.0
  *
+ * 070228 :Fix to allow multiple sessions with same remote MAC and same
+ * session id by including the local device ifindex in the
+ * tuple identifying a session. This also ensures packets can't
+ * be injected into a session from interfaces other than the one
+ * specified by userspace. Florian Zumbiehl [EMAIL PROTECTED]
+ * (Oh, BTW, this one is YYMMDD, in case you were wondering ...)
  * 220102 :Fix module use count on failure in pppoe_create, pppox_sk -acme
  * 030700 :Fixed connect logic to allow for disconnect.
  * 270700 :Fixed potential SMP problems; we must protect against
@@ -127,14 +133,14 @@
  *  Set/get/delete/rehash items  (internal versions)
  *
  **/
-static struct pppox_sock *__get_item(unsigned long sid, unsigned char *addr)
+static struct pppox_sock *__get_item(unsigned long sid, unsigned char *addr, 
int ifindex)
 {
int hash = hash_item(sid, addr);
struct pppox_sock *ret;
 
ret = item_hash_table[hash];
 
-   while (ret  !cmp_addr(ret-pppoe_pa, sid, addr))
+   while (ret  !(cmp_addr(ret-pppoe_pa, sid, addr)  
ret-pppoe_dev-ifindex == ifindex))
ret = ret-next;
 
return ret;
@@ -147,21 +153,19 @@
 
ret = item_hash_table[hash];
while (ret) {
-   if (cmp_2_addr(ret-pppoe_pa, po-pppoe_pa))
+   if (cmp_2_addr(ret-pppoe_pa, po-pppoe_pa)  
ret-pppoe_dev-ifindex == po-pppoe_dev-ifindex)
return -EALREADY;
 
ret = ret-next;
}
 
-   if (!ret) {
-   po-next = item_hash_table[hash];
-   item_hash_table[hash] = po;
-   }
+   po-next = item_hash_table[hash];
+   item_hash_table[hash] = po;
 
return 0;
 }
 
-static struct pppox_sock *__delete_item(unsigned long sid, char *addr)
+static struct pppox_sock *__delete_item(unsigned long sid, char *addr, int 
ifindex)
 {
int hash = hash_item(sid, addr);
struct pppox_sock *ret, **src;
@@ -170,7 +174,7 @@
src = item_hash_table[hash];
 
while (ret) {
-   if (cmp_addr(ret-pppoe_pa, sid, addr)) {
+   if (cmp_addr(ret-pppoe_pa, sid, addr)  
ret-pppoe_dev-ifindex == ifindex) {
*src = ret-next;
break;
}
@@ -188,12 +192,12 @@
  *
  **/
 static inline struct pppox_sock *get_item(unsigned long sid,
-unsigned char *addr)
+unsigned char *addr, int ifindex)
 {
struct pppox_sock *po;
 
read_lock_bh(pppoe_hash_lock);
-   po = __get_item(sid, addr);
+   po = __get_item(sid, addr, ifindex);
if (po)
sock_hold(sk_pppox(po));
read_unlock_bh(pppoe_hash_lock);
@@ -203,7 +207,15 @@
 
 static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp)
 {
-   return get_item(sp-sa_addr.pppoe.sid, sp-sa_addr.pppoe.remote);
+   struct net_device *dev = NULL;
+   int ifindex;
+
+   dev = dev_get_by_name(sp-sa_addr.pppoe.dev);
+   if(!dev)
+   return NULL;
+   ifindex = dev-ifindex;
+   dev_put(dev);
+   return get_item(sp-sa_addr.pppoe.sid, sp-sa_addr.pppoe.remote, 
ifindex);
 }
 
 static inline int set_item(struct pppox_sock *po)
@@ -220,12 +232,12 @@
return i;
 }
 
-static inline struct pppox_sock *delete_item(unsigned long sid, char *addr)
+static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, 
int ifindex)
 {
struct

Re: Weird problem with PPPoE on tap interface

2007-02-26 Thread Florian Zumbiehl
Hi,

 I'm experiencing a pretty strange problem with kernel PPPoE on tap
 interfaces with a vanilla 2.6.20 kernel that prevents the PPP connection
 from being established:
[...]

Well, I guess I have found the source of the problem: The PPPoE code seems
to match inbound packets to sockets based only on source address and
session ID - so in the given scenario, packets are delivered to pppd
once when they arrive at the real ethernet interface and a second time
when they arrive at the tap interface. Now, this is kindof two problems,
actually:

1. This is not in compliance with RFC 2516, which states in section 4:

   | The SESSION_ID field is sixteen bits.  It is an unsigned value in
   | network byte order.  It's value is defined below for Discovery
   | packets.  The value is fixed for a given PPP session and, in fact,
   | defines a PPP session along with the Ethernet SOURCE_ADDR and
   | DESTINATION_ADDR.

   So, it would be legal for there to be multiple sessions with the same
   peer using the same session ID, as long as they are using different
   local MAC addresses - but the current PPPoE code would not be able to
   distinguish those.

   This probably is not a problem for me, but should be fixed anyway,
   IMO. (I say probably since I am actually using different local MAC
   addresses for multiple PPPoE sessions to the same DSL-AC, as T-Com
   doesn't allow for multiple sessions from the same MAC address -
   so they would be allowed by the RFC to use the same session ID for
   more than one of those sessions, even though I doubt that they're
   doing that.)

2. In addition to session ID, source and destination address, IMO, the
   incoming interface should be added to the key that's used for
   matching packets to sockets. This is probably somewhat of a design
   decision much like the weak ES vs. strong ES issue with IP, but
   AFAICS, the userspace-API works by binding to an interface index, not
   by binding to a MAC address (unlike IP socket binding), which is why
   I'd expect from pppd's perspective that only packets from that
   interface are delivered.

   Plus, from the end user's perspective, this might even be somewhat of
   a security problem. I for one would (well, I obviuosly _did_ :-)
   expect that if I do specify an interface name in pppd's config,
   pppd won't get into contact with anything that's going on on
   interfaces other than the one specified. As the involved MAC addresses
   probably usually aren't considered a secret, all you'd have to guess
   in order to interfere with/inject packets into PPPoE sessions on a
   different interface is the 16-bit session identifier, which might not
   even be random.

Well, your opinions are welcome. Plus any hints as to how to fix this.
I'd tend to simply(?) add some more fields to the
{hash,get,set,delete}_item() functions in drivers/net/pppoe.c.
But maybe there is some better way?

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Weird problem with PPPoE on tap interface

2007-02-25 Thread Florian Zumbiehl
Hi,

I'm experiencing a pretty strange problem with kernel PPPoE on tap
interfaces with a vanilla 2.6.20 kernel that prevents the PPP connection
from being established:

Every PPPoE session packet (that is, LCP, since it never gets to a stage
where any other session data is being exchanged) is delivered to pppd
twice. This can be seen from pppd's logging messages when debugging
is enabled, and strace confirmed that it indeed does read() the exact
same data twice in a row from the same file descriptor - even though
a tcpdump on the corresponding tap interface does show each packet only
once.

For confirming this, I used a program with a select() loop that simply
moves packets unchanged and without reordering back and forth between
a real ethernet interface in promiscuous mode and the tap interface
used by pppd.

What makes this even stranger, is, that the setup works perfectly (and
only a single copy of packets is delivered to pppd) if I simply replace
the tap interface in pppd's config with the real ethernet interface
that the tap interface was previously bridged to (it's an ISA NE2K
clone, BTW).

Finally, it also works perfectly when I use userspace rp-pppoe through
the tap interface.

So far, I also confirmed that in kernel space, the call to ppp_input()
in drivers/net/pppoe.c is executed as many times as pppd receives a
packet, so the problem must be somewhere before that.

Well, I'm gonna try to find out more - but if someone with some
more knowledge of the involved kernel code would be willing to help
with this in some way or another, that would certainly be
appreciated ;-) - if you do need any additional information, let me
know ...

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP not retransmitting supposedly lost segment

2006-04-23 Thread Florian Zumbiehl
Hi,

 This tells us that the kernel was indeed retransmitting but the
 packet didn't make it out of the interface according to your
 tcpdump.

Well, at least the tcp part of it, it seems, yep. The debugging I've
done so far also seems to confirm that the problem is somewhere beyond
the tcp code, but I don't have any real clue yet ...

 Do you have any firewall rules? Any tc rules?

ip6tables rules, yes, but none in the filter/OUTPUT chain (and policy
ACCEPT, of course) and only counting rules in mangle/OUTPUT (using
only user-defined chains and target RETURN, with policy ACCEPT, once
again).

So i'd wonder if that'd be the cause of the problem.

BTW, other TCP/IPv6 (IRC) connections active at the same time work
without problems - but each of them fails in that way now and
then ...

Well, all in all that probably doesn't help you much - but any
hints as to how/where to further debug are welcome :-)

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP not retransmitting supposedly lost segment

2006-04-09 Thread Florian Zumbiehl
Hi,

meanwhile I gathered some information from /proc/net/tcp6 on this
problem - in case anyone is interested :-)

| 1144560959:   8: :1223 
:1A0B 01 : 02:00097F82  
 10000 3188619 8 c1409260 89 10 20 2 2
| 1144560969:   8: :1223 
:1A0B 01 0016: 01:0074 0004 
 10000 3188619 9 c1409260 1424 10 18 1 2
| 1144560980:   8: :1223 
:1A0B 01 0016: 01:00FA 0005 
 10000 3188619 9 c1409260 2848 10 18 1 2
| 1144560990:   8: :1223 
:1A0B 01 0016: 01:05F2 0006 
 10000 3188619 9 c1409260 5696 10 18 1 2
| 1144561000:   8: :1223 
:1A0B 01 0016: 01:0205 0006 
 10000 3188619 9 c1409260 5696 10 18 1 2
| 1144561010:   8: :1223 
:1A0B 01 0016: 01:0FE4 0007 
 10000 3188619 9 c1409260 11392 10 18 1 2
| 1144561020:   8: :1223 
:1A0B 01 0016: 01:0BF6 0007 
 10000 3188619 9 c1409260 11392 10 18 1 2
| 1144561030:   8: :1223 
:1A0B 01 0016: 01:0809 0007 
 10000 3188619 9 c1409260 11392 10 20 1 2
| 1144561040:   8: :1223 
:1A0B 01 0016: 01:041B 0007 
 10000 3188619 9 c1409260 11392 10 20 1 2
| 1144561050:   8: :1223 
:1A0B 01 0016: 01:002E 0007 
 10000 3188619 9 c1409260 11392 10 20 1 2
| 1144561060:   8: :1223 
:1A0B 01 0016: 01:1FDA 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561070:   8: :1223 
:1A0B 01 0016: 01:1BE7 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561080:   8: :1223 
:1A0B 01 0016: 01:17F9 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561090:   8: :1223 
:1A0B 01 0016: 01:140B 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561100:   8: :1223 
:1A0B 01 0016: 01:101E 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561110:   8: :1223 
:1A0B 01 0016: 01:0C30 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561120:   8: :1223 
:1A0B 01 0016: 01:0842 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561130:   8: :1223 
:1A0B 01 0016: 01:0455 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561140:   8: :1223 
:1A0B 01 0016: 01:0067 0008 
 10000 3188619 9 c1409260 22784 10 20 1 2
| 1144561150:   8: :1223 
:1A0B 04 0017: 01:2B5A 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561161:   8: :1223 
:1A0B 04 0017: 01:276C 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561171:   8: :1223 
:1A0B 04 0017: 01:237E 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561181:   8: :1223 
:1A0B 04 0017: 01:1F91 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561191:   8: :1223 
:1A0B 04 0017: 01:1BA3 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561201:   8: :1223 
:1A0B 04 0017: 01:17B5 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561211:   8: :1223 
:1A0B 04 0017: 01:13C2 0001 
00 0 9 c1409260 3 10 18 1 2
| 1144561221:   8: 

TCP not retransmitting supposedly lost segment

2006-04-06 Thread Florian Zumbiehl
Hi,

I've been regularly losing my IRC connections going over my PPP/PPPoE/ADSL
IPv6 connection with static addresses for quite some time now. I haven't
found any obvious correlation with PPP reconnects or anything else one
would expect to cause this - now, I've had a look at it using tcpdump
and it looks to me pretty much like it's linux's TCP implementation that
fails to retransmit lost segments on some occasions.

The two occasions I captured so far look pretty similar, so here is
one of them:

| 03:34:56.061190 2002::1.6667  2001::1.2214: P 42984:43037(53) ack 17052 win 
5712 nop,nop,timestamp 2372706322 98778470
| 03:34:56.061470 2001::1.2214  2002::1.6667: . ack 43037 win 6888 
nop,nop,timestamp 98778484 2372706322
| 03:35:57.004547 2001::1.2214  2002::1.6667: P 17052:17073(21) ack 43037 win 
6888 nop,nop,timestamp 98793722 2372706322
| 03:38:58.062458 2001::1.2214  2002::1.6667: F 17073:17073(0) ack 43037 win 
6888 nop,nop,timestamp 98838992 2372706322
| 03:38:58.118258 2002::1.6667  2001::1.2214: . ack 17052 win 5712 
nop,nop,timestamp 2372730527 98778484,nop,nop,sack sack 1 {17073:17074} 
| 03:38:58.657355 2002::1.6667  2001::1.2214: P 43037:43059(22) ack 17052 win 
5712 nop,nop,timestamp 2372730581 98778484,nop,nop,sack sack 1 {17073:17074} 
| 03:38:58.657759 2001::1.2214  2002::1.6667: R 130103887:130103887(0) win 0

(The addresses are not the original ones, obviously.)

Note in particular the third and the fourth lines (which is when irssi
decides to close the connection because of a ping timeout):
No retransmission takes place for (more than) three minutes even though
the segment sent in the third line is not acknowledged.

Tcpdump was running on 2001::1 and it reported no dropped packets, so
I suppose that no packets are missing from this dump (unless there is
a bug in there(, too), of course ...).

2001::1 is running a linux kernel version 2.6.16, 2002::1 is an IRC
server i've got no control over - but I suppose that it's not at fault
here anyway, so that probably doesn't matter ...

I've previously been observing similar behaviour with 2.4 kernels
already (that is, IRC connections on that internet connection getting
lost, but I've never had a close look at it then), so I suppose it's
not just a faulty compilation or something like that ...

Well - in case there is anything I can provide you with that would
help you with fixing this, let me know (nope, no clue yet how to
reproduce this except for waiting for my IRC connections to time
out ;-) - simply dropping some packets in ip6tables's filter/OUTPUT
chain did not do the trick ...).

Florian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html