Re: [iproute PATCH 0/2] Netns performance improvements

2016-07-08 Thread Brian Haley

On 07/07/2016 01:28 PM, Rick Jones wrote:

On 07/07/2016 09:34 AM, Eric W. Biederman wrote:

Rick Jones  writes:

300 routers is far from the upper limit/goal.  Back in HP Public
Cloud, we were running as many as 700 routers per network node (*),
and more than four network nodes. (back then it was just the one
namespace per router and network). Mileage will of course vary based
on the "oomph" of one's network node(s).


To clarify processes for these routers and dhcp servers are created
with "ip netns exec"?


I believe so, but it would be good to have someone else confirm that, and speak
to your paragraph below.


Yes, the namespace is created and configured, then in the case of dhcp an 'ip 
netns exec $namespace dnsmasq ...' is run.  Routers typically have a small 
daemon running "inside" as well.



If that is the case and you are using this feature as effectively a
lightweight container and not lots vrfs in a single network stack
then I suspect much larger gains can be had by creating a variant
of ip netns exec avoids the mount propagation.


So you're thinking a new command like 'ip netns daemon $namespace ...' ?  Or if 
there's a better way with other tools today to accomplish this I'd be 
interested, as waiting for a new iproute2 to ripple through the distros could 
take a while.


-Brian


Re: [PATCH nf-next 0/9] netfilter: remove per-netns conntrack tables, part 1

2016-05-05 Thread Brian Haley

On 05/05/2016 06:36 PM, Florian Westphal wrote:

Brian Haley <brian.ha...@hpe.com> wrote:

I've seen cases where certain users are attacked, where the CT table is
filled such that we start seeing "nf_conntrack: table full, dropping packet"
messages (as expected).  But other users continue to function normally,
unaffected.  Is this still the case - each netns has some limit it can't
exceed?


The limit is global, the accounting per namespace.


So this is a change from existing.


No, see __nf_conntrack_alloc():

 if (nf_conntrack_max &&
 unlikely(atomic_read(>ct.count) > nf_conntrack_max)) {
...

ct.count is whatever number of entries the namespace has allocated,
so max number of possible conntracks is always infinite if number
of net namespaces is unlimited (barring memory constraints, of course).


Ah yes, nf_conntrack_max is a global, thanks for setting me straight.  So I 
guess the tuning might just include increasing the bucket count in order to try 
and keep the number of items in each one small since there will be more entries 
in this single table now.


Thanks,

-Brian


Re: [PATCH nf-next 0/9] netfilter: remove per-netns conntrack tables, part 1

2016-05-05 Thread Brian Haley

On 05/05/2016 04:54 PM, Florian Westphal wrote:

Brian Haley <brian.ha...@hpe.com> wrote:

Openstack networking creates virtual routers using namespaces for isolation
between users.  VETH pairs are used to connect the interfaces on these
routers to different networks, whether they are internal (private) or
external (public).  In most cases NAT is done inside the namespace as
packets move between the networks.

I've seen cases where certain users are attacked, where the CT table is
filled such that we start seeing "nf_conntrack: table full, dropping packet"
messages (as expected).  But other users continue to function normally,
unaffected.  Is this still the case - each netns has some limit it can't
exceed?


The limit is global, the accounting per namespace.


So this is a change from existing.


If the bucket count (net.netfilter.nf_conntrack_buckets) is high enough
to accomodate the expected load and noone can create arbitrary number of
net namespaces things are fine.


In my case we can't control the number of namespaces, each user will get one as 
a virtual router is created.  We could change how we size things, but that 
doesn't stop one user from consuming larger than their 1/N share of entries. 
Typically we just increase the number of systems hosting these "routers" when we 
hit a limit, which decreases the netns count per node.



I haven't changed the way this works yet because I did not have a better
idea so far.


Creating a per-netns maximum seems doable, but maybe not practical from the 
accounting side of things.  Can't think of anything else at the moment.


-Brian



Re: [PATCH nf-next 0/9] netfilter: remove per-netns conntrack tables, part 1

2016-05-05 Thread Brian Haley

On 04/28/2016 01:13 PM, Florian Westphal wrote:

[ CCing netdev so netns folks can have a look too ]

This patch series removes the per-netns connection tracking tables.
All conntrack objects are then stored in one global global table.

This avoids the infamous 'vmalloc' when lots of namespaces are used:
We no longer allocate a new conntrack table for each namespace (with 64k
size this saves 512kb of memory per netns).

- net namespace address is made part of conntrack hash, to spread
   conntracks over entire table even if netns has overlapping ip addresses.
- lookup and iterators net_eq() to skip conntracks living in a different
   namespace.


Hi Florian,

Question on this series.

Openstack networking creates virtual routers using namespaces for isolation 
between users.  VETH pairs are used to connect the interfaces on these routers 
to different networks, whether they are internal (private) or external (public). 
 In most cases NAT is done inside the namespace as packets move between the 
networks.


I've seen cases where certain users are attacked, where the CT table is filled 
such that we start seeing "nf_conntrack: table full, dropping packet" messages 
(as expected).  But other users continue to function normally, unaffected.  Is 
this still the case - each netns has some limit it can't exceed?  I didn't see 
it, but your comment in 9/9 seemed like something was there -  "we would start 
to 'over-subscribe' the affected/overlimit netns".


Thanks,

-Brian


Broadcom 5708 taking a few seconds to get link-up

2008-02-08 Thread Brian Haley

Hi Michael,

I'm working on a system that has two on-board 5708's.  We've noticed 
that it takes about 3 seconds for the link to come up - is this 
considered normal?  I've tried this with the latest davem tree with 
similar results to older kernels/drivers.


# uname -r
2.6.24

# ethtool -i eth3
driver: bnx2
version: 1.7.3
firmware-version: 1.9.3
bus-info: :42:00.0

# lspci -v
42:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 
Gigabit Ethernet (rev 11)

Subsystem: Hewlett-Packard Company Unknown device 7038
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 34
Memory at fa00 (64-bit, non-prefetchable) [size=32M]
Capabilities: [40] PCI-X non-bridge device
Capabilities: [48] Power Management version 2
Capabilities: [50] Vital Product Data
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ 
Queue=0/0 Enable-


# ip link set eth3 up; mii-tool eth3; sleep 1; mii-tool eth3; sleep 1; 
mii-tool eth3; sleep 1; mii-tool eth3

eth3: no link
eth3: no link
eth3: no link
eth3: negotiated 100baseTx-FD, link ok

Other drivers I've tried - e1000 and tg3, get up in  1 second.  I'm 
asking becuase any packet I try to transmit out this interface before 
link-up never gets out.


Thanks for any info,

-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 support for NFS server

2008-01-17 Thread Brian Haley

Aurélien Charbon wrote:

Thanks for your comments.
Here is the patch with some cleanups.


Hi Aurelien,

Just two nits.


--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -400,6 +400,15 @@ static inline int ipv6_addr_v4mapped(const struct in6_addr 
*a)
 a-s6_addr32[2] == htonl(0x));
 }
 
+static inline void ipv6_addr_set_v4mapped(const __be32 addr,

+ struct in6_addr *v4mapped)
+{
+   ipv6_addr_set(v4mapped,
+   0, 0,
+   htonl(0x),
+   addr);
+}


I think Bruce wanted you to put as much on one line here as possible.


@@ -641,9 +668,24 @@ static int unix_gid_find(uid_t uid, struct group_info 
**gip,
 int
 svcauth_unix_set_client(struct svc_rqst *rqstp)
 {
-   struct sockaddr_in *sin = svc_addr_in(rqstp);
+   struct sockaddr_in *sin;
+   struct sockaddr_in6 *sin6, sin6_storage;
struct ip_map *ipm;
 
+	switch (rqstp-rq_addr.ss_family) {

+   case AF_INET:
+   sin = svc_addr_in(rqstp);
+   sin6 = sin6_storage;
+   ipv6_addr_set(sin6-sin6_addr, 0, 0,
+   htonl(0x), sin-sin_addr.s_addr);
+   break;


ipv6_addr_set_v4mapped(sin-sin_addr.s_addr, sin6-sin6_addr);

-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2008-01-07 Thread Brian Haley

David Stevens wrote:

Yeah, that's what I get for typing in off-the-cuff code. What
I was thinking was the fl.oif assignment instead was:
if (!sk-sk_bound_dev_if 
(addr_type  IPV6_ADDR_MULTICAST))
sk-sk_bound_dev_if = np-mcast_oif;

Which it is not, but maybe it could be, since this is a connect().


How about the simple patch below?  I just removed the ENINVAL check from 
my original patch, but it accomplishes the same thing.



That patch looks better, but I'm wondering if we could just remove the
requirement that sin6_scope_id be set here if it's multicast, since it
is doing the following later in the code:

if (!fl.oif  (addr_typeIPV6_ADDR_MULTICAST))
fl.oif = np-mcast_oif;

So, really, all we need to do is get through the LINKLOCAL section
without error in the multicast case and we can remove the redundant
multicast check there. I think that'd be simpler.


I don't think we can remove that check since it covers the non-multicast 
case.


-Brian

Signed-off-by: Brian Haley [EMAIL PROTECTED]
---
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2ed689a..5d4245a 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -123,11 +123,11 @@ ipv4_connected:
 goto out;
 			}
 			sk-sk_bound_dev_if = usin-sin6_scope_id;
-			if (!sk-sk_bound_dev_if 
-			(addr_type  IPV6_ADDR_MULTICAST))
-fl.oif = np-mcast_oif;
 		}
 
+		if (!sk-sk_bound_dev_if  (addr_type  IPV6_ADDR_MULTICAST))
+			sk-sk_bound_dev_if = np-mcast_oif;
+
 		/* Connect to link-local address requires an interface */
 		if (!sk-sk_bound_dev_if) {
 			err = -EINVAL;


Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2007-12-19 Thread Brian Haley

Hi David,

David Stevens wrote:

OK, I see what you're trying to fix now.

I think the scope_id checks are not quite right-- they
should be something like this:

if (addr_typeIPV6_ADDR_LINKLOCAL) {
if (addr_len = sizeof(struct sockaddr_in6)) {
if (sk-sk_bound_dev_if  usin-sin6_scope_id 
sk-sk_bound_dev_if != usin-sin6_scope_id) {
err = -EINVAL;
goto out;
}
if (usin-sin6_scope_id)
sk-sk_bound_dev_if = usin-sin6_scope_id;
if (!sk-sk_bound_dev_if 
 (addr_type  IPV6_ADDR_MULTICAST))
fl.oif = np-mcast_oif;


This assignment will not get us past the next check...


/* connect to the link-local addres requires an interface */
if (!sk-sk_bound_dev_if) {
err = -EINVAL;
goto out;
}


... and even if it did, fl.oif is over-written by sk_bound_dev_if just a 
few lines down.



If I did an SO_BINDTODEVICE and specified sin6_scope_id,
then they better agree.
If I specified sin6_scope_id without SO_BINDTODEVICE, set
the device to that.
If I get this far without a device and it's multicast, use 
mcast_oif

If I get all through that and don't have a device, EINVAL.


You also need to check if mcast_oif matches sk_bind_dev_if here - it's 
actually done in the setsockopt() code already when we set it, 
duplicating it here isn't that big a deal.


How about the following patch?  It does not set sk_bound_dev_if to 
mcast_oif, but does allow the connect() to succeed.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
---
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2ed689a..3226970 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -114,6 +114,8 @@ ipv4_connected:
 		goto out;
 	}
 
+	fl.oif = sk-sk_bound_dev_if;
+
 	if (addr_typeIPV6_ADDR_LINKLOCAL) {
 		if (addr_len = sizeof(struct sockaddr_in6) 
 		usin-sin6_scope_id) {
@@ -122,14 +124,14 @@ ipv4_connected:
 err = -EINVAL;
 goto out;
 			}
-			sk-sk_bound_dev_if = usin-sin6_scope_id;
-			if (!sk-sk_bound_dev_if 
-			(addr_type  IPV6_ADDR_MULTICAST))
-fl.oif = np-mcast_oif;
+			fl.oif = sk-sk_bound_dev_if = usin-sin6_scope_id;
 		}
 
+		if (!fl.oif  (addr_type  IPV6_ADDR_MULTICAST))
+			fl.oif = np-mcast_oif;
+
 		/* Connect to link-local address requires an interface */
-		if (!sk-sk_bound_dev_if) {
+		if (!fl.oif) {
 			err = -EINVAL;
 			goto out;
 		}
@@ -148,7 +150,6 @@ ipv4_connected:
 	fl.proto = sk-sk_protocol;
 	ipv6_addr_copy(fl.fl6_dst, np-daddr);
 	ipv6_addr_copy(fl.fl6_src, np-saddr);
-	fl.oif = sk-sk_bound_dev_if;
 	fl.fl_ip_dport = inet-dport;
 	fl.fl_ip_sport = inet-sport;
 


Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2007-12-19 Thread Brian Haley

David Stevens wrote:

Vlad Yasevich [EMAIL PROTECTED] wrote on 12/19/2007 07:20:53 AM:


But this still requires either a SO_BINDTODEVICE or sin6_scope_id.  This
means the an application can call BINDTODEVICE(eth0), MULTICAST_IF(eth1)
issue a connect on a UDP socket an succeed?  Seems wrong to me.

Can you check section 6.7 of RFC 3542.


No, it requires one of SO_BINDTODEVICE, sin6_scope_id, or 
IPV6_MULTICAST_IF.
If you do an SO_BINDTODEVICE(eth0) and then an IPV6_MULTICAST_IF(eth1), 
the
IPV6_MULTICAST_IF will fail in setsockopt (EINVAL), because it requires a 
match
for bound sockets. I'm not sure if SO_BINDTODEVICE resets mcast_oif if you 
do

them in the reverse order, but that would be a bug in SO_BINDTODEVICE.


It doesn't, that was one way I tested my first patch by forcing a mis-match.


The precedence order as implemented already is:

SO_BINDTODEVICE is highest and always wins
sin6_scope_id next
IPV6_MULTICAST_IF

and the existing code has the rule that all link-local addresses require a
sin6_scope_id. The change (intended) is to relax the sin6_scope_id rule 
only

for link-local multicasts that have done either an SO_BINDTODEVICE or
IPV6_MULTICAST_IF already.


Yes, that was the intention of my patch.

-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2007-12-19 Thread Brian Haley

David Stevens wrote:

Brian Haley [EMAIL PROTECTED] wrote on 12/19/2007 07:35:46 AM:
...

if (usin-sin6_scope_id)
sk-sk_bound_dev_if = usin-sin6_scope_id;
if (!sk-sk_bound_dev_if 
 (addr_type  IPV6_ADDR_MULTICAST))
fl.oif = np-mcast_oif;

This assignment will not get us past the next check...


Yeah, that's what I get for typing in off-the-cuff code. What
I was thinking was the fl.oif assignment instead was:
if (!sk-sk_bound_dev_if 
(addr_type  IPV6_ADDR_MULTICAST))
sk-sk_bound_dev_if = np-mcast_oif;

Which it is not, but maybe it could be, since this is a connect().


My original patch did this, but also checked for a possible mis-match 
with sk_bound_dev_if - it would actually wind-up setting it to the same 
value if it was already set correctly.



That patch looks better, but I'm wondering if we could just remove the
requirement that sin6_scope_id be set here if it's multicast, since it
is doing the following later in the code:

if (!fl.oif  (addr_typeIPV6_ADDR_MULTICAST))
fl.oif = np-mcast_oif;


We would still have to check np-mcast_oif is set in the link-local case 
since we shouldn't be getting here with a zero.



So, really, all we need to do is get through the LINKLOCAL section
without error in the multicast case and we can remove the redundant
multicast check there. I think that'd be simpler.

I also note that sin6_scope_id appears not to be honored at all in
the non-linklocal case, which may be correct, but surprises me.

I want to look a little more at this; I know you have a customer
issue, so I'll make it quick.


Don't worry about that, they can wait, and I'm leaving for 10 days 
anyways...


-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2007-12-18 Thread Brian Haley

Trying to connect() to an IPv6 link-local multicast address by
specifying the outgoing multicast interface doesn't work, you have to
bind to a device first with an SO_BINDTODEVICE setsockopt() call.  This
patch allows the IPV6_MULTICAST_IF setting to also control which
interface should be used for the connection, as specified in RFC 3493.

Signed-off-by: Brian Haley [EMAIL PROTECTED]
---

diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2ed689a..0b1e7eb 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -123,9 +123,15 @@ ipv4_connected:
 goto out;
 			}
 			sk-sk_bound_dev_if = usin-sin6_scope_id;
-			if (!sk-sk_bound_dev_if 
-			(addr_type  IPV6_ADDR_MULTICAST))
-fl.oif = np-mcast_oif;
+		}
+
+		if ((addr_type  IPV6_ADDR_MULTICAST)  np-mcast_oif) {
+			if (sk-sk_bound_dev_if 
+			sk-sk_bound_dev_if != np-mcast_oif) {
+err = -EINVAL;
+goto out;
+			}
+			sk-sk_bound_dev_if = np-mcast_oif;
 		}
 
 		/* Connect to link-local address requires an interface */



Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()

2007-12-18 Thread Brian Haley

David Stevens wrote:

Brian Haley [EMAIL PROTECTED] wrote on 12/18/2007 12:57:54 PM:


Trying to connect() to an IPv6 link-local multicast address by
specifying the outgoing multicast interface doesn't work, you have to
bind to a device first with an SO_BINDTODEVICE setsockopt() call.


Other OSes allow this operation, like FreeBSD, Tru64 UNIX and Solaris.


No, you simply have to specify sin6_scope_id for link-scope
addresses, like you do in unicast cases.


But isn't this why IPV6_MULTICAST_IF exists?  So you don't have to bind 
to an interface or use the scope id?  RFC 3493 does not mention having 
to set a scope id in order to send multicast packets:


   IPv6 applications may send multicast packets by simply specifying an
   IPv6 multicast address as the destination address, for example in the
   destination address argument of the sendto() function.


Your patch requires them
to match (if specified), but I don't think IPV6_MULTICAST_IF should
override or require a match for a valid sin6_scope_id (or be an error).


The patch won't override sk_bound_dev_if, or sin6_scope_id, it's a last 
resort for link-local multicast.  As far as matching, I think they 
should if you set both SO_BINDTODEVICE/sin6_scope_id and 
IPV6_MULTICAST_IF.  I can relax that check if you like.


The one thing my patch does do is set sk_bound_dev_if, which it never 
did - that seemed like the right thing to do since that's what the scope 
id path does, and makes sure we always continue to use this interface.



If I read it correctly, the existing code uses IPV6_MULTICAST_IF
if the sin6_scope_id is not set, otherwise honors the interface specified
in the connect. That seems like correct behaviour to me, and RFC 3493
doesn't address the relative precedence of the two that I see. This is
in the linklocal branch, and all unicast linklocal's require specifying
sin6_scope_id. Multicast doesn't if require a scope_id in the case where
you've done an IPV6_MULTICAST_IF, but it should still allow a different
scope_id when you have used IPV6_MULTICAST_IF.


The IPV6_ADDR_MULTICAST check is inside the sin6_scope_id if() 
statement, so will never get checked if the scope hasn't been specified, 
that's the bug.  Since that isn't required for multicast we always get 
an EINVAL here.



Do you have application code that you believe is correct that
doesn't work?


Yes, a customer does.

-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 support for NFS server

2007-12-11 Thread Brian Haley

Hi Aurelien,

Aurélien Charbon wrote:


Here is a cleanup for the ip_map caching patch in nfs server.

It prepares for IPv6 text-based mounts and exports.

Tests: tested with only IPv4 network and basic nfs ops (mount, file 
creation and modification)


In an email back on October 29th I sent-out a similar patch with a new 
ipv6_addr_set_v4mapped() inline - it might be useful to pull that piece 
into your patch since it cleans it up a bit to get rid of the 
ipv6_addr_set() calls.  I can re-send you that patch off-line if you 
can't find it.


-Brian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NFS: handle IPv6 addresses in nfs ctl

2007-10-30 Thread Brian Haley

Aurélien Charbon wrote:
Here is a second missing part of the IPv6 support in NFS server code 
concerning knfd syscall interface.

It updates write_getfd and write_getfd to accept IPv6 addresses.

Applies on a kernel including ip_map cache modifications


Both patches still have bugs, I think the patch I sent yesterday fixed 
them all, so I would recommend using that instead.  Of course Neil's 
comment possibly trumps all that anyways...


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NFS: handle IPv6 addresses in nfs ctl

2007-10-29 Thread Brian Haley

Hi Aurelien,

This is a combination of your two patches into one, cleaned-up with all 
my complaints.  I also added an ipv6_addr_set_v4mapped() inline after 
someone else here convinced me it gets rid of a lot of cruft from the 
code.  The DCCP, etc. code can be cleaned-up later if this gets accepted.


I have only compile-tested this, hoping you can test the functionality.

-Brian


Add IPv6 support to NFS server ip_map caching code and knfsd syscall 
interface - write_getfs() and write_getfd() will now accept IPv6 addresses.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 66d0aeb..c47ba77 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -35,6 +35,7 @@
 #include linux/lockd/bind.h
 #include linux/sunrpc/msg_prot.h
 #include linux/sunrpc/gss_api.h
+#include net/ipv6.h
 
 #define NFSDDBG_FACILITY	NFSDDBG_EXPORT
 
@@ -1556,6 +1557,7 @@ exp_addclient(struct nfsctl_client *ncp)
 {
 	struct auth_domain	*dom;
 	int			i, err;
+	struct in6_addr		in6;
 
 	/* First, consistency check. */
 	err = -EINVAL;
@@ -1574,9 +1576,10 @@ exp_addclient(struct nfsctl_client *ncp)
 		goto out_unlock;
 
 	/* Insert client into hashtable. */
-	for (i = 0; i  ncp-cl_naddr; i++)
-		auth_unix_add_addr(ncp-cl_addrlist[i], dom);
-
+	for (i = 0; i  ncp-cl_naddr; i++) {
+		ipv6_addr_set_v4mapped(ncp-cl_addrlist[i].s_addr, in6);
+		auth_unix_add_addr(in6, dom);
+	}
 	auth_unix_forget_old(dom);
 	auth_domain_put(dom);
 
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 77dc989..5cb5f0d 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -37,6 +37,7 @@
 #include linux/nfsd/syscall.h
 
 #include asm/uaccess.h
+#include net/ipv6.h
 
 /*
  *	We have a single directory with 9 nodes in it.
@@ -219,24 +220,37 @@ static ssize_t write_getfs(struct file *file, char *buf, size_t size)
 {
 	struct nfsctl_fsparm *data;
 	struct sockaddr_in *sin;
+	struct sockaddr_in6 *sin6;
 	struct auth_domain *clp;
 	int err = 0;
 	struct knfsd_fh *res;
+	struct in6_addr in6;
 
 	if (size  sizeof(*data))
 		return -EINVAL;
 	data = (struct nfsctl_fsparm*)buf;
 	err = -EPROTONOSUPPORT;
-	if (data-gd_addr.sa_family != AF_INET)
+	switch (data-gd_addr.sa_family) {
+	case AF_INET:
+		sin = (struct sockaddr_in *)data-gd_addr;
+		ipv6_addr_set_v4mapped(sin-sin_addr.s_addr, in6);
+		break;
+	case AF_INET6:
+		sin6 = (struct sockaddr_in6 *)data-gd_addr;
+		ipv6_addr_copy(in6, sin6-sin6_addr); 
+		break;
+	default:
 		goto out;
-	sin = (struct sockaddr_in *)data-gd_addr;
+	}
+
 	if (data-gd_maxlen  NFS3_FHSIZE)
 		data-gd_maxlen = NFS3_FHSIZE;
 
 	res = (struct knfsd_fh*)buf;
 
 	exp_readlock();
-	if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+	if (!(clp = auth_unix_lookup(in6)))
 		err = -EPERM;
 	else {
 		err = exp_rootfh(clp, data-gd_path, res, data-gd_maxlen);
@@ -253,25 +267,41 @@ static ssize_t write_getfd(struct file *file, char *buf, size_t size)
 {
 	struct nfsctl_fdparm *data;
 	struct sockaddr_in *sin;
+	struct sockaddr_in6 *sin6;
 	struct auth_domain *clp;
 	int err = 0;
 	struct knfsd_fh fh;
 	char *res;
+	struct in6_addr in6;
 
 	if (size  sizeof(*data))
 		return -EINVAL;
 	data = (struct nfsctl_fdparm*)buf;
 	err = -EPROTONOSUPPORT;
-	if (data-gd_addr.sa_family != AF_INET)
+	if (data-gd_addr.sa_family != AF_INET 
+	data-gd_addr.sa_family != AF_INET6)
 		goto out;
 	err = -EINVAL;
 	if (data-gd_version  2 || data-gd_version  NFSSVC_MAXVERS)
 		goto out;
 
 	res = buf;
-	sin = (struct sockaddr_in *)data-gd_addr;
 	exp_readlock();
-	if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+	switch (data-gd_addr.sa_family) {
+	case AF_INET:
+		sin = (struct sockaddr_in *)data-gd_addr;
+		ipv6_addr_set_v4mapped(sin-sin_addr.s_addr, in6);
+		break;
+	case AF_INET6:
+		sin6 = (struct sockaddr_in6 *)data-gd_addr;
+		ipv6_addr_copy(in6, sin6-sin6_addr);
+		break;
+	default:
+		goto out;
+	}
+
+	if (!(clp = auth_unix_lookup(in6)))
 		err = -EPERM;
 	else {
 		err = exp_rootfh(clp, data-gd_path, fh, NFS_FHSIZE);
diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index 22e1ef8..64ecb93 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -15,6 +15,7 @@
 #include linux/sunrpc/msg_prot.h
 #include linux/sunrpc/cache.h
 #include linux/hash.h
+#include net/ipv6.h
 
 #define SVC_CRED_NGROUPS	32
 struct svc_cred {
@@ -120,10 +121,10 @@ extern void	svc_auth_unregister(rpc_authflavor_t flavor);
 
 extern struct auth_domain *unix_domain_find(char *name);
 extern void auth_domain_put(struct auth_domain *item);
-extern int auth_unix_add_addr(struct in_addr addr, struct auth_domain *dom);
+extern int auth_unix_add_addr(struct in6_addr *addr, struct auth_domain *dom);
 extern struct auth_domain *auth_domain_lookup(char *name, struct auth_domain *new);
 extern struct auth_domain *auth_domain_find(char *name);
-extern struct auth_domain *auth_unix_lookup(struct in_addr addr);
+extern struct auth_domain *auth_unix_lookup(struct in6_addr *addr);
 extern int

Re: Configuring the same IP on multiple addresses

2007-10-29 Thread Brian Haley

David Miller wrote:

From: David Miller [EMAIL PROTECTED]
Date: Mon, 29 Oct 2007 15:25:59 -0700 (PDT)


Can you guys please just state upfront what virtualization
issue is made more difficult by features you want to remove?


Sorry, I mentioned virtualization because that's been the
largest majority of the cases being presented lately.

I suspect in your case it's some multicast or SCTP thing :-)


It's actually neither in this case :)

We have customers migrating from BSD stacks to Linux.  They notice all 
the differences in the sockets API, sometimes even find bugs, and we fix 
them and send patches upstream.  They also do stupid things like 
duplicate address configurations on two interfaces in different subnets.


IPv6 was the curious one for us here since it falls into an RFC gray 
area - addresses are assigned to interfaces, not hosts (RFC 4291), but 
they should be tested for uniqueness before being assigned (RFC 4862). 
This address didn't pass the uniqueness test, although it did pass DAD 
because the links were different.  We couldn't find another OS for a 
host or router (including IOS) that allows this, hence the question.


Thanks, and sorry if it's just another waste of your time to explain it.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] NFS: change the ip_map cache code to handle IPv6 addresses

2007-10-24 Thread Brian Haley

Hi Aurelien,

I think you're almost there, at least with my comments :)


linux-2.6.23-ipmap/include/net/ipv6.h
--- linux-2.6.23-haley/include/net/ipv6.h2007-10-22 
09:42:58.0 +0200
+++ linux-2.6.23-ipmap/include/net/ipv6.h2007-10-22 
10:10:59.0 +0200

@@ -21,6 +21,7 @@
#include net/ndisc.h
#include net/flow.h
#include net/snmp.h
+#include linux/in.h

#define SIN6_LEN_RFC213324

@@ -167,6 +168,12 @@ DECLARE_SNMP_STAT(struct udp_mib, udplit
if (is_udplite) SNMP_INC_STATS_USER(udplite_stats_in6, 
field); \

elseSNMP_INC_STATS_USER(udp_stats_in6, field);} while(0)

+#define IS_ADDR_MAPPED(a) \
+(((uint32_t *) (a))[0] == 0\
+ ((uint32_t *) (a))[1] == 0\
+ (((uint32_t *) (a))[2] == 0\
+|| ((uint32_t *) (a))[2] == htonl(0x)))
+
struct ip6_ra_chain
{
struct ip6_ra_chain*next;
@@ -380,7 +387,7 @@ static inline int ipv6_addr_any(const st
static inline int ipv6_addr_v4mapped(const struct in6_addr *a)
{
return ((a-s6_addr32[0] | a-s6_addr32[1]) == 0 
- a-s6_addr32[2] == htonl(0x));
+a-s6_addr32[2] == htonl(0x));
}


You don't need to touch ipv6.h at all, IS_ADDR_MAPPED is unused and the 
other is removing a space.



static void ip_map_init(struct cache_head *cnew, struct cache_head *citem)
{
@@ -125,7 +133,7 @@ static void ip_map_init(struct cache_hea
struct ip_map *item = container_of(citem, struct ip_map, h);

strcpy(new-m_class, item-m_class);
-new-m_addr.s_addr = item-m_addr.s_addr;
+ipv6_addr_copy((new-m_addr), (item-m_addr));


Extra () here.


@@ -651,7 +694,7 @@ svcauth_unix_set_client(struct svc_rqst
ipm = ip_map_cached_get(rqstp);
if (ipm == NULL)
ipm = ip_map_lookup(rqstp-rq_server-sv_program-pg_class,
-sin-sin_addr);
+(sin6-sin6_addr));


Extra () here.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NFS: handle IPv6 addresses in nfs ctl

2007-10-24 Thread Brian Haley

Hi Aurelien,

Again, a few more comments.

I might just modify these in my own tree and send out a patch that 
combines both into one, it might be less work.



@@ -229,9 +229,20 @@ static ssize_t write_getfs(struct file *
return -EINVAL;
data = (struct nfsctl_fsparm*)buf;
err = -EPROTONOSUPPORT;
-   if (data-gd_addr.sa_family != AF_INET)
+   switch (data-gd_addr.sa_family) {
+   case AF_INET6:
+   sin6 = sin6_storage;


This should be:

in6 = sin6_storage;


+   sin6 = (struct sockaddr_in6 *)data-gd_addr;
+		ipv6_addr_copy(in6, (sin6-sin6_addr)); 


Extra () here.

-	if (!(clp = auth_unix_lookup(in6))) 
+	switch (data-gd_addr.sa_family) {

+   case AF_INET:
+   /* IPv6 address mapping */
+   ipv6_addr_set(in6, 0, 0, htonl(0x), ((struct sockaddr_in 
*)data-gd_addr)-sin_addr.s_addr);
+   break;
+   case AF_INET6:
+   sin6 = sin6_storage;


This should be:

in6 = sin6_storage;


+   sin6 = (struct sockaddr_in6 *)data-gd_addr;
+   ipv6_addr_copy(in6, (sin6-sin6_addr));


Extra () here.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: multicast: bug or feature

2007-10-19 Thread Brian Haley

Hi David,

David Stevens wrote:

From looking at the code, it appears that validate
source is failing just because of the rp_filter. Do you have
rp_filter set to nonzero?
If so, it may do what you want just by setting that
to 0:

sysctl -w net.ipv4.conf.all.rp_filter=0


rp_filter is set to zero, it's the if (res.type != RTN_UNICAST) check 
in fib_validate_source() that's doing it.  If I add a new 
accept_local_addr sysctl to ipv4_devconf to allow RTN_LOCAL here, 
everything works just fine.  I just don't know how palatable that would 
be to upstream...


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NFS: handle IPv6 addresses in nfs ctl

2007-10-12 Thread Brian Haley

Hi Aurelien,

Comments in-line.

Aurélien Charbon wrote:
Here is a second missing part of the IPv6 support in NFS server code 
concerning knfd syscall interface.



-struct sockaddr_in *sin;
+struct sockaddr_in6 *sin, sin6_storage;


Nit, should call this sin6 now.


@@ -228,9 +228,20 @@ static ssize_t write_getfs(struct file *
return -EINVAL;
data = (struct nfsctl_fsparm*)buf;
err = -EPROTONOSUPPORT;
-if (data-gd_addr.sa_family != AF_INET)
+sin = sin6_storage;


This should be moved in the AF_INET case.


+switch (data-gd_addr.sa_family) {
+case AF_INET6:
+sin = (struct sockaddr_in6 *)data-gd_addr;
+in6 = sin-sin6_addr;


in6 is a structure, not a pointer.  If you want it do this you have to 
use ipv6_addr_copy().



+case AF_INET:
+/* Map v4 address into v6 structure */
+ipv6_addr_v4map(((struct sockaddr_in 
*)data-gd_addr)-sin_addr, in6);


ipv6_addr_set(...)


@@ -257,7 +265,7 @@ static ssize_t write_getfs(struct file *
static ssize_t write_getfd(struct file *file, char *buf, size_t size)
{
struct nfsctl_fdparm *data;
-struct sockaddr_in *sin;
+struct sockaddr_in6 *sin, sin6_storage;


Nit, sin - sin6.


@@ -268,18 +276,29 @@ static ssize_t write_getfd(struct file *
return -EINVAL;
data = (struct nfsctl_fdparm*)buf;
err = -EPROTONOSUPPORT;
-if (data-gd_addr.sa_family != AF_INET)
+if (data-gd_addr.sa_family != AF_INET 
+data-gd_addr.sa_family != AF_INET6)
goto out;
err = -EINVAL;
if (data-gd_version  2 || data-gd_version  NFSSVC_MAXVERS)
goto out;

res = buf;
-sin = (struct sockaddr_in *)data-gd_addr;
+sin = sin6_storage;


Move in AF_INET case.


-/* IPv6 address mapping */
-ipv6_addr_v4map(sin-sin_addr, in6);
+switch (data-gd_addr.sa_family) {
+case AF_INET:
+/* IPv6 address mapping */
+ipv6_addr_v4map(((struct sockaddr_in 
*)data-gd_addr)-sin_addr, in6);


Use ipv6_set_addr(...)


+break;
+case AF_INET6:
+sin = (struct sockaddr_in6 *)data-gd_addr;
+in6 = sin-sin6_addr;


Must use ipv6_addr_copy() here too.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] NFS: change the ip_map cache code to handle IPv6 addresses

2007-10-12 Thread Brian Haley

Hi Aurelien,

There were some of my comments you haven't addressed yet, comments in-line.

Aurélien Charbon wrote:

Here is a patch for the ip_map caching code part in nfs server.



+for (i = 0; i  ncp-cl_naddr; i++) {
+/* Mapping address */
+ipv6_addr_v4map(ncp-cl_addrlist[i], addr6);


ipv6_addr_set(addr6, 0, 0, htonl(0x), ncp-cl_addrlist[i]);


+/* IPv6 address mapping */
+ipv6_addr_v4map(sin-sin_addr, in6);


ipv6_addr_set(in6, 0, 0, htonl(0x), sin-sin_addr);


+/* IPv6 address mapping */
+ipv6_addr_v4map(sin-sin_addr, in6);


ipv6_addr_set(in6, 0, 0, htonl(0x), sin-sin_addr);


+#define IS_ADDR_MAPPED(a) \
+(((uint32_t *) (a))[0] == 0\
+ ((uint32_t *) (a))[1] == 0\
+ (((uint32_t *) (a))[2] == 0\
+|| ((uint32_t *) (a))[2] == htonl(0x)))


This is unused, can go away.

+static inline void ipv6_addr_v4map(const struct in_addr a1, struct 
in6_addr a2)

+{
+a2.s6_addr32[0] = 0;
+a2.s6_addr32[1] = 0;
+a2.s6_addr32[2] = htonl(0x);
+a2.s6_addr32[3] = (uint32_t)a1.s_addr;
+}


If you use ipv6_addr_set() everywhere you don't need this.


static inline int ipv6_addr_v4mapped(const struct in6_addr *a)
{
return ((a-s6_addr32[0] | a-s6_addr32[1]) == 0 
- a-s6_addr32[2] == htonl(0x));
+a-s6_addr32[2] == htonl(0x));
}


Guessing you changed a tab to a space, unnecessary.


-static struct ip_map *ip_map_lookup(char *class, struct in_addr addr);
+static struct ip_map *ip_map_lookup(char *class, struct in6_addr addr);


I still think you should pass a pointer here.


-int auth_unix_add_addr(struct in_addr addr, struct auth_domain *dom)
+int auth_unix_add_addr(struct in6_addr addr, struct auth_domain *dom)


And here.


-struct auth_domain *auth_unix_lookup(struct in_addr addr)
+struct auth_domain *auth_unix_lookup(struct in6_addr addr)


And here.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPv6] Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2

2007-10-11 Thread Brian Haley

Hi,

From RFC 3493, Section 5.2:

  IPV6_MULTICAST_IF

 Set the interface to use for outgoing multicast packets.  The
 argument is the index of the interface to use.  If the
 interface index is specified as zero, the system selects the
 interface (for example, by looking up the address in a routing
 table and using the resulting interface).

This patch adds support for (index == 0) to reset the value to it's 
original state, allowing the system to choose the best interface.  IPv4 
already behaves this way.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 532425d..1334fc1 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -539,12 +539,15 @@ done:
 	case IPV6_MULTICAST_IF:
 		if (sk-sk_type == SOCK_STREAM)
 			goto e_inval;
-		if (sk-sk_bound_dev_if  sk-sk_bound_dev_if != val)
-			goto e_inval;
 
-		if (__dev_get_by_index(init_net, val) == NULL) {
-			retv = -ENODEV;
-			break;
+		if (val) {
+			if (sk-sk_bound_dev_if  sk-sk_bound_dev_if != val)
+goto e_inval;
+
+			if (__dev_get_by_index(init_net, val) == NULL) {
+retv = -ENODEV;
+break;
+			}
 		}
 		np-mcast_oif = val;
 		retv = 0;


Re: [PATCH] division-by-zero in inet_csk_get_port

2007-10-10 Thread Brian Haley

Anton Arapov wrote:

  So, now the way suggested by Denis looks reasonable.

  What do you think?


If that's the case then you should fix __udp_lib_get_port() the same way.

Prevent division by zero in __udp_lib_get_port() when only one 
unsecured port is available.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ef4d901..61faa38 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -150,10 +150,11 @@ int __udp_lib_get_port(struct sock *sk, unsigned short snum,
 		int i;
 		int low = sysctl_local_port_range[0];
 		int high = sysctl_local_port_range[1];
+		int remaining = (high - low) + 1;
 		unsigned rover, best, best_size_so_far;
 
 		best_size_so_far = UINT_MAX;
-		best = rover = net_random() % (high - low) + low;
+		best = rover = net_random() % remaining + low;
 
 		/* 1st pass: look for empty (or shortest) hash chain */
 		for (i = 0; i  UDP_HTABLE_SIZE; i++) {


[IPv6] Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493

2007-10-10 Thread Brian Haley

Hi,

From RFC 3493, Section 5.2:

  IPV6_MULTICAST_IF

 Set the interface to use for outgoing multicast packets.  The
 argument is the index of the interface to use.  If the
 interface index is specified as zero, the system selects the
 interface (for example, by looking up the address in a routing
 table and using the resulting interface).

This patch adds support for (index == 0) to reset the value to it's 
original state, allowing the system to choose the best interface.  IPv4 
already behaves this way.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 532425d..309284e 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -539,6 +539,13 @@ done:
 	case IPV6_MULTICAST_IF:
 		if (sk-sk_type == SOCK_STREAM)
 			goto e_inval;
+
+		if (val == 0) {
+			np-mcast_oif = 0;
+			retv = 0;
+			break;
+		}
+
 		if (sk-sk_bound_dev_if  sk-sk_bound_dev_if != val)
 			goto e_inval;
 


Re: [RFC] more robust inet range checking

2007-10-10 Thread Brian Haley

Stephen Hemminger wrote:

 int inet_csk_bind_conflict(const struct sock *sk,
   const struct inet_bind_bucket *tb)
@@ -77,10 +90,11 @@ int inet_csk_get_port(struct inet_hashin
 
 	local_bh_disable();

if (!snum) {
-   int low = sysctl_local_port_range[0];
-   int high = sysctl_local_port_range[1];
-   int remaining = (high - low) + 1;
-   int rover = net_random() % (high - low) + low;
+   int remaining, range[2], rover;
+
+   inet_get_local_port_range(range);
+   remaining = range[1] - range[0];
+   rover = net_random() % (range[1] - range[0]) + range[0];


nit-pick:
rover = net_random() % remaining + range[0];


--- a/net/ipv4/udp.c2007-10-10 08:27:00.0 -0700
+++ b/net/ipv4/udp.c2007-10-10 09:44:35.0 -0700
@@ -147,13 +147,13 @@ int __udp_lib_get_port(struct sock *sk, 
 	write_lock_bh(udp_hash_lock);
 
 	if (!snum) {

-   int i;
-   int low = sysctl_local_port_range[0];
-   int high = sysctl_local_port_range[1];
+   int i, range[2];
unsigned rover, best, best_size_so_far;


Should these be signed ints?  They're the only ones that are unsigned, 
but I don't know why.



--- a/net/sctp/protocol.c   2007-10-10 08:27:00.0 -0700
+++ b/net/sctp/protocol.c   2007-10-10 09:58:21.0 -0700
@@ -1173,7 +1173,6 @@ SCTP_STATIC __init int sctp_init(void)
}
 
 	spin_lock_init(sctp_port_alloc_lock);

-   sctp_port_rover = sysctl_local_port_range[0] - 1;


I think you can remove the port_rover definition in sctp/structs.h and 
also the lock that protects it.  Patch below for that which can be 
applied on-top of yours.


-Brian


Remove SCTP port_rover and port_alloc_lock as they're no longer required.

Signed-off-by: Brian Haley [EMAIL PROTECTED]

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 448f713..c1a083c 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -197,8 +197,6 @@ extern struct sctp_globals {
 
 	/* This is the sctp port control hash.	*/
 	int port_hashsize;
-	int port_rover;
-	spinlock_t port_alloc_lock;  /* Protects port_rover. */
 	struct sctp_bind_hashbucket *port_hashtable;
 
 	/* This is the global local address list.
@@ -245,8 +243,6 @@ extern struct sctp_globals {
 #define sctp_assoc_hashsize		(sctp_globals.assoc_hashsize)
 #define sctp_assoc_hashtable		(sctp_globals.assoc_hashtable)
 #define sctp_port_hashsize		(sctp_globals.port_hashsize)
-#define sctp_port_rover			(sctp_globals.port_rover)
-#define sctp_port_alloc_lock		(sctp_globals.port_alloc_lock)
 #define sctp_port_hashtable		(sctp_globals.port_hashtable)
 #define sctp_local_addr_list		(sctp_globals.local_addr_list)
 #define sctp_local_addr_lock		(sctp_globals.addr_list_lock)
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 80df457..81b26c5 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1172,8 +1172,6 @@ SCTP_STATIC __init int sctp_init(void)
 		sctp_port_hashtable[i].chain = NULL;
 	}
 
-	spin_lock_init(sctp_port_alloc_lock);
-
 	printk(KERN_INFO SCTP: Hash tables configured 
 			 (established %d bind %d)\n,
 		sctp_assoc_hashsize, sctp_port_hashsize);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index e1e2d2c..293200d 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -5321,7 +5321,6 @@ static long sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 		remaining = range[1] - range[0];
 		rover = net_random() % remaining + range[0];
 
-		sctp_spin_lock(sctp_port_alloc_lock);
 		do {
 			rover++;
 			if ((rover  range[0]) || (rover  range[1]))
@@ -5337,7 +5336,6 @@ static long sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 		next:
 			sctp_spin_unlock(head-lock);
 		} while (--remaining  0);
-		sctp_spin_unlock(sctp_port_alloc_lock);
 
 		/* Exhausted local port range during search? */
 		ret = 1;


Re: [IPv6] Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493

2007-10-10 Thread Brian Haley

David Stevens wrote:

What about just checking for 0 in the later test?

if (val  __dev_get_by_index(val) == NULL) {


We could fail the next check right before that though:

  if (sk-sk_bound_dev_if  sk-sk_bound_dev_if != val)
  goto e_inval;

I just mimicked what the IPv4 code does in do_ip_setsockopt().

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPv6] Fix ICMPv6 redirect handling with target multicast address, try 3

2007-10-03 Thread Brian Haley
When the ICMPv6 Target address is multicast, Linux processes the 
redirect instead of dropping it.  The problem is in this code in 
ndisc_redirect_rcv():


if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
   ICMPv6 Redirect: target address is not 
link-local.\n);

return;
}

This second check will succeed if the Target address is, for example, 
FF02::1 because it has link-local scope.  Instead, it should be checking 
if it's a unicast link-local address, as stated in RFC 2461/4861 Section 
8.1:


  - The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's 
implied.


This bug is preventing Linux kernels from achieving IPv6 Logo Phase II 
certification because of a recent error that was found in the TAHI test 
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the 
multicast address in the Destination field instead of Target field, so 
we were passing the test.  This won't be the case anymore.


The patch below fixes this problem, and also fixes ndisc_send_redirect() 
to not send an invalid redirect with a multicast address in the Target 
field.  I re-ran the TAHI Neighbor Discovery section to make sure Linux 
passes all 245 tests now.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
Acked-by: David L Stevens [EMAIL PROTECTED]
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 74c4d8d..b761dbe 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1267,9 +1267,10 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
 
 	if (ipv6_addr_equal(dest, target)) {
 		on_link = 1;
-	} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	} else if (ipv6_addr_type(target) !=
+		   (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
-			   ICMPv6 Redirect: target address is not link-local.\n);
+			   ICMPv6 Redirect: target address is not link-local unicast.\n);
 		return;
 	}
 
@@ -1343,9 +1344,9 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 	}
 
 	if (!ipv6_addr_equal(ipv6_hdr(skb)-daddr, target) 
-	!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	ipv6_addr_type(target) != (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
-			ICMPv6 Redirect: target address is not link-local.\n);
+			ICMPv6 Redirect: target address is not link-local unicast.\n);
 		return;
 	}
 


[IPv6] Fix ICMPv6 redirect handling with target multicast address

2007-10-02 Thread Brian Haley
When the ICMPv6 Target address is multicast, Linux processes the 
redirect instead of dropping it.  The problem is in this code in 
ndisc_redirect_rcv():


if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
   ICMPv6 Redirect: target address is not 
link-local.\n);

return;
}

This second check will succeed if the Target address is, for example, 
FF02::1 because it has link-local scope.  Instead, it should be checking 
if it's a unicast link-local address, as stated in RFC 2461/4861 Section 
8.1:


  - The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's 
implied.


This bug is preventing Linux kernels from achieving IPv6 Logo Phase II 
certification because of a recent error that was found in the TAHI test 
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the 
multicast address in the Destination field instead of Target field, so 
we were passing the test.  This won't be the case anymore.


The patch below fixes this problem, and also fixes ndisc_send_redirect() 
to not send an invalid redirect with a multicast address in the Target 
field.  I re-ran the TAHI Neighbor Discovery section to make sure Linux 
passes all 245 tests now.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 74c4d8d..a0a6406 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1267,7 +1267,8 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
 
 	if (ipv6_addr_equal(dest, target)) {
 		on_link = 1;
-	} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	} else if (ipv6_addr_type(target) !=
+		   (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
 			   ICMPv6 Redirect: target address is not link-local.\n);
 		return;
@@ -1343,7 +1344,7 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 	}
 
 	if (!ipv6_addr_equal(ipv6_hdr(skb)-daddr, target) 
-	!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	ipv6_addr_type(target) != (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
 			ICMPv6 Redirect: target address is not link-local.\n);
 		return;


Re: [IPv6] Fix ICMPv6 redirect handling with target multicast address

2007-10-02 Thread Brian Haley

Hi David,

David Stevens wrote:
ipv6_addr_type() returns a mask, so checking for equality will 
fail to
match if  any other (irrelevant) attributes are set. How about using 
bitwise

operators for that?


ipv6_addr_type() does return a mask, but there's a lot of code that just 
checks for equality since some things are mutually-exclusive - this code 
is actually identical to what ip6_route_add() does.  I don't 
particularly like this duality, but it's there - I'd gladly volunteer to 
clean this up everywhere if I didn't think there might be some 
performance reason it was done like that.



Also, the error message is no longer descriptive of the
failure if it's a link-local multicast, but you could make it target 
address is not

link-local unicast.\n (in both places).


I can do that, thanks.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPV6] Fix ICMPv6 redirect handling with target multicast address

2007-10-01 Thread Brian Haley
Hi,

YOSHIFUJI Hideaki / 吉藤英明 wrote:
 I think it'd also be better if you add the check to be:

 if (ipv6_addr_type(target)  
 (IPV6_ADDR_LINKLOCAL|IPV6_ADDR_UNICAST))

 or something along those lines, rather than reproducing ipv6_addr_type() 
 code
 separately in a new ipv6_addr_linklocal() function.
 
 I'm fine with the idea of the fix itself.

Ok, in both the receive and send code?

 Please use ipv6_addr_type() so far and convert other users as well
 to ipv6_addr_linklocal() in another patch.

I'll re-do the patch.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPV6] Fix ICMPv6 redirect handling with target multicast address

2007-09-28 Thread Brian Haley
When the ICMPv6 Target address is multicast, Linux processes the 
redirect instead of dropping it.  The problem is in this code in 
ndisc_redirect_rcv():


if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
   ICMPv6 Redirect: target address is not 
link-local.\n);

return;
}

This second check will succeed if the Target address is, for example, 
FF02::1 because it has link-local scope.  Instead, it should be checking 
if it's a unicast link-local address, as stated in RFC 2461/4861 Section 
8.1:


  - The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's 
implied.


This bug is preventing Linux kernels from achieving IPv6 Logo Phase II 
certification because of a recent error that was found in the TAHI test 
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the 
multicast address in the Destination field instead of Target field, so 
we were passing the test.  This won't be the case anymore.


The patch below fixes this problem, and also fixes ndisc_send_redirect() 
to not send an invalid redirect with a multicast address in the Target 
field.  I re-ran the TAHI Neighbor Discovery section to make sure Linux 
passes all 245 tests now.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 31b3f1b..4f47d29 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -368,6 +368,11 @@ static inline int ipv6_prefix_equal(const struct in6_addr *a1,
    prefixlen);
 }
 
+static inline int ipv6_addr_linklocal(const struct in6_addr *a)
+{
+	return ((a-s6_addr32[0]  htonl(0xFFC0)) == htonl(0xFE80));
+}
+
 static inline int ipv6_addr_any(const struct in6_addr *a)
 {
 	return ((a-s6_addr32[0] | a-s6_addr32[1] | 
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 74c4d8d..8f953a7 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1267,7 +1267,7 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
 
 	if (ipv6_addr_equal(dest, target)) {
 		on_link = 1;
-	} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	} else if (!ipv6_addr_linklocal(target)) {
 		ND_PRINTK2(KERN_WARNING
 			   ICMPv6 Redirect: target address is not link-local.\n);
 		return;
@@ -1343,7 +1343,7 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 	}
 
 	if (!ipv6_addr_equal(ipv6_hdr(skb)-daddr, target) 
-	!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	!ipv6_addr_linklocal(target)) {
 		ND_PRINTK2(KERN_WARNING
 			ICMPv6 Redirect: target address is not link-local.\n);
 		return;


Re: [PATCH 1/1] NFS: change the ip_map cache code to handle IPv6 addresses

2007-09-06 Thread Brian Haley

Hi Aurelien,

More comments.

Aurélien Charbon wrote:

This is a small part of missing pieces of IPv6 support for the server.
It deals with the ip_map caching code part.



/* Insert client into hashtable. */
-   for (i = 0; i  ncp-cl_naddr; i++)
-   auth_unix_add_addr(ncp-cl_addrlist[i], dom);
-
+   for (i = 0; i  ncp-cl_naddr; i++) {
+   /* Mapping address */
+   ipv6_addr_v4map(ncp-cl_addrlist[i], addr6);


ipv6_addr_set(addr6, 0, 0, htonl(0x), ncp-cl_addrlist[i]);
See below.


@@ -236,7 +237,11 @@ static ssize_t write_getfs(struct file *
res = (struct knfsd_fh*)buf;
 
 	exp_readlock();

-   if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+   /* IPv6 address mapping */
+   ipv6_addr_v4map(sin-sin_addr, in6);


ipv6_addr_set(in6, 0, 0, htonl(0x), sin-sin_addr);
See below.


@@ -271,7 +277,11 @@ static ssize_t write_getfd(struct file *
res = buf;
sin = (struct sockaddr_in *)data-gd_addr;
exp_readlock();
-   if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+   /* IPv6 address mapping */
+   ipv6_addr_v4map(sin-sin_addr, in6);


ipv6_addr_set(in6, 0, 0, htonl(0x), sin-sin_addr);
See below.


+#define IS_ADDR_MAPPED(a) \
+   (((uint32_t *) (a))[0] == 0 \
+((uint32_t *) (a))[1] == 0   \
+(((uint32_t *) (a))[2] == 0  \
+   || ((uint32_t *) (a))[2] == htonl(0x)))


Can go away, right?


+static inline void ipv6_addr_v4map(const struct in_addr a1, struct in6_addr a2)
+{
+   a2.s6_addr32[0] = 0;
+   a2.s6_addr32[1] = 0;
+   a2.s6_addr32[2] = htonl(0x);
+   a2.s6_addr32[3] = (uint32_t)a1.s_addr;
+}


This can go away.  Looking at other code that does this - TCP, UDP, 
DCCP, they just call ipv6_addr_set() directly.



+static inline int hash_ip6(struct in6_addr ip)
+{
+   return (hash_ip(ip.s6_addr32[0]) ^
+   hash_ip(ip.s6_addr32[1]) ^
+   hash_ip(ip.s6_addr32[2]) ^
+   hash_ip(ip.s6_addr32[3]));
+}


Should probably use a pointer to the address (*ip), probably doesn't 
matter that much since it's an inline.



@@ -151,20 +159,22 @@ static void ip_map_request(struct cache_
 {
char text_addr[20];


This needs to be at least 40 since you're passing that to snprintf() below.


+   if (ipv6_addr_v4mapped((im-m_addr))) {
+   snprintf(text_addr, 20, NIPQUAD_FMT,
+   ntohl(im-m_addr.s6_addr32[3])  24  0xff,
+   ntohl(im-m_addr.s6_addr32[3])  16  0xff,
+   ntohl(im-m_addr.s6_addr32[3])   8  0xff,
+   ntohl(im-m_addr.s6_addr32[3])   0  0xff);
+   } else {
+   snprintf(text_addr, 40, NIP6_FMT, NIP6(im-m_addr));
+   }


Here ---^^


-static struct ip_map *ip_map_lookup(char *class, struct in_addr addr)
+static struct ip_map *ip_map_lookup(char *class, struct in6_addr addr)


Maybe you should pass a pointer to the address (*addr) to avoid passing 
it on the stack.



-int auth_unix_add_addr(struct in_addr addr, struct auth_domain *dom)
+int auth_unix_add_addr(struct in6_addr addr, struct auth_domain *dom)


Here too.


-struct auth_domain *auth_unix_lookup(struct in_addr addr)
+struct auth_domain *auth_unix_lookup(struct in6_addr addr)


Here too.


@@ -641,7 +669,19 @@ static int unix_gid_find(uid_t uid, stru
 int
 svcauth_unix_set_client(struct svc_rqst *rqstp)
 {
-   struct sockaddr_in *sin = svc_addr_in(rqstp);
+   struct sockaddr_in *sin;
+   struct sockaddr_in6 *sin6;


Will need change this to something like:

 +  struct sockaddr_in6 *sin6, sin6_storage;

See below.


+
+   switch (rqstp-rq_addr.ss_family) {
+   default:
+   BUG();
+   case AF_INET:
+   sin = svc_addr_in(rqstp);
+   ipv6_addr_v4map(sin-sin_addr, sin6-sin6_addr);


sin6 here is uninitialized, and in order to create a mapped address 
you'll need to allocate storage space on the stack to hold it.  New code 
would be:


sin6 = sin6_storage;
ipv6_addr_set(sin6-sin6_addr, 0, 0, htonl(0x),
sin-sin_addr);

gcc really should have complained about that...

Maybe in the future rq_addr can just be in the correct form, but there's 
a lot of other code that would need to change for that.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] NFS: change the ip_map cache code to handle IPv6 addresses

2007-08-23 Thread Brian Haley

Hi Aurelien,

Aurélien Charbon wrote:
According to Neil's comments, I have tried to correct the mistakes of my 
first sending


I have some more comments.


@@ -1559,6 +1560,7 @@ exp_addclient(struct nfsctl_client *ncp)
{
struct auth_domain*dom;
inti, err;
+struct in6_addr addr6;


Indentation looks wrong.

diff -p -u -r -N linux-2.6.23-rc3/fs/nfsd/nfsctl.c 
linux-2.6.23-rc3-IPv6-ipmap-cache/fs/nfsd/nfsctl.c
--- linux-2.6.23-rc3/fs/nfsd/nfsctl.c2007-08-23 13:18:16.0 
+0200
+++ linux-2.6.23-rc3-IPv6-ipmap-cache/fs/nfsd/nfsctl.c2007-08-23 
13:25:28.0 +0200

@@ -222,7 +222,7 @@ static ssize_t write_getfs(struct file *
struct auth_domain *clp;
int err = 0;
struct knfsd_fh *res;
-
+struct in6_addr in6;


Indentation.


if (size  sizeof(*data))
return -EINVAL;
data = (struct nfsctl_fsparm*)buf;
@@ -236,7 +236,14 @@ static ssize_t write_getfs(struct file *
res = (struct knfsd_fh*)buf;

exp_readlock();
-if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+/* IPv6 address mapping */
+in6.s6_addr32[0] = 0;
+in6.s6_addr32[1] = 0;
+in6.s6_addr32[2] = htonl(0x);
+in6.s6_addr32[3] = (uint32_t)sin-sin_addr.s_addr;


Why didn't you use your new ipv6_addr_map() inline here?


@@ -253,6 +260,7 @@ static ssize_t write_getfd(struct file *
{
struct nfsctl_fdparm *data;
struct sockaddr_in *sin;
+struct in6_addr in6;


Indentation.


@@ -271,7 +279,14 @@ static ssize_t write_getfd(struct file *
res = buf;
sin = (struct sockaddr_in *)data-gd_addr;
exp_readlock();
-if (!(clp = auth_unix_lookup(sin-sin_addr)))
+
+/* IPv6 address mapping */
+in6.s6_addr32[0] = 0;
+in6.s6_addr32[1] = 0;
+in6.s6_addr32[2] = htonl(0x);
+in6.s6_addr32[3] = (uint32_t)sin-sin_addr.s_addr;


Why didn't you use your new ipv6_addr_map() inline here too?

diff -p -u -r -N linux-2.6.23-rc3/include/net/ipv6.h 
linux-2.6.23-rc3-IPv6-ipmap-cache/include/net/ipv6.h
--- linux-2.6.23-rc3/include/net/ipv6.h2007-08-23 13:18:23.0 
+0200
+++ linux-2.6.23-rc3-IPv6-ipmap-cache/include/net/ipv6.h2007-08-23 
13:25:28.0 +0200

@@ -21,6 +21,7 @@
#include net/ndisc.h
#include net/flow.h
#include net/snmp.h
+#include linux/in.h

#define SIN6_LEN_RFC213324

@@ -167,6 +168,12 @@ DECLARE_SNMP_STAT(struct udp_mib, udplit
if (is_udplite) SNMP_INC_STATS_USER(udplite_stats_in6, 
field); \

elseSNMP_INC_STATS_USER(udp_stats_in6, field);} while(0)

+#define IS_ADDR_MAPPED(a) \
+(((uint32_t *) (a))[0] == 0\
+ ((uint32_t *) (a))[1] == 0\
+ (((uint32_t *) (a))[2] == 0\
+|| ((uint32_t *) (a))[2] == htonl(0x)))


I need to update a patch of mine that added a v4-mapped inline, let me 
send that out.  In the kernel you should use u32 too, is that why you 
needed to include linux/net.h?



+/* Maps a IPv4 address into a wright IPv6 address */
+static inline int ipv6_addr_map(const struct in_addr a1, struct 
in6_addr a2)

+{
+a2.s6_addr32[0] = 0;
+a2.s6_addr32[1] = 0;
+a2.s6_addr32[2] = htonl(0x);
+a2.s6_addr32[3] = (uint32_t)a1.s_addr;
+return 0;
+}


This can be void, noone ever checks the return status.  Maybe change the 
name to ipv6_addr_v4map() too?



@@ -84,7 +85,7 @@ static void svcauth_unix_domain_release(
struct ip_map {
struct cache_headh;
charm_class[8]; /* e.g. nfsd */
-struct in_addrm_addr;
+struct in6_addrm_addr;


Indentation.


static void ip_map_init(struct cache_head *cnew, struct cache_head *citem)
{
@@ -125,7 +133,7 @@ static void ip_map_init(struct cache_hea
struct ip_map *item = container_of(citem, struct ip_map, h);

strcpy(new-m_class, item-m_class);
-new-m_addr.s_addr = item-m_addr.s_addr;
+memcpy((new-m_addr), (item-m_addr), sizeof(struct in6_addr));


Use ipv6_addr_copy().


@@ -151,20 +159,22 @@ static void ip_map_request(struct cache_
{
char text_addr[20];
struct ip_map *im = container_of(h, struct ip_map, h);
-__be32 addr = im-m_addr.s_addr;
-
-snprintf(text_addr, 20, %u.%u.%u.%u,
- ntohl(addr)  24  0xff,
- ntohl(addr)  16  0xff,
- ntohl(addr)   8  0xff,
- ntohl(addr)   0  0xff);

+if (IS_ADDR_MAPPED(im-m_addr.s6_addr32)) {
+snprintf(text_addr, 20, NIPQUAD_FMT,
+ntohl(im-m_addr.s6_addr32[3])  24  0xff,
+ntohl(im-m_addr.s6_addr32[3])  16  0xff,
+ntohl(im-m_addr.s6_addr32[3])   8  0xff,
+ntohl(im-m_addr.s6_addr32[3])   0  0xff);
+} else {
+snprintf(text_addr, 20, NIP6_FMT, NIP6(im-m_addr));
+}


You'll need more than 20 bytes to print an IPv6 address, I'd make this 
at least 44 to account for some fluff.  Surprised you didn't crash 
during testing.



static int ip_map_parse(struct cache_detail *cd,
@@ -175,10 +185,10 @@ static int ip_map_parse(struct cache_det
 

[IPv6] Add v4mapped address inline

2007-08-23 Thread Brian Haley

Add v4mapped address inline to avoid calls to ipv6_addr_type().
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 9059e0e..c2b6c11 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -418,6 +418,12 @@ static inline int ipv6_addr_diff(const struct in6_addr *a1, const struct in6_add
 	return __ipv6_addr_diff(a1, a2, sizeof(struct in6_addr));
 }
 
+static inline int ipv6_addr_v4mapped(const struct in6_addr *a)
+{
+	return ((a-s6_addr32[0] | a-s6_addr32[1]) == 0  
+		 a-s6_addr32[2] == htonl(0x)); 
+}
+
 /*
  *	Prototypes exported by ipv6
  */
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 761a910..92d8119 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -249,7 +249,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			}
 
 			if (ipv6_only_sock(sk) ||
-			!(ipv6_addr_type(np-daddr)  IPV6_ADDR_MAPPED)) {
+			!ipv6_addr_v4mapped(np-daddr)) {
 retv = -EADDRNOTAVAIL;
 break;
 			}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0f7defb..d5c0175 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -697,7 +697,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char __user *optval,
 	if (!cmd.tcpm_keylen) {
 		if (!tcp_sk(sk)-md5sig_info)
 			return -ENOENT;
-		if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED)
+		if (ipv6_addr_v4mapped(sin6-sin6_addr))
 			return tcp_v4_md5_do_del(sk, sin6-sin6_addr.s6_addr32[3]);
 		return tcp_v6_md5_do_del(sk, sin6-sin6_addr);
 	}
@@ -720,7 +720,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char __user *optval,
 	newkey = kmemdup(cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
 	if (!newkey)
 		return -ENOMEM;
-	if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED) {
+	if (ipv6_addr_v4mapped(sin6-sin6_addr)) {
 		return tcp_v4_md5_do_add(sk, sin6-sin6_addr.s6_addr32[3],
 	 newkey, cmd.tcpm_keylen);
 	}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4210951..3e0ca15 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -610,7 +610,7 @@ int udpv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 		daddr = NULL;
 
 	if (daddr) {
-		if (ipv6_addr_type(daddr) == IPV6_ADDR_MAPPED) {
+		if (ipv6_addr_v4mapped(daddr)) {
 			struct sockaddr_in sin;
 			sin.sin_family = AF_INET;
 			sin.sin_port = sin6 ? sin6-sin6_port : inet-dport;
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index f8aa23d..cd57a51 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -481,7 +481,7 @@ static int sctp_v6_cmp_addr(const union sctp_addr *addr1,
 	if (addr1-sa.sa_family != addr2-sa.sa_family) {
 		if (addr1-sa.sa_family == AF_INET 
 		addr2-sa.sa_family == AF_INET6 
-		IPV6_ADDR_MAPPED == ipv6_addr_type(addr2-v6.sin6_addr)) {
+		ipv6_addr_v4mapped(addr2-v6.sin6_addr)) {
 			if (addr2-v6.sin6_port == addr1-v4.sin_port 
 			addr2-v6.sin6_addr.s6_addr32[3] ==
 			addr1-v4.sin_addr.s_addr)
@@ -489,7 +489,7 @@ static int sctp_v6_cmp_addr(const union sctp_addr *addr1,
 		}
 		if (addr2-sa.sa_family == AF_INET 
 		addr1-sa.sa_family == AF_INET6 
-		IPV6_ADDR_MAPPED == ipv6_addr_type(addr1-v6.sin6_addr)) {
+		ipv6_addr_v4mapped(addr1-v6.sin6_addr)) {
 			if (addr1-v6.sin6_port == addr2-v4.sin_port 
 			addr1-v6.sin6_addr.s6_addr32[3] ==
 			addr2-v4.sin_addr.s_addr)


Re: [IPv6] Add v4mapped address inline

2007-08-23 Thread Brian Haley

YOSHIFUJI Hideaki /  wrote:

Please put this just after ipv6_addr_any(), not after
ipv6_addr_diff().


Ok, updated patch attached.

-Brian


Add v4mapped address inline to avoid calls to ipv6_addr_type().

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 9059e0e..37bdb25 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -377,6 +377,12 @@ static inline int ipv6_addr_any(const struct in6_addr *a)
 		 a-s6_addr32[2] | a-s6_addr32[3] ) == 0); 
 }
 
+static inline int ipv6_addr_v4mapped(const struct in6_addr *a)
+{
+	return ((a-s6_addr32[0] | a-s6_addr32[1]) == 0  
+		 a-s6_addr32[2] == htonl(0x)); 
+}
+
 /*
  * find the first different bit between two addresses
  * length of address must be a multiple of 32bits
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 761a910..92d8119 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -249,7 +249,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			}
 
 			if (ipv6_only_sock(sk) ||
-			!(ipv6_addr_type(np-daddr)  IPV6_ADDR_MAPPED)) {
+			!ipv6_addr_v4mapped(np-daddr)) {
 retv = -EADDRNOTAVAIL;
 break;
 			}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0f7defb..d5c0175 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -697,7 +697,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char __user *optval,
 	if (!cmd.tcpm_keylen) {
 		if (!tcp_sk(sk)-md5sig_info)
 			return -ENOENT;
-		if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED)
+		if (ipv6_addr_v4mapped(sin6-sin6_addr))
 			return tcp_v4_md5_do_del(sk, sin6-sin6_addr.s6_addr32[3]);
 		return tcp_v6_md5_do_del(sk, sin6-sin6_addr);
 	}
@@ -720,7 +720,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char __user *optval,
 	newkey = kmemdup(cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
 	if (!newkey)
 		return -ENOMEM;
-	if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED) {
+	if (ipv6_addr_v4mapped(sin6-sin6_addr)) {
 		return tcp_v4_md5_do_add(sk, sin6-sin6_addr.s6_addr32[3],
 	 newkey, cmd.tcpm_keylen);
 	}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4210951..3e0ca15 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -610,7 +610,7 @@ int udpv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 		daddr = NULL;
 
 	if (daddr) {
-		if (ipv6_addr_type(daddr) == IPV6_ADDR_MAPPED) {
+		if (ipv6_addr_v4mapped(daddr)) {
 			struct sockaddr_in sin;
 			sin.sin_family = AF_INET;
 			sin.sin_port = sin6 ? sin6-sin6_port : inet-dport;
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index f8aa23d..cd57a51 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -481,7 +481,7 @@ static int sctp_v6_cmp_addr(const union sctp_addr *addr1,
 	if (addr1-sa.sa_family != addr2-sa.sa_family) {
 		if (addr1-sa.sa_family == AF_INET 
 		addr2-sa.sa_family == AF_INET6 
-		IPV6_ADDR_MAPPED == ipv6_addr_type(addr2-v6.sin6_addr)) {
+		ipv6_addr_v4mapped(addr2-v6.sin6_addr)) {
 			if (addr2-v6.sin6_port == addr1-v4.sin_port 
 			addr2-v6.sin6_addr.s6_addr32[3] ==
 			addr1-v4.sin_addr.s_addr)
@@ -489,7 +489,7 @@ static int sctp_v6_cmp_addr(const union sctp_addr *addr1,
 		}
 		if (addr2-sa.sa_family == AF_INET 
 		addr1-sa.sa_family == AF_INET6 
-		IPV6_ADDR_MAPPED == ipv6_addr_type(addr1-v6.sin6_addr)) {
+		ipv6_addr_v4mapped(addr1-v6.sin6_addr)) {
 			if (addr1-v6.sin6_port == addr2-v4.sin_port 
 			addr1-v6.sin6_addr.s6_addr32[3] ==
 			addr2-v4.sin_addr.s_addr)


Re: [GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread Brian Haley

Thomas Graf wrote:

@@ -217,14 +229,8 @@ EXPORT_SYMBOL(genl_register_mc_group);
 void genl_unregister_mc_group(struct genl_family *family,
  struct genl_multicast_group *grp)
 {
-   BUG_ON(grp-family != family);
genl_lock();
-   netlink_clear_multicast_users(genl_sock, grp-id);
-   clear_bit(grp-id, mc_groups);
-   list_del(grp-list);
-   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
-   grp-id = 0;
-   grp-family = NULL;
+   genl_unregister_mc_group(family, grp);
genl_unlock();
 }


Shouldn't this be __genl_unregister_mc_group(family, grp) ?

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 4/8] Use menuconfig objects: SCTP

2007-05-11 Thread Brian Haley

[EMAIL PROTECTED] wrote:

diff -puN net/sctp/Kconfig~use-menuconfig-objects-sctp net/sctp/Kconfig
--- a/net/sctp/Kconfig~use-menuconfig-objects-sctp
+++ a/net/sctp/Kconfig
@@ -2,11 +2,9 @@
 # SCTP configuration
 #
 
-menu SCTP Configuration (EXPERIMENTAL)

-   depends on INET  EXPERIMENTAL
-
-config IP_SCTP
+menuconfig IP_SCTP
tristate The SCTP Protocol (EXPERIMENTAL)
+   depends on NET  EXPERIMENTAL
depends on IPV6 || IPV6=n
select CRYPTO if SCTP_HMAC_SHA1 || SCTP_HMAC_MD5
select CRYPTO_HMAC if SCTP_HMAC_SHA1 || SCTP_HMAC_MD5


This just changed INET  EXPERIMENTAL to NET  EXPERIMENTAL, don't 
know if that was intended.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC, PATCH] IPV6 : add 64 bits components in struct in6_addr to speedup ipv6_addr_equal() ipv6_addr_any()

2007-04-30 Thread Brian Haley

Eric Dumazet wrote:

On 64bit arches, we can speedup some IPV6 addresses compares, using 64 bits 
fields in struct in6_addr.



diff --git a/include/linux/in6.h b/include/linux/in6.h
index 2a61c82..a4241a6 100644
--- a/include/linux/in6.h
+++ b/include/linux/in6.h
@@ -34,10 +34,12 @@ struct in6_addr
__u8u6_addr8[16];
__be16  u6_addr16[8];
__be32  u6_addr32[4];
+   __be64  u6_addr64[2];
} in6_u;
 #define s6_addrin6_u.u6_addr8
 #define s6_addr16  in6_u.u6_addr16
 #define s6_addr32  in6_u.u6_addr32
+#define s6_addr64  in6_u.u6_addr64
 };


I also had this idea back in 2004:

 I will eventually do a 64-bit comparison to see if putting an
 #ifdef CONFIG_64BIT is worth it.

 No, because we cannot assume 64bit alignment.

 --yoshfuji

The problem is that drivers don't necessarily align the address on the 
correct boundary, so on some 64-bit arches this could be fatal.  There's 
ways around it since I did it in a previous life, but you'd need to copy 
the addresses and hide them in the skb in the rare case, neither of 
which is a great thing to do.


Unless Yoshifuji has a better solution...

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] [IPv6] Add link and site-local scope inline

2007-04-06 Thread Brian Haley



YOSHIFUJI Hideaki /  wrote:

In article [EMAIL PROTECTED] (at Thu, 05 Apr 2007 23:21:05 -0400), Brian Haley 
[EMAIL PROTECTED] says:


Add link and site-local scope inline to avoid calls to ipv6_addr_type().



I disagree.  Multicast scopes should also be handled appropriately.


Yes, I totally missed that 
ipv6_addr_scope2type(IPV6_ADDR_MC_SCOPE(addr)) in __ipv6_addr_type(), so 
the linklocal inline probably isn't worth it since it would have to be 
something like:


static inline int ipv6_addr_scope_linklocal(const struct in6_addr *a)
{
return ((a-s6_addr32[0]  htonl(0xFFC0)) == htonl(0xFE80) ||
((a-s6_addr32[0]  htonl(0xFF00)) == htonl(0xFF00) 
 ((a)-s6_addr[1]  0x0f) == IPV6_ADDR_SCOPE_LINKLOCAL)))
}

That's not that clean an inline anymore, but still doable...

I'll clean-up the rest based on your comments and re-send.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add loopback address type inline

2007-04-06 Thread Brian Haley

YOSHIFUJI Hideaki /  wrote:

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 32c6398..06ee92d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1067,7 +1067,6 @@ int ip6_route_add(struct fib6_config *cfg)
struct net_device *dev = NULL;
struct inet6_dev *idev = NULL;
struct fib6_table *table;
-   int addr_type;

if (cfg-fc_dst_len  128 || cfg-fc_src_len  128)
return -EINVAL;
@@ -1108,9 +1107,7 @@ int ip6_route_add(struct fib6_config *cfg)
cfg-fc_protocol = RTPROT_BOOT;
rt-rt6i_protocol = cfg-fc_protocol;

-   addr_type = ipv6_addr_type(cfg-fc_dst);
-
-   if (addr_type  IPV6_ADDR_MULTICAST)
+   if (ipv6_addr_type_multicast(cfg-fc_dst))
rt-u.dst.input = ip6_mc_input;
else
rt-u.dst.input = ip6_forward;


different commit...


This and the previous patch were layered, and I couldn't add the rest of 
this change without the loopback inline:



@@ -1133,7 +1130,8 @@ int ip6_route_add(struct fib6_config *cfg)
   they would result in kernel looping; promote them to reject routes
 */
if ((cfg-fc_flags  RTF_REJECT) ||
-   (dev  (dev-flagsIFF_LOOPBACK)  
!(addr_typeIPV6_ADDR_LOOPBACK))) {
+   (dev  (dev-flagsIFF_LOOPBACK) 
+!ipv6_addr_loopback(cfg-fc_dst))) {
/* hold loopback dev/idev if we haven't done so. */
if (dev != loopback_dev) {
if (dev) {


because they both used addr_type.

I'll put this all in one patch together next time so it's more obvious.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] [IPv6] Add multicast address type inline

2007-04-05 Thread Brian Haley

Add multicast address type inline to avoid calls to ipv6_addr_type().

Signed-off-by: Brian Haley [EMAIL PROTECTED]
---
 include/net/ipv6.h|5 +
 net/ipv6/icmp.c   |   12 
 net/ipv6/ip6_tunnel.c |4 ++--
 net/ipv6/route.c  |4 ++--
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d473789..a888b0e 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -439,6 +439,11 @@ static inline int ipv6_addr_scope_sitelocal(const struct 
in6_addr *a)
return ((a-s6_addr32[0]  htonl(0xFFC0)) == htonl(0xFEC0));
 }

+static inline int ipv6_addr_type_multicast(const struct in6_addr *a)
+{
+   return ((a-s6_addr32[0]  htonl(0xFF00)) == htonl(0xFF00));
+}
+
 /*
  * Prototypes exported by ipv6
  */
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index e94992a..709037f 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -312,7 +312,6 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, 
__u32 info,
struct flowi fl;
struct icmpv6_msg msg;
int iif = 0;
-   int addr_type = 0;
int len;
int hlimit, tclass;
int err = 0;
@@ -327,8 +326,6 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, 
__u32 info,
 *  Rule (e.1) is enforced by not using icmpv6_send
 *  in any code that processes icmp errors.
 */
-   addr_type = ipv6_addr_type(hdr-daddr);
-
if (ipv6_chk_addr(hdr-daddr, skb-dev, 0))
saddr = hdr-daddr;

@@ -336,7 +333,7 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, 
__u32 info,
 *  Dest addr check
 */

-   if ((addr_type  IPV6_ADDR_MULTICAST || skb-pkt_type != PACKET_HOST)) {
+   if (ipv6_addr_type_multicast(hdr-daddr) || skb-pkt_type != 
PACKET_HOST) {
if (type != ICMPV6_PKT_TOOBIG 
!(type == ICMPV6_PARAMPROB 
  code == ICMPV6_UNK_OPTION 
@@ -346,13 +343,11 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, 
__u32 info,
saddr = NULL;
}

-   addr_type = ipv6_addr_type(hdr-saddr);
-
/*
 *  Source addr check
 */

-   if (addr_type  IPV6_ADDR_LINKLOCAL)
+   if (ipv6_addr_scope_linklocal(hdr-saddr))
iif = skb-dev-ifindex;

/*
@@ -361,7 +356,8 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, 
__u32 info,
 *  We check unspecified / multicast addresses here,
 *  and anycast addresses will be checked later.
 */
-   if ((addr_type == IPV6_ADDR_ANY) || (addr_type  IPV6_ADDR_MULTICAST)) {
+   if (ipv6_addr_any(hdr-saddr) ||
+   ipv6_addr_type_multicast(hdr-saddr)) {
LIMIT_NETDEBUG(KERN_DEBUG icmpv6_send: addr_any/mcast 
source\n);
return;
}
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0902fb..0dd1f63 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -,8 +,8 @@ static void ip6_tnl_link_config(struct ip6_tnl *t)
dev-iflink = p-link;

if (p-flags  IP6_TNL_F_CAP_XMIT) {
-   int strict = (ipv6_addr_type(p-raddr) 
- (IPV6_ADDR_MULTICAST|IPV6_ADDR_LINKLOCAL));
+   int strict = ipv6_addr_type_multicast(p-raddr) ||
+ipv6_addr_scope_linklocal(p-raddr);

struct rt6_info *rt = rt6_lookup(p-raddr, p-laddr,
 p-link, strict);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 53d79ac..32c6398 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -227,8 +227,8 @@ static __inline__ int rt6_check_expired(const struct 
rt6_info *rt)

 static inline int rt6_need_strict(struct in6_addr *daddr)
 {
-   return (ipv6_addr_type(daddr) 
-   (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL));
+   return (ipv6_addr_is_multicast(daddr) ||
+   ipv6_addr_scope_linklocal(daddr));
 }

 /*

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] [IPv6] Add new scope and address type inlines

2007-04-05 Thread Brian Haley

[sorry if anyone got these multiple times, git-send-email weirdness]

This set of patches adds new IPv6 scope and address type inlines
to both clean-up the code (inspired by Arnaldo's skb cleanup) and reduce
calls to ipv6_addr_type() when we can just compare the address directly.
No functionality is changed.

I'm only cc'ing the DCCP and LkSCTP lists on the patches that actually
touch their code.

-Brian

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] Add mapped address type inline

2007-04-05 Thread Brian Haley

Add mapped address type inline to avoid calls to ipv6_addr_type().

Signed-off-by: Brian Haley [EMAIL PROTECTED]
---
 include/net/ipv6.h   |6 ++
 net/ipv6/ip6_flowlabel.c |6 ++
 net/ipv6/ipv6_sockglue.c |2 +-
 net/ipv6/tcp_ipv6.c  |   13 +
 net/ipv6/udp.c   |2 +-
 net/sctp/ipv6.c  |4 ++--
 6 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index a888b0e..f3e13db 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -444,6 +444,12 @@ static inline int ipv6_addr_type_multicast(const struct 
in6_addr *a)
return ((a-s6_addr32[0]  htonl(0xFF00)) == htonl(0xFF00));
 }

+static inline int ipv6_addr_type_mapped(const struct in6_addr *a)
+{
+   return ((a-s6_addr32[0] | a-s6_addr32[1]) == 0 
+a-s6_addr32[2] == htonl(0x));
+}
+
 /*
  * Prototypes exported by ipv6
  */
diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
index c206a15..b1bd088 100644
--- a/net/ipv6/ip6_flowlabel.c
+++ b/net/ipv6/ip6_flowlabel.c
@@ -282,7 +282,6 @@ fl_create(struct in6_flowlabel_req *freq, char __user 
*optval, int optlen, int *
 {
struct ip6_flowlabel *fl;
int olen;
-   int addr_type;
int err;

err = -ENOMEM;
@@ -328,9 +327,8 @@ fl_create(struct in6_flowlabel_req *freq, char __user 
*optval, int optlen, int *
if (err)
goto done;
fl-share = freq-flr_share;
-   addr_type = ipv6_addr_type(freq-flr_dst);
-   if ((addr_typeIPV6_ADDR_MAPPED)
-   || addr_type == IPV6_ADDR_ANY) {
+   if (ipv6_addr_type_mapped(freq-flr_dst) ||
+   ipv6_addr_any(freq-flr_dst)) {
err = -EINVAL;
goto done;
}
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index aa3d07c..d83e982 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -249,7 +249,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, 
int optname,
}

if (ipv6_only_sock(sk) ||
-   !(ipv6_addr_type(np-daddr)  IPV6_ADDR_MAPPED)) {
+   !ipv6_addr_type_mapped(np-daddr)) {
retv = -EADDRNOTAVAIL;
break;
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 537978c..a47d23d 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -132,7 +132,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr 
*uaddr,
struct in6_addr *saddr = NULL, *final_p = NULL, final;
struct flowi fl;
struct dst_entry *dst;
-   int addr_type;
int err;

if (addr_len  SIN6_LEN_RFC2133)
@@ -163,12 +162,10 @@ static int tcp_v6_connect(struct sock *sk, struct 
sockaddr *uaddr,
if(ipv6_addr_any(usin-sin6_addr))
usin-sin6_addr.s6_addr[15] = 0x1;

-   addr_type = ipv6_addr_type(usin-sin6_addr);
-
-   if(addr_type  IPV6_ADDR_MULTICAST)
+   if (ipv6_addr_type_multicast(usin-sin6_addr))
return -ENETUNREACH;

-   if (addr_typeIPV6_ADDR_LINKLOCAL) {
+   if (ipv6_addr_scope_linklocal(usin-sin6_addr)) {
if (addr_len = sizeof(struct sockaddr_in6) 
usin-sin6_scope_id) {
/* If interface is set while binding, indices
@@ -200,7 +197,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr 
*uaddr,
 *  TCP over IPv4
 */

-   if (addr_type == IPV6_ADDR_MAPPED) {
+   if (ipv6_addr_type_mapped(usin-sin6_addr)) {
u32 exthdrlen = icsk-icsk_ext_hdr_len;
struct sockaddr_in sin;

@@ -703,7 +703,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char 
__user *optval,
if (!cmd.tcpm_keylen) {
if (!tcp_sk(sk)-md5sig_info)
return -ENOENT;
-   if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED)
+   if (ipv6_addr_type_mapped(sin6-sin6_addr))
return tcp_v4_md5_do_del(sk, 
sin6-sin6_addr.s6_addr32[3]);
return tcp_v6_md5_do_del(sk, sin6-sin6_addr);
}
@@ -725,7 +725,7 @@ static int tcp_v6_parse_md5_keys (struct sock *sk, char 
__user *optval,
newkey = kmemdup(cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
if (!newkey)
return -ENOMEM;
-   if (ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_MAPPED) {
+   if (ipv6_addr_type_mapped(sin6-sin6_addr)) {
return tcp_v4_md5_do_add(sk, sin6-sin6_addr.s6_addr32[3],
 newkey, cmd.tcpm_keylen);
}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index c0b5fe3..6636431 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -610,7 +610,7 @@ int udpv6_sendmsg(struct kiocb *iocb, struct sock *sk,
daddr = NULL;

if (daddr

[PATCH 1/4] [IPv6] Add link and site-local scope inline

2007-04-05 Thread Brian Haley

Add link and site-local scope inline to avoid calls to ipv6_addr_type().

Signed-off-by: Brian Haley [EMAIL PROTECTED]
---
 include/net/ipv6.h   |   10 ++
 net/dccp/ipv6.c  |2 +-
 net/ipv6/addrconf.c  |6 +++---
 net/ipv6/af_inet6.c  |2 +-
 net/ipv6/datagram.c  |   11 ---
 net/ipv6/inet6_connection_sock.c |2 +-
 net/ipv6/ip6_output.c|2 +-
 net/ipv6/mcast.c |8 +++-
 net/ipv6/ndisc.c |8 
 net/ipv6/raw.c   |4 ++--
 net/ipv6/tcp_ipv6.c  |2 +-
 net/ipv6/udp.c   |4 ++--
 net/sctp/ipv6.c  |   16 +++-
 13 files changed, 40 insertions(+), 37 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 00328b7..d473789 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -429,6 +429,16 @@ static inline int ipv6_addr_diff(const struct in6_addr 
*a1, const struct in6_add
return __ipv6_addr_diff(a1, a2, sizeof(struct in6_addr));
 }

+static inline int ipv6_addr_scope_linklocal(const struct in6_addr *a)
+{
+   return ((a-s6_addr32[0]  htonl(0xFFC0)) == htonl(0xFE80));
+}
+
+static inline int ipv6_addr_scope_sitelocal(const struct in6_addr *a)
+{
+   return ((a-s6_addr32[0]  htonl(0xFFC0)) == htonl(0xFEC0));
+}
+
 /*
  * Prototypes exported by ipv6
  */
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 64eac25..14a0f12 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -476,7 +476,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct 
sk_buff *skb)

/* So that link locals have meaning */
if (!sk-sk_bound_dev_if 
-   ipv6_addr_type(ireq6-rmt_addr)  IPV6_ADDR_LINKLOCAL)
+   ipv6_addr_scope_linklocal(ireq6-rmt_addr))
ireq6-iif = inet6_iif(skb);

/*
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 47d3adf..2d4fe24 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2634,7 +2634,7 @@ static void addrconf_dad_completed(struct inet6_ifaddr 
*ifp)
if (ifp-idev-cnf.forwarding == 0 
ifp-idev-cnf.rtr_solicits  0 
(dev-flagsIFF_LOOPBACK) == 0 
-   (ipv6_addr_type(ifp-addr)  IPV6_ADDR_LINKLOCAL)) {
+   ipv6_addr_scope_linklocal(ifp-addr)) {
struct in6_addr all_routers;

ipv6_addr_all_routers(all_routers);
@@ -3155,7 +3155,7 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, 
struct ifmcaddr6 *ifmca,
u8 scope = RT_SCOPE_UNIVERSE;
int ifindex = ifmca-idev-dev-ifindex;

-   if (ipv6_addr_scope(ifmca-mca_addr)  IFA_SITE)
+   if (ipv6_addr_scope_sitelocal(ifmca-mca_addr))
scope = RT_SCOPE_SITE;

nlh = nlmsg_put(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags);
@@ -3180,7 +3180,7 @@ static int inet6_fill_ifacaddr(struct sk_buff *skb, 
struct ifacaddr6 *ifaca,
u8 scope = RT_SCOPE_UNIVERSE;
int ifindex = ifaca-aca_idev-dev-ifindex;

-   if (ipv6_addr_scope(ifaca-aca_addr)  IFA_SITE)
+   if (ipv6_addr_scope_sitelocal(ifaca-aca_addr))
scope = RT_SCOPE_SITE;

nlh = nlmsg_put(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index df31cdd..24618cf 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -431,7 +431,7 @@ int inet6_getname(struct socket *sock, struct sockaddr 
*uaddr,

sin-sin6_port = inet-sport;
}
-   if (ipv6_addr_type(sin-sin6_addr)  IPV6_ADDR_LINKLOCAL)
+   if (ipv6_addr_scope_linklocal(sin-sin6_addr))
sin-sin6_scope_id = sk-sk_bound_dev_if;
*uaddr_len = sizeof(*sin);
return(0);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 4a355fe..a8612b2 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -323,7 +323,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, 
int len)
sin-sin6_flowinfo =
(*(__be32 *)(nh + serr-addr_offset - 24) 

 IPV6_FLOWINFO_MASK);
-   if (ipv6_addr_type(sin-sin6_addr)  
IPV6_ADDR_LINKLOCAL)
+   if (ipv6_addr_scope_linklocal(sin-sin6_addr))
sin-sin6_scope_id = IP6CB(skb)-iif;
} else {
ipv6_addr_set(sin-sin6_addr, 0, 0,
@@ -343,7 +343,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, 
int len)
ipv6_addr_copy(sin-sin6_addr, ipv6_hdr(skb)-saddr);
if (np-rxopt.all)
datagram_recv_ctl(sk, msg, skb);
-   if (ipv6_addr_type(sin-sin6_addr)  
IPV6_ADDR_LINKLOCAL)
+   if (ipv6_addr_scope_linklocal(sin-sin6_addr

Re: [PATCH 3/3] bonding: Improve IGMP join processing

2007-03-06 Thread Brian Haley

Andy Gospodarek wrote:

If we are easily able to differentiate between the multicast addresses
in the mc_list as to which are for ipv4 and which are for ipv6 then it
would be easy to call-out to something in the ipv6 mcast code when
needed instead of always calling out to ipv4 code.


I've been unable to figure out exactly what you're referring to in the 
code (bond_main.c), it seems to failover all multicast addresses, 
regardless of what address family they are.  I might have missed 
something in 4K lines of code though?


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] bonding: Improve IGMP join processing

2007-03-01 Thread Brian Haley

Jay Vosburgh wrote:

My only concern is that this code assumes all mcast addresses stored in
dev-mc-list list are for ipv4 igmp mcast addresses and nothing was done
for ipv6.

But this is much better than what we have now, so... 


Agreed, but there's no IPv6 support anywhere in bonding at
present (for unicast or multicast), so this isn't really a loss.


So forgive my naive question, but what would it take to make IPv6 work? 
 I know DAD fails on a test setup I have, but I haven't dug-into why 
this is (I can guess), and I'd like to see it working.  I'm willing to 
help, even if just to get it limping along.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: Implement RFC 4429 Optimistic Duplicate Address Detection

2007-02-05 Thread Brian Haley

Please, if you think you can find a way for us to do optimistic dad flags as
opt-in, rather than masked out, I'm all for it.  Thanks!


This patch should apply on-top of yours, if you want I can send the 
whole thing out too.  I've only compile-tested it, so don't know if it 
behaves the same as your original.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c341371..ddac8b0 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -593,13 +593,8 @@ ipv6_add_addr(struct inet6_dev *idev, co
 	ifa-cstamp = ifa-tstamp = jiffies;
 
 	ifa-rt = rt;
-#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
-	if (!idev-cnf.optimistic_dad || ipv6_devconf.forwarding ||
-	   (ifa-rt-rt6i_nexthop == NULL))
+	if (rt-rt6i_nexthop == NULL)
 		ifa-flags = ~IFA_F_OPTIMISTIC;
-#else
-	ifa-flags = ~IFA_F_OPTIMISTIC;
-#endif
 	ifa-idev = idev;
 	in6_dev_hold(idev);
 	/* For caller */
@@ -776,6 +771,7 @@ static int ipv6_create_tempaddr(struct i
 	int tmp_plen;
 	int ret = 0;
 	int max_addresses;
+	u32 addr_flags;
 
 	write_lock(idev-lock);
 	if (ift) {
@@ -833,11 +829,17 @@ retry:
 	spin_unlock_bh(ifp-lock);
 
 	write_unlock(idev-lock);
+
+	addr_flags = IFA_F_TEMPORARY;
+	/* set in addrconf_prefix_rcv() */
+	if (ifp-flags  IFA_F_OPTIMISTIC)
+		addr_flags |= IFA_F_OPTIMISTIC;
+
 	ift = !max_addresses ||
 	  ipv6_count_addresses(idev)  max_addresses ? 
 		ipv6_add_addr(idev, addr, tmp_plen,
 			  ipv6_addr_type(addr)IPV6_ADDR_SCOPE_MASK, 
-			  IFA_F_TEMPORARY|IFA_F_OPTIMISTIC) : NULL;
+			  addr_flags) : NULL;
 	if (!ift || IS_ERR(ift)) {
 		in6_ifa_put(ifp);
 		in6_dev_put(idev);
@@ -1746,6 +1748,13 @@ ok:
 
 		if (ifp == NULL  valid_lft) {
 			int max_addresses = in6_dev-cnf.max_addresses;
+			u32 addr_flags = 0;
+
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+			if (in6_dev-cnf.optimistic_dad 
+			!ipv6_devconf.forwarding)
+addr_flags = IFA_F_OPTIMISTIC;
+#endif
 
 			/* Do not allow to create too much of autoconfigured
 			 * addresses; this would be too easy way to crash kernel.
@@ -1753,7 +1762,8 @@ ok:
 			if (!max_addresses ||
 			ipv6_count_addresses(in6_dev)  max_addresses)
 ifp = ipv6_add_addr(in6_dev, addr, pinfo-prefix_len,
-		addr_typeIPV6_ADDR_SCOPE_MASK, 0);
+		addr_typeIPV6_ADDR_SCOPE_MASK,
+		addr_flags);
 
 			if (!ifp || IS_ERR(ifp)) {
 in6_dev_put(in6_dev);
@@ -1762,10 +1772,6 @@ ok:
 
 			update_lft = create = 1;
 			ifp-cstamp = jiffies;
-#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
-			if (ifp-idev-cnf.optimistic_dad)
-ifp-flags |= IFA_F_OPTIMISTIC;
-#endif
 			addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT);
 		}
 
@@ -2141,9 +2147,16 @@ static void init_loopback(struct net_dev
 static void addrconf_add_linklocal(struct inet6_dev *idev, struct in6_addr *addr)
 {
 	struct inet6_ifaddr * ifp;
+	u32 addr_flags = IFA_F_PERMANENT;
+
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+	if (idev-cnf.optimistic_dad 
+	!ipv6_devconf.forwarding)
+		addr_flags |= IFA_F_OPTIMISTIC;
+#endif
+
 
-	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, 
-		IFA_F_PERMANENT|IFA_F_OPTIMISTIC);
+	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, addr_flags);
 	if (!IS_ERR(ifp)) {
 		addrconf_dad_start(ifp, 0);
 		in6_ifa_put(ifp);


Re: [PATCH] IPv6: Implement RFC 4429 Optimistic Duplicate Address Detection

2007-02-02 Thread Brian Haley

Hi Neil,


@@ -830,7 +836,8 @@ retry:
ift = !max_addresses ||
 	  ipv6_count_addresses(idev)  max_addresses ? 
 		ipv6_add_addr(idev, addr, tmp_plen,

- ipv6_addr_type(addr)IPV6_ADDR_SCOPE_MASK, 
IFA_F_TEMPORARY) : NULL;
+			  ipv6_addr_type(addr)IPV6_ADDR_SCOPE_MASK, 
+			  IFA_F_TEMPORARY|IFA_F_OPTIMISTIC) : NULL;


So why are you always adding these as optimistic now?  Shouldn't this be 
triggering off idev-cnf.optimistic_dad?  I know you're clearing it in 
ipv6_add_addr(), but I liked Vlad's suggestion of not setting it 
initially since this way seems backwards.



@@ -2123,7 +2142,8 @@ static void addrconf_add_linklocal(struct inet6_dev 
*idev, struct in6_addr *addr
 {
struct inet6_ifaddr * ifp;
 
-	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT);
+	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, 
+		IFA_F_PERMANENT|IFA_F_OPTIMISTIC);


Here too.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: Implement RFC 4429 Optimistic Duplicate Address Detection

2007-02-02 Thread Brian Haley

Hi Vlad,

Vlad Yasevich wrote:

Brian Haley wrote:

Hi Neil,


@@ -830,7 +836,8 @@ retry:
 ift = !max_addresses ||
   ipv6_count_addresses(idev)  max_addresses ? 
ipv6_add_addr(idev, addr, tmp_plen,

-  ipv6_addr_type(addr)IPV6_ADDR_SCOPE_MASK,
IFA_F_TEMPORARY) : NULL;
+  ipv6_addr_type(addr)IPV6_ADDR_SCOPE_MASK,
+  IFA_F_TEMPORARY|IFA_F_OPTIMISTIC) : NULL;


Hi Brian


So why are you always adding these as optimistic now?  Shouldn't this be
triggering off idev-cnf.optimistic_dad?  I know you're clearing it in
ipv6_add_addr(), but I liked Vlad's suggestion of not setting it
initially since this way seems backwards.


The troubling case seems to manually configured addresses (inet6_addr_add()).
If we can clearly and easily distinguish between this case of address
and all the other ones, then we can simply set the flag in ipv6_add_addr, like
we set the tentative flag.


Right, I guess maybe I'm trying to figure out what 
idev-cnf.optimistic_dad means:


1. Interface supports OPTIMISTIC addresses
2. All auto-configured addresses on interface are OPTIMISTIC
3. ???

All other addresses are created w/out OPTIMISTIC set.

I think manually-configured addresses can be tagged as OPTIMISTIC just 
like MIPv6 Home Addresses are if we just change this line in 
inet6_rtm_newaddr():


 ifa_flags = ifm-ifa_flags  (IFA_F_NODAD | IFA_F_HOMEADDRESS);
--
 ifa_flags = ifm-ifa_flags  (IFA_F_NODAD | IFA_F_HOMEADDRESS | 
IFA_F_OPTIMISTIC);


and tweak the rest of the code, but that doesn't cover the 
addrconf_add_ifaddr() codepath via ioctl(SIOCSIFADDR).


I can generate a patch based-on Neil's, but it will take me until Monday 
to get it out.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] [SCTP]: Verify some mandatory parameters.

2007-01-17 Thread Brian Haley

--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -462,24 +461,6 @@ sctp_disposition_t sctp_sf_do_5_1C_ack(const struct 
sctp_endpoint *ep,



-   if (!init_tag) {
-   struct sctp_chunk *reply = sctp_make_abort(asoc, chunk, 0);
-   if (!reply)
-   goto nomem;


This introduced a compiler warning, easily fixed.

-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index fce1f60..fbbc9e6 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -531,9 +531,6 @@ sctp_disposition_t sctp_sf_do_5_1C_ack(c
 			SCTP_CHUNK(err_chunk));
 
 	return SCTP_DISPOSITION_CONSUME;
-
-nomem:
-	return SCTP_DISPOSITION_NOMEM;
 }
 
 /*


Re: [Bugme-new] [Bug 7665] New: getsockopt(IPV6_*CAST_HOPS) returns -1

2006-12-13 Thread Brian Haley

David Miller wrote:

I wonder, since the most accurate return value is tied to the route,
what is expected of this getsockopt() before a socket's identity
(and therefore route) is known?


A search for RTAX_HOPLIMIT found very little code that ever sets it, 
iproute2 was the only important one, so the interface/system default is 
probably the only one that's ever used.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: [Bugme-new] [Bug 7665] New: getsockopt(IPV6_*CAST_HOPS) returns -1

2006-12-11 Thread Brian Haley

Andrew Morton wrote:

Where fd is a socket (datagram or raw) with IPv6 protocol family,
getsockopt(fd, IPPROTO_IPV6, IPV6_UNICAST_HOPS, ...) succeeds, but the returned 
hop limit is -1. connect()'ing the socket first does not solve the problem.


An IPv6 socket's hoplimit value is not set at creation time, instead, 
the hoplimit in an outgoing packet is set dynamically at transmit time 
to one of the following (in this order):


1. Hoplimit route metric (if set)
2. Outgoing interface value (/proc/sys/net/ipv6/conf/ethX/hop_limit)
3. Global IPv6 value (/proc/sys/net/ipv6/conf/all/hop_limit)

A setsockopt() value *will* override this.

Some *nixes have a different behavior and do set it at socket() creation 
time.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] [IPV6] RAW: Add checksum default defines for mobility header.

2006-12-05 Thread Brian Haley

YOSHIFUJI Hideaki /  wrote:

Add checksum default defines for mobility header(MH).
As the result kernel's behavior is to handle MH checksum
as default.


I'd like to hold this on.  I need to check RFC.


That looks correct according to 3775.

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network virtualization/isolation

2006-11-29 Thread Brian Haley

Eric W. Biederman wrote:

I think for cases across network socket namespaces it should
be a matter for the rules, to decide if the connection should
happen and what error code to return if the connection does not
happen.

There is a potential in this to have an ambiguous case where two
applications can be listening for connections on the same socket
on the same port and both will allow the connection.  If that
is the case I believe the proper definition is the first socket
that we find that will accept the connection gets the connection.


Wouldn't you want to catch this at bind() and/or configuration time and 
fail?  Having overlapping namespaces/rules seems undesirable, since as 
Herbert said, can get you unexpected behaviour.



I think with the appropriate set of rules it provides what is needed
for application migration.  I.e. 127.0.0.1 can be filtered so that
you can only connect to sockets in your current container.

It does get a little odd because it does allow for the possibility
that you can have multiple connected sockets with same source ip,
source port, destination ip, destination port.  If the rules are
setup appropriately.  I don't see that peculiarity being visible on
the outside network so it shouldn't be a problem.


So if they're using the same protocol (eg TCP), how is it decided which 
one gets an incoming packet?  Maybe I'm missing something as I don't 
understand your inside/outside network reference - is that to the 
loopback address comment in the previous paragraph?


Thanks,

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: only modify checksum for UDP

2006-11-11 Thread Brian Haley

This is a discussion about Brian's SECOND PATCH which needs
fixups.


Please forget about the second patch, optimizing this code path isn't 
worth it if the -1 trick doesn't work in all cases.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IPv6: only modify checksum for UDP

2006-11-10 Thread Brian Haley
Only change upper-layer checksum from 0 to 0x for UDP (as RFC 768 
states), not for others as RFC 4443 doesn't require it.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 81bd45b..dbb9b1f 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -246,8 +246,6 @@ static int icmpv6_push_pending_frames(st
 	   len, fl-proto, tmp_csum);
 		icmp6h-icmp6_cksum = tmp_csum;
 	}
-	if (icmp6h-icmp6_cksum == 0)
-		icmp6h-icmp6_cksum = -1;
 	ip6_push_pending_frames(sk);
 out:
 	return err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 6bc6655..baf7b82 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -536,7 +536,7 @@ static int rawv6_push_pending_frames(str
    fl-fl6_dst,
    total_len, fl-proto, tmp_csum);
 
-	if (tmp_csum == 0)
+	if (tmp_csum == 0  fl-proto == IPPROTO_UDP)
 		tmp_csum = -1;
 
 	csum = tmp_csum;


[PATCH] IPv6: optimize echo reply checksum calculation

2006-11-10 Thread Brian Haley
Since the only difference between echo requests and echo replies is the 
ICMPv6 type value (which is a difference of 1), just subtracting one 
from the request checksum will result in the correct checksum for the reply.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index dbb9b1f..ee04610 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -212,7 +212,7 @@ static __inline__ int opt_unrec(struct s
 	return (*op  0xC0) == 0x80;
 }
 
-static int icmpv6_push_pending_frames(struct sock *sk, struct flowi *fl, struct icmp6hdr *thdr, int len)
+static int icmpv6_push_pending_frames(struct sock *sk, struct flowi *fl, struct icmp6hdr *thdr, int len, int cksum_needed)
 {
 	struct sk_buff *skb;
 	struct icmp6hdr *icmp6h;
@@ -223,7 +223,9 @@ static int icmpv6_push_pending_frames(st
 
 	icmp6h = (struct icmp6hdr*) skb-h.raw;
 	memcpy(icmp6h, thdr, sizeof(struct icmp6hdr));
-	icmp6h-icmp6_cksum = 0;
+
+	if (!cksum_needed)
+		goto sendit;
 
 	if (skb_queue_len(sk-sk_write_queue) == 1) {
 		skb-csum = csum_partial((char *)icmp6h,
@@ -246,6 +248,7 @@ static int icmpv6_push_pending_frames(st
 	   len, fl-proto, tmp_csum);
 		icmp6h-icmp6_cksum = tmp_csum;
 	}
+sendit:
 	ip6_push_pending_frames(sk);
 out:
 	return err;
@@ -451,7 +454,7 @@ void icmpv6_send(struct sk_buff *skb, in
 		ip6_flush_pending_frames(sk);
 		goto out_put;
 	}
-	err = icmpv6_push_pending_frames(sk, fl, tmp_hdr, len + sizeof(struct icmp6hdr));
+	err = icmpv6_push_pending_frames(sk, fl, tmp_hdr, len + sizeof(struct icmp6hdr), 1);
 
 	if (type = ICMPV6_DEST_UNREACH  type = ICMPV6_PARAMPROB)
 		ICMP6_INC_STATS_OFFSET_BH(idev, ICMP6_MIB_OUTDESTUNREACHS, type - ICMPV6_DEST_UNREACH);
@@ -489,6 +492,14 @@ static void icmpv6_echo_reply(struct sk_
 	memcpy(tmp_hdr, icmph, sizeof(tmp_hdr));
 	tmp_hdr.icmp6_type = ICMPV6_ECHO_REPLY;
 
+	/*
+	 * The only difference between echo requests and echo replies is the
+	 * ICMPv6 type value (which is a difference of 1).  So if we subtract
+	 * one from the request checksum, it will result in the correct
+	 * checksum for the reply.
+	 */
+	tmp_hdr.icmp6_cksum--;
+
 	memset(fl, 0, sizeof(fl));
 	fl.proto = IPPROTO_ICMPV6;
 	ipv6_addr_copy(fl.fl6_dst, skb-nh.ipv6h-saddr);
@@ -540,7 +551,7 @@ static void icmpv6_echo_reply(struct sk_
 		ip6_flush_pending_frames(sk);
 		goto out_put;
 	}
-	err = icmpv6_push_pending_frames(sk, fl, tmp_hdr, skb-len + sizeof(struct icmp6hdr));
+	err = icmpv6_push_pending_frames(sk, fl, tmp_hdr, skb-len + sizeof(struct icmp6hdr), 0);
 
 ICMP6_INC_STATS_BH(idev, ICMP6_MIB_OUTECHOREPLIES);
 ICMP6_INC_STATS_BH(idev, ICMP6_MIB_OUTMSGS);


Re: [PATCH] IPv6: optimize echo reply checksum calculation

2006-11-10 Thread Brian Haley

Al Viro wrote:

On Fri, Nov 10, 2006 at 11:25:53AM -0500, Brian Haley wrote:
Since the only difference between echo requests and echo replies is the 
ICMPv6 type value (which is a difference of 1), just subtracting one 
from the request checksum will result in the correct checksum for the reply.


Um, no.  That will *not* result in correct checksum.  Please, RTFRFC 1071.


I verified this works for echo request/reply on my IA64 box, 
double-checked with ethereal/wireshark.  Is there something specific in 
RFC 1071 that I should be looking for?


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: optimize echo reply checksum calculation

2006-11-10 Thread Brian Haley

Al Viro wrote:

so -= 1 is broken even on ia64 and it's *always* broken on big-endian
boxen.


It's not broken in ia64, I've tested that, just don't have an x86 for 
testing right now.  Can you please apply these changes and prove it's 
broken?  This little trick has been done in other UNIXes for years 
without any problems.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: optimize echo reply checksum calculation

2006-11-10 Thread Brian Haley

Al Viro wrote:

On Fri, Nov 10, 2006 at 02:04:32PM -0500, Brian Haley wrote:

Al Viro wrote:

so -= 1 is broken even on ia64 and it's *always* broken on big-endian
boxen.
It's not broken in ia64, I've tested that, just don't have an x86 for 
testing right now.  Can you please apply these changes and prove it's 
broken?  This little trick has been done in other UNIXes for years 
without any problems.


Could you fscking read what you've replied to?  Your -=1 will turn 0
into 0x instead of correct 0xfffe.  IOW, it's broken in 1:65536
cases.


I looked again at your previous email:


Note that even on little-endian you want
3 - 2
2 - 1
1 - 0x
0 - 0xfffe


That doesn't look right to me, but I'll take your word that there's one 
edge case out there I don't see (even though this worked on Alpha). 
Forget about the patch then.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: why do we mangle checksums for v6 ICMP?

2006-11-09 Thread Brian Haley

Hi Al,

Al Viro wrote:

AFAICS, the rules are:

(1) checksum is 16-bit one's complement of the one's complement sum of
relevant 16bit words.

(2) for v4 UDP all-zeroes has special meaning - no checksum; if you get
it from (1), send all-ones instead.

(3) for v6 UDP we have the same remapping as in (2), but all-zeroes has
different meaning - not ignore checksum as in v4, but reject the
packet.

(4) there is no (4).

IOW, nobody except UDP has any business doing that 0-0x
replacement.  However, we have
   if (icmp6h-icmp6_cksum == 0)
   icmp6h-icmp6_cksum = -1;


This doesn't look necessary, RFCs 4443/2463 don't mention it being 
necessary, and BSD doesn't do it either.  I'll cook-up a patch to remove 
that since I was doing some other mods in that codepath.



and similar in net/ipv6/raw.c


Maybe here it only needs to be done if (fl-proto == IPPROTO_UDP)?

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver

2006-09-25 Thread Brian Haley

Joerg Roedel wrote:

Is there something in the RFC that suggests that a byte order other than
'network order' is possible/acceptable there?


No. The RFC states nothing at all about byte- or bitorder. That is why
the RFC is ambigious at this point.


RFC 791 (IPv4) Appendix B does give instructions on byte ordering for 
all IPv4 headers and data, and RFC 791 is listed in the References for 
RFC 3378.  I noticed this is only Informational, not a Standards track 
document, so I guess the non-interoperable implementations kind of go 
with the territory.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] make some netfilter globals __read_mostly

2006-09-19 Thread Brian Haley

Make some netfilter globals __read_mostly at the request of Patrick McHardy.

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c
index aa45917..2370245 100644
--- a/net/ipv4/netfilter/ip_conntrack_core.c
+++ b/net/ipv4/netfilter/ip_conntrack_core.c
@@ -64,17 +64,17 @@ atomic_t ip_conntrack_count = ATOMIC_INI
 
 void (*ip_conntrack_destroyed)(struct ip_conntrack *conntrack) = NULL;
 LIST_HEAD(ip_conntrack_expect_list);
-struct ip_conntrack_protocol *ip_ct_protos[MAX_IP_CT_PROTO];
+struct ip_conntrack_protocol *ip_ct_protos[MAX_IP_CT_PROTO] __read_mostly;
 static LIST_HEAD(helpers);
 unsigned int ip_conntrack_htable_size = 0;
 int ip_conntrack_max;
-struct list_head *ip_conntrack_hash;
+struct list_head *ip_conntrack_hash __read_mostly;
 static kmem_cache_t *ip_conntrack_cachep __read_mostly;
 static kmem_cache_t *ip_conntrack_expect_cachep __read_mostly;
 struct ip_conntrack ip_conntrack_untracked;
 unsigned int ip_ct_log_invalid;
 static LIST_HEAD(unconfirmed);
-static int ip_conntrack_vmalloc;
+static int ip_conntrack_vmalloc __read_mostly;
 
 static unsigned int ip_conntrack_next_id;
 static unsigned int ip_conntrack_expect_next_id;
diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c
index 276a964..de680e5 100644
--- a/net/ipv4/netfilter/ip_queue.c
+++ b/net/ipv4/netfilter/ip_queue.c
@@ -52,15 +52,15 @@ struct ipq_queue_entry {
 
 typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long);
 
-static unsigned char copy_mode = IPQ_COPY_NONE;
+static unsigned char copy_mode __read_mostly = IPQ_COPY_NONE;
 static unsigned int queue_maxlen = IPQ_QMAX_DEFAULT;
 static DEFINE_RWLOCK(queue_lock);
-static int peer_pid;
-static unsigned int copy_range;
+static int peer_pid __read_mostly;
+static unsigned int copy_range __read_mostly;
 static unsigned int queue_total;
 static unsigned int queue_dropped = 0;
 static unsigned int queue_user_dropped = 0;
-static struct sock *ipqnl;
+static struct sock *ipqnl __read_mostly;
 static LIST_HEAD(queue_list);
 static DEFINE_MUTEX(ipqnl_mutex);
 
diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c
index c01c126..c74d1cc 100644
--- a/net/ipv6/netfilter/ip6_queue.c
+++ b/net/ipv6/netfilter/ip6_queue.c
@@ -56,15 +56,15 @@ struct ipq_queue_entry {
 
 typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long);
 
-static unsigned char copy_mode = IPQ_COPY_NONE;
+static unsigned char copy_mode __read_mostly = IPQ_COPY_NONE;
 static unsigned int queue_maxlen = IPQ_QMAX_DEFAULT;
 static DEFINE_RWLOCK(queue_lock);
-static int peer_pid;
-static unsigned int copy_range;
+static int peer_pid __read_mostly;
+static unsigned int copy_range __read_mostly;
 static unsigned int queue_total;
 static unsigned int queue_dropped = 0;
 static unsigned int queue_user_dropped = 0;
-static struct sock *ipqnl;
+static struct sock *ipqnl __read_mostly;
 static LIST_HEAD(queue_list);
 static DEFINE_MUTEX(ipqnl_mutex);
 
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 8f22619..d50c52d 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -74,17 +74,17 @@ atomic_t nf_conntrack_count = ATOMIC_INI
 
 void (*nf_conntrack_destroyed)(struct nf_conn *conntrack) = NULL;
 LIST_HEAD(nf_conntrack_expect_list);
-struct nf_conntrack_protocol **nf_ct_protos[PF_MAX];
-struct nf_conntrack_l3proto *nf_ct_l3protos[PF_MAX];
+struct nf_conntrack_protocol **nf_ct_protos[PF_MAX] __read_mostly;
+struct nf_conntrack_l3proto *nf_ct_l3protos[PF_MAX] __read_mostly;
 static LIST_HEAD(helpers);
 unsigned int nf_conntrack_htable_size = 0;
 int nf_conntrack_max;
-struct list_head *nf_conntrack_hash;
-static kmem_cache_t *nf_conntrack_expect_cachep;
+struct list_head *nf_conntrack_hash __read_mostly;
+static kmem_cache_t *nf_conntrack_expect_cachep __read_mostly;
 struct nf_conn nf_conntrack_untracked;
 unsigned int nf_ct_log_invalid;
 static LIST_HEAD(unconfirmed);
-static int nf_conntrack_vmalloc;
+static int nf_conntrack_vmalloc __read_mostly;
 
 static unsigned int nf_conntrack_next_id;
 static unsigned int nf_conntrack_expect_next_id;


Re: [PATCH] change netfilter tunables to __read_mostly

2006-09-01 Thread Brian Haley

Patrick McHardy wrote:

Patrick McHardy wrote:

Brian Haley wrote:


Change some netfilter tunables to __read_mostly.  Also fixed some
incorrect file reference comments while I was in there.


Please send these kind of patches for netfilter to me so I can
make sure they don't clash with other pending patches.



(this will be my last __read_mostly patch unless someone points out
something else that needs it)


This seems to be a bit random, there are quite a few more candidates for
__read_mostly right next to the ones you marked.



Ah sorry, I didn't really get why you chose these :)

The patch doesn't clash with anything in my queue, so ACK from me.


Sorry, next time I'll send them to you, or 
[EMAIL PROTECTED]  I'll cook-up another patch for the 
others you mentioned and send it out.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] change max_dgram_qlen sysctl to __read_mostly

2006-08-31 Thread Brian Haley

Change AF_UNIX sysctl_unix_max_dgram_qlen to __read_mostly.

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 7c91c20..b43a278 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -117,7 +117,7 @@
 #include net/checksum.h
 #include linux/security.h
 
-int sysctl_unix_max_dgram_qlen = 10;
+int sysctl_unix_max_dgram_qlen __read_mostly = 10;
 
 struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
 DEFINE_SPINLOCK(unix_table_lock);


[PATCH] change somaxconn sysctl to __read_mostly

2006-08-31 Thread Brian Haley

Change sysctl_somaxconn to __read_mostly.

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/socket.c b/net/socket.c
index f4d143c..e3d67fe 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1336,7 +1336,7 @@ asmlinkage long sys_bind(int fd, struct 
  *	ready for listening.
  */
 
-int sysctl_somaxconn = SOMAXCONN;
+int sysctl_somaxconn __read_mostly = SOMAXCONN;
 
 asmlinkage long sys_listen(int fd, int backlog)
 {


Re: [GIT PATCH] IPv6 Updates for net-2.6.19

2006-08-25 Thread Brian Haley

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c9f74c1..9b50e0c 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -703,6 +703,7 @@ void ip6_route_input(struct sk_buff *skb
.ip6_u = {
.daddr = iph-daddr,
.saddr = iph-saddr,
+   .fwmark = skb-nfmark,
.flowlabel = (* (u32 *) iph)IPV6_FLOWINFO_MASK,
},
},


I can't build the latest 2.6.19-git with this patch, skb-nfmark 
requires CONFIG_NETFILTER, which isn't in my .config.  The obvious 
workaround is the patch below, but that might not be what you want.  Can 
send my .config if you need it.


-Brian

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 9b50e0c..dc880cc 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -703,7 +703,9 @@ void ip6_route_input(struct sk_buff *skb
 			.ip6_u = {
 .daddr = iph-daddr,
 .saddr = iph-saddr,
+#ifdef CONFIG_NETFILTER
 .fwmark = skb-nfmark,
+#endif
 .flowlabel = (* (u32 *) iph)IPV6_FLOWINFO_MASK,
 			},
 		},


Re: [PATCH 22/44] [IPV6]: Find option offset by type.

2006-08-23 Thread Brian Haley

YOSHIFUJI Hideaki wrote:

From: Masahide NAKAMURA [EMAIL PROTECTED]

This is a helper to search option offset from extension header which
can carry TLV option like destination options header.
Mobile IPv6 home address option will use it.
Based on MIPL2 kernel patch.



+   /* not_found */
+   return -1;
+ bad:
+   return -1;
+}


You could change this to:

/* not_found */
   bad:
return -1;
   }

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Change some sysctl variables to __read_mostly

2006-08-09 Thread Brian Haley

Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly.

Couldn't actually measure any performance increase while testing (.3% I 
consider noise), but seems like the right thing to do.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index d273cad..e3d8d9b 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2449,7 +2449,7 @@ static struct neigh_sysctl_table {
 	ctl_table		neigh_neigh_dir[2];
 	ctl_table		neigh_proto_dir[2];
 	ctl_table		neigh_root_dir[2];
-} neigh_sysctl_template = {
+} neigh_sysctl_template __read_mostly = {
 	.neigh_vars = {
 		{
 			.ctl_name	= NET_NEIGH_MCAST_SOLICIT,
diff --git a/net/core/sock.c b/net/core/sock.c
index b67d868..cfaf090 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -187,13 +187,13 @@ static struct lock_class_key af_callback
 #define SK_RMEM_MAX		(_SK_MEM_OVERHEAD * _SK_MEM_PACKETS)
 
 /* Run time adjustable parameters. */
-__u32 sysctl_wmem_max = SK_WMEM_MAX;
-__u32 sysctl_rmem_max = SK_RMEM_MAX;
-__u32 sysctl_wmem_default = SK_WMEM_MAX;
-__u32 sysctl_rmem_default = SK_RMEM_MAX;
+__u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX;
+__u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX;
+__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX;
+__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX;
 
 /* Maximal space eaten by iovec or ancilliary data plus some space */
-int sysctl_optmem_max = sizeof(unsigned long)*(2*UIO_MAXIOV + 512);
+int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512);
 
 static int sock_set_timeout(long *timeo_p, char __user *optval, int optlen)
 {
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index fc40da3..701960e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -392,7 +392,7 @@ int inet_release(struct socket *sock)
 }
 
 /* It is off by default, see below. */
-int sysctl_ip_nonlocal_bind;
+int sysctl_ip_nonlocal_bind __read_mostly;
 
 int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 {
@@ -988,7 +988,7 @@ void inet_unregister_protosw(struct inet
  *  Shall we try to damage output packets if routing dev changes?
  */
 
-int sysctl_ip_dynaddr;
+int sysctl_ip_dynaddr __read_mostly;
 
 static int inet_sk_reselect_saddr(struct sock *sk)
 {
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 6d223e5..c2ad07e 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -187,11 +187,11 @@ struct icmp_err icmp_err_convert[] = {
 };
 
 /* Control parameters for ECHO replies. */
-int sysctl_icmp_echo_ignore_all;
-int sysctl_icmp_echo_ignore_broadcasts = 1;
+int sysctl_icmp_echo_ignore_all __read_mostly;
+int sysctl_icmp_echo_ignore_broadcasts __read_mostly = 1;
 
 /* Control parameter - ignore bogus broadcast responses? */
-int sysctl_icmp_ignore_bogus_error_responses = 1;
+int sysctl_icmp_ignore_bogus_error_responses __read_mostly = 1;
 
 /*
  * 	Configurable global rate limit.
@@ -205,9 +205,9 @@ int sysctl_icmp_ignore_bogus_error_respo
  *	time exceeded (11), parameter problem (12)
  */
 
-int sysctl_icmp_ratelimit = 1 * HZ;
-int sysctl_icmp_ratemask = 0x1818;
-int sysctl_icmp_errors_use_inbound_ifaddr;
+int sysctl_icmp_ratelimit __read_mostly = 1 * HZ;
+int sysctl_icmp_ratemask __read_mostly = 0x1818;
+int sysctl_icmp_errors_use_inbound_ifaddr __read_mostly;
 
 /*
  *	ICMP control array. This specifies what to do with each ICMP.
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index f84c4d0..48d705e 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1397,8 +1397,8 @@ static struct in_device * ip_mc_find_dev
 /*
  *	Join a socket to a group
  */
-int sysctl_igmp_max_memberships = IP_MAX_MEMBERSHIPS;
-int sysctl_igmp_max_msf = IP_MAX_MSF;
+int sysctl_igmp_max_memberships __read_mostly = IP_MAX_MEMBERSHIPS;
+int sysctl_igmp_max_msf __read_mostly = IP_MAX_MSF;
 
 
 static int ip_mc_del1_src(struct ip_mc_list *pmc, int sfmode,
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 8d7f107..165d728 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -54,15 +54,15 @@
  * even the most extreme cases without allowing an attacker to measurably
  * harm machine performance.
  */
-int sysctl_ipfrag_high_thresh = 256*1024;
-int sysctl_ipfrag_low_thresh = 192*1024;
+int sysctl_ipfrag_high_thresh __read_mostly = 256*1024;
+int sysctl_ipfrag_low_thresh __read_mostly = 192*1024;
 
-int sysctl_ipfrag_max_dist = 64;
+int sysctl_ipfrag_max_dist __read_mostly = 64;
 
 /* Important NOTE! Fragment queue must be destroyed before MSL expires.
  * RFC791 is wrong proposing to prolongate timer each fragment arrival by TTL.
  */
-int sysctl_ipfrag_time = IP_FRAG_TIME;
+int sysctl_ipfrag_time __read_mostly = IP_FRAG_TIME;
 
 struct ipfrag_skb_cb
 {
@@ -130,7 +130,7 @@ static unsigned int ipqhashfn(u16 id, u3
 }
 
 static struct timer_list ipfrag_secret_timer;
-int sysctl_ipfrag_secret_interval = 10 * 60 * HZ;
+int sysctl_ipfrag_secret_interval __read_mostly = 10 * 60 * HZ;
 
 static void

Re: [PATCH] ipv4: don't call upper-layer disconnect function if not connected

2006-08-02 Thread Brian Haley



The socket could have been bind()'d to, in which case it will
not move to connected state and we still need to invoke
the disconnect methods such as udp_disconnect() to clear out
that binding.


Ok.


You seem to be groveling in random areas of the ipv4 and ipv6 stack,
what are you working on?


Was looking into a customer-reported memory leak that seemed to be in 
this code path.  It wasn't, but this tweak seemed sane at the time.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6: only set err in rawv6_bind() when necessary

2006-08-02 Thread Brian Haley



Every other path going from this location in rawv6_bind()
will clear err to zero, so your patch also doesn't fix any
bug.


I knew it didn't fix a bug, I just hadn't noticed the C idiom you 
pointed-out until I knew to look for it.  rawv6_bind() even does this, duh.


-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IPv6: only set err in rawv6_bind() when necessary

2006-08-01 Thread Brian Haley
The variable 'err' is set in rawv6_bind() before the address check fails 
instead of after, moved inside if() statement.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8a30cd8..072b28b 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -240,10 +240,10 @@ static int rawv6_bind(struct sock *sk, s
 		 */
 		v4addr = LOOPBACK4_IPV6;
 		if (!(addr_type  IPV6_ADDR_MULTICAST))	{
-			err = -EADDRNOTAVAIL;
 			if (!ipv6_chk_addr(addr-sin6_addr, dev, 0)) {
 if (dev)
 	dev_put(dev);
+err = -EADDRNOTAVAIL;
 goto out;
 			}
 		}


[PATCH] ipv4: don't call upper-layer disconnect function if not connected

2006-08-01 Thread Brian Haley
Calling connect() with AF_UNSPEC will disconnect a socket, but we don't 
need to do any work if the socket isn't currently connected.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c84a320..b294b92 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -480,12 +480,16 @@ int inet_dgram_connect(struct socket *so
 {
 	struct sock *sk = sock-sk;
 
-	if (uaddr-sa_family == AF_UNSPEC)
-		return sk-sk_prot-disconnect(sk, flags);
+	if (uaddr-sa_family == AF_UNSPEC) {
+		if (sock-state != SS_UNCONNECTED)
+			return sk-sk_prot-disconnect(sk, flags);
+		else
+			return 0;
+	}
 
 	if (!inet_sk(sk)-num  inet_autobind(sk))
 		return -EAGAIN;
-	return sk-sk_prot-connect(sk, (struct sockaddr *)uaddr, addr_len);
+	return sk-sk_prot-connect(sk, uaddr, addr_len);
 }
 
 static long inet_wait_for_connect(struct sock *sk, long timeo)
@@ -525,8 +529,11 @@ int inet_stream_connect(struct socket *s
 	lock_sock(sk);
 
 	if (uaddr-sa_family == AF_UNSPEC) {
-		err = sk-sk_prot-disconnect(sk, flags);
-		sock-state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
+		if (sock-state != SS_UNCONNECTED) {
+			err = sk-sk_prot-disconnect(sk, flags);
+			sock-state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
+		} else
+			err = 0;
 		goto out;
 	}
 


Re: [PATCH] s2io: netpoll support

2006-06-26 Thread Brian Haley

Ravinandan Arakali wrote:

Since the poll_controller entry point will be used by utilities such as
netdump, I am thinking we need to clear Tx interrupts as well here.
Did you get a chance to test this patch with netdump ?


No, I've only been testing Kgdb over Ethernet on Debian, I think netdump 
is a Red Hat thing.  I can test any changes you'd want to make with this 
patch if you post the diffs.


-Brian



-Original Message-
From: Brian Haley [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 15, 2006 11:37 AM
To: netdev@vger.kernel.org
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; Ananda. Raju
(E-mail); Leonid. Grossman (E-mail)
Subject: Re: [PATCH] s2io: netpoll support


This adds netpoll support for things like netconsole/kgdboe to the s2io
10GbE driver.

Signed-off-by: Brian Haley [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io: netpoll support

2006-06-15 Thread Brian Haley

This adds netpoll support for things like netconsole/kgdboe to the s2io
10GbE driver.

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index 79208f4..f2b8dba 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -2575,6 +2575,50 @@ no_rx:
 #endif
 
 /**
+ * s2io_netpoll - Rx interrupt service handler for netpoll support
+ * @dev : pointer to the device structure.
+ * Description:
+ * Polling 'interrupt' - used by things like netconsole to send skbs
+ * without having to re-enable interrupts. It's not called while
+ * the interrupt routine is executing.
+ */
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void s2io_netpoll(struct net_device *dev)
+{
+	nic_t *nic = dev-priv;
+	mac_info_t *mac_control;
+	struct config_param *config;
+	XENA_dev_config_t __iomem *bar0 = nic-bar0;
+	u64 val64;
+	int i;
+
+	disable_irq(dev-irq);
+
+	atomic_inc(nic-isr_cnt);
+	mac_control = nic-mac_control;
+	config = nic-config;
+
+	val64 = readq(bar0-rx_traffic_int);
+	writeq(val64, bar0-rx_traffic_int);
+
+	for (i = 0; i  config-rx_ring_num; i++)
+		rx_intr_handler(mac_control-rings[i]);
+
+	for (i = 0; i  config-rx_ring_num; i++) {
+		if (fill_rx_buffers(nic, i) == -ENOMEM) {
+			DBG_PRINT(ERR_DBG, %s:Out of memory, dev-name);
+			DBG_PRINT(ERR_DBG,  in Rx Netpoll!!\n);
+			break;
+		}
+	}
+	atomic_dec(nic-isr_cnt);
+	enable_irq(dev-irq);
+	return;
+}
+#endif
+
+/**
  *  rx_intr_handler - Rx interrupt handler
  *  @nic: device private variable.
  *  Description:
@@ -6210,6 +6254,10 @@ Defaulting to INTA\n);
 	dev-weight = 32;
 #endif
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	dev-poll_controller = s2io_netpoll;
+#endif
+
 	dev-features |= NETIF_F_SG | NETIF_F_IP_CSUM;
 	if (sp-high_dma_flag == TRUE)
 		dev-features |= NETIF_F_HIGHDMA;


[PATCH] ipv6: order addresses by scope

2006-06-09 Thread Brian Haley
If IPv6 addresses are ordered by scope, then ipv6_dev_get_saddr() can 
break-out of the device addr_list for() loop when the candidate source 
address scope is less than the destination address scope.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 445006e..e1d6a6f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -509,6 +509,25 @@ void inet6_ifa_finish_destroy(struct ine
 	kfree(ifp);
 }
 
+static void
+ipv6_link_dev_addr(struct inet6_dev *idev, struct inet6_ifaddr *ifp)
+{
+	struct inet6_ifaddr *ifa, **ifap;
+
+	/*
+	 * Each device address list is sorted in order of scope -
+	 * global before linklocal.
+	 */
+	for (ifap = idev-addr_list; (ifa = *ifap) != NULL;
+	 ifap = ifa-if_next) {
+		if (ifp-scope  ifa-scope)
+			break;
+	}
+
+	ifp-if_next = *ifap;
+	*ifap = ifp;
+}
+
 /* On success it returns ifp with increased reference count */
 
 static struct inet6_ifaddr *
@@ -574,8 +593,7 @@ ipv6_add_addr(struct inet6_dev *idev, co
 
 	write_lock(idev-lock);
 	/* Add to inet6_dev unicast addr list. */
-	ifa-if_next = idev-addr_list;
-	idev-addr_list = ifa;
+	ipv6_link_dev_addr(idev, ifa);
 
 #ifdef CONFIG_IPV6_PRIVACY
 	if (ifa-flagsIFA_F_TEMPORARY) {
@@ -982,7 +1000,7 @@ int ipv6_dev_get_saddr(struct net_device
 	continue;
 			} else if (score.scope  hiscore.scope) {
 if (score.scope  daddr_scope)
-	continue;
+	break; /* addresses sorted by scope */
 else {
 	score.rule = 2;
 	goto record_it;


[PATCH] s2io: netpoll support

2006-06-08 Thread Brian Haley
This adds netpoll support for things like netconsole/kgdboe to the s2io 
10GbE driver.


This duplicates some code from s2io_poll() as I wanted to be 
least-invasive, someone from Neterion might have other thoughts?


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index 79208f4..c2c5f46 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -2575,6 +2575,53 @@ no_rx:
 #endif
 
 /**
+ * s2io_netpoll - Rx interrupt service handler for netpoll support
+ * @dev : pointer to the device structure.
+ * Description:
+ * Polling 'interrupt' - used by things like netconsole to send skbs
+ * without having to re-enable interrupts. It's not called while
+ * the interrupt routine is executing.
+ */
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void s2io_netpoll(struct net_device *dev)
+{
+	nic_t *nic = dev-priv;
+	mac_info_t *mac_control;
+	struct config_param *config;
+	XENA_dev_config_t __iomem *bar0 = nic-bar0;
+	u64 val64;
+	int i;
+
+	atomic_inc(nic-isr_cnt);
+
+	/*  Disable all interrupts */
+	en_dis_able_nic_intrs(nic, ENA_ALL_INTRS, DISABLE_INTRS);
+
+	mac_control = nic-mac_control;
+	config = nic-config;
+
+	val64 = readq(bar0-rx_traffic_int);
+	writeq(val64, bar0-rx_traffic_int);
+
+	for (i = 0; i  config-rx_ring_num; i++)
+		rx_intr_handler(mac_control-rings[i]);
+
+	for (i = 0; i  config-rx_ring_num; i++) {
+		if (fill_rx_buffers(nic, i) == -ENOMEM) {
+			DBG_PRINT(ERR_DBG, %s:Out of memory, dev-name);
+			DBG_PRINT(ERR_DBG,  in Rx Netpoll!!\n);
+			break;
+		}
+	}
+	/* Re-enable all interrupts */
+	en_dis_able_nic_intrs(nic, ENA_ALL_INTRS, ENABLE_INTRS);
+	atomic_dec(nic-isr_cnt);
+	return;
+}
+#endif
+
+/**
  *  rx_intr_handler - Rx interrupt handler
  *  @nic: device private variable.
  *  Description:
@@ -6210,6 +6257,10 @@ Defaulting to INTA\n);
 	dev-weight = 32;
 #endif
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	dev-poll_controller = s2io_netpoll;
+#endif
+
 	dev-features |= NETIF_F_SG | NETIF_F_IP_CSUM;
 	if (sp-high_dma_flag == TRUE)
 		dev-features |= NETIF_F_HIGHDMA;


Re: reminder, 2.6.18 window...

2006-05-24 Thread Brian Haley

jamal wrote:

On Wed, 2006-24-05 at 12:14 -0700, Phil Dibowitz wrote:


Right, I'm aware there are other ways of doing this - I've written scripts to
record a hundreds of numbers, and then subtract them from each other. But
those scripts are work arounds 


I don't have any problem with Phil's changes.


It is not a work around, _it is design intent_. It is what network
management tools have been expecting since the days of the caveman.
These stats are supposed to be monotonically increasing; if that
behavior is contradicted, a rollover of the counters is assumed.


So how is this different than if an SNMP station probes my system, then 
I reboot, then they probe again.  Things will seem to have gone 
backwards, but they deal with that just fine.



for a feature _lacking_ in the kernel. A
feature that, as I've mentioned, is supported on any piece of networking gear
(and of course, lets not forget there's a specific option in the kernel config
*just* for behave like a router).



Can you provide some link to a vendor that allows resetting ethernet
stats? I am almost certain, if they do they will have something or other
which indicates that such a reset happened.


DEC/Compaq/HP has allowed this on Tru64 UNIX since 1999 because we had 
customers that wanted it, noone ever complained about complications with 
SNMP.  We did save the last time the stats were zero'd in the struct for 
posterity, but that was never get-able via SNMP:


-- netstat -I tu0 -s

tu0 Ethernet counters at Wed May 24 16:30:05 2006

  609415 seconds since last zeroed
  3943458720 bytes received
   113576310 bytes sent
...

Maybe saving a ztime would make people happier?


It is also easier for cisco
to have none standard feature as of ios 15.16 which could support such
behavior because they bundle everything including network management
tools.


I never received any free management tools with my Cisco routers :) , 
they charge big bucks for that stuff!


If my patch was invasive and broke things, 


It _does break_ things for all known management apps.


Can anyone show a management app this breaks?

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netfilter: cannot build latest kernel with CONFIG_NETFILTER=y/m on IA64

2006-04-07 Thread Brian Haley
Can't build latest 2.6 kernel with CONFIG_NETFILTER=y/m on IA64, there's 
a missing #include in net/ipv6/netfilter.c


net/ipv6/netfilter.c: In function `nf_ip6_checksum':
net/ipv6/netfilter.c:92: warning: implicit declaration of function 
`csum_ipv6_magic'

  LD  net/ipv6/ipv6.o
  LD  net/ipv6/built-in.o
  LD  net/built-in.o
  GEN .version
  CHK include/linux/compile.h
  UPD include/linux/compile.h
  CC  init/version.o
  LD  init/built-in.o
  LD  .tmp_vmlinux1
net/built-in.o(.text+0x168812): In function `nf_ip6_checksum':
include/net/checksum.h:67: undefined reference to `csum_ipv6_magic'
net/built-in.o(.text+0x168892):include/net/checksum.h:67: undefined 
reference to `csum_ipv6_magic'


Patch against today's net-2.6.git tree.

-Brian

Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 3e9ecfa..395a417 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -7,6 +7,7 @@
 #include net/ipv6.h
 #include net/ip6_route.h
 #include net/xfrm.h
+#include net/ip6_checksum.h
 
 int ip6_route_me_harder(struct sk_buff *skb)
 {


[PATCH] [IPV6]: fix ipv6_saddr_score struct element

2006-02-28 Thread Brian Haley
The scope element in the ipv6_saddr_score struct used in 
ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope() 
returns a signed integer (and can return -1).


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -834,7 +834,7 @@ struct ipv6_saddr_score {
int addr_type;
unsigned intattrs;
int matchlen;
-   unsigned intscope;
+   int scope;
unsigned intrule;
 };
 


Re: [GIT PATCH] RFC3484 compliant source address selection

2005-11-08 Thread Brian Haley

Hi Yoshifuji,


 /*
+ * find the first different bit between two addresses
+ * length of address must be a multiple of 32bits
+ */
+static inline int __ipv6_addr_diff(const void *token1, const void *token2, int 
addrlen)
+{
+   const __u32 *a1 = token1, *a2 = token2;
+   int i;
+
+   addrlen = 2;
+
+   for (i = 0; i  addrlen; i++) {
+   __u32 xb = a1[i] ^ a2[i];
+   if (xb) {
+   int j = 31;
+
+   xb = ntohl(xb);
+   while ((xb  (1  j)) == 0)
+   j--;
+
+   return (i * 32 + 31 - j);
+   }
+   }


I did a few performance measurements in userspace on this function like 
was done for ipv6_addr_equal() a year ago and found the following patch 
speeds it up a little since it avoids the exclusive-OR (which can be 
expensive) except when necessary.  The line numbers are wrong since I 
couldn't get to your git:// through our firewall.  I'll look at the 
other diffs more closely later.


Thanks,

-Brian
Signed-off-by: Brian Haley [EMAIL PROTECTED]

*** adiff.h 2005-11-08 16:32:00.0 -0500
--- bdiff.h 2005-11-08 16:32:47.0 -0500
***
*** 6,13 
addrlen = 2;
  
for (i = 0; i  addrlen; i++) {
!   __u32 xb = a1[i] ^ a2[i];
!   if (xb) {
int j = 31;
  
xb = ntohl(xb);
--- 6,13 
addrlen = 2;
  
for (i = 0; i  addrlen; i++) {
!   if (a1[i] != a2[i]) {
!   __u32 xb = a1[i] ^ a2[i];
int j = 31;
  
xb = ntohl(xb);


Re: [PATCH 1/2] [IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542).

2005-09-09 Thread Brian Haley

Trying again to get this to netdev, sorry if duplicate...


Subject: [PATCH 1/2] [IPV6]: Support several new sockopt / ancillary data in 
Advanced API (RFC3542).


Thanks for getting a lot of this update done.


+/* RFC3542 advanced socket options (50-67) */
+#define IPV6_RECVPKTINFO   50
+#define IPV6_PKTINFO   51
+#if 0
+#define IPV6_RECVPATHMTU   52
+#define IPV6_PATHMTU   53
+#define IPV6_DONTFRAG  54
+#define IPV6_USE_MIN_MTU   55
+#endif
+#define IPV6_RECVHOPOPTS   56
+#define IPV6_HOPOPTS   57
+#if 0
+#define IPV6_RECVRTHDRDSTOPTS  58  /* Unused, see net/ipv6/datagram.c */
+#endif
+#define IPV6_RTHDRDSTOPTS  59
+#define IPV6_RECVRTHDR 60
+#define IPV6_RTHDR 61
+#define IPV6_RECVDSTOPTS   62
+#define IPV6_DSTOPTS   63
+#define IPV6_RECVHOPLIMIT  64
+#define IPV6_HOPLIMIT  65
+#if 0
+#define IPV6_RECVTCLASS66
+#define IPV6_TCLASS67
+#endif
+
 #endif


So some of these are defined, but never used, for example all of *MTU
and DONTFRAG, are these being worked on or do you want some patches?

I've also attached a patch I sent out last October to bring the Type 0
routing header inline with 3542, but I've left the addr field since it
doesn't take up any space and makes the code that uses it cleaner.  This
of course also requires changes in glibc (?) for
/usr/include/netinet/ip6.h, etc.

Thanks,

-Brian

Signed-off-by: Brian Haley [EMAIL PROTECTED]

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -68,7 +68,7 @@ struct ipv6_opt_hdr {
 
 struct rt0_hdr {
 	struct ipv6_rt_hdr	rt_hdr;
-	__u32			bitmap;		/* strict/loose bit map */
+	__u32			reserved;
 	struct in6_addr		addr[0];
 
 #define rt0_type		rt_hdr.type
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -404,8 +404,7 @@ ipv6_invert_rthdr(struct sock *sk, struc
 
 	memcpy(opt-srcrt, hdr, sizeof(*hdr));
 	irthdr = (struct rt0_hdr*)opt-srcrt;
-	/* Obsolete field, MBZ, when originated by us */
-	irthdr-bitmap = 0;
+	irthdr-reserved = 0;
 	opt-srcrt-segments_left = n;
 	for (i=0; in; i++)
 		memcpy(irthdr-addr+i, rthdr-addr+(n-1-i), 16);
diff --git a/net/ipv6/netfilter/ip6t_rt.c b/net/ipv6/netfilter/ip6t_rt.c
--- a/net/ipv6/netfilter/ip6t_rt.c
+++ b/net/ipv6/netfilter/ip6t_rt.c
@@ -161,8 +161,8 @@ match(const struct sk_buff *skb,
((rtinfo-hdrlen == hdrlen) ^
!!(rtinfo-invflags  IP6T_RT_INV_LEN;
DEBUGP(res %02X %02X %02X , 
-   		(rtinfo-flags  IP6T_RT_RES), ((struct rt0_hdr *)rh)-bitmap,
-   		!((rtinfo-flags  IP6T_RT_RES)  (((struct rt0_hdr *)rh)-bitmap)));
+   		(rtinfo-flags  IP6T_RT_RES), ((struct rt0_hdr *)rh)-reserved,
+   		!((rtinfo-flags  IP6T_RT_RES)  (((struct rt0_hdr *)rh)-reserved)));
 
ret = (rh != NULL)
		
@@ -179,12 +179,12 @@ match(const struct sk_buff *skb,
!!(rtinfo-invflags  IP6T_RT_INV_TYP)));
 
 	if (ret  (rtinfo-flags  IP6T_RT_RES)) {
-		u_int32_t *bp, _bitmap;
-		bp = skb_header_pointer(skb,
-	ptr + offsetof(struct rt0_hdr, bitmap),
-	sizeof(_bitmap), _bitmap);
+		u_int32_t *rp, _reserved;
+		rp = skb_header_pointer(skb,
+	ptr + offsetof(struct rt0_hdr, reserved),
+	sizeof(_reserved), _reserved);
 
-		ret = (*bp == 0);
+		ret = (*rp == 0);
 	}
 
 	DEBUGP(#%d ,rtinfo-addrnr);


Re: [PATCH 2/2][NET] cleanup INET_REFCNT_DEBUG code

2005-07-21 Thread Brian Haley

Arnaldo Carvalho de Melo wrote:


--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -486,6 +486,8 @@ extern int sk_wait_data(struct sock *sk,
 
 struct request_sock_ops;
 
+#undef SOCK_REFCNT_DEBUG

+


Why are you doing this here?  Leftover from debugging?

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html