date:20060823

Re: The Proposed Linux kevent API (was: Re: [take12 0/3] kevent: Generic event handling mechanism.)

2006-08-23 Thread Evgeniy Polyakov

On Tue, Aug 22, 2006 at 06:36:07PM -0700, Nicholas Miell ([EMAIL PROTECTED]) 
wrote:
 == The Proposed Linux kevent API == 
 
 The proposed Linux kevent API is a new unified event handling
 interface, similar in spirit to Windows completion ports and Solaris
 completion ports and similar in fact to the FreeBSD/OS X kqueue
 interface.
 
 Using a single kernel call, a thread can wait for all possible event
 types that the kernel can generate, instead of past interfaces that
 only allow you to wait for specific subsets of events (e.g. POSIX
 sigevent completions are limited only to AIO completion, timer expiry,
 and the arrival of new messages to a message queue, while epoll_wait
 is just a more efficient method of doing a traditional Unix select or
 poll).
 
 Instead of evolving the struct sigevent notification methods to allow
 you to continue using standard POSIX interfaces like lio_listio(),
 mq_notify() or timer_create() while queuing completion notifications
 to a kevent completion queue (much the way the Solaris port API is
 designed, or the the API proposed by Ulrich Drepper in The
 Need for Asynchronous, Zero-Copy Network I/O as found at
 http://people.redhat.com/drepper/newni.pdf ), kevent choooses to
 follow the FreeBSD route and introduce an entirely new and
 incompatible method of requesting and reporting event notifications
 (while also managing to be incompatible with FreeBSD's kqueue).
 
 This is done through the introduction of two new syscalls and a
 variety of supporting datatypes. The first function, kevent_ctl(), is
 used to create and manipulate kevent queues, while the second,
 kevent_get_events(), is use to wait for new events.
 
 
 They operate as follows:
 
 int kevent_ctl(int fd, unsigned int cmd, unsigned int num, void *arg);
 
 fd is the file descriptor referring to the kevent queue to
 manipulate. It is ignored if the cmd parameter is KEVENT_CTL_INIT.
 
 cmd is the requested operation. It can be one of the following:
 
   KEVENT_CTL_INIT - create a new kevent queue and return it's file
   descriptor. The fd, num, and arg parameters are ignored.
 
   KEVENT_CTL_ADD, KEVENT_CTL_MODIFY, KEVENT_CTL_REMOVE - add new,
   modify existing, or remove existing event notification
   requests.
 
 num is the number of struct ukevent in the array pointed to by arg
 
 arg is an array of struct ukevent. Why it is of type void* and not 
   struct ukevent* is a mystery.
 
 When called, kevent_ctl will carry out the operation specified in the
 cmd parameter.
 
 
 int kevent_get_events(int ctl_fd, unsigned int min_nr,
   unsigned int max_nr, unsigned int timeout,
   void *buf, unsigned flags)
 
 ctl_fd is the file descriptor referring to the kevent queue.
 
 min_nr is the minimum number of completed events that
kevent_get_events will block waiting for.
 
 max_nr is the number of struct ukevent in buf.
 
 timeout is the number of milliseconds to wait before returning less
   than min_nr events. If this is -1, I *think* it'll wait
   indefinitely, but I'm not sure that msecs_to_jiffies(-1) ends
   up being MAX_SCHEDULE_TIMEOUT

You forget the case for non-blocked file descriptor.
Here is comment from the code:

 * In nonblocking mode it returns as many events as possible, but not more than 
@max_nr.
 * In blocking mode it waits until timeout or if at least @min_nr events are 
ready.

 buf is a pointer an array of struct ukevent. Why it is of type void*
 and not struct ukevent* is a mystery.
 
 flags is unused.
 
 When called, kevent_get_events will wait timeout milliseconds for at
 least min_nr completed events, copying completed struct ukevents to
 buf and deleting any KEVENT_REQ_ONESHOT event requests.
 
 
 The bulk of the interface is entirely done through the ukevent struct.
 It is used to add event requests, modify existing event requests,
 specify which event requests to remove, and return completed events.
 
 struct ukevent contains the following members:
 
 struct kevent_id id
This is described as containing the socket number, file
descriptor and so on, which I take to mean it contains an fd,
however for some mysterious reason struct kevent_id contains
__u32 raw[2] and (for KEVENT_POLL events) the actual fd is
placed in raw[0] and raw[1] is never mentioned except to
faithfully copy it around.
 
For KEVENT_TIMER events, raw[0] contains a relative time in
milliseconds and raw[1] is still not used.
 
Why the struct member is called raw remains a mystery.

If you followed previous patchsets you could find, that there were
network AIO, fs IO and fs-inotify-like notifications.
Some of them use that fields.
I got two u32 numbers to be unioned with pointer like user data is.
That pointer should be obtained through Ulrich's dma_alloc() and
friends.

 __u32 type
   The actual event type, either KEVENT_POLL for fd polling or

Re: [PATCH] locking bug in fib_semantics.c

2006-08-23 Thread Jarek Poplawski

On Tue, Aug 22, 2006 at 12:35:56PM +0200, Jarek Poplawski wrote:
... 
 Hello,
 I've found it at last but on that occasion I've got some
 doubt according to rcu_read_lock and rcu_call treatment:
...

Actually there is one more doubt (bug really, but
not very probable): proc file reading is without any
locking in fib_hash.c, so if somebody uses programs
which do that often, he could have problems while
adding or deleting a route in a wrong time. If it
will be ever changed, fz_nent should also be ++/--
under lock, I think. 

Jarek P.
 
PS: linux-2.6.18-rc4
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/18] d80211: LED triggers

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 09:54 -0700, Jouni Malinen wrote:

 Is someone using these or planning on using them? I have been open to
 just removing all code due to lack of active use.

Some people on powerbooks want to use the front LED for wireless
activity instead of other things, and a few weeks ago I wrote a LED
class device for it so they could hook it up now :)

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 08/18] d80211: clean up exports

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 09:44 -0700, Jouni Malinen wrote:

 Moving the EXPORT_SYMBOL definitions sounds good, but I would like to
 keep changes between EXPORT_SYMBOL and EXPORT_SYMBOL_GPL separate from
 this kind of cleanup. In addition, I'm not personally a huge fan of the
 EXPORT_SYMBOL_GPL in the first place since I believe the GPL should
 cover this without additional changes in the source code. In other
 words, I would prefer that the EXPORT_SYMBOL would not be changed to
 EXPORT_SYMBOL_GPL here.

I just intended making it _GPL as an additional deterrent since you
practically need the internal ieee80211_i.h header for any kind of rate
control algorithm. I'm fine with dropping them, however.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/18] d80211: get rid of sta_aid in favour of keeping track of TIM

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 20:36 +0200, Jiri Benc wrote:
 On Mon, 21 Aug 2006 09:41:14 +0200, Johannes Berg wrote:
  I think this is not correct if a STA is removed for which packets
  are buffered, but if it is still wrong then that case was never
  correct to start with if the hw has a set_tim callback.
 
 You're right, good catch.

:)

  +   /* 251 = max size of tim bitmap in beacon */
  +   for (i = 0; i  251; i++) {
 
 Please, use a constant here.

Yeah, good point.

  +   u8 tim[sizeof(unsigned long)*BITS_TO_LONGS(MAX_AID_TABLE_SIZE+1)];
 
 Hm, adding spaces here would extend the line above 80 characters... But
 this way it doesn't look good. What to do here? I'd prefer leaving the
 line a little over 80 chars in this case. What do you think?

Heh, didn't even really think about that. I can throw in a few spaces.

  @@ -424,13 +424,6 @@ void sta_info_remove_aid_ptr(struct sta_
  sdata = IEEE80211_DEV_TO_SUB_IF(sta-dev);
  if (sta-aid = 0 || !sdata-bss)
  return;
  -
  -   sdata-bss-sta_aid[sta-aid - 1] = NULL;
  -   if (sta-aid == sdata-bss-max_aid) {
  -   while (sdata-bss-max_aid  0 
  -  !sdata-bss-sta_aid[sdata-bss-max_aid - 1])
  -   sdata-bss-max_aid--;
  -   }
   }
 
 Why are you not calling bss_tim_clear here? Am I missing something?

Dunno. I probably just looked at the code and thought 'oh, all this does
is updated max_aid, let me get rid of it' :)

 Also, adding hw-set_tim call here should fix the problem you described
 at the beginning of the mail.

Yeah, I guess so.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Tue, Aug 22, 2006 at 04:46:19PM -0700, Ulrich Drepper ([EMAIL PROTECTED]) 
wrote:
 DaveM says there are example programs for the current interfaces.  I
 must admit I haven't seen those either.  So if possible, point the world
 to them again.  If you do that now I'll review everything and write up
 my recommendations re the interface before Monday.

Attached typical usage for inode and timer events.
Network AIO was implemented as separated syscalls.

 -- 
 ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
 



-- 
Evgeniy Polyakov
#include sys/types.h
#include sys/stat.h
#include sys/ioctl.h
#include sys/time.h

#include fcntl.h
#include stdio.h
#include stdlib.h
#include errno.h
#include string.h
#include unistd.h

#include linux/unistd.h
#include linux/types.h
#include linux/ukevent.h

#define _syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \
type name (type1 arg1, type2 arg2, type3 arg3, type4 arg4) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4);\
}

#define _syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
  type5,arg5) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4, arg5);\
}

#define _syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
  type5,arg5,type6,arg6) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5, type6 arg6) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4, arg5, arg6);\
}

_syscall4(int, kevent_ctl, int, arg1, unsigned int, argv2, unsigned int, argv3, 
void *, argv4);
_syscall6(int, kevent_get_events, int, arg1, unsigned int, argv2, unsigned int, 
argv3, unsigned int, argv4, void *, argv5, unsigned, arg6);

#define ulog(f, a...) fprintf(stderr, f, ##a)
#define ulog_err(f, a...) ulog(f : %s [%d].\n, ##a, strerror(errno), errno)

static void usage(char *p)
{
ulog(Usage: %s -t type -e event -o oneshot -p path -n wait_num -h\n, 
p);
}

static int get_id(int type, char *path)
{
int ret = -1;

switch (type) {
case KEVENT_TIMER:
ret = 3000;
break;
case KEVENT_INODE:
ret = open(path, O_RDONLY);
break;
}

return ret;
}

int main(int argc, char *argv[])
{
int ch, fd, err, type, event, oneshot, i, num, wait_num;
char *path;
char buf[4096];
struct ukevent *uk;
struct timeval tm1, tm2;

path = NULL;
type = event = -1;
oneshot = 0;
wait_num = 10;

while ((ch = getopt(argc, argv, p:t:e:o:n:h))  0) {
switch (ch) {
case 'n':
wait_num = atoi(optarg);
break;
case 'p':
path = optarg;
break;
case 't':
type = atoi(optarg);
break;
case 'e':
event = atoi(optarg);
break;
case 'o':
oneshot = atoi(optarg);
break;
default:
usage(argv[0]);
return -1;
}
}

if (event == -1 || type == -1 || (type == KEVENT_INODE  !path)) {
ulog(You need at least -t -e parameters and -p for inode 
notifications.\n);
usage(argv[0]);
return -1;
}

fd = kevent_ctl(0, KEVENT_CTL_INIT, 1, NULL);
if (fd == -1) {
ulog_err(Failed create kevent control block);
return -1;
}

memset(buf, 0, sizeof(buf));

gettimeofday(tm1, NULL);

num = 1;
for (i=0; inum; ++i) {
uk = (struct ukevent *)buf;
uk-event = event;
uk-type = type;
if (oneshot)
uk-req_flags |= KEVENT_REQ_ONESHOT;
uk-user[0] = i;
uk-id.raw[0] = get_id(uk-type, path);

err = kevent_ctl(fd, KEVENT_CTL_ADD, 1, uk);
if (err  0) {
ulog_err(Failed to perform control operation: type=%d, 
event=%d, oneshot=%d, type, event, oneshot);
close(fd);
return err;
}
ulog(%s: err: %d.\n, __func__, err);
if (err) {
ulog(%d: ret_flags: 0x%x, ret_data: %u %d.\n, i, 
uk-ret_flags, uk-ret_data[0], (int)uk-ret_data[1]);
}
}

gettimeofday(tm2, NULL);

ulog(%08ld.%08ld: Load:

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 02:43:50AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
wrote:
 Actually, I didn't miss that, it is an orthogonal issue. A timespec
 timeout parameter for the syscall does not imply the use of timespec
 in any timer event, etc. Nor is there any timespec timer in kqueue's
 struct kevent, which is the only (interface related) thing that will
 be exposed.

void * in structure exported to userspace is forbidden.
long in syscall requires wrapper in per-arch code (although that
workaround _is_ there, it does not mean that broken interface should 
be used).
poll uses millisecods - it is perfectly ok.

 Rakshasa

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Andrew Morton

On Wed, 23 Aug 2006 10:56:59 +0400
Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 On Wed, Aug 23, 2006 at 02:43:50AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
 wrote:
  Actually, I didn't miss that, it is an orthogonal issue. A timespec
  timeout parameter for the syscall does not imply the use of timespec
  in any timer event, etc. Nor is there any timespec timer in kqueue's
  struct kevent, which is the only (interface related) thing that will
  be exposed.
 
 void * in structure exported to userspace is forbidden.
 long in syscall requires wrapper in per-arch code (although that
 workaround _is_ there, it does not mean that broken interface should 
 be used).
 poll uses millisecods - it is perfectly ok.

I wonder whether designing-in a millisecond granularity is the right thing
to do.  If in a few years the kernel is running tickless with high-res clock
interrupt sources, that might look a bit lumpy.

Switching it to a __u64 nanosecond counter would be basically free on
64-bit machines, and not very expensive on 32-bit, no?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 12:07:58AM -0700, Andrew Morton ([EMAIL PROTECTED]) 
wrote:
 On Wed, 23 Aug 2006 10:56:59 +0400
 Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 
  On Wed, Aug 23, 2006 at 02:43:50AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
  wrote:
   Actually, I didn't miss that, it is an orthogonal issue. A timespec
   timeout parameter for the syscall does not imply the use of timespec
   in any timer event, etc. Nor is there any timespec timer in kqueue's
   struct kevent, which is the only (interface related) thing that will
   be exposed.
  
  void * in structure exported to userspace is forbidden.
  long in syscall requires wrapper in per-arch code (although that
  workaround _is_ there, it does not mean that broken interface should 
  be used).
  poll uses millisecods - it is perfectly ok.
 
 I wonder whether designing-in a millisecond granularity is the right thing
 to do.  If in a few years the kernel is running tickless with high-res clock
 interrupt sources, that might look a bit lumpy.
 
 Switching it to a __u64 nanosecond counter would be basically free on
 64-bit machines, and not very expensive on 32-bit, no?

Let's then place there a structure with 64bit seconds and nanoseconds,
similar to timspec, but without longs there.
What do you think?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 3/5] d80211: fix interface removal

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 10:33 -0700, David Kimdon wrote:
 + if (param-u.if_info.type == HOSTAP_IF_WDS) {
 + type = IEEE80211_IF_TYPE_WDS;
 + } else if (param-u.if_info.type == HOSTAP_IF_VLAN) {
 + type = IEEE80211_IF_TYPE_VLAN;
 + } else if (param-u.if_info.type == HOSTAP_IF_BSS) {
 + type = IEEE80211_IF_TYPE_AP;
 + } else if (param-u.if_info.type == HOSTAP_IF_STA) {
 + type = IEEE80211_IF_TYPE_STA;
 + } else {
 +return -EINVAL;
   }

IMHO that'd look better as a switch(). Or maybe even a small static
array to map them and just some bounds checking code?

Also, spaces instead of tab on the last added line.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/5] d80211: add ioctl to stop data frame tx

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 10:34 -0700, David Kimdon wrote:

 This ioctl is used when radar is delected on a channel.  Data frames must stop
 but management frames must be allowed to continue for some time to communicate
 the channel switch to stations.

Which does lead to the question: How are you detecting radar in
userspace in the first place??

 +   if (unlikely(local-stop_data_frame_tx)) {
 +   struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) 
 skb-data;
 +   u16 fc = le16_to_cpu(hdr-frame_control);
 +   if ((fc  IEEE80211_FCTL_FTYPE) == IEEE80211_FTYPE_DATA) {
 +   dev_kfree_skb(skb);
 +   return 0;
 +   }
 +   }

Should that really drop dataframes dead on the floor? And wouldn't it
make sense stop the networking layer from injecting more data into the
stack when stop_data_frame_tx is enabled?

 +static int ieee80211_ioctl_set_stop_data_frame_tx(struct net_device *dev,
 + int val)
 +{
 + struct ieee80211_local *local = dev-ieee80211_ptr;
 +local-stop_data_frame_tx = val;
 +return 0;
 +}

Again, whitespace damaged. Yes, I know it's hard to code in there with
any smart editor that thinks it knows what to do based on the
surroundings because those may also contain whitespace...


 +ret = ieee80211_ioctl_set_stop_data_frame_tx(dev, value);

Ditto.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Add wireless statics to bcm43xx-d80211

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 16:15 -0500, Larry Finger wrote:
  +  int maxssi;
  
  Why is maxssi here? Can it really change between received frames?
 
 No it cannot change between frames; however, the max value can be very 
 different for different 
 drivers using d80211. On the bcm43xx, it is 60; whereas 100 seems to be a 
 better value for the 
 rt2x00 chips. Adding it here seemed like a good way to handle this situation. 
 Do you suggest 
 something else?

I think the question was intended to be: why is it in
ieee80211_rx_status and not in struct ieee80211_hw?

 Again to pass the differing values for different drivers.

Ditto here. Just stick it into struct ieee80211_hw instead.

  I would suggest using -110 dBm as a floor (to be compatible with RCPI
  definition, see mail from Simon Barber describing it). Or is there any
  particular reason for -104 dBm?
 
 It is the value previously used in the softmac version of bcm43xx. A value of 
 -110 would obviously 
 be better.

Who maintains the softmac version now? :P
I'd suggest to just change it there too for consistency.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] bcm43xx-softmac - set correct value in mac_suspended for ifdown/ifup sequence

2006-08-23 Thread Michael Buesch

On Wednesday 23 August 2006 00:07, Larry Finger wrote:
 John,
 
 Please apply this to wireless-2.6.
 
 Michael - bcm43xx-d80211 probably needs this as well.
 
 Larry
 
 ---
 
 When bcm43xx-softmac is given an ifdown/ifup sequence, the value for 
 bcm-mac_suspended ends up 
 wrong, which leads to a large number of assert(bcm-mac_suspended=0) 
 messages. This one-line patch 
 fixes this problem.

I think the following is the correct fix for the issue.
It is already in the d80211 branch. (Seems like it got lost somehow).
Can you test this?

Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===
--- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c   
2006-08-23 10:00:27.0 +0200
+++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c2006-08-23 
10:01:45.0 +0200
@@ -3349,6 +3349,8 @@
memset(bcm-dma_reason, 0, sizeof(bcm-dma_reason));
bcm-irq_savedstate = BCM43xx_IRQ_INITIAL;
 
+   bcm-mac_suspended = 1;
+
/* Noise calculation context */
memset(bcm-noisecalc, 0, sizeof(bcm-noisecalc));
 

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread David Miller

From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 23 Aug 2006 00:07:58 -0700

 I wonder whether designing-in a millisecond granularity is the right thing
 to do.  If in a few years the kernel is running tickless with high-res clock
 interrupt sources, that might look a bit lumpy.

 Switching it to a __u64 nanosecond counter would be basically free on
 64-bit machines, and not very expensive on 32-bit, no?

If it ends up in a structure we'll need to use the aligned_u64 type
in order to avoid problems with 32-bit x86 binaries running on 64-bit
kernels.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Ian McDonald


I wonder whether designing-in a millisecond granularity is the right thing
to do.  If in a few years the kernel is running tickless with high-res clock
interrupt sources, that might look a bit lumpy.


I'd second that - when working on DCCP I've done a lot of the work in
microseconds and it made quite a difference instead of milliseconds
because of it's design.

I haven't followed kevents in great detail but it sounds like
something that could be useful for me with higher resolution timers
than milliseconds.
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 12:07:58AM -0700, Andrew Morton ([EMAIL PROTECTED]) 
wrote:
 On Wed, 23 Aug 2006 10:56:59 +0400
 Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 
  On Wed, Aug 23, 2006 at 02:43:50AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
  wrote:
   Actually, I didn't miss that, it is an orthogonal issue. A timespec
   timeout parameter for the syscall does not imply the use of timespec
   in any timer event, etc. Nor is there any timespec timer in kqueue's
   struct kevent, which is the only (interface related) thing that will
   be exposed.
  
  void * in structure exported to userspace is forbidden.
  long in syscall requires wrapper in per-arch code (although that
  workaround _is_ there, it does not mean that broken interface should 
  be used).
  poll uses millisecods - it is perfectly ok.
 
 I wonder whether designing-in a millisecond granularity is the right thing
 to do.  If in a few years the kernel is running tickless with high-res clock
 interrupt sources, that might look a bit lumpy.
 
 Switching it to a __u64 nanosecond counter would be basically free on
 64-bit machines, and not very expensive on 32-bit, no?

I can put nanoseconds as timer interval too (with aligned_u64 as David
mentioned), and put it for timeout value too - 64 bit nanosecods ends up
with 58 years, probably enough.
Structures with u64 a really not so good idea.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The Proposed Linux kevent API (was: Re: [take12 0/3] kevent: Generic event handling mechanism.)

2006-08-23 Thread Nicholas Miell

On Wed, 2006-08-23 at 10:22 +0400, Evgeniy Polyakov wrote:
 On Tue, Aug 22, 2006 at 06:36:07PM -0700, Nicholas Miell ([EMAIL PROTECTED]) 
 wrote:
  int kevent_get_events(int ctl_fd, unsigned int min_nr,
  unsigned int max_nr, unsigned int timeout,
  void *buf, unsigned flags)
  
  ctl_fd is the file descriptor referring to the kevent queue.
  
  min_nr is the minimum number of completed events that
 kevent_get_events will block waiting for.
  
  max_nr is the number of struct ukevent in buf.
  
  timeout is the number of milliseconds to wait before returning less
  than min_nr events. If this is -1, I *think* it'll wait
  indefinitely, but I'm not sure that msecs_to_jiffies(-1) ends
  up being MAX_SCHEDULE_TIMEOUT
 
 You forget the case for non-blocked file descriptor.
 Here is comment from the code:
 
  * In nonblocking mode it returns as many events as possible, but not more 
 than @max_nr.
  * In blocking mode it waits until timeout or if at least @min_nr events are 
 ready.

I missed that, but why bother with O_NONBLOCK? It appears to to make the
timeout parameter completely unnecessary, which means you could just
make timeout = 0 give you the nonblocking behavior, and non-zero the
blocking behavior (leaving -1 as wait forever).

  buf is a pointer an array of struct ukevent. Why it is of type void*
  and not struct ukevent* is a mystery.
  
  flags is unused.
  
  When called, kevent_get_events will wait timeout milliseconds for at
  least min_nr completed events, copying completed struct ukevents to
  buf and deleting any KEVENT_REQ_ONESHOT event requests.
  
  
  The bulk of the interface is entirely done through the ukevent struct.
  It is used to add event requests, modify existing event requests,
  specify which event requests to remove, and return completed events.
  
  struct ukevent contains the following members:
  
  struct kevent_id id
 This is described as containing the socket number, file
 descriptor and so on, which I take to mean it contains an fd,
 however for some mysterious reason struct kevent_id contains
 __u32 raw[2] and (for KEVENT_POLL events) the actual fd is
 placed in raw[0] and raw[1] is never mentioned except to
 faithfully copy it around.
  
 For KEVENT_TIMER events, raw[0] contains a relative time in
 milliseconds and raw[1] is still not used.
  
 Why the struct member is called raw remains a mystery.
 
 If you followed previous patchsets you could find, that there were
 network AIO, fs IO and fs-inotify-like notifications.
 Some of them use that fields.
 I got two u32 numbers to be unioned with pointer like user data is.
 That pointer should be obtained through Ulrich's dma_alloc() and
 friends.
 
  __u32 type
The actual event type, either KEVENT_POLL for fd polling or
KEVENT_TIMER for timers.
  
  __u32 event
For events of type KEVENT_POLL, event contains the polling flags
of interest (i.e. POLLIN, POLLPRI, POLLOUT, POLLERR, POLLHUP,
POLLNVAL).
  
For events of type KEVENT_TIMER, event is ignored.
  
  __u32 req_flags
Per-event request flags. Currently, this may be 0 or
KEVENT_REQ_ONESHOT to specify that the event be removed after it
is fired.
  
  __u32 ret_flags
Per-event return flags. This may be 0 or a combination of
KEVENT_RET_DONE if the event has completed or
KVENT_RET_BROKEN if the event is broken, which I take to mean
any sort of error condition. DONE|BROKEN is a valid state, but I
don't really know what it means.
 
 DONE means that event processing is completed and it can be read back to
 userspace, if in addition it contains BROKEN it means that kevent is
 broken.

So KEVENT_RET_DONE is purely an internal thing? And what does
KEVENT_RET_BROKEN mean, exactly?

  __u32 ret_data[2]
Event return data. This is unused by KEVENT_POLL events, while
KEVENT_TIMER inexplicably places jiffies in ret_data[0]. If the
event is broken, an error code is placed in ret_data[1].
 
 Each kevent user can place here any hints it wants, for example network
 socket notifications place there length of the accept queue and so on.

I didn't document what it could theoretically be used for, just what it
is actually used for.

 In error condition error is placed there too.
 
  union { __u32 user[2]; void *ptr; }
An anonymous union (which is a fairly recent C addition)
containing data saved for the user and otherwise ignored by the
kernel.
  
  For KEVENT_CTL_ADD, all fields relevant to the event type must be
  filled (id, type, possibly event, req_flags). After kevent_ctl(...,
  KEVENT_CTL_ADD, ...) returns each struct's ret_flags should be
  checked to see if the event is already broken or done.
  
  For KEVENT_CTL_MODIFY, the id, req_flags, and user and event fields
  must be set and an existing kevent request

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Nicholas Miell

On Wed, 2006-08-23 at 00:35 -0700, David Miller wrote:
 From: Andrew Morton [EMAIL PROTECTED]
 Date: Wed, 23 Aug 2006 00:07:58 -0700

  I wonder whether designing-in a millisecond granularity is the right thing
  to do.  If in a few years the kernel is running tickless with high-res clock
  interrupt sources, that might look a bit lumpy.

  Switching it to a __u64 nanosecond counter would be basically free on
  64-bit machines, and not very expensive on 32-bit, no?

 If it ends up in a structure we'll need to use the aligned_u64 type
 in order to avoid problems with 32-bit x86 binaries running on 64-bit
 kernels.

Perhaps

struct timespec64
{
uint64_t tv_sec __attribute__((aligned(8)));
uint32_t tv_nsec;
}

with a snide remark about gcc in the comments?

-- 
Nicholas Miell [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Jari Sundell


On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:

void * in structure exported to userspace is forbidden.


Only void * I'm seeing belongs to the user, (udata) perhaps you are
talking of something different?


long in syscall requires wrapper in per-arch code (although that
workaround _is_ there, it does not mean that broken interface should
be used).
poll uses millisecods - it is perfectly ok.


The kernel is there to hide those ugly implementation details from the
user, so I don't care that much about a workaround being required in
some cases. More important, IMHO is consistency with the POSIX system
calls.

I guess as long as you use usec, at least it won't be a pain to use.

Rakshasa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 10:22:06AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
wrote:
 On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 void * in structure exported to userspace is forbidden.
 
 Only void * I'm seeing belongs to the user, (udata) perhaps you are
 talking of something different?

Yes, exactly about it.

I put union {
u32 a[2];
void *b;
} 
epcially to eliminate that problem.

And I'm not that sure aboit stuff like uptr_t or how they call pointers
in userspace and kernelspace.

 long in syscall requires wrapper in per-arch code (although that
 workaround _is_ there, it does not mean that broken interface should
 be used).
 poll uses millisecods - it is perfectly ok.
 
 The kernel is there to hide those ugly implementation details from the
 user, so I don't care that much about a workaround being required in
 some cases. More important, IMHO is consistency with the POSIX system
 calls.
 
 I guess as long as you use usec, at least it won't be a pain to use.

Andrew suggested to use nanoseconds there in u64 variable.
I think it is ok.

 Rakshasa

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 1/3] kevent: Core files.

2006-08-23 Thread Eric Dumazet

Hello Evgeniy

I have one comment/suggestion (minor detail, your work is very good)

I suggest to add one item in kevent_registered_callbacks[], so that 
kevent_registered_callbacks[KEVENT_MAX] is valid and can act as a fallback.

In kevent_add_callbacks() you could replace the eventual NULL pointers by 
kevent_break() in 
kevent_registered_callbacks[pos].{callback, enqueue, dequeue}
like :

+int kevent_add_callbacks(const struct kevent_callbacks *cb, unsigned int pos)
+{
+   struct kevent_callbacks *p = kevent_registered_callbacks[pos];
+   if (pos = KEVENT_MAX)
+   return -EINVAL;
+  p-enqueue = (cb-enqueue) ? cb-enqueue : kevent_break;
+  p-dequeue = (cb-dequeue) ? cb-dequeue : kevent_break;
+  p-callback = (cb-callback) ? cb-callback : kevent_break;
+   printk(KERN_INFO KEVENT: Added callbacks for type %u.\n, pos);
+   return 0;
+}

(I also added a const qualifier in first function argument, and unsigned int 
pos so that the if (pos = KEVENT_MAX) test catches 'negative' values)

Then you change kevent_break() to return -EINVAL instead of 0.

+int kevent_break(struct kevent *k)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(k-ulock, flags);
+   k-event.ret_flags |= KEVENT_RET_BROKEN;
+   spin_unlock_irqrestore(k-ulock, flags);
+   return -EINVAL;
+}

Then avoid the tests in kevent_enqueue()

+int kevent_enqueue(struct kevent *k)
+{
+   return k-callbacks.enqueue(k);
+}

And avoid the tests in  kevent_dequeue()

+int kevent_dequeue(struct kevent *k)
+{
+   return k-callbacks.dequeue(k);
+}

And change kevent_init() to

+int kevent_init(struct kevent *k)
+{
+   spin_lock_init(k-ulock);
+   k-flags = 0;
+
+   if (unlikely(k-event.type = KEVENT_MAX))
+   k-event.type = KEVENT_MAX;
+
+
+   k-callbacks = kevent_registered_callbacks[k-event.type];
+   if (unlikely(k-callbacks.callback == kevent_break))
+   return kevent_break(k);
+
+   return 0;
+}



Eric Dumazet
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 1/3] kevent: Core files.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 10:51:36AM +0200, Eric Dumazet ([EMAIL PROTECTED]) 
wrote:
 Hello Evgeniy

Hi Eric.
 
 I have one comment/suggestion (minor detail, your work is very good)
 
 I suggest to add one item in kevent_registered_callbacks[], so that 
 kevent_registered_callbacks[KEVENT_MAX] is valid and can act as a fallback.

Sounds good, could you please send appliable patch with proper
signed-off line?

 Eric Dumazet

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 1/3] kevent: Core files.

2006-08-23 Thread Eric Dumazet

On Wednesday 23 August 2006 11:18, Evgeniy Polyakov wrote:
 On Wed, Aug 23, 2006 at 10:51:36AM +0200, Eric Dumazet ([EMAIL PROTECTED]) 
wrote:
  Hello Evgeniy

 Hi Eric.

  I have one comment/suggestion (minor detail, your work is very good)
 
  I suggest to add one item in kevent_registered_callbacks[], so that
  kevent_registered_callbacks[KEVENT_MAX] is valid and can act as a
  fallback.

 Sounds good, could you please send appliable patch with proper
 signed-off line?

Unfortunately not at this moment, I'm quite busy at work, my boss will kill 
me :( .
If you find this good, please add it to your next patch submission or forget 
it. 

Thank you
Eric

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 1/3] kevent: Core files.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 11:23:52AM +0200, Eric Dumazet ([EMAIL PROTECTED]) 
wrote:
  Sounds good, could you please send appliable patch with proper
  signed-off line?
 
 Unfortunately not at this moment, I'm quite busy at work, my boss will kill 
 me :( .
 If you find this good, please add it to your next patch submission or forget 
 it. 

Ok, I will try to get it from pieces in e-mail.

 Thank you
 Eric

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2.6.19 PATCH 0/7] ehea: IBM eHEA Ethernet Device Driver

2006-08-23 Thread Jan-Bernd Themann

Hi,

in this latest version of the IBM eHEA Ethernet Device Driver
we removed all unnecessary variable initializations and did some
code cleanup. We hope that we didn't miss to respond to any of your
suggestions. Please feel free to send us further feedback. We highly
appreciate your efforts.

Thanks,
Jan-Bernd

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED]
Changelog-by:  Jan-Bernd Themann [EMAIL PROTECTED]

Differences to patch set http://www.spinics.net/lists/netdev/msg12702.html

Changelog:

- Unnecessary variable initializations removed
- Promiscuous mode support included

 drivers/net/Kconfig |9 
 drivers/net/Makefile|1 
 drivers/net/ehea/Makefile   |7 
 drivers/net/ehea/ehea.h |  438 ++
 drivers/net/ehea/ehea_ethtool.c |  244 +++
 drivers/net/ehea/ehea_hcall.h   |   51 
 drivers/net/ehea/ehea_hw.h  |  290 
 drivers/net/ehea/ehea_main.c| 2677 
 drivers/net/ehea/ehea_phyp.c|  784 +++
 drivers/net/ehea/ehea_phyp.h|  463 ++
 drivers/net/ehea/ehea_qmr.c |  607 +
 drivers/net/ehea/ehea_qmr.h |  361 +
 12 files changed, 5932 insertions(+)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2.6.19 PATCH 3/7] ehea: queue management

2006-08-23 Thread Jan-Bernd Themann

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] 


 drivers/net/ehea/ehea_qmr.c |  607 
 drivers/net/ehea/ehea_qmr.h |  361 ++
 2 files changed, 968 insertions(+)



--- linux-2.6.18-rc4-git1-orig/drivers/net/ehea/ehea_qmr.c  1969-12-31 
16:00:00.0 -0800
+++ kernel/drivers/net/ehea/ehea_qmr.c  2006-08-23 02:01:00.282229545 -0700
@@ -0,0 +1,607 @@
+/*
+ *  linux/drivers/net/ehea/ehea_qmr.c
+ *
+ *  eHEA ethernet device driver for IBM eServer System p
+ *
+ *  (C) Copyright IBM Corp. 2006
+ *
+ *  Authors:
+ *   Christoph Raisch [EMAIL PROTECTED]
+ *   Jan-Bernd Themann [EMAIL PROTECTED]
+ *   Thomas Klein [EMAIL PROTECTED]
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include ehea.h
+#include ehea_phyp.h
+#include ehea_qmr.h
+
+static void *hw_qpageit_get_inc(struct hw_queue *queue)
+{
+   void *retvalue = hw_qeit_get(queue);
+
+   queue-current_q_offset += queue-pagesize;
+   if (queue-current_q_offset  queue-queue_length) {
+   queue-current_q_offset -= queue-pagesize;
+   retvalue = NULL;
+   } else if (((u64) retvalue)  (EHEA_PAGESIZE-1)) {
+   ehea_error(not on pageboundary);
+   retvalue = NULL;
+   }
+   return retvalue;
+}
+
+static int hw_queue_ctor(struct hw_queue *queue, const u32 nr_of_pages,
+ const u32 pagesize, const u32 qe_size)
+{
+   int pages_per_kpage = PAGE_SIZE / pagesize;
+   int i, k;
+
+   if ((pagesize  PAGE_SIZE) || (!pages_per_kpage)) {
+   ehea_error(pagesize conflict! kernel pagesize=%d, 
+  ehea pagesize=%d, (int)PAGE_SIZE, (int)pagesize);
+   return -EINVAL;
+   }
+
+   queue-queue_length = nr_of_pages * pagesize;
+   queue-queue_pages = kmalloc(nr_of_pages * sizeof(void*), GFP_KERNEL);
+   if (!queue-queue_pages) {
+   ehea_error(no mem for queue_pages);
+   return -ENOMEM;
+   }
+
+   /*
+* allocate pages for queue:
+* outer loop allocates whole kernel pages (page aligned) and
+* inner loop divides a kernel page into smaller hea queue pages
+*/
+   i = 0;
+   while (i  nr_of_pages) {
+   u8 *kpage = (u8*)get_zeroed_page(GFP_KERNEL);
+   if (!kpage)
+   goto exit0;
+   for (k = 0; k  pages_per_kpage  i  nr_of_pages; k++) {
+   (queue-queue_pages)[i] = (struct ehea_page *)kpage;
+   kpage += pagesize;
+   i++;
+   }
+   }
+
+   queue-current_q_offset = 0;
+   queue-qe_size = qe_size;
+   queue-pagesize = pagesize;
+   queue-toggle_state = 1;
+
+   return 0;
+
+exit0:
+   for (i = 0; i  nr_of_pages; i += pages_per_kpage) {
+   if (!(queue-queue_pages)[i])
+   break;
+   free_page((unsigned long)(queue-queue_pages)[i]);
+   }
+   return -ENOMEM;
+}
+
+static void hw_queue_dtor(struct hw_queue *queue)
+{
+   int pages_per_kpage = PAGE_SIZE / queue-pagesize;
+   int i, nr_pages;
+
+   if (!queue || !queue-queue_pages)
+   return;
+
+   nr_pages = queue-queue_length / queue-pagesize;
+
+   for (i = 0; i  nr_pages; i += pages_per_kpage)
+   free_page((unsigned long)(queue-queue_pages)[i]);
+
+   kfree(queue-queue_pages);
+}
+
+struct ehea_cq *ehea_create_cq(struct ehea_adapter *adapter,
+  int nr_of_cqe, u64 eq_handle, u32 cq_token)
+{
+   struct ehea_cq *cq;
+   struct h_epa epa;
+   u64 *cq_handle_ref, hret, rpage;
+   u32 act_nr_of_entries, act_pages, counter;
+   int ret;
+   void *vpage;
+
+   cq = kzalloc(sizeof(*cq), GFP_KERNEL);
+   if (!cq) {
+   ehea_error(no mem for cq);
+   goto create_cq_exit0;
+   }
+
+   cq-attr.max_nr_of_cqes = nr_of_cqe;
+   cq-attr.cq_token = cq_token;
+   cq-attr.eq_handle = eq_handle;
+
+   cq-adapter = adapter;
+
+   cq_handle_ref = cq-fw_handle;
+   act_nr_of_entries = 0;
+   act_pages = 0;
+
+   hret = ehea_h_alloc_resource_cq(adapter-handle, cq, cq-attr,
+

[2.6.19 PATCH 6/7] ehea: eHEA Makefile

2006-08-23 Thread Jan-Bernd Themann

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] 


 drivers/net/ehea/Makefile |7 +++
 1 file changed, 7 insertions(+)



--- linux-2.6.18-rc4-git1-orig/drivers/net/ehea/Makefile1969-12-31 
16:00:00.0 -0800
+++ kernel/drivers/net/ehea/Makefile2006-08-23 02:00:58.124819734 -0700
@@ -0,0 +1,7 @@
+#
+# Makefile for the eHEA ethernet device driver for IBM eServer System p
+#
+
+ehea-y = ehea_main.o ehea_phyp.o ehea_qmr.o ehea_ethtool.o ehea_phyp.o
+obj-$(CONFIG_EHEA) += ehea.o
+
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] add nl80211

2006-08-23 Thread Johannes Berg

On Tue, 2006-08-22 at 15:52 +0200, Johannes Berg wrote:
 + dev = dev_get_by_index(ifindex);
 + result = nl80211_drv_by_priv_locked(dev-ieee80211_ptr);
 + dev_put(dev);
 + if (result)
 + return result;
 + err = -ENODEV;

Doh, bug (not checking dev != NULL) fixed in my local patch, will submit
new ones later.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 16/18] d80211: get rid of MICHAEL_MIC_HWACCEL define

2006-08-23 Thread Jiri Benc

On Wed, 23 Aug 2006 09:05:30 +0200, Johannes Berg wrote:
 On Tue, 2006-08-22 at 21:00 +0200, Jiri Benc wrote:
  When you're touching this, could you #ifdef out wpa_test when
  CONFIG_HOSTAPD_WPA_TESTING is not defined? This could be a part of this
  patch.
 
 I thought about it but deemed it too ugly. And wouldn't the compiler
 optimize it away anyway? Thing is, it's used in tests, and having
 #ifdefs within if (...) tests is hugely ugly.

#ifdefs within if tests are ugly, I agree. I thought about restructuring
these ifs. Okay, we will go with your patch and address this later.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Add wireless statics to bcm43xx-d80211

2006-08-23 Thread Jiri Benc

On Tue, 22 Aug 2006 16:15:47 -0500, Larry Finger wrote:
 No it cannot change between frames; however, the max value can be very 
 different for different 
 drivers using d80211. On the bcm43xx, it is 60; whereas 100 seems to be a 
 better value for the 
 rt2x00 chips. Adding it here seemed like a good way to handle this situation. 
 Do you suggest 
 something else?

As Johannes already said, it should be in ieee80211_hw. 

  [...]
  --- a/net/d80211/ieee80211_i.h
  +++ b/net/d80211/ieee80211_i.h
  @@ -337,6 +337,9 @@ struct ieee80211_local {
 struct net_device *apdev; /* wlan#ap - management frames (hostapd) */
 int open_count;
 int monitors;
  +  int link_quality;
  +  int noise;
  +  struct iw_statistics wstats;
  
  Why are these three variables in ieee80211_local? They are not used
  anywhere.
 
 You are right about the first two; however, wstats is used in the new
 routine ieee80211_get_wireless_stats.

Oh, get_wireless_stats returns struct iw_statistics, so it needs to be
allocated all the time. Grr, that's stupid...

Okay, wstats really needs to be in ieee80211_local then.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] d80211: clean up exports

2006-08-23 Thread Johannes Berg

This puts all EXPORT_SYMBOL() macros along with the function being exported.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]

--- wireless-dev.orig/net/d80211/ieee80211.c2006-08-23 10:35:03.0 
+0200
+++ wireless-dev/net/d80211/ieee80211.c 2006-08-23 10:35:03.0 +0200
@@ -267,6 +267,7 @@ int ieee80211_get_hdrlen(u16 fc)
 
return hdrlen;
 }
+EXPORT_SYMBOL(ieee80211_get_hdrlen);
 
 
 int ieee80211_get_hdrlen_from_skb(struct sk_buff *skb)
@@ -281,6 +282,7 @@ int ieee80211_get_hdrlen_from_skb(struct
return 0;
return hdrlen;
 }
+EXPORT_SYMBOL(ieee80211_get_hdrlen_from_skb);
 
 
 #ifdef IEEE80211_VERBOSE_DEBUG_FRAME_DUMP
@@ -1816,6 +1818,7 @@ struct sk_buff * ieee80211_beacon_get(st
ap-num_beacons++;
return skb;
 }
+EXPORT_SYMBOL(ieee80211_beacon_get);
 
 
 struct sk_buff *
@@ -1882,6 +1885,7 @@ ieee80211_get_buffered_bc(struct net_dev
 
return skb;
 }
+EXPORT_SYMBOL(ieee80211_get_buffered_bc);
 
 static int __ieee80211_if_config(struct net_device *dev,
 struct sk_buff *beacon)
@@ -1966,6 +1970,7 @@ struct ieee80211_conf *ieee80211_get_hw_
struct ieee80211_local *local = dev-ieee80211_ptr;
 return local-conf;
 }
+EXPORT_SYMBOL(ieee80211_get_hw_conf);
 
 
 static int ieee80211_change_mtu(struct net_device *dev, int new_mtu)
@@ -2097,6 +2102,7 @@ struct dev_mc_list *ieee80211_get_mc_lis
*ptr = sdata;
return mc;
 }
+EXPORT_SYMBOL(ieee80211_get_mc_list_item);
 
 static struct net_device_stats *ieee80211_get_stats(struct net_device *dev)
 {
@@ -2644,6 +2650,7 @@ int ieee80211_radar_status(struct net_de
ieee80211_rx_mgmt(dev, skb, 0, ieee80211_msg_radar);
return 0;
 }
+EXPORT_SYMBOL(ieee80211_radar_status);
 
 
 int ieee80211_set_aid_for_sta(struct net_device *dev, u8 *peer_address,
@@ -2667,6 +2674,7 @@ int ieee80211_set_aid_for_sta(struct net
 ieee80211_rx_mgmt(dev, skb, 0, ieee80211_msg_set_aid_for_sta);
 return 0;
 }
+EXPORT_SYMBOL(ieee80211_set_aid_for_sta);
 
 
 static void ap_sta_ps_start(struct net_device *dev, struct sta_info *sta)
@@ -3692,6 +3700,7 @@ void __ieee80211_rx(struct net_device *d
if (sta)
sta_info_put(sta);
 }
+EXPORT_SYMBOL(__ieee80211_rx);
 
 
 static ieee80211_txrx_result
@@ -3865,6 +3874,7 @@ void ieee80211_rx_irqsafe(struct net_dev
 skb_queue_tail(local-skb_queue, skb);
 tasklet_schedule(local-tasklet);
 }
+EXPORT_SYMBOL(ieee80211_rx_irqsafe);
 
 
 void ieee80211_tx_status_irqsafe(struct net_device *dev, struct sk_buff *skb,
@@ -3894,6 +3904,7 @@ void ieee80211_tx_status_irqsafe(struct 
}
 tasklet_schedule(local-tasklet);
 }
+EXPORT_SYMBOL(ieee80211_tx_status_irqsafe);
 
 
 static void ieee80211_tasklet_handler(unsigned long data)
@@ -4145,6 +4156,7 @@ void ieee80211_tx_status(struct net_devi
 /* Send frame to hostapd */
 ieee80211_rx_mgmt(dev, skb, NULL, msg_type);
 }
+EXPORT_SYMBOL(ieee80211_tx_status);
 
 
 /* TODO: implement register/unregister functions for adding TX/RX handlers
@@ -4397,6 +4409,7 @@ struct net_device *ieee80211_alloc_hw(si
 
return mdev;
 }
+EXPORT_SYMBOL(ieee80211_alloc_hw);
 
 
 int ieee80211_register_hw(struct net_device *dev, struct ieee80211_hw *hw)
@@ -4508,6 +4521,7 @@ fail_sysfs:
ieee80211_dev_free_index(local);
return result;
 }
+EXPORT_SYMBOL(ieee80211_register_hw);
 
 int ieee80211_update_hw(struct net_device *dev, struct ieee80211_hw *hw)
 {
@@ -4539,6 +4553,7 @@ int ieee80211_update_hw(struct net_devic
 
return 0;
 }
+EXPORT_SYMBOL(ieee80211_update_hw);
 
 
 void ieee80211_unregister_hw(struct net_device *dev)
@@ -4599,6 +4614,7 @@ void ieee80211_unregister_hw(struct net_
ieee80211_dev_free_index(local);
ieee80211_led_exit(local);
 }
+EXPORT_SYMBOL(ieee80211_unregister_hw);
 
 void ieee80211_free_hw(struct net_device *dev)
 {
@@ -4608,6 +4624,7 @@ void ieee80211_free_hw(struct net_device
ieee80211_wep_free(local);
ieee80211_dev_free(local);
 }
+EXPORT_SYMBOL(ieee80211_free_hw);
 
 void ieee80211_release_hw(struct ieee80211_local *local)
 {
@@ -4654,6 +4671,7 @@ int ieee80211_netif_oper(struct net_devi
 
 return 0;
 }
+EXPORT_SYMBOL(ieee80211_netif_oper);
 
 void ieee80211_wake_queue(struct net_device *dev, int queue)
 {
@@ -4668,6 +4686,7 @@ void ieee80211_wake_queue(struct net_dev
__netif_schedule(dev);
}
 }
+EXPORT_SYMBOL(ieee80211_wake_queue);
 
 void ieee80211_stop_queue(struct net_device *dev, int queue)
 {
@@ -4675,6 +4694,7 @@ void ieee80211_stop_queue(struct net_dev
 
set_bit(IEEE80211_LINK_STATE_XOFF, local-state[queue]);
 }
+EXPORT_SYMBOL(ieee80211_stop_queue);
 
 void ieee80211_start_queues(struct net_device *dev)
 {
@@ -4684,12 +4704,14 @@ void ieee80211_start_queues(struct net_d
for (i = 0; i  local-hw-queues; i++)
clear_bit(IEEE80211_LINK_STATE_XOFF, local-state[i]);
 }

Re: [RFT] sky2: transmit complete alternative

2006-08-23 Thread Jon Wikne


Stephen Hemminger wrote:

Does the following get rid of the hang?

Recode the transmit completion handling to avoid races between the hardware
status report mechanism and the interrupt handler. Rather than relying on
the index value in the status ring, read the chip register and cleanup
all completed transmits. 


Reduce the transmit lock window smaller to allow more parallelism.


[ patch ]

I'm afraid not. :-(

1) The number of bytes before the hang occurs seems to have _decreased_
   rather than the opposite. This observation is, however, a question
   of statistics, and I have better such with the previous version.

2) Previously, the sequence /sbin/ifdown eth0 - /sbin/ifup eth0 made
   the driver recover. Now, the latter of these commands hangs the
   whole system. The recovery is now /sbin/ifdown eth0 - rmmod sky2
   - modprobe sky2


-- Jon

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] d80211: get rid of sta_aid in favour of keeping track of TIM

2006-08-23 Thread Johannes Berg

This patch gets rid of the HUGE sta_aid array that was there in the
access point structure and instead keeps track of the TIM. Also
reduces stack usage of the ieee80211_beacon_add_tim() function
considerably, and fixes a bug where removing a station that had
frames buffered wouldn't update the hardware TIM (if necessary).

Signed-off-by: Johannes Berg [EMAIL PROTECTED]

--- wireless-dev.orig/net/d80211/ieee80211.c2006-08-23 10:35:02.0 
+0200
+++ wireless-dev/net/d80211/ieee80211.c 2006-08-23 10:35:03.0 +0200
@@ -22,6 +22,7 @@
 #include linux/mutex.h
 #include net/iw_handler.h
 #include linux/compiler.h
+#include linux/bitmap.h
 
 #include net/d80211.h
 #include net/d80211_common.h
@@ -1044,8 +1045,12 @@ ieee80211_tx_h_unicast_ps_buf(struct iee
} else
tx-local-total_ps_buffered++;
/* Queue frame to be sent after STA sends an PS Poll frame */
-   if (skb_queue_empty(sta-ps_tx_buf)  tx-local-hw-set_tim)
-   tx-local-hw-set_tim(tx-dev, sta-aid, 1);
+   if (skb_queue_empty(sta-ps_tx_buf)) {
+   if (tx-local-hw-set_tim)
+   tx-local-hw-set_tim(tx-dev, sta-aid, 1);
+   if (tx-sdata-bss)
+   bss_tim_set(tx-local, tx-sdata-bss, 
sta-aid);
+   }
pkt_data = (struct ieee80211_tx_packet_data *)tx-skb-cb;
pkt_data-jiffies = jiffies;
 skb_queue_tail(sta-ps_tx_buf, tx-skb);
@@ -1676,25 +1681,16 @@ static void ieee80211_beacon_add_tim(str
 {
u8 *pos, *tim;
int aid0 = 0;
-   int i, num_bits = 0, n1, n2;
-   u8 bitmap[251];
+   int i, have_bits = 0, n1, n2;
 
/* Generate bitmap for TIM only if there are any STAs in power save
 * mode. */
-   if (atomic_read(bss-num_sta_ps)  0  bss-max_aid  0) {
-   memset(bitmap, 0, sizeof(bitmap));
-   spin_lock_bh(local-sta_lock);
-   for (i = 0; i  bss-max_aid; i++) {
-   if (bss-sta_aid[i] 
-   (!skb_queue_empty(bss-sta_aid[i]-ps_tx_buf) ||
-!skb_queue_empty(bss-sta_aid[i]-tx_filtered)))
-   {
-   bitmap[(i + 1) / 8] |= 1  (i + 1) % 8;
-   num_bits++;
-   }
-   }
-   spin_unlock_bh(local-sta_lock);
-   }
+   spin_lock_bh(local-sta_lock);
+   if (atomic_read(bss-num_sta_ps)  0)
+   /* in the hope that this is faster than
+* checking byte-for-byte */
+   have_bits = !bitmap_empty((unsigned long*)bss-tim,
+ MAX_AID_TABLE_SIZE+1);
 
if (bss-dtim_count == 0)
bss-dtim_count = bss-dtim_period - 1;
@@ -1707,40 +1703,40 @@ static void ieee80211_beacon_add_tim(str
*pos++ = bss-dtim_count;
*pos++ = bss-dtim_period;
 
-   if (bss-dtim_count == 0  !skb_queue_empty(bss-ps_bc_buf)) {
+   if (bss-dtim_count == 0  !skb_queue_empty(bss-ps_bc_buf))
aid0 = 1;
-   }
 
-   if (num_bits) {
+   if (have_bits) {
/* Find largest even number N1 so that bits numbered 1 through
 * (N1 x 8) - 1 in the bitmap are 0 and number N2 so that bits
 * (N2 + 1) x 8 through 2007 are 0. */
n1 = 0;
-   for (i = 0; i  sizeof(bitmap); i++) {
-   if (bitmap[i]) {
+   for (i = 0; i  IEEE80211_MAX_TIM_LEN; i++) {
+   if (bss-tim[i]) {
n1 = i  0xfe;
break;
}
}
n2 = n1;
-   for (i = sizeof(bitmap) - 1; i = n1; i--) {
-   if (bitmap[i]) {
+   for (i = IEEE80211_MAX_TIM_LEN - 1; i = n1; i--) {
+   if (bss-tim[i]) {
n2 = i;
break;
}
}
 
/* Bitmap control */
-   *pos++ = n1 | (aid0 ? 1 : 0);
+   *pos++ = n1 | aid0;
/* Part Virt Bitmap */
-   memcpy(pos, bitmap + n1, n2 - n1 + 1);
+   memcpy(pos, bss-tim + n1, n2 - n1 + 1);
 
tim[1] = n2 - n1 + 4;
skb_put(skb, n2 - n1);
} else {
-   *pos++ = aid0 ? 1 : 0; /* Bitmap control */
+   *pos++ = aid0; /* Bitmap control */
*pos++ = 0; /* Part Virt Bitmap */
}
+   spin_unlock_bh(local-sta_lock);
 }
 
 
@@ -2702,8 +2698,12 @@ static int ap_sta_ps_end(struct net_devi
atomic_dec(sdata-bss-num_sta_ps);
sta-flags = ~(WLAN_STA_PS | WLAN_STA_TIM);
sta-pspoll = 0;
-   if

Re: [PATCH] d80211: get rid of sta_aid in favour of keeping track of TIM

2006-08-23 Thread Johannes Berg

On Wed, 2006-08-23 at 12:04 +0200, Johannes Berg wrote:

 +/* miscellaneous IEEE 802.11 constants */
  #define IEEE80211_MAX_FRAG_THRESHOLD 2346
  #define IEEE80211_MAX_RTS_THRESHOLD 2347
 +#define IEEE80211_MAX_TIM_LEN 251

Nice, but I'm using it wrongly :) I'll send a fixed patch.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH ] d80211: get rid of sta_aid in favour of keeping track of TIM

2006-08-23 Thread Johannes Berg

This patch gets rid of the HUGE sta_aid array that was there in the
access point structure and instead keeps track of the TIM. Also
reduces stack usage of the ieee80211_beacon_add_tim() function
considerably, and fixes a bug where removing a station that had
frames buffered wouldn't update the hardware TIM (if necessary).

It also removes the MAX_AID_TABLE_SIZE pseudo-configuration option
(it was a define with a comment indicating it could be changed)
since now having all AIDs available is no longer expensive.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]

--- wireless-dev.orig/net/d80211/ieee80211.c2006-08-23 10:58:26.0 
+0200
+++ wireless-dev/net/d80211/ieee80211.c 2006-08-23 12:09:53.0 +0200
@@ -22,6 +22,7 @@
 #include linux/mutex.h
 #include net/iw_handler.h
 #include linux/compiler.h
+#include linux/bitmap.h
 
 #include net/d80211.h
 #include net/d80211_common.h
@@ -1044,8 +1045,12 @@ ieee80211_tx_h_unicast_ps_buf(struct iee
} else
tx-local-total_ps_buffered++;
/* Queue frame to be sent after STA sends an PS Poll frame */
-   if (skb_queue_empty(sta-ps_tx_buf)  tx-local-hw-set_tim)
-   tx-local-hw-set_tim(tx-dev, sta-aid, 1);
+   if (skb_queue_empty(sta-ps_tx_buf)) {
+   if (tx-local-hw-set_tim)
+   tx-local-hw-set_tim(tx-dev, sta-aid, 1);
+   if (tx-sdata-bss)
+   bss_tim_set(tx-local, tx-sdata-bss, 
sta-aid);
+   }
pkt_data = (struct ieee80211_tx_packet_data *)tx-skb-cb;
pkt_data-jiffies = jiffies;
 skb_queue_tail(sta-ps_tx_buf, tx-skb);
@@ -1676,25 +1681,16 @@ static void ieee80211_beacon_add_tim(str
 {
u8 *pos, *tim;
int aid0 = 0;
-   int i, num_bits = 0, n1, n2;
-   u8 bitmap[251];
+   int i, have_bits = 0, n1, n2;
 
/* Generate bitmap for TIM only if there are any STAs in power save
 * mode. */
-   if (atomic_read(bss-num_sta_ps)  0  bss-max_aid  0) {
-   memset(bitmap, 0, sizeof(bitmap));
-   spin_lock_bh(local-sta_lock);
-   for (i = 0; i  bss-max_aid; i++) {
-   if (bss-sta_aid[i] 
-   (!skb_queue_empty(bss-sta_aid[i]-ps_tx_buf) ||
-!skb_queue_empty(bss-sta_aid[i]-tx_filtered)))
-   {
-   bitmap[(i + 1) / 8] |= 1  (i + 1) % 8;
-   num_bits++;
-   }
-   }
-   spin_unlock_bh(local-sta_lock);
-   }
+   spin_lock_bh(local-sta_lock);
+   if (atomic_read(bss-num_sta_ps)  0)
+   /* in the hope that this is faster than
+* checking byte-for-byte */
+   have_bits = !bitmap_empty((unsigned long*)bss-tim,
+ IEEE80211_MAX_AID+1);
 
if (bss-dtim_count == 0)
bss-dtim_count = bss-dtim_period - 1;
@@ -1707,40 +1703,40 @@ static void ieee80211_beacon_add_tim(str
*pos++ = bss-dtim_count;
*pos++ = bss-dtim_period;
 
-   if (bss-dtim_count == 0  !skb_queue_empty(bss-ps_bc_buf)) {
+   if (bss-dtim_count == 0  !skb_queue_empty(bss-ps_bc_buf))
aid0 = 1;
-   }
 
-   if (num_bits) {
+   if (have_bits) {
/* Find largest even number N1 so that bits numbered 1 through
 * (N1 x 8) - 1 in the bitmap are 0 and number N2 so that bits
 * (N2 + 1) x 8 through 2007 are 0. */
n1 = 0;
-   for (i = 0; i  sizeof(bitmap); i++) {
-   if (bitmap[i]) {
+   for (i = 0; i  IEEE80211_MAX_TIM_LEN; i++) {
+   if (bss-tim[i]) {
n1 = i  0xfe;
break;
}
}
n2 = n1;
-   for (i = sizeof(bitmap) - 1; i = n1; i--) {
-   if (bitmap[i]) {
+   for (i = IEEE80211_MAX_TIM_LEN - 1; i = n1; i--) {
+   if (bss-tim[i]) {
n2 = i;
break;
}
}
 
/* Bitmap control */
-   *pos++ = n1 | (aid0 ? 1 : 0);
+   *pos++ = n1 | aid0;
/* Part Virt Bitmap */
-   memcpy(pos, bitmap + n1, n2 - n1 + 1);
+   memcpy(pos, bss-tim + n1, n2 - n1 + 1);
 
tim[1] = n2 - n1 + 4;
skb_put(skb, n2 - n1);
} else {
-   *pos++ = aid0 ? 1 : 0; /* Bitmap control */
+   *pos++ = aid0; /* Bitmap control */
*pos++ = 0; /* Part Virt Bitmap */
}
+   spin_unlock_bh(local-sta_lock);
 }
 
 
@@ -2702,8 +2698,12 @@

[PATCH] b44: fix eeprom endianess issue

2006-08-23 Thread Michael Buesch

Hi Andrew,

Please apply this patch to -mm for testing.
I think in the long term we want to convert b44 to use the
new ssb backend driver, which would also fix the issue, but
for now I think this small fix is best.

Please note that this test is only compile tested, as
I don't have a b44 device.

--

This fixes eeprom read on big-endian architectures.

Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: linux-2.6/drivers/net/b44.c
===
--- linux-2.6.orig/drivers/net/b44.c2006-08-22 11:27:56.0 +0200
+++ linux-2.6/drivers/net/b44.c 2006-08-23 12:26:31.0 +0200
@@ -2055,7 +2055,7 @@
u16 *ptr = (u16 *) data;
 
for (i = 0; i  128; i += 2)
-   ptr[i / 2] = readw(bp-regs + 4096 + i);
+   ptr[i / 2] = cpu_to_le16(readw(bp-regs + 4096 + i));
 
return 0;
 }


-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] b44: fix eeprom endianess issue

2006-08-23 Thread Michael Buesch

Oh, I think I should have CCed Jeff ;)
Sorry.

On Wednesday 23 August 2006 12:32, Michael Buesch wrote:
 Hi Andrew,
 
 Please apply this patch to -mm for testing.
 I think in the long term we want to convert b44 to use the
 new ssb backend driver, which would also fix the issue, but
 for now I think this small fix is best.
 
 Please note that this test is only compile tested, as
 I don't have a b44 device.
 
 --
 
 This fixes eeprom read on big-endian architectures.
 
 Signed-off-by: Michael Buesch [EMAIL PROTECTED]
 
 Index: linux-2.6/drivers/net/b44.c
 ===
 --- linux-2.6.orig/drivers/net/b44.c  2006-08-22 11:27:56.0 +0200
 +++ linux-2.6/drivers/net/b44.c   2006-08-23 12:26:31.0 +0200
 @@ -2055,7 +2055,7 @@
   u16 *ptr = (u16 *) data;
  
   for (i = 0; i  128; i += 2)
 - ptr[i / 2] = readw(bp-regs + 4096 + i);
 + ptr[i / 2] = cpu_to_le16(readw(bp-regs + 4096 + i));
  
   return 0;
  }
 
 

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

multicast group memberships purge on interface delete

2006-08-23 Thread Michal Ruzicka


Hello there,
I've got the following question/suggestion:

The situation today:
When an interface is deleted and there happen to have been some multicast 
groups joined on it only
the interface's list of multicast meberships is deleted. The sockets through 
which the groups
were joined and more importantly their associated multicast membership lists 
are left untouched.
This makes it difficult for the function that handles leaving multicast 
groups on a socket to decide
what to do with groups that were joined on such an interface (that no longer 
exists). The present
implementation is a kind of  a best guess (and nothnig better can probably 
be done about that).
It may even fail to leave an affected group (group that was joined on a 
deleted interface) completely
and thus block a slot in the sockets's multicast mebership list which size 
is purposely limited.


My question/suggestion:
Would it feasible to drop the relevant entries from sockets' multicast 
membership lists on the interface
delete? Yes, I do realize it would require to walk through a number of 
sockets to see if there is any
multicast entry for the interface in question to delete. But this could be 
optimized by maintaining a list
of sockets that have a multicast group joined on the interface (and keep a 
pointer to this list in the
device structure). This would ease the job of the function handling leaving 
multicast groups, made
its beahaviour more deterministic and possible errors reported by it more 
meaningful/reliable.


Notes:
- The suggested approach is reportedly taken by other OSes (notably NetBSD). 
The fact
that linux doesn't behave the same poses a problem for cross platform 
software for the behaviour

of different systems is different in one more detail.
- The suggested list of sockets that have a multicast group joined on the 
interface could also
probably be of some help when maintaining the per interface multicast source 
filter list or
per-interface multicast reception state as per RFC 3376 (IGMPv3) section 
3.2.


Thanks
Michal Ruzicka 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2.6.19 PATCH 4/7] ehea: ethtool interface

2006-08-23 Thread Jan-Bernd Themann

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] 


 drivers/net/ehea/ehea_ethtool.c |  244 
 1 file changed, 244 insertions(+)



--- linux-2.6.18-rc4-git1-orig/drivers/net/ehea/ehea_ethtool.c  1969-12-31 
16:00:00.0 -0800
+++ kernel/drivers/net/ehea/ehea_ethtool.c  2006-08-23 02:01:00.363230269 
-0700
@@ -0,0 +1,244 @@
+/*
+ *  linux/drivers/net/ehea/ehea_ethtool.c
+ *
+ *  eHEA ethernet device driver for IBM eServer System p
+ *
+ *  (C) Copyright IBM Corp. 2006
+ *
+ *  Authors:
+ *   Christoph Raisch [EMAIL PROTECTED]
+ *   Jan-Bernd Themann [EMAIL PROTECTED]
+ *   Thomas Klein [EMAIL PROTECTED]
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include ehea.h
+#include ehea_phyp.h
+
+
+static int netdev_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+   u64 hret;
+   struct ehea_port *port = netdev_priv(dev);
+   struct ehea_adapter *adapter = port-adapter;
+   struct hcp_ehea_port_cb4 *cb4;
+
+   cb4 = kzalloc(H_CB_ALIGNMENT, GFP_KERNEL);
+   if (!cb4) {
+   ehea_error(no mem for cb4);
+   return -ENOMEM;
+   }
+
+   hret = ehea_h_query_ehea_port(adapter-handle, port-logical_port_id,
+ H_PORT_CB4, H_PORT_CB4_ALL, cb4);
+   if (hret != H_SUCCESS) {
+   ehea_error(query_ehea_port failed);
+   kfree(cb4);
+   return -EIO;
+   }
+
+   if (netif_msg_hw(port))
+   ehea_dump(cb4, sizeof(*cb4), netdev_get_settings);
+
+   if (netif_carrier_ok(dev)) {
+   switch(cb4-port_speed){
+   case H_PORT_SPEED_10M_H:
+   cmd-speed = SPEED_10;
+   cmd-duplex = DUPLEX_HALF;
+   break;
+   case H_PORT_SPEED_10M_F:
+   cmd-speed = SPEED_10;
+   cmd-duplex = DUPLEX_FULL;
+   break;
+   case H_PORT_SPEED_100M_H:
+   cmd-speed = SPEED_100;
+   cmd-duplex = DUPLEX_HALF;
+   break;
+   case H_PORT_SPEED_100M_F:
+   cmd-speed = SPEED_100;
+   cmd-duplex = DUPLEX_FULL;
+   break;
+   case H_PORT_SPEED_1G_F:
+   cmd-speed = SPEED_1000;
+   cmd-duplex = DUPLEX_FULL;
+   break;
+   case H_PORT_SPEED_10G_F:
+   cmd-speed = SPEED_1;
+   cmd-duplex = DUPLEX_FULL;
+   break;
+   }
+   } else {
+   cmd-speed = -1;
+   cmd-duplex = -1;
+   }
+
+   cmd-supported = (SUPPORTED_1baseT_Full | SUPPORTED_1000baseT_Full
+  | SUPPORTED_100baseT_Full |  SUPPORTED_100baseT_Half
+  | SUPPORTED_10baseT_Full | SUPPORTED_10baseT_Half
+  | SUPPORTED_Autoneg | SUPPORTED_FIBRE);
+
+   cmd-advertising = (ADVERTISED_1baseT_Full | ADVERTISED_Autoneg
+| ADVERTISED_FIBRE);
+
+   cmd-port = PORT_FIBRE;
+   cmd-autoneg = AUTONEG_ENABLE;
+
+   kfree(cb4);
+   return 0;
+}
+
+static void netdev_get_drvinfo(struct net_device *dev,
+  struct ethtool_drvinfo *info)
+{
+   strlcpy(info-driver, DRV_NAME, sizeof(info-driver) - 1);
+   strlcpy(info-version, DRV_VERSION, sizeof(info-version) - 1);
+}
+
+static u32 netdev_get_msglevel(struct net_device *dev)
+{
+   struct ehea_port *port = netdev_priv(dev);
+   return port-msg_enable;
+}
+
+static void netdev_set_msglevel(struct net_device *dev, u32 value)
+{
+   struct ehea_port *port = netdev_priv(dev);
+   port-msg_enable = value;
+}
+
+static char ehea_ethtool_stats_keys[][ETH_GSTRING_LEN] = {
+   {poll_max_processed},
+   {queue_stopped},
+   {min_swqe_avail},
+   {poll_receive_err},
+   {pkt_send},
+   {pkt_xmit},
+   {send_tasklet},
+   {ehea_poll},
+   {nwqe},
+   {swqe_available_0},
+   {sig_comp_iv},
+   {rxo},
+   {rx64},
+   {rx65},
+   {rx128},
+   {rx256},
+   {rx512},
+   {rx1024},
+   {txo},
+

[2.6.19 PATCH 7/7] ehea: Makefile Kconfig

2006-08-23 Thread Jan-Bernd Themann

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] 


 drivers/net/Kconfig  |9 +
 drivers/net/Makefile |1 +
 2 files changed, 10 insertions(+)



diff -Nurp -X dontdiff linux-2.6.18-rc4-git1/drivers/net/Kconfig 
patched_kernel/drivers/net/Kconfig
--- linux-2.6.18-rc4-git1/drivers/net/Kconfig   2006-08-06 11:20:11.0 
-0700
+++ patched_kernel/drivers/net/Kconfig  2006-08-22 06:00:49.545435280 -0700
@@ -2277,6 +2277,15 @@ config CHELSIO_T1
   To compile this driver as a module, choose M here: the module
   will be called cxgb.
 
+config EHEA
+   tristate eHEA Ethernet support
+   depends on IBMEBUS
+   ---help---
+ This driver supports the IBM pSeries eHEA ethernet adapter.
+
+ To compile the driver as a module, choose M here. The module
+ will be called ehea.
+
 config IXGB
tristate Intel(R) PRO/10GbE support
depends on PCI
diff -Nurp -X dontdiff linux-2.6.18-rc4-git1/drivers/net/Makefile 
patched_kernel/drivers/net/Makefile
--- linux-2.6.18-rc4-git1/drivers/net/Makefile  2006-08-06 11:20:11.0 
-0700
+++ patched_kernel/drivers/net/Makefile 2006-08-22 05:53:59.254861851 -0700
@@ -10,6 +10,7 @@ obj-$(CONFIG_E1000) += e1000/
 obj-$(CONFIG_IBM_EMAC) += ibm_emac/
 obj-$(CONFIG_IXGB) += ixgb/
 obj-$(CONFIG_CHELSIO_T1) += chelsio/
+obj-$(CONFIG_EHEA) += ehea/
 obj-$(CONFIG_BONDING) += bonding/
 obj-$(CONFIG_GIANFAR) += gianfar_driver.o
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Jari Sundell


On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:

On Wed, Aug 23, 2006 at 10:22:06AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
wrote:
 On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 void * in structure exported to userspace is forbidden.

 Only void * I'm seeing belongs to the user, (udata) perhaps you are
 talking of something different?

Yes, exactly about it.

I put union {
u32 a[2];
void *b;
}
epcially to eliminate that problem.


It's just random data of a known maximum size appended to the struct,
I'm sure you can find a clean way to handle it. If you mangle the
first variable name in your union, you'll end up with something that
should be usable instead of udata.


And I'm not that sure aboit stuff like uptr_t or how they call pointers
in userspace and kernelspace.


Well, I can't find any use of pointers in your struct ukevent, nor in
any of the kqueue events in my man page. So if this is a deficit it
applies to both, I guess?


ukevent is aligned to 8 bytes already (it's size selected to be 40 bytes),
so it should not be a problem.

 Eric


Even if it is so, wouldn't it be better to be explicit about it?

Rakshasa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2.6.19 PATCH 2/7] ehea: pHYP interface

2006-08-23 Thread Jan-Bernd Themann

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] 


 drivers/net/ehea/ehea_hcall.h |   51 ++
 drivers/net/ehea/ehea_phyp.c  |  784 ++
 drivers/net/ehea/ehea_phyp.h  |  463 
 3 files changed, 1298 insertions(+)



--- linux-2.6.18-rc4-git1-orig/drivers/net/ehea/ehea_phyp.c 1969-12-31 
16:00:00.0 -0800
+++ kernel/drivers/net/ehea/ehea_phyp.c 2006-08-23 02:01:00.086227795 -0700
@@ -0,0 +1,784 @@
+/*
+ *  linux/drivers/net/ehea/ehea_phyp.c
+ *
+ *  eHEA ethernet device driver for IBM eServer System p
+ *
+ *  (C) Copyright IBM Corp. 2006
+ *
+ *  Authors:
+ *   Christoph Raisch [EMAIL PROTECTED]
+ *   Jan-Bernd Themann [EMAIL PROTECTED]
+ *   Thomas Klein [EMAIL PROTECTED]
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include ehea_phyp.h
+
+
+static inline u16 get_order_of_qentries(u16 queue_entries)
+{
+   u8 ld = 1;  /*  logarithmus dualis */
+   while (((1U  ld) - 1)  queue_entries)
+   ld++;
+   return ld - 1;
+}
+
+
+/* Defines for H_CALL H_ALLOC_RESOURCE */
+#define H_ALL_RES_TYPE_QP1
+#define H_ALL_RES_TYPE_CQ2
+#define H_ALL_RES_TYPE_EQ3
+#define H_ALL_RES_TYPE_MR5
+#define H_ALL_RES_TYPE_MW6
+
+static long ehea_hcall_9arg_9ret(unsigned long opcode,
+unsigned long arg1, unsigned long arg2,
+unsigned long arg3, unsigned long arg4,
+unsigned long arg5, unsigned long arg6,
+unsigned long arg7, unsigned long arg8,
+unsigned long arg9, unsigned long *out1,
+unsigned long *out2,unsigned long *out3,
+unsigned long *out4,unsigned long *out5,
+unsigned long *out6,unsigned long *out7,
+unsigned long *out8,unsigned long *out9)
+{
+   long hret;
+   int i, sleep_msecs;
+
+   for (i = 0; i  5; i++) {
+   hret = plpar_hcall_9arg_9ret(opcode,arg1, arg2, arg3, arg4,
+arg5, arg6, arg7, arg8, arg9, out1,
+out2, out3, out4, out5, out6, out7,
+out8, out9);
+   if (H_IS_LONG_BUSY(hret)) {
+   sleep_msecs = get_longbusy_msecs(hret);
+   msleep_interruptible(sleep_msecs);
+   continue;
+   }
+
+   if (hret  H_SUCCESS)
+   ehea_error(op=%lx hret=%lx 
+  i1=%lx i2=%lx i3=%lx i4=%lx i5=%lx i6=%lx 
+  i7=%lx i8=%lx i9=%lx 
+  o1=%lx o2=%lx o3=%lx o4=%lx o5=%lx o6=%lx 
+  o7=%lx o8=%lx o9=%lx,
+  opcode, hret, arg1, arg2, arg3, arg4, arg5,
+  arg6, arg7, arg8, arg9, *out1, *out2, *out3,
+  *out4, *out5, *out6, *out7, *out8, *out9);
+   return hret;
+   }
+   return H_BUSY;
+}
+
+u64 ehea_h_query_ehea_qp(const u64 hcp_adapter_handle, const u8 qp_category,
+const u64 qp_handle, const u64 sel_mask, void *cb_addr)
+{
+   u64 dummy;
+
+   if u64)cb_addr)  (PAGE_SIZE - 1)) != 0) {
+   ehea_error(not on pageboundary);
+   return H_PARAMETER;
+   }
+
+   return ehea_hcall_9arg_9ret(H_QUERY_HEA_QP,
+   hcp_adapter_handle, /* R4 */
+   qp_category,/* R5 */
+   qp_handle,  /* R6 */
+   sel_mask,   /* R7 */
+   virt_to_abs(cb_addr),   /* R8 */
+   0, 0, 0, 0, /* R9-R12 */
+   dummy, /* R4 */
+   dummy, /* R5 */
+   dummy, /* R6 */
+

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Andi Kleen

Evgeniy Polyakov [EMAIL PROTECTED] writes:
 
 Let's then place there a structure with 64bit seconds and nanoseconds,
 similar to timspec, but without longs there.

You need 64bit (or at least more than 32bit) for the seconds,
otherwise you add a y2038 problem which would be sad in new code.
Remember you might be still alive then ;-)

Ok one could argue that on 32bit architectures 2038 is so deeply
embedded that it doesn't make much difference, but I still
think it would be better to not readd it to new interfaces there.

64bit longs on 32bit is fine, as long as you use aligned_u64,
never long long or u64 (which has varying alignment between i386 and x86-64)

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 11:58:20AM +0200, Andi Kleen ([EMAIL PROTECTED]) wrote:
 Evgeniy Polyakov [EMAIL PROTECTED] writes:
  
  Let's then place there a structure with 64bit seconds and nanoseconds,
  similar to timspec, but without longs there.
 
 You need 64bit (or at least more than 32bit) for the seconds,
 otherwise you add a y2038 problem which would be sad in new code.
 Remember you might be still alive then ;-)

I hope so :)

 Ok one could argue that on 32bit architectures 2038 is so deeply
 embedded that it doesn't make much difference, but I still
 think it would be better to not readd it to new interfaces there.
 
 64bit longs on 32bit is fine, as long as you use aligned_u64,
 never long long or u64 (which has varying alignment between i386 and x86-64)

Btw, aligned_u64 is not exported to userspace.
I commited a change with __u64 nanoseconds without any strucutres.
Do we really need a structure?

 -Andi

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 11:49:22AM +0200, Jari Sundell ([EMAIL PROTECTED]) 
wrote:
  Only void * I'm seeing belongs to the user, (udata) perhaps you are
  talking of something different?
 
 Yes, exactly about it.
 
 I put union {
 u32 a[2];
 void *b;
 }
 epcially to eliminate that problem.
 
 It's just random data of a known maximum size appended to the struct,
 I'm sure you can find a clean way to handle it. If you mangle the
 first variable name in your union, you'll end up with something that
 should be usable instead of udata.

If there will be usual pointer, size of the whole structure will be
different in kernel and userspace.

 And I'm not that sure aboit stuff like uptr_t or how they call pointers
 in userspace and kernelspace.
 
 Well, I can't find any use of pointers in your struct ukevent, nor in
 any of the kqueue events in my man page. So if this is a deficit it
 applies to both, I guess?

No, it will change sizes of the structure in kernelspace and userspace,
so they just can not communicate.

 ukevent is aligned to 8 bytes already (it's size selected to be 40 bytes),
 so it should not be a problem.
 
  Eric
 
 Even if it is so, wouldn't it be better to be explicit about it?

Ok, I will add a comment about it.

 Rakshasa

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Jari Sundell


On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:


No, it will change sizes of the structure in kernelspace and userspace,
so they just can not communicate.


struct kevent {
 uintptr_t ident;/* identifier for this event */
 short filter;   /* filter for event */
 u_short   flags;/* action flags for kqueue */
 u_int fflags;   /* filter flag value */

 union {
   u32   _data_padding[2];
   intptr_t  data; /* filter data value */
 };

 union {
   u32   _udata_padding[2];
   void  *udata;   /* opaque user data identifier */
 };
};

I'm not missing anything obvious here, I hope.

Rakshasa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC][PATCH 0/3] net: a lighter UDP-Lite (RFC 3828)

2006-08-23 Thread gerrit

[NET/IPV4]: a lighter UDP-Lite (RFC 3828)

This is a revised RFC resubmission of the UDP-Lite code which, thanks
to suggestions by David Miller, is now drastically reduced in size:

   ``A fully functional UDP-Lite module in a mere 209 lines !''

I feel that not much more can be removed without making the code obfuscated,
but would like to challenge people on this list to look out for further 
possible integration and reductions. 

I would further like to hear suggestions for a common naming scheme, after 
some of the UDP functions have been made generic, shared between both UDP
and UDP-Lite. 

I will wait with the UDP(-Lite)v6 part until feedback and comments have been
received: the v6-side will mirror the format of the v4-side.

To get a quick idea of what is happening, it is best to start with udplite.c,
since this also lists all the shared functions. This file is #included into 
udp.c -- I did want to keep functionally different blocks of code logically 
separate, but could not see the need for separate compilation.

A detailed changelog is included below. 

The code has been tested over several days on i686, i386-SMP, AMD,
and sparc64 platforms; using various userland and kernel applications
such as multicast streaming, DNS, socket programs, NFS client/server
(different file sizes); and on hardware with TX/RX UDP checksums (tg3).

Enclosed patch can be applied to Torvald's tree. Application code for testing 
is on
http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz


*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Important things that need to be resolved
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

a)  Naming scheme: Several functions are now generic, shared between UDP and 
UDP-Lite. 
They have not been given new names so far. Which naming scheme should be 
used ???
[e.g.  `udpl_checksum_complete() instead of `udp_checksum_complete()' ?] 

b)  udp_v{4,6}_get_port(): raised earlier, this function appears almost 
identical in
two places. There has been discussion, resubmission, but no final opinion 
yet. 
Can people please decide whether the suggested integration is OK or not -- 
the single
get_port algorithm now has a total of four customers: udp4, udplite4, udp6, 
udplite6.

c)  Code cosmetics: I have left out any cosmetical changes for later, to 
minimize patch 
size. But eventually I would like to tidy up the code, in particular add 
more 
documentation to the structs and some of the (shared) functions. 
Suggestions ?

d)  Shared udp_hash_lock: Is it worth to implement separate rwlocks for UDP and 
UDP-Lite?
This would make the code quite a lot more complicated and disadvantages 
will only occur 
in the borderline case when many UDP applications have to compete at the 
same time with
many UDP-Lite applications. But will this result in noticeable performance 
loss at all?

 C h a n g e l o g

1/ Code integration.

 The patch follows David's suggestions. Additionally, the implementation
 was made simpler by exploiting the new `pcflag' member which
struct udp_sock
 now contains. This flag can only be set by UDP-Lite and so uniquely
 distinguishes UDP and UDP-Lite sockets. On UDP sockets, pcflag will
 always be 0 since the structure is zeroed out upon allocation.


2/ No separate UDP-Lite header.

 UDP-Lite does not really define a new header structure, rather it 
re-interprets the 
 `len' header field of UDP with a different semantics. Therefore, a separate 
`struct 
 udplitehdr' is not really  necessary  and hides the fact that 75% of the 
header 
 structures have exactly the same meaning.  Thus UDP-Lite now also uses `struct 
udphdr', 
 the semantic difference is taken care of by the code.


3/ Code-sharing.

 The following functions can now be shared due to reliance on common structures:
   * udp_disconnect() (thanks to unified struct udp_sock )
   * udp_v4_mcast_next()  (thanks to unified struct udp_sock )
   * udp_getsockopt() (thanks to unified struct udp_sock )
   * do_udp_getsockopt()  (thanks to unified struct udp_sock )
   * compat_udp_getsockopt()  (thanks to unified struct udp_sock )
   * udp_encap_rcv()  (thanks to unified struct udphdr )
   * udp_ioctl()  (thanks to unifying both structures)

 The following functions have been turned into parameterised ones:

   * udp_v4_get_port()  -  parameterised as __udp_get_port()
   * udp_v4_lookup()-  parameterised as __udp_lookup()
   * udp_err()  -  parameterised as __udp_err()
   * udp_v4_mcast_deliver() -  parameterised as __udp_mcast_deliver()
 This was possible thanks to common use of udp_v4_mcast_next(see above).
   * udp_lport_inuse()  -  parameterised as __udp_lport_inuse()
 This function is unnecessary in net/udp.h ! See earlier patch / discussion
 on udp_get_port() in

[RFC][PATCH 2/3] net/ipv4: UDP and generic UDP(-Lite) processing

2006-08-23 Thread gerrit

[Net/IPv4]: Modifications to the UDP module and generic UDP/-Lite processing.


Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---

 include/net/udp.h |   72 +-
 net/ipv4/udp.c|  607 --
 2 files changed, 477 insertions(+), 202 deletions(-)


diff --git a/include/net/udp.h b/include/net/udp.h
index 766fba1..5dcdd53 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -26,9 +26,48 @@ #include linux/list.h
 #include net/inet_sock.h
 #include net/sock.h
 #include net/snmp.h
+#include net/ip.h
+#include linux/ipv6.h
 #include linux/seq_file.h
 
 #define UDP_HTABLE_SIZE128
+#include net/udplite.h
+
+/**
+ * struct udp_skb_cb  -  UDP(-Lite) private variables
+ *
+ * @header:  private variables used by IPv4/IPv6
+ * @cscov:   checksum coverage length (UDP-Lite only)
+ * @partial_cov: if set indicates partial csum coverage
+ */
+struct udp_skb_cb {
+   union {
+   struct inet_skb_parmh4;
+#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
+   struct inet6_skb_parm   h6;
+#endif
+   } header;
+   __u16   cscov;
+   __u8partial_cov;
+};
+#define UDP_SKB_CB(__skb)  ((struct udp_skb_cb *)((__skb)-cb))
+
+/*
+ * Generic checksumming  routines for UDP(-Lite) v4 and v6
+ */
+static inline u16  __udp_checksum_complete(struct sk_buff *skb)
+{
+   if (! UDP_SKB_CB(skb)-partial_cov)
+   return __skb_checksum_complete(skb);
+   return  csum_fold(skb_checksum(skb, 0, UDP_SKB_CB(skb)-cscov,
+ skb-csum));
+}
+
+static __inline__ int udp_checksum_complete(struct sk_buff *skb)
+{
+   return skb-ip_summed != CHECKSUM_UNNECESSARY 
+   __udp_checksum_complete(skb);
+}
 
 /* udp.c: This needs to be shared by v4 and v6 because the lookup
  *and hashing code needs to work with different AF's yet
@@ -39,16 +78,25 @@ extern rwlock_t udp_hash_lock;
 
 extern int udp_port_rover;
 
-static inline int udp_lport_inuse(u16 num)
+/*
+ * XXX: the following two functions do not have to be here. The only
+ * other user of udp_lport_inuse is  net/ipv6/udp.c -- whose get_port
+ * is almost fully identical with UDPv4's. -grrtrr
+ */
+static inline int __udp_lport_inuse(u16 num, struct hlist_head udptable[])
 {
struct sock *sk;
struct hlist_node *node;
 
-   sk_for_each(sk, node, udp_hash[num  (UDP_HTABLE_SIZE - 1)])
+   sk_for_each(sk, node, udptable[num  (UDP_HTABLE_SIZE - 1)])
if (inet_sk(sk)-num == num)
return 1;
return 0;
 }
+static __inline__ int udp_lport_inuse(u16 num)
+{
+   return __udp_lport_inuse(num, udp_hash);
+}
 
 /* Note: this must match 'valbool' in sock_setsockopt */
 #define UDP_CSUM_NOXMIT1
@@ -75,21 +123,35 @@ extern unsigned int udp_poll(struct file
 poll_table *wait);
 
 DECLARE_SNMP_STAT(struct udp_mib, udp_statistics);
-#define UDP_INC_STATS(field)   SNMP_INC_STATS(udp_statistics, field)
-#define UDP_INC_STATS_BH(field)
SNMP_INC_STATS_BH(udp_statistics, field)
-#define UDP_INC_STATS_USER(field)  SNMP_INC_STATS_USER(udp_statistics, 
field)
+/*
+ * SNMP statistics for UDP and UDP-Lite
+ */
+#define UDP_INC_STATS(field, is_udplite)   \
+   if (is_udplite) SNMP_INC_STATS(udplite_statistics, field);  \
+   elseSNMP_INC_STATS(udp_statistics, field);
+#define UDP_INC_STATS_USER(field, is_udplite)  \
+   if (is_udplite) SNMP_INC_STATS_USER(udplite_statistics, field); \
+   elseSNMP_INC_STATS_USER(udp_statistics, field);
+#define UDP_INC_STATS_BH(field, is_udplite)\
+   if (is_udplite) SNMP_INC_STATS_BH(udplite_statistics, field);   \
+   elseSNMP_INC_STATS_BH(udp_statistics, field);
+#define UDP_DEC_STATS_BH(field, is_udplite)\
+   if (is_udplite) SNMP_DEC_STATS_BH(udplite_statistics, field);   \
+   elseSNMP_DEC_STATS_BH(udp_statistics, field);
 
 /* /proc */
 struct udp_seq_afinfo {
struct module   *owner;
char*name;
sa_family_t family;
+   struct hlist_head   *hashtable;
int (*seq_show) (struct seq_file *m, void *v);
struct file_operations  *seq_fops;
 };
 
 struct udp_iter_state {
sa_family_t family;
+   struct hlist_head   *hashtable;
int bucket;
struct seq_operations   seq_ops;
 };
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index f136cec..a8c04d5 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -92,10 +92,8 @@ #include linux/errno.h
 #include linux/timer.h
 #include linux/mm.h
 #include linux/inet.h
-#include linux/ipv6.h
 #include linux/netdevice.h

[RFC][PATCH 1/3] net/ipv4: UDP-Lite extensions

2006-08-23 Thread gerrit

[Net/IPv4]: UDP-Lite standalone support and shared UDP/-Lite socket structure.


Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---

 include/linux/udp.h   |   19 
 include/net/udplite.h |   35 
 net/ipv4/udplite.c|  209 ++
 3 files changed, 262 insertions(+), 1 deletion(-)


diff --git a/include/linux/udp.h b/include/linux/udp.h
index 90223f0..76fb1a1 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -19,6 +19,12 @@ #define _LINUX_UDP_H
 
 #include linux/types.h
 
+/**
+ *   struct udphdr  -  UDP/-Lite header
+ *
+ *   UDP (RFC 768) and UDP-Lite (RFC 3828) share the same header structure,
+ *   the only difference is that UDP-Lite interprets `len' as checksum 
coverage.
+ */
 struct udphdr {
__u16   source;
__u16   dest;
@@ -50,13 +56,24 @@ struct udp_sock {
 * when the socket is uncorked.
 */
__u16len;   /* total length of pending frames */
+/*
+ * Fields specific to UDP-Lite.
+ */
+   __u16pcslen;
+   __u16pcrlen;
+/* indicator bits used by pcflag: */
+#define UDPLITE_BIT  0x1   /* set by udplite proto init function */
+#define UDPLITE_SEND_CC  0x2   /* set via udplite setsockopt */
+#define UDPLITE_RECV_CC  0x4   /* set via udplite setsocktopt*/
+   __u8 pcflag;/* marks socket as UDP-Lite if  0*/
 };
 
 static inline struct udp_sock *udp_sk(const struct sock *sk)
 {
return (struct udp_sock *)sk;
 }
+#define IS_UDPLITE(__sk) (udp_sk(__sk)-pcflag)
 
-#endif
+#endif /* __KERNEL__   */
 
 #endif /* _LINUX_UDP_H */
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
new file mode 100644
index 000..f5597e9
--- /dev/null
+++ b/net/ipv4/udplite.c
@@ -0,0 +1,209 @@
+/*
+ *  UDPLITE An implementation of the UDP-Lite protocol (RFC 3828).
+ *
+ *  Version:$Id: udplite.c,v 1.22 2006/08/22 13:01:52 gerrit Exp gerrit $
+ *
+ *  Authors:Gerrit Renker   [EMAIL PROTECTED]
+ *
+ *  Changes:
+ *  Fixes:
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+struct hlist_head  udplite_hash[UDP_HTABLE_SIZE];
+intudplite_port_rover;
+DEFINE_SNMP_STAT(struct udp_mib, udplite_statistics)   __read_mostly;
+
+/* these functions are called by UDP-Lite with protocol-specific parameters */
+static int __udp_get_port(struct sock *, unsigned short,
+  struct hlist_head *, int *);
+static struct sock *__udp_lookup(u32 , u16, u32, u16, int, struct hlist_head 
*);
+static int __udp_mcast_deliver(struct sk_buff *, struct udphdr *,
+   u32, u32, struct hlist_head * );
+static int __udp_common_rcv(struct sk_buff *, int is_udplite);
+static void__udp_err(struct sk_buff *, u32, struct hlist_head *);
+#ifdef CONFIG_PROC_FS
+static int udp4_seq_show(struct seq_file *, void *);
+#endif
+
+/*
+ * Designate sk as UDP-Lite socket
+ */
+static inline int udplite_sk_init(struct sock *sk)
+{
+   udp_sk(sk)-pcflag = UDPLITE_BIT;
+   return 0;
+}
+
+static __inline__ int udplite_v4_get_port(struct sock *sk, unsigned short snum)
+{
+   return  __udp_get_port(sk, snum, udplite_hash, udplite_port_rover);
+}
+
+static __inline__ struct sock *udplite_v4_lookup(u32 saddr, u16 sport,
+u32 daddr, u16 dport, int dif)
+{
+   return __udp_lookup(saddr, sport, daddr, dport, dif, udplite_hash);
+}
+
+static __inline__ int udplite_v4_mcast_deliver(struct sk_buff *skb,
+   struct udphdr *uh, u32 saddr, u32 daddr)
+{
+   return __udp_mcast_deliver(skb, uh, saddr, daddr, udplite_hash);
+}
+
+__inline__ int udplite_rcv(struct sk_buff *skb)
+{
+   return __udp_common_rcv(skb, 1);
+}
+
+__inline__ void udplite_err(struct sk_buff *skb, u32 info)
+{
+   return __udp_err(skb, info, udplite_hash);
+}
+
+static int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh,
+unsigned short len, u32 saddr, u32 daddr)
+{
+   u16 cscov;
+
+/* In UDPv4 a zero checksum means that the transmitter generated no
+ * checksum. UDP-Lite (like IPv6) mandates checksums, hence packets
+ * with a zero checksum field are illegal.
*/
+   if (uh-check == 0) {
+   LIMIT_NETDEBUG(KERN_DEBUG UDPLITE: zeroed csum field
+   (%d.%d.%d.%d:%d - %d.%d.%d.%d:%d)\n, NIPQUAD(saddr),
+   ntohs(uh-source), NIPQUAD(daddr), ntohs(uh-dest));
+   return 0;
+   }
+
+

[RFC][PATCH 3/3] net/ipv4: misc. support files

2006-08-23 Thread gerrit

[Net/IPv4]: Miscellaneous changes which complete the 
v4 support for UDP-Lite.


Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---

 include/linux/in.h |1 +
 include/linux/snmp.h   |   14 ++
 include/linux/socket.h |1 +
 include/net/snmp.h |2 ++
 include/net/xfrm.h |2 ++
 net/ipv4/af_inet.c |   15 ++-
 net/ipv4/proc.c|   16 ++--
 net/ipv6/udp.c |1 +
 8 files changed, 49 insertions(+), 3 deletions(-)


diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c84a320..21038ec 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1223,10 +1223,14 @@ static int __init init_ipv4_mibs(void)
tcp_statistics[1] = alloc_percpu(struct tcp_mib);
udp_statistics[0] = alloc_percpu(struct udp_mib);
udp_statistics[1] = alloc_percpu(struct udp_mib);
+   udplite_statistics[0] = alloc_percpu(struct udp_mib);
+   udplite_statistics[1] = alloc_percpu(struct udp_mib);
+
if (!
(net_statistics[0]  net_statistics[1]  ip_statistics[0]
  ip_statistics[1]  tcp_statistics[0]  tcp_statistics[1]
- udp_statistics[0]  udp_statistics[1]))
+ udp_statistics[0]  udp_statistics[1]
+ udplite_statistics[0]  udplite_statistics[1] ) )
return -ENOMEM;
 
(void) tcp_mib_init();
@@ -1300,6 +1304,11 @@ #endif
inet_register_protosw(q);
 
/*
+*  Add UDP-Lite (RFC 3828)
+*/
+   udplite4_register();
+
+   /*
 *  Set the ARP module up
 */
 
@@ -1367,6 +1376,8 @@ static int __init ipv4_proc_init(void)
goto out_tcp;
if (udp4_proc_init())
goto out_udp;
+   if (udplite4_proc_init())
+   goto out_udplite;
if (fib_proc_init())
goto out_fib;
if (ip_misc_proc_init())
@@ -1376,6 +1387,8 @@ out:
 out_misc:
fib_proc_exit();
 out_fib:
+   udplite4_proc_exit();
+out_udplite:
udp4_proc_exit();
 out_udp:
tcp4_proc_exit();
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index d61e2a9..c93f091 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -66,9 +66,10 @@ static int sockstat_seq_show(struct seq_
   tcp_death_row.tw_count, atomic_read(tcp_sockets_allocated),
   atomic_read(tcp_memory_allocated));
seq_printf(seq, UDP: inuse %d\n, fold_prot_inuse(udp_prot));
+   seq_printf(seq, UDPLITE: inuse %d\n, fold_prot_inuse(udplite_prot));
seq_printf(seq, RAW: inuse %d\n, fold_prot_inuse(raw_prot));
-   seq_printf(seq,  FRAG: inuse %d memory %d\n, ip_frag_nqueues,
-  atomic_read(ip_frag_mem));
+   seq_printf(seq, FRAG: inuse %d memory %d\n, ip_frag_nqueues,
+atomic_read(ip_frag_mem));
return 0;
 }
 
@@ -302,6 +303,17 @@ static int snmp_seq_show(struct seq_file
   fold_field((void **) udp_statistics, 
  snmp4_udp_list[i].entry));
 
+   /* the UDP and UDP-Lite MIBs are the same */
+   seq_puts(seq, \nUdpLite:);
+   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   seq_printf(seq,  %s, snmp4_udp_list[i].name);
+
+   seq_puts(seq, \nUdpLite:);
+   for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+   seq_printf(seq,  %lu,
+  fold_field((void **) udplite_statistics,
+ snmp4_udp_list[i].entry) );
+
seq_putc(seq, '\n');
return 0;
 }
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3d54f24..51efd04 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1051,6 +1051,7 @@ static struct udp_seq_afinfo udp6_seq_af
.owner  = THIS_MODULE,
.name   = udp6,
.family = AF_INET6,
+   .hashtable  = udp_hash,
.seq_show   = udp6_seq_show,
.seq_fops   = udp6_seq_fops,
 };
diff --git a/include/linux/in.h b/include/linux/in.h
index 94f557f..5ada82e 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -44,6 +44,7 @@ enum {
 
   IPPROTO_COMP   = 108,/* Compression Header protocol */
   IPPROTO_SCTP   = 132,/* Stream Control Transport Protocol
*/
+  IPPROTO_UDPLITE = 136,   /* UDP-Lite (RFC 3828)  */
 
   IPPROTO_RAW   = 255, /* Raw IP packets   */
   IPPROTO_MAX
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 3614090..592b666 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -264,6 +264,7 @@ #define SOL_UDP 17
 #define SOL_IPV6   41
 #define SOL_ICMPV6 58
 #define SOL_SCTP   132
+#define SOL_UDPLITE136 /* UDP-Lite (RFC 3828) */
 #define SOL_RAW255
 #define SOL_IPX256
 #define SOL_AX25   257
diff --git

[take13 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov


Generic event handling mechanism.

Changes from 'take12' patchset:
 * remove non-chardev interface for initialization
 * use pointer to kevent_mring instead of unsigned longs
 * use aligned 64bit type in raw user data (can be used by high-res timer if 
needed)
 * simplified enqueue/dequeue callbacks and kevent initialization
 * use nanoseconds for timeout
 * put number of milliseconds into timer's return data
 * move some definitions into user-visible header
 * removed filenames from comments

Changes from 'take11' patchset:
 * include missing headers into patchset
 * some trivial code cleanups (use goto instead of if/else games and so on)
 * some whitespace cleanups
 * check for ready_callback() callback before main loop which should save us 
some ticks

Changes from 'take10' patchset:
 * removed non-existent prototypes
 * added helper function for kevent_registered_callbacks
 * fixed 80 lines comments issues
 * added shared between userspace and kernelspace header instead of embedd them 
in one
 * core restructuring to remove forward declarations
 * s o m e w h i t e s p a c e c o d y n g s t y l e c l e a n u p
 * use vm_insert_page() instead of remap_pfn_range()

Changes from 'take9' patchset:
 * fixed -nopage method

Changes from 'take8' patchset:
 * fixed mmap release bug
 * use module_init() instead of late_initcall()
 * use better structures for timer notifications

Changes from 'take7' patchset:
 * new mmap interface (not tested, waiting for other changes to be acked)
- use nopage() method to dynamically substitue pages
- allocate new page for events only when new added kevent requres it
- do not use ugly index dereferencing, use structure instead
- reduced amount of data in the ring (id and flags), 
maximum 12 pages on x86 per kevent fd

Changes from 'take6' patchset:
 * a lot of comments!
 * do not use list poisoning for detection of the fact, that entry is in the 
list
 * return number of ready kevents even if copy*user() fails
 * strict check for number of kevents in syscall
 * use ARRAY_SIZE for array size calculation
 * changed superblock magic number
 * use SLAB_PANIC instead of direct panic() call
 * changed -E* return values
 * a lot of small cleanups and indent fixes

Changes from 'take5' patchset:
 * removed compilation warnings about unused wariables when lockdep is not 
turned on
 * do not use internal socket structures, use appropriate (exported) wrappers 
instead
 * removed default 1 second timeout
 * removed AIO stuff from patchset

Changes from 'take4' patchset:
 * use miscdevice instead of chardevice
 * comments fixes

Changes from 'take3' patchset:
 * removed serializing mutex from kevent_user_wait()
 * moved storage list processing to RCU
 * removed lockdep screaming - all storage locks are initialized in the same 
function, so it was learned 
to differentiate between various cases
 * remove kevent from storage if is marked as broken after callback
 * fixed a typo in mmaped buffer implementation which would end up in wrong 
index calcualtion 

Changes from 'take2' patchset:
 * split kevent_finish_user() to locked and unlocked variants
 * do not use KEVENT_STAT ifdefs, use inline functions instead
 * use array of callbacks of each type instead of each kevent callback 
initialization
 * changed name of ukevent guarding lock
 * use only one kevent lock in kevent_user for all hash buckets instead of 
per-bucket locks
 * do not use kevent_user_ctl structure instead provide needed arguments as 
syscall parameters
 * various indent cleanups
 * added optimisation, which is aimed to help when a lot of kevents are being 
copied from userspace
 * mapped buffer (initial) implementation (no userspace yet)

Changes from 'take1' patchset:
 - rebased against 2.6.18-git tree
 - removed ioctl controlling
 - added new syscall kevent_get_events(int fd, unsigned int min_nr, unsigned 
int max_nr,
unsigned int timeout, void __user *buf, unsigned flags)
 - use old syscall kevent_ctl for creation/removing, modification and initial 
kevent 
initialization
 - use mutuxes instead of semaphores
 - added file descriptor check and return error if provided descriptor does not 
match
kevent file operations
 - various indent fixes
 - removed aio_sendfile() declarations.

Thank you.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[take13 2/3] kevent: poll/select() notifications.

2006-08-23 Thread Evgeniy Polyakov


poll/select() notifications.

This patch includes generic poll/select and timer notifications.

kevent_poll works simialr to epoll and has the same issues (callback
is invoked not from internal state machine of the caller, but through
process awake).

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2561020..76b3039 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -236,6 +236,7 @@ #include linux/prio_tree.h
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/kevent.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -698,6 +699,9 @@ #ifdef CONFIG_EPOLL
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c
new file mode 100644
index 000..b051784
--- /dev/null
+++ b/kernel/kevent/kevent_poll.c
@@ -0,0 +1,222 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h
+#include linux/kevent.h
+#include linux/poll.h
+#include linux/fs.h
+
+static kmem_cache_t *kevent_poll_container_cache;
+static kmem_cache_t *kevent_poll_priv_cache;
+
+struct kevent_poll_ctl
+{
+   struct poll_table_structpt;
+   struct kevent   *k;
+};
+
+struct kevent_poll_wait_container
+{
+   struct list_headcontainer_entry;
+   wait_queue_head_t   *whead;
+   wait_queue_twait;
+   struct kevent   *k;
+};
+
+struct kevent_poll_private
+{
+   struct list_headcontainer_list;
+   spinlock_t  container_lock;
+};
+
+static int kevent_poll_enqueue(struct kevent *k);
+static int kevent_poll_dequeue(struct kevent *k);
+static int kevent_poll_callback(struct kevent *k);
+
+static int kevent_poll_wait_callback(wait_queue_t *wait, 
+   unsigned mode, int sync, void *key)
+{
+   struct kevent_poll_wait_container *cont = 
+   container_of(wait, struct kevent_poll_wait_container, wait);
+   struct kevent *k = cont-k;
+   struct file *file = k-st-origin;
+   u32 revents;
+
+   revents = file-f_op-poll(file, NULL);
+
+   kevent_storage_ready(k-st, NULL, revents);
+
+   return 0;
+}
+
+static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead, 
+   struct poll_table_struct *poll_table)
+{
+   struct kevent *k = 
+   container_of(poll_table, struct kevent_poll_ctl, pt)-k;
+   struct kevent_poll_private *priv = k-priv;
+   struct kevent_poll_wait_container *cont;
+   unsigned long flags;
+
+   cont = kmem_cache_alloc(kevent_poll_container_cache, SLAB_KERNEL);
+   if (!cont) {
+   kevent_break(k);
+   return;
+   }
+   
+   cont-k = k;
+   init_waitqueue_func_entry(cont-wait, kevent_poll_wait_callback);
+   cont-whead = whead;
+
+   spin_lock_irqsave(priv-container_lock, flags);
+   list_add_tail(cont-container_entry, priv-container_list);
+   spin_unlock_irqrestore(priv-container_lock, flags);
+
+   add_wait_queue(whead, cont-wait);
+}
+
+static int kevent_poll_enqueue(struct kevent *k)
+{
+   struct file *file;
+   int err, ready = 0;
+   unsigned int revents;
+   struct kevent_poll_ctl ctl;
+   struct kevent_poll_private *priv;
+
+   file = fget(k-event.id.raw[0]);
+   if (!file)
+   return -ENODEV;
+
+   err = -EINVAL;
+   if (!file-f_op || !file-f_op-poll)
+   goto err_out_fput;
+
+   err = -ENOMEM;
+   priv = kmem_cache_alloc(kevent_poll_priv_cache, SLAB_KERNEL);
+   if (!priv)
+   goto err_out_fput;
+
+   spin_lock_init(priv-container_lock);
+   INIT_LIST_HEAD(priv-container_list);
+
+   k-priv = priv;
+
+   ctl.k = k;
+   init_poll_funcptr(ctl.pt, kevent_poll_qproc);
+
+   err = kevent_storage_enqueue(file-st, k);
+   if (err)
+   goto err_out_free;
+
+   revents = file-f_op-poll(file,

[take13 1/3] kevent: Core files.

2006-08-23 Thread Evgeniy Polyakov


Core files.

This patch includes core kevent files:
 - userspace controlling
 - kernelspace interfaces
 - initialization
 - notification state machines

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/arch/i386/kernel/syscall_table.S b/arch/i386/kernel/syscall_table.S
index dd63d47..091ff42 100644
--- a/arch/i386/kernel/syscall_table.S
+++ b/arch/i386/kernel/syscall_table.S
@@ -317,3 +317,5 @@ ENTRY(sys_call_table)
.long sys_tee   /* 315 */
.long sys_vmsplice
.long sys_move_pages
+   .long sys_kevent_get_events
+   .long sys_kevent_ctl
diff --git a/arch/x86_64/ia32/ia32entry.S b/arch/x86_64/ia32/ia32entry.S
index 5d4a7d1..b2af4a8 100644
--- a/arch/x86_64/ia32/ia32entry.S
+++ b/arch/x86_64/ia32/ia32entry.S
@@ -713,4 +713,6 @@ #endif
.quad sys_tee
.quad compat_sys_vmsplice
.quad compat_sys_move_pages
+   .quad sys_kevent_get_events
+   .quad sys_kevent_ctl
 ia32_syscall_end:  
diff --git a/include/asm-i386/unistd.h b/include/asm-i386/unistd.h
index fc1c8dd..c9dde13 100644
--- a/include/asm-i386/unistd.h
+++ b/include/asm-i386/unistd.h
@@ -323,10 +323,12 @@ #define __NR_sync_file_range  314
 #define __NR_tee   315
 #define __NR_vmsplice  316
 #define __NR_move_pages317
+#define __NR_kevent_get_events 318
+#define __NR_kevent_ctl319
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 318
+#define NR_syscalls 320
 
 /*
  * user-visible error numbers are in the range -1 - -128: see
diff --git a/include/asm-x86_64/unistd.h b/include/asm-x86_64/unistd.h
index 94387c9..61363e0 100644
--- a/include/asm-x86_64/unistd.h
+++ b/include/asm-x86_64/unistd.h
@@ -619,10 +619,14 @@ #define __NR_vmsplice 278
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_kevent_get_events 280
+__SYSCALL(__NR_kevent_get_events, sys_kevent_get_events)
+#define __NR_kevent_ctl281
+__SYSCALL(__NR_kevent_ctl, sys_kevent_ctl)
 
 #ifdef __KERNEL__
 
-#define __NR_syscall_max __NR_move_pages
+#define __NR_syscall_max __NR_kevent_ctl
 
 #ifndef __NO_STUBS
 
diff --git a/include/linux/kevent.h b/include/linux/kevent.h
new file mode 100644
index 000..fa282ac
--- /dev/null
+++ b/include/linux/kevent.h
@@ -0,0 +1,173 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __KEVENT_H
+#define __KEVENT_H
+#include linux/types.h
+#include linux/list.h
+#include linux/spinlock.h
+#include linux/mutex.h
+#include linux/wait.h
+#include linux/net.h
+#include linux/rcupdate.h
+#include linux/kevent_storage.h
+#include linux/ukevent.h
+
+#define KEVENT_MIN_BUFFS_ALLOC 3
+
+struct kevent;
+struct kevent_storage;
+typedef int (* kevent_callback_t)(struct kevent *);
+
+/* @callback is called each time new event has been caught. */
+/* @enqueue is called each time new event is queued. */
+/* @dequeue is called each time event is dequeued. */
+
+struct kevent_callbacks {
+   kevent_callback_t   callback, enqueue, dequeue;
+};
+
+#define KEVENT_READY   0x1
+#define KEVENT_STORAGE 0x2
+#define KEVENT_USER0x4
+
+struct kevent
+{
+   /* Used for kevent freeing.*/
+   struct rcu_head rcu_head;
+   struct ukevent  event;
+   /* This lock protects ukevent manipulations, e.g. ret_flags changes. */
+   spinlock_t  ulock;
+
+   /* Entry of user's queue. */
+   struct list_headkevent_entry;
+   /* Entry of origin's queue. */
+   struct list_headstorage_entry;
+   /* Entry of user's ready. */
+   struct list_headready_entry;
+
+   u32 flags;
+
+   /* User who requested this kevent. */
+   struct kevent_user  *user;
+   /* Kevent container. */
+   struct kevent_storage   *st;
+
+   struct kevent_callbacks callbacks;
+
+   /* Private data for different storages. 
+* poll()/select storage has a list of wait_queue_t containers 
+* for each -poll() { poll_wait()' } here.
+*/
+   void*priv;
+};
+
+#define

[PATCH] d80211: add ieee80211_stop_queues()

2006-08-23 Thread Michael Buesch

Add ieee80211_stop_queues() to stop all queues
with a single call.
I will submit a patch for bcm43xx to use this function
as soon as this got merged.

Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: wireless-dev/include/net/d80211.h
===
--- wireless-dev.orig/include/net/d80211.h  2006-08-19 18:26:05.0 
+0200
+++ wireless-dev/include/net/d80211.h   2006-08-23 13:38:41.0 +0200
@@ -826,6 +826,15 @@
 void ieee80211_start_queues(struct net_device *dev);
 
 /**
+ * ieee80211_stop_queues - stop all queues
+ * @dev: pointer to $struct net_device as obtained from
+ *   ieee80211_alloc_hw().
+ *
+ * Drivers should use this function instead of netif_stop_queue.
+ */
+void ieee80211_stop_queues(struct net_device *dev);
+
+/**
  * ieee80211_get_mc_list_item - iteration over items in multicast list
  * @dev: pointer to struct net_device as obtained from
  * ieee80211_alloc_hw().
Index: wireless-dev/net/d80211/ieee80211.c
===
--- wireless-dev.orig/net/d80211/ieee80211.c2006-08-19 18:26:05.0 
+0200
+++ wireless-dev/net/d80211/ieee80211.c 2006-08-23 13:41:34.0 +0200
@@ -4690,6 +4690,15 @@
clear_bit(IEEE80211_LINK_STATE_XOFF, local-state[i]);
 }
 
+void ieee80211_stop_queues(struct net_device *dev)
+{
+   struct ieee80211_local *local = dev-ieee80211_ptr;
+   int i;
+
+   for (i = 0; i  local-hw-queues; i++)
+   ieee80211_stop_queue(dev, i);
+}
+
 void * ieee80211_dev_hw_data(struct net_device *dev)
 {
struct ieee80211_local *local = dev-ieee80211_ptr;
@@ -4819,6 +4828,7 @@
 EXPORT_SYMBOL(ieee80211_wake_queue);
 EXPORT_SYMBOL(ieee80211_stop_queue);
 EXPORT_SYMBOL(ieee80211_start_queues);
+EXPORT_SYMBOL(ieee80211_stop_queues);
 EXPORT_SYMBOL(ieee80211_dev_hw_data);
 EXPORT_SYMBOL(ieee80211_dev_stats);
 EXPORT_SYMBOL(ieee80211_get_hw_conf);

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: multicast group memberships purge on interface delete

2006-08-23 Thread jamal

On Wed, 2006-23-08 at 13:08 +0200, Michal Ruzicka wrote:

 My question/suggestion:
 Would it feasible to drop the relevant entries from sockets' multicast 
 membership lists on the interface
 delete? Yes, I do realize it would require to walk through a number of 
 sockets to see if there is any
 multicast entry for the interface in question to delete. But this could be 
 optimized by maintaining a list
 of sockets that have a multicast group joined on the interface (and keep a 
 pointer to this list in the
 device structure). This would ease the job of the function handling leaving 
 multicast groups, made
 its beahaviour more deterministic and possible errors reported by it more 
 meaningful/reliable.
 

You should be able to fix it in the kernel by listening to events of
the interface/device disappearing. By disappearing i think you meant
the netdevice was totally rmmod-ed? The challenge is to make the app
also aware of you taking away the group from underneath them (thats why
i said fix it)

These events are also available in user space via netlink. so an alter
your app could listen to them and make the group leaves instead of the
kernel.

cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: multicast group memberships purge on interface delete

2006-08-23 Thread Michal Růžička





You should be able to fix it in the kernel by listening to events of
the interface/device disappearing.


Interesting, I've thought that it would have to be done explicitly by the 
interface

cleanup code, this approach looks promising to me.


By disappearing i think you meant
the netdevice was totally rmmod-ed?


No need to rmmod anything, just think of ppp or gre interfaces which come 
and go

without any modules loading/unloading. But yes, the rmmod would probably be
needed in case of, for example, an ethernet device.


The challenge is to make the app
also aware of you taking away the group from underneath them (thats why
i said fix it)



I dont's see this as any challange as the applications could just assume 
that any
memberships on deleted interfaces have been just droped implicitly by the 
kernel.

(This should be no problem for them provided that they keep track of
the interfaces present on the system, which they should anyway or otherwise
they could end up listening to just a part of the multicast traffic they are
interested in.)



These events are also available in user space via netlink. so an alter
your app could listen to them and make the group leaves instead of the
kernel.



In fact I've had proposed that on the application mailing list (the 
appliaction is
quagga formerly zebra routing suite to be specific) but the people there 
disliked
it because of the fact that for example the NetBSD (as I noted in my 
previous
post) does the group leaves implicitly on the interface delete and the 
explicit

group leaves fail there (and reportedly on other OSes too).
Sure this can solved by some conditional compilation.
This is why my post was more a theoretical design question/suggestion than
a feature request (or a bug report).

In this sense what do you think about the possible benefit of the proposed
approach for maintaning the per-interface multicast reception state?


cheers,
jamal



Thanks
Michal 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IP1000A: IC Plus update 2006-08-22

2006-08-23 Thread Francois Romieu

Francois Romieu [EMAIL PROTECTED] :
[...]
 or as a serie of patches at:
 http://www.fr.zoreil.com/linux/2.6.x/2.6.18-rc4/ip1000

Typo. It should be:
http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.18-rc4/ip1000

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

call panic if nl_table allocation fails

2006-08-23 Thread Akinobu Mita

This patch makes crash happen if initialization of nl_table fails
in initcalls. It is better than getting use after free crash later.

Cc: Patrick McHardy [EMAIL PROTECTED]
Cc: David Miller [EMAIL PROTECTED]
Signed-off-by: Akinobu Mita [EMAIL PROTECTED]

Index: work-failmalloc/net/netlink/af_netlink.c
===
--- work-failmalloc.orig/net/netlink/af_netlink.c
+++ work-failmalloc/net/netlink/af_netlink.c
@@ -1273,8 +1273,7 @@ netlink_kernel_create(int unit, unsigned
struct netlink_sock *nlk;
unsigned long *listeners = NULL;
 
-   if (!nl_table)
-   return NULL;
+   BUG_ON(!nl_table);
 
if (unit0 || unit=MAX_LINKS)
return NULL;
@@ -1745,11 +1744,8 @@ static int __init netlink_proto_init(voi
netlink_skb_parms_too_large();
 
nl_table = kcalloc(MAX_LINKS, sizeof(*nl_table), GFP_KERNEL);
-   if (!nl_table) {
-enomem:
-   printk(KERN_CRIT netlink_init: Cannot allocate nl_table\n);
-   return -ENOMEM;
-   }
+   if (!nl_table)
+   goto panic;
 
if (num_physpages = (128 * 1024))
max = num_physpages  (21 - PAGE_SHIFT);
@@ -1769,7 +1765,7 @@ enomem:
nl_pid_hash_free(nl_table[i].hash.table,
 1 * sizeof(*hash-table));
kfree(nl_table);
-   goto enomem;
+   goto panic;
}
memset(hash-table, 0, 1 * sizeof(*hash-table));
hash-max_shift = order;
@@ -1786,6 +1782,8 @@ enomem:
rtnetlink_init();
 out:
return err;
+panic:
+   panic(netlink_init: Cannot allocate nl_table\n);
 }
 
 core_initcall(netlink_proto_init);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take13 1/3] kevent: Core files.

2006-08-23 Thread Eric Dumazet

Again Evgeniy I really begin to like kevent :)

On Wednesday 23 August 2006 13:24, Evgeniy Polyakov wrote:
+struct kevent
+{
+   /* Used for kevent freeing.*/
+   struct rcu_head rcu_head;
+   struct ukevent  event;
+   /* This lock protects ukevent manipulations, e.g. ret_flags changes. 
*/
+   spinlock_t  ulock;
+
+   /* Entry of user's queue. */
+   struct list_headkevent_entry;
+   /* Entry of origin's queue. */
+   struct list_headstorage_entry;
+   /* Entry of user's ready. */
+   struct list_headready_entry;
+
+   u32 flags;
+
+   /* User who requested this kevent. */
+   struct kevent_user  *user;
+   /* Kevent container. */
+   struct kevent_storage   *st;
+
+   struct kevent_callbacks callbacks;
+
+   /* Private data for different storages. 
+    * poll()/select storage has a list of wait_queue_t containers 
+    * for each -poll() { poll_wait()' } here.
+    */
+   void*priv;
+};

I wonder if you can reorder fields in this structure, so that 'read mostly' 
fields are grouped together, maybe in a different cache line.
This should help reduce false sharing in SMP.
read mostly fields are (but you know better than me) : callbacks, rcu_head, 
priv, user, event, ...


+#define KEVENT_MAX_EVENTS  4096

Could you please tell me (Forgive me if you already clarified this point) , 
what happens if the number of queued events reaches this value ?


+int kevent_init(struct kevent *k)
+{
+   spin_lock_init(k-ulock);
+   k-flags = 0;
+
+   if (unlikely(k-event.type = KEVENT_MAX))
+   return kevent_break(k);
+

As long you are sure we cannot call kevent_enqueue()/kevent_dequeue() after a 
failed kevent_init() it should be fine.

+int kevent_add_callbacks(const struct kevent_callbacks *cb, int pos)
+{
+   struct kevent_callbacks *p;
+   
+   if (pos = KEVENT_MAX)
+   return -EINVAL;

if a negative pos is used here we might crash. KEVENT_MAX is a signed too, so 
the compare is done on signed values.
If we consider callers always give a sane value, the test can be suppressed.
If we consider callers may be wrong, then we must do a correct test.
If you dont want to change function prototype, then change the test to :

if ((unsigned)pos = KEVENT_MAX)
  return -EINVAL;

Some people on lkml will prefer:
if (pos  0 || pos = KEVENT_MAX)
return -EINVAL;
or
#define KEVENT_MAX 6U /* unsigned constant */

+static kmem_cache_t *kevent_cache;

You probably want to add __read_mostly here to avoid false sharing.

+static kmem_cache_t *kevent_cache __read_mostly;

Same for other caches :
+static kmem_cache_t *kevent_poll_container_cache;
+static kmem_cache_t *kevent_poll_priv_cache;


About the hash table :

+struct kevent_user
+{
+   struct list_headkevent_list[KEVENT_HASH_MASK+1];
+   spinlock_t  kevent_lock;

epoll used to use a hash table too (its size was configurable at init time), 
and was converted to a RB-tree for good reasons...(avoid a user to allocate a 
big hash table in pinned memory and DOS)
Are you sure a process handling one million sockets will succeed using kevent 
instead of epoll ?

Do you have a pointer to sample source code using mmap()/kevent interface ? 
It's not clear to me how we can use it (and notice that a full wrap occured, 
user app could miss x*KEVENT_MAX_EVENTS events ?). Do we still must use a 
syscall to dequeue events ?

In particular you state sizeof(mukevent) is 40, while its 12:

+/*
+ * Note that kevents does not exactly fill the page (each mukevent is 40 
bytes),
+ * so we reuse 4 bytes at the begining of the first page to store index.
+ * Take that into account if you want to change size of struct ukevent.
+ */

+struct mukevent
+{
+   struct kevent_idid;  /* size()=8 */
+   __u32   ret_flags; /* size()=4 */
+};



Thank you
Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Jari Sundell


On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:

We still do not know what uintptr_t is, and it looks like it is a pointer,
which is forbidden. Those numbers are not enough to make network AIO.
And actually is not compatible with kqueue already, so you will need to
write your own parser to convert your parameters into above structure.


7.18.1.4 Integertypes capable of holding object pointers

1 The following type designates a signed integer type with the
property that any valid
pointer to void can be converted to this type, then converted back to
pointer to void,
and the result will compare equal to the original pointer:

Dunno if this means that x86-64 needs yet another typedef, or if using
long for intptr_t is incorrect. But assuming a different integer type
was used instead of intptr_t, that is known to be able to hold a
pointer, would there still be any problems?

I'm unable to see anything specific about AIO in your kevent patch
that these modifications wouldn't support.

Rakshasa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take12 0/3] kevent: Generic event handling mechanism.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 02:55:47PM +0200, Jari Sundell ([EMAIL PROTECTED]) 
wrote:
 On 8/23/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 We still do not know what uintptr_t is, and it looks like it is a pointer,
 which is forbidden. Those numbers are not enough to make network AIO.
 And actually is not compatible with kqueue already, so you will need to
 write your own parser to convert your parameters into above structure.
 
 7.18.1.4 Integertypes capable of holding object pointers
 
 1 The following type designates a signed integer type with the
 property that any valid
 pointer to void can be converted to this type, then converted back to
 pointer to void,
 and the result will compare equal to the original pointer:
 
 Dunno if this means that x86-64 needs yet another typedef, or if using
 long for intptr_t is incorrect. But assuming a different integer type
 was used instead of intptr_t, that is known to be able to hold a
 pointer, would there still be any problems?

stdint.h

/* Types for `void *' pointers.  */
#if __WORDSIZE == 64
# ifndef __intptr_t_defined
typedef long intintptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned long int   uintptr_t;
#else
# ifndef __intptr_t_defined
typedef int intptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned intuintptr_t;
#endif

which means that with 32bit userspace it will be equal to 32bit only.

 I'm unable to see anything specific about AIO in your kevent patch
 that these modifications wouldn't support.

I was asked to postpone AIO stuff for now, you can find it in previous
patchsets sent about week or two ago.

 Rakshasa

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] Add Sonics Silicon Backplane driver

2006-08-23 Thread Michael Buesch

This patch adds a Sonics Silicon Backplane driver backend
that can be used by ssb based device drivers auch as bcm43xx
and b44.

Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: wireless-dev/drivers/misc/Kconfig
===
--- wireless-dev.orig/drivers/misc/Kconfig  2006-08-21 22:45:56.0 
+0200
+++ wireless-dev/drivers/misc/Kconfig   2006-08-22 21:07:01.0 +0200
@@ -28,5 +28,9 @@
 
  If unsure, say N.
 
+config SONICS_SILICON_BACKPLANE
+   tristate
+   depends on PCI
+
 endmenu
 
Index: wireless-dev/drivers/misc/Makefile
===
--- wireless-dev.orig/drivers/misc/Makefile 2006-08-21 22:45:56.0 
+0200
+++ wireless-dev/drivers/misc/Makefile  2006-08-21 22:47:10.0 +0200
@@ -3,5 +3,6 @@
 #
 obj- := misc.o # Dummy rule to force built-in.o to be made
 
-obj-$(CONFIG_IBM_ASM)  += ibmasm/
-obj-$(CONFIG_HDPU_FEATURES)+= hdpuftrs/
+obj-$(CONFIG_IBM_ASM)  += ibmasm/
+obj-$(CONFIG_HDPU_FEATURES)+= hdpuftrs/
+obj-$(CONFIG_SONICS_SILICON_BACKPLANE) += ssb.o
Index: wireless-dev/drivers/misc/ssb.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ wireless-dev/drivers/misc/ssb.c 2006-08-23 10:58:06.0 +0200
@@ -0,0 +1,1015 @@
+/*
+ * Sonics Silicon Backplane backend.
+ *
+ * Copyright (C) 2005-2006 Michael Buesch [EMAIL PROTECTED]
+ * Copyright (C) 2005 Martin Langer [EMAIL PROTECTED]
+ * Copyright (C) 2005 Stefano Brivio [EMAIL PROTECTED]
+ * Copyright (C) 2005 Danny van Dyk [EMAIL PROTECTED]
+ * Copyright (C) 2005 Andreas Jaggi [EMAIL PROTECTED]
+ *
+ * Derived from the Broadcom 4400 device driver.
+ * Copyright (C) 2002 David S. Miller (davem@redhat.com)
+ * Fixed by Pekka Pietikainen ([EMAIL PROTECTED])
+ * Copyright (C) 2006 Broadcom Corporation.
+ *
+ * Licensed under the GNU/GPL. See COPYING for details.
+ */
+
+#include linux/ssb.h
+#include linux/pci.h
+#include linux/delay.h
+
+
+#define SSB_DEBUG  0
+#define PFXssb: 
+
+
+#if SSB_DEBUG
+# define dprintk(f, x...)  do { printk(f ,##x); } while (0)
+# define assert(expr) \
+   do {
\
+   if (unlikely(!(expr))) {
\
+   printk(KERN_ERR PFX ASSERTION FAILED (%s) at: %s:%d:%s()\n,   
\
+   #expr, __FILE__, __LINE__, __FUNCTION__);   
\
+   }   
\
+   } while (0)
+#else
+# define dprintk(f, x...)  do { /* nothing */ } while (0)
+# define assert(expr)  do { if (expr) { /* nothing */ } } while (0)
+#endif
+
+
+static inline int ssb_pci_read_config32(struct ssb *ssb, int offset,
+   u32 *value)
+{
+   return pci_read_config_dword(ssb-pci_dev, offset, value);
+}
+
+static inline int ssb_pci_read_config16(struct ssb *ssb, int offset,
+   u16 *value)
+{
+   return pci_read_config_word(ssb-pci_dev, offset, value);
+}
+
+static inline int ssb_pci_write_config32(struct ssb *ssb, int offset,
+u32 value)
+{
+   return pci_write_config_dword(ssb-pci_dev, offset, value);
+}
+
+static inline u32 ssb_read32(struct ssb *ssb, u16 offset)
+{
+   return ioread32(ssb-mmio + offset + ssb_core_offset(ssb));
+}
+
+static inline void ssb_write32(struct ssb *ssb, u16 offset,
+  u32 value)
+{
+   iowrite32(value, ssb-mmio + offset + ssb_core_offset(ssb));
+}
+
+static inline u16 ssb_read16(struct ssb *ssb, u16 offset)
+{
+   return ioread16(ssb-mmio + offset + ssb_core_offset(ssb));
+}
+
+static inline void ssb_write16(struct ssb *ssb, u16 offset,
+  u16 value)
+{
+   iowrite16(value, ssb-mmio + offset + ssb_core_offset(ssb));
+}
+
+
+static inline u8 ssb_crc8(u8 crc, u8 data)
+{
+   /* Polynomial:   x^8 + x^7 + x^6 + x^4 + x^2 + 1   */
+   static const u8 t[] = {
+   0x00, 0xF7, 0xB9, 0x4E, 0x25, 0xD2, 0x9C, 0x6B,
+   0x4A, 0xBD, 0xF3, 0x04, 0x6F, 0x98, 0xD6, 0x21,
+   0x94, 0x63, 0x2D, 0xDA, 0xB1, 0x46, 0x08, 0xFF,
+   0xDE, 0x29, 0x67, 0x90, 0xFB, 0x0C, 0x42, 0xB5,
+   0x7F, 0x88, 0xC6, 0x31, 0x5A, 0xAD, 0xE3, 0x14,
+   0x35, 0xC2, 0x8C, 0x7B, 0x10, 0xE7, 0xA9, 0x5E,
+   0xEB, 0x1C, 0x52, 0xA5, 0xCE, 0x39, 0x77, 0x80,
+   0xA1, 0x56, 0x18, 0xEF, 0x84, 0x73, 0x3D, 0xCA,
+   0xFE, 0x09, 0x47, 0xB0, 0xDB, 0x2C, 0x62, 0x95,
+   0xB4, 0x43, 0x0D, 0xFA, 0x91, 0x66, 0x28, 0xDF,
+   0x6A, 0x9D, 0xD3, 0x24, 0x4F, 0xB8, 0xF6, 0x01,
+   0x20, 0xD7, 0x99, 0x6E, 0x05, 0xF2, 0xBC, 0x4B,
+

[PATCH 0/2] Sonics Silicon Backplane driver

2006-08-23 Thread Michael Buesch

Hi,

This patch series adds a lowlevel driver for the
Sonics Silicon Backplane (short ssb). The ssb is used
in some broadcom devices such as bcm43xx and bcm44xx.

The ssb is a common backplane to host several operating
cores. It's responsible to manage these cores.

John, please apply this patch series to wireless-dev.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] Add Sonics Silicon Backplane driver

2006-08-23 Thread Michael Buesch

On Wednesday 23 August 2006 12:59, Martin Michlmayr wrote:
 * Michael Buesch [EMAIL PROTECTED] [2006-08-23 11:59]:
  +#define SSB_SPROM1_OEM 0x1076  /* 8 bytes OEM string 
  (rev 1 only) */
  +/* SPROM Revision 2 (inherits from rev 1) */
  +#define SSB_SPROM2_BFLHI   0x1038  /* Boardflags (high 16 bits) */
  +#define SSB_SPROM2_MAXP_A  0x103A  /* A-PHY Max Power */
  +#define  SSB_SPROM2_MAXP_A_HI  0x00FF  /* Max Power High */
  +#define  SSB_SPROM2_MAXP_A_LO  0x1100  /* Max Power Low */
  +#define  SSB_SPROM2_MAXP_A_LO_SHIFT8
  +#define SSB_SPROM2_PA1LOB0 0x103C  /* A-PHY PowerAmplifier Low 
  Settings */
  +#define SSB_SPROM2_PA1LOB1 0x103E  /* A-PHY PowerAmplifier Low 
  Settings */
  +#define SSB_SPROM2_PA1LOB2 0x1040  /* A-PHY PowerAmplifier Low 
  Settings */
  +#define SSB_SPROM2_PA1HIB0 0x1042  /* A-PHY PowerAmplifier High 
  Settings */
  +#define SSB_SPROM2_PA1HIB1 0x1044  /* A-PHY PowerAmplifier High 
  Settings */
  +#define SSB_SPROM2_PA1HIB2 0x1046  /* A-PHY PowerAmplifier High 
  Settings */
  +#define SSB_SPROM2_OPO 0x1078  /* OFDM Power Offset 
  from CCK Level */
  +#define  SSB_SPROM2_OPO_VALUE  0x00FF
  +#define  SSB_SPROM2_OPO_UNUSED 0xFF00
  +#define SSB_SPROM2_CCODE   0x107C  /* Two char Country Code */
 
 This is badly formated.  Can you repost, please.

Uhm, no? I don't think the formating is broken.
Please _apply_ that diff and look again at the resulting file.

(I the issue is the tabs that go crazy, if the additional diff
plus char at the beginning is inserted)

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] Add Sonics Silicon Backplane driver

2006-08-23 Thread Martin Michlmayr

* Michael Buesch [EMAIL PROTECTED] [2006-08-23 11:59]:
 +#define SSB_SPROM1_OEM   0x1076  /* 8 bytes OEM string 
 (rev 1 only) */
 +/* SPROM Revision 2 (inherits from rev 1) */
 +#define SSB_SPROM2_BFLHI 0x1038  /* Boardflags (high 16 bits) */
 +#define SSB_SPROM2_MAXP_A0x103A  /* A-PHY Max Power */
 +#define  SSB_SPROM2_MAXP_A_HI0x00FF  /* Max Power High */
 +#define  SSB_SPROM2_MAXP_A_LO0x1100  /* Max Power Low */
 +#define  SSB_SPROM2_MAXP_A_LO_SHIFT  8
 +#define SSB_SPROM2_PA1LOB0   0x103C  /* A-PHY PowerAmplifier Low 
 Settings */
 +#define SSB_SPROM2_PA1LOB1   0x103E  /* A-PHY PowerAmplifier Low 
 Settings */
 +#define SSB_SPROM2_PA1LOB2   0x1040  /* A-PHY PowerAmplifier Low 
 Settings */
 +#define SSB_SPROM2_PA1HIB0   0x1042  /* A-PHY PowerAmplifier High 
 Settings */
 +#define SSB_SPROM2_PA1HIB1   0x1044  /* A-PHY PowerAmplifier High 
 Settings */
 +#define SSB_SPROM2_PA1HIB2   0x1046  /* A-PHY PowerAmplifier High 
 Settings */
 +#define SSB_SPROM2_OPO   0x1078  /* OFDM Power Offset 
 from CCK Level */
 +#define  SSB_SPROM2_OPO_VALUE0x00FF
 +#define  SSB_SPROM2_OPO_UNUSED   0xFF00
 +#define SSB_SPROM2_CCODE 0x107C  /* Two char Country Code */

This is badly formated.  Can you repost, please.
-- 
Martin Michlmayr
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] Add Sonics Silicon Backplane driver

2006-08-23 Thread Johannes Berg

On Wed, 2006-08-23 at 11:59 +0100, Martin Michlmayr wrote:
  +#define SSB_SPROM2_OPO 0x1078  /* OFDM Power Offset 
  from CCK Level */
  +#define  SSB_SPROM2_OPO_VALUE  0x00FF
  +#define  SSB_SPROM2_OPO_UNUSED 0xFF00
  +#define SSB_SPROM2_CCODE   0x107C  /* Two char Country Code */
 
 This is badly formated.  Can you repost, please.

That's just because of the tabs, if the + in front is removed it's
formatted fine. And the indentation is supposed to be like that I'd
think.

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] Add Sonics Silicon Backplane driver

2006-08-23 Thread Martin Michlmayr

* Johannes Berg [EMAIL PROTECTED] [2006-08-23 13:04]:
   +#define  SSB_SPROM2_OPO_VALUE0x00FF
   +#define  SSB_SPROM2_OPO_UNUSED   0xFF00
   +#define SSB_SPROM2_CCODE 0x107C  /* Two char Country Code */
  
  This is badly formated.  Can you repost, please.
 
 That's just because of the tabs, if the + in front is removed it's
 formatted fine. And the indentation is supposed to be like that I'd
 think.

Yes, sorry, I didn't realize that registers and register values have
different indentation.
-- 
Martin Michlmayr
http://www.cyrius.com/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [take13 1/3] kevent: Core files.

2006-08-23 Thread Evgeniy Polyakov

On Wed, Aug 23, 2006 at 05:27:53PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 One can find it in archive on homepage
 http://tservice.net.ru/~s0mbre/old/?section=projectsitem=kevent 
 or attached.

Now it is really attached.

-- 
Evgeniy Polyakov
#include sys/types.h
#include sys/stat.h
#include sys/ioctl.h
#include sys/time.h
#include sys/mman.h

#include fcntl.h
#include stdio.h
#include stdlib.h
#include errno.h
#include string.h
#include unistd.h

#include linux/unistd.h
#include linux/types.h

#define PAGE_SIZE   4096
#include linux/ukevent.h

#define _syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \
type name (type1 arg1, type2 arg2, type3 arg3, type4 arg4) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4);\
}

#define _syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
  type5,arg5) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4, arg5);\
}

#define _syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
  type5,arg5,type6,arg6) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5, type6 arg6) \
{\
return syscall(__NR_##name, arg1, arg2, arg3, arg4, arg5, arg6);\
}

_syscall4(int, kevent_ctl, int, arg1, unsigned int, argv2, unsigned int, argv3, 
void *, argv4);
_syscall6(int, kevent_get_events, int, arg1, unsigned int, argv2, unsigned int, 
argv3, __u64, argv4, void *, argv5, unsigned, arg6);

#define ulog(f, a...) fprintf(stderr, f, ##a)
#define ulog_err(f, a...) ulog(f : %s [%d].\n, ##a, strerror(errno), errno)

static void usage(char *p)
{
ulog(Usage: %s -t type -e event -o oneshot -p path -n wait_num -f 
kevent_file -h\n, p);
}

static int get_id(int type, char *path)
{
int ret = -1;

switch (type) {
case KEVENT_TIMER:
ret = 3000;
break;
case KEVENT_INODE:
ret = open(path, O_RDONLY);
break;
}

return ret;
}

static void *evtest_mmap(int fd, off_t *offset, unsigned int number)
{
void *start, *ptr;
off_t o = *offset;

start = NULL;

ptr = mmap(start, PAGE_SIZE*number, PROT_READ, MAP_SHARED, fd, 
o*PAGE_SIZE);
if (ptr == MAP_FAILED) {
ulog_err(Failed to mmap: start: %p, number: %u, offset: %lu, 
start, number, o);
return NULL;
}

printf(mmap: ptr: %p, start: %p, number: %u, offset: %lu.\n, ptr, 
start, number, o);
*offset =  o + number;
return ptr;
}

int main(int argc, char *argv[])
{
int ch, fd, err, type, event, oneshot, wait_num, number;
unsigned int i, num, old_idx;
char *path, *file;
char buf[4096];
struct ukevent *uk;
struct kevent_mring *ring;
off_t offset;

path = NULL;
type = event = -1;
oneshot = 0;
wait_num = 10;
offset = 0;
number = 1;
old_idx = 0;
file = /dev/kevent;

while ((ch = getopt(argc, argv, f:p:t:e:o:n:h))  0) {
switch (ch) {
case 'f':
file = optarg;
break;
case 'n':
wait_num = atoi(optarg);
break;
case 'p':
path = optarg;
break;
case 't':
type = atoi(optarg);
break;
case 'e':
event = atoi(optarg);
break;
case 'o':
oneshot = atoi(optarg);
break;
default:
usage(argv[0]);
return -1;
}
}

if (event == -1 || type == -1 || (type == KEVENT_INODE  !path)) {
ulog(You need at least -t -e parameters and -p for inode 
notifications.\n);
usage(argv[0]);
return -1;
}

fd = open(file, O_RDWR);
if (fd == -1) {
ulog_err(Failed create kevent control block using file %s, 
file);
return -1;
}

ring = evtest_mmap(fd, offset, number);
if (!ring)
return -1;

memset(buf, 0, sizeof(buf));

num = 1;
for (i=0; inum; ++i) {
uk = (struct ukevent *)buf;
uk-event = event;
uk-type = type;
if (oneshot)
uk-req_flags |= KEVENT_REQ_ONESHOT;
uk-user[0] = i;

Re: call panic if nl_table allocation fails

2006-08-23 Thread James Morris

On Wed, 23 Aug 2006, Akinobu Mita wrote:

 This patch makes crash happen if initialization of nl_table fails
 in initcalls. It is better than getting use after free crash later.

   nl_table = kcalloc(MAX_LINKS, sizeof(*nl_table), GFP_KERNEL);

Perhaps it'd be better to declare this as an array rather than allocating 
it at runtime.



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: call panic if nl_table allocation fails

2006-08-23 Thread Patrick McHardy

James Morris wrote:
 On Wed, 23 Aug 2006, Akinobu Mita wrote:
 
 
This patch makes crash happen if initialization of nl_table fails
in initcalls. It is better than getting use after free crash later.
 
 
  nl_table = kcalloc(MAX_LINKS, sizeof(*nl_table), GFP_KERNEL);
 
 
 Perhaps it'd be better to declare this as an array rather than allocating 
 it at runtime.

That would still leave the MAX_LINKS allocations for the pid hashes
which need to be allocated because they are dynamically sized. We
could delay the pid hash allocations until the first bind happens
of course, but I doubt it would be worth it since they start with
just a single bucket.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH 1/3] net/ipv4: UDP-Lite extensions

2006-08-23 Thread James Morris

On Wed, 23 Aug 2006, [EMAIL PROTECTED] wrote:

 +void __init udplite4_register(void)
 +{
 + if (proto_register(udplite_prot, 1))
 + goto out_register_err;
 +
 + if (inet_add_protocol(udplite_protocol, IPPROTO_UDPLITE)  0)
 + goto out_unregister_proto;
 +
 + inet_register_protosw(udplite4_protosw);
 +
 + return;
 +
 +out_unregister_proto:
 + proto_unregister(udplite_prot);
 +out_register_err:
 + printk(KERN_ERR udplite4_register: Cannot add UDP-Lite protocol\n);
 +}

Other protocols  network components call panic() if they fail during boot 
initialization.  Not sure if this is a great thing, but it raises the 
issue of whether udp-lite should remain consistent here.



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: multicast group memberships purge on interface delete

2006-08-23 Thread jamal

On Wed, 2006-23-08 at 15:29 +0200, Michal Růžička wrote:

 No need to rmmod anything, just think of ppp or gre interfaces which come 
 and go
 without any modules loading/unloading. But yes, the rmmod would probably be
 needed in case of, for example, an ethernet device.
 

Ok - Same effect. i.e the same events would be generated if a gre
dissapears or an ethernet is rmmoded.

  The challenge is to make the app
  also aware of you taking away the group from underneath them (thats why
  i said fix it)
 
 
 I dont's see this as any challange as the applications could just assume 
 that any
 memberships on deleted interfaces have been just droped implicitly by the 
 kernel.

How would they know that the interface has been deleted?
If you have the answer to that, then why dont you do the
unsubscriptions/leaves as well?

 (This should be no problem for them provided that they keep track of
 the interfaces present on the system, which they should anyway or otherwise
 they could end up listening to just a part of the multicast traffic they are
 interested in.)
 

Right. So does your app do this?

 In fact I've had proposed that on the application mailing list (the 
 appliaction is
 quagga formerly zebra routing suite to be specific) but the people there 
 disliked
 it because of the fact that for example the NetBSD (as I noted in my 
 previous
 post) does the group leaves implicitly on the interface delete and the 
 explicit
 group leaves fail there (and reportedly on other OSes too).
 Sure this can solved by some conditional compilation.
 This is why my post was more a theoretical design question/suggestion than
 a feature request (or a bug report).
 
 In this sense what do you think about the possible benefit of the proposed
 approach for maintaning the per-interface multicast reception state?

An arguement can be made that if you joined the groups from the app,
then the app should be responsible to unsubscribe. i.e this is a policy
decision. 
You could have the kernel implement your policy as you described, but in
my view you would have to tell it. And conditional compilation or some
way of telling the kernel would fit in such a case.
There is probably a good reason why NetBSD insists on doing it in the
kernel; do you know what this reason is?

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/44] [IPV6] MIP6: Add inbound interface of routing header type 2.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add inbound interface of routing header type 2 for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/addrconf.h |7 +
 net/ipv6/exthdrs.c |   69 ++--
 2 files changed, 68 insertions(+), 8 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 3d71251..5fc8627 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -61,6 +61,13 @@ extern int   addrconf_set_dstaddr(void _
 extern int ipv6_chk_addr(struct in6_addr *addr,
  struct net_device *dev,
  int strict);
+/* XXX: this is a placeholder till addrconf supports */
+#ifdef CONFIG_IPV6_MIP6
+static inline int ipv6_chk_home_addr(struct in6_addr *addr)
+{
+   return 0;
+}
+#endif
 extern struct inet6_ifaddr *   ipv6_get_ifaddr(struct in6_addr *addr,
struct net_device *dev,
int strict);
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index fe3f737..8d4af75 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -43,6 +43,9 @@ #include net/rawv6.h
 #include net/ndisc.h
 #include net/ip6_route.h
 #include net/addrconf.h
+#ifdef CONFIG_IPV6_MIP6
+#include net/xfrm.h
+#endif
 
 #include asm/uaccess.h
 
@@ -219,7 +222,7 @@ static int ipv6_rthdr_rcv(struct sk_buff
 {
struct sk_buff *skb = *skbp;
struct inet6_skb_parm *opt = IP6CB(skb);
-   struct in6_addr *addr;
+   struct in6_addr *addr = NULL;
struct in6_addr daddr;
int n, i;
 
@@ -244,6 +247,23 @@ static int ipv6_rthdr_rcv(struct sk_buff
 
 looped_back:
if (hdr-segments_left == 0) {
+   switch (hdr-type) {
+#ifdef CONFIG_IPV6_MIP6
+   case IPV6_SRCRT_TYPE_2:
+   /* Silently discard type 2 header unless it was
+* processed by own
+*/
+   if (!addr) {
+   IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS);
+   kfree_skb(skb);
+   return -1;
+   }
+   break;
+#endif
+   default:
+   break;
+   }
+
opt-lastopt = skb-h.raw - skb-nh.raw;
opt-srcrt = skb-h.raw - skb-nh.raw;
skb-h.raw += (hdr-hdrlen + 1)  3;
@@ -253,17 +273,29 @@ looped_back:
return 1;
}
 
-   if (hdr-type != IPV6_SRCRT_TYPE_0) {
+   switch (hdr-type) {
+   case IPV6_SRCRT_TYPE_0:
+   if (hdr-hdrlen  0x01) {
+   IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS);
+   icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (hdr-hdrlen) 
- skb-nh.raw);
+   return -1;
+   }
+   break;
+#ifdef CONFIG_IPV6_MIP6
+   case IPV6_SRCRT_TYPE_2:
+   /* Silently discard invalid RTH type 2 */
+   if (hdr-hdrlen != 2 || hdr-segments_left != 1) {
+   IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS);
+   kfree_skb(skb);
+   return -1;
+   }
+   break;
+#endif
+   default:
IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS);
icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (hdr-type) - 
skb-nh.raw);
return -1;
}
-   
-   if (hdr-hdrlen  0x01) {
-   IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS);
-   icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (hdr-hdrlen) - 
skb-nh.raw);
-   return -1;
-   }
 
/*
 *  This is the routing header forwarding algorithm from
@@ -303,6 +335,27 @@ looped_back:
addr = rthdr-addr;
addr += i - 1;
 
+   switch (hdr-type) {
+#ifdef CONFIG_IPV6_MIP6
+   case IPV6_SRCRT_TYPE_2:
+   if (xfrm6_input_addr(skb, (xfrm_address_t *)addr,
+(xfrm_address_t *)skb-nh.ipv6h-saddr,
+IPPROTO_ROUTING)  0) {
+   IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS);
+   kfree_skb(skb);
+   return -1;
+   }
+   if (!ipv6_chk_home_addr(addr)) {
+   IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS);
+   kfree_skb(skb);
+   return -1;
+   }
+   break;
+#endif
+   default:
+   break;
+   }
+
if (ipv6_addr_is_multicast(addr)) {
IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS);

[PATCH 23/44] [IPV6]: Allow to replace skbuff by TLV parser.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

In receiving Mobile IPv6 home address option which is a TLV carried
by destination options header, kernel will try to mangle source adderss
of packet. Think of cloned skbuff it is required to replace it by
the parser just like routing header case.
This is a framework to achieve that to allow TLV parser to replace
inbound skbuff pointer.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/ipv6.h   |2 +-
 net/ipv6/exthdrs.c   |   29 +++--
 net/ipv6/ip6_input.c |2 +-
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index c4ea127..8e6ec60 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -229,7 +229,7 @@ extern int  ip6_ra_control(struct sock
   void (*destructor)(struct sock 
*));
 
 
-extern int ipv6_parse_hopopts(struct sk_buff *skb);
+extern int ipv6_parse_hopopts(struct sk_buff **skbp);
 
 extern struct ipv6_txoptions *  ipv6_dup_options(struct sock *sk, struct 
ipv6_txoptions *opt);
 extern struct ipv6_txoptions * ipv6_renew_options(struct sock *sk, struct 
ipv6_txoptions *opt,
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 72e8175..fc972c1 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -102,7 +102,7 @@ int ipv6_find_tlv(struct sk_buff *skb, i
 
 struct tlvtype_proc {
int type;
-   int (*func)(struct sk_buff *skb, int offset);
+   int (*func)(struct sk_buff **skbp, int offset);
 };
 
 /*
@@ -111,8 +111,10 @@ struct tlvtype_proc {
 
 /* An unknown option is detected, decide what to do */
 
-static int ip6_tlvopt_unknown(struct sk_buff *skb, int optoff)
+static int ip6_tlvopt_unknown(struct sk_buff **skbp, int optoff)
 {
+   struct sk_buff *skb = *skbp;
+
switch ((skb-nh.raw[optoff]  0xC0)  6) {
case 0: /* ignore */
return 1;
@@ -137,8 +139,9 @@ static int ip6_tlvopt_unknown(struct sk_
 
 /* Parse tlv encoded option header (hop-by-hop or destination) */
 
-static int ip6_parse_tlv(struct tlvtype_proc *procs, struct sk_buff *skb)
+static int ip6_parse_tlv(struct tlvtype_proc *procs, struct sk_buff **skbp)
 {
+   struct sk_buff *skb = *skbp;
struct tlvtype_proc *curr;
int off = skb-h.raw - skb-nh.raw;
int len = ((skb-h.raw[1]+1)3);
@@ -168,13 +171,13 @@ static int ip6_parse_tlv(struct tlvtype_
/* type specific length/alignment 
   checks will be performed in the 
   func(). */
-   if (curr-func(skb, off) == 0)
+   if (curr-func(skbp, off) == 0)
return 0;
break;
}
}
if (curr-type  0) {
-   if (ip6_tlvopt_unknown(skb, off) == 0)
+   if (ip6_tlvopt_unknown(skbp, off) == 0)
return 0;
}
break;
@@ -213,7 +216,8 @@ static int ipv6_destopt_rcv(struct sk_bu
opt-lastopt = skb-h.raw - skb-nh.raw;
opt-dst1 = skb-h.raw - skb-nh.raw;
 
-   if (ip6_parse_tlv(tlvprocdestopt_lst, skb)) {
+   if (ip6_parse_tlv(tlvprocdestopt_lst, skbp)) {
+   skb = *skbp;
skb-h.raw += ((skb-h.raw[1]+1)3);
opt-nhoff = opt-dst1;
return 1;
@@ -517,8 +521,10 @@ EXPORT_SYMBOL_GPL(ipv6_invert_rthdr);
 
 /* Router Alert as of RFC 2711 */
 
-static int ipv6_hop_ra(struct sk_buff *skb, int optoff)
+static int ipv6_hop_ra(struct sk_buff **skbp, int optoff)
 {
+   struct sk_buff *skb = *skbp;
+
if (skb-nh.raw[optoff+1] == 2) {
IP6CB(skb)-ra = optoff;
return 1;
@@ -531,8 +537,9 @@ static int ipv6_hop_ra(struct sk_buff *s
 
 /* Jumbo payload */
 
-static int ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
+static int ipv6_hop_jumbo(struct sk_buff **skbp, int optoff)
 {
+   struct sk_buff *skb = *skbp;
u32 pkt_len;
 
if (skb-nh.raw[optoff+1] != 4 || (optoff3) != 2) {
@@ -581,8 +588,9 @@ static struct tlvtype_proc tlvprochopopt
{ -1, }
 };
 
-int ipv6_parse_hopopts(struct sk_buff *skb)
+int ipv6_parse_hopopts(struct sk_buff **skbp)
 {
+   struct sk_buff *skb = *skbp;
struct inet6_skb_parm *opt = IP6CB(skb);
 
/*
@@ -598,7 +606,8 @@ int ipv6_parse_hopopts(struct sk_buff *s
}
 
opt-hop = sizeof(struct ipv6hdr);
-   if (ip6_parse_tlv(tlvprochopopt_lst, skb)) {
+   if (ip6_parse_tlv(tlvprochopopt_lst, skbp)) {
+   skb =

[PATCH 12/44] [XFRM] STATE: Add a hook to obtain local/remote outbound address.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Outbound transformation replaces both source and destination address with
state's end-point addresses at the same time when IPsec tunnel mode.
It is also required to change them for Mobile IPv6 route optimization, but we
should care about the following differences:
 - changing result is not end-point but care-of address
 - either source or destination is replaced for each state
This hook is a common platform to change outbound address.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/xfrm.h  |2 ++
 net/ipv6/xfrm6_policy.c |   20 ++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 6d0dafb..cbfacdb 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -267,6 +267,8 @@ struct xfrm_type
int (*input)(struct xfrm_state *, struct sk_buff 
*skb);
int (*output)(struct xfrm_state *, struct sk_buff 
*pskb);
int (*hdr_offset)(struct xfrm_state *, struct 
sk_buff *, u8 **);
+   xfrm_address_t  *(*local_addr)(struct xfrm_state *, 
xfrm_address_t *);
+   xfrm_address_t  *(*remote_addr)(struct xfrm_state *, 
xfrm_address_t *);
/* Estimate maximal size of result of transformation of a dgram */
u32 (*get_max_size)(struct xfrm_state *, int size);
 };
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 81355bb..9328fc8 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -59,6 +59,22 @@ __xfrm6_find_bundle(struct flowi *fl, st
return dst;
 }
 
+static inline struct in6_addr*
+__xfrm6_bundle_addr_remote(struct xfrm_state *x, struct in6_addr *addr)
+{
+   return (x-type-remote_addr) ?
+   (struct in6_addr*)x-type-remote_addr(x, (xfrm_address_t 
*)addr) :
+   (struct in6_addr*)x-id.daddr;
+}
+
+static inline struct in6_addr*
+__xfrm6_bundle_addr_local(struct xfrm_state *x, struct in6_addr *addr)
+{
+   return (x-type-local_addr) ?
+   (struct in6_addr*)x-type-local_addr(x, (xfrm_address_t 
*)addr) :
+   (struct in6_addr*)x-props.saddr;
+}
+
 /* Allocate chain of dst_entry's, attach known xfrm's, calculate
  * all the metrics... Shortly, bundle a bundle.
  */
@@ -115,8 +131,8 @@ __xfrm6_bundle_create(struct xfrm_policy
dst1-next = dst_prev;
dst_prev = dst1;
if (xfrm[i]-props.mode != XFRM_MODE_TRANSPORT) {
-   remote = (struct in6_addr*)xfrm[i]-id.daddr;
-   local  = (struct in6_addr*)xfrm[i]-props.saddr;
+   remote = __xfrm6_bundle_addr_remote(xfrm[i], remote);
+   local  = __xfrm6_bundle_addr_local(xfrm[i], local);
tunnel = 1;
}
header_len += xfrm[i]-props.header_len;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 35/44] [XFRM]: Trace which secpath state is reject factor.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

For Mobile IPv6 usage, it is required to trace which secpath state is reject
factor in order to notify it to user space (to know the address which cannot
be used route optimized communication).
Based on MIPL2 kernel patch.

This patch was also written by: Henrik Petander [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/xfrm.h |1 +
 net/xfrm/xfrm_policy.c |   55 ++--
 2 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index bf6daaa..276884f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -274,6 +274,7 @@ #define XFRM_TYPE_NON_FRAGMENT  1
void(*destructor)(struct xfrm_state *);
int (*input)(struct xfrm_state *, struct sk_buff 
*skb);
int (*output)(struct xfrm_state *, struct sk_buff 
*pskb);
+   int (*reject)(struct xfrm_state *, struct sk_buff 
*, struct flowi *);
int (*hdr_offset)(struct xfrm_state *, struct 
sk_buff *, u8 **);
xfrm_address_t  *(*local_addr)(struct xfrm_state *, 
xfrm_address_t *);
xfrm_address_t  *(*remote_addr)(struct xfrm_state *, 
xfrm_address_t *);
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index c009d6f..7b446a9 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -988,6 +988,23 @@ error:
 }
 EXPORT_SYMBOL(xfrm_lookup);
 
+static inline int
+xfrm_secpath_reject(int idx, struct sk_buff *skb, struct flowi *fl)
+{
+   struct xfrm_state *x;
+   int err;
+
+   if (!skb-sp || idx  0 || idx = skb-sp-len)
+   return 0;
+   x = skb-sp-xvec[idx];
+   if (!x-type-reject)
+   return 0;
+   xfrm_state_hold(x);
+   err = x-type-reject(x, skb, fl);
+   xfrm_state_put(x);
+   return err;
+}
+
 /* When skb is transformed back to its native form, we have to
  * check policy restrictions. At the moment we make this in maximally
  * stupid way. Shame on me. :-) Of course, connected sockets must
@@ -1010,6 +1027,13 @@ xfrm_state_ok(struct xfrm_tmpl *tmpl, st
  xfrm_state_addr_cmp(tmpl, x, family));
 }
 
+/*
+ * 0 or more than 0 is returned when validation is succeeded (either bypass
+ * because of optional transport mode, or next index of the mathced secpath
+ * state with the template.
+ * -1 is returned when no matching template is found.
+ * Otherwise -2 - errored_index is returned.
+ */
 static inline int
 xfrm_policy_ok(struct xfrm_tmpl *tmpl, struct sec_path *sp, int start,
   unsigned short family)
@@ -1024,8 +1048,11 @@ xfrm_policy_ok(struct xfrm_tmpl *tmpl, s
for (; idx  sp-len; idx++) {
if (xfrm_state_ok(tmpl, sp-xvec[idx], family))
return ++idx;
-   if (sp-xvec[idx]-props.mode != XFRM_MODE_TRANSPORT)
+   if (sp-xvec[idx]-props.mode != XFRM_MODE_TRANSPORT) {
+   if (start == -1)
+   start = -2-idx;
break;
+   }
}
return start;
 }
@@ -1046,11 +1073,14 @@ xfrm_decode_session(struct sk_buff *skb,
 }
 EXPORT_SYMBOL(xfrm_decode_session);
 
-static inline int secpath_has_nontransport(struct sec_path *sp, int k)
+static inline int secpath_has_nontransport(struct sec_path *sp, int k, int 
*idxp)
 {
for (; k  sp-len; k++) {
-   if (sp-xvec[k]-props.mode != XFRM_MODE_TRANSPORT)
+   if (sp-xvec[k]-props.mode != XFRM_MODE_TRANSPORT) {
+   if (idxp)
+   *idxp = k;
return 1;
+   }
}
 
return 0;
@@ -1062,6 +1092,8 @@ int __xfrm_policy_check(struct sock *sk,
struct xfrm_policy *pol;
struct flowi fl;
u8 fl_dir = policy_to_flow_dir(dir);
+   int xerr_idx = -1;
+   int *xerr_idxp = xerr_idx;
 
if (xfrm_decode_session(skb, fl, family)  0)
return 0;
@@ -1086,8 +1118,13 @@ int __xfrm_policy_check(struct sock *sk,
pol = flow_cache_lookup(fl, family, fl_dir,
xfrm_policy_lookup);
 
-   if (!pol)
-   return !skb-sp || !secpath_has_nontransport(skb-sp, 0);
+   if (!pol) {
+   if (skb-sp  secpath_has_nontransport(skb-sp, 0, xerr_idxp)) 
{
+   xfrm_secpath_reject(xerr_idx, skb, fl);
+   return 0;
+   }
+   return 1;
+   }
 
pol-curlft.use_time = (unsigned long)xtime.tv_sec;
 
@@ -1107,11 +1144,14 @@ int __xfrm_policy_check(struct sock *sk,
 */
for (i = pol-xfrm_nr-1, k = 0; i = 0; i--) {
k = xfrm_policy_ok(pol-xfrm_vec+i,

[PATCH 42/44] [XFRM] POLICY: Support netlink socket interface for sub policy.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Sub policy can be used through netlink socket.
PF_KEY uses main only and it is TODO to support sub.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/xfrm.h |7 +++
 include/net/xfrm.h   |1 
 net/key/af_key.c |   18 +--
 net/xfrm/xfrm_user.c |  134 +-
 4 files changed, 142 insertions(+), 18 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 492fb98..14ecd19 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -230,6 +230,12 @@ enum xfrm_ae_ftype_t {
 #define XFRM_AE_MAX (__XFRM_AE_MAX - 1)
 };
 
+struct xfrm_userpolicy_type {
+   __u8type;
+   __u16   reserved1;
+   __u8reserved2;
+};
+
 /* Netlink message attributes.  */
 enum xfrm_attr_type_t {
XFRMA_UNSPEC,
@@ -248,6 +254,7 @@ enum xfrm_attr_type_t {
XFRMA_SRCADDR,  /* xfrm_address_t */
XFRMA_COADDR,   /* xfrm_address_t */
XFRMA_LASTUSED,
+   XFRMA_POLICY_TYPE,  /* struct xfrm_userpolicy_type */
__XFRMA_MAX
 
 #define XFRMA_MAX (__XFRMA_MAX - 1)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index aab31a2..0f1117d 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -204,6 +204,7 @@ struct km_event
u32 proto;
u32 byid;
u32 aevent;
+   u32 type;
} data;
 
u32 seq;
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 19e047b..83b443d 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1731,7 +1731,8 @@ static u32 gen_reqid(void)
++reqid;
if (reqid == 0)
reqid = IPSEC_MANUAL_REQID_MAX+1;
-   if (xfrm_policy_walk(check_reqid, (void*)reqid) != -EEXIST)
+   if (xfrm_policy_walk(XFRM_POLICY_TYPE_MAIN, check_reqid,
+(void*)reqid) != -EEXIST)
return reqid;
} while (reqid != start);
return 0;
@@ -2268,7 +2269,8 @@ static int pfkey_spddelete(struct sock *
return err;
}
 
-   xp = xfrm_policy_bysel_ctx(pol-sadb_x_policy_dir-1, sel, 
tmp.security, 1);
+   xp = xfrm_policy_bysel_ctx(XFRM_POLICY_TYPE_MAIN, 
pol-sadb_x_policy_dir-1,
+  sel, tmp.security, 1);
security_xfrm_policy_free(tmp);
if (xp == NULL)
return -ENOENT;
@@ -2330,7 +2332,7 @@ static int pfkey_spdget(struct sock *sk,
if (dir = XFRM_POLICY_MAX)
return -EINVAL;
 
-   xp = xfrm_policy_byid(dir, pol-sadb_x_policy_id,
+   xp = xfrm_policy_byid(XFRM_POLICY_TYPE_MAIN, dir, pol-sadb_x_policy_id,
  hdr-sadb_msg_type == SADB_X_SPDDELETE2);
if (xp == NULL)
return -ENOENT;
@@ -2378,7 +2380,7 @@ static int pfkey_spddump(struct sock *sk
 {
struct pfkey_dump_data data = { .skb = skb, .hdr = hdr, .sk = sk };
 
-   return xfrm_policy_walk(dump_sp, data);
+   return xfrm_policy_walk(XFRM_POLICY_TYPE_MAIN, dump_sp, data);
 }
 
 static int key_notify_policy_flush(struct km_event *c)
@@ -2405,7 +2407,8 @@ static int pfkey_spdflush(struct sock *s
 {
struct km_event c;
 
-   xfrm_policy_flush();
+   xfrm_policy_flush(XFRM_POLICY_TYPE_MAIN);
+   c.data.type = XFRM_POLICY_TYPE_MAIN;
c.event = XFRM_MSG_FLUSHPOLICY;
c.pid = hdr-sadb_msg_pid;
c.seq = hdr-sadb_msg_seq;
@@ -2667,6 +2670,9 @@ static int pfkey_send_notify(struct xfrm
 
 static int pfkey_send_policy_notify(struct xfrm_policy *xp, int dir, struct 
km_event *c)
 {
+   if (xp  xp-type != XFRM_POLICY_TYPE_MAIN)
+   return 0;
+
switch (c-event) {
case XFRM_MSG_POLEXPIRE:
return key_notify_policy_expire(xp, c);
@@ -2675,6 +2681,8 @@ static int pfkey_send_policy_notify(stru
case XFRM_MSG_UPDPOLICY:
return key_notify_policy(xp, dir, c);
case XFRM_MSG_FLUSHPOLICY:
+   if (c-data.type != XFRM_POLICY_TYPE_MAIN)
+   break;
return key_notify_policy_flush(c);
default:
printk(pfkey: Unknown policy event %d\n, c-event);
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index a4a4dd6..a096586 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -784,6 +784,22 @@ static int verify_policy_dir(__u8 dir)
return 0;
 }
 
+static int verify_policy_type(__u8 type)
+{
+   switch (type) {
+   case XFRM_POLICY_TYPE_MAIN:
+#ifdef CONFIG_XFRM_SUB_POLICY
+   case XFRM_POLICY_TYPE_SUB:
+#endif
+   break;
+
+   default:
+   return -EINVAL;
+   };
+
+   return 0;
+}
+
 static int verify_newpolicy_info(struct xfrm_userpolicy_info *p)
 {
switch (p-share) {
@@

[PATCH 6/44] [XFRM] STATE: Search by address using source address list.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

This is a support to search transformation states by its addresses
by using source address list for Mobile IPv6 usage.
To use it from user-space, it is also added a message type for
source address as a xfrm state option.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/xfrm.h   |1 +
 include/net/xfrm.h |2 ++
 net/ipv4/xfrm4_state.c |9 +++
 net/ipv6/xfrm6_state.c |   21 +
 net/xfrm/xfrm_state.c  |   37 +++---
 net/xfrm/xfrm_user.c   |   59 +++-
 6 files changed, 119 insertions(+), 10 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 5154064..66343d3 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -234,6 +234,7 @@ enum xfrm_attr_type_t {
XFRMA_REPLAY_VAL,
XFRMA_REPLAY_THRESH,
XFRMA_ETIMER_THRESH,
+   XFRMA_SRCADDR,  /* xfrm_address_t */
__XFRMA_MAX
 
 #define XFRMA_MAX (__XFRMA_MAX - 1)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 4933f46..bd51224 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -245,6 +245,7 @@ struct xfrm_state_afinfo {
struct xfrm_tmpl *tmpl,
xfrm_address_t *daddr, 
xfrm_address_t *saddr);
struct xfrm_state   *(*state_lookup)(xfrm_address_t *daddr, u32 
spi, u8 proto);
+   struct xfrm_state   *(*state_lookup_byaddr)(xfrm_address_t *daddr, 
xfrm_address_t *saddr, u8 proto);
struct xfrm_state   *(*find_acq)(u8 mode, u32 reqid, u8 proto, 
 xfrm_address_t *daddr, 
xfrm_address_t *saddr, 
 int create);
@@ -937,6 +938,7 @@ extern void xfrm_state_insert(struct xfr
 extern int xfrm_state_add(struct xfrm_state *x);
 extern int xfrm_state_update(struct xfrm_state *x);
 extern struct xfrm_state *xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 
proto, unsigned short family);
+extern struct xfrm_state *xfrm_state_lookup_byaddr(xfrm_address_t *daddr, 
xfrm_address_t *saddr, u8 proto, unsigned short family);
 extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
 extern void xfrm_state_flush(u8 proto);
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
index c56b258..616be13 100644
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -80,6 +80,14 @@ __xfrm4_state_lookup(xfrm_address_t *dad
return NULL;
 }
 
+/* placeholder until ipv4's code is written */
+static struct xfrm_state *
+__xfrm4_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr,
+   u8 proto)
+{
+   return NULL;
+}
+
 static struct xfrm_state *
 __xfrm4_find_acq(u8 mode, u32 reqid, u8 proto, 
 xfrm_address_t *daddr, xfrm_address_t *saddr, 
@@ -137,6 +145,7 @@ static struct xfrm_state_afinfo xfrm4_st
.init_flags = xfrm4_init_flags,
.init_tempsel   = __xfrm4_init_tempsel,
.state_lookup   = __xfrm4_state_lookup,
+   .state_lookup_byaddr= __xfrm4_state_lookup_byaddr,
.find_acq   = __xfrm4_find_acq,
 };
 
diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index 2fb0785..9c95b9d 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -64,6 +64,26 @@ __xfrm6_init_tempsel(struct xfrm_state *
 }
 
 static struct xfrm_state *
+__xfrm6_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr,
+   u8 proto)
+{
+   struct xfrm_state *x = NULL;
+   unsigned h;
+
+   h = __xfrm6_src_hash(saddr);
+   list_for_each_entry(x, xfrm6_state_afinfo.state_bysrc+h, bysrc) {
+   if (x-props.family == AF_INET6 
+   ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr 
*)x-id.daddr.a6) 
+   ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr 
*)x-props.saddr.a6) 
+   proto == x-id.proto) {
+   xfrm_state_hold(x);
+   return x;
+   }
+   }
+   return NULL;
+}
+
+static struct xfrm_state *
 __xfrm6_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto)
 {
unsigned h = __xfrm6_spi_hash(daddr, spi, proto);
@@ -140,6 +160,7 @@ static struct xfrm_state_afinfo xfrm6_st
.family = AF_INET6,
.init_tempsel   = __xfrm6_init_tempsel,
.state_lookup   = __xfrm6_state_lookup,
+   .state_lookup_byaddr= __xfrm6_state_lookup_byaddr,
.find_acq   = __xfrm6_find_acq,
 };
 
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 2a99928..11f480b 100644
--- a/net/xfrm/xfrm_state.c
+++

[PATCH 29/44] [IPV6] MIP6: Add destination options header transformation.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add destination options header transformation for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Noriaki TAKAMIYA [EMAIL PROTECTED]
Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/mip6.h |3 +
 net/ipv6/mip6.c|  167 
 2 files changed, 170 insertions(+), 0 deletions(-)

diff --git a/include/net/mip6.h b/include/net/mip6.h
index 644b8b6..42b65ba 100644
--- a/include/net/mip6.h
+++ b/include/net/mip6.h
@@ -25,6 +25,9 @@
 #ifndef _NET_MIP6_H
 #define _NET_MIP6_H
 
+#define MIP6_OPT_PAD_1 0
+#define MIP6_OPT_PAD_N 1
+
 extern int mip6_init(void);
 extern void mip6_fini(void);
 
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 63e548b..a8adf89 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -35,6 +35,165 @@ static xfrm_address_t *mip6_xfrm_addr(st
return x-coaddr;
 }
 
+static inline unsigned int calc_padlen(unsigned int len, unsigned int n)
+{
+   return (n - len + 16)  0x7;
+}
+
+static inline void *mip6_padn(__u8 *data, __u8 padlen)
+{
+   if (!data)
+   return NULL;
+   if (padlen == 1) {
+   data[0] = MIP6_OPT_PAD_1;
+   } else if (padlen  1) {
+   data[0] = MIP6_OPT_PAD_N;
+   data[1] = padlen - 2;
+   if (padlen  2)
+   memset(data+2, 0, data[1]);
+   }
+   return data + padlen;
+}
+
+static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+   struct ipv6hdr *iph = skb-nh.ipv6h;
+   struct ipv6_destopt_hdr *destopt = (struct ipv6_destopt_hdr *)skb-data;
+
+   if (!ipv6_addr_equal(iph-saddr, (struct in6_addr *)x-coaddr) 
+   !ipv6_addr_any((struct in6_addr *)x-coaddr))
+   return -ENOENT;
+
+   return destopt-nexthdr;
+}
+
+/* Destination Option Header is inserted.
+ * IP Header's src address is replaced with Home Address Option in
+ * Destination Option Header.
+ */
+static int mip6_destopt_output(struct xfrm_state *x, struct sk_buff *skb)
+{
+   struct ipv6hdr *iph;
+   struct ipv6_destopt_hdr *dstopt;
+   struct ipv6_destopt_hao *hao;
+   u8 nexthdr;
+   int len;
+
+   iph = (struct ipv6hdr *)skb-data;
+   iph-payload_len = htons(skb-len - sizeof(*iph));
+
+   nexthdr = *skb-nh.raw;
+   *skb-nh.raw = IPPROTO_DSTOPTS;
+
+   dstopt = (struct ipv6_destopt_hdr *)skb-h.raw;
+   dstopt-nexthdr = nexthdr;
+
+   hao = mip6_padn((char *)(dstopt + 1),
+   calc_padlen(sizeof(*dstopt), 6));
+
+   hao-type = IPV6_TLV_HAO;
+   hao-length = sizeof(*hao) - 2;
+   BUG_TRAP(hao-length == 16);
+
+   len = ((char *)hao - (char *)dstopt) + sizeof(*hao);
+
+   memcpy(hao-addr, iph-saddr, sizeof(hao-addr));
+   memcpy(iph-saddr, x-coaddr, sizeof(iph-saddr));
+
+   BUG_TRAP(len == x-props.header_len);
+   dstopt-hdrlen = (x-props.header_len  3) - 1;
+
+   return 0;
+}
+
+static int mip6_destopt_offset(struct xfrm_state *x, struct sk_buff *skb,
+  u8 **nexthdr)
+{
+   u16 offset = sizeof(struct ipv6hdr);
+   struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(skb-nh.ipv6h + 1);
+   unsigned int packet_len = skb-tail - skb-nh.raw;
+   int found_rhdr = 0;
+
+   *nexthdr = skb-nh.ipv6h-nexthdr;
+
+   while (offset + 1 = packet_len) {
+
+   switch (**nexthdr) {
+   case NEXTHDR_HOP:
+   break;
+   case NEXTHDR_ROUTING:
+   found_rhdr = 1;
+   break;
+   case NEXTHDR_DEST:
+   /*
+* HAO MUST NOT appear more than once.
+* XXX: It is better to try to find by the end of
+* XXX: packet if HAO exists.
+*/
+   if (ipv6_find_tlv(skb, offset, IPV6_TLV_HAO) = 0) {
+   LIMIT_NETDEBUG(KERN_WARNING mip6: hao exists 
already, override\n);
+   return offset;
+   }
+
+   if (found_rhdr)
+   return offset;
+
+   break;
+   default:
+   return offset;
+   }
+
+   offset += ipv6_optlen(exthdr);
+   *nexthdr = exthdr-nexthdr;
+   exthdr = (struct ipv6_opt_hdr*)(skb-nh.raw + offset);
+   }
+
+   return offset;
+}
+
+static int mip6_destopt_init_state(struct xfrm_state *x)
+{
+   if (x-id.spi) {
+   printk(KERN_INFO %s: spi is not 0: %u\n, __FUNCTION__,
+  x-id.spi);
+   return -EINVAL;
+   }
+   if (x-props.mode != XFRM_MODE_ROUTEOPTIMIZATION) {
+

[PATCH 40/44] [XFRM] POLICY: sub policy support.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Sub policy is introduced. Main and sub policy are applied the same flow.
(Policy that current kernel uses is named as main.)
It is required another transformation policy management to keep IPsec
and Mobile IPv6 lives separate.
Policy which lives shorter time in kernel should be a sub i.e. normally
main is for IPsec and sub is for Mobile IPv6.
(Such usage as two IPsec policies on different database can be used, too.)

Limitation or TODOs:
 - Sub policy is not supported for per socket one (it is always inserted as 
main).
 - Current kernel makes cached outbound with flowi to skip searching database.
   However this patch makes it disabled only when two policies are used and
   the first matched one is bypass case because neither flowi nor bundle
   information knows about transformation template size.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/xfrm.h   |7 +
 include/net/xfrm.h |   45 +++--
 net/xfrm/xfrm_policy.c |  252 +---
 3 files changed, 260 insertions(+), 44 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 4009f44..492fb98 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -104,6 +104,13 @@ struct xfrm_stats {
 
 enum
 {
+   XFRM_POLICY_TYPE_MAIN   = 0,
+   XFRM_POLICY_TYPE_SUB= 1,
+   XFRM_POLICY_TYPE_MAX= 2
+};
+
+enum
+{
XFRM_POLICY_IN  = 0,
XFRM_POLICY_OUT = 1,
XFRM_POLICY_FWD = 2,
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 00784d9..5bd6beb 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -342,6 +342,7 @@ struct xfrm_policy
atomic_trefcnt;
struct timer_list   timer;
 
+   u8  type;
u32 priority;
u32 index;
struct xfrm_selectorselector;
@@ -390,6 +391,19 @@ extern int xfrm_unregister_km(struct xfr
 
 
 extern struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2];
+#ifdef CONFIG_XFRM_SUB_POLICY
+extern struct xfrm_policy *xfrm_policy_list_sub[XFRM_POLICY_MAX*2];
+
+static inline int xfrm_policy_lists_empty(int dir)
+{
+   return (!xfrm_policy_list[dir]  !xfrm_policy_list_sub[dir]);
+}
+#else
+static inline int xfrm_policy_lists_empty(int dir)
+{
+   return (!xfrm_policy_list[dir]);
+}
+#endif
 
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
 {
@@ -405,6 +419,20 @@ static inline void xfrm_pol_put(struct x
__xfrm_policy_destroy(policy);
 }
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols)
+{
+   int i;
+   for (i = npols - 1; i = 0; --i)
+   xfrm_pol_put(pols[i]);
+}
+#else
+static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols)
+{
+   xfrm_pol_put(pols[0]);
+}
+#endif
+
 #define XFRM_DST_HSIZE 1024
 
 static __inline__
@@ -738,8 +766,8 @@ static inline int xfrm_policy_check(stru
 {
if (sk  sk-sk_policy[XFRM_POLICY_IN])
return __xfrm_policy_check(sk, dir, skb, family);
-   
-   return  (!xfrm_policy_list[dir]  !skb-sp) ||
+
+   return  (xfrm_policy_lists_empty(dir)  !skb-sp) ||
(skb-dst-flags  DST_NOPOLICY) ||
__xfrm_policy_check(sk, dir, skb, family);
 }
@@ -759,7 +787,7 @@ extern int __xfrm_route_forward(struct s
 
 static inline int xfrm_route_forward(struct sk_buff *skb, unsigned short 
family)
 {
-   return  !xfrm_policy_list[XFRM_POLICY_OUT] ||
+   return  xfrm_policy_lists_empty(XFRM_POLICY_OUT) ||
(skb-dst-flags  DST_NOXFRM) ||
__xfrm_route_forward(skb, family);
 }
@@ -1023,18 +1051,19 @@ static inline int xfrm_dst_lookup(struct
 #endif
 
 struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp);
-extern int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, 
void*), void *);
+extern int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, 
int, void*), void *);
 int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl);
-struct xfrm_policy *xfrm_policy_bysel_ctx(int dir, struct xfrm_selector *sel,
+struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int dir,
+ struct xfrm_selector *sel,
  struct xfrm_sec_ctx *ctx, int delete);
-struct xfrm_policy *xfrm_policy_byid(int dir, u32 id, int delete);
-void xfrm_policy_flush(void);
+struct xfrm_policy *xfrm_policy_byid(u8, int dir, u32 id, int delete);
+void xfrm_policy_flush(u8 type);
 u32 xfrm_get_acqseq(void);
 void xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi);
 struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, 
  xfrm_address_t *daddr, xfrm_address_t *saddr, 
  int create, unsigned

[PATCH 9/44] [XFRM]: Restrict authentication algorithm only when inbound transformation protocol is IPsec.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

For Mobile IPv6 usage, routing header or destination options header is used and
it doesn't require this comparison. It is checked only for IPsec template.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/xfrm/xfrm_policy.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index dd8e543..66cd501 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1004,7 +1004,8 @@ xfrm_state_ok(struct xfrm_tmpl *tmpl, st
(x-id.spi == tmpl-id.spi || !tmpl-id.spi) 
(x-props.reqid == tmpl-reqid || !tmpl-reqid) 
x-props.mode == tmpl-mode 
-   (tmpl-aalgos  (1x-props.aalgo)) 
+   ((tmpl-aalgos  (1x-props.aalgo)) ||
+!(xfrm_id_proto_match(tmpl-id.proto, IPSEC_PROTO_ANY))) 
!(x-props.mode != XFRM_MODE_TRANSPORT 
  xfrm_state_addr_cmp(tmpl, x, family));
 }
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/44] Mobile IPv6 Platform, Take 2 (for net-2.6.19)

2006-08-23 Thread YOSHIFUJI Hideaki

From: Hideaki YOSHIFUJI [EMAIL PROTECTED]

This is the update of the MIPv6 CN introduction and MIPv6 CN
based on the MIPL2 kernel patch.

Major difference from the last reviewed patches are:

 o Updates from netdev comments
   - Remove CONFIG_XFRM_ADVANCED
   - Make more inline functions for readability
   - Reorder patch by the point of not files but features
   - Use new netlink attribute message type instead of changing
 public structure for user-space

 o New patch
   - [PATCH 16/44] [XFRM] IPV6: Restrict bundle reusing

Transformation mode is used like IPsec transport or tunnel.
It is required to add two more items, route optimization and
inbound trigger for Mobile IPv6.

It is also available at:

git://git.skbuff.net/gitroot/yoshfuji/net-2.6.19-20060821-mip6cn-20060823

P.S.

we have another discussion about improvement of XFRM state hash.
Our patches (about source address list part) will conflict with that
work, but we did not take that into account so far because it was not
in the tree.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/44] [XFRM]: Add XFRM_MODE_xxx for future use.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Transformation mode is used as either IPsec transport or tunnel.
It is required to add two more items, route optimization and inbound trigger
for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/xfrm.h|6 --
 include/net/xfrm.h  |2 +-
 net/ipv4/ah4.c  |2 +-
 net/ipv4/esp4.c |6 +++---
 net/ipv4/ipcomp.c   |8 
 net/ipv4/xfrm4_input.c  |2 +-
 net/ipv4/xfrm4_output.c |4 ++--
 net/ipv4/xfrm4_policy.c |2 +-
 net/ipv4/xfrm4_state.c  |2 +-
 net/ipv4/xfrm4_tunnel.c |2 +-
 net/ipv6/ah6.c  |2 +-
 net/ipv6/esp6.c |4 ++--
 net/ipv6/ipcomp6.c  |6 +++---
 net/ipv6/xfrm6_input.c  |2 +-
 net/ipv6/xfrm6_output.c |4 ++--
 net/ipv6/xfrm6_policy.c |2 +-
 net/ipv6/xfrm6_state.c  |2 +-
 net/ipv6/xfrm6_tunnel.c |2 +-
 net/key/af_key.c|6 +++---
 net/xfrm/xfrm_policy.c  |   11 ++-
 net/xfrm/xfrm_user.c|4 ++--
 21 files changed, 42 insertions(+), 39 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 46a15c7..5154064 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -120,7 +120,9 @@ enum
 
 #define XFRM_MODE_TRANSPORT 0
 #define XFRM_MODE_TUNNEL 1
-#define XFRM_MODE_MAX 2
+#define XFRM_MODE_ROUTEOPTIMIZATION 2
+#define XFRM_MODE_IN_TRIGGER 3
+#define XFRM_MODE_MAX 4
 
 /* Netlink configuration messages.  */
 enum {
@@ -247,7 +249,7 @@ struct xfrm_usersa_info {
__u32   seq;
__u32   reqid;
__u16   family;
-   __u8mode; /* 0=transport,1=tunnel */
+   __u8mode;   /* XFRM_MODE_xxx */
__u8replay_window;
__u8flags;
 #define XFRM_STATE_NOECN   1
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index f724d3f..f47965b 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -299,7 +299,7 @@ struct xfrm_tmpl
 
__u32   reqid;
 
-/* Mode: transport/tunnel */
+/* Mode: transport, tunnel etc. */
__u8mode;
 
 /* Sharing mode: unique, this session only, this user only etc. */
diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index 506d7a2..a82e2f0 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -253,7 +253,7 @@ static int ah_init_state(struct xfrm_sta
goto error;

x-props.header_len = XFRM_ALIGN8(sizeof(struct ip_auth_hdr) + 
ahp-icv_trunc_len);
-   if (x-props.mode)
+   if (x-props.mode == XFRM_MODE_TUNNEL)
x-props.header_len += sizeof(struct iphdr);
x-data = ahp;
 
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index fc2f8ce..f5057dc 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -237,7 +237,7 @@ static int esp_input(struct xfrm_state *
 *as per draft-ietf-ipsec-udp-encaps-06,
 *section 3.1.2
 */
-   if (!x-props.mode)
+   if (x-props.mode == XFRM_MODE_TRANSPORT)
skb-ip_summed = CHECKSUM_UNNECESSARY;
}
 
@@ -256,7 +256,7 @@ static u32 esp4_get_max_size(struct xfrm
struct esp_data *esp = x-data;
u32 blksize = ALIGN(crypto_tfm_alg_blocksize(esp-conf.tfm), 4);
 
-   if (x-props.mode) {
+   if (x-props.mode == XFRM_MODE_TUNNEL) {
mtu = ALIGN(mtu + 2, blksize);
} else {
/* The worst case. */
@@ -368,7 +368,7 @@ static int esp_init_state(struct xfrm_st
if (crypto_cipher_setkey(esp-conf.tfm, esp-conf.key, 
esp-conf.key_len))
goto error;
x-props.header_len = sizeof(struct ip_esp_hdr) + esp-conf.ivlen;
-   if (x-props.mode)
+   if (x-props.mode == XFRM_MODE_TUNNEL)
x-props.header_len += sizeof(struct iphdr);
if (x-encap) {
struct xfrm_encap_tmpl *encap = x-encap;
diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c
index a0c28b2..b163ebc 100644
--- a/net/ipv4/ipcomp.c
+++ b/net/ipv4/ipcomp.c
@@ -176,7 +176,7 @@ static int ipcomp_output(struct xfrm_sta
return 0;
 
 out_ok:
-   if (x-props.mode)
+   if (x-props.mode == XFRM_MODE_TUNNEL)
ip_send_check(iph);
return 0;
 }
@@ -216,7 +216,7 @@ static struct xfrm_state *ipcomp_tunnel_
t-id.daddr.a4 = x-id.daddr.a4;
memcpy(t-sel, x-sel, sizeof(t-sel));
t-props.family = AF_INET;
-   t-props.mode = 1;
+   t-props.mode = XFRM_MODE_TUNNEL;
t-props.saddr.a4 = x-props.saddr.a4;
t-props.flags = x-props.flags;
 
@@ -415,7 +415,7 @@ static int ipcomp_init_state(struct xfrm
goto out;

[PATCH 8/44] [XFRM] STATE: Introduce route optimization mode.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Route optimization is used with routing header and destination options
header for Mobile IPv6.
At outbound it makes header space like IPsec transport. At inbound
it does nothing because exhdrs.c functions have responsibility to
update skbuff information for these headers.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/Kconfig |7 +++
 net/ipv6/Makefile|1 
 net/ipv6/xfrm6_mode_ro.c |   94 ++
 3 files changed, 102 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 540e800..5067665 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -126,6 +126,13 @@ config INET6_XFRM_MODE_TUNNEL
 
  If unsure, say Y.
 
+config INET6_XFRM_MODE_ROUTEOPTIMIZATION
+   tristate IPv6: MIPv6 route optimization mode (EXPERIMENTAL)
+   depends on IPV6  EXPERIMENTAL
+   select XFRM
+   ---help---
+ Support for MIPv6 route optimization mode.
+
 config IPV6_TUNNEL
tristate IPv6: IPv6-in-IPv6 tunnel
select INET6_TUNNEL
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 9eebf60..87e912e 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_INET6_XFRM_TUNNEL) += xfrm6
 obj-$(CONFIG_INET6_TUNNEL) += tunnel6.o
 obj-$(CONFIG_INET6_XFRM_MODE_TRANSPORT) += xfrm6_mode_transport.o
 obj-$(CONFIG_INET6_XFRM_MODE_TUNNEL) += xfrm6_mode_tunnel.o
+obj-$(CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION) += xfrm6_mode_ro.o
 obj-$(CONFIG_NETFILTER)+= netfilter/
 
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
diff --git a/net/ipv6/xfrm6_mode_ro.c b/net/ipv6/xfrm6_mode_ro.c
new file mode 100644
index 000..c11c335
--- /dev/null
+++ b/net/ipv6/xfrm6_mode_ro.c
@@ -0,0 +1,94 @@
+/*
+ * xfrm6_mode_ro.c - Route optimization mode for IPv6.
+ *
+ * Copyright (C)2003-2006 Helsinki University of Technology
+ * Copyright (C)2003-2006 USAGI/WIDE Project
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+/*
+ * Authors:
+ * Noriaki TAKAMIYA @USAGI
+ * Masahide NAKAMURA @USAGI
+ */
+
+#include linux/init.h
+#include linux/kernel.h
+#include linux/module.h
+#include linux/skbuff.h
+#include linux/stringify.h
+#include net/ipv6.h
+#include net/xfrm.h
+
+/* Add route optimization header space.
+ *
+ * The IP header and mutable extension headers will be moved forward to make
+ * space for the route optimization header.
+ *
+ * On exit, skb-h will be set to the start of the encapsulation header to be
+ * filled in by x-type-output and skb-nh will be set to the nextheader field
+ * of the extension header directly preceding the encapsulation header, or in
+ * its absence, that of the top IP header.  The value of skb-data will always
+ * point to the top IP header.
+ */
+static int xfrm6_ro_output(struct sk_buff *skb)
+{
+   struct xfrm_state *x = skb-dst-xfrm;
+   struct ipv6hdr *iph;
+   u8 *prevhdr;
+   int hdr_len;
+
+   skb_push(skb, x-props.header_len);
+   iph = skb-nh.ipv6h;
+
+   hdr_len = x-type-hdr_offset(x, skb, prevhdr);
+   skb-nh.raw = prevhdr - x-props.header_len;
+   skb-h.raw = skb-data + hdr_len;
+   memmove(skb-data, iph, hdr_len);
+   return 0;
+}
+
+/*
+ * Do nothing about routing optimization header unlike IPsec.
+ */
+static int xfrm6_ro_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+   return 0;
+}
+
+static struct xfrm_mode xfrm6_ro_mode = {
+   .input = xfrm6_ro_input,
+   .output = xfrm6_ro_output,
+   .owner = THIS_MODULE,
+   .encap = XFRM_MODE_ROUTEOPTIMIZATION,
+};
+
+static int __init xfrm6_ro_init(void)
+{
+   return xfrm_register_mode(xfrm6_ro_mode, AF_INET6);
+}
+
+static void __exit xfrm6_ro_exit(void)
+{
+   int err;
+
+   err = xfrm_unregister_mode(xfrm6_ro_mode, AF_INET6);
+   BUG_ON(err);
+}
+
+module_init(xfrm6_ro_init);
+module_exit(xfrm6_ro_exit);
+MODULE_LICENSE(GPL);
+MODULE_ALIAS_XFRM_MODE(AF_INET6, XFRM_MODE_ROUTEOPTIMIZATION);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 34/44] [IPV6] MIP6: Transformation support mobility header.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Transformation support mobility header.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/xfrm.h  |5 +
 net/ipv6/xfrm6_policy.c |   15 +++
 2 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 2078d84..bf6daaa 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -547,6 +547,11 @@ u16 xfrm_flowi_sport(struct flowi *fl)
case IPPROTO_ICMPV6:
port = htons(fl-fl_icmp_type);
break;
+#ifdef CONFIG_IPV6_MIP6
+   case IPPROTO_MH:
+   port = htons(fl-fl_mh_type);
+   break;
+#endif
default:
port = 0;   /*XXX*/
}
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 729b474..98c2fe4 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -18,6 +18,9 @@ #include net/xfrm.h
 #include net/ip.h
 #include net/ipv6.h
 #include net/ip6_route.h
+#ifdef CONFIG_IPV6_MIP6
+#include net/mip6.h
+#endif
 
 static struct dst_ops xfrm6_dst_ops;
 static struct xfrm_policy_afinfo xfrm6_policy_afinfo;
@@ -270,6 +273,18 @@ _decode_session6(struct sk_buff *skb, st
fl-proto = nexthdr;
return;
 
+#ifdef CONFIG_IPV6_MIP6
+   case IPPROTO_MH:
+   if (pskb_may_pull(skb, skb-nh.raw + offset + 3 - 
skb-data)) {
+   struct ip6_mh *mh;
+   mh = (struct ip6_mh *)exthdr;
+
+   fl-fl_mh_type = mh-ip6mh_type;
+   }
+   fl-proto = nexthdr;
+   return;
+#endif
+
/* XXX Why are there these headers? */
case IPPROTO_AH:
case IPPROTO_ESP:
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 41/44] [XFRM]: Add sorting interface for state and template.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Under two transformation policies it is required to merge them.
This is a platform to sort state for outbound and templates
for inbound respectively.
It will be used when Mobile IPv6 and IPsec are used at the same time.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/xfrm.h |   20 
 net/xfrm/xfrm_policy.c |   16 ++--
 net/xfrm/xfrm_state.c  |   38 ++
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 5bd6beb..aab31a2 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -255,6 +255,8 @@ struct xfrm_state_afinfo {
struct xfrm_state   *(*find_acq)(u8 mode, u32 reqid, u8 proto, 
 xfrm_address_t *daddr, 
xfrm_address_t *saddr, 
 int create);
+   int (*tmpl_sort)(struct xfrm_tmpl **dst, struct 
xfrm_tmpl **src, int n);
+   int (*state_sort)(struct xfrm_state **dst, struct 
xfrm_state **src, int n);
 };
 
 extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo);
@@ -1002,6 +1004,24 @@ extern int xfrm_state_add(struct xfrm_st
 extern int xfrm_state_update(struct xfrm_state *x);
 extern struct xfrm_state *xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 
proto, unsigned short family);
 extern struct xfrm_state *xfrm_state_lookup_byaddr(xfrm_address_t *daddr, 
xfrm_address_t *saddr, u8 proto, unsigned short family);
+#ifdef CONFIG_XFRM_SUB_POLICY
+extern int xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src,
+ int n, unsigned short family);
+extern int xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src,
+  int n, unsigned short family);
+#else
+static inline int xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl 
**src,
+int n, unsigned short family)
+{
+   return -ENOSYS;
+}
+
+static inline int xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state 
**src,
+ int n, unsigned short family)
+{
+   return -ENOSYS;
+}
+#endif
 extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
 extern void xfrm_state_flush(u8 proto);
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 65e84b1..a1be1b5 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -861,6 +861,8 @@ xfrm_tmpl_resolve(struct xfrm_policy **p
  struct xfrm_state **xfrm,
  unsigned short family)
 {
+   struct xfrm_state *tp[XFRM_MAX_DEPTH];
+   struct xfrm_state **tpp = (npols  1) ? tp : xfrm;
int cnx = 0;
int error;
int ret;
@@ -871,7 +873,8 @@ xfrm_tmpl_resolve(struct xfrm_policy **p
error = -ENOBUFS;
goto fail;
}
-   ret = xfrm_tmpl_resolve_one(pols[i], fl, xfrm[cnx], family);
+
+   ret = xfrm_tmpl_resolve_one(pols[i], fl, tpp[cnx], family);
if (ret  0) {
error = ret;
goto fail;
@@ -879,11 +882,15 @@ xfrm_tmpl_resolve(struct xfrm_policy **p
cnx += ret;
}
 
+   /* found states are sorted for outbound processing */
+   if (npols  1)
+   xfrm_state_sort(xfrm, tpp, cnx, family);
+
return cnx;
 
  fail:
for (cnx--; cnx=0; cnx--)
-   xfrm_state_put(xfrm[cnx]);
+   xfrm_state_put(tpp[cnx]);
return error;
 
 }
@@ -1280,6 +1287,7 @@ #endif
struct sec_path *sp;
static struct sec_path dummy;
struct xfrm_tmpl *tp[XFRM_MAX_DEPTH];
+   struct xfrm_tmpl *stp[XFRM_MAX_DEPTH];
struct xfrm_tmpl **tpp = tp;
int ti = 0;
int i, k;
@@ -1297,6 +1305,10 @@ #endif
tpp[ti++] = pols[pi]-xfrm_vec[i];
}
xfrm_nr = ti;
+   if (npols  1) {
+   xfrm_tmpl_sort(stp, tpp, xfrm_nr, family);
+   tpp = stp;
+   }
 
/* For each tunnel xfrm, find the first matching tmpl.
 * For each tmpl before that, find corresponding xfrm.
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index a26ef69..622e92a 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -728,6 +728,44 @@ xfrm_find_acq(u8 mode, u32 reqid, u8 pro
 }
 EXPORT_SYMBOL(xfrm_find_acq);
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+int
+xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n,
+  unsigned short family)
+{
+   int err = 0;
+   struct xfrm_state_afinfo *afinfo =

[PATCH 38/44] [IPV6] MIP6: Ignore to report if mobility headers is rejected.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Ignore to report user-space for known mobility headers rejected by
destination options header transformation.
Mobile IPv6 specification (RFC3775) says that mobility header
is used with destination options header carrying home address option
only for binding update message. Other type message cannot be used
and node must drop it silently (and must not send binding error) if
receving such packet.
To achieve it, (1) application should use transformation policy and
wild-card states to catch binding update message prior other packets
(2) kernel doesn't report the reject to user-space not to send
binding error message by application.
This patch is for (2).
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/mip6.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 31445d0..7085403 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -234,6 +234,9 @@ static int mip6_destopt_reject(struct xf
struct timeval stamp;
int err = 0;
 
+   if (unlikely(fl-proto == IPPROTO_MH  fl-fl_mh_type = 
IP6_MH_TYPE_MAX))
+   goto out;
+
if (likely(opt-dsthao)) {
offset = ipv6_find_tlv(skb, opt-dsthao, IPV6_TLV_HAO);
if (likely(offset = 0))
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 33/44] [IPV6] MIP6: Add sending mobility header functions through raw socket.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Mobility header is built by user-space and sent through raw socket.
Kernel just extracts its type to flow.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/raw.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 2178a2a..87b0adc 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -609,6 +609,9 @@ static void rawv6_probe_proto_opt(struct
struct iovec *iov;
u8 __user *type = NULL;
u8 __user *code = NULL;
+#ifdef CONFIG_IPV6_MIP6
+   u8 len = 0;
+#endif
int probed = 0;
int i;
 
@@ -640,6 +643,20 @@ static void rawv6_probe_proto_opt(struct
probed = 1;
}
break;
+#ifdef CONFIG_IPV6_MIP6
+   case IPPROTO_MH:
+   if (iov-iov_base  iov-iov_len  1)
+   break;
+   /* check if type field is readable or not. */
+   if (iov-iov_len  2 - len) {
+   u8 __user *p = iov-iov_base;
+   get_user(fl-fl_mh_type, p[2 - len]);
+   probed = 1;
+   } else
+   len += iov-iov_len;
+
+   break;
+#endif
default:
probed = 1;
break;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 44/44] [XFRM] IPV6: Support Mobile IPv6 extension headers sorting.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Support Mobile IPv6 extension headers sorting for two transformation policies.
Mobile IPv6 extension headers should be placed after IPsec
transport mode, but before transport AH when outbound.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/xfrm6_state.c |   28 ++--
 1 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index e0b8f3c..6269584 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -173,7 +173,19 @@ __xfrm6_state_sort(struct xfrm_state **d
if (j == n)
goto end;
 
-   /* XXX: Rule 2: select MIPv6 RO or inbound trigger */
+   /* Rule 2: select MIPv6 RO or inbound trigger */
+#ifdef CONFIG_IPV6_MIP6
+   for (i = 0; i  n; i++) {
+   if (src[i] 
+   (src[i]-props.mode == XFRM_MODE_ROUTEOPTIMIZATION ||
+src[i]-props.mode == XFRM_MODE_IN_TRIGGER)) {
+   dst[j++] = src[i];
+   src[i] = NULL;
+   }
+   }
+   if (j == n)
+   goto end;
+#endif
 
/* Rule 3: select IPsec transport AH */
for (i = 0; i  n; i++) {
@@ -226,7 +238,19 @@ __xfrm6_tmpl_sort(struct xfrm_tmpl **dst
if (j == n)
goto end;
 
-   /* XXX: Rule 2: select MIPv6 RO or inbound trigger */
+   /* Rule 2: select MIPv6 RO or inbound trigger */
+#ifdef CONFIG_IPV6_MIP6
+   for (i = 0; i  n; i++) {
+   if (src[i] 
+   (src[i]-mode == XFRM_MODE_ROUTEOPTIMIZATION ||
+src[i]-mode == XFRM_MODE_IN_TRIGGER)) {
+   dst[j++] = src[i];
+   src[i] = NULL;
+   }
+   }
+   if (j == n)
+   goto end;
+#endif
 
/* Rule 3: select IPsec tunnel */
for (i = 0; i  n; i++) {
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 25/44] [IPV6] MIP6: Add inbound interface of home address option.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add inbound function of home address option by registering it to TLV table for
destination options header.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/ipv6.h |3 ++
 net/ipv6/exthdrs.c   |   84 +-
 2 files changed, 86 insertions(+), 1 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index c1601ef..4613a38 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -226,6 +226,9 @@ struct inet6_skb_parm {
__u16   dst0;
__u16   srcrt;
__u16   dst1;
+#ifdef CONFIG_IPV6_MIP6
+   __u16   dsthao;
+#endif
__u16   lastopt;
__u32   nhoff;
__u16   flags;
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index fc972c1..077f626 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -196,8 +196,80 @@ bad:
   Destination options header.
  */
 
+#ifdef CONFIG_IPV6_MIP6
+static int ipv6_dest_hao(struct sk_buff **skbp, int optoff)
+{
+   struct sk_buff *skb = *skbp;
+   struct ipv6_destopt_hao *hao;
+   struct inet6_skb_parm *opt = IP6CB(skb);
+   struct ipv6hdr *ipv6h = (struct ipv6hdr *)skb-nh.raw;
+   struct in6_addr tmp_addr;
+   int ret;
+
+   if (opt-dsthao) {
+   LIMIT_NETDEBUG(KERN_DEBUG hao duplicated\n);
+   goto discard;
+   }
+   opt-dsthao = opt-dst1;
+   opt-dst1 = 0;
+
+   hao = (struct ipv6_destopt_hao *)(skb-nh.raw + optoff);
+
+   if (hao-length != 16) {
+   LIMIT_NETDEBUG(
+   KERN_DEBUG hao invalid option length = %d\n, 
hao-length);
+   goto discard;
+   }
+
+   if (!(ipv6_addr_type(hao-addr)  IPV6_ADDR_UNICAST)) {
+   LIMIT_NETDEBUG(
+   KERN_DEBUG hao is not an unicast addr:  NIP6_FMT 
\n, NIP6(hao-addr));
+   goto discard;
+   }
+
+   ret = xfrm6_input_addr(skb, (xfrm_address_t *)ipv6h-daddr,
+  (xfrm_address_t *)hao-addr, IPPROTO_DSTOPTS);
+   if (unlikely(ret  0))
+   goto discard;
+
+   if (skb_cloned(skb)) {
+   struct sk_buff *skb2 = skb_copy(skb, GFP_ATOMIC);
+   if (skb2 == NULL)
+   goto discard;
+
+   kfree_skb(skb);
+
+   /* update all variable using below by copied skbuff */
+   *skbp = skb = skb2;
+   hao = (struct ipv6_destopt_hao *)(skb2-nh.raw + optoff);
+   ipv6h = (struct ipv6hdr *)skb2-nh.raw;
+   }
+
+   if (skb-ip_summed == CHECKSUM_COMPLETE)
+   skb-ip_summed = CHECKSUM_NONE;
+
+   ipv6_addr_copy(tmp_addr, ipv6h-saddr);
+   ipv6_addr_copy(ipv6h-saddr, hao-addr);
+   ipv6_addr_copy(hao-addr, tmp_addr);
+
+   if (skb-tstamp.off_sec == 0)
+   __net_timestamp(skb);
+
+   return 1;
+
+ discard:
+   kfree_skb(skb);
+   return 0;
+}
+#endif
+
 static struct tlvtype_proc tlvprocdestopt_lst[] = {
-   /* No destination options are defined now */
+#ifdef CONFIG_IPV6_MIP6
+   {
+   .type   = IPV6_TLV_HAO,
+   .func   = ipv6_dest_hao,
+   },
+#endif
{-1,NULL}
 };
 
@@ -205,6 +277,9 @@ static int ipv6_destopt_rcv(struct sk_bu
 {
struct sk_buff *skb = *skbp;
struct inet6_skb_parm *opt = IP6CB(skb);
+#ifdef CONFIG_IPV6_MIP6
+   __u16 dstbuf;
+#endif
 
if (!pskb_may_pull(skb, (skb-h.raw-skb-data)+8) ||
!pskb_may_pull(skb, (skb-h.raw-skb-data)+((skb-h.raw[1]+1)3))) 
{
@@ -215,11 +290,18 @@ static int ipv6_destopt_rcv(struct sk_bu
 
opt-lastopt = skb-h.raw - skb-nh.raw;
opt-dst1 = skb-h.raw - skb-nh.raw;
+#ifdef CONFIG_IPV6_MIP6
+   dstbuf = opt-dst1;
+#endif
 
if (ip6_parse_tlv(tlvprocdestopt_lst, skbp)) {
skb = *skbp;
skb-h.raw += ((skb-h.raw[1]+1)3);
+#ifdef CONFIG_IPV6_MIP6
+   opt-nhoff = dstbuf;
+#else
opt-nhoff = opt-dst1;
+#endif
return 1;
}
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 36/44] [XFRM]: Introduce XFRM_MSG_REPORT.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

XFRM_MSG_REPORT is a message as notification of state protocol and selector
from kernel to user-space.
Mobile IPv6 will use it when inbound reject is occurred at route optimization
to make user-space know a binding error requirement.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/xfrm.h  |   12 
 include/net/xfrm.h|2 ++
 net/xfrm/xfrm_state.c |   19 +++
 net/xfrm/xfrm_user.c  |   46 ++
 4 files changed, 79 insertions(+), 0 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 1d8c1f2..4009f44 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -166,6 +166,10 @@ #define XFRM_MSG_FLUSHPOLICY XFRM_MSG_FL
 #define XFRM_MSG_NEWAE XFRM_MSG_NEWAE
XFRM_MSG_GETAE,
 #define XFRM_MSG_GETAE XFRM_MSG_GETAE
+
+   XFRM_MSG_REPORT,
+#define XFRM_MSG_REPORT XFRM_MSG_REPORT
+
__XFRM_MSG_MAX
 };
 #define XFRM_MSG_MAX (__XFRM_MSG_MAX - 1)
@@ -325,12 +329,18 @@ struct xfrm_usersa_flush {
__u8proto;
 };
 
+struct xfrm_user_report {
+   __u8proto;
+   struct xfrm_selectorsel;
+};
+
 #ifndef __KERNEL__
 /* backwards compatibility for userspace */
 #define XFRMGRP_ACQUIRE1
 #define XFRMGRP_EXPIRE 2
 #define XFRMGRP_SA 4
 #define XFRMGRP_POLICY 8
+#define XFRMGRP_REPORT 0x10
 #endif
 
 enum xfrm_nlgroups {
@@ -346,6 +356,8 @@ #define XFRMNLGRP_SAXFRMNLGRP_SA
 #define XFRMNLGRP_POLICY   XFRMNLGRP_POLICY
XFRMNLGRP_AEVENTS,
 #define XFRMNLGRP_AEVENTS  XFRMNLGRP_AEVENTS
+   XFRMNLGRP_REPORT,
+#define XFRMNLGRP_REPORT   XFRMNLGRP_REPORT
__XFRMNLGRP_MAX
 };
 #define XFRMNLGRP_MAX  (__XFRMNLGRP_MAX - 1)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 276884f..00784d9 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -382,6 +382,7 @@ struct xfrm_mgr
struct xfrm_policy  *(*compile_policy)(struct sock *sk, int opt, u8 
*data, int len, int *dir);
int (*new_mapping)(struct xfrm_state *x, 
xfrm_address_t *ipaddr, u16 sport);
int (*notify_policy)(struct xfrm_policy *x, int 
dir, struct km_event *c);
+   int (*report)(u8 proto, struct xfrm_selector *sel, 
xfrm_address_t *addr);
 };
 
 extern int xfrm_register_km(struct xfrm_mgr *km);
@@ -1043,6 +1044,7 @@ extern void xfrm_init_pmtu(struct dst_en
 extern wait_queue_head_t km_waitq;
 extern int km_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr, u16 
sport);
 extern void km_policy_expired(struct xfrm_policy *pol, int dir, int hard, u32 
pid);
+extern int km_report(u8 proto, struct xfrm_selector *sel, xfrm_address_t 
*addr);
 
 extern void xfrm_input_init(void);
 extern int xfrm_parse_spi(struct sk_buff *skb, u8 nexthdr, u32 *spi, u32 *seq);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 3da89c0..a26ef69 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1055,6 +1055,25 @@ void km_policy_expired(struct xfrm_polic
 }
 EXPORT_SYMBOL(km_policy_expired);
 
+int km_report(u8 proto, struct xfrm_selector *sel, xfrm_address_t *addr)
+{
+   int err = -EINVAL;
+   int ret;
+   struct xfrm_mgr *km;
+
+   read_lock(xfrm_km_lock);
+   list_for_each_entry(km, xfrm_km_list, list) {
+   if (km-report) {
+   ret = km-report(proto, sel, addr);
+   if (!ret)
+   err = ret;
+   }
+   }
+   read_unlock(xfrm_km_lock);
+   return err;
+}
+EXPORT_SYMBOL(km_report);
+
 int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int 
optlen)
 {
int err;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index ae8dc6b..a4a4dd6 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1489,6 +1489,7 @@ static const int xfrm_msg_min[XFRM_NR_MS
[XFRM_MSG_FLUSHPOLICY - XFRM_MSG_BASE] = NLMSG_LENGTH(0),
[XFRM_MSG_NEWAE   - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id),
[XFRM_MSG_GETAE   - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id),
+   [XFRM_MSG_REPORT  - XFRM_MSG_BASE] = XMSGSIZE(xfrm_user_report),
 };
 
 #undef XMSGSIZE
@@ -2056,12 +2057,57 @@ static int xfrm_send_policy_notify(struc
 
 }
 
+static int build_report(struct sk_buff *skb, u8 proto,
+   struct xfrm_selector *sel, xfrm_address_t *addr)
+{
+   struct xfrm_user_report *ur;
+   struct nlmsghdr *nlh;
+   unsigned char *b = skb-tail;
+
+   nlh = NLMSG_PUT(skb, 0, 0, XFRM_MSG_REPORT, sizeof(*ur));
+   ur = NLMSG_DATA(nlh);
+   nlh-nlmsg_flags = 0;
+
+   ur-proto = proto;
+   memcpy(ur-sel, sel, sizeof(ur-sel));
+
+   if

[PATCH Take 2] bcm43xx-softmac - set correct value in mac_suspended for ifdown/ifup sequence]

2006-08-23 Thread Larry Finger


John,

This is a better solution of the issue. Please disregard the patch sent 
yesterday.

Larry

---

When bcm43xx-softmac is given an ifdown/ifup sequence, the value for 
bcm-mac_suspended ends up
wrong, which leads to a large number of assert(bcm-mac_suspended=0) messages. 
This one-line patch
fixes this problem.

I think the following is the correct fix for the issue.
It is already in the d80211 branch. (Seems like it got lost somehow).

Signed-off-by: Michael Buesch [EMAIL PROTECTED]
Signed-Off-By: Larry Finger [EMAIL PROTECTED]

Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===
--- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c   
2006-08-23 10:00:27.0 +0200
+++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c2006-08-23 
10:01:45.0 +0200
@@ -3349,6 +3349,8 @@
memset(bcm-dma_reason, 0, sizeof(bcm-dma_reason));
bcm-irq_savedstate = BCM43xx_IRQ_INITIAL;

+   bcm-mac_suspended = 1;
+
/* Noise calculation context */
memset(bcm-noisecalc, 0, sizeof(bcm-noisecalc));


--
Greetings Michael.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 32/44] [IPV6] MIP6: Add receiving mobility header functions through raw socket.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Like ICMPv6, mobility header is handled through raw socket.
In inbound case, check only whether ICMPv6 error should be sent as a reply
or not by kernel.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]
This patch was also written by: Antti Tuominen [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/mip6.h |4 +++
 net/ipv6/mip6.c|   83 
 net/ipv6/raw.c |   29 ++
 3 files changed, 115 insertions(+), 1 deletions(-)

diff --git a/include/net/mip6.h b/include/net/mip6.h
index fd43178..68263c6 100644
--- a/include/net/mip6.h
+++ b/include/net/mip6.h
@@ -25,6 +25,9 @@
 #ifndef _NET_MIP6_H
 #define _NET_MIP6_H
 
+#include linux/skbuff.h
+#include net/sock.h
+
 #define MIP6_OPT_PAD_1 0
 #define MIP6_OPT_PAD_N 1
 
@@ -53,5 +56,6 @@ #define IP6_MH_TYPE_MAX   IP6_MH_TYPE_BER
 
 extern int mip6_init(void);
 extern void mip6_fini(void);
+extern int mip6_mh_filter(struct sock *sk, struct sk_buff *skb);
 
 #endif
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index a8adf89..7b5f893 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -26,7 +26,10 @@ #include linux/config.h
 #include linux/module.h
 #include linux/skbuff.h
 #include linux/ipv6.h
+#include linux/icmpv6.h
+#include net/sock.h
 #include net/ipv6.h
+#include net/ip6_checksum.h
 #include net/xfrm.h
 #include net/mip6.h
 
@@ -55,6 +58,86 @@ static inline void *mip6_padn(__u8 *data
return data + padlen;
 }
 
+static inline void mip6_param_prob(struct sk_buff *skb, int code, int pos)
+{
+   icmpv6_send(skb, ICMPV6_PARAMPROB, code, pos, skb-dev);
+}
+
+static int mip6_mh_len(int type)
+{
+   int len = 0;
+
+   switch (type) {
+   case IP6_MH_TYPE_BRR:
+   len = 0;
+   break;
+   case IP6_MH_TYPE_HOTI:
+   case IP6_MH_TYPE_COTI:
+   case IP6_MH_TYPE_BU:
+   case IP6_MH_TYPE_BACK:
+   len = 1;
+   break;
+   case IP6_MH_TYPE_HOT:
+   case IP6_MH_TYPE_COT:
+   case IP6_MH_TYPE_BERROR:
+   len = 2;
+   break;
+   }
+   return len;
+}
+
+int mip6_mh_filter(struct sock *sk, struct sk_buff *skb)
+{
+   struct ip6_mh *mh;
+   int mhlen;
+
+   if (!pskb_may_pull(skb, (skb-h.raw - skb-data) + 8) ||
+   !pskb_may_pull(skb, (skb-h.raw - skb-data) + ((skb-h.raw[1] + 1) 
 3)))
+   return -1;
+
+   mh = (struct ip6_mh *)skb-h.raw;
+
+   if (mh-ip6mh_hdrlen  mip6_mh_len(mh-ip6mh_type)) {
+   LIMIT_NETDEBUG(KERN_DEBUG mip6: MH message too short: %d vs 
=%d\n,
+  mh-ip6mh_hdrlen, mip6_mh_len(mh-ip6mh_type));
+   mip6_param_prob(skb, 0, (mh-ip6mh_hdrlen) - skb-nh.raw);
+   return -1;
+   }
+   mhlen = (mh-ip6mh_hdrlen + 1)  3;
+
+   if (skb-ip_summed == CHECKSUM_COMPLETE) {
+   skb-ip_summed = CHECKSUM_UNNECESSARY;
+   if (csum_ipv6_magic(skb-nh.ipv6h-saddr,
+   skb-nh.ipv6h-daddr,
+   mhlen, IPPROTO_MH,
+   skb-csum)) {
+   LIMIT_NETDEBUG(KERN_DEBUG mip6: MH hw checksum 
failed\n);
+   skb-ip_summed = CHECKSUM_NONE;
+   }
+   }
+   if (skb-ip_summed == CHECKSUM_NONE) {
+   if (csum_ipv6_magic(skb-nh.ipv6h-saddr,
+   skb-nh.ipv6h-daddr,
+   mhlen, IPPROTO_MH,
+   skb_checksum(skb, 0, mhlen, 0))) {
+   LIMIT_NETDEBUG(KERN_DEBUG mip6: MH checksum failed 
[%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x  
%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x]\n,
+  NIP6(skb-nh.ipv6h-saddr),
+  NIP6(skb-nh.ipv6h-daddr));
+   return -1;
+   }
+   skb-ip_summed = CHECKSUM_UNNECESSARY;
+   }
+
+   if (mh-ip6mh_proto != IPPROTO_NONE) {
+   LIMIT_NETDEBUG(KERN_DEBUG mip6: MH invalid payload proto = 
%d\n,
+  mh-ip6mh_proto);
+   mip6_param_prob(skb, 0, (mh-ip6mh_proto) - skb-nh.raw);
+   return -1;
+   }
+
+   return 0;
+}
+
 static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb)
 {
struct ipv6hdr *iph = skb-nh.ipv6h;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index bf55b5b..2178a2a 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -50,6 +50,9 @@ #include net/transp_v6.h
 #include net/udp.h
 #include net/inet_common.h
 #include net/tcp_states.h
+#ifdef CONFIG_IPV6_MIP6
+#include net/mip6.h
+#endif
 
 #include net/rawv6.h
 #include net/xfrm.h
@@ -169,8 +172,32 @@ int

[PATCH 26/44] [IPV6] MIP6: Revert address to send ICMPv6 error.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

IPv6 source address is replaced in receiving packet
with home address option carried by destination options header.
To send ICMPv6 error back, original address which is received one on wire
should be used. This function checks such header is included
and reverts them.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/icmp.c |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 4a6e911..37364a7 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -273,6 +273,29 @@ static int icmpv6_getfrag(void *from, ch
return 0;
 }
 
+#ifdef CONFIG_IPV6_MIP6
+static void mip6_addr_swap(struct sk_buff *skb)
+{
+   struct ipv6hdr *iph = skb-nh.ipv6h;
+   struct inet6_skb_parm *opt = IP6CB(skb);
+   struct ipv6_destopt_hao *hao;
+   struct in6_addr tmp;
+   int off;
+
+   if (opt-dsthao) {
+   off = ipv6_find_tlv(skb, opt-dsthao, IPV6_TLV_HAO);
+   if (likely(off = 0)) {
+   hao = (struct ipv6_destopt_hao *)(skb-nh.raw + off);
+   ipv6_addr_copy(tmp, iph-saddr);
+   ipv6_addr_copy(iph-saddr, hao-addr);
+   ipv6_addr_copy(hao-addr, tmp);
+   }
+   }
+}
+#else
+static inline void mip6_addr_swap(struct sk_buff *skb) {}
+#endif
+
 /*
  * Send an ICMP message in response to a packet in error
  */
@@ -350,6 +373,8 @@ void icmpv6_send(struct sk_buff *skb, in
return;
}
 
+   mip6_addr_swap(skb);
+
memset(fl, 0, sizeof(fl));
fl.proto = IPPROTO_ICMPV6;
ipv6_addr_copy(fl.fl6_dst, hdr-saddr);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/44] [IPV6]: Add Kconfig to enable Mobile IPv6.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add Kconfig to enable Mobile IPv6.
Based on MIPL2 kernel patch.

Signed-off-by: Noriaki TAKAMIYA [EMAIL PROTECTED]
Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/Kconfig |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 5067665..82ea1dc 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -97,6 +97,15 @@ config INET6_IPCOMP
 
  If unsure, say Y.
 
+config IPV6_MIP6
+   bool IPv6: Mobility (EXPERIMENTAL)
+   depends on IPV6  EXPERIMENTAL
+   select XFRM
+   ---help---
+ Support for IPv6 Mobility described in RFC 3775.
+
+ If unsure, say N.
+
 config INET6_XFRM_TUNNEL
tristate
select INET6_TUNNEL
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/44] [XFRM] STATE: Add a hook to find offset to be inserted header in outbound.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

On current kernel, ip6_find_1stfragopt() is used by IPv6 IPsec to find offset to
be inserted header in outbound for transport mode. (BTW, no usage may be needed 
for
IPv4 case.)
Mobile IPv6 requires another logic for routing header and destination options
header respectively. This patch is common platform for the offset and adopts it 
to
IPsec.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/net/xfrm.h  |3 +++
 net/ipv6/ah6.c  |3 ++-
 net/ipv6/esp6.c |3 ++-
 net/ipv6/ipcomp6.c  |1 +
 net/ipv6/ipv6_syms.c|1 +
 net/ipv6/xfrm6_mode_transport.c |2 +-
 net/ipv6/xfrm6_output.c |6 ++
 7 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index bd51224..a936423 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -266,6 +266,7 @@ struct xfrm_type
void(*destructor)(struct xfrm_state *);
int (*input)(struct xfrm_state *, struct sk_buff 
*skb);
int (*output)(struct xfrm_state *, struct sk_buff 
*pskb);
+   int (*hdr_offset)(struct xfrm_state *, struct 
sk_buff *, u8 **);
/* Estimate maximal size of result of transformation of a dgram */
u32 (*get_max_size)(struct xfrm_state *, int size);
 };
@@ -960,6 +961,8 @@ extern u32 xfrm6_tunnel_alloc_spi(xfrm_a
 extern void xfrm6_tunnel_free_spi(xfrm_address_t *saddr);
 extern u32 xfrm6_tunnel_spi_lookup(xfrm_address_t *saddr);
 extern int xfrm6_output(struct sk_buff *skb);
+extern int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb,
+u8 **prevhdr);
 
 #ifdef CONFIG_XFRM
 extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type);
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index b085562..ab90b2d 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -424,7 +424,8 @@ static struct xfrm_type ah6_type =
.init_state = ah6_init_state,
.destructor = ah6_destroy,
.input  = ah6_input,
-   .output = ah6_output
+   .output = ah6_output,
+   .hdr_offset = xfrm6_find_1stfragopt,
 };
 
 static struct inet6_protocol ah6_protocol = {
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9231981..0e2b372 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -365,7 +365,8 @@ static struct xfrm_type esp6_type =
.destructor = esp6_destroy,
.get_max_size   = esp6_get_max_size,
.input  = esp6_input,
-   .output = esp6_output
+   .output = esp6_output,
+   .hdr_offset = xfrm6_find_1stfragopt,
 };
 
 static struct inet6_protocol esp6_protocol = {
diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c
index 1578529..8669146 100644
--- a/net/ipv6/ipcomp6.c
+++ b/net/ipv6/ipcomp6.c
@@ -460,6 +460,7 @@ static struct xfrm_type ipcomp6_type = 
.destructor = ipcomp6_destroy,
.input  = ipcomp6_input,
.output = ipcomp6_output,
+   .hdr_offset = xfrm6_find_1stfragopt,
 };
 
 static struct inet6_protocol ipcomp6_protocol = 
diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c
index dd4d1ce..e1a7416 100644
--- a/net/ipv6/ipv6_syms.c
+++ b/net/ipv6/ipv6_syms.c
@@ -31,6 +31,7 @@ EXPORT_SYMBOL(ipv6_chk_addr);
 EXPORT_SYMBOL(in6_dev_finish_destroy);
 #ifdef CONFIG_XFRM
 EXPORT_SYMBOL(xfrm6_rcv);
+EXPORT_SYMBOL(xfrm6_find_1stfragopt);
 #endif
 EXPORT_SYMBOL(rt6_lookup);
 EXPORT_SYMBOL(ipv6_push_nfrag_opts);
diff --git a/net/ipv6/xfrm6_mode_transport.c b/net/ipv6/xfrm6_mode_transport.c
index 711d713..a5dce21 100644
--- a/net/ipv6/xfrm6_mode_transport.c
+++ b/net/ipv6/xfrm6_mode_transport.c
@@ -35,7 +35,7 @@ static int xfrm6_transport_output(struct
skb_push(skb, x-props.header_len);
iph = skb-nh.ipv6h;
 
-   hdr_len = ip6_find_1stfragopt(skb, prevhdr);
+   hdr_len = x-type-hdr_offset(x, skb, prevhdr);
skb-nh.raw = prevhdr - x-props.header_len;
skb-h.raw = skb-data + hdr_len;
memmove(skb-data, iph, hdr_len);
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 26f1886..b4628fb 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -17,6 +17,12 @@ #include linux/netfilter_ipv6.h
 #include net/ipv6.h
 #include net/xfrm.h
 
+int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb,
+ u8 **prevhdr)
+{
+   return ip6_find_1stfragopt(skb, prevhdr);
+}
+
 static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
int mtu, ret = 0;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 19/44] [IPV6] MIP6: Add routing header type 2 definition.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add routing header type 2 definition for Mobile IPv6.
Based on MIPL2 kernel patch.

Signed-off-by: Noriaki TAKAMIYA [EMAIL PROTECTED]
Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/ipv6.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 297853c..14be2db 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -29,6 +29,7 @@ struct in6_ifreq {
 
 #define IPV6_SRCRT_STRICT  0x01/* this hop must be a neighbor  */
 #define IPV6_SRCRT_TYPE_0  0   /* IPv6 type 0 Routing Header   */
+#define IPV6_SRCRT_TYPE_2  2   /* IPv6 type 2 Routing Header   */
 
 /*
  * routing header
@@ -73,6 +74,18 @@ struct rt0_hdr {
 #define rt0_type   rt_hdr.type
 };
 
+/*
+ * routing header type 2
+ */
+
+struct rt2_hdr {
+   struct ipv6_rt_hdr  rt_hdr;
+   __u32   reserved;
+   struct in6_addr addr;
+
+#define rt2_type   rt_hdr.type
+};
+
 struct ipv6_auth_hdr {
__u8  nexthdr;
__u8  hdrlen;   /* This one is measured in 32 bit units! */
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/44] [XFRM]: Fix message about transformation user interface.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Transformation user interface is not only for IPsec.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/xfrm/Kconfig |6 +++---
 net/xfrm/xfrm_user.c |2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig
index 0c1c043..43228f7 100644
--- a/net/xfrm/Kconfig
+++ b/net/xfrm/Kconfig
@@ -6,11 +6,11 @@ config XFRM
depends on NET
 
 config XFRM_USER
-   tristate IPsec user configuration interface
+   tristate Transformation user configuration interface
depends on INET  XFRM
---help---
- Support for IPsec user configuration interface used
- by native Linux tools.
+ Support for Transformation(XFRM) user configuration interface
+ like IPsec used by native Linux tools.
 
  If unsure, say Y.
 
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 6511085..642ec66 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2052,7 +2052,7 @@ static int __init xfrm_user_init(void)
 {
struct sock *nlsk;
 
-   printk(KERN_INFO Initializing IPsec netlink socket\n);
+   printk(KERN_INFO Initializing XFRM netlink socket\n);
 
nlsk = netlink_kernel_create(NETLINK_XFRM, XFRMNLGRP_MAX,
 xfrm_netlink_rcv, THIS_MODULE);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 27/44] [IPV6] IPSEC: Support sending with Mobile IPv6 extension headers.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Mobile IPv6 defines home address option as an option of destination
options header. It is placed before fragment header then
ip6_find_1stfragopt() is fixed to know about it.
Home address option also carries final source address of the flow, then
outbound AH calculation should take care of it like routing header case.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 net/ipv6/ah6.c|  109 +
 net/ipv6/ip6_output.c |   18 ++--
 2 files changed, 122 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index ab90b2d..96f36fd 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -74,6 +74,68 @@ bad:
return 0;
 }
 
+#ifdef CONFIG_IPV6_MIP6
+/**
+ * ipv6_rearrange_destopt - rearrange IPv6 destination options header
+ * @iph: IPv6 header
+ * @destopt: destionation options header
+ */
+static void ipv6_rearrange_destopt(struct ipv6hdr *iph, struct ipv6_opt_hdr 
*destopt)
+{
+   u8 *opt = (u8 *)destopt;
+   int len = ipv6_optlen(destopt);
+   int off = 0;
+   int optlen = 0;
+
+   off += 2;
+   len -= 2;
+
+   while (len  0) {
+
+   switch (opt[off]) {
+
+   case IPV6_TLV_PAD0:
+   optlen = 1;
+   break;
+   default:
+   if (len  2)
+   goto bad;
+   optlen = opt[off+1]+2;
+   if (len  optlen)
+   goto bad;
+
+   /* Rearrange the source address in @iph and the
+* addresses in home address option for final source.
+* See 11.3.2 of RFC 3775 for details.
+*/
+   if (opt[off] == IPV6_TLV_HAO) {
+   struct in6_addr final_addr;
+   struct ipv6_destopt_hao *hao;
+
+   hao = (struct ipv6_destopt_hao *)opt[off];
+   if (hao-length != sizeof(hao-addr)) {
+   if (net_ratelimit())
+   printk(KERN_WARNING destopt 
hao: invalid header length: %u\n, hao-length);
+   goto bad;
+   }
+   ipv6_addr_copy(final_addr, hao-addr);
+   ipv6_addr_copy(hao-addr, iph-saddr);
+   ipv6_addr_copy(iph-saddr, final_addr);
+   }
+   break;
+   }
+
+   off += optlen;
+   len -= optlen;
+   }
+   if (len == 0)
+   return;
+
+bad:
+   return;
+}
+#endif
+
 /**
  * ipv6_rearrange_rthdr - rearrange IPv6 routing header
  * @iph: IPv6 header
@@ -113,7 +175,11 @@ static void ipv6_rearrange_rthdr(struct 
ipv6_addr_copy(iph-daddr, final_addr);
 }
 
+#ifdef CONFIG_IPV6_MIP6
+static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len, int dir)
+#else
 static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len)
+#endif
 {
union {
struct ipv6hdr *iph;
@@ -128,6 +194,28 @@ static int ipv6_clear_mutable_options(st
 
while (exthdr.raw  end) {
switch (nexthdr) {
+#ifdef CONFIG_IPV6_MIP6
+   case NEXTHDR_HOP:
+   if (!zero_out_mutable_opts(exthdr.opth)) {
+   LIMIT_NETDEBUG(
+   KERN_WARNING overrun %sopts\n,
+   nexthdr == NEXTHDR_HOP ?
+   hop : dest);
+   return -EINVAL;
+   }
+   break;
+   case NEXTHDR_DEST:
+   if (dir == XFRM_POLICY_OUT)
+   ipv6_rearrange_destopt(iph, exthdr.opth);
+   if (!zero_out_mutable_opts(exthdr.opth)) {
+   LIMIT_NETDEBUG(
+   KERN_WARNING overrun %sopts\n,
+   nexthdr == NEXTHDR_HOP ?
+   hop : dest);
+   return -EINVAL;
+   }
+   break;
+#else
case NEXTHDR_HOP:
case NEXTHDR_DEST:
if (!zero_out_mutable_opts(exthdr.opth)) {
@@ -138,6 +226,7 @@ static int ipv6_clear_mutable_options(st
return -EINVAL;
}
break;
+#endif
 
case NEXTHDR_ROUTING:
ipv6_rearrange_rthdr(iph,

[PATCH 31/44] [IPV6] MIP6: Add Mobility header definition.

2006-08-23 Thread YOSHIFUJI Hideaki

From: Masahide NAKAMURA [EMAIL PROTECTED]

Add Mobility header definition for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Antti Tuominen [EMAIL PROTECTED]

Signed-off-by: Masahide NAKAMURA [EMAIL PROTECTED]
Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
---
 include/linux/in6.h |1 +
 include/net/flow.h  |9 +
 include/net/ipv6.h  |1 +
 include/net/mip6.h  |   23 +++
 4 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/include/linux/in6.h b/include/linux/in6.h
index 086ec2a..d776829 100644
--- a/include/linux/in6.h
+++ b/include/linux/in6.h
@@ -134,6 +134,7 @@ #define IPPROTO_FRAGMENT44  /* IPv6 frag
 #define IPPROTO_ICMPV6 58  /* ICMPv6   */
 #define IPPROTO_NONE   59  /* IPv6 no next header  */
 #define IPPROTO_DSTOPTS60  /* IPv6 destination options 
*/
+#define IPPROTO_MH 135 /* IPv6 mobility header */
 
 /*
  * IPv6 TLV options.
diff --git a/include/net/flow.h b/include/net/flow.h
index 21d988b..e052291 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -72,12 +72,21 @@ #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x0
} dnports;
 
__u32   spi;
+
+#ifdef CONFIG_IPV6_MIP6
+   struct {
+   __u8type;
+   } mht;
+#endif
} uli_u;
 #define fl_ip_sportuli_u.ports.sport
 #define fl_ip_dportuli_u.ports.dport
 #define fl_icmp_type   uli_u.icmpt.type
 #define fl_icmp_code   uli_u.icmpt.code
 #define fl_ipsec_spi   uli_u.spi
+#ifdef CONFIG_IPV6_MIP6
+#define fl_mh_type uli_u.mht.type
+#endif
__u32   secid;  /* used by xfrm; see secid.txt */
 } __attribute__((__aligned__(BITS_PER_LONG/8)));
 
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 8e6ec60..72bf47b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -40,6 +40,7 @@ #define NEXTHDR_AUTH  51  /* Authenticati
 #define NEXTHDR_ICMP   58  /* ICMP for IPv6. */
 #define NEXTHDR_NONE   59  /* No next header */
 #define NEXTHDR_DEST   60  /* Destination options header. */
+#define NEXTHDR_MOBILITY   135 /* Mobility header. */
 
 #define NEXTHDR_MAX255
 
diff --git a/include/net/mip6.h b/include/net/mip6.h
index 42b65ba..fd43178 100644
--- a/include/net/mip6.h
+++ b/include/net/mip6.h
@@ -28,6 +28,29 @@ #define _NET_MIP6_H
 #define MIP6_OPT_PAD_1 0
 #define MIP6_OPT_PAD_N 1
 
+/*
+ * Mobility Header
+ */
+struct ip6_mh {
+   __u8ip6mh_proto;
+   __u8ip6mh_hdrlen;
+   __u8ip6mh_type;
+   __u8ip6mh_reserved;
+   __u16   ip6mh_cksum;
+   /* Followed by type specific messages */
+   __u8data[0];
+} __attribute__ ((__packed__));
+
+#define IP6_MH_TYPE_BRR0   /* Binding Refresh Request */
+#define IP6_MH_TYPE_HOTI   1   /* HOTI Message   */
+#define IP6_MH_TYPE_COTI   2   /* COTI Message  */
+#define IP6_MH_TYPE_HOT3   /* HOT Message   */
+#define IP6_MH_TYPE_COT4   /* COT Message  */
+#define IP6_MH_TYPE_BU 5   /* Binding Update */
+#define IP6_MH_TYPE_BACK   6   /* Binding ACK */
+#define IP6_MH_TYPE_BERROR 7   /* Binding Error */
+#define IP6_MH_TYPE_MAXIP6_MH_TYPE_BERROR
+
 extern int mip6_init(void);
 extern void mip6_fini(void);
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 >

1 - 100 of 258 matches

Mail list logo