Re: [PATCH 3/3] infiniband: rxe: fix 32-bit build warnings

2016-06-13 Thread Moni Shoua
thanks. will be applied to next series.

On Mon, Jun 13, 2016 at 3:54 PM, Arnd Bergmann  wrote:
> The new rxe infinband driver passes around pointers that have been
> converted to 64-bit integers. This is valid, but causes compile-time
> warnings on all 32-bit architectures:
>
> infiniband/hw/rxe/rxe_dma.c: In function 'rxe_dma_map_single':
> infiniband/hw/rxe/rxe_dma.c:49:9: error: cast from pointer to integer of 
> different size [-Werror=pointer-to-int-cast]
>   return (u64)cpu_addr;
>  ^
> infiniband/hw/rxe/rxe_dma.c: In function 'rxe_dma_map_page':
> infiniband/hw/rxe/rxe_dma.c:73:9: error: cast from pointer to integer of 
> different size [-Werror=pointer-to-int-cast]
> infiniband/hw/rxe/rxe_dma.c: In function 'rxe_map_sg':
> infiniband/hw/rxe/rxe_dma.c:99:10: error: cast from pointer to integer of 
> different size [-Werror=pointer-to-int-cast]
> infiniband/hw/rxe/rxe_dma.c: In function 'rxe_dma_alloc_coherent':
> infiniband/hw/rxe/rxe_dma.c:143:17: error: cast from pointer to integer of 
> different size [-Werror=pointer-to-int-cast]
>
> This changes the cast to use 'uintptr_t', which can always be
> cast to and from pointer, and can be assigned to and from 64-bit
> integers.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/infiniband/hw/rxe/rxe_dma.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/hw/rxe/rxe_dma.c 
> b/drivers/infiniband/hw/rxe/rxe_dma.c
> index f080bc5bda43..7634c1a81b2b 100644
> --- a/drivers/infiniband/hw/rxe/rxe_dma.c
> +++ b/drivers/infiniband/hw/rxe/rxe_dma.c
> @@ -46,7 +46,7 @@ static u64 rxe_dma_map_single(struct ib_device *dev,
>   enum dma_data_direction direction)
>  {
> WARN_ON(!valid_dma_direction(direction));
> -   return (u64)cpu_addr;
> +   return (uintptr_t)cpu_addr;
>  }
>
>  static void rxe_dma_unmap_single(struct ib_device *dev,
> @@ -70,7 +70,7 @@ static u64 rxe_dma_map_page(struct ib_device *dev,
> goto done;
> }
>
> -   addr = (u64)page_address(page);
> +   addr = (uintptr_t)page_address(page);
> if (addr)
> addr += offset;
>
> @@ -96,7 +96,7 @@ static int rxe_map_sg(struct ib_device *dev, struct 
> scatterlist *sgl,
> WARN_ON(!valid_dma_direction(direction));
>
> for_each_sg(sgl, sg, nents, i) {
> -   addr = (u64)page_address(sg_page(sg));
> +   addr = (uintptr_t)page_address(sg_page(sg));
> if (!addr) {
> ret = 0;
> break;
> @@ -140,7 +140,7 @@ static void *rxe_dma_alloc_coherent(struct ib_device 
> *dev, size_t size,
> addr = page_address(p);
>
> if (dma_handle)
> -   *dma_handle = (u64)addr;
> +   *dma_handle = (uintptr_t)addr;
>
> return addr;
>  }
> --
> 2.7.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] infiniband: rxe: avoid 64-bit division

2016-06-13 Thread Moni Shoua
thanks. will be applied to next series

On Mon, Jun 13, 2016 at 4:18 PM, Leon Romanovsky  wrote:
> On Mon, Jun 13, 2016 at 02:54:53PM +0200, Arnd Bergmann wrote:
>> The rxe driver fails to build on 32-bit because of a 64-bit division:
>>
>> In function `rxe_qp_from_attr':
>> :(.text+0x53158): undefined reference to `__aeabi_uldivmod'
>>
>> We can easily avoid this division by converting the nanosecond value
>> into jiffies directly rather than converting to microseconds first.
>>
>> Signed-off-by: Arnd Bergmann 
>
> Thanks Arnd,
> All three patches are applied on topic/rxe now.


Re: [PATCH 2/3] infiniband: rxe: add UDP_TUNNEL dependency

2016-06-13 Thread Moni Shoua
thanks. will be applied to next series

On Mon, Jun 13, 2016 at 3:54 PM, Arnd Bergmann  wrote:
> The newly added rxe driver links against the UDP tunneling code,
> which causes build errors when CONFIG_UDP_TUNNEL is disabled:
>
> ERROR: "setup_udp_tunnel_sock" [drivers/infiniband/hw/rxe/ib_rxe.ko] 
> undefined!
> ERROR: "udp_tunnel_sock_release" [drivers/infiniband/hw/rxe/ib_rxe.ko] 
> undefined!
> ERROR: "udp_sock_create4" [drivers/infiniband/hw/rxe/ib_rxe.ko] undefined!
>
> This adds a Kconfig dependency to prevent the invalid configuration.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/infiniband/hw/rxe/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/infiniband/hw/rxe/Kconfig 
> b/drivers/infiniband/hw/rxe/Kconfig
> index 649b7be11eb8..a199d0df31d0 100644
> --- a/drivers/infiniband/hw/rxe/Kconfig
> +++ b/drivers/infiniband/hw/rxe/Kconfig
> @@ -1,6 +1,7 @@
>  config INFINIBAND_RXE
> tristate "Software RDMA over Ethernet (RoCE) driver"
> depends on INET && PCI && INFINIBAND
> +   depends on NET_UDP_TUNNEL
> ---help---
> This driver implements the InfiniBand RDMA transport over
> the Linux network stack. It enables a system with a
> --
> 2.7.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible bug in bonding

2007-11-12 Thread Moni Shoua

Hi,

As reported last week by Or Gerlitz and confirmed by me, there is a kernel 
crash 
when trying to unenslave all ib slaves (which leads to bonding master 
destruction).
I also found that it happens with Ethernet slaves if following the steps:

1. unenslaving all slaves via sysfs
2. deleting the bonding master via sysfs


I see that in commit 6603a6f25e4bca922a7dfbf0bf03072d98850176 
the order of the bonding master destruction sequence was changed 
in 2 places (a part of the commitdiff is below).

Although I might be missing something it looks like a bug to me
to dereference the dev pointer after unregistering it. Am I right?

I also see that the kernel crash doesn't happen when reverting from this 
(part of) the patch.

I'd like to have your opinion please.

thanks
 
MoniS

-

commit 6603a6f25e4bca922a7dfbf0bf03072d98850176
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:50 2007 -0700

bonding: Convert more locks to _bh, acquire rtnl, for new locking

Convert more lock acquisitions to _bh flavor to avoid deadlock
with workqueue activity and add acquisition of RTNL in appropriate places.
Affects ALB mode, as well as core bonding functions and sysfs.

Signed-off-by: Andy Gospodarek [EMAIL PROTECTED]
Signed-off-by: Jay Vosburgh [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]


--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1846,9 +1846,9 @@ int bond_release(struct net_device *bond
 */
 void bond_destroy(struct bonding *bond)
 {
+   unregister_netdevice(bond-dev);
bond_deinit(bond-dev);
bond_destroy_sysfs_entry(bond);
-   unregister_netdevice(bond-dev);
 }

.
.
.

@@ -4473,8 +4473,8 @@ static void bond_free_all(void)
bond_mc_list_destroy(bond);
/* Release the bonded slaves */
bond_release_all(bond_dev);
-   bond_deinit(bond_dev);
unregister_netdevice(bond_dev);
+   bond_deinit(bond_dev);
}
 
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] bonding: fix rtnl locking merge error

2007-11-07 Thread Moni Shoua
Jay Vosburgh wrote:
   Looks like I incorrectly merged one of the rtnl lock changes,
 so that one function, bonding_show_active_slave, held rtnl but didn't
 release it, and another, bonding_store_active_slave, never held rtnl but
 did release it.
 
   Fixed so the first function doesn't mess with rtnl, and the
 second correctly acquires and releases rtnl.
 
   Bug reported by Moni Shoua [EMAIL PROTECTED]

I ran some shallow tests and it seems that the patch fixes  the problem.

thanks

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bonding / 2.6.24-rc1 issues

2007-11-07 Thread Moni Shoua
Or Gerlitz wrote:
 Jay, Moni
 
 I did some tests with 2.6.24-rc1 and the first patch to bonding that Jay
 sent last night to netdev. Basic operation and fail over work fine.
 However, I see some crashes which are somehow related to destroying the
 bond when the slaves are ipoib ones, I don't see similar crashes when
 enslaving ethernet devices (Broadcom Corporation NetXtreme BCM5704
 Gigabit Ethernet (rev 03)), my compressed dot config is attached.
 
 The first type of oops is when I just do modprobe -r bonding after
 enslavement of the ipoib devices:
 
 
 the second type of oops is when I modprobe -r ib_ipoib after
 enslavement. I was not able to test this one with ethernet as the tg3
 code is built into my kernel

I couldn't reproduce the first oops. However, after applying Jay's
fixes for 2.6.24-rc1 I managed to reproduce the second oops.
I will try to look into it.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with ib-bonding of 2.6.24-rc1

2007-11-06 Thread Moni Shoua
Same thing happens with Ethernet slave - Intel 82546GB Gigabit Ethernet 
Controller (rev 03)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with ib-bonding of 2.6.24-rc1

2007-11-05 Thread Moni Shoua
Moni Shoua wrote:
 Basically, what I see is that after a while commands like ifconfig or ip 
 stucks.
 I only use sysfs  to configure bonding (which also stucks after a while).
 

Maybe some extra information might help

This is what happens in the kernel while the command 'ip' is stucked in user 
level.

bond0 S 8057eb80 0  4799  2
 8100060e7ee0 0046  0286
 810004259820 81000160d760 810004259a28 00020019
  80244f94  
Call Trace:
 [80244f94] queue_delayed_work_on+0xaf/0xbe
 [880597c7] :bonding:bond_mii_monitor+0x0/0x9a
 [80244fe8] worker_thread+0x0/0xe4
 [80245089] worker_thread+0xa1/0xe4
 [8024833b] autoremove_wake_function+0x0/0x2e
 [80248213] kthread+0x47/0x74
 [8020c458] child_rip+0xa/0x12
 [802481cc] kthread+0x0/0x74
 [8020c44e] child_rip+0x0/0x12

ipD 8057eb80 0  4842   4841
 81000615bb88 0082  80289b5c
 810006123140 8100016a77e0 810006123348 0003045c45a8
    
Call Trace:
 [80289b5c] __link_path_walk+0xb97/0xcec
 [80563a64] __mutex_lock_slowpath+0x68/0xa3
 [805638f6] mutex_lock+0x1a/0x1e
 [804ecffb] rtnetlink_rcv+0x9/0x1e
 [804f30e6] netlink_unicast+0x1c3/0x236
 [804f3a69] netlink_sendmsg+0x296/0x2a9
 [80266235] zone_statistics+0x3f/0x60
 [804d9806] sock_sendmsg+0xcb/0xe3
 [8024833b] autoremove_wake_function+0x0/0x2e
 [80267960] __do_fault+0x35e/0x398
 [804d8d09] move_addr_to_kernel+0x25/0x36
 [804d9bd0] sys_sendto+0x128/0x151
 [804da686] sys_getsockname+0x72/0xa2
 [8020b63e] system_call+0x7e/0x83


I also attach my .config
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc1
# Tue Oct 30 09:49:25 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
# CONFIG_CGROUPS is not set
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
# CONFIG_FAIR_CGROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_KMOD is not set
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_BLK_DEV_BSG is not set
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_X86_PC=y
# CONFIG_X86_VSMP is not set
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC

Re: problems with ib-bonding of 2.6.24-rc1

2007-11-05 Thread Moni Shoua
Jay Vosburgh wrote:
 Moni Shoua [EMAIL PROTECTED] wrote:
 
 Basically, what I see is that after a while commands like ifconfig or ip 
 stucks.
 I only use sysfs  to configure bonding (which also stucks after a while).
 
   I've fooled with setting various things in bonding in the
 current linux-2.6 git kernel, and I'm not seeing the failure you
 describe.  Can you provide some step by step instructions, including the
 type of system, bonding mode, options, number and type of slaves, etc,
 to induce the failure?
 
   -J
 
 ---
   -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
Machine
---
ARCH:   x86_64
OS: Redhat EL Server 5

bonding
---
mod. options:   none
mode:   1 (active-backup)
miimon: 100
slaves: IP over InfiniBand

Below is a scenario that ends up with what I described

[EMAIL PROTECTED] root]# ifconfig -a
eth0  Link encap:Ethernet  HWaddr 00:04:23:B3:25:C4
  inet addr:172.30.3.234  Bcast:172.30.255.255  Mask:255.255.0.0
  inet6 addr: fe80::204:23ff:feb3:25c4/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:333219 errors:0 dropped:0 overruns:0 frame:0
  TX packets:130880 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:100
  RX bytes:248023834 (236.5 MiB)  TX bytes:35230500 (33.5 MiB)
  Base address:0xdc00 Memory:fcea-fcec

eth1  Link encap:Ethernet  HWaddr 00:04:23:B3:25:C5
  BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  Base address:0xdc80 Memory:fcee-fcf0

ib0   Link encap:InfiniBand  HWaddr 
00:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
  BROADCAST MULTICAST  MTU:2044  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

ib1   Link encap:InfiniBand  HWaddr 
00:00:04:05:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
  BROADCAST MULTICAST  MTU:2044  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:44 errors:0 dropped:0 overruns:0 frame:0
  TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:7145 (6.9 KiB)  TX bytes:7145 (6.9 KiB)

sit0  Link encap:IPv6-in-IPv4
  NOARP  MTU:1480  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

[EMAIL PROTECTED] root]# modprobe bonding
[EMAIL PROTECTED] root]# echo 1  /sys/class/net/bond0/bonding/mode
[EMAIL PROTECTED] root]# echo 100  /sys/class/net/bond0/bonding/miimon
[EMAIL PROTECTED] root]# echo +ib0  /sys/class/net/bond0/bonding/slaves
[EMAIL PROTECTED] root]# echo +ib1  /sys/class/net/bond0/bonding/slaves
[EMAIL PROTECTED] root]# ifconfig bond0 192.168.3.234
[EMAIL PROTECTED] root]# ip a s
1: lo: LOOPBACK,UP,LOWER_UP mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:04:23:b3:25:c4 brd ff:ff:ff:ff:ff:ff
inet 172.30.3.234/16 brd 172.30.255.255 scope global eth0
inet6 fe80::204:23ff:feb3:25c4/64 scope link
   valid_lft forever preferred_lft forever
3: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop qlen 1000
link/ether 00:04:23:b3:25:c5 brd ff:ff:ff:ff:ff:ff
4: sit0: NOARP mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
5: ib0: BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP mtu 1500 qdisc pfifo_fast 
master bond0 qlen 128
link/infiniband 00:00:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:96:e8:a9 
brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
6: ib1: BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP mtu 1500 qdisc pfifo_fast 
master bond0 qlen 128
link/infiniband 00:00:04:05:fe:80:00:00:00:00:00:00:00:08:f1:04:03:96:e8:aa 
brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
8: bond0: BROADCAST,MULTICAST,MASTER,UP,LOWER_UP mtu 1500 qdisc noqueue
link/infiniband 00:00:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:96:e8:a9 
brd 00:ff:ff:ff:ff:12:40

problems with ib-bonding of 2.6.24-rc1

2007-11-04 Thread Moni Shoua
Hi,
I've been doing some tests for bonding of 2.6.24-rc1 and noticed some problems.
My first goal was to see how bonding works with IPoIB slaves but I also tried it
with Ethernet.

Basically, what I see is that after a while commands like ifconfig or ip stucks.
I only use sysfs  to configure bonding (which also stucks after a while).

After stripping the list of commits below from the code I see no problems.

Does anybody else have the same problem?

thanks
 MoniS



commit d0e81b7e2246a41d068ecaf15aac9de570816d63
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:51 2007 -0700

bonding: Acquire correct locks in alb for promisc change

--
commit 6603a6f25e4bca922a7dfbf0bf03072d98850176
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:50 2007 -0700

bonding: Convert more locks to _bh, acquire rtnl, for new locking

--
commit 059fe7a578fba5bbb0fdc0365bfcf6218fa25eb0
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:49 2007 -0700

bonding: Convert locks to _bh, rework alb locking for new locking

--
commit 0b0eef66419e9abe6fd62bc958ab7cd0a18f858e
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:48 2007 -0700

bonding: Convert miimon to new locking

--
commit cf5f9044934658dd3ffc628a60cd37c70f8168b1
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:47 2007 -0700

bonding: Convert balance-rr transmit to new locking

--
commit 1b76b31693d4a6088dec104ff6a6ead54081a3c2
Author: Jay Vosburgh [EMAIL PROTECTED]
Date:   Wed Oct 17 17:37:45 2007 -0700

Convert bonding timers to workqueues

--
commit 3a4fa0a25da81600ea0bcd75692ae8ca6050d165
Author: Robert P. J. Day [EMAIL PROTECTED]
Date:   Fri Oct 19 23:10:43 2007 +0200

Fix misspellings of system, controller, interrupt and necessary.

--
commit 1c3f0b8e07de78a86f2dce911f5e245845ce40a8
Author: Mathieu Desnoyers [EMAIL PROTECTED]
Date:   Thu Oct 18 23:41:04 2007 -0700

Change struct marker users


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH linux-2.6] bonding: two small fixes for IPoIB support

2007-10-16 Thread Moni Shoua
Jay Vosburgh wrote:
   Two small fixes to IPoIB support for bonding:
 
   1- copy header_ops from slave to bonding for IPoIB slaves
   2- move release and destroy logic to UNREGISTER from GOING_DOWN
  notifier to avoid double release
 
   Set bonding to version 3.2.1.
 
 Signed-off-by: Moni Shoua monis at voltaire.com
 Signed-off-by: Jay Vosburgh [EMAIL PROTECTED]
 
 ---
  drivers/net/bonding/bond_main.c |   11 +--
  drivers/net/bonding/bonding.h   |4 ++--
  2 files changed, 7 insertions(+), 8 deletions(-)
 
 diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
 index db80f24..6f85cc3 100644
 --- a/drivers/net/bonding/bond_main.c
 +++ b/drivers/net/bonding/bond_main.c
 @@ -1263,6 +1263,7 @@ static void bond_setup_by_slave(struct net_device 
 *bond_dev,
   struct bonding *bond = bond_dev-priv;
  
   bond_dev-neigh_setup   = slave_dev-neigh_setup;
 + bond_dev-header_ops= slave_dev-header_ops;
  
   bond_dev-type  = slave_dev-type;
   bond_dev-hard_header_len   = slave_dev-hard_header_len;
 @@ -3351,7 +3352,10 @@ static int bond_slave_netdev_event(unsigned long 
 event, struct net_device *slave
   switch (event) {
   case NETDEV_UNREGISTER:
   if (bond_dev) {
 - bond_release(bond_dev, slave_dev);
 + if (bond-setup_by_slave)
 + bond_release_and_destroy(bond_dev, slave_dev);
 + else
 + bond_release(bond_dev, slave_dev);
   }
   break;
   case NETDEV_CHANGE:
 @@ -3366,11 +3370,6 @@ static int bond_slave_netdev_event(unsigned long 
 event, struct net_device *slave
* ... Or is it this?
*/
   break;
 - case NETDEV_GOING_DOWN:
 - dprintk(slave %s is going down\n, slave_dev-name);
 - if (bond-setup_by_slave)
 - bond_release_and_destroy(bond_dev, slave_dev);
 - break;
   case NETDEV_CHANGEMTU:
   /*
* TODO: Should slaves be allowed to
 diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
 index a8bbd56..b818060 100644
 --- a/drivers/net/bonding/bonding.h
 +++ b/drivers/net/bonding/bonding.h
 @@ -22,8 +22,8 @@
  #include bond_3ad.h
  #include bond_alb.h
  
 -#define DRV_VERSION  3.2.0
 -#define DRV_RELDATE  September 13, 2007
 +#define DRV_VERSION  3.2.1
 +#define DRV_RELDATE  October 15, 2007
  #define DRV_NAME bonding
  #define DRV_DESCRIPTION  Ethernet Channel Bonding Driver
  
Jay,
Thanks for this work.

Jeff,
Thanks for applying.
I noticed that this patch isn't applied. It includes important fixes. 
Can you please apply it also?


MoniS


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 0/8] net/bonding: ADD IPoIB support for the bonding driver

2007-10-15 Thread Moni Shoua
This is the 7th version of this patch series. See link to V6 below.

Changes from the previous version
-

* Some patches required modifications to remove offsets so they can be applied 
with git-apply
* Patch #3 was first modified by Jay and later by me to make it work with 
header_ops
* patch #8 was changed to fix the problem that caused 'ifconfig down' to stuck 
(dev_close was called twice)

Jay,
I removed the Acked-by lines from patches 3  8. Can you please add them back 
after you approve?

thanks
MoniS

Link to V6: 

http://lists.openfabrics.org/pipermail/general/2007-September/041139.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 1/8] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-15 Thread Moni Shoua

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb-dst-neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb-dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n-dev is an ipoib device and hence netdev_priv(n-dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n-dev-flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
Acked-by: Roland Dreier [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   24 +++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 ++-
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 6545fa7..1b3327a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -349,6 +349,7 @@ #endif
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -365,7 +366,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index e072f3c..cae026c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -517,7 +517,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb-dst-neighbour);
+   neigh = ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
if (!neigh) {
++dev-stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -817,6 +817,13 @@ static void ipoib_neigh_cleanup(struct n
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
+   neigh = *to_ipoib_neigh(n);
+   if (neigh) {
+   priv = netdev_priv(neigh-dev);
+   ipoib_dbg(priv, neigh_destructor for bonding device: %s\n,
+ n-dev-name);
+   } else
+   return;
ipoib_dbg(priv,
  neigh_cleanup for %06x  IPOIB_GID_FMT \n,
  IPOIB_QPN(n-ha),
@@ -824,13 +831,10 @@ static void ipoib_neigh_cleanup(struct n
 
spin_lock_irqsave(priv-lock, flags);
 
-   neigh = *to_ipoib_neigh(n);
-   if (neigh) {
-   if (neigh-ah)
-   ah = neigh-ah;
-   list_del(neigh-list);
-   ipoib_neigh_free(n-dev, neigh);
-   }
+   if (neigh-ah)
+   ah = neigh-ah;
+   list_del(neigh-list);
+   ipoib_neigh_free(n-dev, neigh);
 
spin_unlock_irqrestore(priv-lock, flags);
 
@@ -838,7 +842,8 @@ static void ipoib_neigh_cleanup(struct n
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -847,6 +852,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh-neighbour = neighbour;
+   neigh-dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(neigh-queue);
ipoib_cm_set(neigh, NULL);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 827820e..9bcfc7a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -705,7 +705,8 @@ out:
if (skb-dst
skb-dst-neighbour 
!*to_ipoib_neigh(skb-dst-neighbour)) {
-   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour);
+   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour

Re: [ofa-general] [PATCH V7 0/8] net/bonding: ADD IPoIB support for the bonding driver

2007-10-15 Thread Moni Shoua
Moni Shoua wrote:
 This is the 7th version of this patch series. See link to V6 below.
 

I forgot to mention that the patches are relative to 
jgarzik/netdev-2.6.git#master.
I couldn't compile the 2.6.24 or the upstream branches so I used master branch 
to test the fixes.



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 2/8] IB/ipoib: Verify address handle validity on send

2007-10-15 Thread Moni Shoua

When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
Acked-by: Roland Dreier [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index cae026c..362610d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -692,9 +692,10 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
} else if (neigh-ah) {
-   if (unlikely(memcmp(neigh-dgid.raw,
+   if (unlikely((memcmp(neigh-dgid.raw,
skb-dst-neighbour-ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid))) ||
+(neigh-dev != dev))) {
spin_lock(priv-lock);
/*
 * It's safe to call ipoib_put_ah() inside

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 5/8] net/bonding: Enable IP multicast for bonding IPoIB devices

2007-10-15 Thread Moni Shoua

Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast 
groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding 
device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another 
ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
Acked-by: Jay Vosburgh [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |5 +++--
 drivers/net/bonding/bond_sysfs.c |6 ++
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 32dc75e..d7e43ba 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1281,8 +1281,9 @@ int bond_enslave(struct net_device *bond
 
/* bond must be initialized by bond_open() before enslaving */
if (!(bond_dev-flags  IFF_UP)) {
-   dprintk(Error, master_dev is not up\n);
-   return -EPERM;
+   printk(KERN_WARNING DRV_NAME
+%s: master_dev is not up in bond_enslave\n,
+   bond_dev-name);
}
 
/* already enslaved */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 6f49ca7..ca4e429 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
/* Quick sanity check -- is the bond interface up? */
if (!(bond-dev-flags  IFF_UP)) {
-   printk(KERN_ERR DRV_NAME
-  : %s: Unable to update slaves because interface is 
down.\n,
+   printk(KERN_WARNING DRV_NAME
+  : %s: doing slave updates when interface is down.\n,
   bond-dev-name);
-   ret = -EPERM;
-   goto out;
}
 
/* Note:  We can't hold bond-lock here, as bond_create grabs it. */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 6/8] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-10-15 Thread Moni Shoua

bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua monis at voltaire.com
Acked-by: Jay Vosburgh [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |3 ++-
 drivers/net/bonding/bond_sysfs.c |   10 --
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d7e43ba..3f082dc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1225,7 +1225,8 @@ static int bond_compute_features(struct 
struct slave *slave;
struct net_device *bond_dev = bond-dev;
unsigned long features = bond_dev-features;
-   unsigned short max_hard_header_len = ETH_HLEN;
+   unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+   bond_dev-hard_header_len);
int i;
 
features = ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index ca4e429..583c568 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -260,6 +260,7 @@ static ssize_t bonding_store_slaves(stru
char command[IFNAMSIZ + 1] = { 0, };
char *ifname;
int i, res, found, ret = count;
+   u32 original_mtu;
struct slave *slave;
struct net_device *dev = NULL;
struct bonding *bond = to_bond(d);
@@ -325,6 +326,7 @@ static ssize_t bonding_store_slaves(stru
}
 
/* Set the slave's MTU to match the bond */
+   original_mtu = dev-mtu;
if (dev-mtu != bond-dev-mtu) {
if (dev-change_mtu) {
res = dev-change_mtu(dev,
@@ -339,6 +341,9 @@ static ssize_t bonding_store_slaves(stru
}
rtnl_lock();
res = bond_enslave(bond-dev, dev);
+   bond_for_each_slave(bond, slave, i)
+   if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0)
+   slave-original_mtu = original_mtu;
rtnl_unlock();
if (res) {
ret = res;
@@ -351,6 +356,7 @@ static ssize_t bonding_store_slaves(stru
bond_for_each_slave(bond, slave, i)
if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0) {
dev = slave-dev;
+   original_mtu = slave-original_mtu;
break;
}
if (dev) {
@@ -365,9 +371,9 @@ static ssize_t bonding_store_slaves(stru
}
/* set the slave MTU to the default */
if (dev-change_mtu) {
-   dev-change_mtu(dev, 1500);
+   dev-change_mtu(dev, original_mtu);
} else {
-   dev-mtu = 1500;
+   dev-mtu = original_mtu;
}
}
else {
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 5011ba9..ad9c632 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -156,6 +156,7 @@ struct slave {
s8 link;/* one of BOND_LINK_ */
s8 state;   /* one of BOND_STATE_ */
u32original_flags;
+   u32original_mtu;
u32link_failure_count;
u16speed;
u8 duplex;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PATCH V6 7/8] net/bonding: Delay sending of gratuitous ARP to avoid failure

2007-10-15 Thread Moni Shoua

Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev-state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua monis at voltaire.com
Acked-by: Jay Vosburgh [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   24 +---
 drivers/net/bonding/bonding.h   |1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3f082dc..c017042 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1103,8 +1103,14 @@ void bond_change_active_slave(struct bon
if (new_active  !bond-do_set_mac_addr)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
-
-   bond_send_gratuitous_arp(bond);
+   if (bond-curr_active_slave 
+   test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state)) {
+   dprintk(delaying gratuitous arp on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond-send_grat_arp = 1;
+   } else
+   bond_send_gratuitous_arp(bond);
}
 }
 
@@ -2074,6 +2080,17 @@ void bond_mii_monitor(struct net_device 
 * program could monitor the link itself if needed.
 */
 
+   if (bond-send_grat_arp) {
+   if (bond-curr_active_slave  
test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state))
+   dprintk(Needs to send gratuitous arp but not yet\n);
+   else {
+   dprintk(sending delayed gratuitous arp on on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond_send_gratuitous_arp(bond);
+   bond-send_grat_arp = 0;
+   }
+   }
read_lock(bond-curr_slave_lock);
oldcurrent = bond-curr_active_slave;
read_unlock(bond-curr_slave_lock);
@@ -2475,7 +2492,7 @@ static void bond_send_gratuitous_arp(str
 
if (bond-master_ip) {
bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
- bond-master_ip, 0);
+   bond-master_ip, 0);
}
 
list_for_each_entry(vlan, bond-vlan_list, vlan_list) {
@@ -4281,6 +4298,7 @@ static int bond_init(struct net_device *
bond-current_arp_slave = NULL;
bond-primary_slave = NULL;
bond-dev = bond_dev;
+   bond-send_grat_arp = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index ad9c632..e0e06a8 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -187,6 +187,7 @@ struct bonding {
struct   timer_list arp_timer;
s8   kill_timers;
s8   do_set_mac_addr;
+   s8   send_grat_arp;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V7 8/8] net/bonding: Destroy bonding master when last slave is gone

2007-10-15 Thread Moni Shoua

When bonding enslaves non Ethernet devices it takes pointers to functions
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved
because we don't know if the pointers are still valid.  Destroying the bond 
when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c  |   37 -
 drivers/net/bonding/bond_sysfs.c |9 +
 drivers/net/bonding/bonding.h|3 +++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c017042..23edf18 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1257,6 +1257,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
 {
+   struct bonding *bond = bond_dev-priv;
bond_dev-neigh_setup   = slave_dev-neigh_setup;
 
bond_dev-type  = slave_dev-type;
@@ -1266,6 +1267,7 @@ static void bond_setup_by_slave(struct n
 
memcpy(bond_dev-broadcast, slave_dev-broadcast,
slave_dev-addr_len);
+   bond-setup_by_slave = 1;
 }
 
 /* enslave device slave to bond device master */
@@ -1829,6 +1831,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+   bond_deinit(bond-dev);
+   bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond-dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device 
*slave_dev)
+{
+   struct bonding *bond = bond_dev-priv;
+   int ret;
+
+   ret = bond_release(bond_dev, slave_dev);
+   if ((ret == 0)  (bond-slave_cnt == 0)) {
+   printk(KERN_INFO DRV_NAME : %s: destroying bond %s.\n,
+  bond_dev-name, bond_dev-name);
+   bond_destroy(bond);
+   }
+   return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3311,7 +3342,10 @@ static int bond_slave_netdev_event(unsig
switch (event) {
case NETDEV_UNREGISTER:
if (bond_dev) {
-   bond_release(bond_dev, slave_dev);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   else
+   bond_release(bond_dev, slave_dev);
}
break;
case NETDEV_CHANGE:
@@ -4299,6 +4333,7 @@ static int bond_init(struct net_device *
bond-primary_slave = NULL;
bond-dev = bond_dev;
bond-send_grat_arp = 0;
+   bond-setup_by_slave = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 583c568..b5d2a13 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
printk(KERN_INFO DRV_NAME
: %s is being deleted...\n,
bond-dev-name);
-   bond_deinit(bond-dev);
-   bond_destroy_sysfs_entry(bond);
-   unregister_netdevice(bond-dev);
+   bond_destroy(bond);
rtnl_unlock();
goto out;
}
@@ -363,7 +361,10 @@ static ssize_t bonding_store_slaves(stru
printk(KERN_INFO DRV_NAME : %s: Removing slave %s\n,
bond-dev-name, dev-name);
rtnl_lock();
-   res = bond_release(bond-dev, dev);
+   if (bond-setup_by_slave)
+   res = bond_release_and_destroy(bond-dev, dev);
+   else
+   res = bond_release(bond-dev, dev);
rtnl_unlock();
if (res) {
ret = res;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index e0e06a8..85e579b 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -188,6 +188,7 @@ struct bonding {
s8   kill_timers;
s8   do_set_mac_addr;
s8   send_grat_arp;
+   s8   setup_by_slave

Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-14 Thread Moni Shoua
Roland Dreier wrote:
   It happens only when ib interfaces are slaves of a bonding device.
   I thought before that the stuck is in napi_disable() but it's almost right.
   I put prints before and after call to napi_disable and see that it is 
 called twice.
   I'll try to investigate in this direction.
   
   ib0: stopping interface
   ib0: before napi_disable
   ib0: after napi_disable
   ib0: downing ib_dev
   ib0: All sends and receives done.
   ib0: stopping interface
   ib0: before napi_disable
 
 Yes, two napi_disable()s in a row without a matching napi_enable()
 will deadlock.  I guess the question is why the ipoib interface is
 being stopped twice.
 
 If you just take the net-2.6.24 tree (without bonding patches), does
 bonding for ethernet interfaces work OK, or is there a similar problem
 with double napi_disable()?  How about bonding of ethernet after this
 batch of bonding patches?
 
  - R.

Ok, I think I know what happens here.
When bonding gets an NETDEV_GOING_DONW event it releases the slave and 
by the way closes the slave device (this is a new code). ifconfig on the other 
hand
closes the deivice one more time and this is why we see 2 napi_disable() in a 
row.

The fix in my opinion is in bonding - it should react to NETDEV_UNREGISTER and 
not to NETDEV_GOING_DONW.
I want to test this point and if it's good I'll submit new patches.


thanks
  MoniS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-13 Thread Moni Shoua
I will be near my lab only tomorrow...
I will check this and let you know.

On 10/11/07, Roland Dreier [EMAIL PROTECTED] wrote:
   It happens only when ib interfaces are slaves of a bonding device.
   I thought before that the stuck is in napi_disable() but it's almost right.
   I put prints before and after call to napi_disable and see that it is 
 called twice.
   I'll try to investigate in this direction.
  
   ib0: stopping interface
   ib0: before napi_disable
   ib0: after napi_disable
   ib0: downing ib_dev
   ib0: All sends and receives done.
   ib0: stopping interface
   ib0: before napi_disable

 Yes, two napi_disable()s in a row without a matching napi_enable()
 will deadlock.  I guess the question is why the ipoib interface is
 being stopped twice.

 If you just take the net-2.6.24 tree (without bonding patches), does
 bonding for ethernet interfaces work OK, or is there a similar problem
 with double napi_disable()?  How about bonding of ethernet after this
 batch of bonding patches?

  - R.
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-11 Thread Moni Shoua
Roland Dreier wrote:
   I also ran a test for the code in the branch of 2.6.24 and found a problem.
   I see that ifconfig down doesn't return (for IPoIB interfaces) and it's 
 stuck in napi_disable() in the kernel (any idea why?)
 
 For what it's worth, I took the upstream 2.6.23 git tree and merged in
 Dave's latest net-2.6.24 tree and my latest for-2.6.24 tree and tried
 that.  I brought up an IPoIB interface, sent a few pings, and did
 ifconfig down, and it worked fine.
 
 Can you try the same thing without the bonding patches to see if your
 setup works OK too?
 
 Also can you give more details about what you do to get ifconfig down stuck?
 
  - R.

Without bonding ifconfig down works fine. 
It happens only when ib interfaces are slaves of a bonding device.
I thought before that the stuck is in napi_disable() but it's almost right.
I put prints before and after call to napi_disable and see that it is called 
twice.
I'll try to investigate in this direction.

ib0: stopping interface
ib0: before napi_disable
ib0: after napi_disable
ib0: downing ib_dev
ib0: All sends and receives done.
ib0: stopping interface
ib0: before napi_disable



There is also a dump of the kernel log after 'echo t  /proc/sysrq-trigger' 
(for ifconfig)

SysRq : Show State

ifconfig  S  0  6311   6099
 810034f49d18 0086  
 810037e747c0 810037e747c0 00013481e000 81003a851a78
 81003a851840 3b0c8c00  802358ee
Call Trace:
 [8023cc89] lock_timer_base+0x24/0x49
 [80403754] schedule_timeout+0x8a/0xad
 [8023d241] process_timeout+0x0/0x5
 [8023d6ec] msleep_interruptible+0x11/0x39
 [884081a7] :ib_ipoib:ipoib_stop+0x64/0x12c
 [8039fc07] dev_close+0x3e/0x56
 [803a1c31] dev_change_flags+0xa7/0x15f
 [803e5bee] devinet_ioctl+0x293/0x5ed
 [803e775b] inet_ioctl+0x7f/0x9d
 [80395b2e] sock_ioctl+0x0/0x1fe
 [80395d08] sock_ioctl+0x1da/0x1fe
 [802947d9] do_ioctl+0x29/0x6f
 [80294a75] vfs_ioctl+0x256/0x267
 [80294adf] sys_ioctl+0x59/0x7a
 [8020bc0e] system_call+0x7e/0x83


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-10 Thread Moni Shoua
Jay Vosburgh wrote:
 David Miller [EMAIL PROTECTED] wrote:
 
 From: Jeff Garzik [EMAIL PROTECTED]
 Date: Tue, 09 Oct 2007 20:56:35 -0400

 Jeff Garzik wrote:
 applied patches 1-9

 the only thing that was a hiccup during submission is that your email 
 subject lines did not contain a notion of ordering [PATCH 1/9]  
 But other than that, the git-send-email went flawlessly.
 unfortunately it does not seem to build flawlessly:
 Yeah it doesn't handle Stephen Hemmingers headerops change
 in net-2.6.24
 
   Gaah.  I'll sort it out and repost.
 
   -J
 
 ---
   -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]

Hi Jay, Jeff
Thanks for the help with making the patch work compile under 2.6.24.
However, patch #3 has a missing line in bond_setup_by_slave that should look 
like this

bond_dev-header_ops= slave_dev-header_ops;

I rewrote the patch and also fixed patch #8 that became broken.

I would send the new patches now but there is more
I also ran a test for the code in the branch of 2.6.24 and found a problem.
I see that ifconfig down doesn't return (for IPoIB interfaces) and it's stuck 
in napi_disable() in the kernel (any idea why?)

I am trying to solve it now so I'd like to wait a short time before applying 
these patches. 
I guess that I'll need to add something.



thanks
   MoniS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver

2007-10-09 Thread Moni Shoua
Jay Vosburgh wrote:
 Jeff Garzik [EMAIL PROTECTED] wrote:
 
 Moni Shoua wrote:
 Jay Vosburgh wrote:
 ACK patches 3 - 9.

Roland, are you comfortable with the IB changes in patches 1 and 2?

Jeff, when Roland acks patches 1 and 2, please apply all 9.

-J
 Hi Jeff,
 Roland acked the IPoIB patches. If you haven't done so already can you 
 please apply them?
 I'm not sure when 2.6.24 is going to open and I'm afraid to miss it.
 hrm, I don't see them in my inbox for some reason.  can someone bounce
 them to me?  or give me a git tree to pull from?
 
   Moni, can you repost the patch series to Jeff, and put the
 appropriate Acked-by lines in for myself (patches 3 - 8) and Roland
 (patches 1 and 2)?  You can probably leave off the netdev and
 openfabrics lists, but cc me.
 
   -J
 
 ---
   -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
 
Hi Jeff,
I don't commits of the patches in 
http://git.kernel.org/?p=linux/kernel/git/jgarzik/netdev-2.6.git;a=summary
(I hope that I'm looking in the right place).

Did you get them?

thanks
MoniS


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver

2007-10-02 Thread Moni Shoua
Jay Vosburgh wrote:
 ACK patches 3 - 9.
 
   Roland, are you comfortable with the IB changes in patches 1 and 2?
 
   Jeff, when Roland acks patches 1 and 2, please apply all 9.
 
   -J

Hi Jeff,
Roland acked the IPoIB patches. If you haven't done so already can you please 
apply them?
I'm not sure when 2.6.24 is going to open and I'm afraid to miss it.

thanks 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] [PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver

2007-09-25 Thread Moni Shoua
Jay,
I think that all comments to the patches were discussed and handled.
If you agree, can you please push then to the networking tree so they will be
merged into 2.6.24?
This includes the IPoIB patches (agreed with Roland).
Note that there are *no* patches to net/core (like in V5).

thanks

 MoniS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister

2007-09-24 Thread Moni Shoua
Roland Dreier wrote:
   The action in bonding to a detach of slave is to unregister the master 
 (see patch 10).
   This can't be done from the context of unregister_netdevice itself (it is 
 protected by rtnl_lock).
 
 I'm confused.  Your patch has:
 
   +  ipoib_slave_detach(cpriv-dev);
  unregister_netdev(cpriv-dev);
 
 And ipoib_slave_detach() is:
 
   +static inline void ipoib_slave_detach(struct net_device *dev)
   +{
   +  rtnl_lock();
   +  netdev_slave_detach(dev);
   +  rtnl_unlock();
   +}
 
 so you are calling netdev_slave_detach() with the rtnl lock held.
 Why can't you make the same call from the start of unregister_netdevice()?
 
 Anyway, if the rtnl lock is a problem, can you just add the call to
 netdev_slave_detach() to unregister_netdev() before it takes the rtnl lock?
 
  - R.
 

Your comment made me do a little rethinking.
In bonding, device is released by calling unregister_netdevice() that doesn't 
take the rtnl_lock (unlike unregister_netdev() that does). I guess that this 
made me 
confused to think that this is not possible. So, I guess I could put 
the detach notification in unregister_netedev() and the reaction to the 
notification 
in the bonding driver would not block.
However, I looked one more time at the code of unregister_netdevice() and found 
out that
nothing prevents from calling unregister_netdevice() again when the 
notification NETDEV_GOING_DOWN
is sent. I tried that and it works.
I have a new set of patches without sending a slave detach and I will send it 
soon.

Thanks for the comment Roland. It makes this patch simpler.

I'd also like to give a credit to Jay for the idea of using NETDEV_GOING_DOWN 
notification
instead of NETDEV_CHANGE+IFF_SLAVE_DETACH. He suggested it a while ago but I 
wrongly thought that
it wouldn't work.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver

2007-09-24 Thread Moni Shoua
This patch series is the sixth version (see below link to V5) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

Patches 1-8 were originally submitted in V5 and patch 9 is an addition by Jay.


Major changes from the previous version:


1. Remove the patches to net/core. Bonding will use the NETDEV_GOING_DOWN 
notification
   instead of NETDEV_CHANGE+IFF_SLAVE_DETACH. This reduces the amount of 
patches from 11
   to 9.

Links to earlier discussion:


1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. V5 series
http://lists.openfabrics.org/pipermail/general/2007-September/040996.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 1/9] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-09-24 Thread Moni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb-dst-neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb-dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n-dev is an ipoib device and hence netdev_priv(n-dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n-dev-flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   24 +++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 ++-
 3 files changed, 20 insertions(+), 11 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h   2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h2007-09-18 
17:09:26.534874404 +0200
@@ -328,6 +328,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-09-18 
17:23:54.725744661 +0200
@@ -511,7 +511,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb-dst-neighbour);
+   neigh = ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
if (!neigh) {
++priv-stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -830,6 +830,13 @@ static void ipoib_neigh_cleanup(struct n
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
+   neigh = *to_ipoib_neigh(n);
+   if (neigh) {
+   priv = netdev_priv(neigh-dev);
+   ipoib_dbg(priv, neigh_destructor for bonding device: %s\n,
+ n-dev-name);
+   } else
+   return;
ipoib_dbg(priv,
  neigh_cleanup for %06x  IPOIB_GID_FMT \n,
  IPOIB_QPN(n-ha),
@@ -837,13 +844,10 @@ static void ipoib_neigh_cleanup(struct n
 
spin_lock_irqsave(priv-lock, flags);
 
-   neigh = *to_ipoib_neigh(n);
-   if (neigh) {
-   if (neigh-ah)
-   ah = neigh-ah;
-   list_del(neigh-list);
-   ipoib_neigh_free(n-dev, neigh);
-   }
+   if (neigh-ah)
+   ah = neigh-ah;
+   list_del(neigh-list);
+   ipoib_neigh_free(n-dev, neigh);
 
spin_unlock_irqrestore(priv-lock, flags);
 
@@ -851,7 +855,8 @@ static void ipoib_neigh_cleanup(struct n
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -860,6 +865,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh-neighbour = neighbour;
+   neigh-dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(neigh-queue);
ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c  2007-09-18 
17:09:26.536874045 +0200
@@ -727,7 +727,8 @@ out:
if (skb-dst
skb-dst-neighbour 
!*to_ipoib_neigh(skb-dst-neighbour

[PATCH V6 2/9] IB/ipoib: Verify address handle validity on send

2007-09-24 Thread Moni Shoua
When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-09-18 
17:09:26.535874225 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-09-18 
17:10:22.375853147 +0200
@@ -686,9 +686,10 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
} else if (neigh-ah) {
-   if (unlikely(memcmp(neigh-dgid.raw,
+   if (unlikely((memcmp(neigh-dgid.raw,
skb-dst-neighbour-ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid))) ||
+(neigh-dev != dev))) {
spin_lock(priv-lock);
/*
 * It's safe to call ipoib_put_ah() inside

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 3/9] net/bonding: Enable bonding to enslave non ARPHRD_ETHER

2007-09-24 Thread Moni Shoua
This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by 
ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c |   39 +++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:08:59.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:13.424688411 
+0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+   struct net_device *slave_dev)
+{
+   bond_dev-hard_header   = slave_dev-hard_header;
+   bond_dev-rebuild_header= slave_dev-rebuild_header;
+   bond_dev-hard_header_cache = slave_dev-hard_header_cache;
+   bond_dev-header_cache_update   = slave_dev-header_cache_update;
+   bond_dev-hard_header_parse = slave_dev-hard_header_parse;
+
+   bond_dev-neigh_setup   = slave_dev-neigh_setup;
+
+   bond_dev-type  = slave_dev-type;
+   bond_dev-hard_header_len   = slave_dev-hard_header_len;
+   bond_dev-addr_len  = slave_dev-addr_len;
+
+   memcpy(bond_dev-broadcast, slave_dev-broadcast,
+   slave_dev-addr_len);
+}
+
 /* enslave device slave to bond device master */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
goto err_undo_flags;
}
 
+   /* set bonding device ether type by slave - bonding netdevices are
+* created with ether_setup, so when the slave type is not ARPHRD_ETHER
+* there is a need to override some of the type dependent attribs/funcs.
+*
+* bond ether type mutual exclusion - don't allow slaves of dissimilar
+* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same 
bond
+*/
+   if (bond-slave_cnt == 0) {
+   if (slave_dev-type != ARPHRD_ETHER)
+   bond_setup_by_slave(bond_dev, slave_dev);
+   } else if (bond_dev-type != slave_dev-type) {
+   printk(KERN_ERR DRV_NAME : %s ether type (%d) is different 
+   from other slaves (%d), can not enslave it.\n,
+   slave_dev-name,
+   slave_dev-type, bond_dev-type);
+   res = -EINVAL;
+   goto err_undo_flags;
+   }
+
if (slave_dev-set_mac_address == NULL) {
printk(KERN_ERR DRV_NAME
: %s: Error: The slave device you specified does 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 4/9] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()

2007-09-24 Thread Moni Shoua
This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c |   87 +++-
 drivers/net/bonding/bonding.h   |1 
 2 files changed, 60 insertions(+), 28 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:13.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:41.971632881 
+0300
@@ -1095,6 +1095,14 @@ void bond_change_active_slave(struct bon
if (new_active) {
bond_set_slave_active_flags(new_active);
}
+
+   /* when bonding does not set the slave MAC address, the bond MAC
+* address is the one of the active slave.
+*/
+   if (new_active  !bond-do_set_mac_addr)
+   memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
+   new_active-dev-addr_len);
+
bond_send_gratuitous_arp(bond);
}
 }
@@ -1351,13 +1359,22 @@ int bond_enslave(struct net_device *bond
}
 
if (slave_dev-set_mac_address == NULL) {
-   printk(KERN_ERR DRV_NAME
-   : %s: Error: The slave device you specified does 
-   not support setting the MAC address. 
-   Your kernel likely does not support slave 
-   devices.\n, bond_dev-name);
-   res = -EOPNOTSUPP;
-   goto err_undo_flags;
+   if (bond-slave_cnt == 0) {
+   printk(KERN_WARNING DRV_NAME
+   : %s: Warning: The first slave device you 
+   specified does not support setting the MAC 
+   address. This bond MAC address would be that 
+   of the active slave.\n, bond_dev-name);
+   bond-do_set_mac_addr = 0;
+   } else if (bond-do_set_mac_addr) {
+   printk(KERN_ERR DRV_NAME
+   : %s: Error: The slave device you specified 
+   does not support setting the MAC addres,.
+   but this bond uses this practice. \n
+   , bond_dev-name);
+   res = -EOPNOTSUPP;
+   goto err_undo_flags;
+   }
}
 
new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1378,16 +1395,18 @@ int bond_enslave(struct net_device *bond
 */
memcpy(new_slave-perm_hwaddr, slave_dev-dev_addr, ETH_ALEN);
 
-   /*
-* Set slave to master's mac address.  The application already
-* set the master's mac address to that of the first slave
-*/
-   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
-   addr.sa_family = slave_dev-type;
-   res = dev_set_mac_address(slave_dev, addr);
-   if (res) {
-   dprintk(Error %d calling set_mac_address\n, res);
-   goto err_free;
+   if (bond-do_set_mac_addr) {
+   /*
+* Set slave to master's mac address.  The application already
+* set the master's mac address to that of the first slave
+*/
+   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
+   addr.sa_family = slave_dev-type;
+   res = dev_set_mac_address(slave_dev, addr);
+   if (res) {
+   dprintk(Error %d calling set_mac_address\n, res);
+   goto err_free;
+   }
}
 
res = netdev_set_master(slave_dev, bond_dev);
@@ -1612,9 +1631,11 @@ err_close:
dev_close(slave_dev);
 
 err_restore_mac:
-   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family = slave_dev-type;
-   dev_set_mac_address(slave_dev, addr);
+   if (bond-do_set_mac_addr) {
+   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
+   addr.sa_family = slave_dev-type;
+   dev_set_mac_address(slave_dev, addr);
+   }
 
 err_free:
kfree(new_slave);
@@ -1792,10 +1813,12 @@ int bond_release(struct net_device *bond
/* close slave before restoring its mac address */
dev_close(slave_dev);
 
-   /* restore original (permanent) mac address */
-   memcpy(addr.sa_data, slave-perm_hwaddr, ETH_ALEN

[PATCH V6 5/9] net/bonding: Enable IP multicast for bonding IPoIB devices

2007-09-24 Thread Moni Shoua
Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast 
groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding 
device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another 
ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c  |5 +++--
 drivers/net/bonding/bond_sysfs.c |6 ++
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:41.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:55:48.431862446 
+0300
@@ -1285,8 +1285,9 @@ int bond_enslave(struct net_device *bond
 
/* bond must be initialized by bond_open() before enslaving */
if (!(bond_dev-flags  IFF_UP)) {
-   dprintk(Error, master_dev is not up\n);
-   return -EPERM;
+   printk(KERN_WARNING DRV_NAME
+%s: master_dev is not up in bond_enslave\n,
+   bond_dev-name);
}
 
/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-08-15 
10:08:58.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-08-15 10:55:48.432862269 
+0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
/* Quick sanity check -- is the bond interface up? */
if (!(bond-dev-flags  IFF_UP)) {
-   printk(KERN_ERR DRV_NAME
-  : %s: Unable to update slaves because interface is 
down.\n,
+   printk(KERN_WARNING DRV_NAME
+  : %s: doing slave updates when interface is down.\n,
   bond-dev-name);
-   ret = -EPERM;
-   goto out;
}
 
/* Note:  We can't hold bond-lock here, as bond_create grabs it. */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 6/9] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-09-24 Thread Moni Shoua
bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c  |3 ++-
 drivers/net/bonding/bond_sysfs.c |   10 --
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-09-24 
12:52:33.0 +0200
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-09-24 12:57:33.411459811 
+0200
@@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
struct slave *slave;
struct net_device *bond_dev = bond-dev;
unsigned long features = bond_dev-features;
-   unsigned short max_hard_header_len = ETH_HLEN;
+   unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+   bond_dev-hard_header_len);
int i;
 
features = ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-09-24 
12:55:09.0 +0200
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-09-24 13:00:23.752680721 
+0200
@@ -260,6 +260,7 @@ static ssize_t bonding_store_slaves(stru
char command[IFNAMSIZ + 1] = { 0, };
char *ifname;
int i, res, found, ret = count;
+   u32 original_mtu;
struct slave *slave;
struct net_device *dev = NULL;
struct bonding *bond = to_bond(d);
@@ -325,6 +326,7 @@ static ssize_t bonding_store_slaves(stru
}
 
/* Set the slave's MTU to match the bond */
+   original_mtu = dev-mtu;
if (dev-mtu != bond-dev-mtu) {
if (dev-change_mtu) {
res = dev-change_mtu(dev,
@@ -339,6 +341,9 @@ static ssize_t bonding_store_slaves(stru
}
rtnl_lock();
res = bond_enslave(bond-dev, dev);
+   bond_for_each_slave(bond, slave, i)
+   if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0)
+   slave-original_mtu = original_mtu;
rtnl_unlock();
if (res) {
ret = res;
@@ -351,6 +356,7 @@ static ssize_t bonding_store_slaves(stru
bond_for_each_slave(bond, slave, i)
if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0) {
dev = slave-dev;
+   original_mtu = slave-original_mtu;
break;
}
if (dev) {
@@ -365,9 +371,9 @@ static ssize_t bonding_store_slaves(stru
}
/* set the slave MTU to the default */
if (dev-change_mtu) {
-   dev-change_mtu(dev, 1500);
+   dev-change_mtu(dev, original_mtu);
} else {
-   dev-mtu = 1500;
+   dev-mtu = original_mtu;
}
}
else {
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-09-24 12:55:09.0 
+0200
+++ net-2.6/drivers/net/bonding/bonding.h   2007-09-24 12:57:33.412459636 
+0200
@@ -156,6 +156,7 @@ struct slave {
s8 link;/* one of BOND_LINK_ */
s8 state;   /* one of BOND_STATE_ */
u32original_flags;
+   u32original_mtu;
u32link_failure_count;
u16speed;
u8 duplex;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PATCH V6 7/9] net/bonding: Delay sending of gratuitous ARP to avoid failure

2007-09-24 Thread Moni Shoua
Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev-state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c |   24 +---
 drivers/net/bonding/bonding.h   |1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:56:33.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 11:04:37.221123652 
+0300
@@ -1102,8 +1102,14 @@ void bond_change_active_slave(struct bon
if (new_active  !bond-do_set_mac_addr)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
-
-   bond_send_gratuitous_arp(bond);
+   if (bond-curr_active_slave 
+   test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state)) {
+   dprintk(delaying gratuitous arp on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond-send_grat_arp = 1;
+   } else
+   bond_send_gratuitous_arp(bond);
}
 }
 
@@ -2083,6 +2089,17 @@ void bond_mii_monitor(struct net_device 
 * program could monitor the link itself if needed.
 */
 
+   if (bond-send_grat_arp) {
+   if (bond-curr_active_slave  
test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state))
+   dprintk(Needs to send gratuitous arp but not yet\n);
+   else {
+   dprintk(sending delayed gratuitous arp on on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond_send_gratuitous_arp(bond);
+   bond-send_grat_arp = 0;
+   }
+   }
read_lock(bond-curr_slave_lock);
oldcurrent = bond-curr_active_slave;
read_unlock(bond-curr_slave_lock);
@@ -2484,7 +2501,7 @@ static void bond_send_gratuitous_arp(str
 
if (bond-master_ip) {
bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
- bond-master_ip, 0);
+   bond-master_ip, 0);
}
 
list_for_each_entry(vlan, bond-vlan_list, vlan_list) {
@@ -4293,6 +4310,7 @@ static int bond_init(struct net_device *
bond-current_arp_slave = NULL;
bond-primary_slave = NULL;
bond-dev = bond_dev;
+   bond-send_grat_arp = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-15 10:56:33.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-15 11:05:41.516451497 
+0300
@@ -187,6 +187,7 @@ struct bonding {
struct   timer_list arp_timer;
s8   kill_timers;
s8   do_set_mac_addr;
+   s8   send_grat_arp;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V6 8/9] net/bonding: Destroy bonding master when last slave is gone

2007-09-24 Thread Moni Shoua
When bonding enslaves non Ethernet devices it takes pointers to functions 
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved 
because we don't know if the pointers are still valid.  Destroying the bond 
when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c  |   37 +
 drivers/net/bonding/bond_sysfs.c |9 +
 drivers/net/bonding/bonding.h|3 +++
 3 files changed, 45 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-09-24 
14:01:24.055441842 +0200
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-09-24 14:05:05.658979207 
+0200
@@ -1256,6 +1256,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
 {
+   struct bonding *bond = bond_dev-priv;
bond_dev-hard_header   = slave_dev-hard_header;
bond_dev-rebuild_header= slave_dev-rebuild_header;
bond_dev-hard_header_cache = slave_dev-hard_header_cache;
@@ -1270,6 +1271,7 @@ static void bond_setup_by_slave(struct n
 
memcpy(bond_dev-broadcast, slave_dev-broadcast,
slave_dev-addr_len);
+   bond-setup_by_slave = 1;
 }
 
 /* enslave device slave to bond device master */
@@ -1838,6 +1840,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+   bond_deinit(bond-dev);
+   bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond-dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device 
*slave_dev)
+{
+   struct bonding *bond = bond_dev-priv;
+   int ret;
+
+   ret = bond_release(bond_dev, slave_dev);
+   if ((ret == 0)  (bond-slave_cnt == 0)) {
+   printk(KERN_INFO DRV_NAME  %s: destroying bond %s.\n,
+   bond_dev-name);
+   bond_destroy(bond);
+   }
+   return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3337,6 +3368,11 @@ static int bond_slave_netdev_event(unsig
 * ... Or is it this?
 */
break;
+   case NETDEV_GOING_DOWN:
+   dprintk(slave %s is going down\n, slave_dev-name);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   break;
case NETDEV_CHANGEMTU:
/*
 * TODO: Should slaves be allowed to
@@ -4311,6 +4347,7 @@ static int bond_init(struct net_device *
bond-primary_slave = NULL;
bond-dev = bond_dev;
bond-send_grat_arp = 0;
+   bond-setup_by_slave = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-09-24 14:01:24.055441842 
+0200
+++ net-2.6/drivers/net/bonding/bonding.h   2007-09-24 14:01:24.627340013 
+0200
@@ -188,6 +188,7 @@ struct bonding {
s8   kill_timers;
s8   do_set_mac_addr;
s8   send_grat_arp;
+   s8   setup_by_slave;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;
@@ -295,6 +296,8 @@ static inline void bond_unset_master_alb
 struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry 
*curr);
 int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct 
net_device *slave_dev);
 int bond_create(char *name, struct bond_params *params, struct bonding 
**newbond);
+void bond_destroy(struct bonding *bond);
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device 
*slave_dev);
 void bond_deinit(struct net_device *bond_dev);
 int bond_create_sysfs(void);
 void bond_destroy_sysfs(void);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-09-24 
14:01:23.523536550 +0200
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-09-24 14:01:24.628339835 
+0200
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
printk(KERN_INFO DRV_NAME
: %s

[PATCH 9/9] bonding: Optionally allow ethernet slaves to keep own MAC

2007-09-24 Thread Moni Shoua
Update the don't change MAC of slaves functionality added in
previous changes to be a generic option, rather than something tied to IB
devices, as it's occasionally useful for regular ethernet devices as well.

Adds fail_over_mac option (which is automatically enabled for IB
slaves), applicable only to active-backup mode.

Includes documentation update.

Updates bonding driver version to 3.2.0.

Signed-off-by: Jay Vosburgh [EMAIL PROTECTED]
---
 Documentation/networking/bonding.txt |   33 +++
 drivers/net/bonding/bond_main.c  |   57 +
 drivers/net/bonding/bond_sysfs.c |   49 +
 drivers/net/bonding/bonding.h|6 ++--
 4 files changed, 121 insertions(+), 24 deletions(-)

diff --git a/Documentation/networking/bonding.txt 
b/Documentation/networking/bonding.txt
index 1da5666..1134062 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -281,6 +281,39 @@ downdelay
will be rounded down to the nearest multiple.  The default
value is 0.
 
+fail_over_mac
+
+   Specifies whether active-backup mode should set all slaves to
+   the same MAC address (the traditional behavior), or, when
+   enabled, change the bond's MAC address when changing the
+   active interface (i.e., fail over the MAC address itself).
+
+   Fail over MAC is useful for devices that cannot ever alter
+   their MAC address, or for devices that refuse incoming
+   broadcasts with their own source MAC (which interferes with
+   the ARP monitor).
+
+   The down side of fail over MAC is that every device on the
+   network must be updated via gratuitous ARP, vs. just updating
+   a switch or set of switches (which often takes place for any
+   traffic, not just ARP traffic, if the switch snoops incoming
+   traffic to update its tables) for the traditional method.  If
+   the gratuitous ARP is lost, communication may be disrupted.
+
+   When fail over MAC is used in conjuction with the mii monitor,
+   devices which assert link up prior to being able to actually
+   transmit and receive are particularly susecptible to loss of
+   the gratuitous ARP, and an appropriate updelay setting may be
+   required.
+
+   A value of 0 disables fail over MAC, and is the default.  A
+   value of 1 enables fail over MAC.  This option is enabled
+   automatically if the first slave added cannot change its MAC
+   address.  This option may be modified via sysfs only when no
+   slaves are present in the bond.
+
+   This option was added in bonding version 3.2.0.
+
 lacp_rate
 
Option specifying the rate in which we'll ask our link partner
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77caca3..c01ff9d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -97,6 +97,7 @@ static char *xmit_hash_policy = NULL;
 static int arp_interval = BOND_LINK_ARP_INTERV;
 static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, };
 static char *arp_validate = NULL;
+static int fail_over_mac = 0;
 struct bond_params bonding_defaults;
 
 module_param(max_bonds, int, 0);
@@ -130,6 +131,8 @@ module_param_array(arp_ip_target, charp, NULL, 0);
 MODULE_PARM_DESC(arp_ip_target, arp targets in n.n.n.n form);
 module_param(arp_validate, charp, 0);
 MODULE_PARM_DESC(arp_validate, validate src/dst of ARP probes: none 
(default), active, backup or all);
+module_param(fail_over_mac, int, 0);
+MODULE_PARM_DESC(fail_over_mac, For active-backup, do not set all slaves to 
the same MAC.  0 of off (default), 1 for on.);
 
 /*- Global variables */
 
@@ -1099,7 +1102,7 @@ void bond_change_active_slave(struct bonding *bond, 
struct slave *new_active)
/* when bonding does not set the slave MAC address, the bond MAC
 * address is the one of the active slave.
 */
-   if (new_active  !bond-do_set_mac_addr)
+   if (new_active  bond-params.fail_over_mac)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
if (bond-curr_active_slave 
@@ -1371,16 +1374,16 @@ int bond_enslave(struct net_device *bond_dev, struct 
net_device *slave_dev)
if (slave_dev-set_mac_address == NULL) {
if (bond-slave_cnt == 0) {
printk(KERN_WARNING DRV_NAME
-   : %s: Warning: The first slave device you 
-   specified does not support setting the MAC 
-   address. This bond MAC address would be that 
-   of the active slave.\n, bond_dev-name);
-   bond-do_set_mac_addr = 0;
-   } else 

Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister

2007-09-23 Thread Moni Shoua
Roland Dreier wrote:
   +  ipoib_slave_detach(cpriv-dev);
  unregister_netdev(cpriv-dev);
 
 Maybe you already answered this before, but I'm still not clear why
 this notifier call can't just be added to the start of
 unregister_netdevice(), so we can avoid having driver needing to know
 anything about bonding internals?
 
  - R.

The action in bonding to a detach of slave is to unregister the master (see 
patch 10).
This can't be done from the context of unregister_netdevice itself (it is 
protected by rtnl_lock).

That's why I had to notify the detach before unregister begins.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 1/11] net/core: add a netdev notification for slave detach

2007-09-20 Thread Moni Shoua
A slave of a bonding master that wants to send a notification before
going down should call netdev_slave_detach(). The handling of this notification
will be done outside the context of unregister_netdevice() which is sometimes
necessary, as with IPoIB slave for example.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 include/linux/if.h |1 +
 net/core/dev.c |   20 
 2 files changed, 21 insertions(+)

Index: net-2.6/net/core/dev.c
===
--- net-2.6.orig/net/core/dev.c 2007-09-20 08:04:47.164051688 +0200
+++ net-2.6/net/core/dev.c  2007-09-20 09:20:21.493060579 +0200
@@ -2588,6 +2588,25 @@ int netdev_set_master(struct net_device 
return 0;
 }
 
+/**
+ * netdev_slave_detach -   notify that slave is about to detach 
from master
+ * @slave: slave device
+ *
+ * Raise a flag that slave is about to detach from master
+ * and notify the netdev  chain.
+ * The caller must hold the rtnl_mutex.
+ */
+
+int netdev_slave_detach(struct net_device *slave)
+{
+   int ret = 0;
+   if (slave-flags  IFF_SLAVE) {
+   slave-priv_flags |= IFF_SLAVE_DETACH;
+   ret = call_netdevice_notifiers(NETDEV_CHANGE, slave);
+   }
+   return ret;
+}
+
 static void __dev_set_promiscuity(struct net_device *dev, int inc)
 {
unsigned short old_flags = dev-flags;
@@ -4120,6 +4139,7 @@ EXPORT_SYMBOL(dev_set_mac_address);
 EXPORT_SYMBOL(free_netdev);
 EXPORT_SYMBOL(netdev_boot_setup_check);
 EXPORT_SYMBOL(netdev_set_master);
+EXPORT_SYMBOL(netdev_slave_detach);
 EXPORT_SYMBOL(netdev_state_change);
 EXPORT_SYMBOL(netif_receive_skb);
 EXPORT_SYMBOL(netif_rx);
Index: net-2.6/include/linux/if.h
===
--- net-2.6.orig/include/linux/if.h 2007-09-20 08:04:47.164051688 +0200
+++ net-2.6/include/linux/if.h  2007-09-20 08:15:29.577729301 +0200
@@ -61,6 +61,7 @@
 #define IFF_MASTER_ALB 0x10/* bonding master, balance-alb. */
 #define IFF_BONDING0x20/* bonding master or slave  */
 #define IFF_SLAVE_NEEDARP 0x40 /* need ARPs for validation */
+#define IFF_SLAVE_DETACH 0x80  /* slave is about to unregister */
 
 #define IF_GET_IFACE   0x0001  /* for querying only */
 #define IF_GET_PROTO   0x0002

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister

2007-09-20 Thread Moni Shoua
When the bonding device enslaves IPoIB devices it takes pointers to
functions in the ib_ipoib module. This is fine as long as the ib_ipoib
nodule remains loaded while the references to its functions exist.
So, to help bonding do a cleanup on time, when the IPoIB net device is a 
slave of a bonding master, let the master know that the IPoIB device is
about to unregister (but before calling unregister).

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/infiniband/ulp/ipoib/ipoib.h  |7 +++
 drivers/infiniband/ulp/ipoib/ipoib_main.c |3 +++
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c |1 +
 3 files changed, 11 insertions(+)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-09-20 
08:35:34.0 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-09-20 
14:20:16.495147879 +0200
@@ -48,6 +48,7 @@
 #include linux/in.h
 
 #include net/dst.h
+#include linux/netdevice.h
 
 MODULE_AUTHOR(Roland Dreier);
 MODULE_DESCRIPTION(IP-over-InfiniBand net driver);
@@ -921,6 +922,7 @@ void ipoib_dev_cleanup(struct net_device
 
/* Delete any child interfaces first */
list_for_each_entry_safe(cpriv, tcpriv, priv-child_intfs, list) {
+   ipoib_slave_detach(cpriv-dev);
unregister_netdev(cpriv-dev);
ipoib_dev_cleanup(cpriv-dev);
free_netdev(cpriv-dev);
@@ -1208,6 +1210,7 @@ static void ipoib_remove_one(struct ib_d
ib_unregister_event_handler(priv-event_handler);
flush_scheduled_work();
 
+   ipoib_slave_detach(priv-dev);
unregister_netdev(priv-dev);
ipoib_dev_cleanup(priv-dev);
free_netdev(priv-dev);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_vlan.c  2007-09-20 
09:26:11.0 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_vlan.c   2007-09-20 
09:27:20.182709679 +0200
@@ -157,6 +157,7 @@ int ipoib_vlan_delete(struct net_device 
mutex_lock(ppriv-vlan_mutex);
list_for_each_entry_safe(priv, tpriv, ppriv-child_intfs, list) {
if (priv-pkey == pkey) {
+   ipoib_slave_detach(priv-dev);
unregister_netdev(priv-dev);
ipoib_dev_cleanup(priv-dev);
list_del(priv-list);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h   2007-09-20 
12:18:56.0 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h2007-09-20 
14:21:47.385972207 +0200
@@ -570,6 +570,13 @@ static inline void ipoib_cm_handle_rx_wc
 
 #endif
 
+static inline void ipoib_slave_detach(struct net_device *dev)
+{
+   rtnl_lock();
+   netdev_slave_detach(dev);
+   rtnl_unlock();
+}
+
 #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
 void ipoib_create_debug_files(struct net_device *dev);
 void ipoib_delete_debug_files(struct net_device *dev);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 3/11] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-09-20 Thread Moni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb-dst-neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb-dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n-dev is an ipoib device and hence netdev_priv(n-dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n-dev-flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   24 +++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 ++-
 3 files changed, 20 insertions(+), 11 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h   2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h2007-09-18 
17:09:26.534874404 +0200
@@ -328,6 +328,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-09-18 
17:23:54.725744661 +0200
@@ -511,7 +511,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb-dst-neighbour);
+   neigh = ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
if (!neigh) {
++priv-stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -830,6 +830,13 @@ static void ipoib_neigh_cleanup(struct n
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
+   neigh = *to_ipoib_neigh(n);
+   if (neigh) {
+   priv = netdev_priv(neigh-dev);
+   ipoib_dbg(priv, neigh_destructor for bonding device: %s\n,
+ n-dev-name);
+   } else
+   return;
ipoib_dbg(priv,
  neigh_cleanup for %06x  IPOIB_GID_FMT \n,
  IPOIB_QPN(n-ha),
@@ -837,13 +844,10 @@ static void ipoib_neigh_cleanup(struct n
 
spin_lock_irqsave(priv-lock, flags);
 
-   neigh = *to_ipoib_neigh(n);
-   if (neigh) {
-   if (neigh-ah)
-   ah = neigh-ah;
-   list_del(neigh-list);
-   ipoib_neigh_free(n-dev, neigh);
-   }
+   if (neigh-ah)
+   ah = neigh-ah;
+   list_del(neigh-list);
+   ipoib_neigh_free(n-dev, neigh);
 
spin_unlock_irqrestore(priv-lock, flags);
 
@@ -851,7 +855,8 @@ static void ipoib_neigh_cleanup(struct n
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -860,6 +865,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh-neighbour = neighbour;
+   neigh-dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(neigh-queue);
ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2007-09-18 
17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c  2007-09-18 
17:09:26.536874045 +0200
@@ -727,7 +727,8 @@ out:
if (skb-dst
skb-dst-neighbour 
!*to_ipoib_neigh(skb-dst-neighbour

[PATCH V5 4/11] IB/ipoib: Verify address handle validity on send

2007-09-20 Thread Moni Shoua
When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-09-18 
17:09:26.535874225 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-09-18 
17:10:22.375853147 +0200
@@ -686,9 +686,10 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
} else if (neigh-ah) {
-   if (unlikely(memcmp(neigh-dgid.raw,
+   if (unlikely((memcmp(neigh-dgid.raw,
skb-dst-neighbour-ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid))) ||
+(neigh-dev != dev))) {
spin_lock(priv-lock);
/*
 * It's safe to call ipoib_put_ah() inside

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER

2007-09-20 Thread Moni Shoua
This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by 
ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c |   39 +++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:08:59.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:13.424688411 
+0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+   struct net_device *slave_dev)
+{
+   bond_dev-hard_header   = slave_dev-hard_header;
+   bond_dev-rebuild_header= slave_dev-rebuild_header;
+   bond_dev-hard_header_cache = slave_dev-hard_header_cache;
+   bond_dev-header_cache_update   = slave_dev-header_cache_update;
+   bond_dev-hard_header_parse = slave_dev-hard_header_parse;
+
+   bond_dev-neigh_setup   = slave_dev-neigh_setup;
+
+   bond_dev-type  = slave_dev-type;
+   bond_dev-hard_header_len   = slave_dev-hard_header_len;
+   bond_dev-addr_len  = slave_dev-addr_len;
+
+   memcpy(bond_dev-broadcast, slave_dev-broadcast,
+   slave_dev-addr_len);
+}
+
 /* enslave device slave to bond device master */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
goto err_undo_flags;
}
 
+   /* set bonding device ether type by slave - bonding netdevices are
+* created with ether_setup, so when the slave type is not ARPHRD_ETHER
+* there is a need to override some of the type dependent attribs/funcs.
+*
+* bond ether type mutual exclusion - don't allow slaves of dissimilar
+* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same 
bond
+*/
+   if (bond-slave_cnt == 0) {
+   if (slave_dev-type != ARPHRD_ETHER)
+   bond_setup_by_slave(bond_dev, slave_dev);
+   } else if (bond_dev-type != slave_dev-type) {
+   printk(KERN_ERR DRV_NAME : %s ether type (%d) is different 
+   from other slaves (%d), can not enslave it.\n,
+   slave_dev-name,
+   slave_dev-type, bond_dev-type);
+   res = -EINVAL;
+   goto err_undo_flags;
+   }
+
if (slave_dev-set_mac_address == NULL) {
printk(KERN_ERR DRV_NAME
: %s: Error: The slave device you specified does 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 6/11] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()

2007-09-20 Thread Moni Shoua
This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c |   87 +++-
 drivers/net/bonding/bonding.h   |1 
 2 files changed, 60 insertions(+), 28 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:13.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:41.971632881 
+0300
@@ -1095,6 +1095,14 @@ void bond_change_active_slave(struct bon
if (new_active) {
bond_set_slave_active_flags(new_active);
}
+
+   /* when bonding does not set the slave MAC address, the bond MAC
+* address is the one of the active slave.
+*/
+   if (new_active  !bond-do_set_mac_addr)
+   memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
+   new_active-dev-addr_len);
+
bond_send_gratuitous_arp(bond);
}
 }
@@ -1351,13 +1359,22 @@ int bond_enslave(struct net_device *bond
}
 
if (slave_dev-set_mac_address == NULL) {
-   printk(KERN_ERR DRV_NAME
-   : %s: Error: The slave device you specified does 
-   not support setting the MAC address. 
-   Your kernel likely does not support slave 
-   devices.\n, bond_dev-name);
-   res = -EOPNOTSUPP;
-   goto err_undo_flags;
+   if (bond-slave_cnt == 0) {
+   printk(KERN_WARNING DRV_NAME
+   : %s: Warning: The first slave device you 
+   specified does not support setting the MAC 
+   address. This bond MAC address would be that 
+   of the active slave.\n, bond_dev-name);
+   bond-do_set_mac_addr = 0;
+   } else if (bond-do_set_mac_addr) {
+   printk(KERN_ERR DRV_NAME
+   : %s: Error: The slave device you specified 
+   does not support setting the MAC addres,.
+   but this bond uses this practice. \n
+   , bond_dev-name);
+   res = -EOPNOTSUPP;
+   goto err_undo_flags;
+   }
}
 
new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1378,16 +1395,18 @@ int bond_enslave(struct net_device *bond
 */
memcpy(new_slave-perm_hwaddr, slave_dev-dev_addr, ETH_ALEN);
 
-   /*
-* Set slave to master's mac address.  The application already
-* set the master's mac address to that of the first slave
-*/
-   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
-   addr.sa_family = slave_dev-type;
-   res = dev_set_mac_address(slave_dev, addr);
-   if (res) {
-   dprintk(Error %d calling set_mac_address\n, res);
-   goto err_free;
+   if (bond-do_set_mac_addr) {
+   /*
+* Set slave to master's mac address.  The application already
+* set the master's mac address to that of the first slave
+*/
+   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
+   addr.sa_family = slave_dev-type;
+   res = dev_set_mac_address(slave_dev, addr);
+   if (res) {
+   dprintk(Error %d calling set_mac_address\n, res);
+   goto err_free;
+   }
}
 
res = netdev_set_master(slave_dev, bond_dev);
@@ -1612,9 +1631,11 @@ err_close:
dev_close(slave_dev);
 
 err_restore_mac:
-   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family = slave_dev-type;
-   dev_set_mac_address(slave_dev, addr);
+   if (bond-do_set_mac_addr) {
+   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
+   addr.sa_family = slave_dev-type;
+   dev_set_mac_address(slave_dev, addr);
+   }
 
 err_free:
kfree(new_slave);
@@ -1792,10 +1813,12 @@ int bond_release(struct net_device *bond
/* close slave before restoring its mac address */
dev_close(slave_dev);
 
-   /* restore original (permanent) mac address */
-   memcpy(addr.sa_data, slave-perm_hwaddr, ETH_ALEN

[PATCH V5 7/11] net/bonding: Enable IP multicast for bonding IPoIB devices

2007-09-20 Thread Moni Shoua
Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast 
groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding 
device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another 
ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c  |5 +++--
 drivers/net/bonding/bond_sysfs.c |6 ++
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:41.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:55:48.431862446 
+0300
@@ -1285,8 +1285,9 @@ int bond_enslave(struct net_device *bond
 
/* bond must be initialized by bond_open() before enslaving */
if (!(bond_dev-flags  IFF_UP)) {
-   dprintk(Error, master_dev is not up\n);
-   return -EPERM;
+   printk(KERN_WARNING DRV_NAME
+%s: master_dev is not up in bond_enslave\n,
+   bond_dev-name);
}
 
/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-08-15 
10:08:58.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-08-15 10:55:48.432862269 
+0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
/* Quick sanity check -- is the bond interface up? */
if (!(bond-dev-flags  IFF_UP)) {
-   printk(KERN_ERR DRV_NAME
-  : %s: Unable to update slaves because interface is 
down.\n,
+   printk(KERN_WARNING DRV_NAME
+  : %s: doing slave updates when interface is down.\n,
   bond-dev-name);
-   ret = -EPERM;
-   goto out;
}
 
/* Note:  We can't hold bond-lock here, as bond_create grabs it. */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver

2007-09-20 Thread Moni Shoua
This patch series is the fifth version (see below link to V4) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

Patches 1-10 were originally submitted in V4 and patch 11 is an addition by Jay.

Jay,
The bonding patches you acked remain unchanged while I guess I sitll need
to get an official ack by Roland for the IPoIB patches.
Is it OK with you to push the entire series to the networking tree?
Roland has already agreed to do so.


Major changes from the previous version:

1. Style changes
2. IPoIB - notify slave detach on vlan delete
3. Add function to net/core for slave detach instead of having it only in
   ib/ipoib
4. IPoIB - handle ib device and bonding device the same way in neigh_cleanup
   function

Links to earlier discussion:


1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. V4 series
http://lists.openfabrics.org/pipermail/general/2007-August/039825.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 8/11] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-09-20 Thread Moni Shoua
bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c  |3 ++-
 drivers/net/bonding/bond_sysfs.c |   19 +--
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:55:48.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-20 14:29:11.911298577 
+0300
@@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
struct slave *slave;
struct net_device *bond_dev = bond-dev;
unsigned long features = bond_dev-features;
-   unsigned short max_hard_header_len = ETH_HLEN;
+   unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+   bond_dev-hard_header_len);
int i;
 
features = ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-08-15 
10:55:48.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-08-15 12:14:41.152469089 
+0300
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
printk(KERN_INFO DRV_NAME
: %s is being deleted...\n,
bond-dev-name);
-   bond_deinit(bond-dev);
-   bond_destroy_sysfs_entry(bond);
-   unregister_netdevice(bond-dev);
+   bond_destroy(bond);
rtnl_unlock();
goto out;
}
@@ -260,6 +258,7 @@ static ssize_t bonding_store_slaves(stru
char command[IFNAMSIZ + 1] = { 0, };
char *ifname;
int i, res, found, ret = count;
+   u32 original_mtu;
struct slave *slave;
struct net_device *dev = NULL;
struct bonding *bond = to_bond(d);
@@ -325,6 +324,7 @@ static ssize_t bonding_store_slaves(stru
}
 
/* Set the slave's MTU to match the bond */
+   original_mtu = dev-mtu;
if (dev-mtu != bond-dev-mtu) {
if (dev-change_mtu) {
res = dev-change_mtu(dev,
@@ -339,6 +339,9 @@ static ssize_t bonding_store_slaves(stru
}
rtnl_lock();
res = bond_enslave(bond-dev, dev);
+   bond_for_each_slave(bond, slave, i)
+   if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0)
+   slave-original_mtu = original_mtu;
rtnl_unlock();
if (res) {
ret = res;
@@ -351,13 +354,17 @@ static ssize_t bonding_store_slaves(stru
bond_for_each_slave(bond, slave, i)
if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0) {
dev = slave-dev;
+   original_mtu = slave-original_mtu;
break;
}
if (dev) {
printk(KERN_INFO DRV_NAME : %s: Removing slave %s\n,
bond-dev-name, dev-name);
rtnl_lock();
-   res = bond_release(bond-dev, dev);
+   if (bond-setup_by_slave)
+   res = bond_release_and_destroy(bond-dev, dev);
+   else
+   res = bond_release(bond-dev, dev);
rtnl_unlock();
if (res) {
ret = res;
@@ -365,9 +372,9 @@ static ssize_t bonding_store_slaves(stru
}
/* set the slave MTU to the default */
if (dev-change_mtu) {
-   dev-change_mtu(dev, 1500);
+   dev-change_mtu(dev, original_mtu);
} else {
-   dev-mtu = 1500;
+   dev-mtu = original_mtu;
}
}
else {
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-15 10:55:34.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-20 14:29:11.912298402 
+0300
@@ -156,6 +156,7 @@ struct slave {
s8 link;/* one

PATCH V5 9/11] net/bonding: Delay sending of gratuitous ARP to avoid failure

2007-09-20 Thread Moni Shoua
Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev-state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c |   24 +---
 drivers/net/bonding/bonding.h   |1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:56:33.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 11:04:37.221123652 
+0300
@@ -1102,8 +1102,14 @@ void bond_change_active_slave(struct bon
if (new_active  !bond-do_set_mac_addr)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
-
-   bond_send_gratuitous_arp(bond);
+   if (bond-curr_active_slave 
+   test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state)) {
+   dprintk(delaying gratuitous arp on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond-send_grat_arp = 1;
+   } else
+   bond_send_gratuitous_arp(bond);
}
 }
 
@@ -2083,6 +2089,17 @@ void bond_mii_monitor(struct net_device 
 * program could monitor the link itself if needed.
 */
 
+   if (bond-send_grat_arp) {
+   if (bond-curr_active_slave  
test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state))
+   dprintk(Needs to send gratuitous arp but not yet\n);
+   else {
+   dprintk(sending delayed gratuitous arp on on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond_send_gratuitous_arp(bond);
+   bond-send_grat_arp = 0;
+   }
+   }
read_lock(bond-curr_slave_lock);
oldcurrent = bond-curr_active_slave;
read_unlock(bond-curr_slave_lock);
@@ -2484,7 +2501,7 @@ static void bond_send_gratuitous_arp(str
 
if (bond-master_ip) {
bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
- bond-master_ip, 0);
+   bond-master_ip, 0);
}
 
list_for_each_entry(vlan, bond-vlan_list, vlan_list) {
@@ -4293,6 +4310,7 @@ static int bond_init(struct net_device *
bond-current_arp_slave = NULL;
bond-primary_slave = NULL;
bond-dev = bond_dev;
+   bond-send_grat_arp = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-15 10:56:33.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-15 11:05:41.516451497 
+0300
@@ -187,6 +187,7 @@ struct bonding {
struct   timer_list arp_timer;
s8   kill_timers;
s8   do_set_mac_addr;
+   s8   send_grat_arp;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 10/11] net/bonding: Destroy bonding master when last slave is gone

2007-09-20 Thread Moni Shoua
When bonding enslaves non Ethernet devices it takes pointers to functions 
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved 
because we don't know if the pointers are still valid.  Destroying the bond 
when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua monis at voltaire.com
---
 drivers/net/bonding/bond_main.c |   45 +++-
 drivers/net/bonding/bonding.h   |3 ++
 2 files changed, 47 insertions(+), 1 deletion(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-20 
14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-20 14:43:17.850571535 
+0300
@@ -1256,6 +1256,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
 {
+   struct bonding *bond = bond_dev-priv;
bond_dev-hard_header   = slave_dev-hard_header;
bond_dev-rebuild_header= slave_dev-rebuild_header;
bond_dev-hard_header_cache = slave_dev-hard_header_cache;
@@ -1270,6 +1271,7 @@ static void bond_setup_by_slave(struct n
 
memcpy(bond_dev-broadcast, slave_dev-broadcast,
slave_dev-addr_len);
+   bond-setup_by_slave = 1;
 }
 
 /* enslave device slave to bond device master */
@@ -1838,6 +1840,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+   bond_deinit(bond-dev);
+   bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond-dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device 
*slave_dev)
+{
+   struct bonding *bond = bond_dev-priv;
+   int ret;
+
+   ret = bond_release(bond_dev, slave_dev);
+   if ((ret == 0)  (bond-slave_cnt == 0)) {
+   printk(KERN_INFO DRV_NAME  %s: destroying bond for.\n,
+   bond_dev-name);
+   bond_destroy(bond);
+   }
+   return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3322,7 +3353,11 @@ static int bond_slave_netdev_event(unsig
switch (event) {
case NETDEV_UNREGISTER:
if (bond_dev) {
-   bond_release(bond_dev, slave_dev);
+   dprintk(slave %s unregisters\n, slave_dev-name);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   else
+   bond_release(bond_dev, slave_dev);
}
break;
case NETDEV_CHANGE:
@@ -3331,6 +3366,13 @@ static int bond_slave_netdev_event(unsig
 * sets up a hierarchical bond, then rmmod's
 * one of the slave bonding devices?
 */
+   if (slave_dev-priv_flags  IFF_SLAVE_DETACH) {
+   dprintk(slave %s detaching\n, slave_dev-name);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   else
+   bond_release(bond_dev, slave_dev);
+   }
break;
case NETDEV_DOWN:
/*
@@ -4311,6 +4353,7 @@ static int bond_init(struct net_device *
bond-primary_slave = NULL;
bond-dev = bond_dev;
bond-send_grat_arp = 0;
+   bond-setup_by_slave = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-20 14:43:17.123702132 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-20 14:47:52.845180870 
+0300
@@ -188,6 +188,7 @@ struct bonding {
s8   kill_timers;
s8   do_set_mac_addr;
s8   send_grat_arp;
+   s8   setup_by_slave;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;
@@ -295,6 +296,8 @@ static inline void bond_unset_master_alb
 struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry 
*curr);
 int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct 
net_device *slave_dev);
 int bond_create(char *name, struct bond_params *params, struct bonding 
**newbond);
+void

[PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER

2007-09-20 Thread Moni Shoua
This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by 
ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua monis at voltaire.com
Signed-off-by: Or Gerlitz ogerlitz at voltaire.com
---
 drivers/net/bonding/bond_main.c |   39 +++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:08:59.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:13.424688411 
+0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+   struct net_device *slave_dev)
+{
+   bond_dev-hard_header   = slave_dev-hard_header;
+   bond_dev-rebuild_header= slave_dev-rebuild_header;
+   bond_dev-hard_header_cache = slave_dev-hard_header_cache;
+   bond_dev-header_cache_update   = slave_dev-header_cache_update;
+   bond_dev-hard_header_parse = slave_dev-hard_header_parse;
+
+   bond_dev-neigh_setup   = slave_dev-neigh_setup;
+
+   bond_dev-type  = slave_dev-type;
+   bond_dev-hard_header_len   = slave_dev-hard_header_len;
+   bond_dev-addr_len  = slave_dev-addr_len;
+
+   memcpy(bond_dev-broadcast, slave_dev-broadcast,
+   slave_dev-addr_len);
+}
+
 /* enslave device slave to bond device master */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
goto err_undo_flags;
}
 
+   /* set bonding device ether type by slave - bonding netdevices are
+* created with ether_setup, so when the slave type is not ARPHRD_ETHER
+* there is a need to override some of the type dependent attribs/funcs.
+*
+* bond ether type mutual exclusion - don't allow slaves of dissimilar
+* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same 
bond
+*/
+   if (bond-slave_cnt == 0) {
+   if (slave_dev-type != ARPHRD_ETHER)
+   bond_setup_by_slave(bond_dev, slave_dev);
+   } else if (bond_dev-type != slave_dev-type) {
+   printk(KERN_ERR DRV_NAME : %s ether type (%d) is different 
+   from other slaves (%d), can not enslave it.\n,
+   slave_dev-name,
+   slave_dev-type, bond_dev-type);
+   res = -EINVAL;
+   goto err_undo_flags;
+   }
+
if (slave_dev-set_mac_address == NULL) {
printk(KERN_ERR DRV_NAME
: %s: Error: The slave device you specified does 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/11] bonding: Optionally allow ethernet slaves to keep own MAC

2007-09-20 Thread Moni Shoua
Update the don't change MAC of slaves functionality added in
previous changes to be a generic option, rather than something tied to IB
devices, as it's occasionally useful for regular ethernet devices as well.

Adds fail_over_mac option (which is automatically enabled for IB
slaves), applicable only to active-backup mode.

Includes documentation update.

Updates bonding driver version to 3.2.0.

Signed-off-by: Jay Vosburgh [EMAIL PROTECTED]
---
 Documentation/networking/bonding.txt |   33 +++
 drivers/net/bonding/bond_main.c  |   57 +
 drivers/net/bonding/bond_sysfs.c |   49 +
 drivers/net/bonding/bonding.h|6 ++--
 4 files changed, 121 insertions(+), 24 deletions(-)

diff --git a/Documentation/networking/bonding.txt 
b/Documentation/networking/bonding.txt
index 1da5666..1134062 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -281,6 +281,39 @@ downdelay
will be rounded down to the nearest multiple.  The default
value is 0.
 
+fail_over_mac
+
+   Specifies whether active-backup mode should set all slaves to
+   the same MAC address (the traditional behavior), or, when
+   enabled, change the bond's MAC address when changing the
+   active interface (i.e., fail over the MAC address itself).
+
+   Fail over MAC is useful for devices that cannot ever alter
+   their MAC address, or for devices that refuse incoming
+   broadcasts with their own source MAC (which interferes with
+   the ARP monitor).
+
+   The down side of fail over MAC is that every device on the
+   network must be updated via gratuitous ARP, vs. just updating
+   a switch or set of switches (which often takes place for any
+   traffic, not just ARP traffic, if the switch snoops incoming
+   traffic to update its tables) for the traditional method.  If
+   the gratuitous ARP is lost, communication may be disrupted.
+
+   When fail over MAC is used in conjuction with the mii monitor,
+   devices which assert link up prior to being able to actually
+   transmit and receive are particularly susecptible to loss of
+   the gratuitous ARP, and an appropriate updelay setting may be
+   required.
+
+   A value of 0 disables fail over MAC, and is the default.  A
+   value of 1 enables fail over MAC.  This option is enabled
+   automatically if the first slave added cannot change its MAC
+   address.  This option may be modified via sysfs only when no
+   slaves are present in the bond.
+
+   This option was added in bonding version 3.2.0.
+
 lacp_rate
 
Option specifying the rate in which we'll ask our link partner
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77caca3..c01ff9d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -97,6 +97,7 @@ static char *xmit_hash_policy = NULL;
 static int arp_interval = BOND_LINK_ARP_INTERV;
 static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, };
 static char *arp_validate = NULL;
+static int fail_over_mac = 0;
 struct bond_params bonding_defaults;
 
 module_param(max_bonds, int, 0);
@@ -130,6 +131,8 @@ module_param_array(arp_ip_target, charp, NULL, 0);
 MODULE_PARM_DESC(arp_ip_target, arp targets in n.n.n.n form);
 module_param(arp_validate, charp, 0);
 MODULE_PARM_DESC(arp_validate, validate src/dst of ARP probes: none 
(default), active, backup or all);
+module_param(fail_over_mac, int, 0);
+MODULE_PARM_DESC(fail_over_mac, For active-backup, do not set all slaves to 
the same MAC.  0 of off (default), 1 for on.);
 
 /*- Global variables */
 
@@ -1099,7 +1102,7 @@ void bond_change_active_slave(struct bonding *bond, 
struct slave *new_active)
/* when bonding does not set the slave MAC address, the bond MAC
 * address is the one of the active slave.
 */
-   if (new_active  !bond-do_set_mac_addr)
+   if (new_active  bond-params.fail_over_mac)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
if (bond-curr_active_slave 
@@ -1371,16 +1374,16 @@ int bond_enslave(struct net_device *bond_dev, struct 
net_device *slave_dev)
if (slave_dev-set_mac_address == NULL) {
if (bond-slave_cnt == 0) {
printk(KERN_WARNING DRV_NAME
-   : %s: Warning: The first slave device you 
-   specified does not support setting the MAC 
-   address. This bond MAC address would be that 
-   of the active slave.\n, bond_dev-name);
-   bond-do_set_mac_addr = 0;
-   } else 

Re: [ofa-general] Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-19 Thread Moni Shoua

Roland, Jay, 
Thanks a lot for the comments.
I'd like to summarize the points raised so far

1. Reduce the indentation in patch #4 (Roland)
I will resend
2. Remove the if (n-dev-flags  IFF_MASTER) from patch #3 (Roland)
I will resend
3. Consider making ipoib_slave_detach() net/core/dev.c (Roland, Jay)
I think that this is a good idea. I can make the patch (and necessary 
changes to 
the other patches) assuming this is agreed by all.
4. Change header for patch #1 (Roland)
I will resend
5. Use NETDEV_GOING_DOWN and not NETDEV_CHANGE + IFF_SLAVE_DETACH (Jay)
The  NETDEV_GOING_DOWN event is sent in the contex of 
unregister_netdevice() 
Since the action in bonding to the event should be unregister the 
bonding master
it is not possible to do so. bonding needs to know about the slave 
detach earlier.
6. call notifiers from unregister_netdev()
See answer to 5.
7. missing call to notifiers in ipoib_vlan_delete() (Roland)
It seems like you're right. I will fix and resend.

I think that if there are no other comments, I will submit the entire 11 
patches 
again (with changes) to make it easier to merge into the kernel. Since the most 
of the 
content in the patch series is in bonding I thought it would be right that Jay 
will 
push all the patches to the networking git. Is it OK with you Roland?




 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver

2007-09-10 Thread Moni Shoua
Hi all,
This patch series is a bit neglected.
Since our goal is to have bonding support for IPoIB in kernel 2.6.24 it is 
very important for us to get comments soon.

We would appreciate if you take some time to look at this and help us push this 
code upstream.

thanks

 MoniS  


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone

2007-09-02 Thread Moni Shoua
Jay Vosburgh wrote:
 Moni Shoua [EMAIL PROTECTED] wrote:
 
 Jay Vosburgh wrote:
 Moni Shoua [EMAIL PROTECTED] wrote:

 When bonding enslaves non Ethernet devices it takes pointers to functions 
 in the module that owns the slaves. In this case it becomes unsafe
 to keep the bonding master registered after last slave was unenslaved 
 because we don't know if the pointers are still valid.  Destroying the 
 bond when slave_cnt is zero
 ensures that these functions be used anymore.
 Would it not be simpler to run the bonding master through
 ether_setup() again when the final slave is released (to reset all of
 the pointers to their ethernet values)?  I'm presuming here the
 pointers of questionable validity are the ones set in the
 bond_setup_by_slave() copied from the slave_dev-hard_header, et al.

 Having the bonding master disappear (but only sometimes) after
 the last slave is removed is a semantic change I'd rather not introduce
 if it's not necessary.
 Thanks for the comments.

 Having the master disappear is one way I could think of to solve the problem 
 of leaving
 the bonding module with pointers to illegal addresses.
 The other way is to increase the usage count, with try_module_get(), of the 
 module which owns of the slave.
 To do that I  have to restore the field  owner in structure net_device (it 
 was removed in 2.6).
 
   What I was asking above is really whether or not it's feasible
 to simply reset the affected pointers back to the ethernet values from
 ether_setup().  I would think this should return the bonding master back
 to the original state it started in before any slaves were added.
 Unless I'm missing something; I'm willing to believe there's some
 IB-specific tidbit I'm unaware of that makes this more complicated than
 it seems.

It's possible to reset the bonding master by calling ether_setup but with one 
exception: its neighbors.
When enslaving IPoIB devices, the bonding master neighbors point to a 
destructor function in the ib_ipoib module.
When ib_ipoib goes down the neighbors of the bonding master still exist and 
when their turn come to die they will
try to access this function and the kernel will crash. This is why I want to 
destroy the bonding master before ib_ipoib
is unloaded (to kill its neighbors). For any other issue (i.e. taken pointer), 
ether_setup would solve the problem.

 
   This presumes that I'm correct in thinking that the pointers
 you're talking about (as being unsafe after removal of last slave) are
 the ones copied in your new function bond_setup_by_slave().
 
   I don't think it's desirable to acquire a reference to the slave
 driver module.
 
   -J
 
 ---
   -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-08-29 Thread Moni Shoua
Jay Vosburgh wrote:
 Moni Shoua [EMAIL PROTECTED] wrote:
 
 bonding sometimes uses Ethernet constants (such as MTU and address length) 
 which
 are not good when it enslaves non Ethernet devices (such as InfiniBand).

 Signed-off-by: Moni Shoua [EMAIL PROTECTED]
 ---
 drivers/net/bonding/bond_main.c  |3 ++-
 drivers/net/bonding/bond_sysfs.c |   19 +--
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

 Index: net-2.6/drivers/net/bonding/bond_main.c
 ===
 --- net-2.6.orig/drivers/net/bonding/bond_main.c 2007-08-15 
 10:55:48.0 +0300
 +++ net-2.6/drivers/net/bonding/bond_main.c  2007-08-20 14:29:11.911298577 
 +0300
 @@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
  struct slave *slave;
  struct net_device *bond_dev = bond-dev;
  unsigned long features = bond_dev-features;
 -unsigned short max_hard_header_len = ETH_HLEN;
 +unsigned short max_hard_header_len = max((u16)ETH_HLEN,
 +bond_dev-hard_header_len);
 
   Since non-IB bonding masters are run through ether_setup, which
 sets hard_header_len to ETH_HLEN, the max() is probably unnecessary, and
 I think this could just be bond_dev-hard_header_len.
 
This is true except for the case where there are no slaves left.
In that case max_hard_header_len has equals to the initialization value.


   bond_for_each_slave(bond, slave, i) {
features = netdev_compute_features(features,
   slave-dev-features);
if (slave-dev-hard_header_len  max_hard_header_len)
max_hard_header_len = slave-dev-hard_header_len;
}

features |= (bond_dev-features  BOND_VLAN_FEATURES);
bond_dev-features = features;
bond_dev-hard_header_len = max_hard_header_len;


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone

2007-08-29 Thread Moni Shoua
Jay Vosburgh wrote:
 Moni Shoua [EMAIL PROTECTED] wrote:
 
 When bonding enslaves non Ethernet devices it takes pointers to functions 
 in the module that owns the slaves. In this case it becomes unsafe
 to keep the bonding master registered after last slave was unenslaved 
 because we don't know if the pointers are still valid.  Destroying the bond 
 when slave_cnt is zero
 ensures that these functions be used anymore.
 
   Would it not be simpler to run the bonding master through
 ether_setup() again when the final slave is released (to reset all of
 the pointers to their ethernet values)?  I'm presuming here the
 pointers of questionable validity are the ones set in the
 bond_setup_by_slave() copied from the slave_dev-hard_header, et al.
 
   Having the bonding master disappear (but only sometimes) after
 the last slave is removed is a semantic change I'd rather not introduce
 if it's not necessary.

Thanks for the comments.

Having the master disappear is one way I could think of to solve the problem of 
leaving
the bonding module with pointers to illegal addresses.
The other way is to increase the usage count, with try_module_get(), of the 
module which owns of the slave.
To do that I  have to restore the field  owner in structure net_device (it was 
removed in 2.6).
Do you prefer the second approach? I wasn't sure about that.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver

2007-08-20 Thread Moni Shoua
This patch series is the fourth version (see below link to V3) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

The motivation is to enable the bonding driver on its HA mode to work with 
the IP over Infiniband (IPoIB) driver. With these patches I was able to enslave 
IPoIB netdevices and run TCP, UDP, IP (UDP) Multicast and ICMP traffic with 
fail-over and fail-back working fine. The working environment was the net-2.6 
git. 

More over, as IPoIB is also the IB ARP provider for the RDMA CM driver which 
is used by native IB ULPs whose addressing scheme is based on IP (e.g. iSER, 
SDP, Lustre, NFSoRDMA, RDS), bonding support for IPoIB devices **enables** HA 
for these ULPs. This holds as when the ULP is informed by the IB HW on the 
failure of the current IB connection, it just need to reconnect, where the 
bonding device will now issue the IB ARP over the active IPoIB slave. 

This series also includes patches to the IPoIB driver that fix some fix 
some neighboring related issues. 

Major changes from the previous version:

1) Addressing the issue of safety when unloading the IPoIB module before
the bonding module
2) style changes


Links to earlier discussion: 

1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. A discussion in openfabrics regarding changes in the IPoIB that 
enable using it as a slave for bonding.
http://lists.openfabrics.org/pipermail/general/2007-July/038914.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag

2007-08-20 Thread Moni Shoua
Export the call to raw_notifier_call_chain so modules can send notifications
on netdev events to the netdev_chain.
Add IFF_SLAVE_DETACH to the list of priv_flags for net_device.
This flag is set by a slave that is about to unregisster from the kernel.

Both changes are used in bonding slaves that wish to inform the bonding master
about coming detachment.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 include/linux/if.h |1 +
 net/core/dev.c |1 +
 2 files changed, 2 insertions(+)

Index: net-2.6/net/core/dev.c
===
--- net-2.6.orig/net/core/dev.c 2007-08-15 10:09:02.0 +0300
+++ net-2.6/net/core/dev.c  2007-08-15 10:53:00.832543390 +0300
@@ -1148,6 +1148,7 @@ int call_netdevice_notifiers(unsigned lo
 {
return raw_notifier_call_chain(netdev_chain, val, v);
 }
+EXPORT_SYMBOL(call_netdevice_notifiers);
 
 /* When  0 there are consumers of rx skb time stamps */
 static atomic_t netstamp_needed = ATOMIC_INIT(0);
Index: net-2.6/include/linux/if.h
===
--- net-2.6.orig/include/linux/if.h 2007-08-20 14:30:39.0 +0300
+++ net-2.6/include/linux/if.h  2007-08-20 14:31:06.625174369 +0300
@@ -61,6 +61,7 @@
 #define IFF_MASTER_ALB 0x10/* bonding master, balance-alb. */
 #define IFF_BONDING0x20/* bonding master or slave  */
 #define IFF_SLAVE_NEEDARP 0x40 /* need ARPs for validation */
+#define IFF_SLAVE_DETACH 0x80  /* slave is about to unregister */
 
 #define IF_GET_IFACE   0x0001  /* for querying only */
 #define IF_GET_PROTO   0x0002

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 2/10] IB/ipoib: Notify the world before doing unregister

2007-08-20 Thread Moni Shoua
When the bonding device enslaves IPoIB devices it takes pointers to
functions in the ib_ipoib module. This is fine as long as the ib_ipoib
nodule remains loaded while the references to its functions exist.
So, to help bonding do a cleanup on time, when the IPoIB net device is a 
slave of a bonding master, let the master know that the IPoIB device is
about to unregister (but before calling unregister).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |   15 +++
 1 files changed, 15 insertions(+)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-08-20 
14:29:29.522209580 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-08-20 
14:43:03.432162133 +0300
@@ -48,6 +48,7 @@
 #include linux/in.h
 
 #include net/dst.h
+#include linux/netdevice.h
 
 MODULE_AUTHOR(Roland Dreier);
 MODULE_DESCRIPTION(IP-over-InfiniBand net driver);
@@ -772,6 +773,18 @@ static void ipoib_timeout(struct net_dev
/* XXX reset QP, etc. */
 }
 
+static int ipoib_slave_detach(struct net_device *dev)
+{
+   int ret = 0;
+   if (dev-flags  IFF_SLAVE) {
+   dev-priv_flags |= IFF_SLAVE_DETACH;
+   rtnl_lock();
+   ret = call_netdevice_notifiers(NETDEV_CHANGE, dev);
+   rtnl_unlock();
+   }
+   return ret;
+}
+
 static int ipoib_hard_header(struct sk_buff *skb,
 struct net_device *dev,
 unsigned short type,
@@ -921,6 +934,7 @@ void ipoib_dev_cleanup(struct net_device
 
/* Delete any child interfaces first */
list_for_each_entry_safe(cpriv, tcpriv, priv-child_intfs, list) {
+   ipoib_slave_detach(cpriv-dev);
unregister_netdev(cpriv-dev);
ipoib_dev_cleanup(cpriv-dev);
free_netdev(cpriv-dev);
@@ -1208,6 +1222,7 @@ static void ipoib_remove_one(struct ib_d
ib_unregister_event_handler(priv-event_handler);
flush_scheduled_work();
 
+   ipoib_slave_detach(priv-dev);
unregister_netdev(priv-dev);
ipoib_dev_cleanup(priv-dev);
free_netdev(priv-dev);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 3/10] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-08-20 Thread Moni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb-dst-neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb-dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n-dev is an ipoib device and hence netdev_priv(n-dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n-dev-flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   17 +++--
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 ++-
 3 files changed, 20 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h   2007-08-15 
10:09:00.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h2007-08-15 
10:53:52.756348574 +0300
@@ -328,6 +328,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-08-15 
10:53:28.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-08-15 
10:53:52.757348397 +0300
@@ -511,7 +511,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb-dst-neighbour);
+   neigh = ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
if (!neigh) {
++priv-stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -830,6 +830,17 @@ static void ipoib_neigh_cleanup(struct n
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
+   if (n-dev-flags  IFF_MASTER) {
+   /* n-dev is not an IPoIB device and we have
+   to take priv from elsewhere */
+   neigh = *to_ipoib_neigh(n);
+   if (neigh) {
+   priv = netdev_priv(neigh-dev);
+   ipoib_dbg(priv, neigh_destructor for bonding device: 
%s\n,
+ n-dev-name);
+   } else
+   return;
+   }
ipoib_dbg(priv,
  neigh_cleanup for %06x  IPOIB_GID_FMT \n,
  IPOIB_QPN(n-ha),
@@ -851,7 +862,8 @@ static void ipoib_neigh_cleanup(struct n
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -860,6 +872,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh-neighbour = neighbour;
+   neigh-dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(neigh-queue);
ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2007-08-15 
10:09:00.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c  2007-08-15 
10:53:52.758348220 +0300
@@ -727,7 +727,8 @@ out:
if (skb-dst
skb-dst-neighbour 
!*to_ipoib_neigh(skb-dst-neighbour)) {
-   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour);
+   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour,
+   
skb-dev);
 
if (neigh

[PATCH V4 4/10] IB/ipoib: Verify address handle validity on send

2007-08-20 Thread Moni Shoua
When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-08-15 
10:53:52.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-08-15 
10:54:03.959364640 +0300
@@ -686,9 +686,10 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
} else if (neigh-ah) {
-   if (unlikely(memcmp(neigh-dgid.raw,
+   if (unlikely((memcmp(neigh-dgid.raw,
skb-dst-neighbour-ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid))) ||
+(neigh-dev != dev))) {
spin_lock(priv-lock);
/*
 * It's safe to call ipoib_put_ah() inside

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 5/10] net/bonding: Enable bonding to enslave non ARPHRD_ETHER

2007-08-20 Thread Moni Shoua
This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by 
ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   39 +++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:08:59.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:13.424688411 
+0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+   struct net_device *slave_dev)
+{
+   bond_dev-hard_header   = slave_dev-hard_header;
+   bond_dev-rebuild_header= slave_dev-rebuild_header;
+   bond_dev-hard_header_cache = slave_dev-hard_header_cache;
+   bond_dev-header_cache_update   = slave_dev-header_cache_update;
+   bond_dev-hard_header_parse = slave_dev-hard_header_parse;
+
+   bond_dev-neigh_setup   = slave_dev-neigh_setup;
+
+   bond_dev-type  = slave_dev-type;
+   bond_dev-hard_header_len   = slave_dev-hard_header_len;
+   bond_dev-addr_len  = slave_dev-addr_len;
+
+   memcpy(bond_dev-broadcast, slave_dev-broadcast,
+   slave_dev-addr_len);
+}
+
 /* enslave device slave to bond device master */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
goto err_undo_flags;
}
 
+   /* set bonding device ether type by slave - bonding netdevices are
+* created with ether_setup, so when the slave type is not ARPHRD_ETHER
+* there is a need to override some of the type dependent attribs/funcs.
+*
+* bond ether type mutual exclusion - don't allow slaves of dissimilar
+* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same 
bond
+*/
+   if (bond-slave_cnt == 0) {
+   if (slave_dev-type != ARPHRD_ETHER)
+   bond_setup_by_slave(bond_dev, slave_dev);
+   } else if (bond_dev-type != slave_dev-type) {
+   printk(KERN_ERR DRV_NAME : %s ether type (%d) is different 
+   from other slaves (%d), can not enslave it.\n,
+   slave_dev-name,
+   slave_dev-type, bond_dev-type);
+   res = -EINVAL;
+   goto err_undo_flags;
+   }
+
if (slave_dev-set_mac_address == NULL) {
printk(KERN_ERR DRV_NAME
: %s: Error: The slave device you specified does 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 6/10] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()

2007-08-20 Thread Moni Shoua
This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   87 +++-
 drivers/net/bonding/bonding.h   |1 
 2 files changed, 60 insertions(+), 28 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:13.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:54:41.971632881 
+0300
@@ -1095,6 +1095,14 @@ void bond_change_active_slave(struct bon
if (new_active) {
bond_set_slave_active_flags(new_active);
}
+
+   /* when bonding does not set the slave MAC address, the bond MAC
+* address is the one of the active slave.
+*/
+   if (new_active  !bond-do_set_mac_addr)
+   memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
+   new_active-dev-addr_len);
+
bond_send_gratuitous_arp(bond);
}
 }
@@ -1351,13 +1359,22 @@ int bond_enslave(struct net_device *bond
}
 
if (slave_dev-set_mac_address == NULL) {
-   printk(KERN_ERR DRV_NAME
-   : %s: Error: The slave device you specified does 
-   not support setting the MAC address. 
-   Your kernel likely does not support slave 
-   devices.\n, bond_dev-name);
-   res = -EOPNOTSUPP;
-   goto err_undo_flags;
+   if (bond-slave_cnt == 0) {
+   printk(KERN_WARNING DRV_NAME
+   : %s: Warning: The first slave device you 
+   specified does not support setting the MAC 
+   address. This bond MAC address would be that 
+   of the active slave.\n, bond_dev-name);
+   bond-do_set_mac_addr = 0;
+   } else if (bond-do_set_mac_addr) {
+   printk(KERN_ERR DRV_NAME
+   : %s: Error: The slave device you specified 
+   does not support setting the MAC addres,.
+   but this bond uses this practice. \n
+   , bond_dev-name);
+   res = -EOPNOTSUPP;
+   goto err_undo_flags;
+   }
}
 
new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1378,16 +1395,18 @@ int bond_enslave(struct net_device *bond
 */
memcpy(new_slave-perm_hwaddr, slave_dev-dev_addr, ETH_ALEN);
 
-   /*
-* Set slave to master's mac address.  The application already
-* set the master's mac address to that of the first slave
-*/
-   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
-   addr.sa_family = slave_dev-type;
-   res = dev_set_mac_address(slave_dev, addr);
-   if (res) {
-   dprintk(Error %d calling set_mac_address\n, res);
-   goto err_free;
+   if (bond-do_set_mac_addr) {
+   /*
+* Set slave to master's mac address.  The application already
+* set the master's mac address to that of the first slave
+*/
+   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
+   addr.sa_family = slave_dev-type;
+   res = dev_set_mac_address(slave_dev, addr);
+   if (res) {
+   dprintk(Error %d calling set_mac_address\n, res);
+   goto err_free;
+   }
}
 
res = netdev_set_master(slave_dev, bond_dev);
@@ -1612,9 +1631,11 @@ err_close:
dev_close(slave_dev);
 
 err_restore_mac:
-   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family = slave_dev-type;
-   dev_set_mac_address(slave_dev, addr);
+   if (bond-do_set_mac_addr) {
+   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
+   addr.sa_family = slave_dev-type;
+   dev_set_mac_address(slave_dev, addr);
+   }
 
 err_free:
kfree(new_slave);
@@ -1792,10 +1813,12 @@ int bond_release(struct net_device *bond
/* close slave before restoring its mac address */
dev_close(slave_dev);
 
-   /* restore original (permanent) mac address */
-   memcpy(addr.sa_data, slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family

[PATCH V4 7/10] net/bonding: Enable IP multicast for bonding IPoIB devices

2007-08-20 Thread Moni Shoua
Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast 
groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding 
device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another 
ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |5 +++--
 drivers/net/bonding/bond_sysfs.c |6 ++
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:54:41.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 10:55:48.431862446 
+0300
@@ -1285,8 +1285,9 @@ int bond_enslave(struct net_device *bond
 
/* bond must be initialized by bond_open() before enslaving */
if (!(bond_dev-flags  IFF_UP)) {
-   dprintk(Error, master_dev is not up\n);
-   return -EPERM;
+   printk(KERN_WARNING DRV_NAME
+%s: master_dev is not up in bond_enslave\n,
+   bond_dev-name);
}
 
/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-08-15 
10:08:58.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-08-15 10:55:48.432862269 
+0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
/* Quick sanity check -- is the bond interface up? */
if (!(bond-dev-flags  IFF_UP)) {
-   printk(KERN_ERR DRV_NAME
-  : %s: Unable to update slaves because interface is 
down.\n,
+   printk(KERN_WARNING DRV_NAME
+  : %s: doing slave updates when interface is down.\n,
   bond-dev-name);
-   ret = -EPERM;
-   goto out;
}
 
/* Note:  We can't hold bond-lock here, as bond_create grabs it. */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-08-20 Thread Moni Shoua
bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |3 ++-
 drivers/net/bonding/bond_sysfs.c |   19 +--
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:55:48.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-20 14:29:11.911298577 
+0300
@@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
struct slave *slave;
struct net_device *bond_dev = bond-dev;
unsigned long features = bond_dev-features;
-   unsigned short max_hard_header_len = ETH_HLEN;
+   unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+   bond_dev-hard_header_len);
int i;
 
features = ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-08-15 
10:55:48.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-08-15 12:14:41.152469089 
+0300
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
printk(KERN_INFO DRV_NAME
: %s is being deleted...\n,
bond-dev-name);
-   bond_deinit(bond-dev);
-   bond_destroy_sysfs_entry(bond);
-   unregister_netdevice(bond-dev);
+   bond_destroy(bond);
rtnl_unlock();
goto out;
}
@@ -260,6 +258,7 @@ static ssize_t bonding_store_slaves(stru
char command[IFNAMSIZ + 1] = { 0, };
char *ifname;
int i, res, found, ret = count;
+   u32 original_mtu;
struct slave *slave;
struct net_device *dev = NULL;
struct bonding *bond = to_bond(d);
@@ -325,6 +324,7 @@ static ssize_t bonding_store_slaves(stru
}
 
/* Set the slave's MTU to match the bond */
+   original_mtu = dev-mtu;
if (dev-mtu != bond-dev-mtu) {
if (dev-change_mtu) {
res = dev-change_mtu(dev,
@@ -339,6 +339,9 @@ static ssize_t bonding_store_slaves(stru
}
rtnl_lock();
res = bond_enslave(bond-dev, dev);
+   bond_for_each_slave(bond, slave, i)
+   if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0)
+   slave-original_mtu = original_mtu;
rtnl_unlock();
if (res) {
ret = res;
@@ -351,13 +354,17 @@ static ssize_t bonding_store_slaves(stru
bond_for_each_slave(bond, slave, i)
if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0) {
dev = slave-dev;
+   original_mtu = slave-original_mtu;
break;
}
if (dev) {
printk(KERN_INFO DRV_NAME : %s: Removing slave %s\n,
bond-dev-name, dev-name);
rtnl_lock();
-   res = bond_release(bond-dev, dev);
+   if (bond-setup_by_slave)
+   res = bond_release_and_destroy(bond-dev, dev);
+   else
+   res = bond_release(bond-dev, dev);
rtnl_unlock();
if (res) {
ret = res;
@@ -365,9 +372,9 @@ static ssize_t bonding_store_slaves(stru
}
/* set the slave MTU to the default */
if (dev-change_mtu) {
-   dev-change_mtu(dev, 1500);
+   dev-change_mtu(dev, original_mtu);
} else {
-   dev-mtu = 1500;
+   dev-mtu = original_mtu;
}
}
else {
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-15 10:55:34.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-20 14:29:11.912298402 
+0300
@@ -156,6 +156,7 @@ struct slave {
s8 link;/* one

PATCH V4 9/10] net/bonding: Delay sending of gratuitous ARP to avoid failure

2007-08-20 Thread Moni Shoua
Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev-state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   24 +---
 drivers/net/bonding/bonding.h   |1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-15 
10:56:33.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-15 11:04:37.221123652 
+0300
@@ -1102,8 +1102,14 @@ void bond_change_active_slave(struct bon
if (new_active  !bond-do_set_mac_addr)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
-
-   bond_send_gratuitous_arp(bond);
+   if (bond-curr_active_slave 
+   test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state)) {
+   dprintk(delaying gratuitous arp on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond-send_grat_arp = 1;
+   } else
+   bond_send_gratuitous_arp(bond);
}
 }
 
@@ -2083,6 +2089,17 @@ void bond_mii_monitor(struct net_device 
 * program could monitor the link itself if needed.
 */
 
+   if (bond-send_grat_arp) {
+   if (bond-curr_active_slave  
test_bit(__LINK_STATE_LINKWATCH_PENDING,
+   bond-curr_active_slave-dev-state))
+   dprintk(Needs to send gratuitous arp but not yet\n);
+   else {
+   dprintk(sending delayed gratuitous arp on on %s\n,
+   bond-curr_active_slave-dev-name);
+   bond_send_gratuitous_arp(bond);
+   bond-send_grat_arp = 0;
+   }
+   }
read_lock(bond-curr_slave_lock);
oldcurrent = bond-curr_active_slave;
read_unlock(bond-curr_slave_lock);
@@ -2484,7 +2501,7 @@ static void bond_send_gratuitous_arp(str
 
if (bond-master_ip) {
bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
- bond-master_ip, 0);
+   bond-master_ip, 0);
}
 
list_for_each_entry(vlan, bond-vlan_list, vlan_list) {
@@ -4293,6 +4310,7 @@ static int bond_init(struct net_device *
bond-current_arp_slave = NULL;
bond-primary_slave = NULL;
bond-dev = bond_dev;
+   bond-send_grat_arp = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-15 10:56:33.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-15 11:05:41.516451497 
+0300
@@ -187,6 +187,7 @@ struct bonding {
struct   timer_list arp_timer;
s8   kill_timers;
s8   do_set_mac_addr;
+   s8   send_grat_arp;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone

2007-08-20 Thread Moni Shoua
When bonding enslaves non Ethernet devices it takes pointers to functions 
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved 
because we don't know if the pointers are still valid.  Destroying the bond 
when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   45 +++-
 drivers/net/bonding/bonding.h   |3 ++
 2 files changed, 47 insertions(+), 1 deletion(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-08-20 
14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-08-20 14:43:17.850571535 
+0300
@@ -1256,6 +1256,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
 {
+   struct bonding *bond = bond_dev-priv;
bond_dev-hard_header   = slave_dev-hard_header;
bond_dev-rebuild_header= slave_dev-rebuild_header;
bond_dev-hard_header_cache = slave_dev-hard_header_cache;
@@ -1270,6 +1271,7 @@ static void bond_setup_by_slave(struct n
 
memcpy(bond_dev-broadcast, slave_dev-broadcast,
slave_dev-addr_len);
+   bond-setup_by_slave = 1;
 }
 
 /* enslave device slave to bond device master */
@@ -1838,6 +1840,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+   bond_deinit(bond-dev);
+   bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond-dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device 
*slave_dev)
+{
+   struct bonding *bond = bond_dev-priv;
+   int ret;
+
+   ret = bond_release(bond_dev, slave_dev);
+   if ((ret == 0)  (bond-slave_cnt == 0)) {
+   printk(KERN_INFO DRV_NAME  %s: destroying bond for.\n,
+   bond_dev-name);
+   bond_destroy(bond);
+   }
+   return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3322,7 +3353,11 @@ static int bond_slave_netdev_event(unsig
switch (event) {
case NETDEV_UNREGISTER:
if (bond_dev) {
-   bond_release(bond_dev, slave_dev);
+   dprintk(slave %s unregisters\n, slave_dev-name);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   else
+   bond_release(bond_dev, slave_dev);
}
break;
case NETDEV_CHANGE:
@@ -3331,6 +3366,13 @@ static int bond_slave_netdev_event(unsig
 * sets up a hierarchical bond, then rmmod's
 * one of the slave bonding devices?
 */
+   if (slave_dev-priv_flags  IFF_SLAVE_DETACH) {
+   dprintk(slave %s detaching\n, slave_dev-name);
+   if (bond-setup_by_slave)
+   bond_release_and_destroy(bond_dev, slave_dev);
+   else
+   bond_release(bond_dev, slave_dev);
+   }
break;
case NETDEV_DOWN:
/*
@@ -4311,6 +4353,7 @@ static int bond_init(struct net_device *
bond-primary_slave = NULL;
bond-dev = bond_dev;
bond-send_grat_arp = 0;
+   bond-setup_by_slave = 0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-08-20 14:43:17.123702132 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-08-20 14:47:52.845180870 
+0300
@@ -188,6 +188,7 @@ struct bonding {
s8   kill_timers;
s8   do_set_mac_addr;
s8   send_grat_arp;
+   s8   setup_by_slave;
struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
struct   proc_dir_entry *proc_entry;
@@ -295,6 +296,8 @@ static inline void bond_unset_master_alb
 struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry 
*curr);
 int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct 
net_device *slave_dev);
 int bond_create(char *name, struct bond_params *params, struct bonding 
**newbond);
+void bond_destroy

Re: [ofa-general] Re: Re: Re: [PATCH V3 0/7] net/bonding: ADD IPoIB support for the bonding driver

2007-08-01 Thread Moni Shoua

 It's always wrong to copy symbols from another module without
 referencing it.

Michael,
It seems like the preferred approach is to prevent ib_ipoib from being 
unloaded while bonding is on top it, right?
It seems like it would handle all safety issues (not just neigh cleanup).


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH V3 0/7] net/bonding: ADD IPoIB support for the bonding driver

2007-07-31 Thread Moni Shoua
Roland Dreier wrote:
   1. When bonding enslaves an IPoIB device the bonding neighbor holds a 
   reference to a cleanup function in the IPoIB drives. This makes it unsafe 
 to 
   unload the IPoIB module if there are bonding neighbors in the air. So, to 
   avoid this race one must unload bonding before unloading IPoIB. 
 
 I think we really want to resolve this somehow.  Getting an oops by
 doing modprobe -r ipoib isn't that friendly.
 
You are right and we want to resolve that.
One way is to clean the neigh destructor function from all IPoIB neighs.
The other way is to prevent ipoib unload if device is a slave or is referenced 
from 
somewhere else.

I guess I would like an advice here.
 Also, what happened to the problem of having an address handle
 belonging to the wrong device on bond failover?  Did you figure out a
 way to fix that one?
This is what patch 2 handles.
 
  - R.
 ___
 general mailing list
 [EMAIL PROTECTED]
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 1/7] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-07-30 Thread Moni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb-dst-neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb-dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n-dev is an ipoib device and hence netdev_priv(n-dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n-dev-flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   17 +++--
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |2 +-
 3 files changed, 19 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h   2007-07-25 
14:56:13.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h2007-07-25 
14:57:48.095724495 +0300
@@ -328,6 +328,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-07-25 
14:56:13.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-07-25 
15:03:11.291291271 +0300
@@ -510,7 +510,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb-dst-neighbour);
+   neigh = ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
if (!neigh) {
++priv-stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -817,6 +817,16 @@ static void ipoib_neigh_cleanup(struct n
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
+   if (n-dev-flags  IFF_MASTER) {
+   /* n-dev is not an IPoIB device and we have to take priv from 
elsewhere */
+   neigh = *to_ipoib_neigh(n);
+   if (neigh){
+   priv = netdev_priv(neigh-dev);
+   ipoib_dbg(priv, neigh_destructor for bonding device: 
%s\n,
+ n-dev-name);
+   } else
+   return;
+   }
ipoib_dbg(priv,
  neigh_cleanup for %06x  IPOIB_GID_FMT \n,
  IPOIB_QPN(n-ha),
@@ -838,7 +848,9 @@ static void ipoib_neigh_cleanup(struct n
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
+
 {
struct ipoib_neigh *neigh;
 
@@ -847,6 +859,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh-neighbour = neighbour;
+   neigh-dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(neigh-queue);
ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2007-07-25 
14:56:13.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c  2007-07-25 
14:57:48.097724142 +0300
@@ -727,7 +727,7 @@ out:
if (skb-dst
skb-dst-neighbour 
!*to_ipoib_neigh(skb-dst-neighbour)) {
-   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour);
+   struct ipoib_neigh *neigh = 
ipoib_neigh_alloc(skb-dst-neighbour, skb-dev);
 
if (neigh) {
kref_get(mcast-ah-ref);

-
To unsubscribe from this list: send the line

[PATCH V3 2/7] IB/ipoib: Verify address handle validity on send

2007-07-30 Thread Moni Shoua
When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the 
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-07-25 
14:57:48.0 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-07-25 
15:02:55.525131034 +0300
@@ -685,9 +685,10 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
} else if (neigh-ah) {
-   if (unlikely(memcmp(neigh-dgid.raw,
+   if (unlikely((memcmp(neigh-dgid.raw,
skb-dst-neighbour-ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid))) ||
+(neigh-dev != dev))) {
spin_lock(priv-lock);
/*
 * It's safe to call ipoib_put_ah() inside

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 3/7] net/bonding: Enable bonding to enslave non ARPHRD_ETHER

2007-07-30 Thread Moni Shoua
This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by 
ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   38 ++
 1 files changed, 38 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-07-25 
15:02:10.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-29 16:24:30.913343981 
+0300
@@ -1277,6 +1277,26 @@ static int bond_compute_features(struct 
return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+   struct net_device *slave_dev)
+{
+   bond_dev-hard_header   = slave_dev-hard_header;
+   bond_dev-rebuild_header= slave_dev-rebuild_header;
+   bond_dev-hard_header_cache = slave_dev-hard_header_cache;
+   bond_dev-header_cache_update   = slave_dev-header_cache_update;
+   bond_dev-hard_header_parse = slave_dev-hard_header_parse;
+
+   bond_dev-neigh_setup   = slave_dev-neigh_setup;
+
+   bond_dev-type  = slave_dev-type;
+   bond_dev-hard_header_len   = slave_dev-hard_header_len;
+   bond_dev-addr_len  = slave_dev-addr_len;
+
+   memcpy(bond_dev-broadcast, slave_dev-broadcast,
+   slave_dev-addr_len);
+}
+
 /* enslave device slave to bond device master */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1351,6 +1371,24 @@ int bond_enslave(struct net_device *bond
goto err_undo_flags;
}
 
+   /* set bonding device ether type by slave - bonding netdevices are
+* created with ether_setup, so when the slave type is not ARPHRD_ETHER
+* there is a need to override some of the type dependent attribs/funcs.
+*
+* bond ether type mutual exclusion - don't allow slaves of dissimilar
+* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same 
bond
+*/
+   if (bond-slave_cnt == 0) {
+   if (slave_dev-type != ARPHRD_ETHER)
+   bond_setup_by_slave(bond_dev, slave_dev);
+   } else if (bond_dev-type != slave_dev-type) {
+   printk(KERN_ERR DRV_NAME : %s ether type (%d) is different 
from 
+   other slaves (%d), can not enslave it.\n, 
slave_dev-name,
+   slave_dev-type, bond_dev-type);
+   res = -EINVAL;
+   goto err_undo_flags;
+   }
+
if (slave_dev-set_mac_address == NULL) {
printk(KERN_ERR DRV_NAME
: %s: Error: The slave device you specified does 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 4/7] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()

2007-07-30 Thread Moni Shoua
This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   88 +++-
 drivers/net/bonding/bonding.h   |1 

 2 files changed, 61 insertions(+), 28 deletions(-)
Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-07-29 
16:24:30.913343981 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-29 16:36:53.234602471 
+0300
@@ -1127,6 +1127,14 @@ void bond_change_active_slave(struct bon
if (new_active) {
bond_set_slave_active_flags(new_active);
}
+
+   /* when bonding does not set the slave MAC address, the bond MAC
+* address is the one of the active slave.
+*/
+   if (new_active  !bond-do_set_mac_addr)
+   memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
+   new_active-dev-addr_len);
+
bond_send_gratuitous_arp(bond);
}
 }
@@ -1390,13 +1398,22 @@ int bond_enslave(struct net_device *bond
}
 
if (slave_dev-set_mac_address == NULL) {
-   printk(KERN_ERR DRV_NAME
-   : %s: Error: The slave device you specified does 
-   not support setting the MAC address. 
-   Your kernel likely does not support slave 
-   devices.\n, bond_dev-name);
-   res = -EOPNOTSUPP;
-   goto err_undo_flags;
+   if (bond-slave_cnt == 0) {
+   printk(KERN_WARNING DRV_NAME
+   : %s: Warning: The first slave device you 
+   specified does not support setting the MAC 
+   address. This bond MAC address would be that 
+   of the active slave.\n, bond_dev-name);
+   bond-do_set_mac_addr = 0;
+   } else if (bond-do_set_mac_addr) {
+   printk(KERN_ERR DRV_NAME
+   : %s: Error: The slave device you specified 
+   does not support setting the MAC addres,.
+   but this bond uses this practice. \n
+   , bond_dev-name);
+   res = -EOPNOTSUPP;
+   goto err_undo_flags;
+   }
}
 
new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1417,16 +1434,18 @@ int bond_enslave(struct net_device *bond
 */
memcpy(new_slave-perm_hwaddr, slave_dev-dev_addr, ETH_ALEN);
 
-   /*
-* Set slave to master's mac address.  The application already
-* set the master's mac address to that of the first slave
-*/
-   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
-   addr.sa_family = slave_dev-type;
-   res = dev_set_mac_address(slave_dev, addr);
-   if (res) {
-   dprintk(Error %d calling set_mac_address\n, res);
-   goto err_free;
+   if (bond-do_set_mac_addr) {
+   /*
+* Set slave to master's mac address.  The application already
+* set the master's mac address to that of the first slave
+*/
+   memcpy(addr.sa_data, bond_dev-dev_addr, bond_dev-addr_len);
+   addr.sa_family = slave_dev-type;
+   res = dev_set_mac_address(slave_dev, addr);
+   if (res) {
+   dprintk(Error %d calling set_mac_address\n, res);
+   goto err_free;
+   }
}
 
res = netdev_set_master(slave_dev, bond_dev);
@@ -1651,9 +1670,11 @@ err_close:
dev_close(slave_dev);
 
 err_restore_mac:
-   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family = slave_dev-type;
-   dev_set_mac_address(slave_dev, addr);
+   if (bond-do_set_mac_addr) {
+   memcpy(addr.sa_data, new_slave-perm_hwaddr, ETH_ALEN);
+   addr.sa_family = slave_dev-type;
+   dev_set_mac_address(slave_dev, addr);
+   }
 
 err_free:
kfree(new_slave);
@@ -1831,10 +1852,12 @@ int bond_release(struct net_device *bond
/* close slave before restoring its mac address */
dev_close(slave_dev);
 
-   /* restore original (permanent) mac address */
-   memcpy(addr.sa_data, slave-perm_hwaddr, ETH_ALEN);
-   addr.sa_family

[PATCH V3 5/7] net/bonding: Enable IP multicast for bonding IPoIB devices

2007-07-30 Thread Moni Shoua

Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast 
groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding 
device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another 
ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
Signed-off-by: Or Gerlitz [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |4 ++--
 drivers/net/bonding/bond_sysfs.c |6 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-07-25 
15:04:50.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-25 15:06:17.175820632 
+0300
@@ -1325,8 +1325,8 @@ int bond_enslave(struct net_device *bond
 
/* bond must be initialized by bond_open() before enslaving */
if (!(bond_dev-flags  IFF_UP)) {
-   dprintk(Error, master_dev is not up\n);
-   return -EPERM;
+   printk(KERN_WARNING DRV_NAME
+%s: master_dev is not up in bond_enslave\n, 
bond_dev-name);
}
 
/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-07-25 
14:18:12.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-07-25 15:06:17.176820452 
+0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
/* Quick sanity check -- is the bond interface up? */
if (!(bond-dev-flags  IFF_UP)) {
-   printk(KERN_ERR DRV_NAME
-  : %s: Unable to update slaves because interface is 
down.\n,
+   printk(KERN_WARNING DRV_NAME
+  : %s: doing slave updates when interface is down.\n,
   bond-dev-name);
-   ret = -EPERM;
-   goto out;
}
 
/* Note:  We can't hold bond-lock here, as bond_create grabs it. */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 0/7] net/bonding: ADD IPoIB support for the bonding driver

2007-07-30 Thread Moni Shoua
This patch series is the third version (see below link to V2) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

The motivation is to enable the bonding driver on its HA mode to work with 
the IP over Infiniband (IPoIB) driver. With these patches I was able to enslave 
IPoIB netdevices and run TCP, UDP, IP (UDP) Multicast and ICMP traffic with 
fail-over and fail-back working fine. The working environment was the net-2.6 
git. 

More over, as IPoIB is also the IB ARP provider for the RDMA CM driver which 
is used by native IB ULPs whose addressing scheme is based on IP (e.g. iSER, 
SDP, Lustre, NFSoRDMA, RDS), bonding support for IPoIB devices **enables** HA 
for these ULPs. This holds as when the ULP is informed by the IB HW on the 
failure of the current IB connection, it just need to reconnect, where the 
bonding device will now issue the IB ARP over the active IPoIB slave. 

This series also includes patches to the IPoIB driver that fix some fix 
some neighboring related issues. 

There are still 2 open issues here: 
1. When bonding enslaves an IPoIB device the bonding neighbor holds a 
reference to a cleanup function in the IPoIB drives. This makes it unsafe to 
unload the IPoIB module if there are bonding neighbors in the air. So, to 
avoid this race one must unload bonding before unloading IPoIB. 

2. Patch No. 7 is a workaround to a problem where gratuitous were not sent 
quite often.
I didn't find something better that fixes this and I would 
appreciate advices and comments regarding it. However, this doesn't seem to me 
as 
an issue related exclusively to IPoIB. 

Links to earlier discussion: 

1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. A discussion in openfabrics regarding changes in the IPoIB that 
enable using it as a slave for bonding.
http://lists.openfabrics.org/pipermail/general/2007-March/034033.html



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 6/7] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

2007-07-30 Thread Moni Shoua

bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c  |2 +-
 drivers/net/bonding/bond_sysfs.c |   10 --
 drivers/net/bonding/bonding.h|1 +
 3 files changed, 10 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-07-25 
15:06:17.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-25 15:33:25.012883360 
+0300
@@ -1255,7 +1255,7 @@ static int bond_compute_features(struct 
unsigned long features = BOND_INTERSECT_FEATURES;
struct slave *slave;
struct net_device *bond_dev = bond-dev;
-   unsigned short max_hard_header_len = ETH_HLEN;
+   u16 max_hard_header_len = max((u16)ETH_HLEN, bond_dev-hard_header_len);
int i;
 
bond_for_each_slave(bond, slave, i) {
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c   2007-07-25 
15:06:17.0 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c2007-07-25 15:20:10.224527636 
+0300
@@ -260,6 +260,7 @@ static ssize_t bonding_store_slaves(stru
char command[IFNAMSIZ + 1] = { 0, };
char *ifname;
int i, res, found, ret = count;
+   u32 original_mtu;
struct slave *slave;
struct net_device *dev = NULL;
struct bonding *bond = to_bond(d);
@@ -325,6 +326,7 @@ static ssize_t bonding_store_slaves(stru
}
 
/* Set the slave's MTU to match the bond */
+   original_mtu = dev-mtu;
if (dev-mtu != bond-dev-mtu) {
if (dev-change_mtu) {
res = dev-change_mtu(dev,
@@ -339,6 +341,9 @@ static ssize_t bonding_store_slaves(stru
}
rtnl_lock();
res = bond_enslave(bond-dev, dev);
+   bond_for_each_slave(bond, slave, i)
+   if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0)
+   slave-original_mtu=original_mtu;
rtnl_unlock();
if (res) {
ret = res;
@@ -351,6 +356,7 @@ static ssize_t bonding_store_slaves(stru
bond_for_each_slave(bond, slave, i)
if (strnicmp(slave-dev-name, ifname, IFNAMSIZ) == 0) {
dev = slave-dev;
+   original_mtu = slave-original_mtu;
break;
}
if (dev) {
@@ -365,9 +371,9 @@ static ssize_t bonding_store_slaves(stru
}
/* set the slave MTU to the default */
if (dev-change_mtu) {
-   dev-change_mtu(dev, 1500);
+   dev-change_mtu(dev, original_mtu);
} else {
-   dev-mtu = 1500;
+   dev-mtu = original_mtu;
}
}
else {
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-07-25 15:03:32.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-07-25 15:20:10.223527810 
+0300
@@ -156,6 +156,7 @@ struct slave {
s8 link;/* one of BOND_LINK_ */
s8 state;   /* one of BOND_STATE_ */
u32original_flags;
+   u32original_mtu;
u32link_failure_count;
u16speed;
u8 duplex;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 7/7] net/bonding: Delay sending of gratuitous ARP to avoid failure

2007-07-30 Thread Moni Shoua

Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev-state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_main.c |   25 +
 drivers/net/bonding/bonding.h   |1 +
 2 files changed, 22 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===
--- net-2.6.orig/drivers/net/bonding/bond_main.c2007-07-25 
15:33:25.0 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-26 18:42:59.296296622 
+0300
@@ -1134,8 +1134,13 @@ void bond_change_active_slave(struct bon
if (new_active  !bond-do_set_mac_addr)
memcpy(bond-dev-dev_addr,  new_active-dev-dev_addr,
new_active-dev-addr_len);
-
-   bond_send_gratuitous_arp(bond);
+   if (bond-curr_active_slave 
+   test_bit(__LINK_STATE_LINKWATCH_PENDING, 
bond-curr_active_slave-dev-state)){
+   dprintk(delaying gratuitous arp on 
%s\n,bond-curr_active_slave-dev-name);
+   bond-send_grat_arp=1;
+   }else{
+   bond_send_gratuitous_arp(bond);
+   }
}
 }
 
@@ -2120,6 +2125,15 @@ void bond_mii_monitor(struct net_device 
 * program could monitor the link itself if needed.
 */
 
+   if (bond-send_grat_arp) {
+   if (bond-curr_active_slave  
test_bit(__LINK_STATE_LINKWATCH_PENDING, bond-curr_active_slave-dev-state))
+   dprintk(Needs to send gratuitous arp but not 
yet\n,__FUNCTION__);
+   else {
+   dprintk(sending delayed gratuitous arp on 
ond-curr_active_slave-dev-name\n);
+   bond_send_gratuitous_arp(bond);
+   bond-send_grat_arp=0;
+   }
+   }
read_lock(bond-curr_slave_lock);
oldcurrent = bond-curr_active_slave;
read_unlock(bond-curr_slave_lock);
@@ -2513,6 +2527,7 @@ static void bond_send_gratuitous_arp(str
struct slave *slave = bond-curr_active_slave;
struct vlan_entry *vlan;
struct net_device *vlan_dev;
+   int i;
 
dprintk(bond_send_grat_arp: bond %s slave %s\n, bond-dev-name,
slave ? slave-dev-name : NULL);
@@ -2520,8 +2535,9 @@ static void bond_send_gratuitous_arp(str
return;
 
if (bond-master_ip) {
-   bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
- bond-master_ip, 0);
+   for (i=0;i3;i++)
+   bond_arp_send(slave-dev, ARPOP_REPLY, bond-master_ip,
+ bond-master_ip, 0);
}
 
list_for_each_entry(vlan, bond-vlan_list, vlan_list) {
@@ -4331,6 +4347,7 @@ static int bond_init(struct net_device *
bond-current_arp_slave = NULL;
bond-primary_slave = NULL;
bond-dev = bond_dev;
+   bond-send_grat_arp=0;
INIT_LIST_HEAD(bond-vlan_list);
 
/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===
--- net-2.6.orig/drivers/net/bonding/bonding.h  2007-07-25 15:20:10.0 
+0300
+++ net-2.6/drivers/net/bonding/bonding.h   2007-07-26 18:42:43.652087660 
+0300
@@ -203,6 +203,7 @@ struct bonding {
struct   vlan_group *vlgrp;
struct   packet_type arp_mon_pt;
s8   do_set_mac_addr;
+   int  send_grat_arp;
 };
 
 /**

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html