[ofa-general] [query] openSM routing algorithms
In the latest openSM release, I could see it supports four different algorithms(Min-hop algorithm being the default). I want to know in detail how these algorithms work and which one to use to when. Can anyone of you help me by giving references to some documents describing the same. regards, Mahesh ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Being young and inexperienced
attachment: img20.gif___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] ofa_1_3_kernel 20070927-0200 daily build status
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.16 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.14 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.9-22.ELsmp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.9-34.ELsmp Failed: Build failed on powerpc with linux-2.6.19 Log: /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936: error: invalid type argument of '-' /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939: error: invalid type argument of '-' /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:940: error: invalid type argument of '-' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.19' make: *** [kernel] Error 2 -- Build failed on powerpc with linux-2.6.17 Log: /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936: error: invalid type argument of '-' /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939: error: invalid type argument of '-' /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:940: error: invalid type argument of '-' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.17' make: *** [kernel] Error 2 -- Build failed on powerpc with linux-2.6.16 Log: /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.16_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936: error: invalid type argument of '-' /home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.16_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939: error: invalid
[ofa-general] ***SPAM*** uDAPL thread safety
HI, Is the uDAPL provider in OFED 1.2 thread safe ? the dat.conf by default has an entry nonthreadsafe and the spec says for some of the routines thread safety depends on the provider. cheers /Dev - Check out the hottest 2008 models today at Yahoo! Autos.___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP for userpsace multicast IB apps
On 9/26/07, Roland Dreier [EMAIL PROTECTED] wrote: To support this inter-op for the case where the receiving party resides at the IB side, there is a need to handle IGMP (reports/queries) else the local IP router would not forward multicast traffic towards the IB network. This patch does a lookup on the database used for multicast reference counting and enhances IPoIB to ignore mulicast group which is already handled by user space, all this under a per device policy flag. That is when the policy flag allows it, IPoIB will not join and attach its QP to a multicast group which has an entry on the database. I don't really follow this explanation. OK, I see in the first paragraph that you want to handle IGMP. How does the second paragraph follow? Why does IGMP mean the kernel IPoIB interface should avoid joining certain multicast groups? (Which groups?) The user space app first joins to the multicast group through the rdma-cm (by calling rdma_join_multicast etc) and then lets the kernel IGMP state machine that it has to join / respond on queries for this group. This can be achieved if, second, the app issues a SOL_IP / IP_ADD_MEMBERSHIP setsockopt call. Since this setsockopt has two impcast A) IGMP etc B) IPoIB set_multicast_list is called, the patch comes to avoid IPoIB from joining / attaching to this group, since the app actually attaches its own UD QP to the group. So my change log comment wasn't detailed enough to make it clear this is the design, sorry. +/* ignore group which is directly joined by user space */ +if (test_bit(IPOIB_FLAG_ADMIN_UMCAST_ALLOWED, priv-flags) +!ib_sa_get_mcmember_rec(priv-ca, priv-port, mgid, rec)) I don't follow this. Why does ib_sa_get_mcmember_rec() returning 0 imply that userspace has already joined the multicast group? Since both the rdma-cm and ipoib are consumers of the core mutlicast management code (core/multicast.c which is linked into ib_sa.ko), and the app (through the rdma-cm) --first-- inserts a record into the database and only then issues the setsockopt call, if ipoib has a hit on a group it was told to join, this group must be offloaded by the rdma-cm consumer. +module_param_named(umcast_allowed, ipoib_umcast_allowed, int, 0444); Not sure I understand why you added the module parameter... The per device flag is initialized by the module param value at ipoib_dev_init() +static DEVICE_ATTR(umcast, S_IWUSR | S_IRUGO, show_umcast, set_umcast); The set_umcast attribute is writable by root anyway so why are there two ways of setting this? I am not sure to fully follow your comment. I just wanted to make the sysfs /sys/class/net/$dev/umcast entry writable and I actually did copy-paste from the set_mode code... +if (!strcmp(buf, 1\n)) { I don't think this is the most robust way of parsing things. for example it will break in a very confusing way if someone uses echo -n Could you use simple_strtoul() or something like that to handle leading/trailing whitespace etc? sure, I will fix it. Or. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [query] openSM routing algorithms
On Thu, 2007-09-27 at 12:36 +0530, Keshetti Mahesh wrote: In the latest openSM release, I could see it supports four different algorithms(Min-hop algorithm being the default). I want to know in detail how these algorithms work and which one to use to when. Can anyone of you help me by giving references to some documents describing the same. The descriptions of and references to (papers on) the routing algorithms are in the OpenSM man page. -- Hal regards, Mahesh ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [ewg] Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.
Michael, Have you pulled this in yet? I want to close out the bug I have open... Thanks, Steve. Steve Wise wrote: Michael S. Tsirkin wrote: Yes, please push this into your git tree (and please verify that cross-build to all OS-es passes). done! git://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2_c Further, please do it this way: add the patch in ofed-1.2.5 and then merge 1.2.5 into 1.3. done! git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel Steve. ___ ewg mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Are you strong man?
attachment: img20.gif___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] SDP memory allocation policy problem?
Thanks for your help. We'll setup to get this tested under pressure. We'll keep you posted. Regards KP On 9/26/07, Jim Mott [EMAIL PROTECTED] wrote: I have reworked your patch slightly and run my simple unit tests on it. No correctness problems detected in latency or bandwidth paths. No performance regressions either. If your proposed patch worked for you, then this one ought to work too. Could you please give it a go and let me know? Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c === --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_bcopy.c 2007-09-26 13:27:43.0 -0500 +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c 2007-09-26 17:52:12.0 -0500 @@ -221,16 +221,26 @@ static void sdp_post_recv(struct sdp_soc skb_frag_t *frag; struct sdp_bsdh *h; int id = ssk-rx_head; + unsigned int gfp_page; /* Now, allocate and repost recv */ /* TODO: allocate from cache */ - skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE, - GFP_KERNEL); + + if (unlikely(ssk-isk.sk.sk_allocation)) { + skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE, + ssk-isk.sk.sk_allocation); + gfp_page = ssk-isk.sk.sk_allocation | __GFP_HIGHMEM; + } else { + skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE, + GFP_KERNEL); + gfp_page = GFP_HIGHUSER; + } + /* FIXME */ BUG_ON(!skb); h = (struct sdp_bsdh *)skb-head; for (i = 0; i ssk-recv_frags; ++i) { - page = alloc_pages(GFP_HIGHUSER, 0); + page = alloc_pages(gfp_page, 0); BUG_ON(!page); frag = skb_shinfo(skb)-frags[i]; frag-page= page; @@ -404,6 +414,7 @@ void sdp_post_sends(struct sdp_sock *ssk /* TODO: nonagle? */ struct sk_buff *skb; int c; + int gfp_page; if (unlikely(!ssk-id)) { if (ssk-isk.sk.sk_send_head) { @@ -415,6 +426,11 @@ void sdp_post_sends(struct sdp_sock *ssk return; } + if (unlikely(ssk-isk.sk.sk_allocation)) + gfp_page = ssk-isk.sk.sk_allocation; + else + gfp_page = GFP_KERNEL; + if (ssk-recv_request ssk-rx_tail = ssk-recv_request_head ssk-bufs = SDP_MIN_BUFS @@ -424,7 +440,7 @@ void sdp_post_sends(struct sdp_sock *ssk skb = sk_stream_alloc_skb(ssk-isk.sk, sizeof(struct sdp_bsdh) + sizeof(*resp_size), - GFP_KERNEL); + gfp_page); /* FIXME */ BUG_ON(!skb); resp_size = (struct sdp_chrecvbuf *)skb_put(skb, sizeof *resp_size); @@ -449,7 +465,7 @@ void sdp_post_sends(struct sdp_sock *ssk skb = sk_stream_alloc_skb(ssk-isk.sk, sizeof(struct sdp_bsdh) + sizeof(*req_size), - GFP_KERNEL); + gfp_page); /* FIXME */ BUG_ON(!skb); ssk-sent_request = SDP_MAX_SEND_SKB_FRAGS * PAGE_SIZE; @@ -480,7 +496,7 @@ void sdp_post_sends(struct sdp_sock *ssk ssk-bufs) { skb = sk_stream_alloc_skb(ssk-isk.sk, sizeof(struct sdp_bsdh), - GFP_KERNEL); + gfp_page); /* FIXME */ BUG_ON(!skb); sdp_post_send(ssk, skb, SDP_MID_DISCONN); -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nathan Dauchy Sent: Tuesday, September 25, 2007 5:50 PM To: general@lists.openfabrics.org Subject: Re: [ofa-general] SDP memory allocation policy problem? Is there anyone among the OFED development team that is looking into this issue? I believe that it is causing nodes to hang at our site. We are running ofed-1.2 and the 2.6.9-55.ELsmp kernel. Workarounds or even untested patches would be appreciated. Thanks! -Nathan Ken Phillips wrote: Greetings, Teammates here report the following: Problem The method SDP uses to allocate socket buffers may cause the node to hang under memory pressure. Details Each kernel level socket has an allocation flag to specify the memory allocation policy for socket buffers, the default is GFP_ATOMIC (or GFP_KERNEL for SDP). If the caller
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfacesto avoid 4-tuple conflicts.
Sean, What is the model on how client connects, say for iSCSI, when client and server both support, iWARP and 10GbE or 1GbE, and would like to setup most performant connection for ULP? Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 -Original Message- From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Thursday, September 27, 2007 2:39 PM To: Steve Wise Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; general@lists.openfabrics.org; [EMAIL PROTECTED] Subject: Re: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfacesto avoid 4-tuple conflicts. The sysadmin creates for iwarp use only alias interfaces of the form devname:iw* where devname is the native interface name (eg eth0) for the iwarp netdev device. The alias label can be anything starting with iw. The iw immediately after the ':' is the key used by the iw_cxgb3 driver. I'm still not sure about this, but haven't come up with anything better myself. And if there's a good chance of other rnic's needing the same support, I'd rather see the common code separated out, even if just encapsulated within this module for easy re-use. As for the code, I have a couple of questions about whether deadlock and a race condition are possible, plus a few minor comments. +static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa) +{ + struct iwch_addrlist *addr; + + addr = kmalloc(sizeof *addr, GFP_KERNEL); + if (!addr) { + printk(KERN_ERR MOD %s - failed to alloc memory!\n, + __FUNCTION__); + return; + } + addr-ifa = ifa; + mutex_lock(rnicp-mutex); + list_add_tail(addr-entry, rnicp-addrlist); + mutex_unlock(rnicp-mutex); +} Should this return success/failure? +static int nb_callback(struct notifier_block *self, unsigned long event, + void *ctx) +{ + struct in_ifaddr *ifa = ctx; + struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb); + + PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event); + + switch (event) { + case NETDEV_UP: + if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) + is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x added\n, + __FUNCTION__, ifa-ifa_label, ifa-ifa_address); + insert_ifa(rnicp, ifa); + iwch_listeners_add_addr(rnicp, ifa-ifa_address); If insert_ifa() fails, what will iwch_listeners_add_addr() do? (I'm not easily seeing the relationship between the address list and the listen list at this point.) + } + break; + case NETDEV_DOWN: + if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) + is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x deleted\n, + __FUNCTION__, ifa-ifa_label, ifa-ifa_address); + iwch_listeners_del_addr(rnicp, ifa-ifa_address); + remove_ifa(rnicp, ifa); + } + break; + default: + break; + } + return 0; +} + +static void delete_addrlist(struct iwch_dev *rnicp) { + struct iwch_addrlist *addr, *tmp; + + mutex_lock(rnicp-mutex); + list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) { + list_del(addr-entry); + kfree(addr); + } + mutex_unlock(rnicp-mutex); +} + +static void populate_addrlist(struct iwch_dev *rnicp) { + int i; + struct in_device *indev; + + for (i = 0; i rnicp-rdev.port_info.nports; i++) { + indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]); + if (!indev) + continue; + for_ifa(indev) + if (is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x added\n, +__FUNCTION__, ifa-ifa_label, +ifa-ifa_address); + insert_ifa(rnicp, ifa); + } + endfor_ifa(indev); + } +} + static void rnic_init(struct iwch_dev *rnicp) { PDBG(%s iwch_dev %p\n, __FUNCTION__, rnicp); @@ -70,6 +187,12 @@ static void rnic_init(struct iwch_dev *r idr_init(rnicp-qpidr); idr_init(rnicp-mmidr); spin_lock_init(rnicp-lock); + INIT_LIST_HEAD(rnicp-addrlist); + INIT_LIST_HEAD(rnicp-listen_eps); + mutex_init(rnicp-mutex); + rnicp-nb.notifier_call = nb_callback; + populate_addrlist(rnicp); + register_inetaddr_notifier(rnicp-nb); rnicp-attr.vendor_id = 0x168; rnicp-attr.vendor_part_id = 7; @@ -148,6 +271,8 @@ static
Re: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.
The sysadmin creates for iwarp use only alias interfaces of the form devname:iw* where devname is the native interface name (eg eth0) for the iwarp netdev device. The alias label can be anything starting with iw. The iw immediately after the ':' is the key used by the iw_cxgb3 driver. I'm still not sure about this, but haven't come up with anything better myself. And if there's a good chance of other rnic's needing the same support, I'd rather see the common code separated out, even if just encapsulated within this module for easy re-use. As for the code, I have a couple of questions about whether deadlock and a race condition are possible, plus a few minor comments. +static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa) +{ + struct iwch_addrlist *addr; + + addr = kmalloc(sizeof *addr, GFP_KERNEL); + if (!addr) { + printk(KERN_ERR MOD %s - failed to alloc memory!\n, + __FUNCTION__); + return; + } + addr-ifa = ifa; + mutex_lock(rnicp-mutex); + list_add_tail(addr-entry, rnicp-addrlist); + mutex_unlock(rnicp-mutex); +} Should this return success/failure? +static int nb_callback(struct notifier_block *self, unsigned long event, + void *ctx) +{ + struct in_ifaddr *ifa = ctx; + struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb); + + PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event); + + switch (event) { + case NETDEV_UP: + if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) + is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x added\n, + __FUNCTION__, ifa-ifa_label, ifa-ifa_address); + insert_ifa(rnicp, ifa); + iwch_listeners_add_addr(rnicp, ifa-ifa_address); If insert_ifa() fails, what will iwch_listeners_add_addr() do? (I'm not easily seeing the relationship between the address list and the listen list at this point.) + } + break; + case NETDEV_DOWN: + if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) + is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x deleted\n, + __FUNCTION__, ifa-ifa_label, ifa-ifa_address); + iwch_listeners_del_addr(rnicp, ifa-ifa_address); + remove_ifa(rnicp, ifa); + } + break; + default: + break; + } + return 0; +} + +static void delete_addrlist(struct iwch_dev *rnicp) +{ + struct iwch_addrlist *addr, *tmp; + + mutex_lock(rnicp-mutex); + list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) { + list_del(addr-entry); + kfree(addr); + } + mutex_unlock(rnicp-mutex); +} + +static void populate_addrlist(struct iwch_dev *rnicp) +{ + int i; + struct in_device *indev; + + for (i = 0; i rnicp-rdev.port_info.nports; i++) { + indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]); + if (!indev) + continue; + for_ifa(indev) + if (is_iwarp_label(ifa-ifa_label)) { + PDBG(%s label %s addr 0x%x added\n, +__FUNCTION__, ifa-ifa_label, +ifa-ifa_address); + insert_ifa(rnicp, ifa); + } + endfor_ifa(indev); + } +} + static void rnic_init(struct iwch_dev *rnicp) { PDBG(%s iwch_dev %p\n, __FUNCTION__, rnicp); @@ -70,6 +187,12 @@ static void rnic_init(struct iwch_dev *r idr_init(rnicp-qpidr); idr_init(rnicp-mmidr); spin_lock_init(rnicp-lock); + INIT_LIST_HEAD(rnicp-addrlist); + INIT_LIST_HEAD(rnicp-listen_eps); + mutex_init(rnicp-mutex); + rnicp-nb.notifier_call = nb_callback; + populate_addrlist(rnicp); + register_inetaddr_notifier(rnicp-nb); rnicp-attr.vendor_id = 0x168; rnicp-attr.vendor_part_id = 7; @@ -148,6 +271,8 @@ static void close_rnic_dev(struct t3cdev mutex_lock(dev_mutex); list_for_each_entry_safe(dev, tmp, dev_list, entry) { if (dev-rdev.t3cdev_p == tdev) { + unregister_inetaddr_notifier(dev-nb); + delete_addrlist(dev); list_del(dev-entry); iwch_unregister_device(dev); cxio_rdev_close(dev-rdev); diff --git a/drivers/infiniband/hw/cxgb3/iwch.h b/drivers/infiniband/hw/cxgb3/iwch.h index caf4e60..7fa0a47 100644 --- a/drivers/infiniband/hw/cxgb3/iwch.h +++ b/drivers/infiniband/hw/cxgb3/iwch.h @@ -36,6 +36,8 @@ #include linux/mutex.h #include linux/list.h #include linux/spinlock.h
Re: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.
Sean Hefty wrote: The sysadmin creates for iwarp use only alias interfaces of the form devname:iw* where devname is the native interface name (eg eth0) for the iwarp netdev device. The alias label can be anything starting with iw. The iw immediately after the ':' is the key used by the iw_cxgb3 driver. I'm still not sure about this, but haven't come up with anything better myself. And if there's a good chance of other rnic's needing the same support, I'd rather see the common code separated out, even if just encapsulated within this module for easy re-use. As for the code, I have a couple of questions about whether deadlock and a race condition are possible, plus a few minor comments. Thanks for reviewing! Responses are in-line below. +static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa) +{ +struct iwch_addrlist *addr; + +addr = kmalloc(sizeof *addr, GFP_KERNEL); +if (!addr) { +printk(KERN_ERR MOD %s - failed to alloc memory!\n, + __FUNCTION__); +return; +} +addr-ifa = ifa; +mutex_lock(rnicp-mutex); +list_add_tail(addr-entry, rnicp-addrlist); +mutex_unlock(rnicp-mutex); +} Should this return success/failure? I think so. See below... +static int nb_callback(struct notifier_block *self, unsigned long event, + void *ctx) +{ +struct in_ifaddr *ifa = ctx; +struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb); + +PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event); + +switch (event) { +case NETDEV_UP: +if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) +is_iwarp_label(ifa-ifa_label)) { +PDBG(%s label %s addr 0x%x added\n, +__FUNCTION__, ifa-ifa_label, ifa-ifa_address); +insert_ifa(rnicp, ifa); +iwch_listeners_add_addr(rnicp, ifa-ifa_address); If insert_ifa() fails, what will iwch_listeners_add_addr() do? (I'm not easily seeing the relationship between the address list and the listen list at this point.) I guess insert_ifa() needs to return success/failure. Then if we failed to add the ifa to the list we won't update the listeners. The relationship is this: - when a listen is done on addr 0.0.0.0, the code walks the list of addresses to do specific listens on each address. - when an address is added or deleted, then the list of current listeners is walked and updated accordingly. +} +break; +case NETDEV_DOWN: +if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) +is_iwarp_label(ifa-ifa_label)) { +PDBG(%s label %s addr 0x%x deleted\n, +__FUNCTION__, ifa-ifa_label, ifa-ifa_address); +iwch_listeners_del_addr(rnicp, ifa-ifa_address); +remove_ifa(rnicp, ifa); +} +break; +default: +break; +} +return 0; +} + +static void delete_addrlist(struct iwch_dev *rnicp) +{ +struct iwch_addrlist *addr, *tmp; + +mutex_lock(rnicp-mutex); +list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) { +list_del(addr-entry); +kfree(addr); +} +mutex_unlock(rnicp-mutex); +} + +static void populate_addrlist(struct iwch_dev *rnicp) +{ +int i; +struct in_device *indev; + +for (i = 0; i rnicp-rdev.port_info.nports; i++) { +indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]); +if (!indev) +continue; +for_ifa(indev) +if (is_iwarp_label(ifa-ifa_label)) { +PDBG(%s label %s addr 0x%x added\n, + __FUNCTION__, ifa-ifa_label, + ifa-ifa_address); +insert_ifa(rnicp, ifa); +} +endfor_ifa(indev); +} +} + static void rnic_init(struct iwch_dev *rnicp) { PDBG(%s iwch_dev %p\n, __FUNCTION__, rnicp); @@ -70,6 +187,12 @@ static void rnic_init(struct iwch_dev *r idr_init(rnicp-qpidr); idr_init(rnicp-mmidr); spin_lock_init(rnicp-lock); +INIT_LIST_HEAD(rnicp-addrlist); +INIT_LIST_HEAD(rnicp-listen_eps); +mutex_init(rnicp-mutex); +rnicp-nb.notifier_call = nb_callback; +populate_addrlist(rnicp); +register_inetaddr_notifier(rnicp-nb); rnicp-attr.vendor_id = 0x168; rnicp-attr.vendor_part_id = 7; @@ -148,6 +271,8 @@ static void close_rnic_dev(struct t3cdev mutex_lock(dev_mutex); list_for_each_entry_safe(dev, tmp, dev_list, entry) { if (dev-rdev.t3cdev_p == tdev) { +unregister_inetaddr_notifier(dev-nb); +delete_addrlist(dev); list_del(dev-entry); iwch_unregister_device(dev); cxio_rdev_close(dev-rdev); diff --git a/drivers/infiniband/hw/cxgb3/iwch.h b/drivers/infiniband/hw/cxgb3/iwch.h index caf4e60..7fa0a47 100644 --- a/drivers/infiniband/hw/cxgb3/iwch.h +++ b/drivers/infiniband/hw/cxgb3/iwch.h @@ -36,6 +36,8 @@ #include linux/mutex.h #include linux/list.h #include
[ofa-general] Problem running SDP apps using OFED 1.2
Hi, I installed the OFED1.2 stack and am trying to run a simple socket server and client over the SDP stack. The Infiniband hardware is QLogic. First I set the ENV vars export LD_PRELOAD=/root/zulfi/iband/INSTALL/lib64/libsdp.so export LIBSDP_CONFIG_FILE=/home/zulfi/libsdp.conf The SDP config file has: use sdp server * *:* use sdp client * *:* Then started the socket server and did a 'sdpnetstat -San' and found that it listed the SDP port on which the server was listening. On the client machine too I did the same; exported the variables, setup the SDP config file and on running the client './client port# server_machine' it gave me a network not reachable error. I tried to get some information about the error on the net but could not find any. I then checked the /proc/pid/maps file and found that libsdp.so was being loaded. also: /root lsmod | grep sdp ib_sdp120224 3 Does QLogic support SDP applications ? Or am I missing something in the SDP config file or do I need to make changes to my code ? Any information on this will be a big help. Thanks, Zulfi ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.
It is ok to block while holding a mutex, yes? It's okay, I just didn't try to trace through the code to see if it ever tries to acquire the same mutex in the thread that needs to signal the event. - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
On Wed, 2007-09-26 at 14:06 -0500, Jim Mott wrote: This is a two part bug report. One is a conceptual problem that may just be a problem of understanding on my part. The other is what I believe to be a bug in the mlx4 driver. mthca has the same issue. 1) ib_create_qp() fails with max_sge If you use ib_query_device() to return the device specific attribute max_sge, it seems reasonable to expect you can create a QP with max_send_sge=max_sge. The problem is that this often fails. The reason is that depending on the QP type (RC, UD, etc.) and how the QP will be used (send, RDMA, atomic, etc.), there can be extra segments required in the WQE that eat up SGE entries. So while some send WQE might have max_sge available SGEs, many will not. Normally the difference between max_sge and the actual maximum value allowed (and checked) for max_send_sge is 1 or 2. This issue may need API extensions to definitively resolve. In the short term, it would be very nice if max_sge reported by ib_query_device() could always return a value that ib_create_qp() could use. Think of it as the minimum max_send_sge value that will work for all QP types. 2) mlx4 setting of max send SQEs The recent patch to support shrinking WQEs introduces a behavior that creates a big difference between the mlx4 supported send SGEs (checked against 61, should be 59 or 60, and reported in ib_query_device as 32 to equal receive side max_rq_sg value). The patch that follows will allow an MLX4 to support the number of send SGEs returned by ib_query_devce, and in fact quite a few more. It probably breaks shrinking WQEs and thus should not be applied directly. Note that if ib_query_device() returned max_sge adjusted for the raddr and atomic segments, this fix would not be needed. MLX4 would still support more SGEs in hardware than can be used through the API, but that is a different problem. --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:27:47.0 -0500 +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:36:40.0 -0500 @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx qp-sq.wqe_shift = ilog2(roundup_pow_of_two(s)); for (;;) { - if (1 qp-sq.wqe_shift dev-dev-caps.max_sq_desc_sz) + if (s dev-dev-caps.max_sq_desc_sz) return -EINVAL; qp-sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 qp-sq.wqe_shift); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
Quoting Jim Mott [EMAIL PROTECTED]: Subject: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device This is a two part bug report. One is a conceptual problem that may just be a problem of understanding on my part. The other is what I believe to be a bug in the mlx4 driver. 1) ib_create_qp() fails with max_sge If you use ib_query_device() to return the device specific attribute max_sge, it seems reasonable to expect you can create a QP with max_send_sge=max_sge. The problem is that this often fails. The reason is that depending on the QP type (RC, UD, etc.) and how the QP will be used (send, RDMA, atomic, etc.), there can be extra segments required in the WQE that eat up SGE entries. So while some send WQE might have max_sge available SGEs, many will not. Normally the difference between max_sge and the actual maximum value allowed (and checked) for max_send_sge is 1 or 2. This issue may need API extensions to definitively resolve. In the short term, it would be very nice if max_sge reported by ib_query_device() could always return a value that ib_create_qp() could use. Think of it as the minimum max_send_sge value that will work for all QP types. 2) mlx4 setting of max send SQEs The recent patch to support shrinking WQEs introduces a behavior that creates a big difference between the mlx4 supported send SGEs (checked against 61, should be 59 or 60, and reported in ib_query_device as 32 to equal receive side max_rq_sg value). The patch that follows will allow an MLX4 to support the number of send SGEs returned by ib_query_devce, and in fact quite a few more. It probably breaks shrinking WQEs and thus should not be applied directly. Note that if ib_query_device() returned max_sge adjusted for the raddr and atomic segments, this fix would not be needed. MLX4 would still support more SGEs in hardware than can be used through the API, but that is a different problem. --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:27:47.0 -0500 +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:36:40.0 -0500 @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx qp-sq.wqe_shift = ilog2(roundup_pow_of_two(s)); for (;;) { - if (1 qp-sq.wqe_shift dev-dev-caps.max_sq_desc_sz) + if (s dev-dev-caps.max_sq_desc_sz) return -EINVAL; qp-sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 qp-sq.wqe_shift); Good idea, but that patch needs more work: max_send_sge returned to user should be made smaller to avoid corrupting the WQE. -- MST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
BTW I hate the shrinking WQE terminology for this, although obviously you weren't the one to introduce it) We are making WQEs smaller so shrinking, and that's how hardware guys seem to call the feature. But it doesn't really matter: the only place the word is used is in the commit log. -- MST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH 11/11]: mlx4_core use fixed CQ moderation paramters
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH 11/11]: mlx4_core use fixed CQ moderation paramters +static int cq_max_count = 16; +static int cq_period = 10; + +module_param(cq_max_count, int, 0444); +MODULE_PARM_DESC(cq_max_count, number of CQEs to generate event); +module_param(cq_period, int, 0444); +MODULE_PARM_DESC(cq_period, time in usec for CQ event generation); I assume this is just a leftover from some earlier approach? These module parameters are just ignored now, so the patch seems kind of pointless. These should go into create CQ inbox. I'll recheck. Anyway I think the approach of having one global setting for all CQs is not a good one -- it seems likely that for example IPoIB and SDP would want different settings, not to mention userspace applications. I agree. But what should be the default setting? Consider also that there's currently no userspace API to control event coalescing. So global setting to control the defaults might still make sense. No? -- MST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support + struct page *page = alloc_page(GFP_ATOMIC | GFP_HIGHUSER); actually: + struct page *page = alloc_page(GFP_ATOMIC | __GFP_HIGHMEM); Isn't this likely to hurt performance on 32 bit systems? -- MST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP for userpsace multicast IB apps
Since both the rdma-cm and ipoib are consumers of the core mutlicast management code (core/multicast.c which is linked into ib_sa.ko), and the app (through the rdma-cm) --first-- inserts a record into the database and only then issues the setsockopt call, if ipoib has a hit on a group it was told to join, this group must be offloaded by the rdma-cm consumer. I'm not sure I understand why that follows. Couldn't there be some other kernel or userspace entity that caused the record to be added? The per device flag is initialized by the module param value at ipoib_dev_init() I still don't really get why there's a module parameter to set the initial value of a flag that only root can change anyway. Why not just the flag through sysfs after loading ipoib rather than having a module parameter to do the same thing? - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support
+ struct page *page = alloc_page(GFP_ATOMIC | __GFP_HIGHMEM); Isn't this likely to hurt performance on 32 bit systems? Yeah, I guess the kernel would need to kmap the data in most cases anyway. So there's not much point in trying to use high memory. - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] Problem running SDP apps using OFED 1.2
Were you able to connect IPoIB between the nodes? Are you sure opensm was running? I am ashamed to admit that occasionally I forget to start opensm and wonder why SDP does not connect. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zulfi Imani Sent: Thursday, September 27, 2007 3:22 PM To: general@lists.openfabrics.org Subject: [ofa-general] Problem running SDP apps using OFED 1.2 Hi, I installed the OFED1.2 stack and am trying to run a simple socket server and client over the SDP stack. The Infiniband hardware is QLogic. First I set the ENV vars export LD_PRELOAD=/root/zulfi/iband/INSTALL/lib64/libsdp.so export LIBSDP_CONFIG_FILE=/home/zulfi/libsdp.conf The SDP config file has: use sdp server * *:* use sdp client * *:* Then started the socket server and did a 'sdpnetstat -San' and found that it listed the SDP port on which the server was listening. On the client machine too I did the same; exported the variables, setup the SDP config file and on running the client './client port# server_machine' it gave me a network not reachable error. I tried to get some information about the error on the net but could not find any. I then checked the /proc/pid/maps file and found that libsdp.so was being loaded. also: /root lsmod | grep sdp ib_sdp120224 3 Does QLogic support SDP applications ? Or am I missing something in the SDP config file or do I need to make changes to my code ? Any information on this will be a big help. Thanks, Zulfi ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: send max_sge lower than reported by ib_query_device
The same bug exists with mthca. I saw it originally in the kernel doing RDS work, but I just put together a short user space test. Thanks. The patch below seems to fix this for me. I guess I'll queue this for 2.6.24. I'm also including the test program I wrote to verify this; mlx4 and mthca seem OK on my system now. diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c index 60de6f9..0c22cf0 100644 --- a/drivers/infiniband/hw/mthca/mthca_main.c +++ b/drivers/infiniband/hw/mthca/mthca_main.c @@ -45,6 +45,7 @@ #include mthca_cmd.h #include mthca_profile.h #include mthca_memfree.h +#include mthca_wqe.h MODULE_AUTHOR(Roland Dreier); MODULE_DESCRIPTION(Mellanox InfiniBand HCA low-level driver); @@ -205,7 +206,20 @@ static int mthca_dev_lim(struct mthca_dev *mdev, struct mthca_dev_lim *dev_lim) mdev-limits.gid_table_len = dev_lim-max_gids; mdev-limits.pkey_table_len = dev_lim-max_pkeys; mdev-limits.local_ca_ack_delay = dev_lim-local_ca_ack_delay; - mdev-limits.max_sg = dev_lim-max_sg; + /* +* Reduce max_sg to a value so that all possible send requests +* will fit into max_desc_sz; send requests will need a next +* segment plus possibly another extra segment, and the UD +* segment is the biggest extra segment. +*/ + mdev-limits.max_sg = + min_t(int, dev_lim-max_sg, + (dev_lim-max_desc_sz - + (sizeof (struct mthca_next_seg) + + (mthca_is_memfree(mdev) ? +sizeof (struct mthca_arbel_ud_seg) : +sizeof (struct mthca_tavor_ud_seg / + sizeof (struct mthca_data_seg)); mdev-limits.max_wqes = dev_lim-max_qp_sz; mdev-limits.max_qp_init_rdma = dev_lim-max_requester_per_qp; mdev-limits.reserved_qps = dev_lim-reserved_qps; --- Here's the test program: #include stdio.h #include string.h #include infiniband/verbs.h int main(int argc, char *argv) { struct ibv_device **dev_list; struct ibv_device_attr dev_attr; struct ibv_context *context; struct ibv_pd *pd; struct ibv_cq *cq; struct ibv_qp_init_attr qp_attr; int t; static const struct { enum ibv_qp_type type; char*name; }type_tab[] = { { IBV_QPT_RC, RC }, { IBV_QPT_UC, UC }, { IBV_QPT_UD, UD }, }; dev_list = ibv_get_device_list(NULL); if (!dev_list) { printf(No IB devices found\n); return 1; } for (; *dev_list; ++dev_list) { printf(%s:\n, ibv_get_device_name(*dev_list)); context = ibv_open_device(*dev_list); if (!context) { printf( ibv_open_device failed\n); continue; } if (ibv_query_device(context, dev_attr)) { printf( ibv_query_device failed\n); continue; } cq = ibv_create_cq(context, 1, NULL, NULL, 0); if (!cq) { printf( ibv_create_cq failed\n); continue; } pd = ibv_alloc_pd(context); if (!pd) { printf( ibv_alloc_pd failed\n); continue; } for (t = 0; t sizeof type_tab / sizeof type_tab[0]; ++t) { memset(qp_attr, 0, sizeof qp_attr); qp_attr.send_cq = cq; qp_attr.recv_cq = cq; qp_attr.cap.max_send_wr = 1; qp_attr.cap.max_recv_wr = 1; qp_attr.cap.max_send_sge = dev_attr.max_sge; qp_attr.cap.max_recv_sge = dev_attr.max_sge; qp_attr.qp_type = type_tab[t].type; printf( %s: SGE %d , type_tab[t].name, dev_attr.max_sge); if (ibv_create_qp(pd, qp_attr)) printf(ok\n); else printf(FAILED\n); } } return 0; } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Save the date: OFA Developer's Summit: November 15-16 in Nevada
We hope you will plan on attending the OpenFabrics Developer's Summit being held November 15-16, 2007 at the Boomtown Hotel in Verdi, Nevada. It will begin at 1pm on Thursday, November 15th and run until the early evening. Friday's session will begin at 8am and end at noon. Last year, this turned out to be a good forum to work through issues that required collaboration. If you have items that ought to be on the agenda, please email them to me. We will have a proposed agenda shortly. This event takes place at the tail end of SC07. The Boomtown hotel is about a twenty minute drive from the Reno-Sparks convention center where SC07 is being held. Rooms are available if needed at the Boomtown hotel starting at $70/night. Thanks for your participation. Johann ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: send max_sge lower than reported by ib_query_device
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: send max_sge lower than reported by ib_query_device The same bug exists with mthca. I saw it originally in the kernel doing RDS work, but I just put together a short user space test. Thanks. The patch below seems to fix this for me. I guess I'll queue this for 2.6.24. I'm not sure this is a good approach: the fact that user attempts to use the max value from query device indicates that he really wants to get as large a value as possible. So lowering this value in query means we are wasting performance for such an app. -- MST ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: send max_sge lower than reported by ib_query_device
Michael I'm not sure this is a good approach: the fact that user Michael attempts to use the max value from query device indicates Michael that he really wants to get as large a value as Michael possible. So lowering this value in query means we are Michael wasting performance for such an app. Right now we report a value of 30 and then give an error if the consumer tries to use that value to actually create a QP. That's a clear bug to me. How do you suggest we resolve this bug? - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-onlyinterfacesto avoid 4-tuple conflicts.
I'm sure I had seen a previous email in this thread that suggested using a userspace library to open a socket in the shared port space. It seems that suggestion was dropped without reason. Does anyone know why? Yes, because it doesn't handle in-kernel uses (eg NFS/RDMA, iSER, etc). The kernel apps could open a Linux tcp socket and create an RDMA socket connection. Both calls are standard Linux kernel architected routines. This approach was NAK'd by David Miller and others... Doesn't NFSoRDMA already open a TCP socket and another for RDMA traffic (ports 2049 2050 if I remember correctly)? The NFS RDMA transport driver does not open a socket for the RDMA connection. It uses a different port in order to allow both TCP and RDMA mounts to the same filer. I currently don't know if iSER, RDS, etc. already do the same thing, but if they don't, they probably could very easily. Woe be to those who do so... Does the neteffect NIC have the same issue as cxgb3 here? What are your thoughts on how to handle this? Yes, the NetEffect RNIC will have the same issue as Chelsio. And all Future RNIC's which support a unified tcp address with Linux will as well. Steve has put a lot of thought and energy into the problem, but I don't think users admins will be very happy with us in the long run. Agreed. In summary, short of having the rdma_cm share kernel port space, I'd like to see the equivalent in userspace and have the kernel apps handle the issue in a similar way as described above. There are a few technical issues to work through (like passing the userspace IP address to the kernel), This just moves the socket creation to code that is outside the purview the kernel maintainers. The exchanging of the 4-tuple created with the kernel module, however, is back in the kernel and in the maintainer's control and responsibility. In my view anything like this will be viewed as an attempt to sneak code into the kernel that the maintainer has already vehemently rejected. This will make people angry and damage the cooperative working relationship that we are trying to build. but I think we can solve that just like other information that gets passed from user into the IB/RDMA kernel modules. Sharing the IP 4-tuple space cooperatively with the core in any fashion has been nak'd. Without this cooperation, the options we've been able to come up with are administrative/policy based approaches. Any ideas you have along these lines are welcome. I am aware of the pending nak's and certainly don't want to sneak anything by anyone. Since we all agree that user admins won't like the current approach I'm trying to come up with alternatives. Arkady has raised some good points regarding iSCSI and I would hope a similar solution could be used for iWARP. Glenn. Tom Glenn. - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: send max_sge lower than reported byib_query_device
I like the idea of this call returning a value that's usable for any QP, with Jim's idea of providing a new call of returning maximum attributes based on QP attributes. OK, so fixing ib_query_device() for mthca to report a value usable for all QPs (as my patch does) is a step in this direction. - R. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] Re: send max_sge lower than reported byib_query_device
Right now we report a value of 30 and then give an error if the consumer tries to use that value to actually create a QP. That's a clear bug to me. How do you suggest we resolve this bug? I like the idea of this call returning a value that's usable for any QP, with Jim's idea of providing a new call of returning maximum attributes based on QP attributes. - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] nightly osm_sim report 2007-09-28:normal completion
OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2007-09-27 OpenSM git rev = Tue_Sep_25_00:30:00_2007 [2c547953885809a8026e20af7809be08b42c3865] ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb] Total=520 Pass=520 Fail=0 Pass: 39 Stability IS1-16.topo 39 Pkey IS1-16.topo 39 OsmTest IS1-16.topo 39 OsmStress IS1-16.topo 39 Multicast IS1-16.topo 39 LidMgr IS1-16.topo 13 Stability IS3-loop.topo 13 Stability IS3-128.topo 13 Pkey IS3-128.topo 13 OsmTest IS3-loop.topo 13 OsmTest IS3-128.topo 13 OsmStress IS3-128.topo 13 Multicast IS3-loop.topo 13 Multicast IS3-128.topo 13 LidMgr IS3-128.topo 13 FatTree merge-roots-4-ary-2-tree.topo 13 FatTree merge-root-4-ary-3-tree.topo 13 FatTree gnu-stallion-64.topo 13 FatTree blend-4-ary-2-tree.topo 13 FatTree RhinoDDR.topo 13 FatTree FullGnu.topo 13 FatTree 4-ary-2-tree.topo 13 FatTree 2-ary-4-tree.topo 13 FatTree 12-node-spaced.topo 13 FTreeFail 4-ary-2-tree-missing-sw-link.topo 13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general