[ofa-general] [query] openSM routing algorithms

2007-09-27 Thread Keshetti Mahesh
In the latest openSM release, I could see it supports four different
algorithms(Min-hop algorithm being the default). I want to know in detail
how these algorithms work and which one to use to when. Can anyone of
you help me by giving references to some documents describing the same.

regards,
Mahesh
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Being young and inexperienced

2007-09-27 Thread Kelley Culver

attachment: img20.gif___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[ofa-general] ofa_1_3_kernel 20070927-0200 daily build status

2007-09-27 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod 
--with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod 
--with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod 
--with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.16
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.14
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.19
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-22.ELsmp
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.9-34.ELsmp

Failed:
Build failed on powerpc with linux-2.6.19
Log:
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936:
 error: invalid type argument of '-'
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939:
 error: invalid type argument of '-'
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:940:
 error: invalid type argument of '-'
make[4]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.o]
 Error 1
make[3]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband/hw/ehca]
 Error 2
make[2]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check/drivers/infiniband]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.19_powerpc_check]
 Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.19'
make: *** [kernel] Error 2
--
Build failed on powerpc with linux-2.6.17
Log:
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936:
 error: invalid type argument of '-'
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939:
 error: invalid type argument of '-'
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:940:
 error: invalid type argument of '-'
make[4]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.o]
 Error 1
make[3]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband/hw/ehca]
 Error 2
make[2]: *** 
[/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check/drivers/infiniband]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.17_powerpc_check]
 Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.17'
make: *** [kernel] Error 2
--
Build failed on powerpc with linux-2.6.16
Log:
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.16_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:936:
 error: invalid type argument of '-'
/home/vlad/tmp/ofa_1_3_kernel-20070927-0200_linux-2.6.16_powerpc_check/drivers/infiniband/hw/ehca/ehca_main.c:939:
 error: invalid

[ofa-general] ***SPAM*** uDAPL thread safety

2007-09-27 Thread Dev
HI,
Is the uDAPL provider in OFED 1.2 thread safe ? the dat.conf by default has an 
entry nonthreadsafe and the spec says for some of the routines thread safety 
depends on the provider.

cheers

/Dev


   
-
 Check out  the hottest 2008 models today at Yahoo! Autos.___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP for userpsace multicast IB apps

2007-09-27 Thread Or Gerlitz
On 9/26/07, Roland Dreier [EMAIL PROTECTED] wrote:

  To support this inter-op for the case where the receiving party resides
 at
  the IB side, there is a need to handle IGMP (reports/queries) else the
 local
  IP router would not forward multicast traffic towards the IB network.
 
  This patch does a lookup on the database used for multicast reference
 counting and
  enhances IPoIB to ignore mulicast group which is already handled by user
 space, all
  this under a per device policy flag. That is when the policy flag allows
 it, IPoIB
  will not join and attach its QP to a multicast group which has an entry
 on the database.

 I don't really follow this explanation.  OK, I see in the first
 paragraph that you want to handle IGMP.  How does the second paragraph
 follow?  Why does IGMP mean the kernel IPoIB interface should avoid
 joining certain multicast groups?  (Which groups?)


The user space app first joins to the multicast group through the rdma-cm
(by calling rdma_join_multicast etc)  and then lets the kernel IGMP state
machine that it has to join / respond on queries for this group.

This can be achieved if, second, the app issues a  SOL_IP /
IP_ADD_MEMBERSHIP setsockopt call. Since this setsockopt has two impcast A)
IGMP etc B) IPoIB set_multicast_list is called, the patch comes to avoid
IPoIB from joining / attaching to this group, since the app actually
attaches its own UD QP to the group.

So my change log comment wasn't detailed enough to make it clear this is the
design, sorry.


  +/* ignore group which is directly joined by user
 space */
  +if (test_bit(IPOIB_FLAG_ADMIN_UMCAST_ALLOWED,
 priv-flags) 
  +!ib_sa_get_mcmember_rec(priv-ca, priv-port,
 mgid, rec))

 I don't follow this.  Why does ib_sa_get_mcmember_rec() returning 0
 imply that userspace has already joined the multicast group?


Since both the rdma-cm and ipoib are consumers of the core mutlicast
management code (core/multicast.c which is linked into ib_sa.ko), and the
app (through the rdma-cm) --first--  inserts a record into the database and
only then issues the setsockopt call, if ipoib has a hit on a group it was
told to join, this group must be offloaded by the rdma-cm consumer.


  +module_param_named(umcast_allowed, ipoib_umcast_allowed, int, 0444);

 Not sure I understand why you added the module parameter...


The per device flag is initialized by the module param value at
ipoib_dev_init()

 +static DEVICE_ATTR(umcast, S_IWUSR | S_IRUGO, show_umcast, set_umcast);

 The set_umcast attribute is writable by root anyway so why are there
 two ways of setting this?


I am not sure to fully follow your comment. I just wanted to make the sysfs
/sys/class/net/$dev/umcast entry writable and I actually did copy-paste from
the set_mode code...

 +if (!strcmp(buf, 1\n)) {

 I don't think this is the most robust way of parsing things.  for
 example it will break in a very confusing way if someone uses echo -n
 Could you use simple_strtoul() or something like that to handle
 leading/trailing whitespace etc?


sure, I will fix it.

Or.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] [query] openSM routing algorithms

2007-09-27 Thread Hal Rosenstock
On Thu, 2007-09-27 at 12:36 +0530, Keshetti Mahesh wrote:
 In the latest openSM release, I could see it supports four different
 algorithms(Min-hop algorithm being the default). I want to know in detail
 how these algorithms work and which one to use to when. Can anyone of
 you help me by giving references to some documents describing the same.

The descriptions of and references to (papers on) the routing algorithms
are in the OpenSM man page.

-- Hal

 
 regards,
 Mahesh
 ___
 general mailing list
 general@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [ewg] Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.

2007-09-27 Thread Steve Wise

Michael,

Have you pulled this in yet?  I want to close out the bug I have open...

Thanks,

Steve.


Steve Wise wrote:



Michael S. Tsirkin wrote:

Yes, please push this into your git tree (and please verify that
cross-build to all OS-es passes).



done!

git://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2_c


Further, please do it this way: add the patch in ofed-1.2.5
and then merge 1.2.5 into 1.3.



done!

git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel


Steve.
___
ewg mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Are you strong man?

2007-09-27 Thread Enrique Reilly

attachment: img20.gif___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] SDP memory allocation policy problem?

2007-09-27 Thread Ken Phillips
Thanks for your help.

We'll setup to get this tested under pressure. We'll keep you posted.

Regards
KP


On 9/26/07, Jim Mott [EMAIL PROTECTED] wrote:
 I have reworked your patch slightly and run my simple unit tests on it.  No 
 correctness problems detected in latency or bandwidth
 paths.  No performance regressions either.

 If your proposed patch worked for you, then this one ought to work too.  
 Could you please give it a go and let me know?

 Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
 ===
 --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_bcopy.c  
 2007-09-26 13:27:43.0 -0500
 +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c   2007-09-26 
 17:52:12.0 -0500
 @@ -221,16 +221,26 @@ static void sdp_post_recv(struct sdp_soc
 skb_frag_t *frag;
 struct sdp_bsdh *h;
 int id = ssk-rx_head;
 +   unsigned int gfp_page;

 /* Now, allocate and repost recv */
 /* TODO: allocate from cache */
 -   skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE,
 - GFP_KERNEL);
 +
 +   if (unlikely(ssk-isk.sk.sk_allocation)) {
 +   skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE,
 + ssk-isk.sk.sk_allocation);
 +   gfp_page = ssk-isk.sk.sk_allocation | __GFP_HIGHMEM;
 +   } else {
 +   skb = sk_stream_alloc_skb(ssk-isk.sk, SDP_HEAD_SIZE,
 + GFP_KERNEL);
 +   gfp_page = GFP_HIGHUSER;
 +   }
 +
 /* FIXME */
 BUG_ON(!skb);
 h = (struct sdp_bsdh *)skb-head;
 for (i = 0; i  ssk-recv_frags; ++i) {
 -   page = alloc_pages(GFP_HIGHUSER, 0);
 +   page = alloc_pages(gfp_page, 0);
 BUG_ON(!page);
 frag = skb_shinfo(skb)-frags[i];
 frag-page= page;
 @@ -404,6 +414,7 @@ void sdp_post_sends(struct sdp_sock *ssk
 /* TODO: nonagle? */
 struct sk_buff *skb;
 int c;
 +   int gfp_page;

 if (unlikely(!ssk-id)) {
 if (ssk-isk.sk.sk_send_head) {
 @@ -415,6 +426,11 @@ void sdp_post_sends(struct sdp_sock *ssk
 return;
 }

 +   if (unlikely(ssk-isk.sk.sk_allocation))
 +   gfp_page = ssk-isk.sk.sk_allocation;
 +   else
 +   gfp_page = GFP_KERNEL;
 +
 if (ssk-recv_request 
 ssk-rx_tail = ssk-recv_request_head 
 ssk-bufs = SDP_MIN_BUFS 
 @@ -424,7 +440,7 @@ void sdp_post_sends(struct sdp_sock *ssk
 skb = sk_stream_alloc_skb(ssk-isk.sk,
   sizeof(struct sdp_bsdh) +
   sizeof(*resp_size),
 - GFP_KERNEL);
 + gfp_page);
 /* FIXME */
 BUG_ON(!skb);
 resp_size = (struct sdp_chrecvbuf *)skb_put(skb, sizeof 
 *resp_size);
 @@ -449,7 +465,7 @@ void sdp_post_sends(struct sdp_sock *ssk
 skb = sk_stream_alloc_skb(ssk-isk.sk,
   sizeof(struct sdp_bsdh) +
   sizeof(*req_size),
 - GFP_KERNEL);
 + gfp_page);
 /* FIXME */
 BUG_ON(!skb);
 ssk-sent_request = SDP_MAX_SEND_SKB_FRAGS * PAGE_SIZE;
 @@ -480,7 +496,7 @@ void sdp_post_sends(struct sdp_sock *ssk
 ssk-bufs) {
 skb = sk_stream_alloc_skb(ssk-isk.sk,
   sizeof(struct sdp_bsdh),
 - GFP_KERNEL);
 + gfp_page);
 /* FIXME */
 BUG_ON(!skb);
 sdp_post_send(ssk, skb, SDP_MID_DISCONN);

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nathan Dauchy
 Sent: Tuesday, September 25, 2007 5:50 PM
 To: general@lists.openfabrics.org
 Subject: Re: [ofa-general] SDP memory allocation policy problem?

 Is there anyone among the OFED development team that is looking into
 this issue?  I believe that it is causing nodes to hang at our site.  We
 are running ofed-1.2 and the 2.6.9-55.ELsmp kernel.

 Workarounds or even untested patches would be appreciated.

 Thanks!

 -Nathan


 Ken Phillips wrote:
  Greetings,
 
  Teammates here report the following:
 
  Problem
 
  The method SDP uses to allocate socket buffers may cause the
  node to hang under memory pressure.
 
  Details
 
  Each kernel level socket has an allocation flag to specify the
  memory allocation policy for socket buffers, the default is GFP_ATOMIC
  (or GFP_KERNEL for SDP).  If the caller 

RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfacesto avoid 4-tuple conflicts.

2007-09-27 Thread Kanevsky, Arkady
Sean,
What is the model on how client connects, say for iSCSI,
when client and server both support, iWARP and 10GbE or 1GbE,
and would like to setup most performant connection for ULP?
Thanks,

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

 -Original Message-
 From: Sean Hefty [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, September 27, 2007 2:39 PM
 To: Steve Wise
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
 general@lists.openfabrics.org; [EMAIL PROTECTED]
 Subject: Re: [ofa-general] [PATCH v3] iw_cxgb3: Support 
 iwarp-only interfacesto avoid 4-tuple conflicts.
 
  The sysadmin creates for iwarp use only alias interfaces 
 of the form 
  devname:iw* where devname is the native interface name 
 (eg eth0) for 
  the iwarp netdev device.  The alias label can be anything 
 starting with iw.
  The iw immediately after the ':' is the key used by the 
 iw_cxgb3 driver.
 
 I'm still not sure about this, but haven't come up with 
 anything better myself.  And if there's a good chance of 
 other rnic's needing the same support, I'd rather see the 
 common code separated out, even if just encapsulated within 
 this module for easy re-use.
 
 As for the code, I have a couple of questions about whether 
 deadlock and a race condition are possible, plus a few minor comments.
 
  +static void insert_ifa(struct iwch_dev *rnicp, struct 
 in_ifaddr *ifa) 
  +{
  +   struct iwch_addrlist *addr;
  +
  +   addr = kmalloc(sizeof *addr, GFP_KERNEL);
  +   if (!addr) {
  +   printk(KERN_ERR MOD %s - failed to alloc memory!\n,
  +  __FUNCTION__);
  +   return;
  +   }
  +   addr-ifa = ifa;
  +   mutex_lock(rnicp-mutex);
  +   list_add_tail(addr-entry, rnicp-addrlist);
  +   mutex_unlock(rnicp-mutex);
  +}
 
 Should this return success/failure?
 
  +static int nb_callback(struct notifier_block *self, 
 unsigned long event,
  +  void *ctx)
  +{
  +   struct in_ifaddr *ifa = ctx;
  +   struct iwch_dev *rnicp = container_of(self, struct 
 iwch_dev, nb);
  +
  +   PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event);
  +
  +   switch (event) {
  +   case NETDEV_UP:
  +   if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
  +   is_iwarp_label(ifa-ifa_label)) {
  +   PDBG(%s label %s addr 0x%x added\n,
  +   __FUNCTION__, ifa-ifa_label, 
 ifa-ifa_address);
  +   insert_ifa(rnicp, ifa);
  +   iwch_listeners_add_addr(rnicp, 
 ifa-ifa_address);
 
 If insert_ifa() fails, what will iwch_listeners_add_addr() 
 do?  (I'm not easily seeing the relationship between the 
 address list and the listen list at this point.)
 
  +   }
  +   break;
  +   case NETDEV_DOWN:
  +   if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
  +   is_iwarp_label(ifa-ifa_label)) {
  +   PDBG(%s label %s addr 0x%x deleted\n,
  +   __FUNCTION__, ifa-ifa_label, 
 ifa-ifa_address);
  +   iwch_listeners_del_addr(rnicp, 
 ifa-ifa_address);
  +   remove_ifa(rnicp, ifa);
  +   }
  +   break;
  +   default:
  +   break;
  +   }
  +   return 0;
  +}
  +
  +static void delete_addrlist(struct iwch_dev *rnicp) {
  +   struct iwch_addrlist *addr, *tmp;
  +
  +   mutex_lock(rnicp-mutex);
  +   list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) {
  +   list_del(addr-entry);
  +   kfree(addr);
  +   }
  +   mutex_unlock(rnicp-mutex);
  +}
  +
  +static void populate_addrlist(struct iwch_dev *rnicp) {
  +   int i;
  +   struct in_device *indev;
  +
  +   for (i = 0; i  rnicp-rdev.port_info.nports; i++) {
  +   indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]);
  +   if (!indev)
  +   continue;
  +   for_ifa(indev)
  +   if (is_iwarp_label(ifa-ifa_label)) {
  +   PDBG(%s label %s addr 0x%x added\n,
  +__FUNCTION__, ifa-ifa_label,
  +ifa-ifa_address);
  +   insert_ifa(rnicp, ifa);
  +   }
  +   endfor_ifa(indev);
  +   }
  +}
  +
   static void rnic_init(struct iwch_dev *rnicp)  {
  PDBG(%s iwch_dev %p\n, __FUNCTION__,  rnicp); @@ 
 -70,6 +187,12 @@ 
  static void rnic_init(struct iwch_dev *r
  idr_init(rnicp-qpidr);
  idr_init(rnicp-mmidr);
  spin_lock_init(rnicp-lock);
  +   INIT_LIST_HEAD(rnicp-addrlist);
  +   INIT_LIST_HEAD(rnicp-listen_eps);
  +   mutex_init(rnicp-mutex);
  +   rnicp-nb.notifier_call = nb_callback;
  +   populate_addrlist(rnicp);
  +   register_inetaddr_notifier(rnicp-nb);
   
  rnicp-attr.vendor_id = 0x168;
  rnicp-attr.vendor_part_id = 7;
  @@ -148,6 +271,8 @@ static 

Re: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.

2007-09-27 Thread Sean Hefty

The sysadmin creates for iwarp use only alias interfaces of the form
devname:iw* where devname is the native interface name (eg eth0) for the
iwarp netdev device.  The alias label can be anything starting with iw.
The iw immediately after the ':' is the key used by the iw_cxgb3 driver.


I'm still not sure about this, but haven't come up with anything better 
myself.  And if there's a good chance of other rnic's needing the same 
support, I'd rather see the common code separated out, even if just 
encapsulated within this module for easy re-use.


As for the code, I have a couple of questions about whether deadlock and 
a race condition are possible, plus a few minor comments.



+static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa)
+{
+   struct iwch_addrlist *addr;
+
+   addr = kmalloc(sizeof *addr, GFP_KERNEL);
+   if (!addr) {
+   printk(KERN_ERR MOD %s - failed to alloc memory!\n,
+  __FUNCTION__);
+   return;
+   }
+   addr-ifa = ifa;
+   mutex_lock(rnicp-mutex);
+   list_add_tail(addr-entry, rnicp-addrlist);
+   mutex_unlock(rnicp-mutex);
+}


Should this return success/failure?


+static int nb_callback(struct notifier_block *self, unsigned long event,
+  void *ctx)
+{
+   struct in_ifaddr *ifa = ctx;
+   struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb);
+
+   PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event);
+
+   switch (event) {
+   case NETDEV_UP:
+   if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
+   is_iwarp_label(ifa-ifa_label)) {
+   PDBG(%s label %s addr 0x%x added\n,
+   __FUNCTION__, ifa-ifa_label, ifa-ifa_address);
+   insert_ifa(rnicp, ifa);
+   iwch_listeners_add_addr(rnicp, ifa-ifa_address);


If insert_ifa() fails, what will iwch_listeners_add_addr() do?  (I'm not 
easily seeing the relationship between the address list and the listen 
list at this point.)



+   }
+   break;
+   case NETDEV_DOWN:
+   if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
+   is_iwarp_label(ifa-ifa_label)) {
+   PDBG(%s label %s addr 0x%x deleted\n,
+   __FUNCTION__, ifa-ifa_label, ifa-ifa_address);
+   iwch_listeners_del_addr(rnicp, ifa-ifa_address);
+   remove_ifa(rnicp, ifa);
+   }
+   break;
+   default:
+   break;
+   }
+   return 0;
+}
+
+static void delete_addrlist(struct iwch_dev *rnicp)
+{
+   struct iwch_addrlist *addr, *tmp;
+
+   mutex_lock(rnicp-mutex);
+   list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) {
+   list_del(addr-entry);
+   kfree(addr);
+   }
+   mutex_unlock(rnicp-mutex);
+}
+
+static void populate_addrlist(struct iwch_dev *rnicp)
+{
+   int i;
+   struct in_device *indev;
+
+   for (i = 0; i  rnicp-rdev.port_info.nports; i++) {
+   indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]);
+   if (!indev)
+   continue;
+   for_ifa(indev)
+   if (is_iwarp_label(ifa-ifa_label)) {
+   PDBG(%s label %s addr 0x%x added\n,
+__FUNCTION__, ifa-ifa_label,
+ifa-ifa_address);
+   insert_ifa(rnicp, ifa);
+   }
+   endfor_ifa(indev);
+   }
+}
+
 static void rnic_init(struct iwch_dev *rnicp)
 {
PDBG(%s iwch_dev %p\n, __FUNCTION__,  rnicp);
@@ -70,6 +187,12 @@ static void rnic_init(struct iwch_dev *r
idr_init(rnicp-qpidr);
idr_init(rnicp-mmidr);
spin_lock_init(rnicp-lock);
+   INIT_LIST_HEAD(rnicp-addrlist);
+   INIT_LIST_HEAD(rnicp-listen_eps);
+   mutex_init(rnicp-mutex);
+   rnicp-nb.notifier_call = nb_callback;
+   populate_addrlist(rnicp);
+   register_inetaddr_notifier(rnicp-nb);
 
 	rnicp-attr.vendor_id = 0x168;

rnicp-attr.vendor_part_id = 7;
@@ -148,6 +271,8 @@ static void close_rnic_dev(struct t3cdev
mutex_lock(dev_mutex);
list_for_each_entry_safe(dev, tmp, dev_list, entry) {
if (dev-rdev.t3cdev_p == tdev) {
+   unregister_inetaddr_notifier(dev-nb);
+   delete_addrlist(dev);
list_del(dev-entry);
iwch_unregister_device(dev);
cxio_rdev_close(dev-rdev);
diff --git a/drivers/infiniband/hw/cxgb3/iwch.h 
b/drivers/infiniband/hw/cxgb3/iwch.h
index caf4e60..7fa0a47 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.h
+++ b/drivers/infiniband/hw/cxgb3/iwch.h
@@ -36,6 +36,8 @@ #include linux/mutex.h
 #include linux/list.h
 #include linux/spinlock.h
 

Re: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.

2007-09-27 Thread Steve Wise



Sean Hefty wrote:

The sysadmin creates for iwarp use only alias interfaces of the form
devname:iw* where devname is the native interface name (eg eth0) for 
the

iwarp netdev device.  The alias label can be anything starting with iw.
The iw immediately after the ':' is the key used by the iw_cxgb3 
driver.


I'm still not sure about this, but haven't come up with anything better 
myself.  And if there's a good chance of other rnic's needing the same 
support, I'd rather see the common code separated out, even if just 
encapsulated within this module for easy re-use.


As for the code, I have a couple of questions about whether deadlock and 
a race condition are possible, plus a few minor comments.




Thanks for reviewing!  Responses are in-line below.



+static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa)
+{
+struct iwch_addrlist *addr;
+
+addr = kmalloc(sizeof *addr, GFP_KERNEL);
+if (!addr) {
+printk(KERN_ERR MOD %s - failed to alloc memory!\n,
+   __FUNCTION__);
+return;
+}
+addr-ifa = ifa;
+mutex_lock(rnicp-mutex);
+list_add_tail(addr-entry, rnicp-addrlist);
+mutex_unlock(rnicp-mutex);
+}


Should this return success/failure?



I think so.  See below...


+static int nb_callback(struct notifier_block *self, unsigned long event,
+   void *ctx)
+{
+struct in_ifaddr *ifa = ctx;
+struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb);
+
+PDBG(%s rnicp %p event %lx\n, __FUNCTION__, rnicp, event);
+
+switch (event) {
+case NETDEV_UP:
+if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
+is_iwarp_label(ifa-ifa_label)) {
+PDBG(%s label %s addr 0x%x added\n,
+__FUNCTION__, ifa-ifa_label, ifa-ifa_address);
+insert_ifa(rnicp, ifa);
+iwch_listeners_add_addr(rnicp, ifa-ifa_address);


If insert_ifa() fails, what will iwch_listeners_add_addr() do?  (I'm not 
easily seeing the relationship between the address list and the listen 
list at this point.)




I guess insert_ifa() needs to return success/failure.  Then if we failed 
to add the ifa to the list we won't update the listeners.


The relationship is this:

- when a listen is done on addr 0.0.0.0, the code walks the list of 
addresses to do specific listens on each address.


- when an address is added or deleted, then the list of current 
listeners is walked and updated accordingly.



+}
+break;
+case NETDEV_DOWN:
+if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) 
+is_iwarp_label(ifa-ifa_label)) {
+PDBG(%s label %s addr 0x%x deleted\n,
+__FUNCTION__, ifa-ifa_label, ifa-ifa_address);
+iwch_listeners_del_addr(rnicp, ifa-ifa_address);
+remove_ifa(rnicp, ifa);
+}
+break;
+default:
+break;
+}
+return 0;
+}
+
+static void delete_addrlist(struct iwch_dev *rnicp)
+{
+struct iwch_addrlist *addr, *tmp;
+
+mutex_lock(rnicp-mutex);
+list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) {
+list_del(addr-entry);
+kfree(addr);
+}
+mutex_unlock(rnicp-mutex);
+}
+
+static void populate_addrlist(struct iwch_dev *rnicp)
+{
+int i;
+struct in_device *indev;
+
+for (i = 0; i  rnicp-rdev.port_info.nports; i++) {
+indev = in_dev_get(rnicp-rdev.port_info.lldevs[i]);
+if (!indev)
+continue;
+for_ifa(indev)
+if (is_iwarp_label(ifa-ifa_label)) {
+PDBG(%s label %s addr 0x%x added\n,
+ __FUNCTION__, ifa-ifa_label,
+ ifa-ifa_address);
+insert_ifa(rnicp, ifa);
+}
+endfor_ifa(indev);
+}
+}
+
 static void rnic_init(struct iwch_dev *rnicp)
 {
 PDBG(%s iwch_dev %p\n, __FUNCTION__,  rnicp);
@@ -70,6 +187,12 @@ static void rnic_init(struct iwch_dev *r
 idr_init(rnicp-qpidr);
 idr_init(rnicp-mmidr);
 spin_lock_init(rnicp-lock);
+INIT_LIST_HEAD(rnicp-addrlist);
+INIT_LIST_HEAD(rnicp-listen_eps);
+mutex_init(rnicp-mutex);
+rnicp-nb.notifier_call = nb_callback;
+populate_addrlist(rnicp);
+register_inetaddr_notifier(rnicp-nb);
 
 rnicp-attr.vendor_id = 0x168;

 rnicp-attr.vendor_part_id = 7;
@@ -148,6 +271,8 @@ static void close_rnic_dev(struct t3cdev
 mutex_lock(dev_mutex);
 list_for_each_entry_safe(dev, tmp, dev_list, entry) {
 if (dev-rdev.t3cdev_p == tdev) {
+unregister_inetaddr_notifier(dev-nb);
+delete_addrlist(dev);
 list_del(dev-entry);
 iwch_unregister_device(dev);
 cxio_rdev_close(dev-rdev);
diff --git a/drivers/infiniband/hw/cxgb3/iwch.h 
b/drivers/infiniband/hw/cxgb3/iwch.h

index caf4e60..7fa0a47 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.h
+++ b/drivers/infiniband/hw/cxgb3/iwch.h
@@ -36,6 +36,8 @@ #include linux/mutex.h
 #include linux/list.h
 #include 

[ofa-general] Problem running SDP apps using OFED 1.2

2007-09-27 Thread Zulfi Imani
Hi,

I installed the OFED1.2 stack and am trying to run a simple socket server
and client over the SDP stack. The Infiniband hardware is QLogic.

First I set the ENV vars
export LD_PRELOAD=/root/zulfi/iband/INSTALL/lib64/libsdp.so
export LIBSDP_CONFIG_FILE=/home/zulfi/libsdp.conf

The SDP config file has:
use sdp server * *:*
use sdp client * *:*

Then started the socket server and did a 'sdpnetstat -San' and found that it
listed the SDP port on which the server was listening.

On the client machine too I did the same; exported the variables, setup the
SDP config file and on running the client './client port# server_machine' it
gave me a network not reachable error.

I tried to get some information about the error on the net but could not
find any.

I then checked the /proc/pid/maps file and found that libsdp.so was being
loaded.
also:
/root  lsmod | grep sdp
ib_sdp120224  3

Does QLogic support SDP applications ? Or am I missing something in the SDP
config file or do I need to make changes to my code ?

Any information on this will be a big help.

Thanks,
Zulfi
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts.

2007-09-27 Thread Sean Hefty
It is ok to block while holding a mutex, yes?

It's okay, I just didn't try to trace through the code to see if it ever tries
to acquire the same mutex in the thread that needs to signal the event.

- Sean
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

2007-09-27 Thread Tom Tucker
On Wed, 2007-09-26 at 14:06 -0500, Jim Mott wrote:
   This is a two part bug report.  One is a conceptual problem that may just 
 be a problem of understanding on my part.  The other is
 what I believe to be a bug in the mlx4 driver.

mthca has the same issue.

 
 1) ib_create_qp() fails with max_sge 
   If you use ib_query_device() to return the device specific 
 attribute max_sge, it seems reasonable to expect you can create
 a QP with max_send_sge=max_sge.  The problem is that this often
 fails.
 
   The reason is that depending on the QP type (RC, UD, etc.) and
 how the QP will be used (send, RDMA, atomic, etc.), there can be
 extra segments required in the WQE that eat up SGE entries.  So
 while some send WQE might have max_sge available SGEs, many will
 not.
 
   Normally the difference between max_sge and the actual maximum
 value allowed (and checked) for max_send_sge is 1 or 2.
 
   This issue may need API extensions to definitively resolve.  In
 the short term, it would be very nice if max_sge reported by 
 ib_query_device() could always return a value that ib_create_qp()
 could use.  Think of it as the minimum max_send_sge value that
 will work for all QP types.
 
 
 2) mlx4 setting of max send SQEs
   The recent patch to support shrinking WQEs introduces a 
 behavior that creates a big difference between the mlx4 
 supported send SGEs (checked against 61, should be 59 or 60,
 and reported in ib_query_device as 32 to equal receive side
 max_rq_sg value).  
 
   The patch that follows will allow an MLX4 to support the
 number of send SGEs returned by ib_query_devce, and in fact
 quite a few more.  It probably breaks shrinking WQEs and thus
 should not be applied directly.
 
   Note that if ib_query_device() returned max_sge adjusted
 for the raddr and atomic segments, this fix would not be
 needed.  MLX4 would still support more SGEs in hardware than
 can be used through the API, but that is a different problem.  
 
 --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 
 13:27:47.0 -0500
 +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c  2007-09-26 
 13:36:40.0 -0500
 @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx
 qp-sq.wqe_shift = ilog2(roundup_pow_of_two(s));
  
 for (;;) {
 -   if (1  qp-sq.wqe_shift  dev-dev-caps.max_sq_desc_sz)
 +   if (s  dev-dev-caps.max_sq_desc_sz)
 return -EINVAL;
  
 qp-sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1  
 qp-sq.wqe_shift);
 
 ___
 general mailing list
 general@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

2007-09-27 Thread Michael S. Tsirkin
 Quoting Jim Mott [EMAIL PROTECTED]:
 Subject: [Bug report / partial patch] OFED 1.3 send max_sge lower than 
 reported by ib_query_device
 
   This is a two part bug report.  One is a conceptual problem that may just 
 be a problem of understanding on my part.  The other is
 what I believe to be a bug in the mlx4 driver.
 
 1) ib_create_qp() fails with max_sge 
   If you use ib_query_device() to return the device specific 
 attribute max_sge, it seems reasonable to expect you can create
 a QP with max_send_sge=max_sge.  The problem is that this often
 fails.
 
   The reason is that depending on the QP type (RC, UD, etc.) and
 how the QP will be used (send, RDMA, atomic, etc.), there can be
 extra segments required in the WQE that eat up SGE entries.  So
 while some send WQE might have max_sge available SGEs, many will
 not.
 
   Normally the difference between max_sge and the actual maximum
 value allowed (and checked) for max_send_sge is 1 or 2.
 
   This issue may need API extensions to definitively resolve.  In
 the short term, it would be very nice if max_sge reported by 
 ib_query_device() could always return a value that ib_create_qp()
 could use.  Think of it as the minimum max_send_sge value that
 will work for all QP types.
 
 
 2) mlx4 setting of max send SQEs
   The recent patch to support shrinking WQEs introduces a 
 behavior that creates a big difference between the mlx4 
 supported send SGEs (checked against 61, should be 59 or 60,
 and reported in ib_query_device as 32 to equal receive side
 max_rq_sg value).  
 
   The patch that follows will allow an MLX4 to support the
 number of send SGEs returned by ib_query_devce, and in fact
 quite a few more.  It probably breaks shrinking WQEs and thus
 should not be applied directly.
 
   Note that if ib_query_device() returned max_sge adjusted
 for the raddr and atomic segments, this fix would not be
 needed.  MLX4 would still support more SGEs in hardware than
 can be used through the API, but that is a different problem.  
 
 --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 
 13:27:47.0 -0500
 +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c  2007-09-26 
 13:36:40.0 -0500
 @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx
 qp-sq.wqe_shift = ilog2(roundup_pow_of_two(s));
  
 for (;;) {
 -   if (1  qp-sq.wqe_shift  dev-dev-caps.max_sq_desc_sz)
 +   if (s  dev-dev-caps.max_sq_desc_sz)
 return -EINVAL;
  
 qp-sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1  
 qp-sq.wqe_shift);

Good idea, but that patch needs more work: max_send_sge returned
to user should be made smaller to avoid corrupting the WQE.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

2007-09-27 Thread Michael S. Tsirkin
 BTW I hate the shrinking WQE terminology for this, although
 obviously you weren't the one to introduce it)

We are making WQEs smaller so shrinking, and that's how hardware guys seem to
call the feature. But it doesn't really matter: the only place the word is used
is in the commit log.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH 11/11]: mlx4_core use fixed CQ moderation paramters

2007-09-27 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH 11/11]: mlx4_core use fixed CQ moderation paramters
 
   +static int cq_max_count = 16;
   +static int cq_period = 10;
   +
   +module_param(cq_max_count, int, 0444);
   +MODULE_PARM_DESC(cq_max_count, number of CQEs to generate event);
   +module_param(cq_period, int, 0444);
   +MODULE_PARM_DESC(cq_period, time in usec for CQ event generation);
 
 I assume this is just a leftover from some earlier approach?  These
 module parameters are just ignored now, so the patch seems kind of
 pointless.

These should go into create CQ inbox. I'll recheck.

 Anyway I think the approach of having one global setting for all CQs
 is not a good one -- it seems likely that for example IPoIB and SDP
 would want different settings, not to mention userspace applications.

I agree. But what should be the default setting?
Consider also that there's currently no userspace API to control
event coalescing.

So global setting to control the defaults might still make sense. No?

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support

2007-09-27 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support
 
   +  struct page *page = alloc_page(GFP_ATOMIC | GFP_HIGHUSER);
 
 actually:
 
 + struct page *page = alloc_page(GFP_ATOMIC | __GFP_HIGHMEM);

Isn't this likely to hurt performance on 32 bit systems?

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP for userpsace multicast IB apps

2007-09-27 Thread Roland Dreier
  Since both the rdma-cm and ipoib are consumers of the core mutlicast
  management code (core/multicast.c which is linked into ib_sa.ko), and the
  app (through the rdma-cm) --first--  inserts a record into the database and
  only then issues the setsockopt call, if ipoib has a hit on a group it was
  told to join, this group must be offloaded by the rdma-cm consumer.

I'm not sure I understand why that follows.  Couldn't there be some
other kernel or userspace entity that caused the record to be added?

  The per device flag is initialized by the module param value at
  ipoib_dev_init()

I still don't really get why there's a module parameter to set the
initial value of a flag that only root can change anyway.  Why not
just the flag through sysfs after loading ipoib rather than having a
module parameter to do the same thing?

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: [PATCH 1/11] IB/ipoib: high dma support

2007-09-27 Thread Roland Dreier
   +  struct page *page = alloc_page(GFP_ATOMIC | __GFP_HIGHMEM);
  
  Isn't this likely to hurt performance on 32 bit systems?

Yeah, I guess the kernel would need to kmap the data in most cases
anyway.  So there's not much point in trying to use high memory.

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] Problem running SDP apps using OFED 1.2

2007-09-27 Thread Jim Mott
Were you able to connect IPoIB between the nodes?  Are you sure opensm was 
running?  I am ashamed to admit that occasionally I
forget to start opensm and wonder why SDP does not connect. 

 

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zulfi Imani
Sent: Thursday, September 27, 2007 3:22 PM
To: general@lists.openfabrics.org
Subject: [ofa-general] Problem running SDP apps using OFED 1.2

 

Hi,

I installed the OFED1.2 stack and am trying to run a simple socket server and 
client over the SDP stack. The Infiniband hardware is
QLogic.

First I set the ENV vars
export LD_PRELOAD=/root/zulfi/iband/INSTALL/lib64/libsdp.so 

export LIBSDP_CONFIG_FILE=/home/zulfi/libsdp.conf


The SDP config file has:
use sdp server * *:* 
use sdp client * *:*

Then started the socket server and did a 'sdpnetstat -San' and found that it 
listed the SDP port on which the server was listening. 

On the client machine too I did the same; exported the variables, setup the SDP 
config file and on running the client './client
port# server_machine' it gave me a network not reachable error. 

I tried to get some information about the error on the net but could not find 
any.

I then checked the /proc/pid/maps file and found that libsdp.so was being 
loaded.
also:
/root  lsmod | grep sdp 
ib_sdp120224  3

Does QLogic support SDP applications ? Or am I missing something in the SDP 
config file or do I need to make changes to my code ?

Any information on this will be a big help. 

Thanks,
Zulfi

 

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[ofa-general] Re: send max_sge lower than reported by ib_query_device

2007-09-27 Thread Roland Dreier
  The same bug exists with mthca.  I saw it originally in the kernel doing RDS 
  work, but I just put together a short user space test.

Thanks.  The patch below seems to fix this for me.  I guess I'll queue
this for 2.6.24.

I'm also including the test program I wrote to verify this; mlx4 and
mthca seem OK on my system now.

diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
b/drivers/infiniband/hw/mthca/mthca_main.c
index 60de6f9..0c22cf0 100644
--- a/drivers/infiniband/hw/mthca/mthca_main.c
+++ b/drivers/infiniband/hw/mthca/mthca_main.c
@@ -45,6 +45,7 @@
 #include mthca_cmd.h
 #include mthca_profile.h
 #include mthca_memfree.h
+#include mthca_wqe.h
 
 MODULE_AUTHOR(Roland Dreier);
 MODULE_DESCRIPTION(Mellanox InfiniBand HCA low-level driver);
@@ -205,7 +206,20 @@ static int mthca_dev_lim(struct mthca_dev *mdev, struct 
mthca_dev_lim *dev_lim)
mdev-limits.gid_table_len  = dev_lim-max_gids;
mdev-limits.pkey_table_len = dev_lim-max_pkeys;
mdev-limits.local_ca_ack_delay = dev_lim-local_ca_ack_delay;
-   mdev-limits.max_sg = dev_lim-max_sg;
+   /*
+* Reduce max_sg to a value so that all possible send requests
+* will fit into max_desc_sz; send requests will need a next
+* segment plus possibly another extra segment, and the UD
+* segment is the biggest extra segment.
+*/
+   mdev-limits.max_sg =
+   min_t(int, dev_lim-max_sg,
+ (dev_lim-max_desc_sz -
+  (sizeof (struct mthca_next_seg) +
+   (mthca_is_memfree(mdev) ?
+sizeof (struct mthca_arbel_ud_seg) :
+sizeof (struct mthca_tavor_ud_seg /
+ sizeof (struct mthca_data_seg));
mdev-limits.max_wqes   = dev_lim-max_qp_sz;
mdev-limits.max_qp_init_rdma   = dev_lim-max_requester_per_qp;
mdev-limits.reserved_qps   = dev_lim-reserved_qps;


---

Here's the test program:

#include stdio.h
#include string.h

#include infiniband/verbs.h

int main(int argc, char *argv)
{
struct ibv_device  **dev_list;
struct ibv_device_attr   dev_attr;
struct ibv_context  *context;
struct ibv_pd   *pd;
struct ibv_cq   *cq;
struct ibv_qp_init_attr  qp_attr;
int  t;
static const struct {
enum ibv_qp_type type;
char*name;
}type_tab[] = {
{ IBV_QPT_RC, RC },
{ IBV_QPT_UC, UC },
{ IBV_QPT_UD, UD },
};

dev_list = ibv_get_device_list(NULL);
if (!dev_list) {
printf(No IB devices found\n);
return 1;
}

for (; *dev_list; ++dev_list) {
printf(%s:\n, ibv_get_device_name(*dev_list));

context = ibv_open_device(*dev_list);
if (!context) {
printf(  ibv_open_device failed\n);
continue;
}

if (ibv_query_device(context, dev_attr)) {
printf(  ibv_query_device failed\n);
continue;
}

cq = ibv_create_cq(context, 1, NULL, NULL, 0);
if (!cq) {
printf(  ibv_create_cq failed\n);
continue;
}

pd = ibv_alloc_pd(context);
if (!pd) {
printf(  ibv_alloc_pd failed\n);
continue;
}

for (t = 0; t  sizeof type_tab / sizeof type_tab[0]; ++t) {
memset(qp_attr, 0, sizeof qp_attr);

qp_attr.send_cq = cq;
qp_attr.recv_cq = cq;
qp_attr.cap.max_send_wr = 1;
qp_attr.cap.max_recv_wr = 1;
qp_attr.cap.max_send_sge = dev_attr.max_sge;
qp_attr.cap.max_recv_sge = dev_attr.max_sge;
qp_attr.qp_type = type_tab[t].type;

printf(  %s: SGE %d , type_tab[t].name, 
dev_attr.max_sge);

if (ibv_create_qp(pd, qp_attr))
printf(ok\n);
else
printf(FAILED\n);
}
}

return 0;
}
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Save the date: OFA Developer's Summit: November 15-16 in Nevada

2007-09-27 Thread Johann George
We hope you will plan on attending the OpenFabrics Developer's
Summit being held November 15-16, 2007 at the Boomtown Hotel in
Verdi, Nevada.  It will begin at 1pm on Thursday, November 15th
and run until the early evening.  Friday's session will begin at
8am and end at noon.

Last year, this turned out to be a good forum to work through issues
that required collaboration.  If you have items that ought to be on
the agenda, please email them to me.  We will have a proposed agenda
shortly.

This event takes place at the tail end of SC07.  The Boomtown hotel is
about a twenty minute drive from the Reno-Sparks convention center
where SC07 is being held.  Rooms are available if needed at the
Boomtown hotel starting at $70/night.

Thanks for your participation.

Johann
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: send max_sge lower than reported by ib_query_device

2007-09-27 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: send max_sge lower than reported by ib_query_device
 
   The same bug exists with mthca.  I saw it originally in the kernel doing 
 RDS work, but I just put together a short user space test.
 
 Thanks.  The patch below seems to fix this for me.  I guess I'll queue
 this for 2.6.24.

I'm not sure this is a good approach: the fact that
user attempts to use the max value from query device
indicates that he really wants to get as large a value
as possible. So lowering this value in query means
we are wasting performance for such an app.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: send max_sge lower than reported by ib_query_device

2007-09-27 Thread Roland Dreier
Michael I'm not sure this is a good approach: the fact that user
Michael attempts to use the max value from query device indicates
Michael that he really wants to get as large a value as
Michael possible. So lowering this value in query means we are
Michael wasting performance for such an app.

Right now we report a value of 30 and then give an error if the
consumer tries to use that value to actually create a QP.  That's a
clear bug to me.  How do you suggest we resolve this bug?

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] [PATCH v3] iw_cxgb3: Support iwarp-onlyinterfacesto avoid 4-tuple conflicts.

2007-09-27 Thread Glenn Grundstrom
 I'm sure I had seen a previous email in this thread that 
   suggested using
 a userspace library to open a socket
 in the shared port space.  It seems that suggestion was 
   dropped without
 reason.  Does anyone know why?
   
   Yes, because it doesn't handle in-kernel uses (eg NFS/RDMA, 
   iSER, etc).
  
  The kernel apps could open a Linux tcp socket and create an RDMA
  socket connection.  Both calls are standard Linux kernel architected
  routines. 
 
 This approach was NAK'd by David Miller and others...
 
   Doesn't NFSoRDMA already open a TCP socket and another for 
  RDMA traffic (ports 2049  2050 if I remember correctly)?  
 
 The NFS RDMA transport driver does not open a socket for the RDMA
 connection. It uses a different port in order to allow both 
 TCP and RDMA
 mounts to the same filer.
 
  I currently
  don't know if iSER, RDS, etc. already do the same thing, but if they
  don't, they probably could very easily.
  
 
 Woe be to those who do so...
 
   
   Does the neteffect NIC have the same issue as cxgb3 here? 
  What are
   your thoughts on how to handle this?
  
  Yes, the NetEffect RNIC will have the same issue as 
 Chelsio.  And all
  Future RNIC's which support a unified tcp address with Linux will as
  well.
  
  Steve has put a lot of thought and energy into the problem, but
  I don't think users  admins will be very happy with us in 
 the long run.
  
 
 Agreed.
 
  In summary, short of having the rdma_cm share kernel port space, I'd
  like to see the equivalent in userspace and have the kernel 
 apps handle
  the issue in a similar way as described above.  There are a few
  technical
  issues to work through (like passing the userspace IP address to the
  kernel),
 
 This just moves the socket creation to code that is outside 
 the purview
 the kernel maintainers. The exchanging of the 4-tuple created with the
 kernel module, however, is back in the kernel and in the maintainer's
 control and responsibility. In my view anything like this 
 will be viewed
 as an attempt to sneak code into the kernel that the maintainer has
 already vehemently rejected. This will make people angry and 
 damage the
 cooperative working relationship that we are trying to build.
 
   but I think we can solve that just like other information that
  gets passed from user into the IB/RDMA kernel modules.
  
 
 
 Sharing the IP 4-tuple space cooperatively with the core in 
 any fashion
 has been nak'd. Without this cooperation, the options we've 
 been able to
 come up with are administrative/policy based approaches. 
 
 Any ideas you have along these lines are welcome.


I am aware of the pending nak's and certainly don't want to sneak
anything by anyone.  Since we all agree that user  admins won't
like the current approach I'm trying to come up with alternatives.
Arkady has raised some good points regarding iSCSI and I would hope
a similar solution could be used for iWARP.

Glenn.

 
 Tom
 
  Glenn.
  
   
- R.
   
  ___
  general mailing list
  general@lists.openfabrics.org
  http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
  
  To unsubscribe, please visit 
 http://openib.org/mailman/listinfo/openib-general
 
 
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: send max_sge lower than reported byib_query_device

2007-09-27 Thread Roland Dreier
  I like the idea of this call returning a value that's usable for any QP, with
  Jim's idea of providing a new call of returning maximum attributes based on 
  QP
  attributes.

OK, so fixing ib_query_device() for mthca to report a value usable for
all QPs (as my patch does) is a step in this direction.

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] Re: send max_sge lower than reported byib_query_device

2007-09-27 Thread Sean Hefty
Right now we report a value of 30 and then give an error if the
consumer tries to use that value to actually create a QP.  That's a
clear bug to me.  How do you suggest we resolve this bug?

I like the idea of this call returning a value that's usable for any QP, with
Jim's idea of providing a new call of returning maximum attributes based on QP
attributes.

- Sean
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] nightly osm_sim report 2007-09-28:normal completion

2007-09-27 Thread kliteyn
OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-09-27
OpenSM git rev = Tue_Sep_25_00:30:00_2007 
[2c547953885809a8026e20af7809be08b42c3865]
ibutils git rev = Tue_Sep_4_17:57:34_2007 
[4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general