Re: [openib-general] [PATCH v2 1/2] iWARP Connection Manager.

2006-06-13 Thread Sean Hefty
Er...no. It will lose this event. Depending on the event...the carnage varies. We'll take a look at this. This behavior is consistent with the Infiniband CM (see drivers/infiniband/core/cm.c function cm_recv_handler()). But I think we should at least log an error because a lost event will

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-06-13 Thread Sean Hefty
Assuming minimal hard-coding of which methods are requests, a client would drop only about 1 MAD per method during start-up. Is this only the new methods which are not hard coded ? Would this invoke a timeout (and hopefully retry) ? We can hard-code existing methods to avoid this problem. So

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-06-13 Thread Sean Hefty
Is the only downside of a larger timeout that potentially more memory accumulates (until the timeout occurs) before it is freed ? This is the only one that I can think of. Can anyone think of others? - Sean ___ openib-general mailing list

Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-12 Thread Sean Hefty
To clarify the motivation more, a question to answer is if we ignore iWarp completely, does it make sense to provide a higher level communication manager for IB. I believe that it does, especially for userspace applications. This lets us leverage existing name services, ipoib, and provides an

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-06-12 Thread Sean Hefty
Sean Hefty wrote: I'd like to propose that the MAD layer detect duplicate requests. After a request MAD has been handed to a client, its context would be maintained until the user calls ib_free_recv_mad(), allowing duplicate requests to be discarded. {snip} Finally, a way would need

Re: [openib-general] [RFC] [PATCH] IB/uverbs: Don't serialize with ib_uverbs_idr_mutex

2006-06-12 Thread Sean Hefty
I started thinking about the kill ib_uverbs_idr_mutex problem, and I realized that there are actually some interesting issues there (as described in the comment at the top of uverbs_cmd.c). In fact I ended up coding the solution below. This passes some basic tests but it could probably use

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-06-12 Thread Sean Hefty
Sean Hefty wrote: 4. Modify umad to learn which requests generate responses, by examining response MADs. When a response is sent, umad would mark which method the response is for by flipping the R-bit. Based on the algorithm, this could result in losing responses the first time

Re: [openib-general] [openfabrics-ewg] OFED 1.0-rc6 tarball available with working ipath driver

2006-06-12 Thread Sean Hefty
Tziporet - Bryan has confirmed that with the patches you've copied, things should work correctly. We've been testing with our version, but I really want to test on the OFED-1.0 version that you've built. Can you send us a pointer to it? How can you go from an RC6 that doesn't build to a 1.0

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-06-12 Thread Sean Hefty
Hal Rosenstock wrote: This brings up a concern. There doesn't seem to be a limit to the number of received MADs that can be queued for a user. Should we have such a limit? How are MADs counted ? Is a multisegment MAD 1 MAD or multiple MADs ? If the latter, it seems problematic to limit

Re: [openib-general] [PATCH 5/5] ucma: export multicast suport to userspace

2006-06-11 Thread Sean Hefty
@@ -58,6 +58,8 @@ enum { RDMA_USER_CM_CMD_GET_EVENT, RDMA_USER_CM_CMD_GET_OPTION, RDMA_USER_CM_CMD_SET_OPTION, + RDMA_USER_CM_CMD_JOIN_MCAST, + RDMA_USER_CM_CMD_LEAVE_MCAST, RDMA_USER_CM_CMD_GET_DST_ATTR }; I think this changes the exported ABI by changing the

Re: [openib-general] [PATCH] mad: prevent duplicate RMPP sessions on responder side

2006-06-11 Thread Sean Hefty
Sean, is anyone looking at this? If not, given that Jack's approach does not touch ABI or API, might it make sense to merge Jack's patch after all and use that as a starting point? With current code in 2.6.17 large RMPPs often get aborted because of the problem of the duplicates. On the other

Re: [openib-general] [PATCH 0/5] multicast abstraction

2006-06-11 Thread Sean Hefty
I am planning to use RDMA CM for multicast functionality. It would be great if you can point me to a simple multicast test program using RDMA CM? There is a userspace test program (mckey) that will be available, but has not been posted yet. (A kernel test program would look fairly similar.) I

Re: [openib-general] [PATCH 1/5] ib_addr: retrieve MGID from device address

2006-06-11 Thread Sean Hefty
dev_addr-broadcast + 4/dev_addr-src_dev_addr + 4 may not be naturally aligned, so casting this pointer to structure type may cause compiler to generate incorrect code. Thanks - I'll update this. - Sean ___ openib-general mailing list

Re: [openib-general] bug report: mad.c: ib_req_notify_cq called without polling cq

2006-06-11 Thread Sean Hefty
mad.c calls ib_req_notify_cq on hotplug event in ib_mad_port_start, after QPs are attached to a CQ. Since this function does not poll the CQ, if sufficient number of MADs arrive at the QP before ib_req_notify_cq is called, RQ might get empty and no completion events will ever be generated. This

[openib-general] Re: Failed multicast join withnew multicast module

2006-06-09 Thread Sean Hefty
Hal Rosenstock wrote: Note the MGRPs are MGIDs and switches are programmed with MLIDs and these can be 1:1 or many:1 depending on the implementation. Most do not do the many:1 but this is allowed by the spec. Also, note that switches know nothing about the groups themselves (only MLIDs and which

Re: [openib-general] Re: [PATCH 1/2] multicast: notify users on membership errors

2006-06-09 Thread Sean Hefty
Michael S. Tsirkin wrote: These should eliminate any races with ipoib leaving, then quickly re-joining a group as a result of an event. Is there a chance this will fix the crashes me and Or were seeing? It shouldn't. The race that I was referring to only involved whether or not a MAD is

Re: [openib-general] [PATCH 2/2] ipoib: handle multicast group reset notification

2006-06-09 Thread Sean Hefty
Sean Hefty wrote: Ipoib already checks for events that require rejoining multicast groups. We just need to add code to handle (i.e. ignore) multicast group reset notifications. Roland, Any issue committing this? - Sean ___ openib-general mailing

[openib-general] Re: Failed multicast join withnew multicast module

2006-06-09 Thread Sean Hefty
Hal Rosenstock wrote: The other issue is whether you trust the state of the network or not when the SM comes up. That's sometimes a dangerous proposition. I considered this, but I think there's a difference between trusting one of the systems on the network, versus the network as a whole.

[openib-general] [PATCH 0/5] multicast abstraction

2006-06-09 Thread Sean Hefty
groups. Signed-off-by: Sean Hefty [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH 1/5] ib_addr: retrieve MGID from device address

2006-06-09 Thread Sean Hefty
Extract the MGID used by ipoib for broadcast traffic from the device address. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- This will be used to get the MCMemberRecord for the ipoib broadcast group. --- svn3/gen2/trunk/src/linux-kernel/infiniband/include/rdma/ib_addr.h 2006-05-25 11:18

[openib-general] [PATCH 2/5] multicast: allow retrieving an MCMemberRecord based on MGID

2006-06-09 Thread Sean Hefty
Add an API to allow retrieving an MCMemberRecord from the local cache based on an MGID. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- This allows an existing MCMemberRecord to be used as a template for creating other multicast groups. --- svn3/gen2/trunk/src/linux-kernel/infiniband/include

[openib-general] [PATCH 3/5] sa_query: add call to initialize ah_attr from an mcmember record

2006-06-09 Thread Sean Hefty
Export a call to initialize an ib_ah_attr structure based on an MCMemberRecord returned from a multicast join request. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- --- svn3/gen2/trunk/src/linux-kernel/infiniband/include/rdma/ib_sa.h 2006-06-06 15:21:05.0 -0700 +++ svn/gen2/trunk

[openib-general] [PATCH 5/5] ucma: export multicast suport to userspace

2006-06-09 Thread Sean Hefty
Expose multicast abstraction through the CMA to userspace. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- --- svn3/gen2/trunk/src/linux-kernel/infiniband/include/rdma/rdma_user_cm.h 2006-06-06 16:53:46.0 -0700 +++ svn/gen2/trunk/src/linux-kernel/infiniband/include/rdma

[openib-general] [PATCH 4/5] rdma cm: add support to join / leave multicast groups

2006-06-09 Thread Sean Hefty
Add IB multicast abstraction to the CMA. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- --- svn3/gen2/trunk/src/linux-kernel/infiniband/include/rdma/rdma_cm.h 2006-06-06 16:53:56.0 -0700 +++ svn/gen2/trunk/src/linux-kernel/infiniband/include/rdma/rdma_cm.h 2006-06-02 10:22

[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
If this comment is directed at client reregister mechanism, you should note that when this was brought up there was resistance to it based on the recommendation (probably not a strong enough word for this) that SMs be redundant in the subnet. There was a fair bit of anecdotal evidence that this

Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-08 Thread Sean Hefty
The following patch series adds support for UD QPs to userspace through the RDMA CM. UD QPs are referenced by an IP address, UDP port number. The RDMA CM abstracts SIDR for Infiniband clients. Roland, Do you see any issues with this patch series or the related userspace changes? There's a

Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-08 Thread Sean Hefty
Roland Dreier wrote: I haven't looked too carefully yet. What's the motivation? It seems strange to put an IB-only transport into the RDMA CM -- iWARP can't handle datagrams, can it? This allows using the address translation to locate the remote service. The RDMA CM also provides an IP

Re: [openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
Greg Lindahl wrote: Isn't this a quality of implementation issue? It's hard to imagine a SM author not realizing this is a good thing to do. I don't know if any SM implementation actually does this today. I think that all break all multicast groups. If it was in the standard, how would

[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
An SM, upon becoming the master, shall respect all existing communication in the fabric, where possible. To me, where possible doesn't sound like an appropriate language for a compliance statement. Is there precedent for this in IB spec? I was trying to express a concept, not formulate exact

[openib-general] Re: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
Hal Rosenstock wrote: 2. There is lazy deletion of MC groups allowed so the reclamation may be difficult. I'm not familiar with the switch programming. Does the SM set the entire MulticastForwardingTable for a switch every time a new group is created, or a new member joins? If the SM loses

[openib-general] [PATCH 1/2] multicast: notify users on membership errors

2006-06-08 Thread Sean Hefty
that requires clients to rejoin a multicast group, the active members are moved into an error state, and the clients are notified of a network reset error. The group is then reset to force additional join requests to generate requests to the SA. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Hal, can

[openib-general] [PATCH 2/2] ipoib: handle multicast group reset notification

2006-06-08 Thread Sean Hefty
Ipoib already checks for events that require rejoining multicast groups. We just need to add code to handle (i.e. ignore) multicast group reset notifications. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Ignoring the callback is a simple fix. I didn't try to see what it would take to have

Re: [openib-general] crash in ib_sa_mcmember_rec_callback while probing out ib_sa

2006-06-07 Thread Sean Hefty
Roland Dreier wrote: Looks like the same crash mst saw related to the multicast module being unloaded and then having sa call back into it. One small clue: esi: f38a5bec edi: f38a5bf4 ebp: fffc esp: f599be60 ebp is -4, which is -EINTR. So this may be a callback from sa_query's

[openib-general] Re: Failed multicast join withnew multicast module

2006-06-07 Thread Sean Hefty
Hal Rosenstock wrote: This leads to a race where NonMembers and SendOnlyNonMembers will fail to re-join until one of the FullMembers joins. Might also be true with joins (not creates) from FullMembers too. I would presume in such cases, the join would be retried. SendOnlyMembers (at least

[openib-general] RE: Failed multicast join withnew multicast module

2006-06-07 Thread Sean Hefty
I might be missing your point but UD is unreliable so the sends can be dropped. The delay/retry is to make sure the join does occur, This is different than a dropped request or reply. In this case, the receiver gets a reply, but it will be a failure from the SA to join the group. For example, a

[openib-general] multicast questions

2006-06-06 Thread Sean Hefty
Does anyone know if the following multicast configurations have been tested? 1. Receiving messages on the same port that they were sent, but on a different QP. 2. Receiving messages on multiple QPs on the same port. - Sean ___ openib-general mailing

RE: [openib-general] [PATCH 1/3] verbs: add call to initialize ib_ah_attr from a work completion

2006-06-06 Thread Sean Hefty
I think it's fine. Should I queue it for 2.6.18 too? That probably makes sense. I'll send a couple of svn revs that should be safe to pull into 2.6.18 after committing this. - Sean ___ openib-general mailing list openib-general@openib.org

[openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-06 Thread Sean Hefty
The following patch series adds support for UD QPs to userspace through the RDMA CM. UD QPs are referenced by an IP address, UDP port number. The RDMA CM abstracts SIDR for Infiniband clients. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- A subsequent patch series will add multicast handling

[openib-general] [PATCH 1/4] IB CM: Save and report remote UD QP attributes after SIDR

2006-06-06 Thread Sean Hefty
Record remote QP information returned from SIDR. Expose attributes through a new API. This functionality is similar to the ib_cm_init_qp_attr() routine that exists for RC QPs. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: core/cm.c

[openib-general] [PATCH 2/4] Add support for UD QPs in RDMA CM

2006-06-06 Thread Sean Hefty
messages to IB CM SIDR REQ messages. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: core/cma.c === --- core/cma.c (revision 7758) +++ core/cma.c (working copy) @@ -66,6 +66,7 @@ static DEFINE_MUTEX(lock); static struct

[openib-general] [PATCH 3/4] uverbs: export ib_copy_ah_attr_to_user

2006-06-06 Thread Sean Hefty
Export the ib_copy_ah_attr_to_user() routine to allow copy ib_ah_attr to userspace to support UD QPs. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: core/uverbs_marshall.c === --- core/uverbs_marshall.c (revision 7758

[openib-general] [PATCH 4/4] uCMA: export UD QP support to userspace

2006-06-06 Thread Sean Hefty
Export the RDMA CM's support of UD QPs to the userspace library. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- My intent is to bump the ABI version only once. The multicast patches will not increment the ABI. Index: core/ucma.c

[openib-general] [PATCH 1/2] libibverbs: add helper functions for UD QP support

2006-06-06 Thread Sean Hefty
Adds some helper functions to simplify using UD QPs. Add new routines: ibv_init_ah_from_wc() and ibv_create_ah_from_wc() to simplify UD QP communication. Expose ibv_copy_ah_attr_from_kern to retrieve ibv_ah_attr from kernel for a UD QP. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index

[openib-general] [PATCH 2/2] librdmacm: add UD QP support for userspace clients

2006-06-06 Thread Sean Hefty
Add support for UD QPs to the RDMA CM library, along with a goofy test program. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/rdma_cma_ib.h === --- include/rdma/rdma_cma_ib.h (revision 7743) +++ include/rdma

[openib-general] libmthca build issue

2006-06-02 Thread Sean Hefty
I'm running into an issue trying to build libmthca. During the ./configure step, I get: checking size of long... configure: error: cannot compute sizeof (long), 77 Has anyone else run into this? - Sean ___ openib-general mailing list

Re: [openib-general] libmthca build issue

2006-06-02 Thread Sean Hefty
Sean Hefty wrote: I'm running into an issue trying to build libmthca. During the ./configure step, I get: checking size of long... configure: error: cannot compute sizeof (long), 77 Has anyone else run into this? Rebooting my system and rebuilding made this error go away. - Sean

RE: [openib-general] libmthca build issue

2006-06-02 Thread Sean Hefty
I just hit this too today. Inspecting the config log file revealed that it could find libibverbs.so. I ran ldconfig, then reran autogen and configure and it worked. Try that... Thanks - I'll try that next time. ___ openib-general mailing list

[openib-general] Re: [PATCH] librdmacm: ucma_init reads past end of device_list

2006-06-01 Thread Sean Hefty
Boyd R. Faulkner wrote: The code currently in place seems to expect there to be a null element at the end of the dev_list to trigger the end of the loop. ibv_get_device_list does not provide such an entry, but the number of entries is available. This patch retrieves that number and loops

Re: [openib-general] [PATCH] CM: store and return attributes needed to send to a UD QP after SIDR

2006-05-31 Thread Sean Hefty
Sean Hefty wrote: Modify the CM to maintain the necessary information needed to send to a UD QP after a user has performed SIDR. Expose the remote QPN, remote QKey, and address handle attributes through the ib_cm_init_qp_attr() routine, so that the information is available from userspace

Re: [openib-general][PATCH 1 of 3] repost: Client Reregister support for kernel space

2006-05-31 Thread Sean Hefty
Eitan Zahavi wrote: Well, this is a very old topic well discussed years ago. All credits to Ashok Raj which you know better then me. The argument against the idea for the SM to be the keeper of these registrations goes as follows: Yes - and I still don't understand why this isn't a personal

[openib-general] Re: Failed multicast join withnew multicast module

2006-05-31 Thread Sean Hefty
Hal Rosenstock wrote: I believe that this still works. Ipoib should leave all multicast groups, then rejoin when an event occurs. As long as no other clients join the ipoib groups, this should work. Are you saying that IPoIB handles the event ? Does the multicast module cooperate (in terms

Re: [openib-general][PATCH 1 of 3] repost: Client Reregister support for kernel space

2006-05-31 Thread Sean Hefty
Eitan Zahavi wrote: Leonid just sent an example for a race that might happen if the SM is to be the maintainer of the data. The race Leonid mentioned is a client sending a request when the SM is down. That request will fail, so there's no data for the SM to maintain for that node. That's a

Re: [openib-general][PATCH 1 of 3] repost: Client Reregister support for kernel space

2006-05-31 Thread Sean Hefty
Eitan Zahavi wrote: [EZ] The race is happening when the SM received the request and responded but the other SMs or the file system did not fully stored that registration and the SM crashed. If the client received a response that the join was successful, then I consider that an SM issue. The

[openib-general] Re: [PATCH 2/2] iWARP Core Changes.

2006-05-31 Thread Sean Hefty
Mainly nits... Steve Wise wrote: -static int copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev, +int copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev, unsigned char *dst_dev_addr) Might want to rename this to something like rdma_copy_addr if

Re: [openib-general] Failed multicast join with new multicast module

2006-05-30 Thread Sean Hefty
Hal Rosenstock wrote: Send-only joins is another case. These are full member joins (JoinState 1) to groups which are not yet created so they fail. I see the problem, and checked in a fix. I forgot to record the last join operation that was initiated, so that it could be failed on an error.

[openib-general] RE: CMA backlog

2006-05-30 Thread Sean Hefty
I think that there are some issues that would need to be worked out, but in general I'm in favor of trying to do something here. Currently, this is not something that can be implemented by ULP on top of CMA, because returning error from REQ will result in reject rather than REQ drop. A generic

[openib-general] RE: CMA backlog

2006-05-30 Thread Sean Hefty
This approach would affect all ULPs, however. For example, no SDP imlementation that I know of retries after a REJ - so this approach won't be interoperable. And AFAIK SDP spec already interprets reject as connection refused. There's no provision I cansee in SDP spec for retries on specific reject

[openib-general] Re: ipoib use of multicast module on trunk causes kernel oops on 2.6.16

2006-05-30 Thread Sean Hefty
Michael S. Tsirkin wrote: I'm still looking at isolating this failure. I'd like to understand the new code better, however. What prevents ipoib_mcast_leave and later ipoib_mcast_free from being called on an mcast that has an outstanding query? We used to have a completion to signal that but

Re: [openib-general] Failed multicast join with new multicast module

2006-05-30 Thread Sean Hefty
Hal Rosenstock wrote: Is client reregister handled properly by the multicast module ? Can you clarify what you mean by this? Are you asking about re-sending join requests based on some event? - Sean ___ openib-general mailing list

RE: [openib-general] Failed multicast join withnew multicast module

2006-05-30 Thread Sean Hefty
On Tue, 2006-05-30 at 17:33, Sean Hefty wrote: Hal Rosenstock wrote: Is client reregister handled properly by the multicast module ? Can you clarify what you mean by this? Are you asking about re-sending join requests based on some event? Yes; when the SM sends a Set PortInfo

[openib-general] [PATCH] git ucm for 2.6.18: convert semaphore to mutex

2006-05-25 Thread Sean Hefty
branch, which will bring the ucm up to date. --- Convert semaphore in ib_ucm_file to a real mutex. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c index fefc9b6..67caf36 100644 --- a/drivers/infiniband/core/ucm.c +++ b

[openib-general] Re: [PATCH] CMA: fix port 2 loopback problems

2006-05-25 Thread Sean Hefty
Michael S. Tsirkin wrote: Fix CMA for loopback configurations: in cma_bind_loopback, make sure sa query is performed from an active port. Thanks! - committed in 7502. - Sean ___ openib-general mailing list openib-general@openib.org

[openib-general] Re: [PATCH] CMA: fix port 2 loopback problems

2006-05-25 Thread Sean Hefty
. Signed-off-by: Ali Ayoub [EMAIL PROTECTED] Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: openib_gen2/drivers/infiniband/core/cma.c === --- openib_gen2.orig/drivers/infiniband

[openib-general] [PATCH] git for-2.6.18 cm: remove unneeded flush workqueue

2006-05-25 Thread Sean Hefty
Destroy_workqueue already does flush_workqueue. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 490fd03..1c7463b 100644 --- a/drivers/infiniband/core/cm.c +++ b

Re: [openib-general] Re: ipoib use of multicast module on trunk causes kernel oops on 2.6.16

2006-05-24 Thread Sean Hefty
Michael S. Tsirkin wrote: What modules were unloaded and being unloaded when this occurred? I think this can be seen in oops. I'm guessing that ipoib and ib_multicast were unloaded, and the crash occurred unloading ib_sa. Is this correct? - Sean

Re: [openib-general] Re: ipoib use of multicast module on trunk causes kernel oops on 2.6.16

2006-05-24 Thread Sean Hefty
Michael S. Tsirkin wrote: Ali here says he's just bringing the ib0 up, and then unloads the module. Linux is sending broadcasts out all the time, so ... Were ipoib and ib_multicast already loaded, and he was just doing an ifconfig? Did he unload both ipoib and ib_multicast, or just ipoib? -

Re: [openib-general] ipoib use of multicast module on trunk causes kernel oops on 2.6.16

2006-05-24 Thread Sean Hefty
Michael S. Tsirkin wrote: The last trunk build causes kernel oops on 2.6.16 while restarting the driver. (the previous build -rev 7422- works fine) Note that ipoib moved to ib_multicast in rev 7401. May 24 16:00:40 sw037 kernel: Modules linked in: ib_sa ib_uverbs ib_umad ib_mthca ib_mad

Re: [openib-general] ipoib use of multicast module on trunk causes kernel oops on 2.6.16

2006-05-24 Thread Sean Hefty
Sean Hefty wrote: Reviewing the code, the multicast module should cancel all SA queries and wait for them to complete before unloading. (Even if it didn't perform the cancel, it should still wait for any outstanding SA query to complete.) As a thought, is there any chance this crash

Re: [openib-general] RE: [PATCH] multiple RDMA_CM_EVENT_DISCONNECTED callbacks

2006-05-23 Thread Sean Hefty
Eric Barton wrote: I just tested your patch and checked that it prevents the double DISCONNECT event callback (it does :). Thanks for testing this. I've committed this patch to svn. - Sean ___ openib-general mailing list openib-general@openib.org

[openib-general] Re: [PATCH] mad: prevent duplicate RMPP sessions on responder side

2006-05-23 Thread Sean Hefty
Jack Morgenstein wrote: Prevent opening multiple RMPP MAD transaction sessions at responder side with the same TID, GID/LID, class. Could happen if RMPP requests are retried while response is in progress. My preference for handling this is to detect and discard duplicate requests, and verify

Re: [openib-general] RE: [PATCH] multiple RDMA_CM_EVENT_DISCONNECTED callbacks

2006-05-23 Thread Sean Hefty
Roland Dreier wrote: Sean Thanks for testing this. I've committed this patch to svn. Should this be merged into what I have queued for 2.6.18? I think so. I was going to send another update later today that included the patches that Michael wanted for SDP support as well. (I didn't

Re: [openib-general] RDMA kernel utilities-newbie

2006-05-22 Thread Sean Hefty
keshetti mahesh wrote: i need to develop a kernel utility capable of RDMA read/write operations i have seen example utilities under svn/gen2/utils/src/linux-kernel/infiniband/util/ tree. where can i find the documentation related to them? There's no documentation for those utilities - just

RE: [openib-general] Re: vapi versus openib imm_data

2006-05-22 Thread Sean Hefty
Thanks both. I'll solve this by adding htonl/ntohl, only on the VAPI side Since VAPI is wanting the data in host order, while openib uses network order, it makes more sense to me to do the swapping on the openib side. - Sean ___ openib-general mailing

Re: [openib-general] Re: vapi versus openib imm_data

2006-05-22 Thread Sean Hefty
Fabian Tillier wrote: It doesn't matter what VAPI wants - it's the application that matters. If the application is using the immediate data for flags, you don't need any swapping on the OpenIB side of things, and you can avoid the swap altogether. While this makes the VAPI implementation less

Re: [openib-general] Re: vapi versus openib imm_data

2006-05-22 Thread Sean Hefty
Fabian Tillier wrote: If you swap the flag constants in little endian systems so they're always in network order - something the compiler can do for you - then you're checking for a bit, and it is safe to treat the value in network order always. The flag constant will be different in little

Re: [openib-general] connection management

2006-05-22 Thread Sean Hefty
amit byron wrote: o to make multiple connections using ib_send_cm_req() i would have make connection requests using ib_send_cm_req() with different service id. is this sufficient, or i missed something? i can use the same port number, correct? You need one service ID per

Re: [openib-general] [PATCH v3] ipoib: convert to use new multicast interface

2006-05-22 Thread Sean Hefty
Sean Hefty wrote: Convert ipoib to make use of the new multicast module interface. I've committed this patch to svn. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe

RE: [openib-general] [PATCH] CM: remove pkey from SIDR REQ - use pkey from path rec instead

2006-05-21 Thread Sean Hefty
while reviewing this i noticed that this documentation error: ib_send_cm_sidr_rep - Sends a service ID resolution ***request*** to the remote node. which you can fix when committing this patch. Thanks - I'll update the documentation as part of this patch. - Sean

Re: [openib-general] infiniband

2006-05-19 Thread Sean Hefty
amit byron wrote: o where can i find sample code to do rdma write? how to setup rdma write read between two infiniband nodes? There are some kernel tests in svn/gen2/utils/src/linux-kernel/infiniband/util. See krping for RDMA read/write. There are usermode tests in

Re: [openib-general] multiple RDMA_CM_EVENT_DISCONNECTED callbacks

2006-05-19 Thread Sean Hefty
Eric Barton wrote: I'm using the rdam_cm API. I've seen it call my CM callback with RDMA_CM_EVENT_DISCONNECTED twice. Is this a bug? I would consider this a bug. The problem is that the underlying IB CM is reporting two events: DREQ_RECEIVED, followed by DREP_RECEIVED. The RDMA CM reports

RE: [openib-general] infiniband

2006-05-19 Thread Sean Hefty
the krping test module make use of rdma_cm* apis. will the module work with ib_cm* api? You would have to adapt the module to use the ib_cm APIs. The posting of the work requests is the same, however. should i using rdma_cm module or ib_cm module, which is the standard? Both are

[openib-general] [PATCH] multiple RDMA_CM_EVENT_DISCONNECTED callbacks

2006-05-19 Thread Sean Hefty
Eric Can you try this patch and let me know if it fixes your problem? - Sean --- Prevent generating duplicated DISCONNECT events. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: cma.c === --- cma.c (revision 7362

Re: [openib-general] [PATCH] cma: fix bind to ip

2006-05-18 Thread Sean Hefty
Thanks! Committed with only minor adjustment to spacing. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] cma: fix bind to ip

2006-05-18 Thread Sean Hefty
Sean Thanks! Committed with only minor adjustment to spacing. - Should I add that commit to what I have queued for 2.6.18? It shouldn't hurt, but isn't strictly needed. The changes are for SDP, and IPv6 support still requires more work. My personal vote would be yes, with the hope that

[openib-general] [PATCH] CM: remove pkey from SIDR REQ - use pkey from path rec instead

2006-05-18 Thread Sean Hefty
The pkey is provided into a SIDR REQ in two places, once as a parameter, and again in the path record. Remove the pkey as a parameter and always use that given in the path record. This change has no practical effect on ABI functionality. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index

[openib-general] [PATCH 1/3] verbs: add call to initialize ib_ah_attr from a work completion

2006-05-18 Thread Sean Hefty
Expose a new call to initialize address handle attributes from a work completion. This functionality is duplicated by both verbs and the CM. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/ib_verbs.h

[openib-general] [RFC] [PATCH] RDMA CM: add UD QP support

2006-05-18 Thread Sean Hefty
calls: resolve_addr, resolve_route, and connect. A server calls: listen and accept. Connect and accept correspond to SIDR REQ / SIDR REP, respectively. This patch introduces a new protocol for SIDR that is the same as that used by the CMA for connection REQs. Signed-off-by: Sean Hefty [EMAIL

[openib-general] [PATCH] libibcm: remove pkey from SIDR REQ

2006-05-18 Thread Sean Hefty
Remove the pkey from the API for SIDR REQ. The pkey is provided in the path record. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/infiniband/cm.h === --- include/infiniband/cm.h (revision 7020) +++ include

[openib-general] [PATCH] librdmacm: add UD QP support

2006-05-18 Thread Sean Hefty
And a patch to support UD QPs from userspace through the librdmacm. Included in the patch is a test program. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/rdma_cma_ib.h === --- include/rdma/rdma_cma_ib.h

Re: [openib-general] [PATCH 2/3] SA: add call to initialize ib_ah_attr from a path record

2006-05-18 Thread Sean Hefty
Sean Hefty wrote: +int ib_init_ah_from_path(struct ib_device *device, u8 port_num, +struct ib_sa_path_rec *rec, struct ib_ah_attr *ah_attr) +{ + int ret; + u16 gid_index; + + memset(ah_attr, 0, sizeof *ah_attr); + ah_attr-dlid = be16_to_cpu(rec

RE: [openib-general] [PATCH] IB: Make needlessly global ib_mad_cachestatic

2006-05-17 Thread Sean Hefty
Any reason not to apply this? Looks fine to apply be me. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [librdmacm] changes to cmatose to return a value different than 0 when there is a failure

2006-05-17 Thread Sean Hefty
Dotan Barak wrote: Added checks to the return values of all of the functions that may fail (in order to add this test to the regression system). Thanks - applied with one minor change. + int rc; Changed 'rc' to 'ret' to match the rest of the code. - Sean

Re: [openib-general] Re: [PATCH] RE: compliancy issue?

2006-05-16 Thread Sean Hefty
OK, I just tested and this works for me. Here's the SDP patch to do what you described. The code actually got cleaner now: its convenient to get different events on active versus passive side - previously I had to check a flag to figure out what does ESTABLISHED mean. I committed the CMA patch.

[openib-general] RE: need help regarding IB core software

2006-05-16 Thread Sean Hefty
Please post generic questions to the openib mailing list. i have started working over infiniband recently i want to develop a sample utility that would perform simple RDMA (read/write) operations There are some test applications that can be used as a base. Are you wanting a

Re: [openib-general] CMA IPv6 support

2006-05-15 Thread Sean Hefty
Michael S. Tsirkin wrote: Sean, CMA currently does not support IPv6 addresses at all. Is that right? This is correct. At best, there's some code in places to handle it. However, while I don't have immediate need to make real IPv6 addressing to work, some applications (notably Java) always

Re: [openib-general] RDMA enabled NICs- newbie

2006-05-15 Thread Sean Hefty
Ian Brown wrote: Are there Etherenet NICS in the market (or will there be soon such nics) such NICs which are RDMA nics (RDMA enabled NICs)? And in case the answer is positive - does linux kernel has support for such nics? The Linux kernel does not have support for RNICs at this time, but

[openib-general] RE: [PATCH] cm: dont flush wqq before destroy

2006-05-15 Thread Sean Hefty
Thanks! - applied. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] CMA IPv6 support

2006-05-15 Thread Sean Hefty
OK. May we discuss the design/API for now? Sounds good. My understanding is an IPv4 socket should only listen on IPv4 requests, while IPv6 socket should listen on both IPv4 and IPv6, unless IPV6_V6ONLY is set. Is that right? What will the API be for this? Maybe create_cm_id should get an

RE: [openib-general] CMA IPv6 support

2006-05-15 Thread Sean Hefty
Rdma_create_id() already takes a struct sockaddr *, which has an address family selector (sa_family) to define the contained address format. Why is that one not sufficient? Rdma_bind() and rdma_resolve_addr() take struct sockaddr *. Rdma_create_id() only has an event handler, context, and port

RE: [openib-general] CMA IPv6 support

2006-05-15 Thread Sean Hefty
Looking at rdma_listen(), the code I see checks for bound state before proceeding to listen: int rdma_listen(struct rdma_cm_id *id, int backlog) { struct rdma_id_private *id_priv; int ret; id_priv = container_of(id, struct rdma_id_private, id); if

<    3   4   5   6   7   8   9   10   11   12   >