[openib-general] [PATCH] RDMA CM: updates to 2.6.18 branch

2006-05-15 Thread Sean Hefty
I'm assuming that since the CMA isn't upstream yet, a single patch will work. The patch below should contain everything that makes sense to merge upstream for the CMA. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index

Re: [openib-general] 2.6.17 and 2.6.18 merge plans

2006-05-12 Thread Sean Hefty
I consistently get a URL error: 504... error: Unable to find 2ac5. under http://git... every time I try to clone your git tree. Is there a mirror that I can try cloning from, or do you know of an alternative way of getting the tree? ___

Re: [openib-general] question regarding GRH flag in ib_ah_attr

2006-05-12 Thread Sean Hefty
Jason Gunthorpe wrote: How about this, how do you see this scenario: 1) Client gets a DGID from 'someplace' 2) Client sends a SA query to resolve the DGID to a Path Record 3) Client configures a QP based on the Path Record Now, the question I'm interested in is this: During step #3 what test

Re: [openib-general] 2.6.17 and 2.6.18 merge plans

2006-05-12 Thread Sean Hefty
Thanks for the info. I just updated my git version, and it worked fine. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit

Re: [openib-general] Re: [PATCH] cm refcount race fix

2006-05-11 Thread Sean Hefty
Michael S. Tsirkin wrote: This has run for a couple of nights here without issues. Please commit. I also think we shall push this patch for 2.6.17 - it is clean and simple enough. Agree? I've committed these changes to svn. Roland, can you queue this patch for 2.6.17, or at least 2.6.18? -

Re: [openib-general] [PATCH v3] ipoib: convert to use new multicast interface

2006-05-11 Thread Sean Hefty
Roland, I just wanted to make sure that this patch wasn't dropped. Can we queue the multicast module for 2.6.18? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe,

Re: [openib-general] [PATCH v3] ipoib: convert to use new multicast interface

2006-05-11 Thread Sean Hefty
Roland Dreier wrote: Sean Can we queue the multicast module for 2.6.18? I guess so. What's the motivation? Do we have any users of it other than IPoIB? I'm working on adding multicast support for userspace (MPI), which will also need this. - Sean

[openib-general] [PATCH] refcount race fixes

2006-05-11 Thread Sean Hefty
Roland, this is the patch that I was referring to. --- Fix race condition during destruction calls to avoid possibility of accessing object after it has been freed. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: mad_rmpp.c

Re: [openib-general] [PATCH v3] ipoib: convert to use new multicast interface

2006-05-11 Thread Sean Hefty
Roland Dreier wrote: Sean I'm working on adding multicast support for userspace (MPI), Sean which will also need this. Right, but that's not going to be ready for 2.6.18, is it? It won't be ready for merging, no. I was hoping to limit the number of out of tree modules that it

Re: [openib-general] question regarding GRH flag in ib_ah_attr

2006-05-11 Thread Sean Hefty
Hal Rosenstock wrote: Anytime the send is off the local subnet (as well as multicast), a GRH is required. Also, there is a management response rule for responding when the request contained a GRH that require a GRH (13.5.4.4 p. 769). Reading through the responses, I think my problems are

Re: [openib-general] [PATCH] RE: compliancy issue?

2006-05-11 Thread Sean Hefty
Sean Hefty wrote: Can you try this simple patch and see if it fixes your problem? You will need to call rdma_accept() or rdma_reject() after receiving a CONNECT_RESPONSE event. The conn_param to rdma_accept() should be NULL. Michael, Did you ever get a chance to try this patch? If so, I

[openib-general] Re: rdma_cm.h: comment nits.

2006-05-11 Thread Sean Hefty
Michael S. Tsirkin wrote: In that case, how about we change rdma_disconnect for IB to do reject if connection isn't established yet? Unless I missed something, the only problem that we're trying to solve is that SDP needs to be able to reject a connection based on private data in the REP,

Re: [openib-general] [PATCH v3] ipoib: convert to use new multicast interface

2006-05-11 Thread Sean Hefty
Michael S. Tsirkin wrote: I'm nervous about doing big changes in the multicast code in ipoib - it had more than a fair share of subtle races. The patch simplifies the multicast code in ipoib, and the serialization in the multicast module is simple enough that we should be able to have a fair

Re: [openib-general] Comms Errors

2006-05-11 Thread Sean Hefty
Eric Barton wrote: int kiblnd_post_rx (kib_rx_t *rx, int credit) { kib_conn_t *conn = rx-rx_conn; struct ib_recv_wr *bad_wrq; int rc; LASSERT (!in_interrupt()); LASSERT (credit == IBLND_POSTRX_NO_CREDIT || credit

Re: [openib-general] rdma_cm.h: comment nits.

2006-05-10 Thread Sean Hefty
Tom Tucker wrote: Its OK to call rdma_reject on active side as well, isn't it? You'll get -EINVAL on iWARP if you do this For IB, rdma_reject can be called on the active side if the user is managing their own QP states, or is SDP. How does iWarp support userspace QPs? - Sean

Re: [openib-general] Re: [PATCH] RE: compliancy issue?

2006-05-10 Thread Sean Hefty
Michael S. Tsirkin wrote: No, looking in the code shows that qp will be changed to rtr and then rts ***before*** sending the RTU since you will call rdma_accept which in turn will call cma_rep_recv Right, missed that, thanks! I was wandering why it was behaving not the way I expected it to

Re: [openib-general] Re: [PATCH 3/3] librdmacm: add ability to get/set transport specific options

2006-05-10 Thread Sean Hefty
Jack Morgenstein wrote: Userspace rdma_get_option() will then also get -ENODATA. OK. We can, therefore, do the following: the dummy procedures in the dummy ib_local_sa.h file will return -ENODATA for all get operations and for ib_create_path_cursor(), and -ENOSYS for

Re: [openib-general] there is a compilation warning in librdmacm

2006-05-10 Thread Sean Hefty
Dotan Barak wrote: There is a compilation warning in the file: src/userspace/librdmacm/src/cma.c. Thanks - I committed a fix for this. It should have been a void. - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general] Re: [PATCH 3/3] librdmacm: add ability to get/set transport specific options

2006-05-10 Thread Sean Hefty
Jack Morgenstein wrote: I assume you mean disabled. Uhm.. yes - that's what I meant. Looks like setting cache_timeout to zero as the default is good enough: in sa_db_init(), cache_timeout remains 0 when translated to jiffies, resulting in paths_per_dest being set to zero as well.

Re: [openib-general] [PATCH] librdmacm abi version

2006-05-10 Thread Sean Hefty
Thanks - committed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] rdma_cm.h: comment nits.

2006-05-10 Thread Sean Hefty
Michael S. Tsirkin wrote: BTW, Sean, could you please explain why is RESPONSE event IB-specific? Does not it match Syn/Ack in the TCP 3-way handshake? I didn't think that even iWarp exposed the TCP connection messages to the users. Plus iWarp connections can be formed over an existing TCP

[openib-general] question regarding GRH flag in ib_ah_attr

2006-05-10 Thread Sean Hefty
For context, I'm trying to work backwards from send a message on a UD QP to determine what information is needed and how it is obtained. Does anyone know how the user determines if the grh flag should be set in the ib_ah_attr when allocating an ib_ah? Do they do this by examining the GIDs in a

RE: [openib-general] 2.6.17 and 2.6.18 merge plans

2006-05-10 Thread Sean Hefty
- CMA. Sean, I think there have been a lot of updates to the cma since the last time you updated me. Can you send me a patch to resync my for-2.6.18 branch with the latest code for upstream? I will start on this by the end of the week. - Sean

Re: [openib-general] [PATCH 3/3] librdmacm: add ability to get/set transport specific options

2006-05-09 Thread Sean Hefty
Jack Morgenstein wrote: Use of local_sa in the rdma_cm kernel module is already patched out for OFED. local_sa is used ONLY in kernel cma.c ( static function cma_resolve_ib_route(), which calls ib_get_path_rec()). Icall to ib_get_path_rec() is eliminated, and we call cma_query_ib_route()

[openib-general] Re: CMA: compliancy issue?

2006-05-09 Thread Sean Hefty
Michael S. Tsirkin wrote: From iSER point of view, this approach is fine, and it would allow for some future flexibility to reject the REP. We prefer to implement it only for 2.6.19, that is when 2.6.18-rc1 is out. Let us start by implementing this in SVN trunk. Sean, if you agree too, can

[openib-general] RE: [PATCH] cm refcount race fix

2006-05-09 Thread Sean Hefty
Here's a patch that should fix both the IB CM and RDMA CM using completions rather than spinlock / wait objects. Michael, can you test that this version works for you? Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: cm.c

RE: [openib-general] Re: [PATCH] update uDAPL openib_cma provider to work with new uCMA event channels

2006-05-09 Thread Sean Hefty
Sync up with Sean on commits. I'm watching for Sean's commit. Did I miss it? It looks like my commit failed for some reason, and I missed it. I've just re-committed the changes, which should be in revision 7019. - Sean ___ openib-general mailing

[openib-general] RE: [PATCH] cm refcount race fix

2006-05-09 Thread Sean Hefty
Here's a patch for all of the files that you listed. I did do some basic testing and didn't see any issues. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: mad_rmpp.c === --- mad_rmpp.c (revision 6884) +++ mad_rmpp.c

Re: [openib-general] CMA: port 2 loopback problems

2006-05-09 Thread Sean Hefty
Michael S. Tsirkin wrote: I thought about this too. People actually do expect loopback to work when link is down. I guess we could create loopback path record, with parameters such as SL editable from sysfs. Until the underlying IB stack supports loopback connections on a non-active port, my

Re: [openib-general] CMA: port 2 loopback problems

2006-05-09 Thread Sean Hefty
Michael S. Tsirkin wrote: Until the underlying IB stack supports loopback connections on a non-active port How do you mean? You can already create loopback connections as per IB spec - it works already. It works if the port is active. I don't believe that there's any code to support

[openib-general] [PATCH] RE: compliancy issue?

2006-05-09 Thread Sean Hefty
if it fixes your problem? You will need to call rdma_accept() or rdma_reject() after receiving a CONNECT_RESPONSE event. The conn_param to rdma_accept() should be NULL. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: cma.c

Re: [openib-general] svn 6829 version issue with rdma_ucm and userspace?

2006-05-09 Thread Sean Hefty
Ira Weiny wrote: I have been struggling with getting svn6829 to work and this is one of the latest issues. # odev1 /root simple_rdma -S librdmacm: kernel ABI version 0 doesn't match library version 1. Failed to create rdma_cm_id This is a little rdma app I wrote which uses the rdma_cm

RE: [openib-general] Re: [PATCH] update uDAPL openib_cma provider to work with new uCMA event channels

2006-05-08 Thread Sean Hefty
Sync up with Sean on commits. I'm watching for Sean's commit. Did I miss it? No - I will commit by noon PST today. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe,

Re: [openib-general] rdma_cm.h: comment nits.

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: Two nits wrt rdma_cm.h: /** * * rdma_reject - Called on the passive side to reject a connection request. */ Its OK to call rdma_reject on active side as well, isn't it? Yes - but only for users that are managing the QP states themselves. /** *

[openib-general] Re: CMA disconnect

2006-05-08 Thread Sean Hefty
Or Gerlitz wrote: Looking in the code i have realized that it is a must for the CMA consumer to call rdma_disconnect to have the QP state moved into ERROR. Maybe it would make sense for the CMA to transition the QP to the error state before destroying it? Am i correct? with this

[openib-general] Re: question on rdma_disconnect

2006-05-08 Thread Sean Hefty
Or Gerlitz wrote: + /* change the ib conn state only if the conn is UP, however always call + * rdma_disconnect since this is the only way to cause the CMA to change + * the QP state to ERROR + */ I updated the comments in the header file to state that rdma_disconnect() transitions that

Re: [openib-general] Re: CMA: compliancy issue?

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: I was originally thinking along the lines of still using ESTABLISHED, and simply delaying RTU till after the handler is called. We would then need to teach CMA to perform reject instead of RTU if handler returns an error code. We even can have a flag to select the

Re: [openib-general] Re: [PATCH] cm refcount race fix

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: void mthca_cq_event(struct mthca_dev *dev, u32 cqn, enum ib_event_type event_type) { struct mthca_cq *cq; struct ib_event event; spin_lock(dev-cq_table.lock); cq = mthca_array_get(dev-cq_table.cq, cqn

Re: [openib-general] CMA: port 2 loopback problems

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: Sean, I am seeing the following problem: I have a dual-port HCA with IPoIB interfaces ib0 on port 1 and ib1 on port 2. port 1 is down and port 2 is up, and I try creating a connection to the loopback address 127.0.0.1. The problem I am seeing is that I am getting

Re: [openib-general] [PATCH] cm refcount race fix

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: static inline void cm_deref_id(struct cm_id_private *cm_id_priv) { + unsigned long flags; + + spin_lock_irqsave(cm_id_priv-lock, flags); if (atomic_dec_and_test(cm_id_priv-refcount)) wake_up(cm_id_priv-wait); +

Re: [openib-general] Re: CMA: compliancy issue?

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: Fine, but rdma_reject will have to alter the state then, to avoid RTU after REJ. It should be able to work just as it does for userspace. The user either calls accept or reject. The IB CM will not send an RTU if a REJ has been sent. I think that the real issue

Re: [openib-general] CMA: port 2 loopback problems

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: Is it possible to communicate between QPs on the same device if that device is disconnected from the fabric? Yes. What attributes do you use for the pkey index and address vector when connecting the QPs? I'm wondering if the correct solution to this issue isn't

Re: [openib-general] Re: CMA: compliancy issue?

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: I actually think all we have to do is to change CMA behaviour on REP: send RTU after, and not before, calling user handler. Since other ULPs don't seem t care when RTU is sent, they will continue working. I think that it makes more sense to give the user the

Re: [openib-general] Re: [PATCH] cm refcount race fix

2006-05-08 Thread Sean Hefty
Michael S. Tsirkin wrote: static inline void cm_deref_id(struct cm_id_private *cm_id_priv) { + unsigned long flags; + + spin_lock_irqsave(cm_id_priv-lock, flags); if (atomic_dec_and_test(cm_id_priv-refcount)) wake_up(cm_id_priv-wait); +

Re: [openib-general] Re: [PATCH] cm refcount race fix

2006-05-08 Thread Sean Hefty
Roland Dreier wrote: Sean Could we use atomic_dec_and_lock() instead? This would keep Sean refcount atomic, but use a spinlock to synchronize with Sean destruction. Hmm, how does that help? Just going to a plain integer with a spinlock to protect it seems simple and clear.

Re: [openib-general] [PATCH 3/3] librdmacm: add ability to get/set transport specific options

2006-05-08 Thread Sean Hefty
Jack Morgenstein wrote: Should we use this revision (6949/6950 of the openib trunk) of the rdma_cm in the upcoming OFED (branch) release? The end-users should probably decide that. These changes use the local_sa, which has not been queued to be merged upstream yet. (since the branch

RE: [openib-general] Re: [PATCH] cm refcount race fix

2006-05-08 Thread Sean Hefty
If you wanted to implement this, you would have to use a completion. A mutex can't be used because it must be released in process context with interrupts enabled. And a semaphore can't be used because there's an implicit use-after-free with semaphores (basically up() touches the semaphore memory

[openib-general] RE: cm crash

2006-05-07 Thread Sean Hefty
cm_process_work does: cm_deref_id(cm_id_priv); if (ret) ib_destroy_cm_id(cm_id_priv-id); assume that another thread calls ib_destroy_cm_id. Now wait_event(cm_id_priv-wait, !atomic_read(cm_id_priv-refcount)); while ((work =

[openib-general] RE: cm crash

2006-05-07 Thread Sean Hefty
Another possible issue: static inline void cm_deref_id(struct cm_id_private *cm_id_priv) { if (atomic_dec_and_test(cm_id_priv-refcount)) wake_up(cm_id_priv-wait); } A thread could test the refcount after atomic_dec_and_test but before wake_up(cm_id_priv-wait), and remove

Re: [openib-general] [PATCH 1/3] rdma cm: allow user to specify path record for connections

2006-05-05 Thread Sean Hefty
This patch series has been committed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH 1/2] librdmacm: add event channels

2006-05-05 Thread Sean Hefty
different threads to process events for different rdma_cm_id's. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: src/cma.c === --- src/cma.c (revision 6950) +++ src/cma.c (working copy) @@ -120,7 +120,6 @@ static struct dlist

[openib-general] [PATCH 2/2] librdmacm: update test programs to use event channels

2006-05-05 Thread Sean Hefty
Update test programs to use the event channel interfaces. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: examples/rping.c === --- examples/rping.c(revision 5693) +++ examples/rping.c(working copy) @@ -141,6 +141,7

Re: [openib-general] sdp code in trunk

2006-05-04 Thread Sean Hefty
Hal Rosenstock wrote: It could have been done the other way 'round as well with the new SDP on a new branch as other ULPs have done prior to being ready for the trunk and all the same goals that you mention accomplished. I agree with Hal. If there was a good chance that the code was more

[openib-general] Re: [PATCH] change iser ib connection state management

2006-05-04 Thread Sean Hefty
Or Gerlitz wrote: changed iser ib conn state management to be done with an int variable keeping the state and a lock. When a related race is possible the lock is used to check (comp) or change (comp_exch) the state. When no race can happen the state is just examined or changed. These look

[openib-general] SA MultiPathRecord v.PathRecord

2006-05-04 Thread Sean Hefty
Moving discussion to list. Hal Rosenstock wrote: MPR does allow for better selection of paths for APM though. Beyond adding the Independence Selector, does MPR add anything else? I assume that the Independence Selector works when paths between multiple GIDs are requested. Have you thought

[openib-general] Re: SA MultiPathRecord v.PathRecord

2006-05-04 Thread Sean Hefty
Hal Rosenstock wrote: To me, this feature seems most useful for all-to-all type connections, but would require some sort of coordination between connecting end-points in order to have fault independent connections between different nodes. E.g. the connection from A to B is independent from

[openib-general] Re: SA MultiPathRecord v.PathRecord

2006-05-04 Thread Sean Hefty
Hal Rosenstock wrote: I'm just trying to determine how an implementation can make use of MPR to understand the best way to expose it. The use of MPR over just path records seems complex. In terms of what ? To be clear, that wasn't a criticism, just a statement that MPR provides additional

[openib-general] Re: SA MultiPathRecord v.PathRecord

2006-05-04 Thread Sean Hefty
Hal Rosenstock wrote: that I'm not sure we can provide value beyond basic MAD support. By that do you mean expose the PRs returned ? That along with exposing the details of the query use to obtain the path records. I'm wondering if we can come up with a sensible abstraction for path

[openib-general] [PATCH 2/3] ucma: add kernel support for get/set options

2006-05-04 Thread Sean Hefty
being established. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: core/ucma.c === --- core/ucma.c (revision 6884) +++ core/ucma.c (working copy) @@ -41,6 +41,8 @@ #include rdma/ib_marshall.h #include rdma/rdma_cm.h +#include

[openib-general] [PATCH 3/3] librdmacm: add ability to get/set transport specific options

2006-05-04 Thread Sean Hefty
Add routines to the userspace RDMA CM library to get/set transport specific options. Add an option to retrieve possible path records for a connection, and set which path a connection will be established on. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/rdma_cma_abi.h

Re: [openib-general] Re: [PATCH] SRP: Avoid a potential deadlock

2006-05-03 Thread Sean Hefty
Roland Dreier wrote: I thought that after the DREP is received, the CM will go through timewait and we will eventually get a TIMEWAIT_EXIT event (with a completion). Am I wrong? Have you actually seen this deadlock happen in practice? This should be the case. TIMEWAIT_EXIT should follow

Re: [openib-general] [PATCH 1 of 2] add lmc cache

2006-05-03 Thread Sean Hefty
Michael S. Tsirkin wrote: Add LMC cache. Committed - thanks! ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 2 of 2] mad: check GID/LID when searching for request

2006-05-03 Thread Sean Hefty
Applied - thanks. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] RDMA CM: assign port numbers when binding a cm_id to an address

2006-05-02 Thread Sean Hefty
Rds uses RDMA_PS_UDP. Here is a patch to add that. I thought that RDS established a connection. (Maybe it should be called a channel multiplexing service?) I don't think that we want to use the RDMA UDP port space for connected QPs. That should be reserved for UD QPs. Can't RDS sit over the

RE: [openib-general] re cma upcalls serialization / disconnected eventquestion

2006-05-02 Thread Sean Hefty
I see, so just to make sure: following rmda_connect i will get always see one of {ESTABLISHED, REJECTED, CONNECT_ERROR} ? Or DEVICE_REMOVAL, but those are the typical callbacks. - Sean ___ openib-general mailing list openib-general@openib.org

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-02 Thread Sean Hefty
And I don't believe that case 3 exists either, but would end up being treated as DS RMPP by the implementation. Why ? Just wondering... If case 3 doesn't exist, then I think we can come up with a generic way to identify DS RMPP that doesn't require checking class or methods. How ? The MAD

[openib-general] RE: Re: [PATCH v2] mad: use GID/LID on requester sidewhen matching responses to requests

2006-05-02 Thread Sean Hefty
It probably would be better to commit it as a separate patch -- one idea per patch. so I understand he's fine with it, and the comment was with regard to how to commit this - first core files, then MAD files. Anyway, its trivial to split the patch, if you want help with that let me know. I'm a

RE: [openib-general] [RFC] [PATCH 1/3] RDMA CM:add rdma_get/set_optioncalls to get/set path records

2006-05-02 Thread Sean Hefty
Agreed... as an interface to userspace, get/set opt makes sense, but inside the kernel you just end up with a dispatch function that demultiplexes things to the real work. So I think the real work functions should be the kernel API. The dispatch function would still be there for userspace, but

RE: [openib-general] re cma upcalls serialization / disconnected eventquestion

2006-05-01 Thread Sean Hefty
Can a ULP assume that cma callbacks for to the same CMA ID are serialized? Yes. (This is required to avoid reporting events out of order to the user.) Also and related to this, is it correct that ***always** before DISCONNECTED event there will be one of {ESTABLISHED, REJECTED, CONNECT_ERROR}?

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-01 Thread Sean Hefty
There is a real issue that is seen when a duplicate request (same TID, SGID, mgmt class) is received at the client, resulting in a duplicate response. You had mentioned in the previous email on this that this was the case of a slow responder. Is the responder slow but playing by the IB timeouts

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-01 Thread Sean Hefty
There's still a window here depending on when free MAD is called versus when the response gets back to the original requester. There are no issues in this case. We just need to avoid having two responses being sent at the same time. but I'm not sure if this would happen in practice. A

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-01 Thread Sean Hefty
Aren't there 3 cases possible here: (1) non RMPP request/RMPP response (e.g. SA GetTable for one), (2) RMPP request/RMPP response (e.g. SA GetMulti), and (3) RMPP request/non RMPP response (I don't think this currently exists but may be mistaken). Are all handled on the initiator/requester side ?

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-01 Thread Sean Hefty
Why is the requester resending ? He's simply timed out waiting for a response. For instance, if this is an SA query, maybe the SA is swamped with requests. I don't think that there are any timeout restrictions for this. - Sean ___ openib-general

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-05-01 Thread Sean Hefty
It needs to fail in a way so that it is not retried, right ? The ib_post_send_mad() call will fail. Since the first response removed the request from the list to check, subsequent retries will also fail. Basically, this prevents a user from sending a response MAD unless it had previously

RE: [openib-general] RFC: detecting duplicate MAD requests

2006-04-29 Thread Sean Hefty
You can't add this kind of thing piecemeal to a protocol and have it work. If the sender doesn't see a response (perhaps the response was lost, or was slow coming), and sends another MAD, this 2nd MAD will have a different sequence number. How does the recipient know it's the If a MAD is sent

RE: [openib-general] Re: [PATCH v2] mad: use GID/LID on requester sidewhen matching responses to requests

2006-04-29 Thread Sean Hefty
Check GID/LID for requester side when searching for request which matches received response. This, in order to guarantee uniqueness if use same TID when requesting via multiple source LIDs (when LMC is not zero). To perform check, add LMC to cache. Further, do not perform LID check for

[openib-general] RE: RFC: detecting duplicate MAD requests

2006-04-29 Thread Sean Hefty
I understand that this is along the lines of the approach poposed by Jack a while ago: https://openib.org/svn/trunk/contrib/mellanox/gen2/patches/mad_rmpp_requester_r etry.patch with the difference that duplicate request detection will be handled by full GID/TID/class/request/response matching,

Re: [openib-general] [PATCH] [RFC] dapltest change for iwarp

2006-04-28 Thread Sean Hefty
Steve Wise wrote: This patch changes the dapltest transaction test to force the client side (the side that dat_ep_connect()) to send the first RDMA message. This ensures that the IWARP MPA protocol requirements are met. I'm presenting this for discussion and possible inclusion in the trunk.

Re: [openib-general] [PATCH] [RFC] dapltest change for iwarp

2006-04-28 Thread Sean Hefty
Steve Wise wrote: The Chelsio RNIC has this issue. If the server sends the first FPDU _before_ the client driver has moved the connection/qp into RDMA mode, then the data is placed as streaming data and the connection must be terminated (dapltest 6 exposes this intermittently). Ammasso doesn't

[openib-general] [PATCH] rdma cm: add sdp version checks

2006-04-28 Thread Sean Hefty
is received, the connection will be aborted. This will result in sending a consumer REJ message when the cm_id is destroyed. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Michael, can you test this against SDP? This patch should address CA4-15, 17, and 22. CA4-43 is handled automatically by the IB CM

[openib-general] RFC: detecting duplicate MAD requests

2006-04-28 Thread Sean Hefty
Today, a request MAD received by the MAD layer is handed to a client. The client processes the MAD, and generates a response. If the client is slow to process the MAD, the request may have been resent. The duplicate request is also handed to the client. The result is that clients perform

RE: [openib-general] [PATCH 5/6] iser RDMA CM (CMA) and IB verbsinteraction

2006-04-28 Thread Sean Hefty
+static int iser_free_device_ib_res(struct iser_device *device) +{ + BUG_ON(device-mr == NULL); + + tasklet_kill(device-cq_tasklet); + + (void)ib_dereg_mr(device-mr); + (void)ib_destroy_cq(device-cq); + (void)ib_dealloc_pd(device-pd); + + device-mr = NULL; +

[openib-general] RE: SDP hello ack header

2006-04-27 Thread Sean Hefty
Sean, CMA does not seem to set MajV/MinV in SDP hello ack header (REP). It does do this for hello header (REQ). Should SDP do this then? I don't think that the CMA cares about the hello ack, but I think it makes more sense for it to set it, since it does for the hh. Which is your preference? -

[openib-general] RE: SDP hello ack header

2006-04-27 Thread Sean Hefty
I don't care much. It seems to make sense for CMA to set it. I created a patch for this. The easiest fix was to set the version in the private_data passed to the CMA by SDP; however, the private_data is declared as const void *. This made me stop and think about the problem more. Both the CMA

[openib-general] [PATCH] rdma_cm: let SDP control the SDP version in the hello header

2006-04-27 Thread Sean Hefty
These are the changes to the CMA that I was considering. This patch lets SDP determine which version of the SDP headers to use. The CMA will check that it can support that version, and set the address fields appropriately. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: cma.c

RE: [openib-general] [PATCH] rdma_cm: let SDP control the SDP version in the hello header

2006-04-27 Thread Sean Hefty
I think if you want to do it this way you should check only the MajV. SDP spec says: A4.3.2.1.1 MAJOR PROTOCOL VERSION NUMBER (MAJV) - 4 BITS The current specification requires MajV to be set to 2. See section A4.5.1 Connection Setup on page 1218 for additional information. CA4-15: The accepting

[openib-general] RE: SDP hello ack header

2006-04-27 Thread Sean Hefty
BTW, does CMA MajV in incoming messages? It does not seem to. If not this needs to be corrected: It checks both the major and minor version on an incoming REQ. See cma_get_net_info() in the cma. A failure will result in sending a REJ, but probably not with the right reject reason / data. But

RE: [openib-general] [PATCH] rdma_cm: let SDP control the SDP version in the hello header

2006-04-27 Thread Sean Hefty
Go ahead. BTW, I'm reasonably sure CMA does not check MajV at least in incoming HelloAck. I think since you must check it in incoming Hello in CMA, its best to check it in incoming HelloAck in CMA as well. Another validation check needed in CMA: CA4-17: The accepting peer shall reject the

RE: [openib-general] [PATCH] - kernel cmatose fix

2006-04-27 Thread Sean Hefty
Thanks - applied. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] rdma_cm: let SDP control the SDP version in the hello header

2006-04-27 Thread Sean Hefty
Could the CMA handoff parsing/evaluation of the Hello/HelloAck exchange much the way it hands off private data to the ULP? It's slightly different with SDP. With other ULPs, the CMA inserts its own header at the start of the private data, and then strips it off on the remote side. However, with

[openib-general] Re: cma unitests / cma local connections

2006-04-26 Thread Sean Hefty
Or Gerlitz wrote: +1 what is the recommended cma kernel unitest, i recall there was cmatose and krping (i might be wrong re the testname) also can you please point me to the program SVN location and if there are such to minimal running instructions... Both of those test programs are in the

Re: [openib-general] [RFC] [PATCH 1/3] RDMA CM: addrdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
Michael S. Tsirkin wrote: Sean, what's up with patch numbering? The second patch labeled 1/3 is really 2/3. I resent 2/3 with the correct subject heading. +static ssize_t ucma_set_option(struct ucma_file *file, const char __user *inbuf, + int in_len, int

Re: [openib-general] [RFC] [PATCH 1/3] RDMA CM: add rdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
Michael S. Tsirkin wrote: +int rdma_set_option(struct rdma_cm_id *id, int level, int optname, + void *optval, size_t optlen); + It seems optval is a user pointer. Should it be parked as such void __user *. The getsockopt / setsockopt calls both use char *optval in their

Re: [openib-general] [RFC] [PATCH 1/3] RDMA CM: add rdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
Michael S. Tsirkin wrote: The getsockopt / setsockopt calls both use char *optval in their interfaces. Internally, they do get_user(), put_user(), copy_to_user(), etc. It's my understanding, which could be way off, that both getsockopt and setsockopt are also callable from kernel modules. I

RE: [openib-general] [RFC] [PATCH 1/3] RDMA CM: addrdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
What about for IB HCAs? Are there a large number of options that have not yet been exposed but which are device independent and *might* be desirable to control? If not, then why introduce a catchall interface as opposed to specific interfaces that have to justified on a per method basis? Sockets

[openib-general] Re: [RFC] [PATCH 1/3] RDMA CM: addrdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
Michael S. Tsirkin wrote: Sockets provides setsockopt/getsockopt calls, and there is an attempt here to emulate sockets. Maybe we should create a socket instead of a char device in ucma? Then we get bind/listen/connect/accept for free. This sounds worth considering. I'm not sure what it

Re: [openib-general] [RFC] [PATCH 1/3] RDMA CM: addrdma_get/set_optioncalls to get/set path records

2006-04-26 Thread Sean Hefty
Caitlin Bestler wrote: * Bind to a device based on IB specific addresses (e.g. GIDs). * Getting usable path records between two nodes. * Setting primary and alternate paths for a connection. * Modify an alternate path for a connection. * Joining a multicast group identified by an IP address. *

Re: [openib-general] slab error while removing ib_mad

2006-04-26 Thread Sean Hefty
Or Gerlitz wrote: I am getting the below trace on 2.6.17-rc2 / AMD x86_64 / PCIX HCA with both the IB sources that come with the kernel and svn trunk 6520. This happens if i just modprobe -r ib_mthca after fresh reboot, can anyone reproduce it on her/his system as well? The module does get

[openib-general] Re: [PATCH] RDMA CM: allow listen without prior binding to listen on any address

2006-04-25 Thread Sean Hefty
Sean Hefty wrote: Allow calling rdma_listen() without calling rdma_bind_addr() beforehand. This will result in binding to any address / any port before listening. I've committed this change. - Sean ___ openib-general mailing list openib-general

[openib-general] [RFC] [PATCH 3/3] RDMA CM: add rdma_get/set_option calls to userspace library

2006-04-25 Thread Sean Hefty
Support rdma_get_option / rdma_set_option through the userspace RDMA CM library. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/rdma_cma_abi.h === --- include/rdma/rdma_cma_abi.h (revision 6335) +++ include/rdma

<    4   5   6   7   8   9   10   11   12   13   >