[openib-general] [PATCH] reference counting added to ib_mad_agent

2004-09-24 Thread Sean Hefty
This patch adds reference counting for MAD agents to protect against deregistration while a callback is being invoked. As part of the structure changes to support reference counting, deregistration code has been simplified, and a bug has been fixed where multiple port structures were being

[openib-general] Re: [PATCH] fix list_entry usage

2004-10-14 Thread Sean Hefty
On Thu, 14 Oct 2004 08:30:45 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Wed, 2004-10-13 at 19:52, Sean Hefty wrote: Patch fixes casting to incorrect structures when calling list_entry(). Have you tried this ? It doesn't work (at least for me). (I had also tried the previous similar

Re: [openib-general] [PATCH] ib_mad: Fix send only registrations

2004-10-14 Thread Sean Hefty
On Thu, 14 Oct 2004 13:33:08 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: /* Validate MAD registration request if supplied */ if (mad_reg_req) { - if (!recv_handler || - mad_reg_req-mgmt_class_version = MAX_MGMT_VERSION) { + if

Re: [openib-general] Re: updated TODO list

2004-09-28 Thread Sean Hefty
On Mon, 27 Sep 2004 18:25:41 -0700 Roland Dreier [EMAIL PROTECTED] wrote: - Implement API for SA path record and MC group queries I spent a couple of days trying to define a basic query API for inclusion in the access layer, but eventually stopped. With the current MAD API, the benefits

Re: [openib-general] [PATCH] cancel outstanding MADs when deregistering

2004-09-29 Thread Sean Hefty
On Tue, 28 Sep 2004 12:50:52 -0700 Roland Dreier [EMAIL PROTECTED] wrote: It looks OK for current functionality but I think it will have to change to support cancelling sends. (Cancelling sends is required for consumers that start a query with a long timeout and then want to unload or

Re: [openib-general] [PATCH] ib_cancel_mad API

2004-09-29 Thread Sean Hefty
On Wed, 29 Sep 2004 11:17:05 -0700 Sean Hefty [EMAIL PROTECTED] wrote: + if (mad_send_wr-refcount == 0) { + list_del(mad_send_wr-agent_send_list); + spin_unlock_irqrestore(mad_agent_priv-send_list_lock, flags); + + mad_send_wc.status

Re: [openib-general] [PATCH] ib_cancel_mad API

2004-09-29 Thread Sean Hefty
On Wed, 29 Sep 2004 15:20:20 -0700 Roland Dreier [EMAIL PROTECTED] wrote: Sean Currently, the consumer _only_ has to free their send Sean context in their send MAD completion handler. No reference Sean counting by the consumer is needed. And it doesn't matter Sean if a send

[openib-general] MAD request/response completion order

2004-09-29 Thread Sean Hefty
Does anyone have a preference which order request/response MADs complete? Sends first always? Receives first always? Whatever is convenient? - Sean -- ___ openib-general mailing list [EMAIL PROTECTED]

Re: [openib-general] MAD request/response completion order

2004-09-30 Thread Sean Hefty
On Wed, 29 Sep 2004 16:18:47 -0700 Fab Tillier [EMAIL PROTECTED] wrote: From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 29, 2004 4:09 PM Does anyone have a preference which order request/response MADs complete? Sends first always? Receives first always? Whatever

Re: [openib-general] mthca and DDR not hidden

2004-09-30 Thread Sean Hefty
On Thu, 30 Sep 2004 16:40:05 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Thu, 2004-09-30 at 16:32, Roland Dreier wrote: Also, the consumer must either create its own MR using that PD, or have access to at least the L_Key for the MAD layer's MR. That would be an acceptable solution

[openib-general] Re: [PATCH] request/response matching in MAD code

2004-10-01 Thread Sean Hefty
On Fri, 01 Oct 2004 12:33:44 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: mad_send_wr-tid = ((struct ib_mad_hdr*) bus_to_virt(cur_send_wr-sg_list-addr))-tid; Thanks - good catch. 2. Added the following to reassemble_recv (it was eliminated

[openib-general] Re: ib_mad: Scenarios for returning posted send MADs

2004-10-04 Thread Sean Hefty
On Mon, 04 Oct 2004 13:26:42 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: There are two lists of posted send MADs: (1) a list of posted sends for the port, and (2) another list per MAD agent. When a send is first posted, it is placed on both lists until the send completion occurs and then is

Re: [openib-general] mthca and DDR not hidden

2004-10-04 Thread Sean Hefty
On Mon, 04 Oct 2004 13:36:26 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Sat, 2004-10-02 at 15:26, Michael S. Tsirkin wrote: I'd like to suggest that the mad layer could expose an allocator function that the user will call to grab the memory for the mad. This function would return the

[openib-general] Re: ib_mad: Scenarios for returning posted send MADs

2004-10-04 Thread Sean Hefty
On Mon, 04 Oct 2004 15:34:51 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: I am pretty sure there is a window here as follows: First, deregistration cancels the MAD removing it from the agent send list. ib_mad_complete_send_wr is invoked some time later and never checks for the send WR still

Re: [openib-general] mthca and DDR not hidden

2004-10-05 Thread Sean Hefty
On Tue, 5 Oct 2004 17:34:32 +0200 Michael S. Tsirkin [EMAIL PROTECTED] wrote: Hello! Quoting r. Sean Hefty ([EMAIL PROTECTED]) Re: [openib-general] mthca and DDR not hidden: Having an allocator routine might force users to perform data copies when sending data. Well, no one is running

[openib-general] Re: ib_mad: Scenarios for returning posted send MADs

2004-10-05 Thread Sean Hefty
On Tue, 05 Oct 2004 09:31:43 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: Do you mean that you should never get a callback for a mad_send_wr if (rather than unless) it's reference count is at least one ? There will never be a completion callback associated with a mad_send_wr unless its

RE: [openib-general] [PATCH] ib_mad.c: Fix request/response matching

2004-10-05 Thread Sean Hefty
Hal The tid of a requests is needed so responses can be matched. Hal One way around this would be to pass the TID as a separate Hal parameter in the ib_post_send_mad call. Maybe there are other Hal less brute force ways. I don't see a way around adding a TID parameter to

RE: [openib-general] [PATCH] ib_mad.c: Fix request/response matching

2004-10-05 Thread Sean Hefty
Fix endian of high tid so responses are properly matched to requests n the TID is in the MAD and goes on the wire. Please, do not use CPU endian! mad_send_wr-tid = ((struct ib_mad_hdr*) - bus_to_virt(cur_send_wr-sg_list-addr))-tid; +

Re: [openib-general] [PATCH] ib_mad: Fix return posted receive MAD routine

2004-10-05 Thread Sean Hefty
On Tue, 5 Oct 2004 13:42:54 -0700 Sean Hefty [EMAIL PROTECTED] wrote: On Tue, 05 Oct 2004 15:03:12 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: + rbuf = list_entry(port_priv-recv_posted_mad_list[i], + struct ib_mad_recv_buf, list

Re: [openib-general] [PATCH] ib_mad.c: Fix request/response matching

2004-10-05 Thread Sean Hefty
On Tue, 05 Oct 2004 16:57:58 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: mad_send_wr-tid = ((struct ib_mad_hdr*) - bus_to_virt(cur_send_wr-sg_list-addr))-tid; + bus_to_virt(cur_send_wr-sg_list-addr))- tid.id; A response MAD should have

Re: [openib-general] [PATCH] ib_mad.c: Fix request/response matching

2004-10-05 Thread Sean Hefty
On Tue, 05 Oct 2004 13:59:37 -0700 Roland Dreier [EMAIL PROTECTED] wrote: Hal Good point. We will need more than access to the TID for Hal RMPP. We need a replacement for bus_to_virt. Is there an Hal approved way to get from DMA address to VA ? No, you just need to save off the

Re: [openib-general] [PATCH] ib_mad.c: Fix request/response matching

2004-10-05 Thread Sean Hefty
On Tue, 05 Oct 2004 14:06:13 -0700 Roland Dreier [EMAIL PROTECTED] wrote: Sean Would we need multiple VAs if scatter-gather is used by the client? Yep. Or we could just say that all the fields the access layer needs to look at must be in the first s/g entry. You're right, and thinking

RE: [openib-general] [PATCH] ib_mad.h: Remove network endianconversionof QP1 QKey

2004-10-06 Thread Sean Hefty
On Wed, 2004-10-06 at 12:46, Fab Tillier wrote: From: Hal Rosenstock [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 06, 2004 9:42 AM ib_mad.h: Remove network endian conversion of QP1 QKey I don't get the point. Is IB_QP1_QKEY going to be treated in host order, and then swapped by

[openib-general] [PATCH] for review to timeout send MADs

2004-10-06 Thread Sean Hefty
Here are some modifications to support timing out send MADs in the access layer. I haven't tested this code beyond building it, but wanted to make it available for review. There are a few race conditions that need to be avoided when handling timeouts, so if it looks like something was missed,

RE: [openib-general] [PATCH] for review to timeout send MADs

2004-10-06 Thread Sean Hefty
It seems you are using the system-wide keventd queue. This isn't necessarily a problem per se, but it would probably be better to use a MAD-layer private workqueue (I suggested a single-threaded workqueue per MAD port earlier). This avoids two problems. First, keventd is subject to arbitrary

RE: [openib-general] Re: [PATCH] for review to timeout send MADs

2004-10-07 Thread Sean Hefty
On Wed, 2004-10-06 at 19:01, Sean Hefty wrote: but wanted to make it available for review. There are a few race conditions that need to be avoided when handling timeouts, so if it looks like something was missed, let me know. If it is not too much work, I would prefer this broken into 2

[openib-general] [PATCH] reformat code to within 80 columns

2004-10-07 Thread Sean Hefty
Only purpose of patch is to reformat code to keep it within 80 columns. The resulting code highlights some areas where we may want to look at restructing it. - Sean Index: access/ib_mad.c === --- access/ib_mad.c (revision

Re: [openib-general] [PATCH] reformat code to within 80 columns

2004-10-07 Thread Sean Hefty
On Thu, 7 Oct 2004 11:49:43 -0700 Sean Hefty [EMAIL PROTECTED] wrote: Only purpose of patch is to reformat code to keep it within 80 columns. The resulting code highlights some areas where we may want to look at restructing it. - Sean Same purpose - different file... SMI code, which

[openib-general] [PATCH] rename structure members

2004-10-07 Thread Sean Hefty
Here's a patch that just renames a few structure members related to MADs. The renamed variables will be used when handling MAD timeouts. - Sean -- Index: access/ib_mad_priv.h === --- access/ib_mad_priv.h(revision 955) +++

Re: [openib-general] L_Key/MR for sending MADs?

2004-10-21 Thread Sean Hefty
On Wed, 20 Oct 2004 22:28:45 -0700 Roland Dreier [EMAIL PROTECTED] wrote: A little while ago, we had a brief discussion about what MR consumers should use for MADs they want to send. It seems the two possibilities where for the MAD layer to expose its MR for consumer use, or for consumers to

[openib-general] [PATCH] timeout wq code

2004-10-21 Thread Sean Hefty
Code to use a work queue to time out MADs. There is one work queue per port. (Completion handling code was not changed.) I'm working on creating a few simple test cases to verify MAD functionality (registration, timeouts, sends, receives, and RMPP in the future), but these are not yet done.

Re: [openib-general] Handling SM class (SMInfo vs. other queries)

2004-10-25 Thread Sean Hefty
On Mon, 25 Oct 2004 12:39:36 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: OK. It's pretty straightforward to change the MAD layer to use PLM rather than snoop MAD (and remove snoop_mad (undo that patch)). Should I post the changes ? I think that this makes sense. - Sean

Re: [openib-general] [PATCH] ib_mad: In ib_mad_complete_recv, decrement agent refcount when not fully reassembled and when no request found

2004-10-25 Thread Sean Hefty
On Sun, 24 Oct 2004 13:38:01 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: ib_mad: In ib_mad_complete_recv, decrement agent reference count when receive is not fully reassembled, and also when solicited and no matching request is found. This allows deregistration to complete rather than

Re: [openib-general] Handling SM class (SMInfo vs. other queries)

2004-10-25 Thread Sean Hefty
On Mon, 25 Oct 2004 10:08:51 -0700 Roland Dreier [EMAIL PROTECTED] wrote: Hal OK. It's pretty straightforward to change the MAD layer to Hal use PLM rather than snoop MAD (and remove snoop_mad (undo Hal that patch)). Should I post the changes ? It's my idea so I certainly like

Re: [openib-general] Handling SM class (SMInfo vs. other queries)

2004-10-25 Thread Sean Hefty
On Mon, 25 Oct 2004 10:34:09 -0700 Roland Dreier [EMAIL PROTECTED] wrote: Sean If the MAD is not consumed by the driver, the MAD Sean layer may update the MAD and call process_local_mad a second Sean time, correct? Sure, I guess so -- nothing should break if the MAD layer does

Re: [openib-general] ib_free_recv_mad and references

2004-10-27 Thread Sean Hefty
On Wed, 27 Oct 2004 10:08:45 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 12:40, Sean Hefty wrote: Currently, a call to ib_free_recv_mad does not dereference the mad_agent that the mad was given to. The call itself does not access the mad_agent, but should

Re: [openib-general] agent_mad_send

2004-10-27 Thread Sean Hefty
On Wed, 27 Oct 2004 09:47:25 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 18:29, Sean Hefty wrote: In agent_mad_send, a call is made to create an address handle. Immediately after calling ib_post_send_mad, the address handle is destroyed. I think that we want

[openib-general] [PATCH] change MAD completion processing to use workqueue

2004-10-27 Thread Sean Hefty
Index: access/ib_mad_priv.h === --- access/ib_mad_priv.h(revision 1078) +++ access/ib_mad_priv.h(working copy) @@ -153,6 +153,7 @@ struct ib_mad_mgmt_class_table *version[MAX_MGMT_VERSION]; struct

Re: [openib-general] 2 questions on physical code layout

2004-10-27 Thread Sean Hefty
On Wed, 27 Oct 2004 13:35:22 -0700 Roland Dreier [EMAIL PROTECTED] wrote: OK, I'm going to go ahead and rename ib_mad.c - mad.c, ib_agent.c - agent.c etc. (This also makes it possible to build a module named ib_mad.o, which I think makes more sense than ib_al.o, from multiple sources). I

Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler

2004-10-27 Thread Sean Hefty
On Tue, 26 Oct 2004 13:14:00 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 13:10, Roland Dreier wrote: Sean As a suggestion, we can allocate 2 CQs per QP, one for Sean receives, and one for sends. This would let us separate Sean send from receive

[openib-general] ib_mad_port_start allows receive processing before sends can be posted

2004-10-27 Thread Sean Hefty
There appears to be a minor race in ib_mad_port_start where the MAD layer could begin accepting and processing receives before the QP allows sends, or even before we know if the QP will finish initializing properly. This makes it difficult to handle traffic that comes in before the QP is

[openib-general] ib_mad_recv_wrid index field

2004-10-27 Thread Sean Hefty
What's the purpose behind the index field in the receive wr_id? - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH]spinlock shouldn't be held while calling ib_post_send()

2004-10-29 Thread Sean Hefty
On Fri, 29 Oct 2004 18:06:47 -0600 Shirley Ma [EMAIL PROTECTED] wrote: Here is the patch. Note that my patch removes the lock when calling ib_post_send. But, holding the lock when calling ib_post_send() should be fine. Also, the current completion code assumes that the work requests are

Re: [openib-general] [PATCH]code optimization in ib_register_mad_agent()

2004-10-29 Thread Sean Hefty
On Fri, 29 Oct 2004 17:35:40 -0600 Shirley Ma [EMAIL PROTECTED] wrote: I am starting to look at the access layer code. Here is a code optimization patch in ib_register_mad_agent(). ib_mad_client_id must be incremented while holding the spinlock (or converted into an atomic). The rest of the

Re: [openib-general] [RFC] [PATCH] Remove redundant ib_qp_cap from 2 verb routines.

2004-11-01 Thread Sean Hefty
On Mon, 01 Nov 2004 11:24:35 -0500 Hal Rosenstock [EMAIL PROTECTED] wrote: On Fri, 2004-10-29 at 16:14, Sean Hefty wrote: On Fri, 29 Oct 2004 13:01:03 -0700 (PDT) Krishna Kumar [EMAIL PROTECTED] wrote: Hi, I know this changes the verbs interface a bit, but ... I don't see

Re: [openib-general] [PATCH]remove redundant assignment in ib_post_send_mad()

2004-11-01 Thread Sean Hefty
On Mon, 01 Nov 2004 18:39:59 -0500 Hal Rosenstock [EMAIL PROTECTED] wrote: I don't think this is an or. A check for *bad_send_wr should be added(which might be changed based on the below question). I will post a patch for this. IMO these should be BUG_ON but just errors as these are localized

Re: [openib-general] [PATCH] Missing check for atomic_dec in ib_post_send_mad

2004-11-02 Thread Sean Hefty
On Tue, 2 Nov 2004 09:59:14 -0800 (PST) Krishna Kumar [EMAIL PROTECTED] wrote: Hi Sean, I think that is the best approach. And using this method, we can also avoid holding the lock if solicited is set. I will send a patch in a few minutes if this approach looks good. Sounds good. I

Re: [openib-general] [PATCH] for review -- fix MAD completion handling

2004-11-02 Thread Sean Hefty
On Thu, 28 Oct 2004 23:30:00 -0700 Sean Hefty [EMAIL PROTECTED] wrote: Here's what I have to handle MAD completion handling. This patch tries to fix the issue of matching a completion (successful or error) with the corresponding work request. Some notes: Please use this patch instead. I

RE: [openib-general] [PATCH] Missing check for atomic_dec inib_post_send_mad

2004-11-03 Thread Sean Hefty
Couple of issues with the new code (same as old code, though) : 1. printk(KERN_ERR PFX No client 0x%x for received MAD on port %d\n, hi_tid, port_priv-port_num); and printk(KERN_NOTICE PFX No matching mad agent found for

RE: [openib-general] [PATCH] Initial checkin of userspace MAD access

2004-11-03 Thread Sean Hefty
I've just checked in an initial version of userspace MAD access (including documentation in docs/user_mad.txt). Unfortunately this is not quite ready for use underneath OpenSM, since it is not possible to register an agent for the SM classes (since they are currently grabbed by the kernel SMA

RE: [openib-general] [PATCH] Initial checkin of userspace MAD access

2004-11-03 Thread Sean Hefty
Another option is to revise the kernel MAD code so that it does not need to register an agent for the SM classes (ie pass all MADs to low-level driver first). I thought that we had decided to go this route, and replace snoop_mad with calls to process_mad. If we're in agreement on this, I can do

RE: [openib-general] [PATCH] Initial checkin of userspace MAD access

2004-11-03 Thread Sean Hefty
On Wed, 2004-11-03 at 12:00, Sean Hefty wrote: Is anyone willing to work on porting opensm to this? If not, I can start on this. Otherwise, I will continue working on adding MAD error/overrun handling. Shahar from Voltaire will be doing this. I am working now on modifying the MAD layer

Re: [openib-general] [PATCH] Initial checkin of userspace MAD access

2004-11-03 Thread Sean Hefty
Roland Dreier wrote: Do the names /dev/infiniband/mthca0/umad1 and so on make sense to people? I thought that userspace verbs support would probably use a file like /dev/infiniband/mthca0/verbs, etc. I think that this approach is good. - Sean ___

Re: [openib-general] [PATCH 1/2] [RFC] Implement resize of CQ

2004-11-03 Thread Sean Hefty
Krishna Kumar wrote: qp_info-qp = ib_create_qp(port_priv-pd, qp_init_attr); - if (IS_ERR(qp_info-qp)) { - printk(KERN_ERR PFX Couldn't create ib_mad QP%d\n, - get_spl_qp_index(qp_type)); + if (!IS_ERR(qp_info-qp)) { + struct

[openib-general] Reusing receive MADs

2004-11-04 Thread Sean Hefty
Is there any interest among people to reuse receive MADs? I.e. once allocated and mapped, the receive MAD and work request would be re-posted to the QP when freed. I ask because if people are interested in such an optimization at some point in the future, it will affect how I structure send

Re: [openib-general] [PATCH 1/2][RFC] Implement resize of CQ

2004-11-05 Thread Sean Hefty
Hal Rosenstock wrote: However I'm not sure I understand why the MAD layer wants to resize these objects -- given that the number of QPs is known in advance and that the MAD layer can choose how many work requests to post per QP, I'm not sure what is gained by trying to resize things dynamically.

[openib-general] ib_mad_recv_done_handler questions

2004-11-08 Thread Sean Hefty
Looking at the latest changes to ib_mad_recv_done_handler, I have a couple of questions: * If the underlying driver provides a process_mad routine, a response MAD is allocated every time a MAD is received on QP 0 or 1. Can we either push this allocation down into the HCA driver, or find an

[openib-general] MAD agent code comments

2004-11-08 Thread Sean Hefty
A couple of comments (so far) while tracing through the MAD agent code. * There are a couple of places where ib_get_agent_mad() will be called multiple times in the same execution path. For example agent_send calls it, as does agent_mad_send. I didn't check to see if the calls would return

Re: [openib-general] Re: IPoIB Completion Handling

2004-11-09 Thread Sean Hefty
Hal Rosenstock wrote: On Tue, 2004-11-09 at 10:37, Roland Dreier wrote: By the way, reposting the receives is not the right thing to do on error -- the QP will be in the error state, so any new work requests will just complete with a flush status. We need to reset the QP and start over to recover

[openib-general] error trying to bring up node

2004-11-09 Thread Sean Hefty
I have two nodes directly connected. When trying to bring up the openib node, I receive a local length error on the CQ after trying to perform a send. I'm continuing to debug... - Sean ___ openib-general mailing list [EMAIL PROTECTED]

Re: [openib-general] error trying to bring up node

2004-11-09 Thread Sean Hefty
Sean Hefty wrote: I have two nodes directly connected. When trying to bring up the openib node, I receive a local length error on the CQ after trying to perform a send. I'm continuing to debug... static int agent_mad_send(struct ib_mad_agent *mad_agent, struct

Re: [openib-general] error trying to bring up node

2004-11-09 Thread Sean Hefty
Hal Rosenstock wrote: On Tue, 2004-11-09 at 14:56, Sean Hefty wrote: Sean Hefty wrote: I have two nodes directly connected. When trying to bring up the openib node, I receive a local length error on the CQ after trying to perform a send. I'm continuing to debug... static int agent_mad_send

[openib-general] [PATCH] handle QP0/1 send queue overrun

2004-11-09 Thread Sean Hefty
The following patch adds support for handling QP0/1 send queue overrun, along with a couple of related fixes: * The patch includes that provided by Roland in order to configure the fabric. * The code no longer modifies the user's send_wr structures when sending a MAD. * Sent MADs work requests

Re: [openib-general] [PATCH] agent: Fix agent_mad_send PCI mapping and gather address and length

2004-11-10 Thread Sean Hefty
Hal Rosenstock wrote: This is a separate issue from the ports not becoming active (DR handling issue). I broke this part yesterday (not a good day at all :-( at either r1184 and/or r1181 when I added what I thought was correct based on Sean's emails (not dispatching additional error cases in

Re: [openib-general] [PATCH] agent: Handle out of order send completions

2004-11-10 Thread Sean Hefty
Hal Rosenstock wrote: - send_wr.wr_id = ++port_priv-wr_id; + send_wr.wr_id = (unsigned long)agent_send_wr-send_list; {snip} + send_wr = (struct list_head *)(unsigned long)mad_send_wc-wr_id; + agent_send_wr = container_of(send_wr, struct ib_agent_send_wr,

Re: [openib-general] Re: [PATCH] handle QP0/1 send queue overrun

2004-11-10 Thread Sean Hefty
Hal Rosenstock wrote: 1. Why was BUG_ON removed from dequeue_mad ? That can be put back. I removed queue_mad, and was going to remove dequeue_mad, but decided to leave it. 2. A couple of questions related to send_wr-num_sge checking. a. Should this be pushed down to mthca and detected there

[openib-general] [PATCH] adjust error checking in ib_post_send_mad

2004-11-10 Thread Sean Hefty
Removes unneeded check and relocates other to while loop. - Sean Index: core/mad.c === --- core/mad.c (revision 1197) +++ core/mad.c (working copy) @@ -518,14 +518,10 @@ if (!bad_send_wr) goto error1; -

[openib-general] QP error handling

2004-11-11 Thread Sean Hefty
I'm trying to force errors on QP0/1 to see if my changes can recover from them. I force the errors by sending with an invalid lkey. Based on the implementation of mthca, what can be expected? I'm not seeing the QP event handler get invoked. I do receive a completion error, followed by

[openib-general] Re: QP error handling

2004-11-11 Thread Sean Hefty
Roland Dreier wrote: mthca currently doesn't handle these 'asynchronous' state transitions (ie transition to error). It continues to think the QP is in the RTS state. Proper handling needs to be implemented. Ok - thanks for the info. However should there be a QP event for a send with invalid

[openib-general] [PATCH] [2/2] change QP state to SQE

2004-11-11 Thread Sean Hefty
This should transition the QP state to SQE when encountering a send error on the CQ. There may be a better way of doing this; I didn't spend a lot of time studying the code. - Sean Index: mthca_dev.h === --- mthca_dev.h (revision

[openib-general] Re: [PATCH] [1/2] SQE handling on MAD QPs

2004-11-12 Thread Sean Hefty
Hal Rosenstock wrote: On Thu, 2004-11-11 at 20:41, Sean Hefty wrote: This patch recovers from send queue errors on QP 0/1. (It should also work in the case of fatal errors, but does not try to recover.) Code was tested by forcing send errors and checking that the port could still go to active

Re: [openib-general] [PATCH] agent: Fix agent_mad_send PCI mapping and gather address and length

2004-11-12 Thread Sean Hefty
Hal Rosenstock wrote: On Wed, 2004-11-10 at 11:59, Roland Dreier wrote: Sean What exactly does it mean then when process_mad returns Sean success? Do any of the return bits from process_mad Sean indicate that the MAD was for the HCA driver? SUCCESS means that process_mad didn't encounter

[openib-general] Re: [PATCH] [2/2] change QP state to SQE

2004-11-12 Thread Sean Hefty
Roland Dreier wrote: I thought about this a little, and it seems that having the CQ poll operation update the QP state is not the right solution. It seems it would be better to add support for the Current QP state modifier for the modify QP operation and expect the consumer to use that to

[openib-general] [PATCH] Remove unneeded call in MAD code

2004-11-12 Thread Sean Hefty
This patch removes ib_mad_return_posted_send_mads, which isn't needed when shutting down. There cannot be any sends outstanding at this point, or clients still exist. - Sean Index: core/mad.c === --- core/mad.c (revision 1222) +++

[openib-general] [PATCH] collapse MAD function calls

2004-11-12 Thread Sean Hefty
This patch callapses several function calls into one when activating the MAD QPs. This avoids repeated allocation/freeing of memory. I have plans to examine the QP transitions to the reset state to see if these are necessary and if a race condition exists between shutting down a port and

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Sean Hefty
On Fri, 12 Nov 2004 22:08:14 -0500 Hal Rosenstock [EMAIL PROTECTED] wrote: This patch looks like it includes the previous patch and due to this 2 large hunks are rejected. Can you regenerate this ? Updated patch. - Sean Index: core/mad.c

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Sean Hefty
On Mon, 15 Nov 2004 14:50:06 -0500 Hal Rosenstock [EMAIL PROTECTED] wrote: On Mon, 2004-11-15 at 13:29, Sean Hefty wrote: On Fri, 12 Nov 2004 22:08:14 -0500 Hal Rosenstock [EMAIL PROTECTED] wrote: This patch looks like it includes the previous patch and due to this 2 large hunks

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: After Roland's query this AM, I am looking at this some more: On Wed, 2004-11-10 at 13:43, Sean Hefty wrote: The second case where I can see this happening is if the client canceled the send, and I'm not sure that we'd want to give the client an unmatched response

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: My personal take would be to avoid adding that complexity. E.g. a client sends a MAD with TID 5, cancels 5, sends 5, cancels 5, sends 5. A response is now received. What should the MAD layer do? I don't see issues with silently dropping any MAD that we're not ready to

Re: [openib-general] [patch] mad.c, agent.c spinlocking on UP

2004-11-16 Thread Sean Hefty
Roland Dreier wrote: Bernhard Hi, from linux/spinlock.h: spin_is_locked on UP always Bernhard says FALSE Good catch. Bernhard please consider applying, Can we try and think of a fix that doesn't involve adding #ifdefs to the source file? Do we really need the BUG_ONs at all? I'd vote

Re: [openib-general] Re: Setting of MAD TID for user mode clients

2004-11-16 Thread Sean Hefty
Roland Dreier wrote: Hal Hi, Should it be the responsibility of user_mad or the client Hal itself to set the hi_tid ? Right now, it's in Hal user_mad::ib_umad_write. I think it has to be in the kernel (ie in user_mad.c) because we can't trust anything userspace gives us. agreed

[openib-general] RMPP implementation

2004-11-16 Thread Sean Hefty
I'm starting work on the RMPP implementation in the MAD code. If anyone has any ideas/preferences on the implementation, please let me know. For the send side, there are a couple of ways to perform the segmentation: 1. Issue one send at a time. Additional sends are not transfered until the

Re: [openib-general] RMPP implementation

2004-11-17 Thread Sean Hefty
Fab Tillier wrote: 1. Issue one send at a time. Additional sends are not transfered until the first send completes. Isn't #1 the simplest to implement? Turnaround on the send queue should be pretty quick, so send performance should be fine. I say do whatever is simplest, and then optimize

Re: [openib-general] RMPP implementation

2004-11-17 Thread Sean Hefty
Roland Dreier wrote: Would it make sense to figure out what the expected consumers of this RMPP support will be and what they will need before designing the RMPP implementation? Absolutely. Right now, I'm assuming opensm and SA query as the primary users. - Sean

Re: [openib-general] [PATCH][RFC/v1][6/12] Add PoIB (IP-over-InfiniBand) driver

2004-11-18 Thread Sean Hefty
Hal Rosenstock wrote: Anyhow, we are within days of starting on this. There are 2 main portions of this: 1. Port to gen2 API 2. Fix build The other aspects can wait if necessary. How long before we need the first part ? Is there any expectation on how long code review would last ? Or would they

Re: [openib-general] Re: OpenIB Thread Usage

2004-11-19 Thread Sean Hefty
Roland Dreier wrote: I think the CM ends up needing its own set of workqueues so that it can queue MAD processing along with time wait events etc. Also we don't want the CM to block general MAD processing while it waits for things like QP modify. I thought about this approach, but wasn't sure

Re: [openib-general] [RFC] [PATCH] mad: Change mad thread model to be 1 thread/port rather than 1 thread/port/CPU

2004-11-19 Thread Sean Hefty
Hal Rosenstock wrote: Change mad thread model to be 1 thread/port rather than 1 thread/port/CPU (Note that I have not applied this but am requesting comments). Index: mad.c === --- mad.c (revision 1269) +++ mad.c (working copy) @@

[openib-general] [PATCH] cleanup/fixes for handle_outgoing_smp

2004-11-24 Thread Sean Hefty
This patch restructures handle_outgoing_smp to improve its readability and fixes the following issues: removes unneeded memory allocation for received SMP, properly sends a SMP if the underlying HCA driver does not provide a process_mad routine, and deallocates the allocated received SMP in all

Re: [openib-general] MAD registration for newer vendor classes

2004-11-29 Thread Sean Hefty
[EMAIL PROTECTED] wrote: Hi, For the newer vendor classes (0x30-0x4f), should we add OUI to the registration and put the demux into the MAD layer for these classes by OUI ? If so, I will work up a patch for this. I guess I need to re-examine the MAD dispatching, but I can't think of a reason why

[openib-general] Re: [PATCH] cleanup/fixes for handle_outgoing_smp

2004-11-29 Thread Sean Hefty
Hal Rosenstock wrote: This patch restructures handle_outgoing_smp to improve its readability I can't see for sure for your patch. The main changes are that the code is outdented and moved from nested if's to a switch statement. and fixes the following issues: removes unneeded memory allocation

[openib-general] [PATCH] [re-send] cleanup/fixes for handle_outgoing_smp

2004-11-29 Thread Sean Hefty
Index: core/mad.c === --- core/mad.c (revision 1291) +++ core/mad.c (working copy) @@ -366,108 +366,93 @@ struct ib_send_wr *send_wr) { int ret; + struct ib_mad_private *mad_priv; +

Re: [openib-general] MAD registration for newer vendor classes

2004-11-29 Thread Sean Hefty
Hal Rosenstock wrote: Also, based on this, do you think it makes sense for an OpenIB OUI (if we are to utilize these classes) ? I think that it makes sense, but I'd wait until we actually have code that utilizes it. - Sean ___ openib-general mailing

[openib-general] Re: [PATCH] [re-send] cleanup/fixes for handle_outgoing_smp

2004-11-29 Thread Sean Hefty
Hal Rosenstock wrote: On Mon, 2004-11-29 at 14:40, Sean Hefty wrote: - if (mad_agent_priv-agent.send_handler) { - /* Now, complete send */ - mad_send_wc.status = IB_WC_SUCCESS; - mad_send_wc.vendor_err = 0

[openib-general] [PATCH] added documentation for exported functions

2004-11-30 Thread Sean Hefty
Patch adds documentation for exported functions that did not have it in ib_verbs.h and device.c. Fixes slight formatting issue in ib_mad.h documentation. Patch will be committed shortly after sending this. - Sean Index: include/ib_verbs.h

Re: [openib-general] smpdump and current MAD layer

2004-11-30 Thread Sean Hefty
Hal Rosenstock wrote: Each received MAD can only have 1 client which owns it. That client is either determined via solicited routing or version/class/method (and soon OUI) routing. This is correct. This was done to avoid having to copy received MADs. So solicited MAD responses cannot currently be

Re: [openib-general] smpdump and current MAD layer

2004-11-30 Thread Sean Hefty
Hal Rosenstock wrote: This is something that was briefly discussed before. I think that I would support snooping by extending the ib_mad_reg_reg structure to indicate a registration type, possibly along with some additional filtering parameters. (We could also create a new snoop routine.)

Re: [openib-general] smpdump and current MAD layer

2004-12-02 Thread Sean Hefty
Hal Rosenstock wrote: So solicited MAD responses cannot currently be snooped nor can unsolicited ones for which an agent is registered (Since SMA and PMA are currently firmware based, the latter is not an issue for the current implementation). I've gotten a start on adding in the snooping support.

Re: [openib-general] smpdump and current MAD layer

2004-12-02 Thread Sean Hefty
Hal Rosenstock wrote: I'd like to place the snooping code in as few places as possible, but still be able to snoop locally processed MADs. Ideally a MAD should be snooped exactly once, which requires some extra care when handling QP errors. Snooping in the completion handling allows the MAD

Re: [openib-general] IPoIB still not working

2004-12-08 Thread Sean Hefty
Roland Dreier wrote: Eitan Results with - ERR 1B10: Provided Join State != FullMember Eitan - required for create. You can not create a group if you Eitan are not a full member. Right. However, ScopeState is dumped as 0x1, which means bit 0 of JoinState (the FullMember bit) is in

[openib-general] crash in mthca soon after loading drivers

2004-12-08 Thread Sean Hefty
I'm getting the following bug in mthca when loading the drivers (core, mad, and mthca). The system is attached to a fabric with opensm running on top of the Mellanox gold software stack. I hit this when running with the tip of openib. Any help would be, well, helpful. - Sean Dec 8 14:53:47

  1   2   3   4   5   6   7   8   9   10   >