This patch adds reference counting for MAD agents to protect against deregistration
while a callback is being invoked. As part of the structure changes to support
reference counting, deregistration code has been simplified, and a bug has been fixed
where multiple port structures were being
On Thu, 14 Oct 2004 08:30:45 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Wed, 2004-10-13 at 19:52, Sean Hefty wrote:
Patch fixes casting to incorrect structures when calling list_entry().
Have you tried this ? It doesn't work (at least for me). (I had also
tried the previous similar
On Thu, 14 Oct 2004 13:33:08 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
/* Validate MAD registration request if supplied */
if (mad_reg_req) {
- if (!recv_handler ||
- mad_reg_req-mgmt_class_version = MAX_MGMT_VERSION) {
+ if
On Mon, 27 Sep 2004 18:25:41 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
- Implement API for SA path record and MC group queries
I spent a couple of days trying to define a basic query API for inclusion in the
access layer, but eventually stopped. With the current MAD API, the benefits
On Tue, 28 Sep 2004 12:50:52 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
It looks OK for current functionality but I think it will have to
change to support cancelling sends. (Cancelling sends is required
for consumers that start a query with a long timeout and then want to
unload or
On Wed, 29 Sep 2004 11:17:05 -0700
Sean Hefty [EMAIL PROTECTED] wrote:
+ if (mad_send_wr-refcount == 0) {
+ list_del(mad_send_wr-agent_send_list);
+ spin_unlock_irqrestore(mad_agent_priv-send_list_lock, flags);
+
+ mad_send_wc.status
On Wed, 29 Sep 2004 15:20:20 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
Sean Currently, the consumer _only_ has to free their send
Sean context in their send MAD completion handler. No reference
Sean counting by the consumer is needed. And it doesn't matter
Sean if a send
Does anyone have a preference which order request/response MADs complete? Sends first
always? Receives first always? Whatever is convenient?
- Sean
--
___
openib-general mailing list
[EMAIL PROTECTED]
On Wed, 29 Sep 2004 16:18:47 -0700
Fab Tillier [EMAIL PROTECTED] wrote:
From: Sean Hefty [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 29, 2004 4:09 PM
Does anyone have a preference which order request/response MADs complete?
Sends first always? Receives first always? Whatever
On Thu, 30 Sep 2004 16:40:05 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Thu, 2004-09-30 at 16:32, Roland Dreier wrote:
Also, the
consumer must either create its own MR using that PD, or have access
to at least the L_Key for the MAD layer's MR.
That would be an acceptable solution
On Fri, 01 Oct 2004 12:33:44 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
mad_send_wr-tid = ((struct ib_mad_hdr*)
bus_to_virt(cur_send_wr-sg_list-addr))-tid;
Thanks - good catch.
2. Added the following to reassemble_recv (it was eliminated
On Mon, 04 Oct 2004 13:26:42 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
There are two lists of posted send MADs: (1) a list of posted sends for
the port, and (2) another list per MAD agent. When a send is first
posted, it is placed on both lists until the send completion occurs and
then is
On Mon, 04 Oct 2004 13:36:26 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Sat, 2004-10-02 at 15:26, Michael S. Tsirkin wrote:
I'd like to suggest that the mad layer could expose an allocator
function that the user will call to grab the memory for the mad.
This function would return the
On Mon, 04 Oct 2004 15:34:51 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
I am pretty sure there is a window here as follows:
First, deregistration cancels the MAD removing it from the agent send
list.
ib_mad_complete_send_wr is invoked some time later and never checks for
the send WR still
On Tue, 5 Oct 2004 17:34:32 +0200
Michael S. Tsirkin [EMAIL PROTECTED] wrote:
Hello!
Quoting r. Sean Hefty ([EMAIL PROTECTED]) Re: [openib-general] mthca and DDR not
hidden:
Having an allocator routine might force users to perform data copies when sending
data.
Well, no one is running
On Tue, 05 Oct 2004 09:31:43 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
Do you mean that you should never get a callback for a mad_send_wr if
(rather than unless) it's reference count is at least one ?
There will never be a completion callback associated with a mad_send_wr unless its
Hal The tid of a requests is needed so responses can be matched.
Hal One way around this would be to pass the TID as a separate
Hal parameter in the ib_post_send_mad call. Maybe there are other
Hal less brute force ways.
I don't see a way around adding a TID parameter to
Fix endian of high tid so responses are properly matched to requests
n the TID is in the MAD and goes on the wire.
Please, do not use CPU endian!
mad_send_wr-tid = ((struct ib_mad_hdr*)
-
bus_to_virt(cur_send_wr-sg_list-addr))-tid;
+
On Tue, 5 Oct 2004 13:42:54 -0700
Sean Hefty [EMAIL PROTECTED] wrote:
On Tue, 05 Oct 2004 15:03:12 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
+ rbuf = list_entry(port_priv-recv_posted_mad_list[i],
+ struct ib_mad_recv_buf, list
On Tue, 05 Oct 2004 16:57:58 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
mad_send_wr-tid = ((struct ib_mad_hdr*)
-
bus_to_virt(cur_send_wr-sg_list-addr))-tid;
+ bus_to_virt(cur_send_wr-sg_list-addr))-
tid.id;
A response MAD should have
On Tue, 05 Oct 2004 13:59:37 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
Hal Good point. We will need more than access to the TID for
Hal RMPP. We need a replacement for bus_to_virt. Is there an
Hal approved way to get from DMA address to VA ?
No, you just need to save off the
On Tue, 05 Oct 2004 14:06:13 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
Sean Would we need multiple VAs if scatter-gather is used by the client?
Yep. Or we could just say that all the fields the access layer needs
to look at must be in the first s/g entry.
You're right, and thinking
On Wed, 2004-10-06 at 12:46, Fab Tillier wrote:
From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
Sent: Wednesday, October 06, 2004 9:42 AM
ib_mad.h: Remove network endian conversion of QP1 QKey
I don't get the point. Is IB_QP1_QKEY going to be treated in host order,
and then swapped by
Here are some modifications to support timing out send MADs in the access layer. I
haven't tested this code beyond building it, but wanted to make it available for
review. There are a few race conditions that need to be avoided when handling
timeouts, so if it looks like something was missed,
It seems you are using the system-wide keventd queue. This isn't
necessarily a problem per se, but it would probably be better to use a
MAD-layer private workqueue (I suggested a single-threaded workqueue
per MAD port earlier). This avoids two problems. First, keventd is
subject to arbitrary
On Wed, 2004-10-06 at 19:01, Sean Hefty wrote:
but wanted to make it available for review. There are a few race
conditions that need to be avoided when handling timeouts, so if
it looks like something was missed, let me know.
If it is not too much work, I would prefer this broken into 2
Only purpose of patch is to reformat code to keep it within 80 columns. The resulting
code highlights some areas where we may want to look at restructing it.
- Sean
Index: access/ib_mad.c
===
--- access/ib_mad.c (revision
On Thu, 7 Oct 2004 11:49:43 -0700
Sean Hefty [EMAIL PROTECTED] wrote:
Only purpose of patch is to reformat code to keep it within 80 columns. The
resulting code highlights some areas where we may want to look at restructing it.
- Sean
Same purpose - different file... SMI code, which
Here's a patch that just renames a few structure members related to MADs. The renamed
variables will be used when handling MAD timeouts.
- Sean
-- Index: access/ib_mad_priv.h
===
--- access/ib_mad_priv.h(revision 955)
+++
On Wed, 20 Oct 2004 22:28:45 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
A little while ago, we had a brief discussion about what MR consumers
should use for MADs they want to send. It seems the two possibilities
where for the MAD layer to expose its MR for consumer use, or for
consumers to
Code to use a work queue to time out MADs. There is one work queue per port.
(Completion handling code was not changed.)
I'm working on creating a few simple test cases to verify MAD functionality
(registration, timeouts, sends, receives, and RMPP in the future), but these are not
yet done.
On Mon, 25 Oct 2004 12:39:36 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
OK. It's pretty straightforward to change the MAD layer to use PLM
rather than snoop MAD (and remove snoop_mad (undo that patch)). Should I
post the changes ?
I think that this makes sense.
- Sean
On Sun, 24 Oct 2004 13:38:01 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
ib_mad: In ib_mad_complete_recv, decrement agent reference count when
receive is not fully reassembled, and also when solicited and no
matching request is found. This allows deregistration to complete rather
than
On Mon, 25 Oct 2004 10:08:51 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
Hal OK. It's pretty straightforward to change the MAD layer to
Hal use PLM rather than snoop MAD (and remove snoop_mad (undo
Hal that patch)). Should I post the changes ?
It's my idea so I certainly like
On Mon, 25 Oct 2004 10:34:09 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
Sean If the MAD is not consumed by the driver, the MAD
Sean layer may update the MAD and call process_local_mad a second
Sean time, correct?
Sure, I guess so -- nothing should break if the MAD layer does
On Wed, 27 Oct 2004 10:08:45 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Tue, 2004-10-26 at 12:40, Sean Hefty wrote:
Currently, a call to ib_free_recv_mad does not dereference the mad_agent that
the mad was given to. The call itself does not access the mad_agent,
but should
On Wed, 27 Oct 2004 09:47:25 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Tue, 2004-10-26 at 18:29, Sean Hefty wrote:
In agent_mad_send, a call is made to create an address handle.
Immediately after calling ib_post_send_mad, the address handle is destroyed.
I think that we want
Index: access/ib_mad_priv.h
===
--- access/ib_mad_priv.h(revision 1078)
+++ access/ib_mad_priv.h(working copy)
@@ -153,6 +153,7 @@
struct ib_mad_mgmt_class_table *version[MAX_MGMT_VERSION];
struct
On Wed, 27 Oct 2004 13:35:22 -0700
Roland Dreier [EMAIL PROTECTED] wrote:
OK, I'm going to go ahead and rename ib_mad.c - mad.c, ib_agent.c -
agent.c etc. (This also makes it possible to build a module named
ib_mad.o, which I think makes more sense than ib_al.o, from multiple
sources).
I
On Tue, 26 Oct 2004 13:14:00 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Tue, 2004-10-26 at 13:10, Roland Dreier wrote:
Sean As a suggestion, we can allocate 2 CQs per QP, one for
Sean receives, and one for sends. This would let us separate
Sean send from receive
There appears to be a minor race in ib_mad_port_start where the MAD
layer could begin accepting and processing receives before the QP allows
sends, or even before we know if the QP will finish initializing
properly. This makes it difficult to handle traffic that comes in
before the QP is
What's the purpose behind the index field in the receive wr_id?
- Sean
___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
On Fri, 29 Oct 2004 18:06:47 -0600
Shirley Ma [EMAIL PROTECTED] wrote:
Here is the patch.
Note that my patch removes the lock when calling ib_post_send. But,
holding the lock when calling ib_post_send() should be fine. Also, the
current completion code assumes that the work requests are
On Fri, 29 Oct 2004 17:35:40 -0600
Shirley Ma [EMAIL PROTECTED] wrote:
I am starting to look at the access layer code. Here is a code
optimization patch in ib_register_mad_agent().
ib_mad_client_id must be incremented while holding the spinlock (or
converted into an atomic). The rest of the
On Mon, 01 Nov 2004 11:24:35 -0500
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Fri, 2004-10-29 at 16:14, Sean Hefty wrote:
On Fri, 29 Oct 2004 13:01:03 -0700 (PDT)
Krishna Kumar [EMAIL PROTECTED] wrote:
Hi,
I know this changes the verbs interface a bit, but ...
I don't see
On Mon, 01 Nov 2004 18:39:59 -0500
Hal Rosenstock [EMAIL PROTECTED] wrote:
I don't think this is an or. A check for *bad_send_wr should be
added(which might be changed based on the below question). I will post
a patch for this. IMO these should be BUG_ON but just errors as these
are localized
On Tue, 2 Nov 2004 09:59:14 -0800 (PST)
Krishna Kumar [EMAIL PROTECTED] wrote:
Hi Sean,
I think that is the best approach. And using this method, we can also
avoid holding the lock if solicited is set. I will send a patch in a
few minutes if this approach looks good.
Sounds good.
I
On Thu, 28 Oct 2004 23:30:00 -0700
Sean Hefty [EMAIL PROTECTED] wrote:
Here's what I have to handle MAD completion handling. This patch
tries to fix the issue of matching a completion (successful or error)
with the corresponding work request. Some notes:
Please use this patch instead. I
Couple of issues with the new code (same as old code, though) :
1. printk(KERN_ERR PFX No client 0x%x for received MAD
on port %d\n,
hi_tid, port_priv-port_num);
and printk(KERN_NOTICE PFX No matching mad agent found for
I've just checked in an initial version of userspace MAD access
(including documentation in docs/user_mad.txt).
Unfortunately this is not quite ready for use underneath OpenSM, since
it is not possible to register an agent for the SM classes (since they
are currently grabbed by the kernel SMA
Another option is to revise the kernel MAD code so that it does not
need to register an agent for the SM classes (ie pass all MADs to
low-level driver first).
I thought that we had decided to go this route, and replace snoop_mad with
calls to process_mad. If we're in agreement on this, I can do
On Wed, 2004-11-03 at 12:00, Sean Hefty wrote:
Is anyone willing to work on porting opensm to this? If not,
I can start on this. Otherwise, I will continue working on
adding MAD error/overrun handling.
Shahar from Voltaire will be doing this. I am working now on modifying
the MAD layer
Roland Dreier wrote:
Do the names /dev/infiniband/mthca0/umad1 and so on make sense to
people? I thought that userspace verbs support would probably use a
file like /dev/infiniband/mthca0/verbs, etc.
I think that this approach is good.
- Sean
___
Krishna Kumar wrote:
qp_info-qp = ib_create_qp(port_priv-pd, qp_init_attr);
- if (IS_ERR(qp_info-qp)) {
- printk(KERN_ERR PFX Couldn't create ib_mad QP%d\n,
- get_spl_qp_index(qp_type));
+ if (!IS_ERR(qp_info-qp)) {
+ struct
Is there any interest among people to reuse receive MADs? I.e. once
allocated and mapped, the receive MAD and work request would be
re-posted to the QP when freed.
I ask because if people are interested in such an optimization at some
point in the future, it will affect how I structure send
Hal Rosenstock wrote:
However I'm not sure I understand why the MAD layer wants to resize
these objects -- given that the number of QPs is known in advance and
that the MAD layer can choose how many work requests to post per QP,
I'm not sure what is gained by trying to resize things dynamically.
Looking at the latest changes to ib_mad_recv_done_handler, I have a
couple of questions:
* If the underlying driver provides a process_mad routine, a response
MAD is allocated every time a MAD is received on QP 0 or 1. Can we
either push this allocation down into the HCA driver, or find an
A couple of comments (so far) while tracing through the MAD agent code.
* There are a couple of places where ib_get_agent_mad() will be called
multiple times in the same execution path. For example agent_send calls
it, as does agent_mad_send. I didn't check to see if the calls would
return
Hal Rosenstock wrote:
On Tue, 2004-11-09 at 10:37, Roland Dreier wrote:
By the way, reposting the receives is not the right thing to do on
error -- the QP will be in the error state, so any new work requests
will just complete with a flush status. We need to reset the QP and
start over to recover
I have two nodes directly connected. When trying to bring up the openib
node, I receive a local length error on the CQ after trying to perform a
send.
I'm continuing to debug...
- Sean
___
openib-general mailing list
[EMAIL PROTECTED]
Sean Hefty wrote:
I have two nodes directly connected. When trying to bring up the openib
node, I receive a local length error on the CQ after trying to perform a
send.
I'm continuing to debug...
static int agent_mad_send(struct ib_mad_agent *mad_agent,
struct
Hal Rosenstock wrote:
On Tue, 2004-11-09 at 14:56, Sean Hefty wrote:
Sean Hefty wrote:
I have two nodes directly connected. When trying to bring up the openib
node, I receive a local length error on the CQ after trying to perform a
send.
I'm continuing to debug...
static int agent_mad_send
The following patch adds support for handling QP0/1 send queue overrun,
along with a couple of related fixes:
* The patch includes that provided by Roland in order to configure the
fabric.
* The code no longer modifies the user's send_wr structures when sending
a MAD.
* Sent MADs work requests
Hal Rosenstock wrote:
This is a separate issue from the ports not becoming active (DR handling
issue). I broke this part yesterday (not a good day at all :-( at either
r1184 and/or r1181 when I added what I thought was correct based on
Sean's emails (not dispatching additional error cases in
Hal Rosenstock wrote:
- send_wr.wr_id = ++port_priv-wr_id;
+ send_wr.wr_id = (unsigned long)agent_send_wr-send_list;
{snip}
+ send_wr = (struct list_head *)(unsigned long)mad_send_wc-wr_id;
+ agent_send_wr = container_of(send_wr, struct ib_agent_send_wr,
Hal Rosenstock wrote:
1. Why was BUG_ON removed from dequeue_mad ?
That can be put back. I removed queue_mad, and was going to remove
dequeue_mad, but decided to leave it.
2. A couple of questions related to send_wr-num_sge checking.
a. Should this be pushed down to mthca and detected there
Removes unneeded check and relocates other to while loop.
- Sean
Index: core/mad.c
===
--- core/mad.c (revision 1197)
+++ core/mad.c (working copy)
@@ -518,14 +518,10 @@
if (!bad_send_wr)
goto error1;
-
I'm trying to force errors on QP0/1 to see if my changes can recover
from them. I force the errors by sending with an invalid lkey. Based
on the implementation of mthca, what can be expected?
I'm not seeing the QP event handler get invoked. I do receive a
completion error, followed by
Roland Dreier wrote:
mthca currently doesn't handle these 'asynchronous' state transitions
(ie transition to error). It continues to think the QP is in the RTS
state. Proper handling needs to be implemented.
Ok - thanks for the info.
However should there be a QP event for a send with invalid
This should transition the QP state to SQE when encountering a
send error on the CQ. There may be a better way of doing this;
I didn't spend a lot of time studying the code.
- Sean
Index: mthca_dev.h
===
--- mthca_dev.h (revision
Hal Rosenstock wrote:
On Thu, 2004-11-11 at 20:41, Sean Hefty wrote:
This patch recovers from send queue errors on QP 0/1. (It should also work in the case
of fatal errors, but does not try to recover.) Code was tested by forcing send errors and
checking that the port could still go to active
Hal Rosenstock wrote:
On Wed, 2004-11-10 at 11:59, Roland Dreier wrote:
Sean What exactly does it mean then when process_mad returns
Sean success? Do any of the return bits from process_mad
Sean indicate that the MAD was for the HCA driver?
SUCCESS means that process_mad didn't encounter
Roland Dreier wrote:
I thought about this a little, and it seems that having the CQ poll
operation update the QP state is not the right solution. It seems it
would be better to add support for the Current QP state modifier for
the modify QP operation and expect the consumer to use that to
This patch removes ib_mad_return_posted_send_mads, which isn't needed when
shutting down. There cannot be any sends outstanding at this point, or
clients still exist.
- Sean
Index: core/mad.c
===
--- core/mad.c (revision 1222)
+++
This patch callapses several function calls into one when activating
the MAD QPs. This avoids repeated allocation/freeing of memory.
I have plans to examine the QP transitions to the reset
state to see if these are necessary and if a race condition exists
between shutting down a port and
On Fri, 12 Nov 2004 22:08:14 -0500
Hal Rosenstock [EMAIL PROTECTED] wrote:
This patch looks like it includes the previous patch and due to this 2
large hunks are rejected. Can you regenerate this ?
Updated patch.
- Sean
Index: core/mad.c
On Mon, 15 Nov 2004 14:50:06 -0500
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Mon, 2004-11-15 at 13:29, Sean Hefty wrote:
On Fri, 12 Nov 2004 22:08:14 -0500
Hal Rosenstock [EMAIL PROTECTED] wrote:
This patch looks like it includes the previous patch and due to this 2
large hunks
Hal Rosenstock wrote:
After Roland's query this AM, I am looking at this some more:
On Wed, 2004-11-10 at 13:43, Sean Hefty wrote:
The second case where I can see this happening is if the client canceled
the send, and I'm not sure that we'd want to give the client an
unmatched response
Hal Rosenstock wrote:
My personal take would be to avoid adding that complexity. E.g. a
client sends a MAD with TID 5, cancels 5, sends 5, cancels 5, sends 5.
A response is now received. What should the MAD layer do?
I don't see issues with silently dropping any MAD that we're not ready
to
Roland Dreier wrote:
Bernhard Hi, from linux/spinlock.h: spin_is_locked on UP always
Bernhard says FALSE
Good catch.
Bernhard please consider applying,
Can we try and think of a fix that doesn't involve adding #ifdefs to
the source file? Do we really need the BUG_ONs at all?
I'd vote
Roland Dreier wrote:
Hal Hi, Should it be the responsibility of user_mad or the client
Hal itself to set the hi_tid ? Right now, it's in
Hal user_mad::ib_umad_write.
I think it has to be in the kernel (ie in user_mad.c) because we can't
trust anything userspace gives us.
agreed
I'm starting work on the RMPP implementation in the MAD code. If anyone
has any ideas/preferences on the implementation, please let me know.
For the send side, there are a couple of ways to perform the segmentation:
1. Issue one send at a time. Additional sends are not transfered until
the
Fab Tillier wrote:
1. Issue one send at a time. Additional sends are not transfered until
the first send completes.
Isn't #1 the simplest to implement? Turnaround on the send queue should be
pretty quick, so send performance should be fine. I say do whatever is
simplest, and then optimize
Roland Dreier wrote:
Would it make sense to figure out what the expected consumers of this
RMPP support will be and what they will need before designing the RMPP
implementation?
Absolutely. Right now, I'm assuming opensm and SA query as the primary
users.
- Sean
Hal Rosenstock wrote:
Anyhow, we are within days of starting on this.
There are 2 main portions of this:
1. Port to gen2 API
2. Fix build
The other aspects can wait if necessary.
How long before we need the first part ? Is there any expectation on how
long code review would last ? Or would they
Roland Dreier wrote:
I think the CM ends up needing its own set of workqueues so that it
can queue MAD processing along with time wait events etc. Also we
don't want the CM to block general MAD processing while it waits for
things like QP modify.
I thought about this approach, but wasn't sure
Hal Rosenstock wrote:
Change mad thread model to be 1 thread/port rather than 1 thread/port/CPU
(Note that I have not applied this but am requesting comments).
Index: mad.c
===
--- mad.c (revision 1269)
+++ mad.c (working copy)
@@
This patch restructures handle_outgoing_smp to improve its readability
and fixes the following issues: removes unneeded memory allocation for
received SMP, properly sends a SMP if the underlying HCA driver does not
provide a process_mad routine, and deallocates the allocated received
SMP in all
[EMAIL PROTECTED] wrote:
Hi,
For the newer vendor classes (0x30-0x4f), should we add OUI to the
registration and put the demux into the MAD layer for these classes by OUI ?
If so, I will work up a patch for this.
I guess I need to re-examine the MAD dispatching, but I can't think of
a reason why
Hal Rosenstock wrote:
This patch restructures handle_outgoing_smp to improve its readability
I can't see for sure for your patch.
The main changes are that the code is outdented and moved from nested
if's to a switch statement.
and fixes the following issues: removes unneeded memory allocation
Index: core/mad.c
===
--- core/mad.c (revision 1291)
+++ core/mad.c (working copy)
@@ -366,108 +366,93 @@
struct ib_send_wr *send_wr)
{
int ret;
+ struct ib_mad_private *mad_priv;
+
Hal Rosenstock wrote:
Also, based on this, do you think it makes sense for an OpenIB OUI (if
we are to utilize these classes) ?
I think that it makes sense, but I'd wait until we actually have code
that utilizes it.
- Sean
___
openib-general mailing
Hal Rosenstock wrote:
On Mon, 2004-11-29 at 14:40, Sean Hefty wrote:
- if (mad_agent_priv-agent.send_handler) {
- /* Now, complete send */
- mad_send_wc.status = IB_WC_SUCCESS;
- mad_send_wc.vendor_err = 0
Patch adds documentation for exported functions that did not have it in
ib_verbs.h
and device.c. Fixes slight formatting issue in ib_mad.h documentation.
Patch will be committed shortly after sending this.
- Sean
Index: include/ib_verbs.h
Hal Rosenstock wrote:
Each received MAD can only have 1 client which owns it. That client is
either determined via solicited routing or version/class/method (and
soon OUI) routing.
This is correct. This was done to avoid having to copy received MADs.
So solicited MAD responses cannot currently be
Hal Rosenstock wrote:
This is something that was briefly discussed before. I think that I
would support snooping by extending the ib_mad_reg_reg structure to
indicate a registration type, possibly along with some additional
filtering parameters. (We could also create a new snoop routine.)
Hal Rosenstock wrote:
So solicited MAD responses cannot currently be snooped nor can
unsolicited ones for which an agent is registered (Since SMA and PMA are
currently firmware based, the latter is not an issue for the current
implementation).
I've gotten a start on adding in the snooping support.
Hal Rosenstock wrote:
I'd like to place the snooping code in as few places as possible, but
still be able to snoop locally processed MADs. Ideally a MAD should be
snooped exactly once, which requires some extra care when handling QP
errors. Snooping in the completion handling allows the MAD
Roland Dreier wrote:
Eitan Results with - ERR 1B10: Provided Join State != FullMember
Eitan - required for create. You can not create a group if you
Eitan are not a full member.
Right. However, ScopeState is dumped as 0x1, which means bit 0 of
JoinState (the FullMember bit) is in
I'm getting the following bug in mthca when loading the drivers (core,
mad, and mthca). The system is attached to a fabric with opensm
running on top of the Mellanox gold software stack. I hit this when
running with the tip of openib. Any help would be, well, helpful.
- Sean
Dec 8 14:53:47
1 - 100 of 2277 matches
Mail list logo