Re: [openib-general] ib_free_recv_mad and references
On Tue, 2004-10-26 at 12:40, Sean Hefty wrote: Currently, a call to ib_free_recv_mad does not dereference the mad_agent that the mad was given to. The call itself does not access the mad_agent, but should a reference be held on the mad_agent while it has a received MAD? Looking at the implementation, it appears that a mad_agent could deregister with the access layer, then call ib_free_recv_mad, which accesses the ib_mad_cache. ib_mad_cache is in existence from the time of module insertion to removal. Deregistering the mad_agent has no effect on its presence so I don't think the order of ib_free_mad_recv and ib_unregister_mad_agent matters. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 2004-10-26 at 12:50, Sean Hefty wrote: I think we have other issues with the completion handling as well. Since we use a single CQ for both QPs, I think that we need to search the send_posted_mad_list to find the corresponding completion. We cannot assume that the completion matches with the request at the head of the list. This appears to be broken in the non-error case as well. I will happily create a patch to fix these issues. Is it worth fixing this for the current approach or should I just wait for this patch ? Thanks. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ib_free_recv_mad and references
On Wed, 27 Oct 2004 10:08:45 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 12:40, Sean Hefty wrote: Currently, a call to ib_free_recv_mad does not dereference the mad_agent that the mad was given to. The call itself does not access the mad_agent, but should a reference be held on the mad_agent while it has a received MAD? Looking at the implementation, it appears that a mad_agent could deregister with the access layer, then call ib_free_recv_mad, which accesses the ib_mad_cache. ib_mad_cache is in existence from the time of module insertion to removal. Deregistering the mad_agent has no effect on its presence so I don't think the order of ib_free_mad_recv and ib_unregister_mad_agent matters. I'm just wondering more about whether we should permit an agent to unregister while it has received MADs outstanding. Or, if it makes more sense to block unregister until all MADs have been returned. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] agent_mad_send
On Wed, 27 Oct 2004 09:47:25 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 18:29, Sean Hefty wrote: In agent_mad_send, a call is made to create an address handle. Immediately after calling ib_post_send_mad, the address handle is destroyed. I think that we want to wait until the send is completed before destroying the address handle, and require this of all callers of ib_post_send_mad. I can post a patch for this but this depends on whether the agent or MAD layer should destroy the AH. I think that the MAD agent should, since it allocated the AH. Also, I don't think that we want to have this code access the port_priv structure, such as the send_list_lock (which ends up being acquired twice). The agent is using a different port_priv structure and send_list_lock than the one the MAD layer uses. Where is it acquired twice ? My bad. I was working off of the variable names, and didn't check that they had different types. - Sean -- ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Handling SM class (SMInfo vs. other queries)
Hal If I understand correctly, this obviates the need for what is Hal now ib_agent. All that might remain is SMI handling for DR Hal SMPs. Is that right ? I think the receive path looks something like if (DR SMP) SMI checks (discard on failure) rc = process_mad() if (rc CONSUMED) if (rc REPONSE) if (DR SMP) outgoing SMI updates send response free MAD else agent dispatch so ib_agent still needs a QP0 agent and a QP1 agent per port for handling sends, but it won't receive any MADs. - Roland ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] change MAD completion processing to use workqueue
Index: access/ib_mad_priv.h === --- access/ib_mad_priv.h(revision 1078) +++ access/ib_mad_priv.h(working copy) @@ -153,6 +153,7 @@ struct ib_mad_mgmt_class_table *version[MAX_MGMT_VERSION]; struct list_head agent_list; struct workqueue_struct *wq; + struct work_struct work; spinlock_t send_list_lock; struct list_head send_posted_mad_list; @@ -162,9 +163,6 @@ struct list_head recv_posted_mad_list[IB_MAD_QPS_CORE]; int recv_posted_mad_count[IB_MAD_QPS_CORE]; u32 recv_wr_index[IB_MAD_QPS_CORE]; - - struct task_struct *mad_thread; - int thread_wake; }; #endif /* __IB_MAD_PRIV_H__ */ Index: access/ib_mad.c === --- access/ib_mad.c (revision 1078) +++ access/ib_mad.c (working copy) @@ -1158,10 +1158,12 @@ /* * IB MAD completion callback */ -static void ib_mad_completion_handler(struct ib_mad_port_private *port_priv) +static void ib_mad_completion_handler(void *data) { + struct ib_mad_port_private *port_priv; struct ib_wc wc; + port_priv = (struct ib_mad_port_private*)data; ib_req_notify_cq(port_priv-cq, IB_CQ_NEXT_COMP); while (ib_poll_cq(port_priv-cq, 1, wc) == 1) { @@ -1333,57 +1335,10 @@ spin_unlock_irqrestore(mad_agent_priv-lock, flags); } -/* - * IB MAD thread - */ -static int ib_mad_thread(void *param) -{ - struct ib_mad_port_private *port_priv = param; - - __set_current_state(TASK_RUNNING); - - do { - port_priv-thread_wake = 0; - wmb(); - - ib_mad_completion_handler(port_priv); - - set_current_state(TASK_INTERRUPTIBLE); - if (!port_priv-thread_wake) - schedule(); - __set_current_state(TASK_RUNNING); - } while (!kthread_should_stop()); - - return 0; -} - -/* - * Initialize the IB MAD thread - */ -static int ib_mad_thread_init(struct ib_mad_port_private *port_priv) -{ - port_priv-thread_wake = 0; - - port_priv-mad_thread = kthread_create(ib_mad_thread, - port_priv, - ib_mad(%6s-%-2d), - port_priv-device-name, - port_priv-port_num); - if (IS_ERR(port_priv-mad_thread)) { - printk(KERN_ERR PFX Couldn't start ib_mad thread for %s port %d\n, - port_priv-device-name, port_priv-port_num); - return PTR_ERR(port_priv-mad_thread); - } - return 0; -} - static void ib_mad_thread_completion_handler(struct ib_cq *cq) { struct ib_mad_port_private *port_priv = cq-cq_context; - - port_priv-thread_wake = 1; - wmb(); - wake_up_process(port_priv-mad_thread); + queue_work(port_priv-wq, port_priv-work); } static int ib_mad_post_receive_mad(struct ib_mad_port_private *port_priv, @@ -1845,15 +1800,12 @@ ret = -ENOMEM; goto error8; } - - ret = ib_mad_thread_init(port_priv); - if (ret) - goto error9; + INIT_WORK(port_priv-work, ib_mad_completion_handler, port_priv); ret = ib_mad_port_start(port_priv); if (ret) { printk(KERN_ERR PFX Couldn't start port\n); - goto error10; + goto error9; } spin_lock_irqsave(ib_mad_port_list_lock, flags); @@ -1862,8 +1814,6 @@ return 0; -error10: - kthread_stop(port_priv-mad_thread); error9: destroy_workqueue(port_priv-wq); error8: @@ -1903,7 +1853,7 @@ spin_unlock_irqrestore(ib_mad_port_list_lock, flags); ib_mad_port_stop(port_priv); - kthread_stop(port_priv-mad_thread); + flush_workqueue(port_priv-wq); destroy_workqueue(port_priv-wq); ib_destroy_qp(port_priv-qp[1]); ib_destroy_qp(port_priv-qp[0]); -- ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] how we DON'T want to make openib
On Wed, 27 Oct 2004, Ronald G. Minnich wrote: I just noticed this tree from a VAPI make :-) sshdbashmakemakeshcat 2*[grep] well that didn't translate make make sh make make make (cat|grep) was the tree. ron ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH] change MAD completion processing to use workqueue
On Wed, 2004-10-27 at 14:59, Sean Hefty wrote: Thanks. Applied. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2 questions on physical code layout
OK, I'm going to go ahead and rename ib_mad.c - mad.c, ib_agent.c - agent.c etc. (This also makes it possible to build a module named ib_mad.o, which I think makes more sense than ib_al.o, from multiple sources). I can continue to merge by hand but it might make sense to make the same change on Hal's branch. - R. ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2 questions on physical code layout
On Wed, 27 Oct 2004 13:35:22 -0700 Roland Dreier [EMAIL PROTECTED] wrote: OK, I'm going to go ahead and rename ib_mad.c - mad.c, ib_agent.c - agent.c etc. (This also makes it possible to build a module named ib_mad.o, which I think makes more sense than ib_al.o, from multiple sources). I can continue to merge by hand but it might make sense to make the same change on Hal's branch. The name changes sound good to me. I didn't realize that you had taken a copy of the current mad code. Is there anything in the openib-candidate branch that isn't in your branch? Does it make sense to just update the code in the roland-merge branch? ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 26 Oct 2004 13:14:00 -0400 Hal Rosenstock [EMAIL PROTECTED] wrote: On Tue, 2004-10-26 at 13:10, Roland Dreier wrote: Sean As a suggestion, we can allocate 2 CQs per QP, one for Sean receives, and one for sends. This would let us separate Sean send from receive completions based on the callback. That's one solution, and another way to handle it is to have a way of distinguishing sends from receives based on wr_id (that's what the Topspin stack does). That's where I was heading with this. It implies a stolen bit in the WRID. Not sure which is better really. Me neither but Sean seems to feel strongly about the CQ separation. Just to make sure that we don't have duplicate efforts, I've been working on the patch to fix handling of send completions. My plan is to use one send_mad_posted_list per QP, to make it faster/easier to find the correct send completion, plus allow for easier error handling when one of the special QPs goes into the error state. The code currently maintains a single CQ per port. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ib_mad_port_start allows receive processing before sends can be posted
There appears to be a minor race in ib_mad_port_start where the MAD layer could begin accepting and processing receives before the QP allows sends, or even before we know if the QP will finish initializing properly. This makes it difficult to handle traffic that comes in before the QP is transitioned to the RTS state, to recover from errors if the RTS transition fails, or to recover from errors if we fail to initialize QP1 after QP0 is active. Longer term, we may want to consider separating the QP0 and QP1 initialization. Short term, I think that if we just move the code around in ib_mad_port_start, we should be able to ensure that both QPs are ready to send and receive before handling any receives. (I don't think that we care if the QPs go to the RTS state without any receives being posted on them. We'll lose all MADs received before the QP goes into the RTR state anyway, so this adds a small delay onto the time that we need to begin handling receives.) Unless there's a reason to keep the code as is, I'll generate a patch for this. - Sean -- ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ib_mad_recv_wrid index field
What's the purpose behind the index field in the receive wr_id? - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] Add new SA module
Here's the new SA module (with support only for PathRecord GETs and MCMemberRecord SETs) that I just checked in. All comments and criticism welcome... (It may be easier to review the code just by looking at include/ib_sa.h and core/sa_query.c in the repo rather than a diff that is just added lines...) - R. Index: include/ib_sa.h === --- include/ib_sa.h (revision 0) +++ include/ib_sa.h (revision 0) @@ -0,0 +1,178 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * http://www.fsf.org/copyleft/gpl.html, or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * http://openib.org/license.html. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id$ + */ + +#ifndef IB_SA_H +#define IB_SA_H + +#include linux/compiler.h + +#include ib_verbs.h + +enum { + IB_SA_CLASS_VERSION = 2 /* IB spec version 1.1 */ +}; + +enum ib_sa_selector { + IB_SA_GTE = 0, + IB_SA_LTE = 1, + IB_SA_EQ = 2, + /* +* The meaning of best depends on the attribute: for +* example, for MTU best will return the largest available +* MTU, while for packet life time, best will return the +* smallest available life time. +*/ + IB_SA_BEST = 3 +}; + +typedef u64 __bitwise ib_sa_comp_mask; + +#define IB_SA_COMP_MASK(n) ((__force ib_sa_comp_mask) cpu_to_be64(1ull n)) + +/* + * Structures for SA records are named struct ib_sa_xxx_rec. No + * attempt is made to pack structures to match the physical layout of + * SA records in SA MADs; all packing and unpacking is handled by the + * SA query code. + * + * For a record with structure ib_sa_xxx_rec, the naming convention + * for the component mask value for field yyy is IB_SA_XXX_REC_YYY (we + * never use different abbreviations or otherwise change the spelling + * of xxx/yyy between ib_sa_xxx_rec.yyy and IB_SA_XXX_REC_YYY). + * + * Reserved rows are indicated with comments to help maintainability. + */ + +/* reserved:0 */ +/* reserved:1 */ +#define IB_SA_PATH_REC_DGIDIB_SA_COMP_MASK( 2) +#define IB_SA_PATH_REC_SGIDIB_SA_COMP_MASK( 3) +#define IB_SA_PATH_REC_DLIDIB_SA_COMP_MASK( 4) +#define IB_SA_PATH_REC_SLIDIB_SA_COMP_MASK( 5) +#define IB_SA_PATH_REC_RAW_TRAFFIC IB_SA_COMP_MASK( 6) +/* reserved:7 */ +#define IB_SA_PATH_REC_FLOW_LABEL IB_SA_COMP_MASK( 8) +#define IB_SA_PATH_REC_HOP_LIMIT IB_SA_COMP_MASK( 9) +#define IB_SA_PATH_REC_TRAFFIC_CLASS IB_SA_COMP_MASK(10) +#define IB_SA_PATH_REC_REVERSIBLE IB_SA_COMP_MASK(11) +#define IB_SA_PATH_REC_NUMB_PATH IB_SA_COMP_MASK(12) +#define IB_SA_PATH_REC_PKEYIB_SA_COMP_MASK(13) +/* reserved: 14 */ +#define IB_SA_PATH_REC_SL IB_SA_COMP_MASK(15) +#define IB_SA_PATH_REC_MTU_SELECTORIB_SA_COMP_MASK(16) +#define IB_SA_PATH_REC_MTU IB_SA_COMP_MASK(17) +#define IB_SA_PATH_REC_RATE_SELECTOR IB_SA_COMP_MASK(18) +#define IB_SA_PATH_REC_RATEIB_SA_COMP_MASK(19) +#define IB_SA_PATH_REC_PACKET_LIFE_TIME_SELECTOR IB_SA_COMP_MASK(20) +#define IB_SA_PATH_REC_PACKET_LIFE_TIMEIB_SA_COMP_MASK(21) +#define IB_SA_PATH_REC_PREFERENCE IB_SA_COMP_MASK(22) + +struct ib_sa_path_rec { + /* reserved */ + /* reserved */ + union ib_gid dgid; + union ib_gid sgid; + u16 dlid; + u16 slid; + int raw_traffic; + /* reserved */ + u32 flow_label; + u8 hop_limit; + u8 traffic_class; + int reversible; + u8 numb_path; + u16 pkey; + /* reserved */ + u8
[openib-general] [PATCH] Convert IPoIB to use new SA module
This converts IPoIB to use the new SA API for PathRecord and MCMemberRecord transactions. Correcting the component mask used for multicast joins after the initial broadcast group still needs to be done... - R. Index: ulp/ipoib/ipoib_main.c === --- ulp/ipoib/ipoib_main.c (revision 1085) +++ ulp/ipoib/ipoib_main.c (working copy) @@ -232,22 +232,24 @@ return 0; } -static int path_rec_completion(tTS_IB_CLIENT_QUERY_TID tid, - int status, - struct ib_path_record *pathrec, - int remaining, void *path_ptr) +static void path_rec_completion(int status, + struct ib_sa_path_rec *pathrec, + void *path_ptr) { struct ipoib_path *path = path_ptr; struct ipoib_dev_priv *priv = netdev_priv(path-dev); struct sk_buff *skb; struct ib_ah *ah; - if (status) + ipoib_dbg(priv, status %d, LID 0x%04x for GID IPOIB_GID_FMT \n, + status, be16_to_cpu(pathrec-dlid), IPOIB_GID_ARG(pathrec-dgid)); + + if (status != IB_WC_SUCCESS) goto err; { struct ib_ah_attr av = { - .dlid = pathrec-dlid, + .dlid = be16_to_cpu(pathrec-dlid), .sl= pathrec-sl, .src_path_bits = 0, .static_rate = 0, @@ -273,7 +275,7 @@ to requeue packet\n); } - return 1; + return; err: while ((skb = __skb_dequeue(path-queue))) @@ -283,15 +285,16 @@ IPOIB_PATH(path-neighbour) = NULL; kfree(path); - - return 1; } static int path_rec_start(struct sk_buff *skb, struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_path *path = kmalloc(sizeof *path, GFP_ATOMIC); - tTS_IB_CLIENT_QUERY_TID tid; + struct ib_sa_path_rec rec = { + .numb_path = 1 + }; + struct ib_sa_query *query; if (!path) goto err; @@ -303,17 +306,23 @@ __skb_queue_tail(path-queue, skb); path-neighbour = NULL; + rec.sgid = priv-local_gid; + memcpy(rec.dgid.raw, skb-dst-neighbour-ha + 4, 16); + rec.pkey = cpu_to_be16(priv-pkey); + /* * XXX there's a race here if path record completion runs * before we get to finish up. Add a lock to path struct? */ - if (tsIbPathRecordRequest(priv-ca, priv-port, - priv-local_gid.raw, - skb-dst-neighbour-ha + 4, - priv-pkey, 0, HZ, 0, - path_rec_completion, - path, tid)) { - ipoib_warn(priv, tsIbPathRecordRequest failed\n); + if (ib_sa_path_rec_get(priv-ca, priv-port, rec, + IB_SA_PATH_REC_DGID | + IB_SA_PATH_REC_SGID | + IB_SA_PATH_REC_NUMB_PATH | + IB_SA_PATH_REC_PKEY, + 1000, GFP_ATOMIC, + path_rec_completion, + path, query) 0) { + ipoib_warn(priv, ib_sa_path_rec_get failed\n); goto err; } @@ -329,21 +338,23 @@ return 0; } -static int unicast_arp_completion(tTS_IB_CLIENT_QUERY_TID tid, - int status, - struct ib_path_record *pathrec, - int remaining, void *skb_ptr) +static void unicast_arp_completion(int status, + struct ib_sa_path_rec *pathrec, + void *skb_ptr) { struct sk_buff *skb = skb_ptr; struct ipoib_dev_priv *priv = netdev_priv(skb-dev); struct ib_ah *ah; + ipoib_dbg(priv, status %d, LID 0x%04x for GID IPOIB_GID_FMT \n, + status, be16_to_cpu(pathrec-dlid), IPOIB_GID_ARG(pathrec-dgid)); + if (status) goto err; { struct ib_ah_attr av = { - .dlid = pathrec-dlid, + .dlid = be16_to_cpu(pathrec-dlid), .sl= pathrec-sl, .src_path_bits = 0, .static_rate = 0, @@ -363,12 +374,10 @@ ipoib_warn(priv, dev_queue_xmit failed to requeue ARP packet\n); - return 1; + return; err: dev_kfree_skb(skb); - - return 1; } static void unicast_arp_finish(struct sk_buff *skb) @@ -394,7 +403,10 @@ {
Re: [openib-general] 2 questions on physical code layout
Sean I didn't realize that you had taken a copy of the current Sean mad code. Is there anything in the openib-candidate branch Sean that isn't in your branch? Does it make sense to just Sean update the code in the roland-merge branch? I've got everything up to r1080 in my branch (which I think is the latest). I would be fine with consolidating work onto the roland-merge branch and pushing core/mad changes through you and Hal, if that's what the consensus is. Or we could copy roland-merge to a new branch with a different name and work there. - R. ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general