I'm trying to resolve a bug in the directed route handling code. I thought I would alert the general list in case someone had a solution. I'm continuing to find a clean fix but its difficult.
The basic problem is that smi_handle_dr_smp_send() and smi_handle_dr_smp_recv() are modifying the directed route part (inc/dec hop_ptr) when the packet is in the LID routed part of the path. Here is an example: Receive SubnGet(NodeInfo) LRH:DLID=0x0009, LRH:SLID=0x000A, hop_ptr=2, hop_cnt=1, DrSLID=0xFFFF, DrDLID=0x0009. It is processed OK through ib_mad_recv_done_handler() ... port_priv->device->process_mad() generates OK response ... agent_send_response() calls ib_create_ah_from_wc() which creates the correct AH (to 0x000A) ... ib_post_send_mad() calls handle_outgoing_dr_smp() which calls smi_handle_dr_smp_send() which INCORRECTLY decrements hop_ptr since this is a reply to 0x0009 not 0xFFFF. The difficulty is that at this point, the AH is opaque so you can't easily tell that the DLID isn't the permissive LID. You can see that DrDLID isn't 0xFFFF but you can't just return 1 in smi_handle_dr_smp_send() because if OpenIB received this same reply (i.e., on node with LID=0x000A), it would still think its at the beginning of the destination LID routed part and not decrement hop_ptr. I think there is a similar problem when sending requests where the initial part of the path is LID routed. -- Ralph Campbell <[EMAIL PROTECTED]> _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
