I haven't cleared the other issues before getting back to this but wanted to respond to some of the points below:
On Tue, 2004-11-09 at 23:55, Roland Dreier wrote: > Roland> OK, I think I understand the problem, but I'm not sure > Roland> what the correct solution is. When a DR SMP arrives at a > Roland> CA from the SM, hop_cnt == hop_ptr == number of hops in > Roland> the directed route, > > Hal> What was the number ? > > For one port it was 4 and for another it was 6. It could really be > anything (it's just how many hops away the SM is). I think I understand how DR is supposed to work :-) I was just looking for the actual values in the failed case to try to understand what is code was doing as I don't have a configuration to recreate this (at least yet). >From what you indicated, it looks like it would be the following case so no response would be sent: /* C14-13:2 */ if (2 <= hop_ptr && hop_ptr <= hop_cnt) { if (node_type != IB_NODE_SWITCH return 0; but I'm not sure whether those were the values on entry to the smi_dr_handle_smp_recv routine that was excised from the code. > Hal> I integrated this patch and checked it back in. I don't think > Hal> this is the solution for all cases (and something else is > Hal> broken). > > Could be. I had a hard time checking the code in smi.c (which is > split between smi_handle_dr_smp_recv() and smi_handle_dr_smp_send() as > well as smi_check_forward_dr_smp(), but which has outgoing and > returning DR handling mixed together) against the IB spec (which > splits outgoing and returning DR handling). I had to squint hard the first time I went through this too (and probably will again). I will explain how this works in sufficient detail if this is of interest. > Hal> The second call to smi_handle_dr_smp_recv was to validate the > Hal> DR in the response packet before sending it. The response > Hal> would be a returning DR packet (D bit 1). If hop_cnt == > Hal> hop_ptr, > > I guess the problem with calling smi_handle_dr_smp_recv() twice on the > same packet is that the function may alter the packet. No, the second call to smi_handle_dr_smp_recv() was on the outgoing response and not the incoming request. The thought was that a packet coming from process_mad is much like an incoming received packet and hence the call to smi_handle_dr_smp_recv. The routine validates the packet but also can do some fixups depending on which case it falls into. Guess it's only dangerous to validate this and wrong to fix it up. The key to me is the following: The split of responsibility on the DR header formation is a little unclear to me. In the case of the SM, are the DR headers fully formed before handing it to the MAD layer or is some DR fixup needed ? -- Hal _______________________________________________ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general