Sasha,

Sasha Khapyorsky wrote:
Hi Yevgeny,

On 13:58 Thu 20 Nov     , Yevgeny Kliteynik wrote:
Function osm_switch_get_port_by_lid() was using the switch's
LFT, so this LFT might not be updated to recent routing.

I guess it could be only with 'subnet_initialization_error' flag up
(failed LinFwdTbl set will trigger this flag).
I think that this was also relevant before the LFT simplification.

Yes, logically it should be so, but...

One immediate outcome of this bug is opensm.fdbs file - when it
is dumped from the switch LFT (and not from lft_buf),

Why this bug is triggered only now?

I had sometimes errors in simulations, and after aome analysis
I decided that they are timing problems with the tests.
Now that I did some stress testing of ucast cache, I started
to see more of these errors.

it sometimes
doesn't match the lst file.

What this "sometimes" mean? I think the case should be investigated
deeper. By such patch we are just trying to hide a possible issue.

As far as I understand opensm.fdbs (and other routing dump) are
generated only after all LinFwdTbl responses are arrived, when some of
them failed 'subnet_initialization_error' flag is up and OpenSM will
resweep. If so why is 'opensm.fdbs' broken? It is not immediately
clear for me.

I didn't see 'subnet_initialization_error' in such cases.
Anyway, here's what I can do: at the end of each ucast_mgr_process
I'll compare lft and lft_buf (something that the other patch is
doing, the one that frees lft_buf), and if there is a difference,
then we have a problem. In not - then I'll look for the cause
elsewhere.

-- Yevgeny

Sasha

Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]>
---
 opensm/include/opensm/osm_switch.h |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_switch.h 
b/opensm/include/opensm/osm_switch.h
index caa0bc5..f06931c 100644
--- a/opensm/include/opensm/osm_switch.h
+++ b/opensm/include/opensm/osm_switch.h
@@ -411,7 +411,11 @@ osm_switch_get_port_by_lid(IN const osm_switch_t * const 
p_sw,
 {
        if (lid_ho == 0 || lid_ho > IB_LID_UCAST_END_HO)
                return OSM_NO_PATH;
-       return p_sw->lft[lid_ho];
+
+       if (p_sw->lft_buf)
+               return p_sw->lft_buf[lid_ho];
+       else
+               return p_sw->lft[lid_ho];
 }
 /*
 * PARAMETERS
--
1.5.1.4




_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to