Hi Yevgeny,

On 14:41 Tue 17 Feb     , Yevgeny Kliteynik wrote:
> 
> This patch fixes bugzilla issue #1515:
> 
> Topology:
>                  |---------------|
>                  |      SW2      |
>                  |---------------|
>                    |x |y    |z |v
>               |----|  |     |  |----|
>               |       |     |       |
>               |  |----|     |----|  |
>               |  |               |  |
>              a| b|              c| d|
>       |---------------|     |---------------|
>       |       SW1     |     |     SW3       |
>       |---------------|     |---------------|
>           |                             |
>           |                             |
>        HCA with SM                      HCA
> 
> During the discovery:
> 
> SM sends NodeInfo request to SW1
> SM sends NodeInfo request to SW2 through link a->x
> SM discovers new node SW2:
>   - updates DR to SW2 to go through link a->x
>   - creates physp x

And requests SwitchInfo from SW2, and on response sends PortInfo to all
switch ports. PortInfo receiver will initialize all switch ports. Isn't
it?

Sasha

> SM sends NodeInfo request to SW2 through link b->y
> SM discovers a known node SW2
>   - DOES NOT create physp y
>   - updates DR to SW2 to go through link b->y
> 
> From now on, the DR to SW2 is going through port y, so OpenSM won't deal with
> port y any more, leaving it uninitialized (no physp object for this port).
> 
> The fix is to create physp for the newly discovered port of the known
> switch node, same way as it is done for HCAs.
> I also added one log message for the case that showed the problem - when
> one of the link sides is uninitialized (no valid ports check). Perhaps
> this log message should be an error message instead?
> 
> Signed-off-by: Yevgeny Kliteynik <[email protected]>
> ---
>  opensm/opensm/osm_node_info_rcv.c |   24 +++++++++++++++++++++++-
>  1 files changed, 23 insertions(+), 1 deletions(-)
> 
> diff --git a/opensm/opensm/osm_node_info_rcv.c 
> b/opensm/opensm/osm_node_info_rcv.c
> index c52c0d5..7da3103 100644
> --- a/opensm/opensm/osm_node_info_rcv.c
> +++ b/opensm/opensm/osm_node_info_rcv.c
> @@ -164,8 +164,12 @@ __osm_ni_rcv_set_links(IN osm_sm_t * sm,
>        */
>       if (!osm_node_link_has_valid_ports(p_node, port_num,
>                                          p_neighbor_node,
> -                                        p_ni_context->port_num))
> +                                        p_ni_context->port_num)) {
> +             OSM_LOG(sm->p_log, OSM_LOG_DEBUG,
> +                     "Link at node 0x%" PRIx64 ", port %u - no valid 
> ports\n",
> +                     cl_ntoh64(osm_node_get_node_guid(p_node)), port_num);
>               goto _exit;
> +     }
> 
>       if (osm_node_link_exists(p_node, port_num,
>                                p_neighbor_node, p_ni_context->port_num)) {
> @@ -537,8 +541,26 @@ __osm_ni_rcv_process_existing_switch(IN osm_sm_t * sm,
>                                    IN osm_node_t * const p_node,
>                                    IN const osm_madw_t * const p_madw)
>  {
> +
> +     ib_smp_t *p_smp;
> +     ib_node_info_t *p_ni;
> +     uint8_t port_num;
> +
>       OSM_LOG_ENTER(sm->p_log);
> 
> +     p_smp = osm_madw_get_smp_ptr(p_madw);
> +     p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp);
> +     port_num = ib_node_info_get_local_port_num(p_ni);
> +
> +     if (!osm_node_get_physp_ptr(p_node, port_num)) {
> +             OSM_LOG(sm->p_log, OSM_LOG_DEBUG,
> +                     "Creating physp for node GUID:0x%"
> +                     PRIx64 ", port %u\n",
> +                     cl_ntoh64(osm_node_get_node_guid(p_node)),
> +                     port_num);
> +             osm_node_init_physp(p_node, p_madw);
> +     }
> +
>       /*
>          If this switch has already been probed during this sweep,
>          then don't bother reprobing it.
> -- 
> 1.5.1.4
> 
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to