Hi Ira,
On 1/12/08, Ira Weiny <[EMAIL PROTECTED]> wrote:
> And to further answer my question...[*]
>
> This seems to fix the problem for us, however I know that it could be better.
> For example it only takes care of partition 0xFFFF, and I think Jason's idea
> of
> having say 16 Mcast Groups and some hash of these into them would be nice.
> But
> is this on the right track? Am I missing some other place in the code?
This is a start.
Some initial comments on a quick scan of the approach used:
This assumes a homogeneous subnet (in terms of rates and MTUs). I
think that only groups which share the same rate and MTU can share the
same MLID.
Also, MLIDs will now need to be use counted and only removed when all
the groups sharing that MLID are removed.
I think this is a policy and rather than this always being the case,
there should be a policy parameter added to OpenSM for this. IMO
default should be to not do this.
Maybe more later...
-- Hal
> Thanks,
> Ira
>
> [*] Again I apologize for the spam but we were in a bit of a panic as we only
> have the big system for the weekend and IB was not part of the test... ;-)
>
> >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001
> From: Ira K. Weiny <[EMAIL PROTECTED]>
> Date: Fri, 11 Jan 2008 22:58:19 -0800
> Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to
> use a single Mcast
> Group.
>
> Signed-off-by: root <[EMAIL PROTECTED]>
> ---
> opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++-
> opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++-
> 2 files changed, 59 insertions(+), 2 deletions(-)
>
> diff --git a/opensm/opensm/osm_sa_mcmember_record.c
> b/opensm/opensm/osm_sa_mcmember_record.c
> index 8eb97ad..6bcc124 100644
> --- a/opensm/opensm/osm_sa_mcmember_record.c
> +++ b/opensm/opensm/osm_sa_mcmember_record.c
> @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const
> p_map_item, IN void *context)
> /* compare entire MGID so different scope will not sneak in for
> the same MGID */
> if (memcmp(&p_mgrp->mcmember_rec.mgid,
> - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t)))
> + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) {
> +
> + /* Special Case IPV6 Multicast Loopback addresses */
> + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */
> +#define SPEC_PREFIX (0xff12601bffff0000)
> +#define INT_ID_MASK (0x00000001ff000000)
> + uint64_t g_prefix =
> cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix);
> + uint64_t g_interface_id =
> cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id);
> + uint64_t rcv_prefix =
> cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix);
> + uint64_t rcv_interface_id =
> cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id);
> +
> + if (rcv_prefix == SPEC_PREFIX
> + &&
> + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) {
> +
> + if ((g_prefix == rcv_prefix)
> + &&
> + (g_interface_id & INT_ID_MASK) ==
> + (rcv_interface_id & INT_ID_MASK)
> + ) {
> + osm_log(sa->p_log, OSM_LOG_INFO,
> + "Special Case Mcast Join for MGID "
> + " MGID 0x%016"PRIx64" :
> 0x%016"PRIx64"\n",
> + rcv_prefix, rcv_interface_id);
> + goto match;
> + }
> + }
> return;
> + }
>
> +match:
> if (p_ctxt->p_mgrp) {
> osm_log(sa->p_log, OSM_LOG_ERROR,
> "__search_mgrp_by_mgid: ERR 1B03: "
> diff --git a/opensm/opensm/osm_sa_path_record.c
> b/opensm/opensm/osm_sa_path_record.c
> index 749a936..469773a 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const
> p_map_item, IN void *context)
>
> /* compare entire MGID so different scope will not sneak in for
> the same MGID */
> - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid,
> sizeof(ib_gid_t)))
> + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid,
> sizeof(ib_gid_t))) {
> +
> + /* Special Case IPV6 Multicast Loopback addresses */
> + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */
> +#define SPEC_PREFIX (0xff12601bffff0000)
> +#define INT_ID_MASK (0x00000001ff000000)
> + uint64_t g_prefix =
> cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix);
> + uint64_t g_interface_id =
> cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id);
> + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix);
> + uint64_t rcv_interface_id =
> cl_ntoh64(p_recvd_mgid->unicast.interface_id);
> +
> + if (rcv_prefix == SPEC_PREFIX
> + &&
> + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) {
> +
> + if ((g_prefix == rcv_prefix)
> + &&
> + (g_interface_id & INT_ID_MASK) ==
> + (rcv_interface_id & INT_ID_MASK)
> + ) {
> + osm_log(sa->p_log, OSM_LOG_INFO,
> + "Special Case Mcast Join for MGID "
> + " MGID 0x%016"PRIx64" :
> 0x%016"PRIx64"\n",
> + rcv_prefix, rcv_interface_id);
> + goto match;
> + }
> + }
> return;
> + }
> +
> +match:
>
> #if 0
> for (i = 0;
> --
> 1.5.1
>
>
>
> On Fri, 11 Jan 2008 22:04:56 -0800
> Ira Weiny <[EMAIL PROTECTED]> wrote:
>
> > Ok,
> >
> > I found my own answer. Sorry for the spam.
> >
> > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html
> >
> > Sorry,
> > Ira
> >
> >
> > On Fri, 11 Jan 2008 19:36:57 -0800
> > Ira Weiny <[EMAIL PROTECTED]> wrote:
> >
> > > I don't really understand the innerworkings of IPoIB so forgive me if
> > > this is a
> > > really stupid question but:
> > >
> > > Is it a bug that there is a Multicast group created for every node in
> > > our
> > > clusters?
> > >
> > > If not a bug why is this done? We just tried to boot on a 1151 node
> > > cluster
> > > and opensm is complaining there are not enough multicast groups.
> > >
> > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All
> > > available:1024 mlids are taken
> > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR
> > > 1B19: __get_new_mlid failed
> > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All
> > > available:1024 mlids are taken
> > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR
> > > 1B19: __get_new_mlid failed
> > >
> > >
> > > Here is the output from my small test cluster: (ibnodesinmcast uses
> > > saquery a
> > > couple of times to print this nice report.)
> > >
> > >
> > > 19:17:24 > whatsup
> > > up: 9: wopr[0-7],wopri
> > > down: 0:
> > > [EMAIL PROTECTED]:/tftpboot/images
> > > 19:25:03 > ibnodesinmcast -g
> > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff)
> > > In 9: wopr[0-7],wopri
> > > Out 0: 0
> > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001)
> > > In 9: wopr[0-7],wopri
> > > Out 0: 0
> > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed)
> > > In 1: wopr3
> > > Out 8: wopr[0-2,4-7],wopri
> > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001)
> > > In 9: wopr[0-7],wopri
> > > Out 0: 0
> > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729)
> > > In 1: wopr4
> > > Out 8: wopr[0-3,5-7],wopri
> > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65)
> > > In 1: wopri
> > > Out 8: wopr[0-7]
> > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d)
> > > In 1: wopr6
> > > Out 8: wopr[0-5,7],wopri
> > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325)
> > > In 1: wopr7
> > > Out 8: wopr[0-6],wopri
> > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35)
> > > In 1: wopr1
> > > Out 8: wopr[0,2-7],wopri
> > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1)
> > > In 1: wopr2
> > > Out 8: wopr[0-1,3-7],wopri
> > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1)
> > > In 1: wopr0
> > > Out 8: wopr[1-7],wopri
> > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9)
> > > In 1: wopr5
> > > Out 8: wopr[0-4,6-7],wopri
> > >
> > >
> > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node
> > > in
> > > them and represent an ipv6 address. Could you turn off ipv6 with the
> > > latest
> > > IPoIB?
> > >
> > > In a bind,
> > > Ira
> > > _______________________________________________
> > > general mailing list
> > > [email protected]
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > >
> > > To unsubscribe, please visit
> > > http://openib.org/mailman/listinfo/openib-general
>
> _______________________________________________
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general