On Sun, 2008-01-13 at 10:05 +0200, Eli Cohen wrote: > IPOIB does not initiate a join to a mulitcast group (except for the > broadcast group).
IPv6 does indeed do this on an IPoIB interface for solicited node multicast. > This comes from routing protocols or use space > sockets. Do you run processes that use many different multicast groups? > > > On Fri, 2008-01-11 at 19:36 -0800, Ira Weiny wrote: > > I don't really understand the innerworkings of IPoIB so forgive me if this > > is a > > really stupid question but: > > > > Is it a bug that there is a Multicast group created for every node in our > > clusters? > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > and opensm is complaining there are not enough multicast groups. > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All > > available:1024 mlids are taken > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR > > 1B19: __get_new_mlid failed > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All > > available:1024 mlids are taken > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR > > 1B19: __get_new_mlid failed > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses > > saquery a > > couple of times to print this nice report.) > > > > > > 19:17:24 > whatsup > > up: 9: wopr[0-7],wopri > > down: 0: > > [EMAIL PROTECTED]:/tftpboot/images > > 19:25:03 > ibnodesinmcast -g > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > In 1: wopr3 > > Out 8: wopr[0-2,4-7],wopri > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > In 1: wopr4 > > Out 8: wopr[0-3,5-7],wopri > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > In 1: wopri > > Out 8: wopr[0-7] > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > In 1: wopr6 > > Out 8: wopr[0-5,7],wopri > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > In 1: wopr7 > > Out 8: wopr[0-6],wopri > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > In 1: wopr1 > > Out 8: wopr[0,2-7],wopri > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > In 1: wopr2 > > Out 8: wopr[0-1,3-7],wopri > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > In 1: wopr0 > > Out 8: wopr[1-7],wopri > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > In 1: wopr5 > > Out 8: wopr[0-4,6-7],wopri > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > IPoIB? > > > > In a bind, > > Ira > > _______________________________________________ > > general mailing list > > [email protected] > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
