On Tue, 2005-04-12 at 12:46, Roland Fehrenbacher wrote: > Hal> SM election occurs per high priority low GUID. So if you > Hal> don't care which SM is the master than you don't need to do > Hal> anything. If you want a specific order (and it is not in GUID > Hal> order) then you need to specify priority. > > Ok. I tried this, specifying priority 0 on one server, and priority 15 > on another one. I assume priority 15, will be the master. > If I first start the priority 0 opensm, and then the priority 15 one, > things look normal: Log excerpts > > priority 0 server > > Apr 12 18:41:06 [4000] -> OpenSM Rev:openib-1.0.0 > Apr 12 18:41:06 [4000] -> osm_opensm_init: Forcing single threaded dispatcher. > Apr 12 18:41:06 [4000] -> osm_report_notice: Reporting Generic Notice type:3 > num:66 from LID:0x0000 GID:0xfe80000000000000,0x0000000000000000 > Apr 12 18:41:06 [4000] -> osm_report_notice: Reporting Generic Notice type:3 > num:66 from LID:0x0000 GID:0xfe80000000000000,0x0000000000000000 > Apr 12 18:41:06 [4000] -> osm_vendor_bind: Binding to port 0x2c902004013c2. > Apr 12 18:41:06 [4000] -> osm_vendor_bind: Binding to port 0x2c902004013c2. > Apr 12 18:41:06 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:41:06 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:41:06 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:41:06 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:41:06 [18007] -> __osm_trap_rcv_process_request: Received Generic > Notice type:0x04 num:144 Producer:1 from LID:0x0001 TID:0x0000000000000011 > Apr 12 18:41:06 [18007] -> osm_report_notice: Reporting Generic Notice type:4 > num:144 from LID:0x0001 GID:0xfe80000000000000,0x0002c902004013c2 > Apr 12 18:41:06 [18007] -> __osm_trap_rcv_process_request: Received Generic > Notice type:0x04 num:144 Producer:1 from LID:0x0002 TID:0x000000000000000d > Apr 12 18:41:06 [18007] -> osm_report_notice: Reporting Generic Notice type:4 > num:144 from LID:0x0002 GID:0xfe80000000000000,0x0002c9020040133a > Apr 12 18:42:25 [18007] -> __osm_trap_rcv_process_request: Received Generic > Notice type:0x04 num:144 Producer:1 from LID:0x0002 TID:0x000000000000000e > Apr 12 18:42:25 [18007] -> osm_report_notice: Reporting Generic Notice type:4 > num:144 from LID:0x0002 GID:0xfe80000000000000,0x0002c9020040133a > > priority 15 server > > Apr 12 18:42:25 [4000] -> OpenSM Rev:openib-1.0.0 > Apr 12 18:42:25 [4000] -> osm_opensm_init: Forcing single threaded dispatcher. > Apr 12 18:42:25 [4000] -> osm_report_notice: Reporting Generic Notice type:3 > num:66 from LID:0x0000 GID:0xfe80000000000000,0x0000000000000000 > Apr 12 18:42:25 [4000] -> osm_report_notice: Reporting Generic Notice type:3 > num:66 from LID:0x0000 GID:0xfe80000000000000,0x0000000000000000 > Apr 12 18:42:25 [4000] -> osm_vendor_bind: Binding to port 0x2c9020040133a. > Apr 12 18:42:25 [4000] -> osm_vendor_bind: Binding to port 0x2c9020040133a. > Apr 12 18:42:25 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:42:25 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:42:25 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > Apr 12 18:42:25 [18007] -> osm_mcmr_rcv_leave_mgrp: ERR 1B25:Received an > Invalid Delete Request. > > When I kill the priority 15 server however, the priority 0 server runs > amok with continous log messages like: > > Apr 12 18:44:28 [2400A] -> umad_receiver: send completed with error(method=1 > attr=20) -- dropping. > Apr 12 18:44:28 [2400A] -> umad_receiver: send completed with error(method=1 > attr=20) -- dropping.
Attribute 0x20 is SMInfo. This is just the SubnGet(SMInfo) from the priority 0 server failing (no matching SubnGetResp received) which is "normal" if you killed the priority 15 server. Do the messages ever subside ? > I assume that the handover to the priority 0 opensm hasn't worked > then. This isn't really handover but that is another matter. You should be able to use the sminfo diag to see whether this SM has assumed the MASTER role. -- Hal _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
