On Thu, Apr 03, 2008 at 07:05:28AM -0600, Ralph H Castain wrote: > Hmmmm...since I have no control nor involvement in what gets sent, perhaps I > can be a disinterested third party. ;-) > > Could you perhaps explain this comment: > > > BTW I looked at how we do modex now on the trunk. For OOB case more > > than half the data we send for each proc is garbage. > > > What "garbage" are you referring to? I am working to remove the stuff > inserted by proc.c - mostly hostname, hopefully arch, etc. If you are > running a "debug" version, there will also be type descriptors for each > entry, but those are eliminated for optimized builds. > > So are you referring to other things? I am talking about openib part of the modex. The "garbage" I am referring to is this:
This is the structure that is sent by modex for each openib BTL. We send entire structure by copying it into a message. struct mca_btl_openib_port_info { uint32_t mtu; #if OMPI_ENABLE_HETEROGENEOUS_SUPPORT uint8_t padding[4]; #endif uint64_t subnet_id; uint16_t lid; /* used only in xrc */ uint16_t apm_lid; /* the lid is used for APM to different port */ char *cpclist; }; The sizeof() the struct is 32 byte, but how much useful info it actually contains? mtu - should really be uint8 since this is encoded value (1,2,3,4) padding - is garbage. sibnet_id - is ok lid - should be sent only for XRC case apm_lid - should be sent only if apm is enabled cpclist - is pure garbage and should not be in this struct at all. So we send 32 bytes with only 9 bytes of useful info (for non XRC case without APM enabled). -- Gleb.