On Wed, Mar 16, 2016 at 4:49 PM, Cabral, Matias A
<matias.a.cab...@intel.com> wrote:
> I didn't go into the code to see who is actually calling this error message, 
> but I suspect this may be a generic error for "out of memory" kind of thing 
> and not specific to the que pair. To confirm please add  -mca 
> pml_base_verbose 100 and add  -mca mtl_base_verbose 100  to see what is being 
> selected.

this didn't spit out anything overly useful, just lots of lines

[node001:00909] mca: base: components_register: registering pml components
[node001:00909] mca: base: components_register: found loaded component v
[node001:00909] mca: base: components_register: component v register
function successful
[node001:00909] mca: base: components_register: found loaded component bfo
[node001:00909] mca: base: components_register: component bfo register
function successful
[node001:00909] mca: base: components_register: found loaded component cm
[node001:00909] mca: base: components_register: component cm register
function successful
[node001:00909] mca: base: components_register: found loaded component ob1
[node001:00909] mca: base: components_register: component ob1 register
function successful

> I'm trying to remember some details of IMB  and alltoallv to see if it is 
> indeed requiring more resources that the other micro benchmarks.

i'm using IMB for my tests, but this issue came up because a
researcher isn't able to run large alltoall codes, so i don't believe
it's specific to IMB

> BTW, did you confirm the limits setup? Also do the nodes have all the same 
> amount of mem?

yes, all nodes have the limits set to unlimited and each node has
256GB of memory

Reply via email to