As I said, the degree of impact depends on the messaging pattern. If rank A
typically sends/recvs with rank A+!, then you won't see much difference.
However, if rank A typically sends/recvs with rank N-A, where N=#ranks in job,
then you'll see a very large difference.
You might try simply changing the mapping pattern - e.g., add -bynode to your
cmd line. This would make it run faster if it followed the latter example.
On Nov 2, 2013, at 12:40 AM, San B wrote:
> Yes MM... But here a single node has 16cores not 64 cores.
> The 1st two jobs were with OMPI-1.4.5.
> 16 cores of single node - 3692.403
> 16 cores on two nodes (8 cores per node) - 12338.809
>
> The 1st two jobs were with OMPI-1.6.5.
> 16 cores of single node - 3547.879
> 16 cores on two nodes (8 cores per node) - 5527.320
>
> As others said, due to shared memory communication the single node job
> is running faster, but I was expecting a slight difference between 1 & 2
> nodes - which is taking 60% more time here.
>
>
>
> On Thu, Oct 31, 2013 at 8:19 PM, Ralph Castain wrote:
> Yes, though the degree of impact obviously depends on the messaging pattern
> of the app.
>
> On Oct 31, 2013, at 2:50 AM, MM wrote:
>
>> Of course, by this you mean, with the same total number of nodes, for e.g.
>> 64 process on 1 node using shared mem, vs 64 processes spread over 2 nodes
>> (32 each for e.g.)?
>>
>>
>> On 29 October 2013 14:37, Ralph Castain wrote:
>> As someone previously noted, apps will always run slower on multiple nodes
>> vs everything on a single node due to the shared memory vs IB differences.
>> Nothing you can do about that one.
>> ___
>>
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users