Hi,
When u test clusters via OSCAR WIZARD , all of test became successfull?!
Did u test other MPIs too?!

On Thu, Nov 20, 2008 at 1:41 PM, Michael Oevermann <
[EMAIL PROTECTED]> wrote:

> Hi all,
> I have "inherited" a small cluster with a head node and four compute
> nodes which
> I have to administer.  The nodes are connected via infiniband (OFED). When
> I
> do a
>
> cexec :1-4 ibstatus
>
> I get someinformation indicating that the infiniband is sort of available:
>
> ************************* oscar_cluster *************************
> --------- n01---------
> Infiniband device 'mthca0' port 1 status:
>        default gid:     fe80:0000:0000:0000:0002:c902:0025:930d
>        base lid:        0x1
>        sm lid:          0x1
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            10 Gb/sec (4X)
>
> --------- n02---------
> Infiniband device 'mthca0' port 1 status:
>        default gid:     fe80:0000:0000:0000:0002:c902:0025:931d
>        base lid:        0x3
>        sm lid:          0x1
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            10 Gb/sec (4X)
>
> --------- n03---------
>        default gid:     fe80:0000:0000:0000:0002:c902:0025:9321
>        base lid:        0x5
>        sm lid:          0x1
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            10 Gb/sec (4X)
>
> --------- n04---------
> Infiniband device 'mthca0' port 1 status:
>        default gid:     fe80:0000:0000:0000:0002:c902:0025:9201
>        base lid:        0x2
>        sm lid:          0x1
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            10 Gb/sec (4X)
>
>
>
>
> However, when I  start runing an mpi job I get the following message
> indicating that the infiniband is not working (I am definitely using the
> mpi-libs compiled with infiniband support):
>
> [0,1,0]: uDAPL on host n01 was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,2]: uDAPL on host n01 was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,3]: uDAPL on host n02 was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,1]: uDAPL on host n02 was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
>
> I am a complete novice in the infiniband area, so can anybody give me
> some advise
> what's going wrong here and how to get the jobs running with infiniband?
>
>
> Thanks for any help
>
> Michael
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>



-- 
A.Nazemian
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to