Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread Terry Dontje

On 04/05/2011 07:37 AM, SLIM H.A. wrote:


Hi Terry

I think the problem may have been caused now by our lustre file system 
being sick, so I'll wait until that is fixed.


It worked outside gridengine but I think I did not include --mca btl 
self,sm,ib or the corresponding environment variables with gridengine, 
although it usually finds the fastest interconnect.


>I've seen this when either OPAL_PREFIX or LD_LIBRARY_PATH not being 
set up correctly.


LD_LIBRARY_PATH is set correctly but where is OPAL_PREFIX set?

OPAL_PREFIX should be set to the base directory of where OMPI is 
installed.  In theory it should not need to be set if configure's prefix 
option is the same place you installed OMPI.  I think it is only when 
you've moved the OMPI installation bits somewhere that doesn't 
corresponds to the configure prefix option.


Of course the same is similarly true with LD_LIBRARY_PATH that you 
really shouldn't need to set that in your scripts/shell if you've 
compiled the programs such that the Rpath is correctly passed to the linker.


--td


Thanks

Henk

*From:*users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] 
*On Behalf Of *Terry Dontje

*Sent:* 05 April 2011 11:21
*To:* us...@open-mpi.org
*Subject:* Re: [OMPI users] orte-odls-default:execv-error

On 04/05/2011 05:11 AM, SLIM H.A. wrote:

After an upgrade of our system I receive the following error message
(openmpi 1.4.2 with gridengine):
  


quote


--
Sorry!  You were supposed to get help about:
 orte-odls-default:execv-error
But I couldn't open the help file:
 ...path/1.4.2/share/openmpi/help-odls-default.txt: Cannot send after
transport endpoint shut
down.  Sorry!

end quote

  
and this is this is the section in the text file

...path/1.4.2/share/openmpi/help-odls-default.txt that refers to
orte-odls-default:execv-error
  
  




quote

[orte-odls-default:execv-error]
Could not execute the executable "%s": %s
  
This could mean that your PATH or executable name is wrong, or that you

do not
have the necessary permissions.  Please ensure that the executable is
able to be
found and executed."

end quote

  
Does the execv-error mean that the file

...path/1.4.2/share/openmpi/help-odls-default.txt was not accessible or
is there a different reason?
  

No, it thinks it cannot find some executable that was requested to 
run.  Do you have the exact mpirun command line that was trying to be 
ran?  Can you first try and run without gridengine?


The error message continues with
  


quote


--
[cn004:00591] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_grpcomm_basic: file not found (ignored)
[cn004:00586] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)
[cn004:00585] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)

--
Sorry!  You were supposed to get help about:
 find-available:none-found
But I couldn't open the help file:
 ...path/1.4.2/share/openmpi/help-mca-base.txt: Cannot send after
transport endpoint shutdown
.  Sorry!

--
[cn004:00586] PML ob1 cannot be selected

end quote

  
but there are .so and .la libraries in the directory

...path/1.4.2/lib/openmpi
Are those the ones not found?

I've seen this when either OPAL_PREFIX or LD_LIBRARY_PATH not being 
set up correctly.


  
Thanks
  
Henk
  
___

users mailing list
us...@open-mpi.org  <mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>





Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread SLIM H.A.
Hi Reuti

1.4.2 is still in the same location and I also built 1.4.3 anew. It
appeared the lustre and ib where not playing along and it is working
now.

Thanks

henk



> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
On
> Behalf Of Reuti
> Sent: 05 April 2011 11:23
> To: Open MPI Users
> Subject: Re: [OMPI users] orte-odls-default:execv-error
> 
> Am 05.04.2011 um 11:11 schrieb SLIM H.A.:
> 
> > After an upgrade of our system I receive the following error message
> > (openmpi 1.4.2 with gridengine):
> 
> Did you move openmpi 1.4.2 to a new (i.e. different) location?
> 
> -- Reuti
> 
> 
> >> quote
> >
-
> ---
> > --
> > Sorry!  You were supposed to get help about:
> >orte-odls-default:execv-error
> > But I couldn't open the help file:
> >...path/1.4.2/share/openmpi/help-odls-default.txt: Cannot send
> after
> > transport endpoint shut
> > down.  Sorry!
> >> end quote
> >
> > and this is this is the section in the text file
> > ...path/1.4.2/share/openmpi/help-odls-default.txt that refers to
> > orte-odls-default:execv-error
> >
> >
> >> quote
> > [orte-odls-default:execv-error]
> > Could not execute the executable "%s": %s
> >
> > This could mean that your PATH or executable name is wrong, or that
> you
> > do not
> > have the necessary permissions.  Please ensure that the executable
is
> > able to be
> > found and executed."
> >> end quote
> >
> > Does the execv-error mean that the file
> > ...path/1.4.2/share/openmpi/help-odls-default.txt was not accessible
> or
> > is there a different reason?
> >
> > The error message continues with
> >
> >> quote
> >
-
> ---
> > --
> > [cn004:00591] mca: base: component_find: unable to open
> > ...path/1.4.2/lib/openmpi/mca_grpcomm_basic: file not found
(ignored)
> > [cn004:00586] mca: base: component_find: unable to open
> > ...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found
> (ignored)
> > [cn004:00585] mca: base: component_find: unable to open
> > ...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found
> (ignored)
> >
-
> ---
> > --
> > Sorry!  You were supposed to get help about:
> >find-available:none-found
> > But I couldn't open the help file:
> >...path/1.4.2/share/openmpi/help-mca-base.txt: Cannot send after
> > transport endpoint shutdown
> > .  Sorry!
> >
-
> ---
> > --
> > [cn004:00586] PML ob1 cannot be selected
> >> end quote
> >
> > but there are .so and .la libraries in the directory
> > ...path/1.4.2/lib/openmpi
> > Are those the ones not found?
> >
> > Thanks
> >
> > Henk
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread SLIM H.A.
 

Hi Terry

 

I think the problem may have been caused now by our lustre file system
being sick, so I'll wait until that is fixed. 

It worked outside gridengine but I think I did not include --mca btl
self,sm,ib or the corresponding environment variables with gridengine,
although it usually finds the fastest interconnect.

 

>I've seen this when either OPAL_PREFIX or LD_LIBRARY_PATH not being set
up correctly.

 

LD_LIBRARY_PATH is set correctly but where is OPAL_PREFIX set?

 

Thanks

 

Henk

 

 

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Terry Dontje
Sent: 05 April 2011 11:21
To: us...@open-mpi.org
Subject: Re: [OMPI users] orte-odls-default:execv-error

 

On 04/05/2011 05:11 AM, SLIM H.A. wrote: 

After an upgrade of our system I receive the following error message
(openmpi 1.4.2 with gridengine):
 

quote


--
Sorry!  You were supposed to get help about:
orte-odls-default:execv-error
But I couldn't open the help file:
...path/1.4.2/share/openmpi/help-odls-default.txt: Cannot send after
transport endpoint shut
down.  Sorry!

end quote

 
and this is this is the section in the text file
...path/1.4.2/share/openmpi/help-odls-default.txt that refers to
orte-odls-default:execv-error
 
 





quote

[orte-odls-default:execv-error]
Could not execute the executable "%s": %s
 
This could mean that your PATH or executable name is wrong, or that you
do not
have the necessary permissions.  Please ensure that the executable is
able to be
found and executed."

end quote

 
Does the execv-error mean that the file
...path/1.4.2/share/openmpi/help-odls-default.txt was not accessible or
is there a different reason?
 

No, it thinks it cannot find some executable that was requested to run.
Do you have the exact mpirun command line that was trying to be ran?
Can you first try and run without gridengine? 

The error message continues with
 

quote


--
[cn004:00591] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_grpcomm_basic: file not found (ignored)
[cn004:00586] mca: base: component_find: unable to open 
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)
[cn004:00585] mca: base: component_find: unable to open 
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)

--
Sorry!  You were supposed to get help about:
find-available:none-found
But I couldn't open the help file:
...path/1.4.2/share/openmpi/help-mca-base.txt: Cannot send after
transport endpoint shutdown
.  Sorry!

--
[cn004:00586] PML ob1 cannot be selected

end quote

 
but there are .so and .la libraries in the directory
...path/1.4.2/lib/openmpi
Are those the ones not found?

I've seen this when either OPAL_PREFIX or LD_LIBRARY_PATH not being set
up correctly.



 
Thanks
 
Henk
 
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

 

-- 
 
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

 

 



Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread Reuti
Am 05.04.2011 um 11:11 schrieb SLIM H.A.:

> After an upgrade of our system I receive the following error message
> (openmpi 1.4.2 with gridengine):

Did you move openmpi 1.4.2 to a new (i.e. different) location?

-- Reuti


>> quote
> 
> --
> Sorry!  You were supposed to get help about:
>orte-odls-default:execv-error
> But I couldn't open the help file:
>...path/1.4.2/share/openmpi/help-odls-default.txt: Cannot send after
> transport endpoint shut
> down.  Sorry!
>> end quote
> 
> and this is this is the section in the text file
> ...path/1.4.2/share/openmpi/help-odls-default.txt that refers to
> orte-odls-default:execv-error
> 
> 
>> quote
> [orte-odls-default:execv-error]
> Could not execute the executable "%s": %s
> 
> This could mean that your PATH or executable name is wrong, or that you
> do not
> have the necessary permissions.  Please ensure that the executable is
> able to be
> found and executed."
>> end quote
> 
> Does the execv-error mean that the file
> ...path/1.4.2/share/openmpi/help-odls-default.txt was not accessible or
> is there a different reason?
> 
> The error message continues with
> 
>> quote
> 
> --
> [cn004:00591] mca: base: component_find: unable to open
> ...path/1.4.2/lib/openmpi/mca_grpcomm_basic: file not found (ignored)
> [cn004:00586] mca: base: component_find: unable to open 
> ...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)
> [cn004:00585] mca: base: component_find: unable to open 
> ...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)
> 
> --
> Sorry!  You were supposed to get help about:
>find-available:none-found
> But I couldn't open the help file:
>...path/1.4.2/share/openmpi/help-mca-base.txt: Cannot send after
> transport endpoint shutdown
> .  Sorry!
> 
> --
> [cn004:00586] PML ob1 cannot be selected
>> end quote
> 
> but there are .so and .la libraries in the directory
> ...path/1.4.2/lib/openmpi
> Are those the ones not found?
> 
> Thanks
> 
> Henk
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread Terry Dontje

On 04/05/2011 05:11 AM, SLIM H.A. wrote:

After an upgrade of our system I receive the following error message
(openmpi 1.4.2 with gridengine):


quote


--
Sorry!  You were supposed to get help about:
 orte-odls-default:execv-error
But I couldn't open the help file:
 ...path/1.4.2/share/openmpi/help-odls-default.txt: Cannot send after
transport endpoint shut
down.  Sorry!

end quote

and this is this is the section in the text file
...path/1.4.2/share/openmpi/help-odls-default.txt that refers to
orte-odls-default:execv-error





quote

[orte-odls-default:execv-error]
Could not execute the executable "%s": %s

This could mean that your PATH or executable name is wrong, or that you
do not
have the necessary permissions.  Please ensure that the executable is
able to be
found and executed."

end quote

Does the execv-error mean that the file
...path/1.4.2/share/openmpi/help-odls-default.txt was not accessible or
is there a different reason?

No, it thinks it cannot find some executable that was requested to run.  
Do you have the exact mpirun command line that was trying to be ran?  
Can you first try and run without gridengine?

The error message continues with


quote


--
[cn004:00591] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_grpcomm_basic: file not found (ignored)
[cn004:00586] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)
[cn004:00585] mca: base: component_find: unable to open
...path/1.4.2/lib/openmpi/mca_notifier_syslog: file not found (ignored)

--
Sorry!  You were supposed to get help about:
 find-available:none-found
But I couldn't open the help file:
 ...path/1.4.2/share/openmpi/help-mca-base.txt: Cannot send after
transport endpoint shutdown
.  Sorry!

--
[cn004:00586] PML ob1 cannot be selected

end quote

but there are .so and .la libraries in the directory
...path/1.4.2/lib/openmpi
Are those the ones not found?
I've seen this when either OPAL_PREFIX or LD_LIBRARY_PATH not being set 
up correctly.

Thanks

Henk

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com