Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread Jeff Squyres
On Jul 30, 2012, at 12:48 PM, Paweł Jaromin wrote:

> make all
> Building file: ../src/snd_0.1.c
> Invoking: GCC C Compiler
> mpicc -I/usr/include/mpi -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP
> -MF"src/snd_0.1.d" -MT"src/snd_0.1.d" -o "src/snd_0.1.o"
> "../src/snd_0.1.c"
> ../src/snd_0.1.c:24: warning: return type defaults to 'int'
> ../src/snd_0.1.c: In function 'main':
> ../src/snd_0.1.c:45: warning: unused variable 'outfile'
> ../src/snd_0.1.c:42: warning: unused variable 'FILE_OUT'
> ../src/snd_0.1.c:41: warning: unused variable 'FILE_NAME'
> ../src/snd_0.1.c:40: warning: unused variable 'AF_setup'
> ../src/snd_0.1.c:38: warning: unused variable 'snd_buffor'
> ../src/snd_0.1.c:37: warning: unused variable 'i'
> ../src/snd_0.1.c: In function 'print_usage':
> ../src/snd_0.1.c:29: warning: control reaches end of non-void function
> Finished building: ../src/snd_0.1.c

You might want to fix these warnings.  The first one and the last one seem like 
they could cause nondeterminism.

Also, you shouldn't be adding -I/usr/include/mpi.  mpicc will add the right -I 
option for you (e.g., do you know for sure that your MPI header files are in 
/usr/include/mpi?).  It's useless at best, and harmful at worst (E.g., if some 
other MPI implementation is installed into /usr/include/mpi).

> no MPI -program which was based on
> 
>  Build of configuration Debug for project snd_test 
> 
> make all
> Building file: ../main.c
> Invoking: GCC C Compiler
> gcc -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"main.d"
> -MT"main.d" -o "main.o" "../main.c"
> Finished building: ../main.c
> 
> Building target: snd_test
> Invoking: GCC C Linker
> gcc  -o "snd_test"  ./main.o   -lsndfile
> Finished building target: snd_test

I notice that you're not including -Wall and a bunch of other compiler flags in 
the non-MPI install.

I also notice that you're not compiling the same .c files at all.

So if I'm understanding this thread right -- and I may well not be -- it seems 
like you're saying:

- when I use gcc to compile main.c, the program runs file
- when I used mpicc to compile ../src/snd_0.1.c, the program fails

If that's the case, your comparing apples to oranges here.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread Paweł Jaromin
normal MPI compiling:

 Build of configuration Debug for project snd_0.1 

make all
Building file: ../src/snd_0.1.c
Invoking: GCC C Compiler
mpicc -I/usr/include/mpi -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP
-MF"src/snd_0.1.d" -MT"src/snd_0.1.d" -o "src/snd_0.1.o"
"../src/snd_0.1.c"
../src/snd_0.1.c:24: warning: return type defaults to 'int'
../src/snd_0.1.c: In function 'main':
../src/snd_0.1.c:45: warning: unused variable 'outfile'
../src/snd_0.1.c:42: warning: unused variable 'FILE_OUT'
../src/snd_0.1.c:41: warning: unused variable 'FILE_NAME'
../src/snd_0.1.c:40: warning: unused variable 'AF_setup'
../src/snd_0.1.c:38: warning: unused variable 'snd_buffor'
../src/snd_0.1.c:37: warning: unused variable 'i'
../src/snd_0.1.c: In function 'print_usage':
../src/snd_0.1.c:29: warning: control reaches end of non-void function
Finished building: ../src/snd_0.1.c

Building target: snd_0.1
Invoking: GCC C Linker
mpicc  -o "snd_0.1"  ./src/snd_0.1.o   -lsndfile -laudiofile
Finished building target: snd_0.1


 Build Finished 


MPI with option --showme:

 Build of configuration Debug for project snd_0.1 

make all
Building file: ../src/snd_0.1.c
Invoking: GCC C Compiler
mpicc --showme -I/usr/include/mpi -O0 -g3 -Wall -c -fmessage-length=0
-MMD -MP -MF"src/snd_0.1.d" -MT"src/snd_0.1.d" -o "src/snd_0.1.o"
"../src/snd_0.1.c"
gcc -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
-pthread -I/usr/include/mpi -O0 -g3 -Wall -c -fmessage-length=0 -MMD
-MP -MFsrc/snd_0.1.d -MTsrc/snd_0.1.d -o src/snd_0.1.o
../src/snd_0.1.c
gcc: ./src/snd_0.1.o: No file or directory
make: *** [libsnd_0.1] Error 1
Finished building: ../src/snd_0.1.c

Building target: libsnd_0.1
Invoking: GCC C Linker
mpicc -shared -o "libsnd_0.1"  ./src/snd_0.1.o   -lsndfile

 Build Finished 



no MPI -program which was based on

 Build of configuration Debug for project snd_test 

make all
Building file: ../main.c
Invoking: GCC C Compiler
gcc -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"main.d"
-MT"main.d" -o "main.o" "../main.c"
Finished building: ../main.c

Building target: snd_test
Invoking: GCC C Linker
gcc  -o "snd_test"  ./main.o   -lsndfile
Finished building target: snd_test


 Build Finished 


2012/7/30 TERRY DONTJE :
> Please show me how you are compiling the program under gcc and mpicc.  Plus
> do a "mpicc --showme".
>
> --td
>
>
> On 7/30/2012 8:33 AM, Paweł Jaromin wrote:
>
> This situation is also strange for me, I spend 2 days to find a bug :(.
>
> Unfortunately I am not  a professional  C/C++ programmer, but I have
> to make this program. Please have a look in a picture from link below,
> maybe it will be more clear.
>
> http://vipjg.nazwa.pl/sndfile_error.png
>
>
>
>
>
>
>
>
>
> 2012/7/30 TERRY DONTJE :
>
> On 7/30/2012 6:11 AM, Paweł Jaromin wrote:
>
> Hello
>
> Thanks for fast answer, but the problem looks a little different.
>
> Of course, I use this code only for master node (rank 0), because only
> this node has an access to file.
>
> As You can see i use "if" clause to check sndFile for NULL:
>
> if (sndFile == NULL)
>
> and it returns not NULL value, so the code can run forward.
> I have found the problem during check array:
>
>
>  long numFrames = sf_readf_float(sndFile, snd_buffor, 
> sfinfo.frames);
>
>  // Check correct number of samples loaded
>  if (numFrames != sfinfo.frames) {
> fprintf(stderr, "Did not read enough frames for 
> source\n");
> sf_close(sndFile);
> free(snd_buffor);
> MPI_Finalize();
> return 1;
>  }
>
> So, after that I went to debuger to check variables (I use Eclipse PTP
> and sdm enviroment), then after inicjalization variable "sndFile" has
> "no value" not "NULL" . Unfortunatelly sndFile has still the same
> value to the end of program :(.
>
> What do you mean by sndFile has "no value"?  There isn't a special "no
> value" value to a variable unless you are debugging a code that somehow had
> some variable optimized out at the particular line you are interested in.
>
> Declarations:
>   FILE*outfile = NULL ;
>   SF_INFO sfinfo ;
>   SNDFILE *sndFile= NULL;
>
> Very interesting is , that "sfinfo" from the same library  works perfect.
> At the end of this story, I modified the program without MPI , then
> compiled it by gcc (not mpicc) and it works fine (in debuger sndFile
> has proper value).
>
> So it seems you believe mpicc is doing something wrong when all mpicc is is
> a wrapper to a compiler.  Maybe doing a "mpicc --showme" will give you an
> idea what compiler and options mpicc is passing to the compiler.  This
> should give you an idea  the difference between your gcc and mpicc
> 

Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Daniel Junglas
I built from a tarball, not svn. In the VERSION file I have
  svn_r=r26429
Is that the information you asked for?

Daniel

users-boun...@open-mpi.org wrote on 07/30/2012 04:15:45 PM:
> 
> Do you know what r# of 1.6 you were trying to compile?  Is this via 
> the tarball or svn?
> 
> thanks,
> 
> --td
> 
> On 7/30/2012 9:41 AM, Daniel Junglas wrote: 
> Hi,
> 
> I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
> Compilation and installation worked without a problem. However,
> when trying to run an application with mpirun I always faced
> this error:
> 
> [hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
> MULTICAST_IF
> for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
> Error: Invalid argument (22)
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at 
line 
> 56
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
> 
--
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_rmcast_base_select failed
>   --> Returned value Error (-1) instead of ORTE_SUCCESS
> 
> 
> After some digging I found that the following patch seems to fix the
> problem (at least the application seems to run correct now):
> --- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
> +++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
> @@ -936,9 +936,16 @@
>  }
>  } else {
>  /* on the xmit side, need to set the interface */
> +void const *addrptr;
>  memset(, 0, sizeof(inaddr));
>  inaddr.sin_addr.s_addr = htonl(chan->interface);
> +#ifdef __sun
> +addrlen = sizeof(inaddr.sin_addr);
> +addrptr = (void *)_addr;
> +#else
>  addrlen = sizeof(struct sockaddr_in);
> +addrptr = (void *)
> +#endif
> 
>  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
>   "setup:socket:xmit interface 
> %03d.%03d.%03d.%03d",
> @@ -945,7 +952,7 @@
>   OPAL_IF_FORMAT_ADDR(chan->interface)));
> 
>  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
> -(void *), addrlen)) < 0) {
> +addrptr, addrlen)) < 0) {
>  opal_output(0, "%s rmcast:init: setsockopt() failed on 
> MULTICAST_IF\n"
>  "\tfor multicast network %03d.%03d.%03d.%03d 
> interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
>  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
> Can anybody confirm that the patch is good/correct? In particular
> that the '__sun' part is the right thing to do?
> 
> Thanks,
> 
> Daniel

> 

> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com

> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Ralph Castain
FWIW: the rmcast framework shouldn't be in 1.6. Jeff and I are testing removal 
and should have it out of there soon.

Meantime, the best solution is to "--enable-mca-no-build rmcast"

On Jul 30, 2012, at 7:15 AM, TERRY DONTJE wrote:

> Do you know what r# of 1.6 you were trying to compile?  Is this via the 
> tarball or svn?
> 
> thanks,
> 
> --td
> 
> On 7/30/2012 9:41 AM, Daniel Junglas wrote:
>> 
>> Hi,
>> 
>> I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
>> Compilation and installation worked without a problem. However,
>> when trying to run an application with mpirun I always faced
>> this error:
>> 
>> [hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
>> MULTICAST_IF
>> for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
>> Error: Invalid argument (22)
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at line 
>> 56
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>   orte_rmcast_base_select failed
>>   --> Returned value Error (-1) instead of ORTE_SUCCESS
>> 
>> 
>> After some digging I found that the following patch seems to fix the
>> problem (at least the application seems to run correct now):
>> --- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
>> +++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
>> @@ -936,9 +936,16 @@
>>  }
>>  } else {
>>  /* on the xmit side, need to set the interface */
>> +void const *addrptr;
>>  memset(, 0, sizeof(inaddr));
>>  inaddr.sin_addr.s_addr = htonl(chan->interface);
>> +#ifdef __sun
>> +addrlen = sizeof(inaddr.sin_addr);
>> +addrptr = (void *)_addr;
>> +#else
>>  addrlen = sizeof(struct sockaddr_in);
>> +addrptr = (void *)
>> +#endif
>>  
>>  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
>>   "setup:socket:xmit interface 
>> %03d.%03d.%03d.%03d",
>> @@ -945,7 +952,7 @@
>>   OPAL_IF_FORMAT_ADDR(chan->interface)));
>>  
>>  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
>> -(void *), addrlen)) < 0) {
>> +addrptr, addrlen)) < 0) {
>>  opal_output(0, "%s rmcast:init: setsockopt() failed on 
>> MULTICAST_IF\n"
>>  "\tfor multicast network %03d.%03d.%03d.%03d 
>> interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
>>  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>> Can anybody confirm that the patch is good/correct? In particular
>> that the '__sun' part is the right thing to do?
>> 
>> Thanks,
>> 
>> Daniel
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Jeff Squyres
Ralph actually suggests that we just remove rmcast from 1.6.1.


On Jul 30, 2012, at 10:15 AM, TERRY DONTJE wrote:

> Do you know what r# of 1.6 you were trying to compile?  Is this via the 
> tarball or svn?
> 
> thanks,
> 
> --td
> 
> On 7/30/2012 9:41 AM, Daniel Junglas wrote:
>> Hi,
>> 
>> I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
>> Compilation and installation worked without a problem. However,
>> when trying to run an application with mpirun I always faced
>> this error:
>> 
>> [hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
>> MULTICAST_IF
>> for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
>> Error: Invalid argument (22)
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at line 
>> 56
>> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
>> ../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>   orte_rmcast_base_select failed
>>   --> Returned value Error (-1) instead of ORTE_SUCCESS
>> 
>> 
>> After some digging I found that the following patch seems to fix the
>> problem (at least the application seems to run correct now):
>> --- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
>> +++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
>> @@ -936,9 +936,16 @@
>>  }
>>  } else {
>>  /* on the xmit side, need to set the interface */
>> +void const *addrptr;
>>  memset(, 0, sizeof(inaddr));
>>  inaddr.sin_addr.s_addr = htonl(chan->interface);
>> +#ifdef __sun
>> +addrlen = sizeof(inaddr.sin_addr);
>> +addrptr = (void *)_addr;
>> +#else
>>  addrlen = sizeof(struct sockaddr_in);
>> +addrptr = (void *)
>> +#endif
>>  
>>  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
>>   "setup:socket:xmit interface 
>> %03d.%03d.%03d.%03d",
>> @@ -945,7 +952,7 @@
>>   OPAL_IF_FORMAT_ADDR(chan->interface)));
>>  
>>  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
>> -(void *), addrlen)) < 0) {
>> +addrptr, addrlen)) < 0) {
>>  opal_output(0, "%s rmcast:init: setsockopt() failed on 
>> MULTICAST_IF\n"
>>  "\tfor multicast network %03d.%03d.%03d.%03d 
>> interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
>>  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>> Can anybody confirm that the patch is good/correct? In particular
>> that the '__sun' part is the right thing to do?
>> 
>> Thanks,
>> 
>> Daniel
>> 
>> 
>> 
>> ___
>> users mailing list
>> 
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread TERRY DONTJE
Do you know what r# of 1.6 you were trying to compile?  Is this via the 
tarball or svn?


thanks,

--td

On 7/30/2012 9:41 AM, Daniel Junglas wrote:

Hi,

I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
Compilation and installation worked without a problem. However,
when trying to run an application with mpirun I always faced
this error:

[hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on
MULTICAST_IF
 for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
 Error: Invalid argument (22)
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file
../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at line
56
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file
../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_rmcast_base_select failed
   -->  Returned value Error (-1) instead of ORTE_SUCCESS


After some digging I found that the following patch seems to fix the
problem (at least the application seems to run correct now):
--- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
+++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
@@ -936,9 +936,16 @@
  }
  } else {
  /* on the xmit side, need to set the interface */
+void const *addrptr;
  memset(, 0, sizeof(inaddr));
  inaddr.sin_addr.s_addr = htonl(chan->interface);
+#ifdef __sun
+addrlen = sizeof(inaddr.sin_addr);
+addrptr = (void *)_addr;
+#else
  addrlen = sizeof(struct sockaddr_in);
+addrptr = (void *)
+#endif

  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
   "setup:socket:xmit interface
%03d.%03d.%03d.%03d",
@@ -945,7 +952,7 @@
   OPAL_IF_FORMAT_ADDR(chan->interface)));

  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF,
-(void *), addrlen))<  0) {
+addrptr, addrlen))<  0) {
  opal_output(0, "%s rmcast:init: setsockopt() failed on
MULTICAST_IF\n"
  "\tfor multicast network %03d.%03d.%03d.%03d
interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
Can anybody confirm that the patch is good/correct? In particular
that the '__sun' part is the right thing to do?

Thanks,

Daniel


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





[OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Daniel Junglas
Hi,

I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
Compilation and installation worked without a problem. However,
when trying to run an application with mpirun I always faced
this error:

[hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
MULTICAST_IF
for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
Error: Invalid argument (22)
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at line 
56
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rmcast_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS


After some digging I found that the following patch seems to fix the
problem (at least the application seems to run correct now):
--- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
+++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
@@ -936,9 +936,16 @@
 }
 } else {
 /* on the xmit side, need to set the interface */
+void const *addrptr;
 memset(, 0, sizeof(inaddr));
 inaddr.sin_addr.s_addr = htonl(chan->interface);
+#ifdef __sun
+addrlen = sizeof(inaddr.sin_addr);
+addrptr = (void *)_addr;
+#else
 addrlen = sizeof(struct sockaddr_in);
+addrptr = (void *)
+#endif

 OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
  "setup:socket:xmit interface 
%03d.%03d.%03d.%03d",
@@ -945,7 +952,7 @@
  OPAL_IF_FORMAT_ADDR(chan->interface)));

 if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
-(void *), addrlen)) < 0) {
+addrptr, addrlen)) < 0) {
 opal_output(0, "%s rmcast:init: setsockopt() failed on 
MULTICAST_IF\n"
 "\tfor multicast network %03d.%03d.%03d.%03d 
interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
Can anybody confirm that the patch is good/correct? In particular
that the '__sun' part is the right thing to do?

Thanks,

Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread TERRY DONTJE
Please show me how you are compiling the program under gcc and mpicc.  
Plus do a "mpicc --showme".


--td

On 7/30/2012 8:33 AM, Paweł Jaromin wrote:

This situation is also strange for me, I spend 2 days to find a bug :(.

Unfortunately I am not  a professional  C/C++ programmer, but I have
to make this program. Please have a look in a picture from link below,
maybe it will be more clear.

http://vipjg.nazwa.pl/sndfile_error.png









2012/7/30 TERRY DONTJE:

On 7/30/2012 6:11 AM, Paweł Jaromin wrote:

Hello

Thanks for fast answer, but the problem looks a little different.

Of course, I use this code only for master node (rank 0), because only
this node has an access to file.

As You can see i use "if" clause to check sndFile for NULL:

if (sndFile == NULL)

and it returns not NULL value, so the code can run forward.
I have found the problem during check array:


   long numFrames = sf_readf_float(sndFile, snd_buffor, 
sfinfo.frames);

   // Check correct number of samples loaded
   if (numFrames != sfinfo.frames) {
  fprintf(stderr, "Did not read enough frames for 
source\n");
  sf_close(sndFile);
  free(snd_buffor);
  MPI_Finalize();
  return 1;
   }

So, after that I went to debuger to check variables (I use Eclipse PTP
and sdm enviroment), then after inicjalization variable "sndFile" has
"no value" not "NULL" . Unfortunatelly sndFile has still the same
value to the end of program :(.

What do you mean by sndFile has "no value"?  There isn't a special "no
value" value to a variable unless you are debugging a code that somehow had
some variable optimized out at the particular line you are interested in.

Declarations:
FILE*outfile = NULL ;
SF_INFO sfinfo ;
SNDFILE *sndFile= NULL;

Very interesting is , that "sfinfo" from the same library  works perfect.
At the end of this story, I modified the program without MPI , then
compiled it by gcc (not mpicc) and it works fine (in debuger sndFile
has proper value).

So it seems you believe mpicc is doing something wrong when all mpicc is is
a wrapper to a compiler.  Maybe doing a "mpicc --showme" will give you an
idea what compiler and options mpicc is passing to the compiler.  This
should give you an idea  the difference between your gcc and mpicc
compilation.  I would suspect either mpicc is using a compiler significantly
different than gcc or that mpicc might be passing some optimization
parameter that is messing the code execution (just a guess).


I hope, now is clear.

Not really.

--td



2012/7/30 TERRY DONTJE:

I am not sure I am understanding the problem correctly so let me describe it
back to you with a couple clarifications.

So your program using sf_open compiles successfully when using gcc and
mpicc.  However, when you run the executable compiled using mpicc sndFile is
null?

If the above is right can you tell us how you ran the code?
Will the code run ok if ran with "mpirun -np 1" on the same machine you run
the gcc code normally?
When the mpicc compiled code sf_open call returns NULL what does the
successive sf_strerror report?
My wild guess is when you run the mpicc compiled code one of the processes
is on a node that doesn't have access to the file passed to sf_open.

--td

On 7/28/2012 1:08 PM, Paweł Jaromin wrote:

Hello all

Because I try make a program to parallel procesing sound files, I use
libsndfile library to load and write wav files. Sytuation is strange,
because when I compile the program by gcc is good (no parallel), but
if I do it by mpicc is a problem with sndFile variable.

// Open sound file
SF_INFO sndInfo;
SNDFILE *sndFile = sf_open(argv[1], SFM_READ,);
if (sndFile == NULL) {
   fprintf(stderr, "Error reading source file '%s': %s\n", argv[1],
sf_strerror(sndFile));
   return 1;
}

This code run witout an error, but variable is "No value"

Maybe somone can help me ??


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631

Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread TERRY DONTJE

On 7/30/2012 6:11 AM, Paweł Jaromin wrote:

Hello

Thanks for fast answer, but the problem looks a little different.

Of course, I use this code only for master node (rank 0), because only
this node has an access to file.

As You can see i use "if" clause to check sndFile for NULL:

if (sndFile == NULL)

and it returns not NULL value, so the code can run forward.
I have found the problem during check array:


   long numFrames = sf_readf_float(sndFile, snd_buffor, 
sfinfo.frames);

   // Check correct number of samples loaded
   if (numFrames != sfinfo.frames) {
  fprintf(stderr, "Did not read enough frames for 
source\n");
  sf_close(sndFile);
  free(snd_buffor);
  MPI_Finalize();
  return 1;
   }

So, after that I went to debuger to check variables (I use Eclipse PTP
and sdm enviroment), then after inicjalization variable "sndFile" has
"no value" not "NULL" . Unfortunatelly sndFile has still the same
value to the end of program :(.
What do you mean by sndFile has "no value"?  There isn't a special "no 
value" value to a variable unless you are debugging a code that somehow 
had some variable optimized out at the particular line you are 
interested in.

Declarations:
FILE*outfile = NULL ;
SF_INFO sfinfo ;
SNDFILE *sndFile= NULL;

Very interesting is , that "sfinfo" from the same library  works perfect.
At the end of this story, I modified the program without MPI , then
compiled it by gcc (not mpicc) and it works fine (in debuger sndFile
has proper value).
So it seems you believe mpicc is doing something wrong when all mpicc is 
is a wrapper to a compiler.  Maybe doing a "mpicc --showme" will give 
you an idea what compiler and options mpicc is passing to the compiler.  
This should give you an idea  the difference between your gcc and mpicc 
compilation.  I would suspect either mpicc is using a compiler 
significantly different than gcc or that mpicc might be passing some 
optimization parameter that is messing the code execution (just a guess).


I hope, now is clear.

Not really.

--td



2012/7/30 TERRY DONTJE:

I am not sure I am understanding the problem correctly so let me describe it
back to you with a couple clarifications.

So your program using sf_open compiles successfully when using gcc and
mpicc.  However, when you run the executable compiled using mpicc sndFile is
null?

If the above is right can you tell us how you ran the code?
Will the code run ok if ran with "mpirun -np 1" on the same machine you run
the gcc code normally?
When the mpicc compiled code sf_open call returns NULL what does the
successive sf_strerror report?
My wild guess is when you run the mpicc compiled code one of the processes
is on a node that doesn't have access to the file passed to sf_open.

--td

On 7/28/2012 1:08 PM, Paweł Jaromin wrote:

Hello all

Because I try make a program to parallel procesing sound files, I use
libsndfile library to load and write wav files. Sytuation is strange,
because when I compile the program by gcc is good (no parallel), but
if I do it by mpicc is a problem with sndFile variable.

// Open sound file
SF_INFO sndInfo;
SNDFILE *sndFile = sf_open(argv[1], SFM_READ,);
if (sndFile == NULL) {
   fprintf(stderr, "Error reading source file '%s': %s\n", argv[1],
sf_strerror(sndFile));
   return 1;
}

This code run witout an error, but variable is "No value"

Maybe somone can help me ??


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





[OMPI users] Regarding the programming

2012-07-30 Thread seshendra seshu
Hi,
I new to MPI programming, i had a task that i have around 3 nodes and where
3 nodes consists of 50 process so where i need to write an parallel quick
sort considering 2,4,8,16 32,42,50. so ineed some ideas and materials to
develop this code so kindly help me by providing your valuable suggestions
and materials with some examples.


Thanking you,


-- 
 WITH REGARDS
M.L.N.Seshendra


Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread Paweł Jaromin
Hello

Thanks for fast answer, but the problem looks a little different.

Of course, I use this code only for master node (rank 0), because only
this node has an access to file.

As You can see i use "if" clause to check sndFile for NULL:

if (sndFile == NULL)

and it returns not NULL value, so the code can run forward.
I have found the problem during check array:


   long numFrames = sf_readf_float(sndFile, snd_buffor, 
sfinfo.frames);

   // Check correct number of samples loaded
   if (numFrames != sfinfo.frames) {
  fprintf(stderr, "Did not read enough frames for 
source\n");
  sf_close(sndFile);
  free(snd_buffor);
  MPI_Finalize();
  return 1;
   }

So, after that I went to debuger to check variables (I use Eclipse PTP
and sdm enviroment), then after inicjalization variable "sndFile" has
"no value" not "NULL" . Unfortunatelly sndFile has still the same
value to the end of program :(.
Declarations:
FILE*outfile = NULL ;
SF_INFO sfinfo ;
SNDFILE *sndFile= NULL;

Very interesting is , that "sfinfo" from the same library  works perfect.
At the end of this story, I modified the program without MPI , then
compiled it by gcc (not mpicc) and it works fine (in debuger sndFile
has proper value).

I hope, now is clear.



2012/7/30 TERRY DONTJE :
> I am not sure I am understanding the problem correctly so let me describe it
> back to you with a couple clarifications.
>
> So your program using sf_open compiles successfully when using gcc and
> mpicc.  However, when you run the executable compiled using mpicc sndFile is
> null?
>
> If the above is right can you tell us how you ran the code?
> Will the code run ok if ran with "mpirun -np 1" on the same machine you run
> the gcc code normally?
> When the mpicc compiled code sf_open call returns NULL what does the
> successive sf_strerror report?
> My wild guess is when you run the mpicc compiled code one of the processes
> is on a node that doesn't have access to the file passed to sf_open.
>
> --td
>
> On 7/28/2012 1:08 PM, Paweł Jaromin wrote:
>
> Hello all
>
> Because I try make a program to parallel procesing sound files, I use
> libsndfile library to load and write wav files. Sytuation is strange,
> because when I compile the program by gcc is good (no parallel), but
> if I do it by mpicc is a problem with sndFile variable.
>
>// Open sound file
>SF_INFO sndInfo;
>SNDFILE *sndFile = sf_open(argv[1], SFM_READ, );
>if (sndFile == NULL) {
>   fprintf(stderr, "Error reading source file '%s': %s\n", argv[1],
> sf_strerror(sndFile));
>   return 1;
>}
>
> This code run witout an error, but variable is "No value"
>
> Maybe somone can help me ??
>
>
> --
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
--
pozdrawiam

Paweł Jaromin



Re: [OMPI users] sndlib problem by mpicc compiler

2012-07-30 Thread TERRY DONTJE
I am not sure I am understanding the problem correctly so let me 
describe it back to you with a couple clarifications.


So your program using sf_open compiles successfully when using gcc and 
mpicc.  However, when you run the executable compiled using mpicc 
sndFile is null?


If the above is right can you tell us how you ran the code?
Will the code run ok if ran with "mpirun -np 1" on the same machine you 
run the gcc code normally?
When the mpicc compiled code sf_open call returns NULL what does the 
successive sf_strerror report?
My wild guess is when you run the mpicc compiled code one of the 
processes is on a node that doesn't have access to the file passed to 
sf_open.


--td
On 7/28/2012 1:08 PM, Paweł Jaromin wrote:

Hello all

Because I try make a program to parallel procesing sound files, I use
libsndfile library to load and write wav files. Sytuation is strange,
because when I compile the program by gcc is good (no parallel), but
if I do it by mpicc is a problem with sndFile variable.

// Open sound file
SF_INFO sndInfo;
SNDFILE *sndFile = sf_open(argv[1], SFM_READ,);
if (sndFile == NULL) {
   fprintf(stderr, "Error reading source file '%s': %s\n", argv[1],
sf_strerror(sndFile));
   return 1;
}

This code run witout an error, but variable is "No value"

Maybe somone can help me ??



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com