Re: [OMPI users] MPI_Send, MPI_Recv problem on Mac and Linux

2012-04-18 Thread Peter Sels
ok, I see. Interesting. Thanks.
Peter

On 18 April 2012 14:17, Jeffrey Squyres <jsquy...@cisco.com> wrote:
> On Apr 18, 2012, at 3:15 AM, Peter Sels wrote:
>
>> I suppose with C++ MPI it's possible to enforce more strict type
>> checking using template or so.)
>
> Not really, unfortunately.  :-(
[snip]

-- 
Peter Sels
www.LogicallyYours.com



Re: [OMPI users] MPI_Send, MPI_Recv problem on Mac and Linux

2012-04-18 Thread Peter Sels
Hi Jeffrey,

Thanks a lot for answering my message.
The  io work thing escaped my attention.
(Usually the compiler catches these things but of course here any
pointer type should work.
I suppose with C++ MPI it's possible to enforce more strict type
checking using template or so.)

Anyway in the meantime I rewrote everything without this mistake, but still
wasn't aware of it.
Everything works as expected now.

thanks,

Peter

On 17 April 2012 22:15, Jeffrey Squyres <jsquy...@cisco.com> wrote:
> Sorry for the delay in replying; I was out last week.
>
> MPI_SEND and MPI_RECV take pointers to the buffer to send and receive, 
> respectively.
>
> When you send a scalar variable, like an int, you get the address of the 
> buffer via the & operator (e.g., MPI_Send(, ...)).  When you send a 
> new'ed/malloc'ed array, you only need to send the pointer value -- i.e., the 
> address pointing to the buffer.  Don't send the address of the pointer, 
> because then you're telling MPI to overwrite the pointer itself.  I.e.,:
>
>  work = new char[...];
>  MPI_Send(work, ...)
>
> not
>
>  work = new char[...];
>  MPI_Send(, ...);
>
> More below.
>
> On Apr 11, 2012, at 3:03 PM, Peter Sels wrote:
>
>> Probably a buffer overrun or so, but I cannot see where.
>
> The buffer overrun is where you specify  in your MPI_SEND/MPI_RECV 
> calls.
>
>> On linux I get:  Segmentation fault (11)
>>
>> Increasing the length gives more problems...
>>
>> How can I get this code stable?
>> What am I doing wrong?
>> Is there a maximum length to MPI messages?
>
> No.
>
>> For sending a string, do I use MPI_CHARACTER or MPI_BYTE or ...?
>
> MPI_BYTE.  MPI_CHARACTER is for Fortran CHARACTERs.
>
>> How come I cannot assert that my messages end in '\0' when received?
>> And how come that when I print them, I also get a segmentation fault?
>
> I think these two issues are symptoms of (work) vs. (), from above.
>
>> Can I send two subsequent messages using MPI_Send, or do I have to do
>
> Sure.
>
>> the first as MPI_Isend and then do a MPI_Wait before the next
>> MPI_Send?...
>
> You can do multiple Isend's and then a Waitall, if you want.
>
>> Why do I not find code online for receiving the length first and then
>> allocating a buffer of this size and then receiving the next message?
>
> I don't know.  Perhaps you didn't google enough?  :-)
>
> FWIW, the new MPI-3 functions MPI_MPROBE and MPI_IMPROBE will help with 
> unknown-length messages, too.  We have that implemented on the Open MPI SVN 
> trunk, but they are not yet available in a stable release.  They'll debut in 
> OMPI v1.7.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Peter Sels
www.LogicallyYours.com



[OMPI users] MPI_Send, MPI_Recv problem on Mac and Linux

2012-04-11 Thread Peter Sels
Dear openMPI users,

I think this should be an easy question to anyone with more experience
than an openMPI-hello-world-program...

I wrote some openMPI code, where the master sends a length and then a
buffer with that length as 2 subsequent MPI messages.
The slave is receiving these messages and answers back in a similar manner.

Sometimes this goes ok, sometimes not.
Messages of 28 chars or shorter do fine.
Messages of 29 or longer are usually problematic.
This length can be controlled with
#define DUMMY_MSG_LENGTH (40)


On Mac I sometimes get a mentioning of "slave 32767", where there
should only be a leave 1.
Probably a buffer overrun or so, but I cannot see where.

On linux I get:  Segmentation fault (11)

Increasing the length gives more problems...

How can I get this code stable?
What am I doing wrong?
Is there a maximum length to MPI messages?
For sending a string, do I use MPI_CHARACTER or MPI_BYTE or ...?

How come I cannot assert that my messages end in '\0' when received?
And how come that when I print them, I also get a segmentation fault?

Can I send two subsequent messages using MPI_Send, or do I have to do
the first as MPI_Isend and then do a MPI_Wait before the next
MPI_Send?...

Why do I not find code online for receiving the length first and then
allocating a buffer of this size and then receiving the next message?

All code, build, run scripts and logs are attached.

It would help me big time, if you could answer my questions or debug the code.

thanks a lot!

Pete
#include 

#include 
#include 
#include 
#include 

#define DUMMY_MSG_LENGTH (40)
// >28 almost never works, 
// <=28 mostly works, sometimes not either

#define LENGTH_TAG 1
#define WORK_TAG 2
#define RESULT_TAG 3
#define DIE_TAG 4

using namespace std;

// From: http://beige.ucs.indiana.edu/I590/node85.html
void mpiErrorLog(int rank, int error_code) {
  if (error_code != MPI_SUCCESS) {

char error_string[BUFSIZ];
int length_of_error_string;

MPI_Error_string(error_code, error_string, _of_error_string);
cerr << "MPI: rank=" << rank << ", errorStr=" << error_string << endl;
//send_error = TRUE;
  }
}

int main(int argc, char* argv[]) {
  typedef unsigned long int unit_of_length_t;
  typedef unsigned char unit_of_work_t;
  typedef unsigned char unit_of_result_t;
  
  int numprocs, rank, namelen;
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  
  MPI_Init(, );
  
  MPI_Comm_size(MPI_COMM_WORLD, );
  cerr << "MPI: numprocs = " << numprocs << endl;
  MPI_Comm_rank(MPI_COMM_WORLD, );
  cerr << "MPI: rank = " << rank << endl;
  MPI_Get_processor_name(processor_name, );
  cerr << "MPI: processor_name = " << processor_name << endl;
  
  MPI_Status status;
  
  // Send work to the Slaves //
  
  //MPI_Errhandler set(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
  int errorCode;
  
  stringstream ss;
  string s0(DUMMY_MSG_LENGTH, 'W');
  cerr << "work msg = '" << s0 << "'" << endl; 
  ss << s0;
  string s = ss.str();
  
  if (rank!=0) {

MPI_Status status;
int errorCode;

while (true) {
  
  // Receive work from the master //
  unit_of_length_t workLength;
  cerr << "MPI: slave " << rank << " ready to receive workLength from master" 
  << endl;
  errorCode = MPI_Recv(, 1, MPI_UNSIGNED_LONG, 0, MPI_ANY_TAG,
   MPI_COMM_WORLD, );
  mpiErrorLog(rank, errorCode);
  
  assert((status.MPI_TAG == LENGTH_TAG) || (status.MPI_TAG == DIE_TAG));
  if (status.MPI_TAG == DIE_TAG) {
cerr << "MPI: slave " << rank << " received dieTag from master, "
<< "errorCode = " << errorCode << endl;
MPI_Finalize();
return 0; // ok
  }
  assert(status.MPI_TAG == LENGTH_TAG);
  cerr << "MPI: slave " << rank << " received workLength = " 
  << workLength << " from master, errorCode = " << errorCode << endl;
  
  unit_of_work_t * work = new unit_of_work_t[workLength+1];
  cerr << "work = " << (void*)work << endl;
  assert(work != 0);
  
  cerr << "MPI: slave " << rank << " ready to receive work from master" 
  << endl;
  MPI_Recv(, workLength+1, MPI_BYTE, 0, WORK_TAG,
   MPI_COMM_WORLD, ); 
  cerr << "MPI: slave " << rank << " received work from master, "
  << "errorCode = " << errorCode << endl;
  mpiErrorLog(rank, errorCode);
  //**//assert(work[workLength] == '\0');
  //**//cerr << ">>>MPI: work = " << work << endl;
  //**//printf("MPI: work = %s", work);


  assert(status.MPI_TAG == WORK_TAG);
  
  
  stringstream ss1;
  string s0(DUMMY_MSG_LENGTH, 'R');
  cerr << "result msg = '" << s0 << "'" << endl; 
  ss1 << s0;
  
  // Send result to the master //
  
  unit_of_length_t resultLength = ss1.str().length();
  
  unit_of_result_t * result = new unit_of_result_t[resultLength+1];
  result[resultLength] = '\0';
  cerr << "result = " << (void*)result << endl;