Re: [OMPI devel] RML Send

2008-06-19 Thread Ralph H Castain
Okay, I've traced this down. The problem is that a DSS-internal function has
been exposed via the API, so now people can mistakenly call the wrong one.
You should -never- be using opal_dss.pack_buffer or opal_dss.unpack_buffer.
Those were supposed to be internal to the DSS only, and will definitely mess
you up if called directly.

I'll fix this problem to avoid future issues. There is a comment in dss.h
that warns you never to call those functions, but who would remember?

I sure wouldn't. I've only avoided the problem because of ignorance - I
didn't know those API's existed!

Should have a fix in later today.
Ralph



On 6/19/08 8:43 AM, "Ralph H Castain"  wrote:

> WOW! Somebody really screwed up the DSS by adding some new API's I'd never
> heard of before, but really can cause the system to break!
> 
> I'm going to have to straighten this mess out - it is a total disaster.
> There needs to be just ONE way of packing and unpacking, not two totally
> incompatible methods.
> 
> Will let you know when it is fixed - probably early next week.
> Ralph
>  
> 
> 
> On 6/19/08 8:34 AM, "Leonardo Fialho"  wrote:
> 
>> Hi Ralph,
>> 
>> Mi mistake, I'm really using ORTE_PROC_MY_DAEMON->jobid.
>> 
>> I have success using pack_buffer()/unpack_buffer() and OPAL_BYTE type,
>> something strange occur when I was using pack()/unpack(). The value of
>> num_bytes increase, example:
>> I tried to read num_bytes=5, and after a unpack this var have 33! I
>> don't understand it...
>> 
>> Thanks,
>> Leonardo Fialho
>> 
>> Ralph Castain escribió:
>>> 
>>> On 6/17/08 3:35 PM, "Leonardo Fialho"  wrote:
>>> 
>>>   
 Hi Ralph,
 
 1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I
 defined in "odls_types.h".
 2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ...
 3) I'm not blocking the "process_commands" function with long code.
 4) To know the daemon's vpid and jobid I used the same jobid from the
 app (in this solution, I can be changed) and the vpid is ordered
 sequentially (0 for mpirun and 1 to N for the orted's).
 
>>> 
>>> The jobid of the daemons is different from the jobid of the apps. So at the
>>> moment, you are actually sending the message to another app!
>>> 
>>> You can find the jobid of the daemons by extracting it as
>>> ORTE_PROC_MY_DAEMON->jobid. Please note, though, that the app has no
>>> knowledge of the contact info for that daemon, so this message will have to
>>> route through the local daemon. Happens transparently, but just wanted to be
>>> clear as to how this is working.
>>> 
>>>   
 The problems is: I need to send a buffered data, and I don't know the
 type of this data. I'm trying to use OPAL_NULL and OPAL_DATA_VALUE to
 send it but I got no success :(
 
>>> 
>>> If I recall correctly, you were trying to archive messages that flowed
>>> through the PML - correct? I would suggest just treating them as bytes and
>>> packing them as an opal_byte_object_t, something like this:
>>> 
>>> opal_byte_object_t bo;
>>> 
>>> bo.size = sizeof(my-data);
>>> bo.data = *my_data;
>>> 
>>> opal_dss.pack(*buffer, , 1, OPAL_BYTE_OBJECT);
>>>  
>>> Then on the other end:
>>> 
>>> opal_byte_object_t *bo;
>>> int32_t n;
>>> 
>>> opal_dss.unpack(*buffer, , , OPAL_BYTE_OBJECT);
>>> 
>>> You can then transfer the data into whatever storage you like. All this does
>>> is pass the #bytes and the bytes as a collected unit - you could, of course,
>>> simply pass the #bytes and bytes with independent packs if you wanted:
>>> 
>>> int32_t num_bytes;
>>> uint8_t *my_data;
>>> 
>>> opal_dss.pack(*buffer, _bytes, 1, OPAL_INT32);
>>> opal_dss.pack(*buffer, my-data, num_bytes, OPAL_BYTE);
>>> 
>>> ...
>>> 
>>> opal_dss.unpack(*buffer, _bytes, , OPAL_INT32);
>>> my_data = (uint8_t*)malloc(num_bytes);
>>> opal_dss.unpack(*buffer, _data, _bytes, OPAL_BYTE);
>>> 
>>> 
>>> Up to you.
>>> 
>>> Hope that helps
>>> Ralph
>>> 
>>>   
 Thanks in advance,
 Leonardo Fialho
 
 
 Ralph H Castain escribió:
 
> I'm not sure exactly how you are trying to do this, but the usual
> procedure
> would be:
> 
> 1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
> want to put in the buffer. So you might call this to pack a string:
> 
> opal_dss.pack(*buffer, , 1, OPAL_STRING);
> 
> 2. once you have everything packed into the buffer, you send the buffer
> with
> 
> orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);
> 
> What you will need is a tag that the daemon is listening on that won't
> interfere with its normal operations - i.e., what you send won't get held
> forever waiting to get serviced, and your servicing won't block us from
> responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
> need to ensure you don't block anything.
> 
> BTW: how is the 

Re: [OMPI devel] RML Send

2008-06-19 Thread Leonardo Fialho

Hi Ralph,

Mi mistake, I'm really using ORTE_PROC_MY_DAEMON->jobid.

I have success using pack_buffer()/unpack_buffer() and OPAL_BYTE type, 
something strange occur when I was using pack()/unpack(). The value of 
num_bytes increase, example:
I tried to read num_bytes=5, and after a unpack this var have 33! I 
don't understand it...


Thanks,
Leonardo Fialho

Ralph Castain escribió:


On 6/17/08 3:35 PM, "Leonardo Fialho"  wrote:

  

Hi Ralph,

1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I
defined in "odls_types.h".
2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ...
3) I'm not blocking the "process_commands" function with long code.
4) To know the daemon's vpid and jobid I used the same jobid from the
app (in this solution, I can be changed) and the vpid is ordered
sequentially (0 for mpirun and 1 to N for the orted's).



The jobid of the daemons is different from the jobid of the apps. So at the
moment, you are actually sending the message to another app!

You can find the jobid of the daemons by extracting it as
ORTE_PROC_MY_DAEMON->jobid. Please note, though, that the app has no
knowledge of the contact info for that daemon, so this message will have to
route through the local daemon. Happens transparently, but just wanted to be
clear as to how this is working.

  

The problems is: I need to send a buffered data, and I don't know the
type of this data. I'm trying to use OPAL_NULL and OPAL_DATA_VALUE to
send it but I got no success :(



If I recall correctly, you were trying to archive messages that flowed
through the PML - correct? I would suggest just treating them as bytes and
packing them as an opal_byte_object_t, something like this:

opal_byte_object_t bo;

bo.size = sizeof(my-data);
bo.data = *my_data;

opal_dss.pack(*buffer, , 1, OPAL_BYTE_OBJECT);
 
Then on the other end:


opal_byte_object_t *bo;
int32_t n;

opal_dss.unpack(*buffer, , , OPAL_BYTE_OBJECT);

You can then transfer the data into whatever storage you like. All this does
is pass the #bytes and the bytes as a collected unit - you could, of course,
simply pass the #bytes and bytes with independent packs if you wanted:

int32_t num_bytes;
uint8_t *my_data;

opal_dss.pack(*buffer, _bytes, 1, OPAL_INT32);
opal_dss.pack(*buffer, my-data, num_bytes, OPAL_BYTE);

...

opal_dss.unpack(*buffer, _bytes, , OPAL_INT32);
my_data = (uint8_t*)malloc(num_bytes);
opal_dss.unpack(*buffer, _data, _bytes, OPAL_BYTE);


Up to you.

Hope that helps
Ralph

  

Thanks in advance,
Leonardo Fialho


Ralph H Castain escribió:


I'm not sure exactly how you are trying to do this, but the usual procedure
would be:

1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
want to put in the buffer. So you might call this to pack a string:

opal_dss.pack(*buffer, , 1, OPAL_STRING);

2. once you have everything packed into the buffer, you send the buffer with

orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);

What you will need is a tag that the daemon is listening on that won't
interfere with its normal operations - i.e., what you send won't get held
forever waiting to get serviced, and your servicing won't block us from
responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
need to ensure you don't block anything.

BTW: how is the app figuring out the name of the remote daemon? The proc
will have access to the daemon's vpid (assuming it knows the nodename where
the daemon is running) in the ESS, but not the jobid - I assume you are
using some method to compute the daemon jobid from the apps?


On 6/17/08 12:08 PM, "Leonardo Fialho"  wrote:

  
  

Hi All,

I´m using RML to send log messages from a PML to a ORTE daemon (located
in another node). I got success sending the message header, but now I
need to send the message data (buffer). How can I do it? The problem is
what data type I need to use for packing/unpacking? I tried
OPAL_DATA_VALUE but don´t get success...

Thanks,




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  
  




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  



--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478



Re: [OMPI devel] RML Send

2008-06-17 Thread Ralph Castain



On 6/17/08 3:35 PM, "Leonardo Fialho"  wrote:

> Hi Ralph,
> 
> 1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I
> defined in "odls_types.h".
> 2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ...
> 3) I'm not blocking the "process_commands" function with long code.
> 4) To know the daemon's vpid and jobid I used the same jobid from the
> app (in this solution, I can be changed) and the vpid is ordered
> sequentially (0 for mpirun and 1 to N for the orted's).

The jobid of the daemons is different from the jobid of the apps. So at the
moment, you are actually sending the message to another app!

You can find the jobid of the daemons by extracting it as
ORTE_PROC_MY_DAEMON->jobid. Please note, though, that the app has no
knowledge of the contact info for that daemon, so this message will have to
route through the local daemon. Happens transparently, but just wanted to be
clear as to how this is working.

> 
> The problems is: I need to send a buffered data, and I don't know the
> type of this data. I'm trying to use OPAL_NULL and OPAL_DATA_VALUE to
> send it but I got no success :(

If I recall correctly, you were trying to archive messages that flowed
through the PML - correct? I would suggest just treating them as bytes and
packing them as an opal_byte_object_t, something like this:

opal_byte_object_t bo;

bo.size = sizeof(my-data);
bo.data = *my_data;

opal_dss.pack(*buffer, , 1, OPAL_BYTE_OBJECT);

Then on the other end:

opal_byte_object_t *bo;
int32_t n;

opal_dss.unpack(*buffer, , , OPAL_BYTE_OBJECT);

You can then transfer the data into whatever storage you like. All this does
is pass the #bytes and the bytes as a collected unit - you could, of course,
simply pass the #bytes and bytes with independent packs if you wanted:

int32_t num_bytes;
uint8_t *my_data;

opal_dss.pack(*buffer, _bytes, 1, OPAL_INT32);
opal_dss.pack(*buffer, my-data, num_bytes, OPAL_BYTE);

...

opal_dss.unpack(*buffer, _bytes, , OPAL_INT32);
my_data = (uint8_t*)malloc(num_bytes);
opal_dss.unpack(*buffer, _data, _bytes, OPAL_BYTE);


Up to you.

Hope that helps
Ralph

> 
> Thanks in advance,
> Leonardo Fialho
> 
> 
> Ralph H Castain escribió:
>> I'm not sure exactly how you are trying to do this, but the usual procedure
>> would be:
>> 
>> 1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
>> want to put in the buffer. So you might call this to pack a string:
>> 
>> opal_dss.pack(*buffer, , 1, OPAL_STRING);
>> 
>> 2. once you have everything packed into the buffer, you send the buffer with
>> 
>> orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);
>> 
>> What you will need is a tag that the daemon is listening on that won't
>> interfere with its normal operations - i.e., what you send won't get held
>> forever waiting to get serviced, and your servicing won't block us from
>> responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
>> need to ensure you don't block anything.
>> 
>> BTW: how is the app figuring out the name of the remote daemon? The proc
>> will have access to the daemon's vpid (assuming it knows the nodename where
>> the daemon is running) in the ESS, but not the jobid - I assume you are
>> using some method to compute the daemon jobid from the apps?
>> 
>> 
>> On 6/17/08 12:08 PM, "Leonardo Fialho"  wrote:
>> 
>>   
>>> Hi All,
>>> 
>>> I´m using RML to send log messages from a PML to a ORTE daemon (located
>>> in another node). I got success sending the message header, but now I
>>> need to send the message data (buffer). How can I do it? The problem is
>>> what data type I need to use for packing/unpacking? I tried
>>> OPAL_DATA_VALUE but don´t get success...
>>> 
>>> Thanks,
>>> 
>> 
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>   
> 





Re: [OMPI devel] RML Send

2008-06-17 Thread Leonardo Fialho

Hi Ralph,

1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I 
defined in "odls_types.h".

2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ...
3) I'm not blocking the "process_commands" function with long code.
4) To know the daemon's vpid and jobid I used the same jobid from the 
app (in this solution, I can be changed) and the vpid is ordered 
sequentially (0 for mpirun and 1 to N for the orted's).


The problems is: I need to send a buffered data, and I don't know the 
type of this data. I'm trying to use OPAL_NULL and OPAL_DATA_VALUE to 
send it but I got no success :(


Thanks in advance,
Leonardo Fialho


Ralph H Castain escribió:

I'm not sure exactly how you are trying to do this, but the usual procedure
would be:

1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
want to put in the buffer. So you might call this to pack a string:

opal_dss.pack(*buffer, , 1, OPAL_STRING);

2. once you have everything packed into the buffer, you send the buffer with

orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);

What you will need is a tag that the daemon is listening on that won't
interfere with its normal operations - i.e., what you send won't get held
forever waiting to get serviced, and your servicing won't block us from
responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
need to ensure you don't block anything.

BTW: how is the app figuring out the name of the remote daemon? The proc
will have access to the daemon's vpid (assuming it knows the nodename where
the daemon is running) in the ESS, but not the jobid - I assume you are
using some method to compute the daemon jobid from the apps?


On 6/17/08 12:08 PM, "Leonardo Fialho"  wrote:

  

Hi All,

I´m using RML to send log messages from a PML to a ORTE daemon (located
in another node). I got success sending the message header, but now I
need to send the message data (buffer). How can I do it? The problem is
what data type I need to use for packing/unpacking? I tried
OPAL_DATA_VALUE but don´t get success...

Thanks,





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  



--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478



Re: [OMPI devel] RML Send

2008-06-17 Thread Ralph H Castain
I'm not sure exactly how you are trying to do this, but the usual procedure
would be:

1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
want to put in the buffer. So you might call this to pack a string:

opal_dss.pack(*buffer, , 1, OPAL_STRING);

2. once you have everything packed into the buffer, you send the buffer with

orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);

What you will need is a tag that the daemon is listening on that won't
interfere with its normal operations - i.e., what you send won't get held
forever waiting to get serviced, and your servicing won't block us from
responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
need to ensure you don't block anything.

BTW: how is the app figuring out the name of the remote daemon? The proc
will have access to the daemon's vpid (assuming it knows the nodename where
the daemon is running) in the ESS, but not the jobid - I assume you are
using some method to compute the daemon jobid from the apps?


On 6/17/08 12:08 PM, "Leonardo Fialho"  wrote:

> Hi All,
> 
> I´m using RML to send log messages from a PML to a ORTE daemon (located
> in another node). I got success sending the message header, but now I
> need to send the message data (buffer). How can I do it? The problem is
> what data type I need to use for packing/unpacking? I tried
> OPAL_DATA_VALUE but don´t get success...
> 
> Thanks,





[OMPI devel] RML Send

2008-06-17 Thread Leonardo Fialho

Hi All,

I´m using RML to send log messages from a PML to a ORTE daemon (located 
in another node). I got success sending the message header, but now I 
need to send the message data (buffer). How can I do it? The problem is 
what data type I need to use for packing/unpacking? I tried 
OPAL_DATA_VALUE but don´t get success...


Thanks,

--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478