Hi Shmuel,

Am 07.12.2017 2:08 nachm. schrieb "Shmuel Levine" <[email protected]>:

Hi All,

I’m looking for some help and guidance regarding passing along data between
localities.  I’ve got some ideas which I think would work, but I’d really
like some feedback, guidance and/or suggestions on this.

<snip>

In general, all of the localities run the following conceptual loop:

a.                   Test for convergence

b.                   Run current iteration

c.                   Migrate data / integrate data, if required



There are 3 communications scenarios I need to account for:

   1. Broadcasting some information to all other nodes.  For example, if
   the convergence tests succeed on one node, notify other nodes that they can
   stop processing.
   2. Send interim results to primary node for aggregation and reporting
   3. Migrate data conditionally



I am going to assume – for now, at least – that the **best** approach for
each scenario might be different, so I’ve separated the details / my
thoughts into separate sections below.



## Broadcasting information to other nodes



It seems to me that this could handled fairly simply if each component had:

   - Boolean flag
   - Action to set flag



This way, each node would test for convergence and if the test succeeds, I
can use broadcast to call this action on all of the other nodes, something
like: 
hpx::lcos::broadcast_apply<set_converged_flag_action>(hpx::find_remote_localities()
);


That sounds like the best shot, yes. Some more suggestions here:
You can use your component ids directly here. One good approach would be to
register components with a basename and retrieve the vector using that same
basename. More information about that can be found here:
https://stellar-group.github.io/hpx/docs/html/header/hpx/runtime/basename_registration_hpp.html
The other suggestion is to use a direct action to set the flag in your
component, they don't spawn a separate thread when being invoked on the
remote end.



## Send interim results to primary node for aggregation and reporting



I spent a while typing up a rough proposal for an idea, and ended up with a
clumsy analog of receive_buffer ( I think -- I’m pretty sure that this is
what receive_buffer is intended to do, although that class isn’t documented
other than a regression test.)

Instead of using receive_buffer, you should use hpx::lcos::channel. It uses
receive_buffer internally and offers the abstractions you need.



## Migrate data conditionally between nodes



My expectations are that

   - I would like to send data based on some to-be-defined criteria which
   will likely be ‘data-centric’, not based on a fixed number of iterations.
   It is most likely that the receiving node will initiate the transfer (i.e.
   request data from one or more adjacent nodes).
   - I do not want the program to block waiting for data.  I am envisioning
   a test in the main control loop, which, if deemed necessary, would send a
   request to one or more adjacent nodes for data, and continue looping.  Once
   all of the new data has been received, then it can be incorporated at the
   beginning of the following iteration.  To be 100% clear, as an example: at
   iteration x, node N_d decides it needs new data from nodes N_c and N_e and
   executes an action on those 2 localities, then immediately starts iteration
   x+1.  Some time during x+1, the data from N_e arrives.  Then, the data from
   N_c arrives in the middle of iteration X+2.  The new data would be
   integrated into node N_d at the very beginning of iteration x+3.
   - Integrating received data cannot happen at an arbitrary time
   determined by the thread scheduler – this would cause major problems with
   the code execution.  In my mind, this means that the continuation for this
   future (or multiple futures)  causes the data to be staged into a temporary
   structure (e.g. a queue).



After mulling over this problem in my mind for hours, I think there are a
couple possibilities:



   1. Have the receiving node broadcast an action to the relevant nodes.
   For example:



Future<vector<data_type> > data_received = hpx::lcos::broadcast(adjacent_nodes,
migrate_data_action);

data_received.then([=](vector<data_type> incoming_data){
incoming_data_stage.push_back(std::move(incoming_data)); }


This sounds good.



   1. I could probably also use receive_buffer: since the data transfer is
   initiated by the receiving node, I know which was the last index used for
   receiving data.  This time, I would not use broadcast, but rather iterate
   over the vector of localities, calling a remote action on each node and
   providing the index to use. Then, I could create a vector of futures from
   calling receive on those same indices, and use when_all to set up the
   continuation, as before.

I think that's overkill if you pull the data (instead of pushing).


   1.



   1. My first thought was to use channels; however, I don’t know how this
   could possibly work.  Among other things, I don’t know whether it is
   possible to check if the channel is empty (testing for c.start() == c.end()
   definitely doesn’t work).





I apologize if this message is overly verbose – I was aiming for clarity,
but I tend to write too much in general…  I’d greatly appreciate any advice
on how to best handle these cases.  Also, any feedback on my rough concepts
would be sincerely appreciated.

I hope that I provided enough feedback to get you going. I'm not entirely
sure what to answer to your last question. I think what you envisioned in
the first point is a good start!
If the data you send around are PODs, then it might make sense to use
serialize_buffer to exchange data. It's optimized for zero copy transfers.



Thanks and best regards,

Shmuel Levine

_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to