Hi Shmuel, Am 07.12.2017 2:08 nachm. schrieb "Shmuel Levine" <[email protected]>:
Hi All, I’m looking for some help and guidance regarding passing along data between localities. I’ve got some ideas which I think would work, but I’d really like some feedback, guidance and/or suggestions on this. <snip> In general, all of the localities run the following conceptual loop: a. Test for convergence b. Run current iteration c. Migrate data / integrate data, if required There are 3 communications scenarios I need to account for: 1. Broadcasting some information to all other nodes. For example, if the convergence tests succeed on one node, notify other nodes that they can stop processing. 2. Send interim results to primary node for aggregation and reporting 3. Migrate data conditionally I am going to assume – for now, at least – that the **best** approach for each scenario might be different, so I’ve separated the details / my thoughts into separate sections below. ## Broadcasting information to other nodes It seems to me that this could handled fairly simply if each component had: - Boolean flag - Action to set flag This way, each node would test for convergence and if the test succeeds, I can use broadcast to call this action on all of the other nodes, something like: hpx::lcos::broadcast_apply<set_converged_flag_action>(hpx::find_remote_localities() ); That sounds like the best shot, yes. Some more suggestions here: You can use your component ids directly here. One good approach would be to register components with a basename and retrieve the vector using that same basename. More information about that can be found here: https://stellar-group.github.io/hpx/docs/html/header/hpx/runtime/basename_registration_hpp.html The other suggestion is to use a direct action to set the flag in your component, they don't spawn a separate thread when being invoked on the remote end. ## Send interim results to primary node for aggregation and reporting I spent a while typing up a rough proposal for an idea, and ended up with a clumsy analog of receive_buffer ( I think -- I’m pretty sure that this is what receive_buffer is intended to do, although that class isn’t documented other than a regression test.) Instead of using receive_buffer, you should use hpx::lcos::channel. It uses receive_buffer internally and offers the abstractions you need. ## Migrate data conditionally between nodes My expectations are that - I would like to send data based on some to-be-defined criteria which will likely be ‘data-centric’, not based on a fixed number of iterations. It is most likely that the receiving node will initiate the transfer (i.e. request data from one or more adjacent nodes). - I do not want the program to block waiting for data. I am envisioning a test in the main control loop, which, if deemed necessary, would send a request to one or more adjacent nodes for data, and continue looping. Once all of the new data has been received, then it can be incorporated at the beginning of the following iteration. To be 100% clear, as an example: at iteration x, node N_d decides it needs new data from nodes N_c and N_e and executes an action on those 2 localities, then immediately starts iteration x+1. Some time during x+1, the data from N_e arrives. Then, the data from N_c arrives in the middle of iteration X+2. The new data would be integrated into node N_d at the very beginning of iteration x+3. - Integrating received data cannot happen at an arbitrary time determined by the thread scheduler – this would cause major problems with the code execution. In my mind, this means that the continuation for this future (or multiple futures) causes the data to be staged into a temporary structure (e.g. a queue). After mulling over this problem in my mind for hours, I think there are a couple possibilities: 1. Have the receiving node broadcast an action to the relevant nodes. For example: Future<vector<data_type> > data_received = hpx::lcos::broadcast(adjacent_nodes, migrate_data_action); data_received.then([=](vector<data_type> incoming_data){ incoming_data_stage.push_back(std::move(incoming_data)); } This sounds good. 1. I could probably also use receive_buffer: since the data transfer is initiated by the receiving node, I know which was the last index used for receiving data. This time, I would not use broadcast, but rather iterate over the vector of localities, calling a remote action on each node and providing the index to use. Then, I could create a vector of futures from calling receive on those same indices, and use when_all to set up the continuation, as before. I think that's overkill if you pull the data (instead of pushing). 1. 1. My first thought was to use channels; however, I don’t know how this could possibly work. Among other things, I don’t know whether it is possible to check if the channel is empty (testing for c.start() == c.end() definitely doesn’t work). I apologize if this message is overly verbose – I was aiming for clarity, but I tend to write too much in general… I’d greatly appreciate any advice on how to best handle these cases. Also, any feedback on my rough concepts would be sincerely appreciated. I hope that I provided enough feedback to get you going. I'm not entirely sure what to answer to your last question. I think what you envisioned in the first point is a good start! If the data you send around are PODs, then it might make sense to use serialize_buffer to exchange data. It's optimized for zero copy transfers. Thanks and best regards, Shmuel Levine _______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
_______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
