Yes, I got the things to work distributing the channels from the main task, but I guess simply using hpx:async to spawn the tasks in an SPMD fashion is not the best to scale, but it’s the only way I have now to pass all these channels to each locality.
Using the other executors that you mentioned how can I pass different arguments to different localities (e.g., the channels)? Is there any example of using tree-like spawning or bulk_async_execute other then the tests? Thanks, Steve > On 12 Oct 2017, at 09:53, Hartmut Kaiser <[email protected]> wrote: > > Steve, > >> I see your point, but the problem is that I cannot know when the receiver >> is going to be executes and so when it will receive the data. >> >> Is there a way to wait on this channel until some task connects to and >> make a get? This way I can keep the channel (and the task) alive until the >> data has been consumed. I think there should be some mechanism to say >> “wait for N tasks to connect”. > > That's part of the deeper problem I hinted at. Let me think about this some > more before I respond. > >> I am reporting my other question: >> More in general, when you have many channels at scale, do you >> think is better to use a register_as/connect_to mechanism or to pass alle >> the necessary channels to each locality at the beginning when I do the >> initial SPMD spawn? > > We usually use the register_as/connect_to mechanism as it separates concerns > nicely. But YMMV, so please try it out for yourself what works best in your > case. Sending a channel over the wire might overcome the lifetime issues > we're discussing as this would keep the channel alive no matter what. > > Regards Hartmut > --------------- > http://boost-spirit.com > http://stellar.cct.lsu.edu > > >> >> Thank you, >> Steve >> >> >> >> >> >> On 12 Oct 2017, at 09:28, Hartmut Kaiser <[email protected]> wrote: >> >> Steve, >> >> >> Going back to the channels discussion, how do you know if a channel has >> been already registered or not? >> >> Look at the following code below (running with 2 localities): >> - they create or connect to the same channel name >> - if I introduce a delay in the receiver locality I get the following >> error: >> >> Assertion failed: (buffer_map_.empty()), function ~receive_buffer, >> file hpx_install/include/hpx/lcos/local/receive_buffer.hpp, line 101. >> >> I think I understand what is going on. This assert says that you're trying >> to destroy a channel with data sitting in the pipeline. This is caused by >> your code: >> >> { >> hpx::lcos::channel<int> c(hpx::find_here()); >> c.register_as(“name"); >> c.set(3); >> } >> >> Which makes the channel go out of scope before the other locality had a >> chance to connect and to extract the value. >> >> There is also a deeper problem lurking behind the scenes, but that's >> something I need to think about more before doing anything about it. >> >> Regards Hartmut >> --------------- >> http://boost-spirit.com >> http://stellar.cct.lsu.edu >> >> >> >> Here is the sample code: >> ————————— >> >> >> void spmd_int(DataFlow::ShardId cid){ >> if(cid%2==0){ >> hpx::lcos::channel<int> c(hpx::find_here()); >> c.register_as(“name"); >> c.set(3); >> } >> else{ >> usleep(1000000); // delay >> hpx::lcos::channel<int> c; >> c.connect_to("name"); >> hpx::future<int> p = c.get(); >> p.get(); >> } >> } >> >> >> int hpx_main(boost::program_options::variables_map& vm){ >> std::vector<hpx::naming::id_type> localities = >> hpx::find_all_localities(); >> >> int loc_count = 0; >> for(auto loc: localities){ >> >> spmd_int_action act; >> hpx::async(act, loc, loc_count); >> >> loc_count++; >> } >> } >> >> ————————— >> >> >> What is happening here? >> If I add a >> c.connect_to("name"); >> >> to the the same task that does the registration (after >> c.register_as(“name");) it works, but I don’t like it (and in my actual >> application I still get this error). >> >> More in general, when you have many of this channels at scale, do you >> think is better to use a register_as/connect_to mechanism or to pass alle >> the necessary channels to each locality at the beginning when I do the >> initial SPMD spawn? >> >> Thanks, >> Steve >> >> >> On 11 Oct 2017, at 14:53, Steve Petruzza <[email protected]> wrote: >> >> Thanks Hartmut, it all makes sense now. >> >> >> >> On 11 Oct 2017, at 14:51, Hartmut Kaiser <[email protected]> wrote: >> >> >> >> I think I’ve found a workaround. >> >> If I use a typedef as following: >> >> typedef std::vector<char> vec_char; >> >> HPX_REGISTER_CHANNEL(vec_char); >> >> It works, but if I try to use directly: >> HPX_REGISTER_CHANNEL(std::vector<char>) >> >> this gives me the error I reported before. >> The issue might be in the expansion of the macro HPX_REGISTER_CHANNEL. >> >> Yes, that confirms my suspicion. I will have a looks what's going on. >> >> Doh! The problem is that the (literal) parameter you give to the macro has >> to conform to the rules of a valid symbol name, i.e. no special characters >> are allowed (no '<', '>', etc.). Sorry, this has to be documented properly >> somewhere, and I forgot to mention it in the first place. >> >> The 'workaround' you propose is the only way to circumvent problems. There >> is nothing we can do about it. >> >> Also, wrt your comment that everything is working if you use >> hpx::lcos::local::channel instead - this is not surprising. The local >> channel type is representing a channel which can be used inside a given >> locality only (no remote operation, just inter-thread/inter-task >> communication), hence its name. Those channels don't require the use of >> the ugly macros, thus there is no problem. >> >> HTH >> Regards Hartmut >> --------------- >> http://boost-spirit.com >> http://stellar.cct.lsu.edu >> >> >> >> >> Thanks! >> Regards Hartmut >> --------------- >> http://boost-spirit.com >> http://stellar.cct.lsu.edu >> >> >> >> >> Steve >> >> >> >> >> On 10 Oct 2017, at 18:38, Steve Petruzza <[email protected]> wrote: >> >> Sorry, regarding the version that I am using it is the commit of your >> split_future for vector: >> >> Adding split_future for std::vector >> >> - this fixes #2940 >> >> commit 8ecf8197f9fc9d1cd45a7f9ee61a7be07ba26f46 >> >> Steve >> >> >> >> On 10 Oct 2017, at 18:33, Steve Petruzza <[email protected]> wrote: >> >> hpx::find_here() >> >> > >
_______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
