If it is still about SSSP: Well, I took that into account. That is the reason why there is a master task. There is a while(updated) loop. Updated is just getting false when globally no updates were made. Same logic in Pagerank. This is totally failsafe :p
2011/9/23 Edward J. Yoon <[email protected]> > In other words, all tasks should be entered into next step until whole > job is completed successfully. > > On Sat, Sep 24, 2011 at 12:37 AM, Edward J. Yoon <[email protected]> > wrote: > > According to BSPMaster log messages, a few tasks of all are finished > > with SUCCEEDED status during the iterations. If I remember correctly, > > child processes calls bspPeer.close() finally. > > > > Then yes, others will be hanged at the step of comparing the size of > > znode and initial task size. > > > > I wonder what happens if some task no longer need to communicate with > others? > > > > On Fri, Sep 23, 2011 at 11:59 PM, Thomas Jungblut > > <[email protected]> wrote: > >> Well, for SSSP example it might be correct. > >> But you faced the hanging problems in randbench, too. > >> > >> Moreover, we have to implement our own mechanisms for high availability > if > >>> we have own sync master server. > >>> > >> > >> +1 > >> > >> 2011/9/23 Edward J. Yoon <[email protected]> > >> > >>> As I mentioned before, it's not a ZK problem. > >>> > >>> Moreover, we have to implement our own mechanisms for high availability > if > >>> we have own sync master server. > >>> > >>> Sent from my iPad > >>> > >>> On Sep 23, 2011, at 11:01 PM, Thomas Jungblut < > >>> [email protected]> wrote: > >>> > >>> > I have made a github for that: > >>> > https://github.com/thomasjungblut/barriersync > >>> > > >>> > Check it out into your eclipse (the root directory failed for > whatever > >>> > reason). > >>> > Start the server and then the clientemulator. > >>> > Works like a real charm. > >>> > > >>> > Please consider this as an alternative. We should not roll out a 4.0 > >>> release > >>> > with a not working barrier sync. > >>> > > >>> > 2011/9/23 Thomas Jungblut <[email protected]> > >>> > > >>> >> Won't much different. > >>> >>> > >>> >> > >>> >> Let's see. > >>> >> > >>> >> 2011/9/23 Edward J. Yoon <[email protected]> > >>> >> > >>> >>> What happens if some task no longer need to communicate with > others? > >>> >>> > >>> >>> I didn't look at the code recently but I guess that the problem is > >>> >>> related with comparison of znode size and task size. > >>> >>> > >>> >>>> I am going to write a RPC barrier sync. Zookeeper sucks in this > case. > >>> >>> > >>> >>> Won't much different. Let's focusing on NG integration and > In/Output > >>> >>> system. > >>> >>> > >>> >>> On Fri, Sep 23, 2011 at 8:21 PM, Thomas Jungblut > >>> >>> <[email protected]> wrote: > >>> >>>> I am going to write a RPC barrier sync. Zookeeper sucks in this > case. > >>> >>>> > >>> >>>> 2011/9/23 Edward J. Yoon <[email protected]> > >>> >>>> > >>> >>>>> P.S., Tested on 16 nodes using 10 tasks per node. > >>> >>>>> > >>> >>>>> On Fri, Sep 23, 2011 at 7:19 PM, Edward J. Yoon < > >>> [email protected] > >>> >>>> > >>> >>>>> wrote: > >>> >>>>>> Hi, > >>> >>>>>> > >>> >>>>>> Today I ran the sssp example with 4GB sample file. > >>> >>>>>> > >>> >>>>>> At 32th step, some tasks are finished and others hang forever. > >>> >>>>>> > >>> >>>>>> Could anyone figure out this problem? > >>> >>>>>> > >>> >>>>>> Plus, there're too many INFO-level logs. Let's reduce them. > >>> >>>>>> > >>> >>>>>> Thanks. > >>> >>>>>> > >>> >>>>>> -- > >>> >>>>>> Best Regards, Edward J. Yoon > >>> >>>>>> @eddieyoon > >>> >>>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> -- > >>> >>>>> Best Regards, Edward J. Yoon > >>> >>>>> @eddieyoon > >>> >>>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> -- > >>> >>>> Thomas Jungblut > >>> >>>> Berlin > >>> >>>> > >>> >>>> mobile: 0170-3081070 > >>> >>>> > >>> >>>> business: [email protected] > >>> >>>> private: [email protected] > >>> >>>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> -- > >>> >>> Best Regards, Edward J. Yoon > >>> >>> @eddieyoon > >>> >>> > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> Thomas Jungblut > >>> >> Berlin > >>> >> > >>> >> mobile: 0170-3081070 > >>> >> > >>> >> business: [email protected] > >>> >> private: [email protected] > >>> >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Thomas Jungblut > >>> > Berlin > >>> > > >>> > mobile: 0170-3081070 > >>> > > >>> > business: [email protected] > >>> > private: [email protected] > >>> > >> > >> > >> > >> -- > >> Thomas Jungblut > >> Berlin > >> > >> mobile: 0170-3081070 > >> > >> business: [email protected] > >> private: [email protected] > >> > > > > > > > > -- > > Best Regards, Edward J. Yoon > > @eddieyoon > > > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > -- Thomas Jungblut Berlin mobile: 0170-3081070 business: [email protected] private: [email protected]
