"Executing thread hangs on the Executor. This means that the executing thread is not responding but the heartbeat thread keeps working. Currently the Executor remains in the hung state until it is restarted. This is not an acceptable solution. We have some ideas if you are interested in exploring this area."
"In case of a Manager failure there is no backup at this point. An immediate problem here is that once a Manager goes offline all Executors that are running threads will probably fail as well. It would be nice to have the Executor store the threads results and wait until the Manager comes back online. Other ideas are welcome."
The above mentioned areas by you are seems to be a good areas of intrests.I want to explore both of these areas in some more detail and then try to remove the above mentioned problems.. I need your help for this which part of code i need to understand before starting working on them as understanding the whole code is not easy. I want to remove the above problems as soon as possible. i am totally devoting my time on Alchemi.
Hop to get reply from you soon.
with regards
Inderpreet Chopra
- [Alchemi-developers] Fault tolerance in Alchemi andrew hudson
- RE: RE: [Alchemi-developers] Fault tolerance in Alchemi Tibor Biro
- [Alchemi-developers] Fault Tolerance in Alchemi andrew hudson
- Re: RE: [Alchemi-developers] Fault Tolerance in Alchemi andrew hudson
