> > There are a couple situations where we need to know if an Executor is
> > alive.
> > 1) Before we send a new thread to it
> > 2) While an Executor has been sent a thread, and is processing it, but
> > before it has finished
> >
> > As for 1, the approach of determining if an Executor is alive via
> > failed or successful communication per your suggestion makes sense.
> > This would also allow Executors to leave and return to the grid
> > transparently to the Manager.
> >
> > However, for 2, it's important for the Manager to know if an Executor
> > is actually alive and just busy processing a task, or whether it has
> > dropped off the grid.  The Manager can then react appropriately and
> > resend the thread as necessary.
> >
> > To do this, we could have a heartbeat mechanism that is activated upon
> > receipt of a thread, and deactivated upon completion of it.  If the
> > heartbeat stops, we lost the Executor.  Otherwise, it's just busy.
> >
> > Any other solutions?
> >
>
> This reminds of me the other problem with the heartbeat thread: you really
> do not get any information from the actual running thread which could be
> hanging. The heartbeat is only useful to determine if the computer is
> reachable.
>
> You could also take the following approach:
> 1. Set a time-out value for the thread. This could be Application, Executor
> or Manager configurable.
> 2. Require longer running threads to signal back progress information. This
> could be done through events.
> 3. If the thread is not telling us what is going on in the allotted time
> then it gets the axe.
> 4. We might want to build in some event (ThreadTimeoutEvent for example)
> that the thread could subscribe to through which the Manager could request
> the progress information. This would give the application the option to
> either subscribe to this event or to tell us how things are going at certain
> milestones. I imagine that complex threads might want to do both.
>
> We could put in the client side framework the basic plumbing for this setup
> so there would be minimal extra coding required in the application.
>
> Tibor
>

The problem that worries me with that approach is the implementation
on the application developer level.  When I create a grid-enabled
application, how will I be able to determine how long my thread will
run so that I know if I should code it to throw/respond to an
occasional progress event?

For instance, let's say I have a PocketPC with the .NET Compact
Framework, a 200MHz Windows 98 machine and a 3.2GHz Windows XP
machine.  The thread doesn't take long to execute on the XP machine,
but takes much longer on the Win98 machine, and even longer on the
PocketPC.  If we're looking purely at a timeout, for most real cases,
the latter two machines will timeout on most every thread.

Second example: let's say each thread connects to a database server to
perform some operation, and the database server gets overloaded to the
point that it's rejecting connections until the current processes have
completed.  Your threads are still alive, but waiting on something
else... if it exceeds some timeout, the thread is killed, and probably
re-sent to another executor which will encounter the same problem.

....

However, I think I may have missed your point.

If we require every thread to respond to some AreYouStillRunning
method invocation, and we kill the thread based upon its failure to
respond (which I think is what you meant), that makes sense.  The
return value from that method can be completely scrapped; we're merely
interested in _if_ it responded, not necessarily what the response
was.  I think that calling a method makes more sense than coding to
respond to a ThreadTimeoutEvent, because we would need to be able to
determine if it responded to the event or not.

And, I think it's a completely separate issue from determining if a
thread is still alive, but I do think there should be some facility
for the thread to send arbitrary progress messages back to the Manager
which are ultimately rerouted back to the GApplication.  That way an
application could be developed where threads are created and their
processing can be monitored by the application during execution, as
opposed to just knowing that the thread was sent out and that it
completed.

Sorry if that didn't make any sense... operating on about 5 hours less
sleep than normal.

Jonathan


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Alchemi-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/alchemi-developers

Reply via email to