On Thursday 20 October 2005 17:19, Chris Marble wrote:
> Robin Stevens wrote:
> > I am running mprime (v24.14.2 on my P4 & Athlon systems) on several
> > Debian GNU/Linux systems.  I keep finding that mprime on one or more
> > systems has stalled while trying to contact the server.
> >
> > When I finally notice and go to sort it out, the mprime process
> > seemingly responds to nothing but a 'kill -9'.  Sometimes this can be
> > many days later, when I notice a result is overdue, though I've set
> > everything to check in daily to make it easier to spot the hosts which
> > are not behaving.
>
> It's a real pain for me too.  If the server doesn't respond in the 1st 2
> contact tries then my mprime just sits.  I can manually do a "mprime -c"
> but I have to do the same kill -9 and restart the process to get more
> crunching.  I may add an hourly "mprime -c" command to cron to see if that
> fixes things.

Same here.

I think the best thing to do with a linux implementation is to set the 
program to contact the server "hardly ever" but run mprime -c in a daily cron 
job.

However surely the fix is to have the server comms done in a seperate thread 
- so the compute bit can just carry on - or, possibly better still, fork off 
a seperate process to do the server comms, which terminates itself when the 
job is done.

The same thing seems to happen with the windows client too. Is there an 
underlying problem with the server which is causing the lockup? Like, server 
sets up TCP session but data doesn't transfer for some reason, but TCP 
session persists causing problem on retry after timeout?

Could this behaviour even be DoSing the server?

Regards
Brian Beesley
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to