Hi, I submitted 100 boinc jobs to the batch system, where over 100 slots were free. This meant they started as soon as the SGE scheduler assigned WNs, and all within 2 seconds. Of the 100, 15 failed with the "Another scheduler instance is running" message, so I think this is consistent with the RPC in progress hypothesis. The contact with the project server seem to take around 3 seconds
11-Jun-2010 16:05:59 [http://www.worldcommunitygrid.org/] Sending scheduler requ est: Project initialization. 11-Jun-2010 16:05:59 [http://www.worldcommunitygrid.org/] Requesting new tasks 11-Jun-2010 16:06:02 [World Community Grid] Scheduler request completed: got 0 n ew tasks 11-Jun-2010 16:06:02 [World Community Grid] Message from project server: Another scheduler instance is running for this host I think I can reduce the 15% failure rate, to something negligible, by putting a random sleep. Trying this, I notice 33 from 100 fail with a new error:- [World Community Grid] Message from project server: Not sending work - last request too recent: 5 sec Looking back to my first 100 test, there were 5 of these too. I don't suppose anyone knows what the minimum time between requests is? I'll put a sleep and run boinc again if the first fails quickly(i.e. with one of these messages). Cheers, Rod. On 06/13/2010 06:29 AM, David Anderson wrote: > There's no problem with multiple client instances having the same host > name > or IP address, as long as they run in separate data directories > (and hence have different host IDs) > > There must be something else going on here; > the "Another scheduler instance is running for this host" messages > mean that a scheduler RPC is in progress for this host ID at this very > moment. > Since scheduler RPCs take about 0.1 second, > this shouldn't always be the case. > > Does this problem happen with other project, or just WCG? > > -- David > > Rod Walker wrote: >> Hi, >> I make sure they have different data directories, but I do not know >> how to change the device name which the project sees. It seems to me >> like something that would be discouraged/prevented. >> If I attach the boincs, running on the same host, to different >> projects, then it would not complain right? _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
