On Fri, Mar 24, 2006 at 09:23:19AM +0100, Phil Carns wrote:
> We have recently found some test scenarios where 30 seconds isn't really 
> long enough.  In particular, if you have the following combination:
> 
> - fast server with a lot of RAM
> - relatively high latency storage (old SAN hardware)
> - very heavy write workload

Pete and I went back and forth on this a while back when his 'perf'
benchmark would write out several hundreded megs of data in a single
MPI_File_write, so there's another workload that triggered the
timeouts, and we don't even have all that much RAM on our test
machines.

> I think we are going to run with the two ServerJob timeouts set to 300 
> seconds (as is already done for the client), but I just wanted to pass 
> along the information in case there is interest in changing the stock 
> default values.

I was a little worried about cranking these up from a failover
perspective, but my gut says people write large I/O (checkpointing,
for one) a little more often then they set up failover.  Longer
ServerJob timeouts sounds good to me, and we'll document somewhere how
to tune for failover.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to