> This looks good to me.  Does it work as intended?  Or is that something for
> Stan to try at scale?

All I can say is that I applied the patch to opensm, and it ran successfully on 
my two node cluster.  Amazing, I know.  I need Stan to test across the larger 
cluster.

> > Moved the calculation of the timeout time to inside the critical
> > section to improve its accuracy in case an attempt to acquire the
> > critical section blocks.
> 
> How does this improve accuracy?  I suppose it depend on whether the timeout
> time is relative to the client making the call, or the call returning.
> Having the timeout calculated before the critsec improves the former,
> calculating it under lock improves the latter.

I was worried about the time between setting timeout and using it:

+       timeout = cl_get_time_stamp() + (((uint64_t)time_ms) * 1000);
+       if ( !p_timer->timeout_time || timeout < p_timer->timeout_time )

If timeout is set before the critical section, and the thread blocks, the if 
check is more likely to return success than setting it after.  The result is 
that the timer may be adjusted, which would set the timer out _further_ that it 
actually is.

- Sean
_______________________________________________
ofw mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw

Reply via email to