Sounds like you were the lucky ones...

I do not think your definition of "drift" is very different from what my understanding.

That said, the patch is to remove the IOR reliance on good time synchronization, like ntp. It has no direct linkage with lustre, just so happens that I noticed this list has had some reports using IOR.

In addition, the granularity of ntp is mostly at milliseconds level assuming your nodes sync frequently enough and always able to do so. The patch brings the clock skew down to the best that a give MPI implementation can achieve, i.e. the granularity of a barrier (inherent in Init()).

Weikuan


Brian J. Murrell wrote:
On Fri, 2007-02-09 at 08:59 -0500, Weikuan Yu wrote:

Hi.

I'd like to briefly explore this second problem...

2) Inaccurate reports due to any drift in time

-- Time drift is an annoying problem of IOR.

What do you mean exactly by time drift?  "Drift" to me means that the
difference in the clocks of machines actually grows, not that they are
simply just not synchronized yet constant.  If you have drift in terms
of clocks actually growing apart, do you know why this is?  Are you
using ntp to (try) to keep the clocks in sync?

What if you synchronize the clocks (i.e. with ntpdate or rdate, etc.) of
all of the nodes right before the IOR run?  Are they still all in sync
after the run ends?

IOR checks on the skew
of timestamps from each process. But it does not calibrate the timer at
the beginning. So it spews numerous warning on systems with big drift
across nodes.

What warnings were you seeing?

What version of Lustre are you using?

b.


_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss



_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to