Hi Weikuan,

I handle most of the IOR maintenance, and I agree the FILE_DELIMITER needs to 
be changed from ':' to something else.  The '@' change seems a good fit for 
IOR users in the past who've had problems with ':', so I'll update the source 
to reflect this new default.

As for the time drift, I think the time drift between nodes should be 
calibrated so all nodes see the same starting time.  Currently, IOR has 
ignored recalibrating if the skew is less than 5 seconds between the earliest 
and latest times.  If the skew is too egregious (> 5), then all tasks use the 
root task's time.  I agree that a better approach would be to have IOR adjust 
for time drift without regard to the wallclock outlier threshold.  I'll make 
this change.

Also, I am going to remove the "WARNING: Time deviation . . ." message.  (I 
agree it can be annoying.)  I will leave the "Wall clock deviation: X.Y sec", 
however.  There are cases where we need to see how badly the nodes are out of 
synchronization timewise.  I don't think this is too intrusive to the output.

If you have any follow up comments to the lustre-discuss list on these changes 
or IOR in general, please cc me as well.

Thanks,

  --  Bill.

> Date: Fri, 09 Feb 2007 08:59:08 -0500
> From: Weikuan Yu <[EMAIL PROTECTED]>
> To: lustre <[email protected]>
> Subject: [Lustre-discuss] IOR patch
>
> Hi,
>
> I could not find a IOR list. Figured that there are a lot of folks on this
> list are using IOR, who might be interested. Here is a patch to eliminate
> two oversights of IOR.
>
> 1) Conflict in the usage of FILE_DELIMITER ':'
> 2) Inaccurate reports due to any drift in time
>
> Note:
> -- ':' is already taken by ROMIO for long for specifying file system
> type. By replacing it with '@', the problem is solved. File system
> specification is again possible.
> -- Time drift is an annoying problem of IOR. IOR checks on the skew
> of timestamps from each process. But it does not calibrate the timer at
> the beginning. So it spews numerous warning on systems with big drift
> across nodes. In addition, it reports wrong numbers for IO rate. Added
> recalibration still makes your numbers more accurate, even if you did not
> notice you have a problem before.
>
> Let me know if you may have some comments.
>
> Thanks,
> Weikuan
> ++++++++++++++++++++++++++++
> Weikuan Yu <+> 1-865-574-7990
> http://www.csm.ornl.gov/~wyu/

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to