This is a bit off-topic for this list, but I was wondering if anyone has any experience working with ptpd (precision time protocol daemon; following the IEEE 1588 spec). The point of ptpd is to give better time precision than NTP; NTP gives accuracy on the order of miliseconds where ptpd/IEEE 1588 gives accuracy on the order of microseconds.

    http://ptpd.sourceforge.net/

Having coordination accuracy within miliseconds could be quite helpful to MPI in multiple ways: giving more accurate MPI tracing outputs, the possibility of scheduling communication (particularly for MPI collectives) in oversubscribed networks, etc.

I'd like to give ptpd a whirl, but there's very little documentation and I can't find any mailing lists or other points of contact where to ask a few questions.

In particular, I would like to run ptpd in a way that I'm guessing would be fairly common in HPC environments: use NTP to get the time to my cluster's head node and then use ptpd to synchronize my cluster to the NTP'ed head node. However, it's not clear to me how ptpd works -- how do I designate one head node as the "master"? What, exactly, do all the command line options to ptpd mean? (there's only a limited "--help" kind of message to explain them) And so on.

I have a busy/active cluster, so I don't want to muck up the clock (and therefore potentially muck up NFS file timestamps) -- some level of experimentation is ok, but I don't want to unintentionally cause a large/bad effect (particularly in terms of NFS) if possible. I'm also curious as to how much network overhead ptpd incurs, both at startup and in its steady state operation.

If anyone has any insight or experience with ptpd, I'd love to hear it. Thanks!

--
Jeff Squyres
Cisco Systems

Reply via email to