Hi Volodymyr, On Wed, Dec 02, 2020 at 05:31:37PM +0100, Volodymyr Bendiuga wrote: > Hi Vladimir, and thank you for extra-well formulated question. > > Actually, when I was drafting an answer to your question, I realized I > had missed one very important point in the patch, namely, that SYNC > packets in P2P mode should not simply be forwarded, but port’s peer > delay should be added to the correction field prior to forwarding. I > apologize for the confusion I may have unintentionally created. > > This is what I have missed in the patch in tc_fwd_event() function code-wise: > if ((q->timestamping >= TS_ONESTEP) && (msg_type(msg) == SYNC)) > { > corr = tmv_to_TimeInterval(q->peer_delay); > corr += q->asymmetry; > msg->header.correction += host2net64(corr); > } > cnt = transport_send(p->trp, &p->fda, TRANS_DEFER_EVENT, msg); > > I will update the patch and send it anew.
Receiving thanks for the question, or further code, is not really what I was expecting, but rather a walk through the life of the Sync packet as it traverses Node A, the linuxptp one-step TC, and then finally Node B. I am not at the stage where I could look at code. I don't yet understand what the hardware is supposed to do in your proposal, and how that is going to do the job. > > Taking into account your diagram, P2P-TC calculates uplink delay > between its port 1 and Node A, by means of PD_REQ/PD_RESP messages. > This delay is then stored in linuxptp (q->peer_delay). In your case we > suppose it is 800 ns. Downstream link delay (port 2 -> Node B) is > disregarded by P2P TC, since Node B will use it on its side (over > there it will be called uplink delay), upon reception of SYNC packet. > > When SYNC packet arrives to TC, we add uplink ports peer delay > to correction field and forward it. TC’s port 2 will calculate > residence time and add it to correction field. Here, the travel > path of SYNC packet, from its ingress at t1, up to linuxptp and > down back to driver, will be resolved in port 2 upon egress when > t2 is taken ((t2 - t1) -> add to correction field. How? The egress port has no idea what t1 was. Where is t1 recorded? > > Upon reception of the SYNC packet, Node B will have everything > it needs to calculate the offset: origin time and correction > value from SYNC packet, and its own uplink delay (700 ns). > Nothing really special going on here, just regular workflow. Yeah, ok, forget about the link delay, that is only 1500 ns in total in my example, or 2.3% of the total latency that the TC must account for. It is not that part that I'm interested in. I want to understand how your proposal makes (t2 - t1) magically land into the correctionField of the Sync message. What does the hardware need to do to make that happen. > As for store-and-forward: unless you can switch it off, like > placing your hw in cut-through mode, you may have to compensate > for it in link asymmetry, but that depends on you HW design. This really doesn't matter. The store-and-forward latency is included in the (t2 - t1) measurement, so as soon as t1 and t2 are taken as timestamps at MAC layer, the entire residence time should be accounted for. But how, that is the question. I've read your emails a few times and I do not understand this. > I guess my explanation is totally redundant, since I’ve revealed the > missing code … . Sorry once again. No it isn't. I find it is still rather lacking. You mentioned the Marvell 88E1548 but didn't fully explain what that does special in order to comply with your rules. As for phrases like "no hardware with support in mainstream Linux fulfils 1-step TC directives imposed by IEEE 1588 V2 standard as of today, and that’s tragic", let's tone it down a little. IEEE 1588 just tells you behaviorally what needs to happen, not how to do it. You would need to come with a more convincing argument that there is at least one piece of hardware out there marketed as one-step TC that acts against the standard. Let alone that no hardware follows the spec. I have asked my first question hoping that you could clarify what the hardware needs to do, something which you either didn't, or it was too shy and I just didn't understand it. In situations like this, when you think you have a solution, it should really be spelled out in bold and all capitals, as something for everybody to take away. So now I will speculate a bit, and assume that the hardware behavior you need is for the port 1 to subtract the t1 RX timestamp from the Sync packet's correctionField, then the port 2 to add the t2 TX timestamp to the correctionField. And for the hardware to send the Sync message to the CPU and only to the CPU. This is the only situation that I can see, where the residence time would be contained into the Sync message at the end, and it would be forwarded by software. Let me ask you based on what criterion do you call this generic? For example, let's take the sja1105 switch. I am perfectly happy with two-step timestamping on it, but it also has some one-step TC abilities, and even though I don't care about them, it may be a good subject to talk about. The manual for it is here: https://www.nxp.com/docs/en/user-guide/UM10944.pdf Basically, for two-step timestamping, you can install some packet traps to the host port via MAC_FLTRES, and this will cause PTP packets to be delivered only to the CPU. RX timestamps and two-step TX timestamps will be collected for the PTP packets and those will be sent to the application stack, in the way that is commonly understood by everybody. Well, although the sja1105 can act like a one-step TC mode, it is actually incapable of taking a one-step timestamp of a packet. Meaning that if you try to send a Sync message from the CPU, the switch will not ensure that the {originTimestamp + correctionField} contains the precise TX timestamp. Not at all. But actually, it doesn't even need to. In fact, the host port is not special at all when operating in the one-step TC mode. The switch doesn't record precise timestamps, it just increments the correctionField with its residence time (i.e. it takes an internal timestamp at the ingress of a PTP event message, another internal timestamp at its egress, and rewrites the correctionField adding this delta to it). In order to do its job properly, the sja1105 one-step TC must be left alone to do its job of forwarding the packet autonomously and updating the residence time in the process. But not forcing the switch to send the Sync message to the CPU just for the CPU to send it back, because that would mean that the switch adds two small residence times, and the big one (the software latency) is completely missing from the calculation. So this is kind of the background that I had in mind when I asked the question. It is clear to me that it could not be supported by the "generic" model that you propose, which requires the Sync messages to be sent to the CPU, and you blame that on sja1105's lack of compliance to the spec. In fact the sja1105 does a fairly okay job as a one-step E2E TC for a thing that requires zero configuration and intervention from software for that. The only things that are missing in the zero-config solution are: - peer delay measurements, in order to update the TPDELIN and TPDELOUT registers. - syntonization, to keep the local clock running at the same rate as the GM. Not sure how you would even calculate that, though, in lack of any time stamping information. To me, a generic solution is one that would work when ports 1 and 2 belong to the same hardware switch, but also when the boundary_clock_jbod option is turned to 1 (and of course, with the PHCs on port 1 and 2 synchronized appropriately). I find to see a generic solution in this sense, however. _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel