Hi Volodymyr,

On Wed, Dec 02, 2020 at 05:31:37PM +0100, Volodymyr Bendiuga wrote:
> Hi Vladimir, and thank you for extra-well formulated question.
>
> Actually, when I was drafting an answer to your question, I realized I
> had missed one very important point in the patch, namely, that SYNC
> packets in P2P mode should not simply be forwarded, but port’s peer
> delay should be added to the correction field prior to forwarding. I
> apologize for the confusion I may have unintentionally created.
>
> This is what I have missed in the patch in tc_fwd_event() function code-wise:
>               if ((q->timestamping >= TS_ONESTEP) && (msg_type(msg) == SYNC)) 
> {
>                       corr = tmv_to_TimeInterval(q->peer_delay);
>                       corr += q->asymmetry;
>                       msg->header.correction += host2net64(corr);
>               }
>               cnt = transport_send(p->trp, &p->fda, TRANS_DEFER_EVENT, msg);
>
> I will update the patch and send it anew.

Receiving thanks for the question, or further code, is not really what I
was expecting, but rather a walk through the life of the Sync packet as
it traverses Node A, the linuxptp one-step TC, and then finally Node B.
I am not at the stage where I could look at code. I don't yet understand
what the hardware is supposed to do in your proposal, and how that is
going to do the job.

>
> Taking into account your diagram, P2P-TC calculates uplink delay
> between its port 1 and Node A, by means of PD_REQ/PD_RESP messages.
> This delay is then stored in linuxptp (q->peer_delay). In your case we
> suppose it is 800 ns. Downstream link delay (port 2 -> Node B) is
> disregarded by P2P TC, since Node B will use it on its side (over
> there it will be called uplink delay), upon reception of SYNC packet.
>
>       When SYNC packet arrives to TC, we add uplink ports peer delay
>       to correction field and forward it. TC’s port 2 will calculate
>       residence time and add it to correction field. Here, the travel
>       path of SYNC packet, from its ingress at t1, up to linuxptp and
>       down back to driver, will be resolved in port 2 upon egress when
>       t2 is taken ((t2 - t1) -> add to correction field.

How? The egress port has no idea what t1 was. Where is t1 recorded?

>
>       Upon reception of the SYNC packet, Node B will have everything
>       it needs to calculate the offset: origin time and correction
>       value from SYNC packet, and its own uplink delay (700 ns).
>       Nothing really special going on here, just regular workflow.

Yeah, ok, forget about the link delay, that is only 1500 ns in total in
my example, or 2.3% of the total latency that the TC must account for.
It is not that part that I'm interested in. I want to understand how
your proposal makes (t2 - t1) magically land into the correctionField of
the Sync message. What does the hardware need to do to make that happen.

>       As for store-and-forward: unless you can switch it off, like
>       placing your hw in cut-through mode, you may have to compensate
>       for it in link asymmetry, but that depends on you HW design.

This really doesn't matter. The store-and-forward latency is included in
the (t2 - t1) measurement, so as soon as t1 and t2 are taken as timestamps
at MAC layer, the entire residence time should be accounted for. But
how, that is the question. I've read your emails a few times and I do
not understand this.

> I guess my explanation is totally redundant, since I’ve revealed the
> missing code … . Sorry once again.

No it isn't. I find it is still rather lacking. You mentioned the
Marvell 88E1548 but didn't fully explain what that does special in order
to comply with your rules. As for phrases like "no hardware with support
in mainstream Linux fulfils 1-step TC directives imposed by IEEE 1588 V2
standard as of today, and that’s tragic", let's tone it down a little.
IEEE 1588 just tells you behaviorally what needs to happen, not how to
do it. You would need to come with a more convincing argument that there
is at least one piece of hardware out there marketed as one-step TC that
acts against the standard. Let alone that no hardware follows the spec.


I have asked my first question hoping that you could clarify what the
hardware needs to do, something which you either didn't, or it was too
shy and I just didn't understand it. In situations like this, when you
think you have a solution, it should really be spelled out in bold and
all capitals, as something for everybody to take away.
So now I will speculate a bit, and assume that the hardware behavior you
need is for the port 1 to subtract the t1 RX timestamp from the Sync
packet's correctionField, then the port 2 to add the t2 TX timestamp to
the correctionField. And for the hardware to send the Sync message to
the CPU and only to the CPU. This is the only situation that I can see,
where the residence time would be contained into the Sync message at the
end, and it would be forwarded by software.

Let me ask you based on what criterion do you call this generic?
For example, let's take the sja1105 switch. I am perfectly happy with
two-step timestamping on it, but it also has some one-step TC abilities,
and even though I don't care about them, it may be a good subject to
talk about. The manual for it is here:
https://www.nxp.com/docs/en/user-guide/UM10944.pdf
Basically, for two-step timestamping, you can install some packet traps
to the host port via MAC_FLTRES, and this will cause PTP packets to be
delivered only to the CPU. RX timestamps and two-step TX timestamps will
be collected for the PTP packets and those will be sent to the application
stack, in the way that is commonly understood by everybody.
Well, although the sja1105 can act like a one-step TC mode, it is
actually incapable of taking a one-step timestamp of a packet. Meaning
that if you try to send a Sync message from the CPU, the switch will not
ensure that the {originTimestamp + correctionField} contains the precise
TX timestamp. Not at all. But actually, it doesn't even need to. In
fact, the host port is not special at all when operating in the one-step
TC mode. The switch doesn't record precise timestamps, it just increments
the correctionField with its residence time (i.e. it takes an internal
timestamp at the ingress of a PTP event message, another internal timestamp
at its egress, and rewrites the correctionField adding this delta to it).
In order to do its job properly, the sja1105 one-step TC must be left
alone to do its job of forwarding the packet autonomously and updating
the residence time in the process. But not forcing the switch to send
the Sync message to the CPU just for the CPU to send it back, because
that would mean that the switch adds two small residence times, and the
big one (the software latency) is completely missing from the calculation.
So this is kind of the background that I had in mind when I asked the
question. It is clear to me that it could not be supported by the
"generic" model that you propose, which requires the Sync messages to be
sent to the CPU, and you blame that on sja1105's lack of compliance to
the spec.

In fact the sja1105 does a fairly okay job as a one-step E2E TC for a
thing that requires zero configuration and intervention from software
for that. The only things that are missing in the zero-config solution
are:
- peer delay measurements, in order to update the TPDELIN and TPDELOUT
  registers.
- syntonization, to keep the local clock running at the same rate as the
  GM. Not sure how you would even calculate that, though, in lack of any
  time stamping information.

To me, a generic solution is one that would work when ports 1 and 2
belong to the same hardware switch, but also when the boundary_clock_jbod
option is turned to 1 (and of course, with the PHCs on port 1 and 2
synchronized appropriately). I find to see a generic solution in this
sense, however.


_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to