On Thu, Apr 27, 2017 at 03:41:03PM +1000, David Mirabito wrote: > * "Fixing" (if this is indeed a bug) was reasonably straight forward - more > or less reordering steps 4,5,6 so that we wake the app only *after* we've > unlocked the bit.
If your analysis was correct, then yes, indeed this is a driver bug. Please submit a patch on netdev. > Q0: Did I make any immediately bad assumptions in my quest to try stop > these tx_timeouts? Hard to tell, but your explanation seed reasonable to me. > Q1: Is there any way for drivers to pass up a 'non-timestamp' to indicate > to applications that it's never going to come? No. > I know there are cases where the packet may be dropped beyond driver's > control so the timestamp won't arrive, but does it make sense for drivers > to indicate to applications that it knows the timestamp will never come, > particularly if the packet was sent? I can't imagine that a driver would know this. > Q2: Could it be within ptp4l's capabilities to detect such a 'no-timestamp > possible' message on the errqueue and do something not quite so drastic as > a full reset, especially if it's a transient -EAGAIN type response? If you miss a Tx time stamp, then something is wrong. Probably the link is down, but it hard to reliably know the cause. I am skeptical that this can really be improved in a practical way. Really, we should fix the drivers, as you have done, or choose non-broken HW. [ BTW, if you don't like the long fault interval, just use ASAP. ] > This is not there today, but would it be sensible/allowable to try again a > few times, with different sequence numbers, etc? Even if not the PTP > protocol should survive a the occasional missing packets, without a full > reset, just maybe the delay value gets a little out of date or whatnot, no? I could imagine an option allowing the program to ignore a certain number of missed Tx time stamps before throwing the fault. > Q3: This all assumes "well behaved" apps that send one packet, receiving > one timestamp before attempting to send another timestamped packet. Is this > mandated, or could an app reasonably expect to send a few packets in a row? Many current HW designs do not support this. > Sending a second packet will compete against the driver retrieving the > timestamp of the first, with no feedback to the app whether it won or not > and whether a timestamp may be expected. Does the API allow for more fancy > HW with deeper tx-timestamp queues to be fully utilised? The API allows fully asynchronous Tx time stamping. In theory, you could send a packet, remember that it deserves a time stamp, then go on to other things. Polling on the error queue would allow you to later match CMSGs with the remembered transmitted packets. We don't do that way because 1) this complicates the code for dubious benefit* and 2) that would limit the HW you could use. * The only benefit I can see would be when sending messages at a very high rate. So far, I have yet to hear that anyone has run into this limitation. > Q4: Is there some conceptual difference between "Packet was dropped > therefore no timestamp" and "Packet [maybe?] sent; wasn't able to get a TX > timestamp for it"? Well, there is a difference, but the poor application will never know about it. Thanks, Richard ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linuxptp-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
