On 04/25 22:47:11, Maxim Uvarov wrote:
> On 04/25/17 21:51, Brian Brooks wrote:
> > On 04/25 12:25:03, Brian Brooks wrote:
> >> On 04/24 08:07:58, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> >>>>> diff --git a/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>> b/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> index c8cf27b6..9ba601a3 100644
> >>>>> --- a/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> +++ b/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> @@ -3,7 +3,14 @@
> >>>>> *
> >>>>> * SPDX-License-Identifier: BSD-3-Clause
> >>>>> */
> >>>>> +
> >>>>> +#include <odp_posix_extensions.h>
> >>>>> +
> >>>>> #include <odp/api/cpu.h>
> >>>>> +#include <odp_time_internal.h>
> >>>>> +#include <odp_debug_internal.h>
> >>>>> +
> >>>>> +#include <time.h>
> >>>>>
> >>>>> uint64_t odp_cpu_cycles(void)
> >>>>> {
> >>>>> @@ -31,3 +38,55 @@ uint64_t odp_cpu_cycles_resolution(void)
> >>>>> {
> >>>>> return 1;
> >>>>> }
> >>>>> +
> >>>>> +uint64_t cpu_global_time(void)
> >>>>> +{
> >>>>> + return odp_cpu_cycles();
> >>>>
> >>>> A cycle counter cannot always be used to measure time. Even on x86,
> >>>> odp_cpu_cycles() will return the value of RDTSC which is not actually
> >>>> representative of the cycle count. Even if the x86 processor is set
> >>>> to a fixed frequency, the Invariant TSC may run at a different fixed
> >>>> frequency. Please take a look at the odp_tick_t proposal here:
> >>>>
> >>>> https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-
> >>>> eL0oGLAQ4OM/edit?usp=sharing
> >>>>
> >>>
> >>> From coverletter:
> >>> "This patch set modifies time implementation to use TSC when running on a
> >>> x86
> >>> CPU that has invarint TSC CPU flag set. Otherwise, the same Linux system
> >>> time
> >>> is used as before. TSC is much more efficient both in performance and
> >>> latency/jitter wise than Linux system call. This can be seen also with
> >>> scheduler latency test which time stamps events with this API. All
> >>> latency
> >>> measurements (min, ave, max) improved significantly."
> >>>
> >>> This function (cpu_global_time()) is called only when we have first
> >>> checked that TSC is invariant. Also we measure the TSC frequency in that
> >>> case. This function is defined in the same file as cpu_cycles(), and the
> >>> file is x86 specific. So, we know what we are doing, and just re-using
> >>> the code to read TSC.
> >
> > What sort of timing accuracy is expected from the app?
> >
> > From benchmarking the maximum single-threaded rate of these reads:
> >
> > x86_64:
> >
> > read 7 ns/op
> > read_sync 22 ns/op
> >
> > A57:
> >
> > read 4 ns/op
> > read_sync 26 ns/op
> >
> > read_sync issues a synchronizing instruction for greater timing accuracy
> > but clearly takes more time to return the time value read from the core.
> >
>
>
> it has to be depend on cpu frequency.
>
> Maxim.
We are showing the difference between 'read' and 'read_sync' on the
same machine here.