On 04/26 07:11:57, Savolainen, Petri (Nokia - FI/Espoo) wrote: > > > > > From coverletter: > > > "This patch set modifies time implementation to use TSC when running on > > a x86 > > > CPU that has invarint TSC CPU flag set. Otherwise, the same Linux system > > time > > > is used as before. TSC is much more efficient both in performance and > > > latency/jitter wise than Linux system call. This can be seen also with > > > scheduler latency test which time stamps events with this API. All > > latency > > > measurements (min, ave, max) improved significantly." > > > > odp_sched_latency currently uses clock_gettime. It is my understanding > > that clock_gettime does not have the over head of the system call. Can > > you elaborate more on the 'improved significantly' part? > > > > clock_gettime() uses the same TSC, but when you profile it with perf you can > see tens of kernel functions including system call entry, RCU maintenance, > etc. > > E.g. in sched_latency test kernel consumed about 10% of all the cycles. Also > latency measurement results improved like this: > * min >3x lower > * avg 2x lower > * max more stable and 50% lower
You might want to share more information on the environment where you're seeing such significant improvements because the results on Broadwell do not match the above interpretation. PS - This patch series breaks the build on ARM. before / after numbers on 2650v4 (HT disabled, Linux 4.4.6, GCC 5.3.1): before: HIGH priority Thread Avg[ns] Min[ns] Max[ns] Samples Total --------------------------------------------------------------- 1 619 351 2463 2103 2103 2 652 382 1509 2019 2019 3 637 360 1950 1867 1867 4 606 373 2328 2073 2073 5 611 371 2677 2096 2096 6 643 378 3045 2106 2106 7 631 354 1677 1923 1923 8 603 367 4721 2054 2054 9 617 373 1524 2111 2111 10 641 369 1808 2024 2024 --------------------------------------------------------------- Total 626 351 4721 20376 20376 LOW priority Thread Avg[ns] Min[ns] Max[ns] Samples Total --------------------------------------------------------------- 1 30498 480 914522 2097 4192201 2 44302 491 584995 1980 4192285 3 84258 680 515286 2001 4192437 4 127714 746 473280 2011 4192231 5 24568 455 724637 2109 4192208 6 42436 473 523936 2041 4192198 7 85438 554 486851 2017 4192381 8 126164 841 203464 2058 4192250 9 23085 492 671478 2091 4192192 10 41748 499 515091 1970 4192280 --------------------------------------------------------------- Total 62725 455 914522 20375 41922663 after: HIGH priority Thread Avg[ns] Min[ns] Max[ns] Samples Total --------------------------------------------------------------- 1 523 202 4671 2152 2152 2 551 276 1540 2058 2058 3 492 257 1274 1928 1928 4 496 269 1201 2035 2035 5 520 252 1506 2165 2165 6 548 291 1540 2002 2002 7 491 251 1274 1969 1969 8 486 259 3007 1951 1951 9 528 276 1601 2091 2091 10 555 264 1611 2001 2001 --------------------------------------------------------------- Total 519 202 4671 20352 20352 LOW priority Thread Avg[ns] Min[ns] Max[ns] Samples Total --------------------------------------------------------------- 1 28303 432 828632 2141 4192152 2 43635 373 662999 2005 4192246 3 85032 579 664442 1991 4192376 4 128053 471 306203 2066 4192269 5 25289 431 591153 2139 4192139 6 41118 362 693192 2013 4192302 7 87817 523 696300 2090 4192335 8 128484 398 232439 2008 4192353 9 23565 507 716741 1969 4192212 10 41952 338 614098 1929 4192303 --------------------------------------------------------------- Total 63273 338 828632 20351 41922687 > -Petri >
