On 13/05/20 17:03 +0300, Paul Irofti wrote: > Hi, > > By far one of the most popular and frequently used system calls is > clock_gettime(2). As a result the cost of kernel-userland transitions > out weight the actual work, thus I am proposing we make the data > available directly from userland without passing through a system call. > > This has been a subject of discussion multiple times across the years > and last I heard from it was at the p2k19 hackthon that I hosted in > Bucharest where espie@ sent me a diff from one of his students(?). Being > busy with organization I have not had the time to look at it and > I am thus getting back to it just now due to robert@ prodding me again > on the subject. The proposed diff is mine, not the student's. > > > The technical bits. > > Please keep in mind that this is only proof of concept. I am looking for > ways to improve the current diff. As it is, it requires a flag day > because it makes use of ELF aux vectors to export the data from the > kernel. > > I have also played with exposing the data via separate ELF sections and > with kbind-mmap alternatives. The frist also involves a flag day and is > more intrusive in my opinion, and the second I could not get to work. I > think that would be the less intrusive way of doing it, possibly without > a flag day, so if anyone knows how, please let me know. > > The supported clocks are just those that do not require process specific > data. Those can also be handled later if this diff is decided to be a > good thing. > > Clock update inside the kernel is done at the end of tc_windup(). There > might be better places to do it. Let me know where. > > The update currently does the work of clock_gettime(), but it can > probably be changed to only update the timehands and move the logic > elsewhere. Note that if we expose only the timehands to userland, most > of the bintime functionality has to also be made available there. Or so > I think. > > In userland, I wrapped the clock_gettime(2) syscall in libc. There, I > search for the auxiliary vector and fetch the timespec data from it. > As you can see in the diff, parts from the elf_exec header will have to > be exposed to userland if we do it this way. > > > Results. > > To test this diff you need to do a full release(8). I have tested this > with multiple programs. Test programs, base programs and packages. None > the less, this diff touches many important areas of our tree and is > very fragile. I also probably missed changing some parts that required > change due to libc or elf changes. > > If you see regressions, which you probably will, please let me know. > > Here is a stress test from robert@: > > robert@x202:/home/robert> time ./t && time ./t2 > 0m00.11s real 0m00.12s user 0m00.00s system > 0m09.99s real 0m02.64s user 0m03.36s system > t is clock_gettime() and t2 is SYS_clock_gettime()
I am in the middle of rebuilding the packages that should gain significant speedup right now. That small test does 5 million calls to clock_gettime, so it is a bit over-reaching but still it shows the difference.