I took another look at the root cause and it seems that the excessive slowness in emulator stems specifically from writing to framebuffer memory, since modifying the code to write to normal memory seems not to exhibit the slowdown. My hypothesis is that QEMU updates the display window on every write to that memory, which would explain the performance hit. On real hardware, those memory accesses would still be slower than normal memory, as they would be backed by video ram, but probably nowhere near how bad it is in QEMU. Sadly, I can't think of a way how to make this faster without adding a proper driver for the emulated video card to the kernel, which is probably not worth it for what is basically just a papercut. A compromise fix could be to add the option to turn scrolling into "rolling" instead, where newest line would wrap around to the top of screen instead of shifting the entire screen contents.
On Fri, Nov 10, 2023, 12:57 AM Martin Decky <mar...@decky.cz> wrote: > Dear all, > > let me just add to the discussion from today's online meeting that I > certainly don't want to generally dismiss Jiri's claim that the > redrawing of the framebuffer might cost a non-trivial fraction of the > CPU time given a lot of output. The raw amount of data transferred is > certainly at least one to two orders of magnitude larger compared to > writing to a character output. This is also one of the reasons why the > kernel benchmarks do not generate any output (contrary to the kernel > tests). > > However, after we optimize the drawing routines, make sure that the > memory attributes are configured correctly and try to avoid other > possible pitfalls, I still think that the drawing mode of the kernel > framebuffer should stay event-driven. It is a debugging tool and the > overhead should be reasonable for the common use case of the few logging > entries. Switching to a timer-driven mode would IMHO add a lot of > unnecessary complexity to the microkernel. > > I can imagine that some code (e.g. tests that generate a lot of output) > might even opt-in for some extra (but still conservative) optimizations > such as half-page scrolling instead of line scrolling, to further lower > the overhead. > > In the most desperate times, we can always resort to the old tricks, > such as configuring an 8-bpp framebuffer :) > > > Just as a single anecdotal point of reference, my non-accelerated > event-driven Linux kernel console is approximately 25 times slower than > the accelerated Gnome Terminal in Wayland (measured on printing random > 80-character lines at the resolution of 2560x1440 and 32 bpp, same font > dimensions). > > The overhead of the accelerated terminal emulator is still about 70 % > (compared to printing to /dev/null), despite my Wayland compositor > running in a timer-driven mode at 60 Hz. Surprisingly, xterm in XWayland > is about 15 % faster. > > However, even the relatively slow Linux kernel console still manages to > print 8000 lines per second. With, say, a 10-fold slowdown due to > emulation, that is still 800 lines per second. > > If the HelenOS kernel console could achieve similar throughput (and > there is no reason why it couldn't), I believe it should pose absolutely > no issues for the roughly 300 lines of output during the typical boot-up. > > > Best regards > > Martin Decky > > _______________________________________________ > HelenOS-devel mailing list > HelenOS-devel@lists.modry.cz > http://lists.modry.cz/listinfo/helenos-devel >
_______________________________________________ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/listinfo/helenos-devel