Hi Eero,

On Wed, 28 May 2025 at 00:47, Eero Tamminen <o...@helsinkinet.fi> wrote:
> On 25.5.2025 15.05, Geert Uytterhoeven wrote:
> > On Thu, 22 May 2025 at 00:56, Eero Tamminen <o...@helsinkinet.fi> wrote:
> >> On 21.5.2025 10.06, Geert Uytterhoeven wrote:
> >>> I do keep it up-to-date locally, so I could provide these changes,
> >>> if you are interested.
> >>
> >> Yes, please!   (see below)
> >
> > Sorry for taking so long:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k.git/log/?h=atari-drm-wip-rebasing
>
> Thanks!
>
> I did boot testing on Hatari emulator with a minimal kernel config
> having atari_drm enabled, atafb disabled, FB & boot logo enabled.
>
> Under Falcon emulation:
> - RGB/VGA => works fine
> - Mono monitor => panic
>    "Kernel panic - not syncing: can't set default video mode"
> Under TT emulation:
> - RGB/VGA => boots, but console is black[1] (palette issue?)
> - Mono monitor => looks OKish[2], but has constant warnings:
> -----------------------------------
> WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_atomic_helper.c:1720
> drm_atomic_helper_wait_for_vblanks+0x1a0/0x1ee
> [CRTC:35:crtc-0] vblank wait timed out

I am not sure this is a bug in atari-drm, or just an issue when using
DRM on slow machines.

> -----------------------------------
>
> Under 030 ST/STe emulation:
> - RGB/VGA => boots, but console is black (palette issue?)
> - Mono monitor => looks OK, but has constant slowpath warnings with:
>    "[CRTC:35:crtc-0] vblank wait timed out"
>
> => Any advice on the issues?

Are these regression in atari-drm, or do they happen with atafb, too?

> PS. I also profiled where most of time goes from "atari-drm" probing,
> until boot reaches user space.  On a minimal -Os built kernel, running
> on (emulated) 32Mhz 030 Falcon, in the default 640x480@4 resolution:
> ----------------------------------------------------------------
> Time spent in profile = 15.29712s.
> ...
> Used cycles:
>    22.37%  22.42%  25.35%   _transp
>    19.15%  19.19%  46.82%   atari_drm_fb_blit_rect.isra.0
>     8.09%   8.09%  13.80%   sys_copyarea
>     3.94%   3.95%   6.23%   sys_imageblit
>     3.69%   3.69%   3.69%   fb_copy_offset.isra.0
>     2.12%   2.13%   2.41%   atari_scsi_falcon_reg_read
>     2.03%   2.03%   2.03%   fb_address_forward
>     1.85%   1.85%  17.98%   fbcon_redraw_blit.constprop.0
>     1.81%   1.81%   2.04%   atari_keyb_init
>     1.78%   1.78%   1.98%   fb_reverse_long
>     1.58%   1.58%   1.90%   arch_cpu_idle
>     1.05%                   memcpy
>     0.95%                   memset
> ...
> ----------------------------------------------------------------
>
> => atari-drm blitting takes half the time during boot.

Yeah, conversion from chunky to planar is expensive.
Would be great to have a text console that operates directly on the
buffer used by the hardware...

> Building kernel with -O2, changes above rather radically, both
> time-wise, and where that time goes:
> ----------------------------------------------------------------
> Time spent in profile = 6.54049s.
> ...
> Used cycles:
>    17.61%  17.61%  17.61%   sys_copyarea
>    11.18%  11.18%  13.11%   arch_cpu_idle
>     7.53%   7.55%   8.45%   atari_drm_fb_blit_rect.isra.0
>     4.26%   4.27%   4.76%   atari_keyb_init
>     2.70%   2.70%   2.93%   atari_scsi_falcon_reg_read
>     2.45%   2.45%  23.81%   fbcon_redraw_blit.constprop.0
>     2.35%   2.35%   2.48%   sys_imageblit
>     2.12%   2.12%   5.89%   atari_floppy_init
>     1.97%                   memset
>     1.31%                   memcpy
> ...
> Instruction cache misses:
>    27.14%  27.14%  27.14%   sys_copyarea
>     3.77%   3.77%   4.05%   atari_scsi_falcon_reg_read
> ...
> Data cache hits:
>    63.55%  63.55%  63.67%   atari_keyb_init
>     7.61%   7.62%   7.84%   atari_drm_fb_blit_rect.isra.0
>     3.86%   3.86%   3.86%   sys_copyarea <= not much hits for copying
> ...
> ----------------------------------------------------------------

So it would be worthwhile to factor out the code that is most
performance-critical into its own file, and use CFLAGS_foo.o += -O2
(or even -O3? or other options?) in the Makefile to build it with a
better optimization level.

> However, -O2 build has the downside that the resulting kernel Oopses
> once it reaches user-space, if 030 data cache emulation is enabled:
> ----------------------------------------------------------------
> Run /init as init process
> ...
> Instruction fault at 0x0041a256
> BAD KERNEL BUSERR

Interesting...

Thanks a lot for testing, and for your analysis!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Reply via email to