> Date: Thu, 7 Jan 2021 22:03:15 +1000
> From: Jonathan Matthew <jonat...@d14n.org>
> 
> On Wed, Jan 06, 2021 at 12:53:45PM +0100, Mark Kettenis wrote:
> > > Date: Wed, 6 Jan 2021 21:29:52 +1000
> > > From: Jonathan Matthew <jonat...@d14n.org>
> > > 
> > > On Wed, Jan 06, 2021 at 10:52:48AM +0100, Mark Kettenis wrote:
> > > > > Date: Wed, 6 Jan 2021 20:29:09 +1100
> > > > > From: Jonathan Gray <j...@jsg.id.au>
> > > > > 
> > > > > On Tue, Jan 05, 2021 at 10:28:20PM -1000, st...@wdwd.me wrote:
> > > > > > I tested with a Protectli FW1 router (dmesg below) forwarding 
> > > > > > packets
> > > > > > between two test machines. The latency spikes occur when running 
> > > > > > headless
> > > > > > beginning with this commit:
> > > > > 
> > > > > As the interrupt is handled via msi it wouldn't be a shared interrupt
> > > > > related problem.
> > > > > 
> > > > > Perhaps some drm kernel thread, but I can't think of anything that 
> > > > > would
> > > > > be doing work with no display connected.
> > > > 
> > > > Could be the kernel periodically polling whether a monitor is
> > > > attached.  Some generations of the Intel graphics hardware have broken
> > > > hardware hotplug detection.  And some rely on polling i2c code to
> > > > detect a VGA monitor.
> > > > 
> > > > Don't know this hardware.  If it has a VGA port that's left
> > > > unconnected it might help to actually connect it.  Maybe one of those
> > > > dongles that fake a VGA monitor would do.
> > > > 
> > > > Disabling inteldrm(4) would also help.
> > > 
> > > On my home router, which is a similar kind of machine, various drm work 
> > > queue
> > > threads use a fair bit of cpu time.  I normally have inteldrm disabled 
> > > just
> > > for that - I hadn't noticed it causing latency problems, it just seemed 
> > > wrong
> > > that my router spent more cpu time on drm stuff than on forwarding 
> > > packets.
> > > 
> > > inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics" rev 0x35
> > > drm0 at inteldrm0
> > > inteldrm0: msi, CHERRYVIEW, gen 8
> > > inteldrm0: 1024x768, 32bpp
> > > 
> > > It has one displayport and two hdmi, no vga, so hopefully analog hotplug
> > > isn't involved.
> > 
> > HDMI may be in the same boat.  And I think there are cases where the
> > VBIOS still advertises an (unconnected) VGA port even if there is no
> > physical output.
> > 
> > > $ vmstat -zi 
> > > interrupt                       total     rate
> > > irq0/clock                     241657      396
> > > irq0/ipi                        17830       29
> > > irq96/acpi0                         0        0
> > > irq144/inteldrm0                  469        0
> > > irq97/ahci0                     50442       82
> > > irq98/xhci0                        25        0
> > > irq176/azalia0                      1        0
> > > irq99/ppb0                          0        0
> > > irq114/re0                      12914       21
> > > irq100/ppb1                         0        0
> > > irq115/re1                      13261       21
> > > irq101/ppb2                         0        0
> > > irq116/athn0                        0        0
> > > irq102/ichiic0                      0        0
> > > irq145/com0                       118        0
> > > irq146/pckbc0                       0        0
> > > irq147/pckbc0                       0        0
> > > Total                          336717      551
> > > 
> > > This is what the drm workqueue threads have done in 15 minutes uptime:
> > > 
> > > root     85080  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmlwq)
> > > root     65272  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmtskl)
> > > root     39235  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmlwq)
> > > root     65215  0.0  0.0     0     0 ??  DK      9:06PM    0:01.00 
> > > (drmlwq)
> > > root     62266  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmlwq)
> > > root     20339  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmubwq)
> > > root     62920  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmubwq)
> > > root     58454  0.0  0.0     0     0 ??  DK      9:06PM    0:01.00 
> > > (drmubwq)
> > > root     75983  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmubwq)
> > > root     31352  0.0  0.0     0     0 ??  DK      9:06PM    0:00.01 
> > > (drmhpwq)
> > > root     23634  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmhpwq)
> > > root     95926  0.0  0.0     0     0 ??  DK      9:06PM    0:00.00 
> > > (drmhpwq)
> > > root     38038  0.0  0.0     0     0 ??  DK      9:06PM    0:07.10 (drmwq)
> > > root     10622  0.0  0.0     0     0 ??  DK      9:06PM    0:06.50 (drmwq)
> > > root     68591  0.0  0.0     0     0 ??  DK      9:06PM    0:01.00 
> > > (drmhpwq)
> > > root     97843  0.0  0.0     0     0 ??  DK      9:06PM    0:11.41 (drmwq)
> > > root     36773  0.5  0.0     0     0 ??  DK      9:06PM    0:10.98 (drmwq)
> > > 
> > > I can work on gathering some profiling data with dt etc. if that would 
> > > help.
> > 
> > Yes, that may be helpful.
> > 
> > If you're not running X, there really shouldn't be much activity.
> > Really just the software hotplug detection and maybe some power
> > management.  But the monitor detection code uses delay(9) in some of
> > its code paths, so that may result in more accumulated CPU cycles than
> > one would naively expect.
> 
> With no X running and not much else happening on the system, running 
> btrace -e 'profile:hz:97 { printf("%s\n", kstack) }', virtually all the
> traces that aren't in acpicpu_idle look like this:
> 
> tsc_delay+0x68
> gmbus_wait+0x3c4
> gmbus_xfer_read+0x170
> do_gmbus_xfer+0x2ae
> gmbus_xfer+0x7b
> i2c_transfer+0x5a
> drm_do_probe_ddc_edid+0x24d
> drm_get_edid+0x4d
> intel_hdmi_set_edid+0x69
> intel_hdmi_detect+0xb1
> drm_helper_probe_detect+0xed
> output_poll_execute+0x14b
> taskq_thread+0x81
> proc_trampoline+0x1c
> 
> which looks like hdmi detection as you guessed.

So we may be able to improve this by improving the usleep_range()
emulation to actually sleep if the interval is long enough.  But
ultimately we need the high resolution timers that cheloha@ is working
on to fix this properly.

Reply via email to