Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log

2019-09-23 Thread Pavel Tatashin
On Tue, Jul 23, 2019 at 3:17 AM Marc Zyngier  wrote:
>
> On Mon, 22 Jul 2019 21:52:42 +0100,
> Pavel Tatashin  wrote:
> >
> > On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier  wrote:
> > >
> > > So far, we've let the arm64 kernel start its meaningful time stamping
> > > of the kernel log pretty late, which is caused by sched_clock() being
> > > initialised rather late compared to other architectures.
> > >
> > > Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> > > much earlier, which I had objections to. The reason for initialising
> > > sched_clock late is that a number of systems have broken counters, and
> > > we need to apply all kind of terrifying workarounds to avoid time
> > > going backward on the affected platforms. Being able to identify the
> > > right workaround comes pretty late in the kernel boot, and providing
> > > an unreliable sched_clock, even for a short period of time, isn't an
> > > appealing prospect.
> > >
> > > To address this, I'm proposing that we allow an architecture to chose
> > > to (1) divorce time stamping and sched_clock during the early phase of
> > > booting, and (2) inherit the time stamping clock as the new epoch the
> > > first time a sched_sched clock gets registered.

Hi Marc,

I know we briefly discussed this at plumbers, but I want to bring it
up again, because I am still puzzled why it is not possible to
stabilize unstable clock early in boot.

Here is an example where clock is stabilized:
https://soleen.com/source/xref/linux/kernel/sched/clock.c?r=457c8996#265

It uses a value that is read at last ticks to normalize clock, and
because ticks are not available early in boot instead we can make sure
that early in boot sched_clock() never returns value smaller than
previously returned value, and if we want to be extra careful, we can
also make sure that sched_clock() early in boot does not jump ahead by
more than some fixed amount of time i.e. more than one hour.

If sched_clock() is available early we will get the benefit of having
other tracers that use it to debug early boot information.

Pasha


Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log

2019-07-23 Thread Marc Zyngier
On Mon, 22 Jul 2019 21:52:42 +0100,
Pavel Tatashin  wrote:
> 
> On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier  wrote:
> >
> > So far, we've let the arm64 kernel start its meaningful time stamping
> > of the kernel log pretty late, which is caused by sched_clock() being
> > initialised rather late compared to other architectures.
> >
> > Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> > much earlier, which I had objections to. The reason for initialising
> > sched_clock late is that a number of systems have broken counters, and
> > we need to apply all kind of terrifying workarounds to avoid time
> > going backward on the affected platforms. Being able to identify the
> > right workaround comes pretty late in the kernel boot, and providing
> > an unreliable sched_clock, even for a short period of time, isn't an
> > appealing prospect.
> >
> > To address this, I'm proposing that we allow an architecture to chose
> > to (1) divorce time stamping and sched_clock during the early phase of
> > booting, and (2) inherit the time stamping clock as the new epoch the
> > first time a sched_sched clock gets registered.
> 
> Could we have a stable clock config for arm64: if it is known that
> this Linux build is going to run on non-broken firmware, and with a
> known stable cntvct_el0 register it can be optionally configured to
> use that stable sched_clock instead of generic clock that arm64 is
> using?

Hmmm. Then I guess a prerequisite is this patch:

https://lwn.net/Articles/490040/

as the number of systems this will reliably run on is a close
approximation to zero. Yes, counting is hard...

More seriously, we've been there before (cue the 32bit ARM port 8
years ago), and really don't want to go back to a time where we had
multiple config options for everything. There is one kernel, which
should run *reliably* on everything.

> This way, the early printk are going to be available on those
> systems without adding a different method for printk's only. It would
> also mean that other users of early sched_clock() such as ftrace could
> benefit from it.

See above. But maybe the real thing to do is to allow local_clock() to
be overridden. Same effect, same complexity, just hidden away from any
given subsystem.

Thanks,

M.

-- 
Jazz is not dead, it just smells funny.


Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log

2019-07-22 Thread Hanjun Guo
On 2019/7/22 18:33, Marc Zyngier wrote:
> So far, we've let the arm64 kernel start its meaningful time stamping
> of the kernel log pretty late, which is caused by sched_clock() being
> initialised rather late compared to other architectures.
> 
> Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> much earlier, which I had objections to. The reason for initialising
> sched_clock late is that a number of systems have broken counters, and
> we need to apply all kind of terrifying workarounds to avoid time
> going backward on the affected platforms. Being able to identify the
> right workaround comes pretty late in the kernel boot, and providing
> an unreliable sched_clock, even for a short period of time, isn't an
> appealing prospect.
> 
> To address this, I'm proposing that we allow an architecture to chose
> to (1) divorce time stamping and sched_clock during the early phase of
> booting, and (2) inherit the time stamping clock as the new epoch the
> first time a sched_sched clock gets registered.
> 
> (1) would allow arm64 to provide a time stamping clock, however
> unreliable it might be, while (2) would allow sched_clock to provide
> time stamps that are continuous with the time-stamping clock.
> 
> The last patch in the series adds the necessary logic to arm64,
> allowing the (potentially unreliable) time stamping of early kernel
> messages.
> 
> Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
> tested on a x86 guest.

This makes the boot log more useful and I can debug some time consuming
issue easier before the arch timer initialization, tested on my ARM64
server, I can see the timestamping from the start [1],

Tested-by: Hanjun Guo 

Thanks
Hanjun

[1]:
[0.00] Booting Linux on physical CPU 0x08 [0x481fd010]
[0.00] Linux version 5.2.0+ (root@localhost.localdomain) (gcc version 
9.0.1 20190312 (Red Hat 9.0.1-0.10) (GCC)) #45 SMP Tue Jul 23 09:17:48 CST 2019
[0.00] Using timestamp clock @100MHz
[0.74] efi: Getting EFI parameters from FDT:
[0.82] efi: EFI v2.70 by EDK II
[0.83] efi:  ACPI 2.0=0x3a30  SMBIOS 3.0=0x39f8  
MEMATTR=0x30996018  MEMRESERVE=0x30997e18
[0.000122] crashkernel reserved: 0x0ba0 - 0x2ba0 
(512 MB)
[0.000126] cma: Failed to reserve 512 MiB
[0.185111] ACPI: Early table checksum verification disabled
[0.185115] ACPI: RSDP 0x3A30 24 (v02 HISI  )
[0.185120] ACPI: XSDT 0x3A27 9C (v01 HISI   HIP08
  0113)
[0.185127] ACPI: FACP 0x39B1 000114 (v06 HISI   HIP08
 HISI 20151124)
[0.185134] ACPI: DSDT 0x39AB 0084E4 (v02 HISI   HIP08
 INTL 20181213)
[0.185139] ACPI: PCCT 0x39FB 8A (v01 HISI   HIP08
 HISI 20151124)
[0.185143] ACPI: SSDT 0x39F9 01021A (v02 HISI   HIP07
 INTL 20181213)
[0.185147] ACPI: BERT 0x39F5 30 (v01 HISI   HIP08
 HISI 20151124)
[0.185150] ACPI: HEST 0x39F3 000308 (v01 HISI   HIP08
 HISI 20151124)
[0.185154] ACPI: ERST 0x39EF 000230 (v01 HISI   HIP08
 HISI 20151124)
[0.185158] ACPI: EINJ 0x39EE 000170 (v01 HISI   HIP08
 HISI 20151124)
[0.185162] ACPI: SLIT 0x39B3 3C (v01 HISI   HIP08
 HISI 20151124)
[0.185166] ACPI: GTDT 0x39B0 7C (v02 HISI   HIP08
 HISI 20151124)
[0.185169] ACPI: MCFG 0x39AF 3C (v01 HISI   HIP08
 HISI 20151124)
[0.185173] ACPI: SPCR 0x39AE 50 (v02 HISI   HIP08
 HISI 20151124)
[0.185177] ACPI: SRAT 0x39AD 0007D0 (v03 HISI   HIP08
 HISI 20151124)
[0.185181] ACPI: APIC 0x39AC 001E6C (v04 HISI   HIP08
 HISI 20151124)
[0.185185] ACPI: IORT 0x39AA 001310 (v00 HISI   HIP08
 INTL 20181213)
[0.185189] ACPI: PPTT 0x3097 0031B0 (v01 HISI   HIP08
 HISI 20151124)
[0.185196] ACPI: SPCR: console: pl011,mmio32,0x9408,115200
[0.185208] ACPI: SRAT: Node 0 PXM 0 [mem 0x208000-0x2f]
[0.185210] ACPI: SRAT: Node 1 PXM 1 [mem 0x30-0x3f]
[0.185212] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x7fff]
[0.185213] ACPI: SRAT: Node 2 PXM 2 [mem 0x2020-0x202f]
[0.185215] ACPI: SRAT: Node 3 PXM 3 [mem 0x2030-0x203f]
[0.185221] NUMA: NODE_DATA [mem 0x2fe3c0-0x2f]
[0.185224] NUMA: NODE_DATA [mem 0x3fe3c0-0x3f]
[0.185226] NUMA: NODE_DATA [mem 0x202fe3c0-0x202f]
[0.185229] NUMA: NODE_DATA [mem 0x203ffdfde3c0-0x203ffdfd]




Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log

2019-07-22 Thread Pavel Tatashin
On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier  wrote:
>
> So far, we've let the arm64 kernel start its meaningful time stamping
> of the kernel log pretty late, which is caused by sched_clock() being
> initialised rather late compared to other architectures.
>
> Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> much earlier, which I had objections to. The reason for initialising
> sched_clock late is that a number of systems have broken counters, and
> we need to apply all kind of terrifying workarounds to avoid time
> going backward on the affected platforms. Being able to identify the
> right workaround comes pretty late in the kernel boot, and providing
> an unreliable sched_clock, even for a short period of time, isn't an
> appealing prospect.
>
> To address this, I'm proposing that we allow an architecture to chose
> to (1) divorce time stamping and sched_clock during the early phase of
> booting, and (2) inherit the time stamping clock as the new epoch the
> first time a sched_sched clock gets registered.

Could we have a stable clock config for arm64: if it is known that
this Linux build is going to run on non-broken firmware, and with a
known stable cntvct_el0 register it can be optionally configured to
use that stable sched_clock instead of generic clock that arm64 is
using? This way, the early printk are going to be available on those
systems without adding a different method for printk's only. It would
also mean that other users of early sched_clock() such as ftrace could
benefit from it.

>
> (1) would allow arm64 to provide a time stamping clock, however
> unreliable it might be, while (2) would allow sched_clock to provide
> time stamps that are continuous with the time-stamping clock.
>
> The last patch in the series adds the necessary logic to arm64,
> allowing the (potentially unreliable) time stamping of early kernel
> messages.
>
> Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
> tested on a x86 guest.
>
> [1] https://lore.kernel.org/patchwork/patch/1015110/
>
> Marc Zyngier (3):
>   printk: Allow architecture-specific timestamping function
>   sched/clock: Allow sched_clock to inherit timestamp_clock epoch
>   arm64: Allow early time stamping
>
>  arch/arm64/Kconfig  |  3 +++
>  arch/arm64/kernel/setup.c   | 44 +
>  include/linux/sched/clock.h | 13 +++
>  kernel/printk/printk.c  |  4 ++--
>  kernel/time/sched_clock.c   | 10 +
>  5 files changed, 72 insertions(+), 2 deletions(-)
>
> --
> 2.20.1
>


[PATCH 0/3] arm64: Allow early timestamping of kernel log

2019-07-22 Thread Marc Zyngier
So far, we've let the arm64 kernel start its meaningful time stamping
of the kernel log pretty late, which is caused by sched_clock() being
initialised rather late compared to other architectures.

Pavel Tatashin proposed[1] to move the initialisation of sched_clock
much earlier, which I had objections to. The reason for initialising
sched_clock late is that a number of systems have broken counters, and
we need to apply all kind of terrifying workarounds to avoid time
going backward on the affected platforms. Being able to identify the
right workaround comes pretty late in the kernel boot, and providing
an unreliable sched_clock, even for a short period of time, isn't an
appealing prospect.

To address this, I'm proposing that we allow an architecture to chose
to (1) divorce time stamping and sched_clock during the early phase of
booting, and (2) inherit the time stamping clock as the new epoch the
first time a sched_sched clock gets registered.

(1) would allow arm64 to provide a time stamping clock, however
unreliable it might be, while (2) would allow sched_clock to provide
time stamps that are continuous with the time-stamping clock.

The last patch in the series adds the necessary logic to arm64,
allowing the (potentially unreliable) time stamping of early kernel
messages.

Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
tested on a x86 guest.

[1] https://lore.kernel.org/patchwork/patch/1015110/

Marc Zyngier (3):
  printk: Allow architecture-specific timestamping function
  sched/clock: Allow sched_clock to inherit timestamp_clock epoch
  arm64: Allow early time stamping

 arch/arm64/Kconfig  |  3 +++
 arch/arm64/kernel/setup.c   | 44 +
 include/linux/sched/clock.h | 13 +++
 kernel/printk/printk.c  |  4 ++--
 kernel/time/sched_clock.c   | 10 +
 5 files changed, 72 insertions(+), 2 deletions(-)

-- 
2.20.1