Re: [PATCH 0/3] fs: reduce export usage of kerne_read*() calls

2020-05-15 Thread Luis Chamberlain
On Wed, May 13, 2020 at 11:17:36AM -0700, Christoph Hellwig wrote:
> Can you also move kernel_read_* out of fs.h?  That header gets pulled
> in just about everywhere and doesn't really need function not related
> to the general fs interface.

Sure, where should I dump these?

  Luis

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec: arm: possible overwrite of initrd

2020-05-15 Thread Russell King - ARM Linux admin
On Fri, May 15, 2020 at 03:57:12PM +0200, Corentin Labbe wrote:
> Hello
> 
> Following https://lkml.org/lkml/2020/4/6/96 I was able to boot my Cubieboard4 
> via kexec reliabily.

You can try increasing the kernel size that kexec thinks the kernel
needs, but it should be extremely accurate with modern kexec.

--image-size $((0x01dc8154 + 0x1))

will add 64k on top of what you currently have.  Note where the first
figure comes from (you'll find it in the debug output, see
"Resulting kernel space").

The best I can say is try playing around with that - but, kexec's
calculations should be spot on to stop the booting kernel from
overwriting the initrd.

The only way to debug that is to get the booted kernel to hexdump the
initrd so it's possible to see what happened to it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5] kernel: add panic_on_taint

2020-05-15 Thread Luis Chamberlain
On Fri, May 15, 2020 at 01:55:02PM -0400, Rafael Aquini wrote:
> Analogously to the introduction of panic_on_warn, this patch introduces a 
> kernel
> option named panic_on_taint in order to provide a simple and generic way to 
> stop
> execution and catch a coredump when the kernel gets tainted by any given flag.
> 
> This is useful for debugging sessions as it avoids having to rebuild the 
> kernel
> to explicitly add calls to panic() into the code sites that introduce the 
> taint
> flags of interest. For instance, if one is interested in proceeding with a
> post-mortem analysis at the point a given code path is hitting a bad page
> (i.e. unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
> by rebooting the kernel with 'panic_on_taint=0x20' amended to the command 
> line.
> 
> Another, perhaps less frequent, use for this option would be as a mean for
> assuring a security policy case where only a subset of taints, or no single
> taint (in paranoid mode), is allowed for the running system.
> The optional switch 'nousertaint' is handy in this particular scenario,
> as it will avoid userspace induced crashes by writes to sysctl interface
> /proc/sys/kernel/tainted causing false positive hits for such policies.
> 
> Suggested-by: Qian Cai 
> Signed-off-by: Rafael Aquini 

Reviewed-by: Luis Chamberlain 

  Luis

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v5] kernel: add panic_on_taint

2020-05-15 Thread Rafael Aquini
Analogously to the introduction of panic_on_warn, this patch introduces a kernel
option named panic_on_taint in order to provide a simple and generic way to stop
execution and catch a coredump when the kernel gets tainted by any given flag.

This is useful for debugging sessions as it avoids having to rebuild the kernel
to explicitly add calls to panic() into the code sites that introduce the taint
flags of interest. For instance, if one is interested in proceeding with a
post-mortem analysis at the point a given code path is hitting a bad page
(i.e. unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
by rebooting the kernel with 'panic_on_taint=0x20' amended to the command line.

Another, perhaps less frequent, use for this option would be as a mean for
assuring a security policy case where only a subset of taints, or no single
taint (in paranoid mode), is allowed for the running system.
The optional switch 'nousertaint' is handy in this particular scenario,
as it will avoid userspace induced crashes by writes to sysctl interface
/proc/sys/kernel/tainted causing false positive hits for such policies.

Suggested-by: Qian Cai 
Signed-off-by: Rafael Aquini 
---
Changelog:
* v2: get rid of unnecessary/misguided compiler hints   (Luis)
  enhance documentation text for the new kernel parameter   (Randy)
* v3: drop sysctl interface, keep it only as a kernel parameter (Luis)
* v4: change panic_on_taint input from alphabetical taint flags
  to hexadecimal bitmasks, for clarity and extendability(Luis)
* v5: add doc note on the potential effects of panic_on_taint
  with notaintuser on writes to kernel.tainted sysctl knob  (Luis)

 Documentation/admin-guide/kdump/kdump.rst |  8 +
 .../admin-guide/kernel-parameters.txt | 13 +++
 Documentation/admin-guide/sysctl/kernel.rst   |  7 
 include/linux/kernel.h|  3 ++
 kernel/panic.c| 34 +++
 kernel/sysctl.c   | 11 +-
 6 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst 
b/Documentation/admin-guide/kdump/kdump.rst
index ac7e131d2935..2da65fef2a1c 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -521,6 +521,14 @@ will cause a kdump to occur at the panic() call.  In cases 
where a user wants
 to specify this during runtime, /proc/sys/kernel/panic_on_warn can be set to 1
 to achieve the same behaviour.
 
+Trigger Kdump on add_taint()
+
+
+The kernel parameter panic_on_taint facilitates a conditional call to panic()
+from within add_taint() whenever the value set in this bitmask matches with the
+bit flag being set by add_taint().
+This will cause a kdump to occur at the add_taint()->panic() call.
+
 Contact
 ===
 
diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index d9197499aad1..27b988acb4db 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3422,6 +3422,19 @@
bit 4: print ftrace buffer
bit 5: print all printk messages in buffer
 
+   panic_on_taint= Bitmask for conditionally call panic() in add_taint()
+   Format: [,nousertaint]
+   Hexadecimal bitmask representing the set of TAINT flags
+   that will cause the kernel to panic when add_taint() is
+   called with any of the flags in this set.
+   The optional switch "nousertaint" can be utilized to
+   prevent userspace forced crashes by writing to sysctl
+   /proc/sys/kernel/tainted any flagset matching with the
+   bitmask set on panic_on_taint.
+   See Documentation/admin-guide/tainted-kernels.rst for
+   extra details on the taint flags that users can pick
+   to compose the bitmask to assign to panic_on_taint.
+
panic_on_warn   panic() instead of WARN().  Useful to cause kdump
on a WARN().
 
diff --git a/Documentation/admin-guide/sysctl/kernel.rst 
b/Documentation/admin-guide/sysctl/kernel.rst
index c6c27db68d4c..427ce0a86b36 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1241,6 +1241,13 @@ ORed together. The letters are seen in "Tainted" line of 
Oops reports.
 
 See :doc:`/admin-guide/tainted-kernels` for more information.
 
+Note:
+  writes to this sysctl interface will fail with ``EINVAL`` if the kernel is
+  booted with the command line option ``panic_on_taint=,nousertaint``
+  and any of the ORed together values being written to ``tainted`` match with
+  the bitmask declared on panic_on_taint.
+  See 

kexec: arm: possible overwrite of initrd

2020-05-15 Thread Corentin Labbe
Hello

Following https://lkml.org/lkml/2020/4/6/96 I was able to boot my Cubieboard4 
via kexec reliabily.

But now I have started to use kernelCI builds, I got problems.
All sunxi_defconfig kernel works but not multi_v7_defconfig which got:
[1.896540] Trying to unpack rootfs image as initramfs...
[1.896947] rootfs image is not initramfs (invalid magic at start of 
compressed archive); looks like an initrd
Then:
[3.927732] RAMDISK: Couldn't find valid RAM disk image starting at 0.
[3.934489] VFS: Cannot open root device \"(null)\" or unknown-block(0,0): 
error -6

I have tryed to disable all related RD/RAMFS/compression CONFIGs without change.
Only the size of the kernel seems to matter which let me think that the initrd 
is overwritten by the kernel.

I use kexec-tools master
This is the output of my kexec run
run kexec with --debug --kexec-syscall --force --initrd /tmp/ramdisk --dtb 
/tmp/dtb --command-line='console=ttyS0,115200n8 root=/dev/ram0 
earlycon=uart,mmio32,0x700 ip=dhcp'
Try gzip decompression.
kernel: 0xb65c0008 kernel_size: 0x853200
MEMORY RANGES
2000-9fff (0)
zImage header: 0x016f2818 0x 0x00853200
zImage size 0x853200, file size 0x853200
zImage requires 0x00864200 bytes
  offset 0xb810 tag 0x5a534c4b size 8
Decompressed kernel sizes:
 text+data 0x01563f54 bss 0x0005ca84 total 0x015c09d8
Resulting kernel space: 0x01dc8154
Kernel: address=0x20008000 size=0x01dc8154
Initrd: address=0x21dd1000 size=0x01c64369
DT: address=0x23a36000 size=0x60bb
kexec_load: entry = 0x20008000 flags = 0x28
nr_segments = 3
segment[0].buf   = 0xb65c0008
segment[0].bufsz = 0x853204
segment[0].mem   = 0x20008000
segment[0].memsz = 0x854000
segment[1].buf   = 0xb495b000
segment[1].bufsz = 0x1c64369
segment[1].mem   = 0x21dd1000
segment[1].memsz = 0x1c65000
segment[2].buf   = 0x4f030
segment[2].bufsz = 0x60bb
segment[2[   39.693411] sun7i-dwmac 83.ethernet eth0: Link is Down
].mem   = 0x23a36000
segment[2].memsz = 0x7000
[   39.709586] kexec_core: Starting new kernel
[   40.120408] Bye!
[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 5.6.11-dirty (compile@Red) (gcc version 9.2.0 
(Gentoo 9.2.0-r2 p3)) #43 SMP Fri May 15 15:31:20 CEST 2020
[0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[0.00] CPU: div instructions available: patching division code
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt: Machine model: Cubietech Cubieboard4
[0.00] Memory policy: Data cache writealloc
[0.00] efi: Getting EFI parameters from FDT:
[0.00] efi: UEFI not found.
[0.00] Ignoring RAM at 0x5000-0xa000
[0.00] Consider using a HIGHMEM enabled kernel.
[0.00] cma: Reserved 64 MiB at 0x4c00
[0.00] percpu: Embedded 20 pages/cpu s49228 r8192 d24500 u81920
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 195072
[0.00] Kernel command line: 'console=ttyS0,115200n8
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, 
linear)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, 
linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] Memory: 662712K/786432K available (12288K kernel code, 1455K 
rwdata, 4788K rodata, 2048K init, 370K bss, 58184K reserved, 65536K 
cma-reserved)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
[0.00] rcu: Hierarchical RCU implementation.
[0.00] rcu: RCU event tracing is enabled.
[0.00] rcu: RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=8.
[0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 
jiffies.
[0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[0.00] random: get_random_bytes called from start_kernel+0x2fc/0x494 
with crng_init=0
[0.00] arch_timer: cp15 timer(s) running at 24.00MHz (phys).
[0.00] clocksource: arch_sys_counter: mask: 0xff 
max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[0.06] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 
4398046511097ns
[0.18] Switching to timer-based delay loop, resolution 41ns
[0.001476] clocksource: timer: mask: 0x max_cycles: 0x, 
max_idle_ns: 79635851949 ns
[0.002630] Console: colour dummy device 80x30
[0.002953] printk: console [tty0] enabled
[0.002997] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 48.00 BogoMIPS (lpj=24)
[0.003026] pid_max: default: 32768 minimum: 301
[0.003201] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, 
linear)
[0.003237] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, 
linear)
[0.003933] CPU: Testing write 

kdump: Getting "warn_alloc" warning during boot of kdump kernel

2020-05-15 Thread Prabhakar Kushwaha
Hi All,

We are getting "warn_alloc" warning during boot of kdump kernel. This
warning is observed with latest upstream tag (v5.7-rc5).

Primary/1st Kernel

# dmesg | grep crash
[0.00] crashkernel reserved: 0xd600 -
0xf600 (512 MB)
[0.00] Kernel command line:
BOOT_IMAGE=(hd8,gpt2)/vmlinuz-5.7.0-rc5
root=UUID=c4050f17-526f-48a8-9804-c6b35cbb584c ro crashkernel=512M
earlycon console=ttyAMA0

# cat /proc/iomem | grep -i crash
  d600 - f600 : Crash kernel

Logs from Kdump/crash kernel with warnings & dump_stack


[0.239360] swapper/0: page allocation failure: order:2,
mode:0x1(GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
[0.249917] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc5 #44
[0.256246] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS
0ACKL027 07/01/2019
[0.264333] Call trace:
[0.266797]  dump_backtrace+0x0/0x1f8
[0.270490]  show_stack+0x20/0x30
[0.273833]  dump_stack+0xc0/0x10c
[0.277263]  warn_alloc+0x10c/0x178
[0.280781]  __alloc_pages_slowpath.constprop.112+0xaec/0xb28
[0.286584]  __alloc_pages_nodemask+0x2b4/0x300
[0.291156]  alloc_page_interleave+0x24/0xa0
[0.295464]  alloc_pages_current+0xe4/0x108
[0.299686]  dma_atomic_pool_init+0x44/0x1a4
[0.303995]  do_one_initcall+0x54/0x228
[0.307864]  kernel_init_freeable+0x228/0x2cc
[0.312263]  kernel_init+0x1c/0x110
[0.315781]  ret_from_fork+0x10/0x18

We did some debugging.
As per commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
. DMA zone has been re-defined.
here, ZONE_DMA has a fixed range of 0x802f - 0xbfff and
ZONE_DMA32 has range from 0xc000-0xf.

When bootargs is defined with "crashkernel= X" for 1st/primary kernel.
Than X amount of memory is reserved in First kernel. This reserved
memory is used to boot kdump/crash kernel and represented as "Crash
kernel" in cat /prom/iomem.

If some region of reserved memory(Crash kernel) **does not** fall in
ZONE_DMA region i.e. 0x802f - 0xbfff, this warning is
observed.
Other drivers like scsi_register_driver [1] also fail. We also see
other kinds of error [2].

Considering DMA_ZONE has requirement of 0x802f - 0xbfff.
Can we enforce "Crash kernel" to always reserved between 0x_
to 0xc000_ in reserve_crashkernel() -->memblock_find_in_range()
or
what could be best possible solution.

--pk

[1]

[   21.509239]  dump_backtrace+0x0/0x1f8
[   21.516592]  show_stack+0x20/0x30
[   21.523248]  dump_stack+0xc0/0x10c
[   21.530087]  warn_alloc+0x10c/0x178
[   21.537090]  __alloc_pages_slowpath.constprop.112+0xaec/0xb28
[   21.548644]  __alloc_pages_nodemask+0x2b4/0x300
[   21.557750]  alloc_pages_current+0x90/0x108
[   21.566155]  alloc_slab_page+0x184/0x340
[   21.574030]  new_slab+0x420/0x4c8
[   21.580681]  ___slab_alloc+0x354/0x4e8
[   21.588207]  __slab_alloc+0x28/0x58
[   21.595210]  kmem_cache_alloc_trace+0x230/0x250
[   21.604316]  sr_probe+0x250/0x618 [sr_mod]
[   21.612555]  really_probe+0xe4/0x448
[   21.619733]  driver_probe_device+0xe8/0x140
[   21.628136]  device_driver_attach+0x7c/0x88
[   21.636536]  __driver_attach+0xac/0x178
[   21.644239]  bus_for_each_dev+0x7c/0xd0
[   21.651943]  driver_attach+0x2c/0x38
[   21.659119]  bus_add_driver+0x1a8/0x240
[   21.666823]  driver_register+0x6c/0x128
[   21.674533]  scsi_register_driver+0x28/0x38
[   21.682939]  init_sr+0x40/0x1 [sr_mod]

[2]
---
[   21.450571] systemd-udevd: page allocation failure: order:0,
mode:0xcc1(GFP_KERNEL|GFP_DMA),
nodemask=(null),cpuset=/,mems_allowed=0^M
[   21.450571] systemd-udevd: page allocation failure: order:0,
mode:0xcc1(GFP_KERNEL|GFP_DMA),
nodemask=(null),cpuset=/,mems_allowed=0^M

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 0/3] printk: replace ringbuffer

2020-05-15 Thread Sergey Senozhatsky
On (20/05/01 11:46), John Ogness wrote:
> Hello,
> 
> Here is a v2 for the first series to rework the printk subsystem. The
> v1 and history are here [0]. This first series only replaces the
> existing ringbuffer implementation. No locking is removed. No
> semantics/behavior of printk are changed.
> 
> The VMCOREINFO is updated. RFC patches for the external tools
> crash(8) [1] and makedumpfile(8) [2] have been submitted that allow
> the new ringbuffer to be correctly read.
> 
> This series is in line with the agreements [3] made at the meeting
> during LPC2019 in Lisbon, with 1 exception: support for dictionaries
> will not be discontinued [4]. Dictionaries are stored in a separate
> buffer so that they cannot interfere with the human-readable buffer.

I'm willing to bless this. The code looks good to me, nice job guys.

Acked-by: Sergey Senozhatsky 

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] arm64/defconfig: Enable CONFIG_KEXEC_FILE

2020-05-15 Thread Bhupesh Sharma
Hi Arnd,

On Thu, Apr 30, 2020 at 10:05 AM Bhupesh Sharma  wrote:
>
> On Tue, Apr 28, 2020 at 3:37 PM Catalin Marinas  
> wrote:
> >
> > On Tue, Apr 28, 2020 at 01:55:58PM +0530, Bhupesh Sharma wrote:
> > > On Wed, Apr 8, 2020 at 4:17 PM Mark Rutland  wrote:
> > > > On Tue, Apr 07, 2020 at 04:01:40AM +0530, Bhupesh Sharma wrote:
> > > > >  arch/arm64/configs/defconfig | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/arch/arm64/configs/defconfig 
> > > > > b/arch/arm64/configs/defconfig
> > > > > index 24e534d85045..fa122f4341a2 100644
> > > > > --- a/arch/arm64/configs/defconfig
> > > > > +++ b/arch/arm64/configs/defconfig
> > > > > @@ -66,6 +66,7 @@ CONFIG_SCHED_SMT=y
> > > > >  CONFIG_NUMA=y
> > > > >  CONFIG_SECCOMP=y
> > > > >  CONFIG_KEXEC=y
> > > > > +CONFIG_KEXEC_FILE=y
> > > > >  CONFIG_CRASH_DUMP=y
> > > > >  CONFIG_XEN=y
> > > > >  CONFIG_COMPAT=y
> > > > > --
> > > > > 2.7.4
> > >
> > > Thanks a lot  Mark.
> > >
> > > Hi Catalin, Will,
> > >
> > > Can you please help pick this patch in the arm tree. We have an
> > > increasing number of user-cases from distro users
> > > who want to use kexec_file_load() as the default interface for
> > > kexec/kdump on arm64.
> >
> > We could pick it up if it doesn't conflict with the arm-soc tree. They
> > tend to pick most of the defconfig changes these days (and could as well
> > pick this one).
>
> Thanks Catalin.
> (+Cc Arnd)
>
> Hi Arnd,
>
> Can you please help pick this change via the arm-soc tree?

Ping. Any updates on this defconfig patch.

Thanks,
Bhupesh


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec