There may be more people who would like to care about this issue. So I added
them to cc list.

Thanks.
Lianbo

在 2020年05月12日 17:47, lijiang 写道:
> Also added Dave Young to the cc list. Thanks.
> 
> 在 2020年05月12日 10:52, lijiang 写道:
>> 在 2020年05月11日 23:01, Philipp Rudo 写道:
>>> Hi Lianbo,
>>>
>> Thank you for this reply, Philipp.
>>
>>> one more question. Does the same problem occur withe the kexec_load syscall,
>>> i.e. option '-c' instead of '-s'?
>>>
>> No, kdump kernel can boot with the kexec_load syscal option '-c'.
>>
>> Currently, I only found kdump kernel can not boot with the kexec_file_load 
>> syscall(option '-s').
>>
>>> Thanks
>>> Philipp
>>>
>>> On Mon, 11 May 2020 11:15:58 +0200
>>> Philipp Rudo <pr...@linux.ibm.com> wrote:
>>>
>>>> Hi Lianbo,
>>>>
>>>> I believe that your crashkernel memory is simply too small. Pretty much at 
>>>> the
>>>> beginning of the kernel log you have
>>>>
>>>>> [    0.070468] setup: The initial RAM disk does not fit into the memory
>>>>
>>>> Although I must say 256M should be enough for most purposes...
>>>>
>>>> Could you please retry with a bigger crashkernel memory?
>>>>
>>
>> I increased the size of crash memory to 512M(crashkernel=512M), kdump kernel 
>> still can
>> not boot, there is a same issue.
>>
>> I added some debug information in the arch/s390/kernel/setup.c, and got the 
>> following logs:
>>
>> [    0.070885] Linux version 5.7.0-rc5+ 
>> (r...@ibm-z-124.rhts.eng.bos.redhat.com)
>>  (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC), GNU ld version 
>> 2.30-73.el8
>> ) #3 SMP Mon May 11 10:28:57 EDT 2020                                        
>>    
>> [    0.070888] setup: Linux is running as a z/VM guest operating system in 
>> 64-bi
>> t mode                                                                       
>>    
>> [    0.071125] lijiang-debug initrd_start:4aeeb000 size:17180900    
>> <----------------------                
>> [    0.071128] setup: The maximum memory size is 2048MB                      
>>    
>> [    0.071130] cma: Reserved 4 MiB at 0x000000001fc00000                     
>>    
>> [    0.071131] setup: The initial RAM disk does not fit into the memory      
>>    
>> [    0.071132] lijiang-debug: check_initrd 810 start:4aeeb000, size:17180900 
>>  <----------------------      
>> [    0.099765] cpu: 2 configured CPUs, 0 standby CPUs  
>>
>> The size of initrd is 17M, the 512M memory should be enough. I could suspect 
>> that kdump
>> kernel doesn't find an appropriate memory block, thereby this causes the 
>> failure.
>>
>> The compressed initrd is really decompressed in the unpack_to_rootfs().
>>
>> I have a s390 machine with 2cpus and 2G memory, which is too slow. :-)
>>
>>
>> Thanks.
>> Lianbo
>>
>>
>>>> Thanks
>>>> Philipp
>>>>
>>>>
>>>> On Fri, 8 May 2020 18:45:56 +0800
>>>> lijiang <liji...@redhat.com> wrote:
>>>>
>>>>> Hi, Philipp Rudo
>>>>>
>>>>> Sorry to disturb you. I ran into a problem on s390 machine, can you help 
>>>>> to have a look?
>>>>>
>>>>> Kdump kernel can not boot on s390x machines if I load the kernel and 
>>>>> initrd images with the kexec_file_load() syscall as below:
>>>>>
>>>>> #kexec -s -p /boot//boot/vmlinuz-5.7.0-rc4+ 
>>>>> --initrd=/boot/initramfs-5.7.0-rc4+kdump.img 
>>>>> --command-line="rd.dasd=0.0.0120 rd.dasd=0.0.0121 rd.dasd=0.0.0122 
>>>>> rd.dasd=0.0.0123 
>>>>> rd.znet=qeth,0.0.8000,0.0.8001,0.0.8002,layer2=1,portname=z-126,portno=0 
>>>>> $tuned_params BOOT_IMAGE=0 nr_cpus=1 cgroup_disable=memory numa=off 
>>>>> udev.children-max=2 panic=10 rootflags=nofail transparent_hugepage=never 
>>>>> novmcoredd nokaslr"
>>>>>
>>>>> But the kexec reboot can work well if I use the kexec_file_load() syscall 
>>>>> as follow:
>>>>>
>>>>> #kexec -s -l  /boot//boot/vmlinuz-5.7.0-rc4+ 
>>>>> --initrd=/boot/initramfs-5.7.0-rc4+kdump.img 
>>>>> --command-line="root=/dev/mapper/rhel_ibm--z--126-root crashkernel=256M 
>>>>> rd.dasd=0.0.0120 rd.dasd=0.0.0121 rd.dasd=0.0.0122 rd.dasd=0.0.0123 
>>>>> rd.lvm.lv=rhel_ibm-z-126/root rd.lvm.lv=rhel_ibm-z-126/swap 
>>>>> rd.znet=qeth,0.0.8000,0.0.8001,0.0.8002,layer2=1,portname=z-126,portno=0 
>>>>> $tuned_params BOOT_IMAGE=0"
>>>>>
>>>>> I added the debug information in the populate_rootfs() 
>>>>> (init/initramfs.c), and I found that the address of initrd_start is null, 
>>>>> and also
>>>>> checked the process of kexec file load, I didn't see any errors. It's 
>>>>> strange. Any suggestions will be appreciated.
>>>>>  
>>>>> BTW: I put the kernel log at the end.
>>>>>
>>>>> Thanks.
>>>>> Lianbo
>>>>>
>>>>>
>>>>> kdump kernel log:
>>>>>
>>>>> 01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP 
>>>>> stop from
>>>>>  CPU 01.                                                                  
>>>>>       
>>>>> 01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP 
>>>>> stop from
>>>>>  CPU 00.                                                                  
>>>>>       
>>>>> [    0.070339] Linux version 5.7.0-rc4+ 
>>>>> (r...@ibm-z-126.rhts.eng.bos.redhat.com)
>>>>>  (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC), GNU ld version 
>>>>> 2.30-73.el8
>>>>> ) #2 SMP Thu May 7 22:29:25 EDT 2020                                      
>>>>>       
>>>>> [    0.070344] setup: Linux is running as a z/VM guest operating system 
>>>>> in 64-bi
>>>>> t mode                                                                    
>>>>>       
>>>>> [    0.070464] setup: The maximum memory size is 2048MB                   
>>>>>       
>>>>> [    0.070468] cma: Reserved 4 MiB at 0x000000000fc00000                  
>>>>>       
>>>>> [    0.070468] setup: The initial RAM disk does not fit into the memory   
>>>>>       
>>>>> [    0.112609] cpu: 2 configured CPUs, 0 standby CPUs                     
>>>>>       
>>>>> [    0.112731] Write protected kernel read-only data: 10116k              
>>>>>       
>>>>>
>>>>> [    0.112747] Zone ranges:                                               
>>>>>       
>>>>> [    0.112748]   DMA      [mem 0x0000000000000000-0x000000007fffffff]     
>>>>>       
>>>>> [    0.112750]   Normal   empty                                           
>>>>>       
>>>>> [    0.112751] Movable zone start for each node                           
>>>>>       
>>>>> [    0.112752] Early memory node ranges                                   
>>>>>       
>>>>> [    0.112753]   node   0: [mem 0x0000000000000000-0x000000000fffffff]    
>>>>>       
>>>>> [    0.112772] Initmem setup node 0 [mem 
>>>>> 0x0000000000000000-0x000000000fffffff] 
>>>>> [    0.115953] percpu: Embedded 33 pages/cpu s96256 r8192 d30720 u135168  
>>>>>       
>>>>> [    0.115976] Built 1 zonelists, mobility grouping on.  Total pages: 
>>>>> 64512     
>>>>> [    0.115977] Policy zone: DMA                                           
>>>>>       
>>>>> [    0.115979] Kernel command line: rd.dasd=0.0.0120 rd.dasd=0.0.0121 
>>>>> rd.dasd=0.
>>>>> 0.0122 rd.dasd=0.0.0123 
>>>>> rd.znet=qeth,0.0.8000,0.0.8001,0.0.8002,layer2=1,portnam
>>>>> e=z-126,portno=0 $tuned_params BOOT_IMAGE=0 nr_cpus=1 
>>>>> cgroup_disable=memory numa
>>>>> =off udev.children-max=2 panic=10 rootflags=nofail 
>>>>> transparent_hugepage=never no
>>>>> vmcoredd nokaslr                                                          
>>>>>       
>>>>> [    0.117247] Dentry cache hash table entries: 32768 (order: 6, 262144 
>>>>> bytes, l
>>>>> inear)                                                                    
>>>>>       
>>>>> [    0.117271] Inode-cache hash table entries: 16384 (order: 5, 131072 
>>>>> bytes, li
>>>>> near)                                                                     
>>>>>       
>>>>> [    0.117297] mem auto-init: stack:off, heap alloc:off, heap free:off    
>>>>>       
>>>>> [    0.121169] Memory: 237484K/262144K available (7652K kernel code, 
>>>>> 1384K rwdat
>>>>> a, 2464K rodata, 3324K init, 816K bss, 20564K reserved, 4096K 
>>>>> cma-reserved)     
>>>>> [    0.121220] random: get_random_u64 called from 
>>>>> cache_random_seq_create+0x6a/0
>>>>> x160 with crng_init=0                                                     
>>>>>       
>>>>> [    0.121310] SLUB: HWalign=256, Order=0-3, MinObjects=0, CPUs=1, 
>>>>> Nodes=1      
>>>>> [    0.121321] ftrace: allocating 25822 entries in 101 pages              
>>>>>       
>>>>> [    0.137295] ftrace: allocated 101 pages with 4 groups                  
>>>>>       
>>>>> [    0.137389] rcu: Hierarchical RCU implementation.                      
>>>>>       
>>>>> [    0.137390] rcu:     RCU restricting CPUs from NR_CPUS=512 to 
>>>>> nr_cpu_ids=1.  
>>>>> [    0.137392] rcu: RCU calculated value of scheduler-enlistment delay is 
>>>>> 10 jif
>>>>> fies.                                                                     
>>>>>       
>>>>> [    0.137393] rcu: Adjusting geometry for rcu_fanout_leaf=16, 
>>>>> nr_cpu_ids=1     
>>>>> [    0.140929] NR_IRQS: 3, nr_irqs: 3, preallocated irqs: 3               
>>>>>       
>>>>> [    0.140977] clocksource: tod: mask: 0xffffffffffffffff max_cycles: 
>>>>> 0x3b0a9be8
>>>>> 03b0a9, max_idle_ns: 1805497147909793 ns                                  
>>>>>       
>>>>> [    0.141001] Console: colour dummy device 80x25                         
>>>>>       
>>>>> [    0.142713] random: fast init done                                     
>>>>>       
>>>>> [    0.144414] printk: console [ttyS0] enabled                            
>>>>>       
>>>>> [    0.144563] pid_max: default: 32768 minimum: 301                       
>>>>>       
>>>>> [    0.144598] LSM: Security Framework initializing                       
>>>>>       
>>>>> [    0.144614] Yama: becoming mindful.                                    
>>>>>       
>>>>> [    0.144621] SELinux:  Initializing.                                    
>>>>>       
>>>>> [    0.144655] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, 
>>>>> linear
>>>>> )                                                                         
>>>>>       
>>>>> [    0.144657] Mountpoint-cache hash table entries: 512 (order: 0, 4096 
>>>>> bytes, l
>>>>> inear)                                                                    
>>>>>       
>>>>> [    0.144877] Disabling memory control group subsystem                   
>>>>>       
>>>>> [    0.145087] rcu: Hierarchical SRCU implementation.                     
>>>>>       
>>>>> [    0.145249] smp: Bringing up secondary CPUs ...                        
>>>>>       
>>>>> [    0.145251] smp: Brought up 1 node, 1 CPU                              
>>>>>       
>>>>> [    0.145365] devtmpfs: initialized                                      
>>>>>       
>>>>> [    0.145529] clocksource: jiffies: mask: 0xffffffff max_cycles: 
>>>>> 0xffffffff, ma
>>>>> x_idle_ns: 19112604462750000 ns                                           
>>>>>       
>>>>> [    0.145532] futex hash table entries: 256 (order: 4, 65536 bytes, 
>>>>> linear)    
>>>>> [    0.145776] NET: Registered protocol family 16                         
>>>>>       
>>>>> [    0.145835] audit: initializing netlink subsys (disabled)              
>>>>>       
>>>>> [    0.145934] Spectre V2 mitigation: execute trampolines                 
>>>>>       
>>>>> [    0.146759] audit: type=2000 audit(1588911734.995:1): 
>>>>> state=initialized audit
>>>>> _enabled=0 res=1                                                          
>>>>>       
>>>>> [    0.146830] HugeTLB registered 1.00 MiB page size, pre-allocated 0 
>>>>> pages     
>>>>> [    0.190956] cryptd: max_cpu_qlen set to 1000                           
>>>>>       
>>>>> [    0.194413] iommu: Default domain type: Translated                     
>>>>>       
>>>>> [    0.194510] SCSI subsystem initialized                                 
>>>>>       
>>>>> [    0.194516] pps_core: LinuxPPS API ver. 1 registered                   
>>>>>       
>>>>> [    0.194518] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 
>>>>> Rodolfo Giome
>>>>> tti <giome...@linux.it>                                                   
>>>>>       
>>>>> [    0.194520] PTP clock support registered                               
>>>>>       
>>>>> [    0.199801] NetLabel: Initializing                                     
>>>>>       
>>>>> [    0.199803] NetLabel:  domain hash size = 128                          
>>>>>       
>>>>> [    0.199804] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO           
>>>>>       
>>>>> [    0.199818] NetLabel:  unlabeled traffic allowed by default            
>>>>>       
>>>>> [    0.219300] VFS: Disk quotas dquot_6.6.0                               
>>>>>       
>>>>> [    0.219322] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 
>>>>> bytes)   
>>>>> [    0.219350] os_info: entry 0: not available (addr=0x0 size=0)          
>>>>>       
>>>>> [    0.219352] os_info: entry 1: copied (addr=0x67a37000 size=200)        
>>>>>       
>>>>> [    0.219353] os_info: crashkernel: addr=0x6fc00000 size=268435456       
>>>>>       
>>>>> [    0.220405] NET: Registered protocol family 2                          
>>>>>       
>>>>> [    0.220552] tcp_listen_portaddr_hash hash table entries: 256 (order: 
>>>>> 0, 4096 
>>>>> bytes, linear)                                                            
>>>>>       
>>>>> [    0.220557] TCP established hash table entries: 2048 (order: 2, 16384 
>>>>> bytes, 
>>>>> linear)                                                                   
>>>>>       
>>>>> [    0.220570] TCP bind hash table entries: 2048 (order: 3, 32768 bytes, 
>>>>> linear 
>>>>> [    0.220587] TCP: Hash tables configured (established 2048 bind 2048)   
>>>>>       
>>>>> [    0.220607] UDP hash table entries: 256 (order: 1, 8192 bytes, linear) 
>>>>>       
>>>>> [    0.220614] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, 
>>>>> linear)  
>>>>> [    0.220649] NET: Registered protocol family 1                          
>>>>>       
>>>>> [    0.220657] NET: Registered protocol family 44                         
>>>>>       
>>>>> [    0.220695] jlb-debug: populate_rootfs 661 initrd:0                    
>>>>>       
>>>>> [    0.220697] jlb-debug: populate_rootfs 663                             
>>>>>       
>>>>> [    0.221396] alg: No test for crc32be (crc32be-vx)                      
>>>>>       
>>>>> [    0.221802] Initialise system trusted keyrings                         
>>>>>       
>>>>> [    0.221809] Key type blacklist registered                              
>>>>>       
>>>>> [    0.221826] workingset: timestamp_bits=45 max_order=16 bucket_order=0  
>>>>>       
>>>>> [    0.223690] integrity: Platform Keyring initialized                    
>>>>>       
>>>>> [    0.227865] NET: Registered protocol family 38                         
>>>>>       
>>>>> [    0.227869] Key type asymmetric registered                             
>>>>>       
>>>>> [    0.227870] Asymmetric key parser 'x509' registered                    
>>>>>       
>>>>> [    0.227877] Block layer SCSI generic (bsg) driver version 0.4 loaded 
>>>>> (major 2
>>>>> 49)                                                                       
>>>>>       
>>>>> [    0.227894] io scheduler mq-deadline registered                        
>>>>>       
>>>>> [    0.227896] io scheduler kyber registered                              
>>>>>       
>>>>> [    0.227924] io scheduler bfq registered                                
>>>>>       
>>>>> [    0.227972] atomic64_test: passed                                      
>>>>>       
>>>>> [    0.228308] rdac: device handler registered                            
>>>>>       
>>>>> [    0.228326] hp_sw: device handler registered                           
>>>>>       
>>>>> [    0.228327] emc: device handler registered                             
>>>>>       
>>>>> [    0.228343] alua: device handler registered                            
>>>>>       
>>>>> [    0.228385] cio: Channel measurement facility initialized using format 
>>>>> extend
>>>>> ed (mode autodetected)                                                    
>>>>>       
>>>>> [    0.228596] drop_monitor: Initializing network drop monitor service    
>>>>>       
>>>>> [    0.228659] Initializing XFRM netlink socket                           
>>>>>       
>>>>> [    0.228760] NET: Registered protocol family 10                         
>>>>>       
>>>>> [    0.229046] Segment Routing with IPv6                                  
>>>>>       
>>>>> [    0.229063] NET: Registered protocol family 17                         
>>>>>       
>>>>> [    0.229085] mpls_gso: MPLS GSO support                                 
>>>>>       
>>>>> [    0.229136] registered taskstats version 1                             
>>>>>       
>>>>> [    0.229145] Loading compiled-in X.509 certificates                     
>>>>>       
>>>>> [    0.272961] Loaded X.509 cert 'Build time autogenerated kernel key: 
>>>>> 6de832de3
>>>>> 5ed366a6c3c2d0e99b0d84ae243cb28'                                          
>>>>>       
>>>>> [    0.273793] Key type big_key registered                                
>>>>>       
>>>>> [    0.273802] ima: No TPM chip found, activating TPM-bypass!             
>>>>>       
>>>>> [    0.273805] ima: Allocated hash algorithm: sha1                        
>>>>>       
>>>>> [    0.273813] ima: No architecture policies found                        
>>>>>       
>>>>> [    0.273933] md: Waiting for all devices to be available before 
>>>>> autodetect    
>>>>> [    0.273934] md: If you don't use raid, use raid=noautodetect           
>>>>>       
>>>>> [    0.274074] md: Autodetecting RAID arrays.                             
>>>>>       
>>>>> [    0.274075] md: autorun ...                                            
>>>>>       
>>>>> [    0.274076] md: ... autorun DONE.                                      
>>>>>       
>>>>> [    0.274092] List of all partitions:                                    
>>>>>       
>>>>> [    0.274093] No filesystem could mount root, tried:                     
>>>>>       
>>>>> [    0.274094]                                                            
>>>>>       
>>>>> [    0.274096] Kernel panic - not syncing: VFS: Unable to mount root fs 
>>>>> on unkno
>>>>> wn-block(1,0)                                                             
>>>>>       
>>>>> [    0.274098] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #2    
>>>>>       
>>>>> [    0.274099] Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)               
>>>>>       
>>>>> [    0.274100] Call Trace:                                                
>>>>>       
>>>>> [    0.274109]  [<0000000000114302>] show_stack+0x8a/0xd0                 
>>>>>       
>>>>> [    0.274113]  [<000000000057a1d2>] dump_stack+0x8a/0xb8                 
>>>>>       
>>>>> [    0.274116]  [<0000000000147828>] panic+0x110/0x308                    
>>>>>       
>>>>> [    0.274121]  [<0000000000c3d616>] mount_block_root+0x35e/0x360         
>>>>>       
>>>>> [    0.274122]  [<0000000000c3d824>] prepare_namespace+0x174/0x1b0        
>>>>>       
>>>>> [    0.274124]  [<0000000000c3d054>] kernel_init_freeable+0x2bc/0x2d0     
>>>>>       
>>>>> [    0.274130]  [<000000000086b5ea>] kernel_init+0x22/0x150               
>>>>>       
>>>>> [    0.274133]  [<00000000008759b0>] ret_from_fork+0x2c/0x30              
>>>>>       
>>>>> 00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 
>>>>> 0010F444
>>>>> 00:                                                               
>>>>>
>>>

Reply via email to