Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-17, Petr Mladek wrote: > Alternative solution would be to get rid of record_print_text() > and use record_print_text_inline() everywhere. It will have some > advantages: > > + _inline() variant will get real testing > + no code duplication > + saving the extra buffer also in console, sysfs, and devkmsg > interface. In preparation for my v2, I implemented this alternate approach. Rather than introducing record_print_text_inline(), I changed record_print_text() to work inline and also it will no longer handle the counting case. The callers of record_print_text() for counting will now call the new counting functions. IMHO it is a nice cleanup and also removes the static printk_record structs for console and syslog. Thanks. John Ogness ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
Hi, John Ogness Thank you for improving the patch series and making great efforts. I'm not sure if I missed anything else. Or are there any other related patches to be applied? After applying this patch series, NMI watchdog detected a hard lockup, which caused that kernel can not boot, please refer to the following call trace. And I put the complete kernel log in the attachment. Test machine: Intel Platform: Grantley-R Wildcat Pass CPU: Broadwell-EP, B0 Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 65536 MB memory, 800 GB disk space kernel: v5.5-rc7 commit: def9d2780727 ("Linux 5.5-rc7") .. [ OK ] Started udev Coldplug all Devices. [ 42.110978] NMI watchdog: Watchdog detected hard LOCKUP on cpu 15 [ 42.110978] Modules linked in: ip_tables xfs libcrc32c sr_mod cdrom sd_mod sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_vram_helper drm_ttm_helper ttm ahci libahci ixgbe drm crc32c_intel libata mdio dca i2c_algo_bit wmi dm_mirror dm_region_hash dm_log dm_mod [ 42.110986] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted 5.5.0-rc7+ #4 [ 42.110986] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.6024.071720181717 07/17/2018 [ 42.110987] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1c0 [ 42.110988] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00 00 75 [ 42.110988] RSP: 0018:bbe207a7bc48 EFLAGS: 0002 [ 42.110989] RAX: 00f80101 RBX: a1576e80 RCX: [ 42.110990] RDX: RSI: RDI: a1e95660 [ 42.110990] RBP: R08: R09: 000b [ 42.110991] R10: a075df5dcf80 R11: a0ebfda0 R12: a1e95660 [ 42.110991] R13: a1e97680 R14: a17197a0 R15: 0047 [ 42.110991] FS: 7f7c5642a980() GS:a075df5c() knlGS: [ 42.110992] CS: 0010 DS: ES: CR0: 80050033 [ 42.110992] CR2: 7ffe95f4c4c0 CR3: 00084fbfc004 CR4: 003606e0 [ 42.110993] DR0: DR1: DR2: [ 42.110993] DR3: DR6: fffe0ff0 DR7: 0400 [ 42.110993] Call Trace: [ 42.110993] _raw_spin_lock+0x1a/0x20 [ 42.110994] console_unlock+0x9e/0x450 [ 42.110994] bust_spinlocks+0x16/0x30 [ 42.110994] oops_end+0x33/0xc0 [ 42.110995] general_protection+0x32/0x40 [ 42.110995] RIP: 0010:copy_data+0xf2/0x1e0 [ 42.110995] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c 89 cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 <41> c7 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00 [ 42.110996] RSP: 0018:bbe207a7bd80 EFLAGS: 00010002 [ 42.110996] RAX: a075d44ca000 RBX: 00a8 RCX: fff000b0 [ 42.110997] RDX: 00a8 RSI: 0f01 RDI: a1456e00 [ 42.110997] RBP: 0801364600307073 R08: 2000 R09: 0801364600307073 [ 42.110997] R10: fff0 R11: 00a8 R12: a1e98330 [ 42.110998] R13: d7efbe00 R14: 00a8 R15: c000 [ 42.110998] _prb_read_valid+0xd8/0x190 [ 42.110998] prb_read_valid+0x15/0x20 [ 42.110999] devkmsg_read+0x9d/0x2a0 [ 42.110999] vfs_read+0x91/0x140 [ 42.110999] ksys_read+0x59/0xd0 [ 42.111000] do_syscall_64+0x55/0x1b0 [ 42.111000] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 42.111000] RIP: 0033:0x7f7c55740b62 [ 42.111001] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 [ 42.111001] RSP: 002b:7ffe95f4c4a8 EFLAGS: 0246 ORIG_RAX: [ 42.111002] RAX: ffda RBX: 7ffe95f4e500 RCX: 7f7c55740b62 [ 42.111002] RDX: 2000 RSI: 7ffe95f4c4b0 RDI: 0008 [ 42.111002] RBP: R08: 0100 R09: 0003 [ 42.111003] R10: 0100 R11: 0246 R12: 7ffe95f4c4b0 [ 42.111003] R13: 7ffe95f4e910 R14: R15: [ 42.111003] Kernel panic - not syncing: Hard LOCKUP [ 42.111004] Shutting down cpus with NMI [ 42.111004] Kernel Offset: 0x1f00 from 0x8100 (relocation range: 0x8000-0xbfff) [ 42.111005] general protection fault: [#1] SMP PTI [ 42.111005] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted 5.5.0-rc7+ #4 [ 42.111005] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.6024.071720181717 07/17/2018 [ 42.111006] RIP: 0010:copy_data+0xf2/0x1e0 [ 42.111006] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c 89 cd 41 89 d6 44 89 44
Re: [PATCH 0/2] printk: replace ringbuffer
On Mon 2020-02-17 12:13:25, John Ogness wrote: > On 2020-02-14, Petr Mladek wrote: > >> I oversaw that devkmsg_open() setup a printk_record and so I did not > >> see to add the extra NULL initialization of text_line_count. There > >> should be be an initializer function/macro to avoid this danger. > >> > >> John Ogness > >> > >> The quick fixup: > >> > >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > >> index d0d24ee1d1f4..5ad67ff60cd9 100644 > >> --- a/kernel/printk/printk.c > >> +++ b/kernel/printk/printk.c > >> @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct > >> file *file) > >>user->record.text_buf_size = sizeof(user->text_buf); > >>user->record.dict_buf = >dict_buf[0]; > >>user->record.dict_buf_size = sizeof(user->dict_buf); > >> + user->record.text_line_count = NULL; > > > > The NULL pointer hidden in the structure also complicates the code > > reading. It is less obvious when the same function is called > > only to get the size/count and when real data. > > OK. > > > I played with it and created extra function to get this information. > > > > In addition, I had problems to follow the code in > > record_print_text_inline(). So I tried to reuse the new function > > and the existing record_printk_text() there. > > > > Please, find below a patch that I ended with. I booted a system > > with this patch. But I guess that I actually did not use the > > record_print_text_inline(). So, it might be buggy. > > Yes, there are several bugs. But I see where you want to go with this: > > - introduce prb_count_lines() to handle line counting > > - introduce prb_read_valid_info() for only reading meta-data and getting > the line count > > - also use prb_count_lines() internally In addition, I would like share the code between record_print_text_inline() and record_print_text(). They both do very similar thing and the logic in far from trivial. Alternative solution would be to get rid of record_print_text() and use record_print_text_inline() everywhere. It will have some advantages: + _inline() variant will get real testing + no code duplication + saving the extra buffer also in console, sysfs, and devkmsg interface. > I will include these changes in v2. I will still introduce the static > inlines to initialize records because readers and writers do it > differently. Sounds good. Best Regards, Petr ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-14, Petr Mladek wrote: >> I oversaw that devkmsg_open() setup a printk_record and so I did not >> see to add the extra NULL initialization of text_line_count. There >> should be be an initializer function/macro to avoid this danger. >> >> John Ogness >> >> The quick fixup: >> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c >> index d0d24ee1d1f4..5ad67ff60cd9 100644 >> --- a/kernel/printk/printk.c >> +++ b/kernel/printk/printk.c >> @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file >> *file) >> user->record.text_buf_size = sizeof(user->text_buf); >> user->record.dict_buf = >dict_buf[0]; >> user->record.dict_buf_size = sizeof(user->dict_buf); >> +user->record.text_line_count = NULL; > > The NULL pointer hidden in the structure also complicates the code > reading. It is less obvious when the same function is called > only to get the size/count and when real data. OK. > I played with it and created extra function to get this information. > > In addition, I had problems to follow the code in > record_print_text_inline(). So I tried to reuse the new function > and the existing record_printk_text() there. > > Please, find below a patch that I ended with. I booted a system > with this patch. But I guess that I actually did not use the > record_print_text_inline(). So, it might be buggy. Yes, there are several bugs. But I see where you want to go with this: - introduce prb_count_lines() to handle line counting - introduce prb_read_valid_info() for only reading meta-data and getting the line count - also use prb_count_lines() internally I will include these changes in v2. I will still introduce the static inlines to initialize records because readers and writers do it differently. Thanks. John Ogness ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On Wed 2020-02-05 16:48:32, John Ogness wrote: > On 2020-02-05, Sergey Senozhatsky wrote: > > 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> > > 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> > > The problem was due to an uninitialized pointer. > > Very recently the ringbuffer API was expanded so that it could > optionally count lines in a record. This made it possible for me to > implement record_print_text_inline(), which can do all the kmsg_dump > multi-line madness without requiring a temporary buffer. Rather than > passing an extra argument around for the optional line count, I added > the text_line_count pointer to the printk_record struct. And since line > counting is rarely needed, it is only performed if text_line_count is > non-NULL. > > I oversaw that devkmsg_open() setup a printk_record and so I did not see > to add the extra NULL initialization of text_line_count. There should be > be an initializer function/macro to avoid this danger. > > John Ogness > > The quick fixup: > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index d0d24ee1d1f4..5ad67ff60cd9 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file > *file) > user->record.text_buf_size = sizeof(user->text_buf); > user->record.dict_buf = >dict_buf[0]; > user->record.dict_buf_size = sizeof(user->dict_buf); > + user->record.text_line_count = NULL; The NULL pointer hidden in the structure also complicates the code reading. It is less obvious when the same function is called only to get the size/count and when real data. I played with it and created extra function to get this information. In addition, I had problems to follow the code in record_print_text_inline(). So I tried to reuse the new function and the existing record_printk_text() there. Please, find below a patch that I ended with. I booted a system with this patch. But I guess that I actually did not use the record_print_text_inline(). So, it might be buggy. Anyway, I wonder what you think about it: >From 383e608f41a2f44898e4cd0751c5ccc18c82f71e Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Fri, 14 Feb 2020 16:14:18 +0100 Subject: [PATCH] printk: Alternative approach for inline dumping line_count in struct printk_record looks a bit error prone. It causes a system crash when people forget to initialize it. It seems better to read this information via a separate API, for example, prg_read_valid_info(). record_print_text_inline() is really complicated[*]. It is yet another variant of the tricky logic used in record_print_text(). It would be great to actually reuse the existing function. [*] I know that you created it on my request. Signed-off-by: Petr Mladek --- kernel/printk/printk.c| 134 +- kernel/printk/printk_ringbuffer.c | 55 +--- kernel/printk/printk_ringbuffer.h | 7 +- 3 files changed, 84 insertions(+), 112 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 5ad67ff60cd9..6b7d6716b178 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -883,7 +883,6 @@ static int devkmsg_open(struct inode *inode, struct file *file) user->record.text_buf_size = sizeof(user->text_buf); user->record.dict_buf = >dict_buf[0]; user->record.dict_buf_size = sizeof(user->dict_buf); - user->record.text_line_count = NULL; logbuf_lock_irq(); user->seq = prb_first_seq(prb); @@ -1283,87 +1282,50 @@ static size_t record_print_text(const struct printk_record *r, bool syslog, return len; } -static size_t record_print_text_inline(struct printk_record *r, bool syslog, - bool time) +static size_t +get_record_text_size(struct printk_info *info, unsigned int line_count, + bool syslog, bool time) { - size_t text_len = r->info->text_len; - size_t buf_size = r->text_buf_size; - char *text = r->text_buf; - char prefix[PREFIX_MAX]; - bool truncated = false; size_t prefix_len; - size_t len = 0; - prefix_len = info_print_prefix(r->info, syslog, time, prefix); - - if (!text) { - /* SYSLOG_ACTION_* buffer size only calculation */ - unsigned int line_count = 1; - - if (r->text_line_count) - line_count = *(r->text_line_count); - /* -* Each line will be preceded with a prefix. The intermediate -* newlines are already within the text, but a final trailing -* newline will be added. -*/ - return ((prefix_len * line_count) + r->info->text_len + 1); - } + prefix_len = info_print_prefix(info, syslog, time, NULL); /* -* Add the prefix for each line by shifting the rest of the text to -
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/13 14:07), Petr Mladek wrote: > On Wed 2020-02-05 17:12:12, John Ogness wrote: > > On 2020-02-05, lijiang wrote: > > > Do you have any suggestions about the size of CONFIG_LOG_* and > > > CONFIG_PRINTK_* options by default? > > > > The new printk implementation consumes more than double the memory that > > the current printk implementation requires. This is because dictionaries > > and meta-data are now stored separately. > > > > If the old defaults (LOG_BUF_SHIFT=17 LOG_CPU_MAX_BUF_SHIFT=12) were > > chosen because they are maximally acceptable defaults, then the defaults > > should be reduced by 1 so that the final size is "similar" to the > > current implementation. > > > > If instead the defaults are left as-is, a machine with less than 64 CPUs > > will reserve 336KiB for printk information (128KiB text, 128KiB > > dictionary, 80KiB meta-data). > > > > It might also be desirable to reduce the dictionary size (maybe 1/4 the > > size of text?). > > Good questions. It would be great to check the usage on some real > systems. [..] > I wish the dictionaries were never added ;-) They complicate the code > and nobody knows how many people actually use the information. Maybe we can have CONFIG_PRINTK_EXTRA_PAYLOAD [for dicts] so people can compile it out if it's not needed. This can save several bytes here and there. -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On Wed 2020-02-05 17:12:12, John Ogness wrote: > On 2020-02-05, lijiang wrote: > > Do you have any suggestions about the size of CONFIG_LOG_* and > > CONFIG_PRINTK_* options by default? > > The new printk implementation consumes more than double the memory that > the current printk implementation requires. This is because dictionaries > and meta-data are now stored separately. > > If the old defaults (LOG_BUF_SHIFT=17 LOG_CPU_MAX_BUF_SHIFT=12) were > chosen because they are maximally acceptable defaults, then the defaults > should be reduced by 1 so that the final size is "similar" to the > current implementation. > > If instead the defaults are left as-is, a machine with less than 64 CPUs > will reserve 336KiB for printk information (128KiB text, 128KiB > dictionary, 80KiB meta-data). > > It might also be desirable to reduce the dictionary size (maybe 1/4 the > size of text?). Good questions. It would be great to check the usage on some real systems. In each case, we should inform users when messages and/or dictionaries were lost. Also it would be great to have a way (function) that would show how big parts of the two ring buffers are occupied by valid data. It might be useful also to detect problems with the ring buffer: + too many space reserved but not commited + too many records invalidated because of different ordering in desc ring and data ring. > However, since the new printk implementation allows for > non-intrusive dictionaries, we might see their usage increase and start > to be as large as the messages themselves. I wish the dictionaries were never added ;-) They complicate the code and nobody knows how many people actually use the information. Best Regards, Petr ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-07, Steven Rostedt wrote: >> The quick fixup: >> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c >> index d0d24ee1d1f4..5ad67ff60cd9 100644 >> --- a/kernel/printk/printk.c >> +++ b/kernel/printk/printk.c >> @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file >> *file) >> user->record.text_buf_size = sizeof(user->text_buf); >> user->record.dict_buf = >dict_buf[0]; >> user->record.dict_buf_size = sizeof(user->dict_buf); >> +user->record.text_line_count = NULL; >> >> logbuf_lock_irq(); >> user->seq = prb_first_seq(prb); > > FYI, I used your patch set to test out Konstantin's new get-lore-mbox > script, and then applied them. It locked up on boot up as well, and > applying this appears to fix it. Yes, this is a horrible bug. In preparation for my v2 I implemented: prb_rec_init_rd() prb_rec_init_wr() as static inline functions to initialize the records. There is a reader and writer variant because they initialize the records differently: readers provide buffers, writers request buffers. This eliminates the manual twiddling with the record struct and ensures that the struct is always properly initialized. John Ogness ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On Wed, 05 Feb 2020 16:48:32 +0100 John Ogness wrote: > The quick fixup: > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index d0d24ee1d1f4..5ad67ff60cd9 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file > *file) > user->record.text_buf_size = sizeof(user->text_buf); > user->record.dict_buf = >dict_buf[0]; > user->record.dict_buf_size = sizeof(user->dict_buf); > + user->record.text_line_count = NULL; > > logbuf_lock_irq(); > user->seq = prb_first_seq(prb); FYI, I used your patch set to test out Konstantin's new get-lore-mbox script, and then applied them. It locked up on boot up as well, and applying this appears to fix it. -- Steve ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
在 2020年01月29日 00:19, John Ogness 写道: > Hello, > > After several RFC series [0][1][2][3][4], here is the first set of > patches to rework the printk subsystem. This first set of patches > only replace the existing ringbuffer implementation. No locking is > removed. No semantics/behavior of printk are changed. > > The VMCOREINFO is updated, which will require changes to the > external crash [5] tool. I will be preparing a patch to add support > for the new VMCOREINFO. > In addition to changing the crash utility, I would think that the kexec-tools(such as the vmcore-dmesg and makedumpfile) also need to be modified accordingly. Thanks Lianbo > This series is in line with the agreements [6] made at the meeting > during LPC2019 in Lisbon, with 1 exception: support for dictionaries > will _not_ be discontinued [7]. Dictionaries are stored in a separate > buffer so that they cannot interfere with the human-readable buffer. > > John Ogness > > [0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogn...@linutronix.de > [1] https://lkml.kernel.org/r/20190607162349.18199-1-john.ogn...@linutronix.de > [2] https://lkml.kernel.org/r/2019072701.11260-1-john.ogn...@linutronix.de > [3] https://lkml.kernel.org/r/20190807222634.1723-1-john.ogn...@linutronix.de > [4] https://lkml.kernel.org/r/20191128015235.12940-1-john.ogn...@linutronix.de > [5] https://github.com/crash-utility/crash > [6] https://lkml.kernel.org/r/87k1acz5rx@linutronix.de > [7] https://lkml.kernel.org/r/20191007120134.ciywr3wale4gx...@pathway.suse.cz > > John Ogness (2): > printk: add lockless buffer > printk: use the lockless ringbuffer > > include/linux/kmsg_dump.h |2 - > kernel/printk/Makefile|1 + > kernel/printk/printk.c| 836 +- > kernel/printk/printk_ringbuffer.c | 1370 + > kernel/printk/printk_ringbuffer.h | 328 +++ > 5 files changed, 2114 insertions(+), 423 deletions(-) > create mode 100644 kernel/printk/printk_ringbuffer.c > create mode 100644 kernel/printk/printk_ringbuffer.h > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
> On 2020-02-05, lijiang wrote: >> Do you have any suggestions about the size of CONFIG_LOG_* and >> CONFIG_PRINTK_* options by default? > > The new printk implementation consumes more than double the memory that > the current printk implementation requires. This is because dictionaries > and meta-data are now stored separately. > > If the old defaults (LOG_BUF_SHIFT=17 LOG_CPU_MAX_BUF_SHIFT=12) were > chosen because they are maximally acceptable defaults, then the defaults > should be reduced by 1 so that the final size is "similar" to the > current implementation. > > If instead the defaults are left as-is, a machine with less than 64 CPUs > will reserve 336KiB for printk information (128KiB text, 128KiB > dictionary, 80KiB meta-data). > > It might also be desirable to reduce the dictionary size (maybe 1/4 the > size of text?). However, since the new printk implementation allows for > non-intrusive dictionaries, we might see their usage increase and start > to be as large as the messages themselves. > > John Ogness > Thanks for the explanation in detail. Lianbo ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
在 2020年02月05日 23:48, John Ogness 写道: > On 2020-02-05, Sergey Senozhatsky wrote: >> 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> >> 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> > > The problem was due to an uninitialized pointer. > > Very recently the ringbuffer API was expanded so that it could > optionally count lines in a record. This made it possible for me to > implement record_print_text_inline(), which can do all the kmsg_dump > multi-line madness without requiring a temporary buffer. Rather than > passing an extra argument around for the optional line count, I added > the text_line_count pointer to the printk_record struct. And since line > counting is rarely needed, it is only performed if text_line_count is > non-NULL. > > I oversaw that devkmsg_open() setup a printk_record and so I did not see > to add the extra NULL initialization of text_line_count. There should be > be an initializer function/macro to avoid this danger. > Good findings. Thanks for the quick fixup, it works well. Lianbo > John Ogness > > The quick fixup: > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index d0d24ee1d1f4..5ad67ff60cd9 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file > *file) > user->record.text_buf_size = sizeof(user->text_buf); > user->record.dict_buf = >dict_buf[0]; > user->record.dict_buf_size = sizeof(user->dict_buf); > + user->record.text_line_count = NULL; > > logbuf_lock_irq(); > user->seq = prb_first_seq(prb); > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 16:48), John Ogness wrote: > On 2020-02-05, Sergey Senozhatsky wrote: > > 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> > > 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> > > The problem was due to an uninitialized pointer. > > Very recently the ringbuffer API was expanded so that it could > optionally count lines in a record. This made it possible for me to > implement record_print_text_inline(), which can do all the kmsg_dump > multi-line madness without requiring a temporary buffer. Rather than > passing an extra argument around for the optional line count, I added > the text_line_count pointer to the printk_record struct. And since line > counting is rarely needed, it is only performed if text_line_count is > non-NULL. > > I oversaw that devkmsg_open() setup a printk_record and so I did not see > to add the extra NULL initialization of text_line_count. There should be > be an initializer function/macro to avoid this danger. > > John Ogness > > The quick fixup: > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index d0d24ee1d1f4..5ad67ff60cd9 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file > *file) > user->record.text_buf_size = sizeof(user->text_buf); > user->record.dict_buf = >dict_buf[0]; > user->record.dict_buf_size = sizeof(user->dict_buf); > + user->record.text_line_count = NULL; > > logbuf_lock_irq(); > user->seq = prb_first_seq(prb); Yes. That should do. It seems that /dev/kmsg reads/writes happen very early in my system and all the backtraces I saw were from completely unrelated paths - either a NULL deref at sys_clone()->do_fork()->copy_creds()->prepare_creads(), or general protection fault in sys_keyctl()->join_session_keyring()->prepare_creds(), or some weird crashes in ext4. And so on. I see some more unexplainable lockups on one on my test boards, but I can't provide more details at this time. Might not be related to the patch set. Need to investigate further. -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On Wed, 2020-02-05 at 16:48 +0100, John Ogness wrote: > On 2020-02-05, Sergey Senozhatsky wrote: > > 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> > > 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> > > The problem was due to an uninitialized pointer. > > Very recently the ringbuffer API was expanded so that it could > optionally count lines in a record. This made it possible for me to > implement record_print_text_inline(), which can do all the kmsg_dump > multi-line madness without requiring a temporary buffer. Rather than > passing an extra argument around for the optional line count, I added > the text_line_count pointer to the printk_record struct. And since line > counting is rarely needed, it is only performed if text_line_count is > non-NULL. > > I oversaw that devkmsg_open() setup a printk_record and so I did not see > to add the extra NULL initialization of text_line_count. There should be > be an initializer function/macro to avoid this danger. > > John Ogness > > The quick fixup: > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c [] > @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file > *file) > user->record.text_buf_size = sizeof(user->text_buf); > user->record.dict_buf = >dict_buf[0]; > user->record.dict_buf_size = sizeof(user->dict_buf); > + user->record.text_line_count = NULL; Probably better to change the kmalloc to kzalloc. user = kzalloc(sizeof(struct devkmsg_user), GFP_KERNEL); ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-05, lijiang wrote: > Do you have any suggestions about the size of CONFIG_LOG_* and > CONFIG_PRINTK_* options by default? The new printk implementation consumes more than double the memory that the current printk implementation requires. This is because dictionaries and meta-data are now stored separately. If the old defaults (LOG_BUF_SHIFT=17 LOG_CPU_MAX_BUF_SHIFT=12) were chosen because they are maximally acceptable defaults, then the defaults should be reduced by 1 so that the final size is "similar" to the current implementation. If instead the defaults are left as-is, a machine with less than 64 CPUs will reserve 336KiB for printk information (128KiB text, 128KiB dictionary, 80KiB meta-data). It might also be desirable to reduce the dictionary size (maybe 1/4 the size of text?). However, since the new printk implementation allows for non-intrusive dictionaries, we might see their usage increase and start to be as large as the messages themselves. John Ogness ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-05, Sergey Senozhatsky wrote: > 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> > 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> The problem was due to an uninitialized pointer. Very recently the ringbuffer API was expanded so that it could optionally count lines in a record. This made it possible for me to implement record_print_text_inline(), which can do all the kmsg_dump multi-line madness without requiring a temporary buffer. Rather than passing an extra argument around for the optional line count, I added the text_line_count pointer to the printk_record struct. And since line counting is rarely needed, it is only performed if text_line_count is non-NULL. I oversaw that devkmsg_open() setup a printk_record and so I did not see to add the extra NULL initialization of text_line_count. There should be be an initializer function/macro to avoid this danger. John Ogness The quick fixup: diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index d0d24ee1d1f4..5ad67ff60cd9 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -883,6 +883,7 @@ static int devkmsg_open(struct inode *inode, struct file *file) user->record.text_buf_size = sizeof(user->text_buf); user->record.dict_buf = >dict_buf[0]; user->record.dict_buf_size = sizeof(user->dict_buf); + user->record.text_line_count = NULL; logbuf_lock_irq(); user->seq = prb_first_seq(prb); ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 10:00), John Ogness wrote: > On 2020-02-05, Sergey Senozhatsky wrote: > So there is a General protection fault. That's the type of a > problem that kills the boot for me as well (different backtrace, > tho). > >>> > >>> Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) > >>> enabled? > >> > >> Yes. These two options are enabled. > >> > >> CONFIG_RELOCATABLE=y > >> CONFIG_RANDOMIZE_BASE=y > > > > So KASLR kills the boot for me. So does KASAN. > > Sergey, thanks for looking into this already! > So I hacked the system a bit. 3BUG: KASAN: wild-memory-access in copy_data+0x129/0x220> 3Write of size 4 at addr 5a5a5a5a5a5a5a5a by task cat/474> Call Trace:> dump_stack+0x76/0xa0> ? copy_data+0x129/0x220> __kasan_report.cold+0x5/0x3b> ? get_page_from_freelist+0x1224/0x1490> ? copy_data+0x129/0x220> copy_data+0x129/0x220> _prb_read_valid+0x1a0/0x330> ? prb_first_seq+0xe0/0xe0> ? __might_sleep+0x2f/0xd0> ? __zone_watermark_ok+0x180/0x180> ? ___might_sleep+0xbe/0xe0> prb_read_valid+0x4f/0x60> ? _prb_read_valid+0x330/0x330> devkmsg_read+0x12e/0x3d0> ? __mod_node_page_state+0x1a/0xa0> ? info_print_ext_header.constprop.0+0x120/0x120> ? __lru_cache_add+0x16c/0x190> ? __handle_mm_fault+0x1097/0x1f60> vfs_read+0xdc/0x200> ksys_read+0xa0/0x130> ? kernel_write+0xb0/0xb0> ? up_read+0x56/0x130> do_syscall_64+0xa0/0x520> ? syscall_return_slowpath+0x210/0x210> ? do_page_fault+0x399/0x4fa> entry_SYSCALL_64_after_hwframe+0x44/0xa9> RIP: 0033:0x7ff5f39813f2> Code: c0 e9 c2 fe ff ff 50 48 8d 3d 9a 0d 0a 00 e8 95 ed 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24> RSP: 002b:7ffc47b3ee58 EFLAGS: 024> c ORIG_RAX: > RAX: ffda RBX: 0002 RCX: 7ff5f39813f2> RDX: 0002 RSI: 7ff5f3588000 RDI: 0003> RBP: 7ff5f3588000 R08: 7ff5f3587010 R09: > R10: 0022 R11: 0246 R12: 55f9c8a81c00> R13: 0003 R14: 0002 R15: 0002> -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
> On 2020-02-05, Sergey Senozhatsky wrote: > So there is a General protection fault. That's the type of a > problem that kills the boot for me as well (different backtrace, > tho). Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) enabled? >>> >>> Yes. These two options are enabled. >>> >>> CONFIG_RELOCATABLE=y >>> CONFIG_RANDOMIZE_BASE=y >> >> So KASLR kills the boot for me. So does KASAN. > > Sergey, thanks for looking into this already! > >> John, do you see any of these problems on your test machine? > > For x86 I have only been using qemu. (For hardware tests I use arm64-smp > in order to verify memory barriers.) With qemu-x86_64 I am unable to > reproduce the problem. > > Lianbo, thanks for the report. Can you share your boot args? Anything > special in there (like log_buf_len=, earlyprintk, etc)? > Thanks for your response. Here is my kernel command line: Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.5.0-rc7+ root=/dev/mapper/intel--wildcatpass--07-root ro crashkernel=512M resume=/dev/mapper/intel--wildcatpass--07-swap rd.lvm.lv=intel-wildcatpass-07/root rd.lvm.lv=intel-wildcatpass-07/swap console=ttyS0,115200n81 BTW: Actually, I put the complete kernel log in my last email reply, you could check the attachment if needed. > Also, could you share your CONFIG_LOG_* and CONFIG_PRINTK_* options? > Sure. Please refer to it. [root@intel-wildcatpass-07 linux]# grep -nr "CONFIG_LOG_" .config 134:CONFIG_LOG_BUF_SHIFT=20 135:CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 [root@intel-wildcatpass-07 linux]# grep -nr "CONFIG_PRINTK_" .config 136:CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 207:CONFIG_PRINTK_NMI=y 7758:CONFIG_PRINTK_TIME=y 7759:# CONFIG_PRINTK_CALLER is not set Do you have any suggestions about the size of CONFIG_LOG_* and CONFIG_PRINTK_* options by default? Thanks. Lianbo > I will move to bare metal x86_64 and hopefully see it as well. > > John > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
> On (20/02/05 13:38), lijiang wrote: >>> On (20/02/05 13:48), Sergey Senozhatsky wrote: On (20/02/05 12:25), lijiang wrote: > > [..] > So there is a General protection fault. That's the type of a problem that kills the boot for me as well (different backtrace, tho). >>> >>> Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) enabled? >>> >> >> Yes. These two options are enabled. >> >> CONFIG_RELOCATABLE=y >> CONFIG_RANDOMIZE_BASE=y > > So KASLR kills the boot for me. So does KASAN. > For my side, after adding the option 'nokaslr' to kernel command line, I still have the previously mentioned problem, finally, kernel failed to boot. Thanks. > John, do you see any of these problems on your test machine? > > -ss > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 10:00), John Ogness wrote: > On 2020-02-05, Sergey Senozhatsky wrote: > So there is a General protection fault. That's the type of a > problem that kills the boot for me as well (different backtrace, > tho). > >>> > >>> Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) > >>> enabled? > >> > >> Yes. These two options are enabled. > >> > >> CONFIG_RELOCATABLE=y > >> CONFIG_RANDOMIZE_BASE=y > > > > So KASLR kills the boot for me. So does KASAN. > > Sergey, thanks for looking into this already! Hey, no prob! I can't see how and why that would be KASLR related, and most likely it's not. Probably we just hit some fault sooner with it enabled. So far it seems that reads from /dev/kmsg are causing problems on my laptop, but it's a bit hard to debug. Nothing printk-related in my boot params. -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On 2020-02-05, Sergey Senozhatsky wrote: So there is a General protection fault. That's the type of a problem that kills the boot for me as well (different backtrace, tho). >>> >>> Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) >>> enabled? >> >> Yes. These two options are enabled. >> >> CONFIG_RELOCATABLE=y >> CONFIG_RANDOMIZE_BASE=y > > So KASLR kills the boot for me. So does KASAN. Sergey, thanks for looking into this already! > John, do you see any of these problems on your test machine? For x86 I have only been using qemu. (For hardware tests I use arm64-smp in order to verify memory barriers.) With qemu-x86_64 I am unable to reproduce the problem. Lianbo, thanks for the report. Can you share your boot args? Anything special in there (like log_buf_len=, earlyprintk, etc)? Also, could you share your CONFIG_LOG_* and CONFIG_PRINTK_* options? I will move to bare metal x86_64 and hopefully see it as well. John ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 13:38), lijiang wrote: > > On (20/02/05 13:48), Sergey Senozhatsky wrote: > >> On (20/02/05 12:25), lijiang wrote: [..] > >> > >> So there is a General protection fault. That's the type of a problem that > >> kills the boot for me as well (different backtrace, tho). > > > > Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) enabled? > > > > Yes. These two options are enabled. > > CONFIG_RELOCATABLE=y > CONFIG_RANDOMIZE_BASE=y So KASLR kills the boot for me. So does KASAN. John, do you see any of these problems on your test machine? -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
> On (20/02/05 13:48), Sergey Senozhatsky wrote: >> On (20/02/05 12:25), lijiang wrote: >> [..] >>> [ 42.111004] Kernel Offset: 0x1f00 from 0x8100 >>> (relocation range: 0x8000-0xbfff) >>> [ 42.111005] general protection fault: [#1] SMP PTI >>> [ 42.111005] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted >>> 5.5.0-rc7+ #4 >>> [ 42.111005] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS >>> SE5C610.86B.01.01.6024.071720181717 07/17/2018 >>> [ 42.111006] RIP: 0010:copy_data+0xf2/0x1e0 >>> [ 42.111006] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c >>> 89 cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 >>> <41> c7 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00 >>> [ 42.111007] RSP: 0018:bbe207a7bd80 EFLAGS: 00010002 >>> [ 42.111007] RAX: a075d44ca000 RBX: 00a8 RCX: >>> fff000b0 >>> [ 42.111008] RDX: 00a8 RSI: 0f01 RDI: >>> a1456e00 >>> [ 42.111008] RBP: 0801364600307073 R08: 2000 R09: >>> 0801364600307073 >>> [ 42.111008] R10: fff0 R11: 00a8 R12: >>> a1e98330 >>> [ 42.111009] R13: d7efbe00 R14: 00a8 R15: >>> c000 >>> [ 42.111009] FS: 7f7c5642a980() GS:a075df5c() >>> knlGS: >>> [ 42.111010] CS: 0010 DS: ES: CR0: 80050033 >>> [ 42.111010] CR2: 7ffe95f4c4c0 CR3: 00084fbfc004 CR4: >>> 003606e0 >>> [ 42.111011] DR0: DR1: DR2: >>> >>> [ 42.111011] DR3: DR6: fffe0ff0 DR7: >>> 0400 >>> [ 42.111012] Call Trace: >>> [ 42.111012] _prb_read_valid+0xd8/0x190 >>> [ 42.111012] prb_read_valid+0x15/0x20 >>> [ 42.111013] devkmsg_read+0x9d/0x2a0 >>> [ 42.111013] vfs_read+0x91/0x140 >>> [ 42.111013] ksys_read+0x59/0xd0 >>> [ 42.111014] do_syscall_64+0x55/0x1b0 >>> [ 42.111014] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> [ 42.111014] RIP: 0033:0x7f7c55740b62 >>> [ 42.111015] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f >>> 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 >>> <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 >>> [ 42.111015] RSP: 002b:7ffe95f4c4a8 EFLAGS: 0246 ORIG_RAX: >>> >>> [ 42.111016] RAX: ffda RBX: 7ffe95f4e500 RCX: >>> 7f7c55740b62 >>> [ 42.111016] RDX: 2000 RSI: 7ffe95f4c4b0 RDI: >>> 0008 >>> [ 42.111017] RBP: R08: 0100 R09: >>> 0003 >>> [ 42.111017] R10: 0100 R11: 0246 R12: >>> 7ffe95f4c4b0 >> >> So there is a General protection fault. That's the type of a problem that >> kills the boot for me as well (different backtrace, tho). > > Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) enabled? > Yes. These two options are enabled. CONFIG_RELOCATABLE=y CONFIG_RANDOMIZE_BASE=y Thanks. > -ss > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 13:48), Sergey Senozhatsky wrote: > On (20/02/05 12:25), lijiang wrote: > [..] > > [ 42.111004] Kernel Offset: 0x1f00 from 0x8100 > > (relocation range: 0x8000-0xbfff) > > [ 42.111005] general protection fault: [#1] SMP PTI > > [ 42.111005] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted > > 5.5.0-rc7+ #4 > > [ 42.111005] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS > > SE5C610.86B.01.01.6024.071720181717 07/17/2018 > > [ 42.111006] RIP: 0010:copy_data+0xf2/0x1e0 > > [ 42.111006] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c > > 89 cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 > > <41> c7 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00 > > [ 42.111007] RSP: 0018:bbe207a7bd80 EFLAGS: 00010002 > > [ 42.111007] RAX: a075d44ca000 RBX: 00a8 RCX: > > fff000b0 > > [ 42.111008] RDX: 00a8 RSI: 0f01 RDI: > > a1456e00 > > [ 42.111008] RBP: 0801364600307073 R08: 2000 R09: > > 0801364600307073 > > [ 42.111008] R10: fff0 R11: 00a8 R12: > > a1e98330 > > [ 42.111009] R13: d7efbe00 R14: 00a8 R15: > > c000 > > [ 42.111009] FS: 7f7c5642a980() GS:a075df5c() > > knlGS: > > [ 42.111010] CS: 0010 DS: ES: CR0: 80050033 > > [ 42.111010] CR2: 7ffe95f4c4c0 CR3: 00084fbfc004 CR4: > > 003606e0 > > [ 42.111011] DR0: DR1: DR2: > > > > [ 42.111011] DR3: DR6: fffe0ff0 DR7: > > 0400 > > [ 42.111012] Call Trace: > > [ 42.111012] _prb_read_valid+0xd8/0x190 > > [ 42.111012] prb_read_valid+0x15/0x20 > > [ 42.111013] devkmsg_read+0x9d/0x2a0 > > [ 42.111013] vfs_read+0x91/0x140 > > [ 42.111013] ksys_read+0x59/0xd0 > > [ 42.111014] do_syscall_64+0x55/0x1b0 > > [ 42.111014] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 42.111014] RIP: 0033:0x7f7c55740b62 > > [ 42.111015] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f > > 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 > > <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 > > [ 42.111015] RSP: 002b:7ffe95f4c4a8 EFLAGS: 0246 ORIG_RAX: > > > > [ 42.111016] RAX: ffda RBX: 7ffe95f4e500 RCX: > > 7f7c55740b62 > > [ 42.111016] RDX: 2000 RSI: 7ffe95f4c4b0 RDI: > > 0008 > > [ 42.111017] RBP: R08: 0100 R09: > > 0003 > > [ 42.111017] R10: 0100 R11: 0246 R12: > > 7ffe95f4c4b0 > > So there is a General protection fault. That's the type of a problem that > kills the boot for me as well (different backtrace, tho). Do you have CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE (KASLR) enabled? -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 12:25), lijiang wrote: [..] > [ 42.111004] Kernel Offset: 0x1f00 from 0x8100 (relocation > range: 0x8000-0xbfff) > [ 42.111005] general protection fault: [#1] SMP PTI > [ 42.111005] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted 5.5.0-rc7+ > #4 > [ 42.111005] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS > SE5C610.86B.01.01.6024.071720181717 07/17/2018 > [ 42.111006] RIP: 0010:copy_data+0xf2/0x1e0 > [ 42.111006] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c 89 > cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 <41> c7 > 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00 > [ 42.111007] RSP: 0018:bbe207a7bd80 EFLAGS: 00010002 > [ 42.111007] RAX: a075d44ca000 RBX: 00a8 RCX: > fff000b0 > [ 42.111008] RDX: 00a8 RSI: 0f01 RDI: > a1456e00 > [ 42.111008] RBP: 0801364600307073 R08: 2000 R09: > 0801364600307073 > [ 42.111008] R10: fff0 R11: 00a8 R12: > a1e98330 > [ 42.111009] R13: d7efbe00 R14: 00a8 R15: > c000 > [ 42.111009] FS: 7f7c5642a980() GS:a075df5c() > knlGS: > [ 42.111010] CS: 0010 DS: ES: CR0: 80050033 > [ 42.111010] CR2: 7ffe95f4c4c0 CR3: 00084fbfc004 CR4: > 003606e0 > [ 42.111011] DR0: DR1: DR2: > > [ 42.111011] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 42.111012] Call Trace: > [ 42.111012] _prb_read_valid+0xd8/0x190 > [ 42.111012] prb_read_valid+0x15/0x20 > [ 42.111013] devkmsg_read+0x9d/0x2a0 > [ 42.111013] vfs_read+0x91/0x140 > [ 42.111013] ksys_read+0x59/0xd0 > [ 42.111014] do_syscall_64+0x55/0x1b0 > [ 42.111014] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 42.111014] RIP: 0033:0x7f7c55740b62 > [ 42.111015] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f > 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 <48> 3d > 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 > [ 42.111015] RSP: 002b:7ffe95f4c4a8 EFLAGS: 0246 ORIG_RAX: > > [ 42.111016] RAX: ffda RBX: 7ffe95f4e500 RCX: > 7f7c55740b62 > [ 42.111016] RDX: 2000 RSI: 7ffe95f4c4b0 RDI: > 0008 > [ 42.111017] RBP: R08: 0100 R09: > 0003 > [ 42.111017] R10: 0100 R11: 0246 R12: > 7ffe95f4c4b0 So there is a General protection fault. That's the type of a problem that kills the boot for me as well (different backtrace, tho). -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 0/2] printk: replace ringbuffer
On (20/02/05 12:25), lijiang wrote: > Hi, John Ogness > > Thank you for improving the patch series and making great efforts. > > I'm not sure if I missed anything else. Or are there any other related > patches to be applied? > > After applying this patch series, NMI watchdog detected a hard lockup, which > caused that kernel can not boot, please refer to > the following call trace. And I put the complete kernel log in the attachment. I'm also having some problems running the code on my laptop. But may be I did something wrong while applying patch 0002 (which didn't apply cleanly). Will look more. -ss ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 0/2] printk: replace ringbuffer
Hello, After several RFC series [0][1][2][3][4], here is the first set of patches to rework the printk subsystem. This first set of patches only replace the existing ringbuffer implementation. No locking is removed. No semantics/behavior of printk are changed. The VMCOREINFO is updated, which will require changes to the external crash [5] tool. I will be preparing a patch to add support for the new VMCOREINFO. This series is in line with the agreements [6] made at the meeting during LPC2019 in Lisbon, with 1 exception: support for dictionaries will _not_ be discontinued [7]. Dictionaries are stored in a separate buffer so that they cannot interfere with the human-readable buffer. John Ogness [0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogn...@linutronix.de [1] https://lkml.kernel.org/r/20190607162349.18199-1-john.ogn...@linutronix.de [2] https://lkml.kernel.org/r/2019072701.11260-1-john.ogn...@linutronix.de [3] https://lkml.kernel.org/r/20190807222634.1723-1-john.ogn...@linutronix.de [4] https://lkml.kernel.org/r/20191128015235.12940-1-john.ogn...@linutronix.de [5] https://github.com/crash-utility/crash [6] https://lkml.kernel.org/r/87k1acz5rx@linutronix.de [7] https://lkml.kernel.org/r/20191007120134.ciywr3wale4gx...@pathway.suse.cz John Ogness (2): printk: add lockless buffer printk: use the lockless ringbuffer include/linux/kmsg_dump.h |2 - kernel/printk/Makefile|1 + kernel/printk/printk.c| 836 +- kernel/printk/printk_ringbuffer.c | 1370 + kernel/printk/printk_ringbuffer.h | 328 +++ 5 files changed, 2114 insertions(+), 423 deletions(-) create mode 100644 kernel/printk/printk_ringbuffer.c create mode 100644 kernel/printk/printk_ringbuffer.h -- 2.20.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec