Re: [Crash-utility] [PATCH] printk: add support for lockless ringbuffer

2020-11-24 Thread John Ogness
On 2020-11-20, HAGIO KAZUHITO(萩尾 一仁) wrote:
> I've updated John's RFC crash patch to match 5.10-rc4 kernel.
> Changes from the RFC patch:
> - followed the following kernel commits
> cfe2790b163a ("printk: move printk_info into separate array")
> 74caba7f2a06 ("printk: move dictionary keys to dev_printk_info")
> f35efc78add6 ("printk: remove dict ring")
> - moved the added members in offset_table and size_table to the end
>   of them
> - print offsets and sizes with "help -o" option
> - support "log -T" option

Thank you for this work. I've attached a followup "crash" patch to
correctly interpret the state value. With this followup patch applied,
this passes all my tests.

Note that support for "./crash --log ./vmcore" is still not
implemented. That can be done in a separate patch.

John Ogness

>From a825be51da728f668bfeb770f2a8cf1bc3ef6a23 Mon Sep 17 00:00:00 2001
From: John Ogness 
Date: Wed, 25 Nov 2020 05:27:53 +0106
Subject: [PATCH] printk: use committed/finalized state values

The ringbuffer entries use 2 state values (committed and finalized)
rather than a single flag to represent being available for reading.
Copy the definitions and state lookup function directly from the
kernel source and use the new states.

Signed-off-by: John Ogness 
Signed-off-by: Nikolay Borisov 
---
 printk.c | 48 +---
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/printk.c b/printk.c
index 7be7218..ac135fc 100644
--- a/printk.c
+++ b/printk.c
@@ -1,12 +1,6 @@
 #include "defs.h"
 #include 
 
-#define DESC_SV_BITS		(sizeof(unsigned long) * 8)
-#define DESC_COMMITTED_MASK	(1UL << (DESC_SV_BITS - 1))
-#define DESC_REUSE_MASK		(1UL << (DESC_SV_BITS - 2))
-#define DESC_FLAGS_MASK		(DESC_COMMITTED_MASK | DESC_REUSE_MASK)
-#define DESC_ID_MASK		(~DESC_FLAGS_MASK)
-
 /* convenience struct for passing many values to helper functions */
 struct prb_map {
 	char *prb;
@@ -21,6 +15,44 @@ struct prb_map {
 	char *text_data;
 };
 
+/*
+ * desc_state and DESC_* definitions taken from kernel source:
+ *
+ * kernel/printk/printk_ringbuffer.h
+ */
+
+/* The possible responses of a descriptor state-query. */
+enum desc_state {
+	desc_miss	=  -1,	/* ID mismatch (pseudo state) */
+	desc_reserved	= 0x0,	/* reserved, in use by writer */
+	desc_committed	= 0x1,	/* committed by writer, could get reopened */
+	desc_finalized	= 0x2,	/* committed, no further modification allowed */
+	desc_reusable	= 0x3,	/* free, not yet used by any writer */
+};
+
+#define DESC_SV_BITS		(sizeof(unsigned long) * 8)
+#define DESC_FLAGS_SHIFT	(DESC_SV_BITS - 2)
+#define DESC_FLAGS_MASK		(3UL << DESC_FLAGS_SHIFT)
+#define DESC_STATE(sv)		(3UL & (sv >> DESC_FLAGS_SHIFT))
+#define DESC_ID_MASK		(~DESC_FLAGS_MASK)
+#define DESC_ID(sv)		((sv) & DESC_ID_MASK)
+
+/*
+ * get_desc_state() taken from kernel source:
+ *
+ * kernel/printk/printk_ringbuffer.c
+ */
+
+/* Query the state of a descriptor. */
+static enum desc_state get_desc_state(unsigned long id,
+  unsigned long state_val)
+{
+	if (id != DESC_ID(state_val))
+		return desc_miss;
+
+	return DESC_STATE(state_val);
+}
+
 static void
 init_offsets(void)
 {
@@ -74,6 +106,7 @@ dump_record(struct prb_map *m, unsigned long id, int msg_flags)
 	unsigned short text_len;
 	unsigned long state_var;
 	unsigned int caller_id;
+	enum desc_state state;
 	unsigned char level;
 	unsigned long begin;
 	unsigned long next;
@@ -90,7 +123,8 @@ dump_record(struct prb_map *m, unsigned long id, int msg_flags)
 	/* skip non-committed record */
 	state_var = ULONG(desc + OFFSET(prb_desc_state_var) +
 			OFFSET(atomic_long_t_counter));
-	if ((state_var & DESC_FLAGS_MASK) != DESC_COMMITTED_MASK)
+	state = get_desc_state(id, state_var);
+	if (state != desc_committed && state != desc_finalized)
 		return;
 
 	info = m->infos + ((id % m->desc_ring_count) * SIZE(printk_info));
-- 
2.20.1

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

[Crash-utility] [RFC PATCH 0/1] support lockless printk ringbuffer

2020-04-29 Thread John Ogness
Hi Kazu,

Here is a patch adding full support for the new lockless printk
ringbuffer as it is currently being proposed. Note that the latest
version has not yet been submitted to LKML. I was waiting until I
finished tests with crash(8) and makedumpfile(8).

The new ringbuffer will export all the necessary symbols, sizes,
and offsets in VMCOREINFO.

Note that I created a separate printk.c for the iteration logic.

Also note that I modified dwarf_info.c to support resolving
typedefs of typedefs. This was necessary in order to support
atomic_long_t and its "counter" member.

I don't expect you to take the patch as-is, but I hope it can
provide some positive ground work for moving forward.

John Ogness (1):
  printk: add support for lockless ringbuffer

 Makefile   |   2 +-
 dwarf_info.c   |  36 +-
 makedumpfile.c | 101 ++--
 makedumpfile.h |  25 +++
 printk.c   | 177 +
 5 files changed, 333 insertions(+), 8 deletions(-)
 create mode 100644 printk.c

-- 
2.20.1


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility



[Crash-utility] [RFC PATCH 1/1] printk: add support for lockless ringbuffer

2020-04-29 Thread John Ogness
Linux is moving to a new lockless ringbuffer. The new ringbuffer
is structured completely different to the previous iterations.
Add support for retrieving the ringbuffer from debug information
and/or using vmcoreinfo. The new ringbuffer is detected based on
the availability of the "prb" symbol.

Signed-off-by: John Ogness 
---
 Makefile   |   2 +-
 dwarf_info.c   |  36 +-
 makedumpfile.c | 101 ++--
 makedumpfile.h |  25 +++
 printk.c   | 177 +
 5 files changed, 333 insertions(+), 8 deletions(-)
 create mode 100644 printk.c

diff --git a/Makefile b/Makefile
index ef20672..dc4ae3e 100644
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
 endif
 
 SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h 
sadump_info.h
-SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c 
cache.c tools.c
+SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c 
cache.c tools.c printk.c
 OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
 SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c 
arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c
 OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
diff --git a/dwarf_info.c b/dwarf_info.c
index e42a9f5..543588b 100644
--- a/dwarf_info.c
+++ b/dwarf_info.c
@@ -614,6 +614,7 @@ search_structure(Dwarf_Die *die, int *found)
 {
int tag;
const char *name;
+   Dwarf_Die die_type;
 
/*
 * If we get to here then we don't have any more
@@ -622,9 +623,31 @@ search_structure(Dwarf_Die *die, int *found)
do {
tag  = dwarf_tag(die);
name = dwarf_diename(die);
-   if ((tag != DW_TAG_structure_type) || (!name)
-   || strcmp(name, dwarf_info.struct_name))
+   if ((!name) || strcmp(name, dwarf_info.struct_name))
+   continue;
+
+   if (tag == DW_TAG_typedef) {
+   if (!get_die_type(die, _type)) {
+   ERRMSG("Can't get CU die of DW_AT_type.\n");
+   break;
+   }
+
+   /* Resolve typedefs of typedefs. */
+   while ((tag = dwarf_tag(_type)) == DW_TAG_typedef) {
+   if (!get_die_type(_type, _type)) {
+   ERRMSG("Can't get CU die of 
DW_AT_type.\n");
+   return;
+   }
+   }
+
+   if (tag != DW_TAG_structure_type)
+   continue;
+   die = _type;
+
+   } else if (tag != DW_TAG_structure_type) {
continue;
+   }
+
/*
 * Skip if DW_AT_byte_size is not included.
 */
@@ -740,6 +763,15 @@ search_typedef(Dwarf_Die *die, int *found)
ERRMSG("Can't get CU die of DW_AT_type.\n");
break;
}
+
+   /* Resolve typedefs of typedefs. */
+   while ((tag = dwarf_tag(_type)) == DW_TAG_typedef) {
+   if (!get_die_type(_type, _type)) {
+   ERRMSG("Can't get CU die of 
DW_AT_type.\n");
+   return;
+   }
+   }
+
dwarf_info.struct_size = dwarf_bytesize(_type);
if (dwarf_info.struct_size <= 0)
continue;
diff --git a/makedumpfile.c b/makedumpfile.c
index f5860a1..6192ee7 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1555,6 +1555,7 @@ get_symbol_info(void)
SYMBOL_INIT(node_data, "node_data");
SYMBOL_INIT(pgdat_list, "pgdat_list");
SYMBOL_INIT(contig_page_data, "contig_page_data");
+   SYMBOL_INIT(prb, "prb");
SYMBOL_INIT(log_buf, "log_buf");
SYMBOL_INIT(log_buf_len, "log_buf_len");
SYMBOL_INIT(log_end, "log_end");
@@ -1971,16 +1972,46 @@ get_structure_info(void)
OFFSET_INIT(elf64_phdr.p_memsz, "elf64_phdr", "p_memsz");
 
SIZE_INIT(printk_log, "printk_log");
-   if (SIZE(printk_log) != NOT_FOUND_STRUCTURE) {
+   SIZE_INIT(printk_ringbuffer, "printk_ringbuffer");
+   if ((SIZE(printk_ringbuffer) != NOT_FOUND_STRUCTURE)) {
+   info->flag_use_printk_ringbuffer = TRUE;
+   info->flag_use_printk_log = FALSE;
+
+   OFFSET_INIT(printk_log.desc_ring, "printk_ringbuffer", 
"desc_ring");
+   OFFSET_INIT(printk_log.text_data_ring,

Re: [Crash-utility] new printk ringbuffer interface

2020-04-24 Thread John Ogness
On 2020-04-23, HAGIO KAZUHITO(萩尾 一仁)  wrote:
>> Should all struct sizes and field offsets be exported? It
>> would look something like this:
>>
>> VMCOREINFO_SYMBOL(prb);
>>
>> VMCOREINFO_STRUCT_SIZE(printk_ringbuffer);
>> VMCOREINFO_OFFSET(printk_ringbuffer, desc_ring);
>> VMCOREINFO_OFFSET(printk_ringbuffer, text_data_ring);
>> VMCOREINFO_OFFSET(printk_ringbuffer, dict_data_ring);
>> VMCOREINFO_OFFSET(printk_ringbuffer, fail);
>>
>> VMCOREINFO_STRUCT_SIZE(prb_desc_ring);
>> VMCOREINFO_OFFSET(prb_desc_ring, count_bits);
>> VMCOREINFO_OFFSET(prb_desc_ring, descs);
>> VMCOREINFO_OFFSET(prb_desc_ring, head_id);
>> VMCOREINFO_OFFSET(prb_desc_ring, tail_id);
>>
>> VMCOREINFO_STRUCT_SIZE(prb_desc);
>> VMCOREINFO_OFFSET(prb_desc, info);
>> VMCOREINFO_OFFSET(prb_desc, state_var);
>> VMCOREINFO_OFFSET(prb_desc, text_blk_lpos);
>> VMCOREINFO_OFFSET(prb_desc, dict_blk_lpos);
>>
>> VMCOREINFO_STRUCT_SIZE(prb_data_blk_lpos);
>> VMCOREINFO_OFFSET(prb_data_blk_lpos, begin);
>> VMCOREINFO_OFFSET(prb_data_blk_lpos, next);
>>
>> VMCOREINFO_STRUCT_SIZE(printk_info);
>> VMCOREINFO_OFFSET(printk_info, seq);
>> VMCOREINFO_OFFSET(printk_info, ts_nsec);
>> VMCOREINFO_OFFSET(printk_info, text_len);
>> VMCOREINFO_OFFSET(printk_info, dict_len);
>> VMCOREINFO_OFFSET(printk_info, caller_id);
>>
>> VMCOREINFO_STRUCT_SIZE(prb_data_ring);
>> VMCOREINFO_OFFSET(prb_data_ring, size_bits);
>> VMCOREINFO_OFFSET(prb_data_ring, data);
>> VMCOREINFO_OFFSET(prb_data_ring, head_id);
>> VMCOREINFO_OFFSET(prb_data_ring, tail_id);
>
> If there is no efficient way, we will need all of the entries in
> VMCOREINFO.

It seems like a lot to export everything, but I don't have a problem
with it. If we decide to export everything (which I expect we will need
to do), then I would change my crash(8) implementation to also rely only
on the VMCOREINFO. I see no point in having some implementations using
debug data and other implementations using VMCOREINFO data, if
VMCOREINFO has everything that is needed.

John Ogness


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

[Crash-utility] [RFC PATCH 1/1] crash: printk: add support for lockless ringbuffer

2020-04-21 Thread John Ogness
Linux is moving to a new lockless ringbuffer. The new ringbuffer
is structured completely different to the previous iterations.
Add support for dumping the ringbuffer with the "log" command.
The new ringbuffer is detected based on the availability of
the "prb" symbol.

Signed-off-by: John Ogness 
---
 Makefile |   5 +
 defs.h   |  24 +
 kernel.c |   8 +-
 printk.c | 298 +++
 4 files changed, 334 insertions(+), 1 deletion(-)
 create mode 100644 printk.c

diff --git a/Makefile b/Makefile
index 7455410..cec8ede 100644
--- a/Makefile
+++ b/Makefile
@@ -61,6 +61,7 @@ VMWARE_HFILES=vmware_vmss.h
 
 CFILES=main.c tools.c global_data.c memory.c filesys.c help.c task.c \
kernel.c test.c gdb_interface.c configure.c net.c dev.c bpf.c \
+   printk.c \
alpha.c x86.c ppc.c ia64.c s390.c s390x.c s390dbf.c ppc64.c x86_64.c \
arm.c arm64.c mips.c sparc64.c \
extensions.c remote.c va_server.c va_server_v1.c symbols.c cmdline.c \
@@ -80,6 +81,7 @@ SOURCE_FILES=${CFILES} ${GENERIC_HFILES} ${MCORE_HFILES} \
 
 OBJECT_FILES=main.o tools.o global_data.o memory.o filesys.o help.o task.o \
build_data.o kernel.o test.o gdb_interface.o net.o dev.o bpf.o \
+   printk.o \
alpha.o x86.o ppc.o ia64.o s390.o s390x.o s390dbf.o ppc64.o x86_64.o \
arm.o arm64.o mips.o sparc64.o \
extensions.o remote.o va_server.o va_server_v1.o symbols.o cmdline.o \
@@ -363,6 +365,9 @@ task.o: ${GENERIC_HFILES} task.c
 kernel.o: ${GENERIC_HFILES} kernel.c
${CC} -c ${CRASH_CFLAGS} kernel.c ${WARNING_OPTIONS} ${WARNING_ERROR}
 
+printk.o: ${GENERIC_HFILES} printk.c
+   ${CC} -c ${CRASH_CFLAGS} printk.c ${WARNING_OPTIONS} ${WARNING_ERROR}
+
 gdb_interface.o: ${GENERIC_HFILES} gdb_interface.c
${CC} -c ${CRASH_CFLAGS} gdb_interface.c ${WARNING_OPTIONS} 
${WARNING_ERROR}
 
diff --git a/defs.h b/defs.h
index d8eda5e..253fabb 100644
--- a/defs.h
+++ b/defs.h
@@ -1906,12 +1906,30 @@ struct offset_table {/* stash of 
commonly-used offsets */
long rt_prio_array_queue;
long task_struct_rt;
long sched_rt_entity_run_list;
+   long log_seq;
long log_ts_nsec;
long log_len;
long log_text_len;
long log_dict_len;
long log_level;
long log_flags_level;
+   long log_caller_id;
+   long prb_desc_ring;
+   long prb_text_data_ring;
+   long prb_dict_data_ring;
+   long prb_fail;
+   long prb_desc_ring_count_bits;
+   long prb_desc_ring_descs;
+   long prb_desc_ring_head_id;
+   long prb_desc_ring_tail_id;
+   long prb_desc_info;
+   long prb_desc_state_var;
+   long prb_desc_text_blk_lpos;
+   long prb_desc_dict_blk_lpos;
+   long prb_data_blk_lpos_begin;
+   long prb_data_blk_lpos_next;
+   long prb_data_ring_size_bits;
+   long prb_data_ring_data;
long timekeeper_xtime_sec;
long neigh_table_hash_mask;
long sched_rt_entity_my_q;
@@ -1983,6 +2001,7 @@ struct offset_table {/* stash of 
commonly-used offsets */
long kmem_cache_cpu_cache;
long nsproxy_net_ns;
long atomic_t_counter;
+   long atomic_long_t_counter;
long percpu_counter_count;
long mm_struct_mm_count;
long task_struct_thread_reg29;
@@ -,6 +2241,11 @@ struct size_table { /* stash of commonly-used 
sizes */
long msg_queue;
long log;
long log_level;
+   long printk_ringbuffer;
+   long prb_desc_ring;
+   long prb_desc;
+   long prb_data_blk_lpos;
+   long prb_data_ring;
long rt_rq;
long task_group;
long vmap_area;
diff --git a/kernel.c b/kernel.c
index 7604fac..a9889ea 100644
--- a/kernel.c
+++ b/kernel.c
@@ -95,6 +95,7 @@ static ulong __dump_audit(char *);
 static void dump_audit(void);
 static char *vmcoreinfo_read_string(const char *);
 static void check_vmcoreinfo(void);
+void dump_lockless_record_log(int);
 
 
 /*
@@ -4956,6 +4957,11 @@ dump_log(int msg_flags)
struct syment *nsp;
int log_wrap, loglevel, log_buf_len;
 
+   if (kernel_symbol_exists("prb")) {
+   dump_lockless_record_log(msg_flags);
+   return;
+   }
+
if (kernel_symbol_exists("log_first_idx") && 
kernel_symbol_exists("log_next_idx")) {
dump_variable_length_record_log(msg_flags);
@@ -5196,7 +5202,7 @@ dump_log_entry(char *logptr, int msg_flags)
 }
 
 /* 
- *  Handle the new variable-length-record log_buf.
+ *  Handle the variable-length-record log_buf.
  */
 static void
 dump_variable_length_record_log(int msg_flags)
diff --git a/printk.c b/printk.c
new file mode 100644
index 000..e8aff4f
--- /dev/null
+++ b/printk.c
@@ -0,0 +1,298 @@
+#include "defs.h"
+#include 
+
+#define DESC_SV_BITS   (sizeof(unsigned 

[Crash-utility] [RFC PATCH 0/1] support lockless printk ringbuffer

2020-04-21 Thread John Ogness
Hi Dave,

I created a proof-of-concept patch to work with the new printk
ringbuffer (as it is currently being proposed). I create a separate
source file (printk.c) because of all the helper functions.

The code doesn't do much error checking if symbols were missing,
and it probably doesn't work unless the machine running crash(8)
has the same endian and pointer-size as the crashed machine. But
otherwise, it does work correctly.

The most important part I wanted to have implemented was the new
logic for record traversal and printing. Being one of the authors
for the new printk ringbuffer, implementing this was far easier
for me than for someone unfamiliar with the ringbuffer internals.

It is using the new "prb" symbol. I did not add VMCOREINFO
support.

Note that this is based on the PATCHv2 that I have queued for
posting to LKML, but as of right now have not yet posted.
Basically I am waiting for feedback from Kazuhito regarding my
VMCOREINFO query. (It will not work with previous iterations
of the new ringbuffer because the struct names have changed.)

I don't expect you to take the patch as-is, but I hope it can
provide some positive ground work for moving forward.

John Ogness (1):
  crash: printk: add support for lockless ringbuffer

 Makefile |   5 +
 defs.h   |  24 +
 kernel.c |   8 +-
 printk.c | 298 +++
 4 files changed, 334 insertions(+), 1 deletion(-)
 create mode 100644 printk.c

-- 
2.20.1


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility



[Crash-utility] new printk ringbuffer interface

2020-04-20 Thread John Ogness
Hello Dave,

You may or may not be aware that we are working on replacing [0] the
Linux printk ringbuffer. Rather than a single buffer containing a single
struct type, the new ringbuffer makes use of several different structs.

I am writing to ask your advice about how this should be exported for
the crash utility. Should all struct sizes and field offsets be
exported? It would look something like this:

VMCOREINFO_SYMBOL(prb);

VMCOREINFO_STRUCT_SIZE(printk_ringbuffer);
VMCOREINFO_OFFSET(printk_ringbuffer, desc_ring);
VMCOREINFO_OFFSET(printk_ringbuffer, text_data_ring);
VMCOREINFO_OFFSET(printk_ringbuffer, dict_data_ring);
VMCOREINFO_OFFSET(printk_ringbuffer, fail);

VMCOREINFO_STRUCT_SIZE(prb_desc_ring);
VMCOREINFO_OFFSET(prb_desc_ring, count_bits);
VMCOREINFO_OFFSET(prb_desc_ring, descs);
VMCOREINFO_OFFSET(prb_desc_ring, head_id);
VMCOREINFO_OFFSET(prb_desc_ring, tail_id);

VMCOREINFO_STRUCT_SIZE(prb_desc);
VMCOREINFO_OFFSET(prb_desc, info);
VMCOREINFO_OFFSET(prb_desc, state_var);
VMCOREINFO_OFFSET(prb_desc, text_blk_lpos);
VMCOREINFO_OFFSET(prb_desc, dict_blk_lpos);

VMCOREINFO_STRUCT_SIZE(prb_data_blk_lpos);
VMCOREINFO_OFFSET(prb_data_blk_lpos, begin);
VMCOREINFO_OFFSET(prb_data_blk_lpos, next);

VMCOREINFO_STRUCT_SIZE(printk_info);
VMCOREINFO_OFFSET(printk_info, seq);
VMCOREINFO_OFFSET(printk_info, ts_nsec);
VMCOREINFO_OFFSET(printk_info, text_len);
VMCOREINFO_OFFSET(printk_info, dict_len);
VMCOREINFO_OFFSET(printk_info, caller_id);

VMCOREINFO_STRUCT_SIZE(prb_data_ring);
VMCOREINFO_OFFSET(prb_data_ring, size_bits);
VMCOREINFO_OFFSET(prb_data_ring, data);
VMCOREINFO_OFFSET(prb_data_ring, head_id);
VMCOREINFO_OFFSET(prb_data_ring, tail_id);

Or would it be enough to just recognize the new "prb" symbol and have
all the structures defined in the crash utility? If the latter is
preferred, should some sort of version number be exported? Or is the
kernel version number enough?

I appreciate your feedback.

John Ogness

[0] lkml.kernel.org/r/20200128161948.8524-1-john.ogn...@linutronix.de


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility