Re: [PATCH 0/1] PCI/MSI: add NULL check before use of msi_desc
Hello Lorenzo, Thank you for telling me. I missed the patch. Best regards, Hiraku Toyooka 2018-01-16 18:30 GMT+09:00 Lorenzo Pieralisi : > On Tue, Jan 16, 2018 at 06:15:46PM +0900, Hiraku Toyooka wrote: >> Hello, >> >> I found a NULL pointer dereference in PCI/MSI when I tried to run kdump >> kernel on i.MX6(MCIMX6Q-SDB). This error occurs when masking MSI irq >> which does not have msi_desc. >> I added NULL check to avoid the error, and kdump worked fine. But I'm >> not sure this is correct way. What do you think about this fix? > > It has been reported and it is being handled: > > https://marc.info/?l=linux-kernel&m=151321815226439&w=2 > >> My environment: >> - Board: MCIMX6Q-SDB >> - Kernel: 4.15.0-rc5 (commit: 464e1d5f23) >>- used also as kdump kernel >>- CONFIG_CRASH_DUMP and CONFIG_DEBUG_INFO are enabled based on >> imx_v6_v7_defconfig >> - U-Boot: u-boot-fslc (2017.11+fslc branch) >>- built with meta-freescale (commit: bf7fd9cfe0) >> >> >> Console log in failure case (patch not applied): >> >> root@imx6qdlsabresd:~# cat /proc/cmdline >> console=ttymxc0,115200 root=PARTUUID=6c7357c5-02 rootwait rw quiet >> crashkernel=96M >> root@imx6qdlsabresd:~# kexec --type zImage -p /boot/zImage >> --dtb=/boot/imx6q-sabresd.dtb --append="console=ttymxc0,115200 >> root=/dev/mmcblk1p2 rootwait rw 3 maxcpus=1 reset_devices earlycon" >> root@imx6qdlsabresd:~# echo c > /proc/sysrq-trigger >> [ 27.590895] sysrq: SysRq : Trigger a crash >> [ 27.595250] Unable to handle kernel NULL pointer dereference at virtual >> address >> ...(snip)... >> [ 27.808001] Backtrace: >> [ 27.810502] [] (sysrq_handle_crash) from [] >> (__handle_sysrq+0xd8/0x258) >> [ 27.818877] r5:0063 r4:c101a6b0 >> [ 27.822489] [] (__handle_sysrq) from [] >> (write_sysrq_trigger+0x78/0x90) >> [ 27.830871] r10: r9:0002 r8: r7:e6490c00 r6: >> r5:01ced738 >> [ 27.838719] r4:0002 >> [ 27.841290] [] (write_sysrq_trigger) from [] >> (proc_reg_write+0x68/0x90) >> [ 27.849664] r5: r4:c04d2610 >> [ 27.853277] [] (proc_reg_write) from [] >> (__vfs_write+0x34/0x134) >> [ 27.861050] r9:0002 r8:01ced738 r7:0002 r6:e71a9f78 r5:c0297954 >> r4:e6a49cc0 >> [ 27.868824] [] (__vfs_write) from [] >> (vfs_write+0xa8/0x170) >> [ 27.876162] r9:0002 r8:01ced738 r7:e71a9f78 r6:01ced738 r5:e6a49cc0 >> r4:0002 >> [ 27.883936] [] (vfs_write) from [] >> (SyS_write+0x44/0x98) >> [ 27.891014] r9:0002 r8:01ced738 r7: r6: r5:e6a49cc0 >> r4:e6a49cc0 >> [ 27.898797] [] (SyS_write) from [] >> (ret_fast_syscall+0x0/0x28) >> [ 27.906395] r9:e71a8000 r8:c01081a4 r7:0004 r6:b6f7eda8 r5:01ced738 >> r4:0002 >> [ 27.914169] Code: e3a04000 e5835000 ee074f9a ebf11af2 (e5c45000) >> [ 27.920332] CPU 1 will stop doing anything useful since another CPU has >> crashed >> [ 27.920342] CPU 0 will stop doing anything useful since another CPU has >> crashed >> [ 27.920351] CPU 2 will stop doing anything useful since another CPU has >> crashed >> [ 27.949670] Unable to handle kernel NULL pointer dereference at virtual >> address 0028 >> [ 27.957798] pgd = c30fc51b >> [ 27.960529] [0028] *pgd=4a140831 >> [ 27.964144] Internal error: Oops: 17 [#2] SMP ARM >> [ 27.968869] Modules linked in: >> [ 27.971962] CPU: 3 PID: 399 Comm: sh Not tainted 4.15.0-rc5-g3630470 #15 >> [ 27.978685] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) >> [ 27.985248] PC is at msi_set_mask_bit+0x18/0x6c >> [ 27.989805] LR is at pci_msi_mask_irq+0x14/0x18 >> [ 27.994358] pc : []lr : []psr: a193 >> [ 28.000647] sp : e71a9bb0 ip : e71a9bc8 fp : e71a9bc4 >> [ 28.005892] r10: e000 r9 : e682e400 r8 : c101a72c >> [ 28.011140] r7 : e71a9c00 r6 : c102a504 r5 : 012f r4 : >> [ 28.017690] r3 : e642b400 r2 : 0001 r1 : 0001 r0 : e642b414 >> [ 28.024241] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment >> none >> [ 28.031486] Control: 10c5387d Table: 36ef804a DAC: 0051 >> [ 28.037255] Process sh (pid: 399, stack limit = 0x61f128fb) >> [ 28.042849] Stack: (0xe71a9bb0 to 0xe71aa000) >> ...(snip)... >> [ 28.334445] Backtrace: >> [ 28.336937] [] (msi_set_mask_bit) from [] >> (pci_msi_mask_irq+0x14/0x18) >> [ 28.345224] r5:012f r4:e642b400 >> [ 28.348844] [
[PATCH 0/1] PCI/MSI: add NULL check before use of msi_desc
[ 28.466423] 9e00: 0001 c10359a0 0004 0002 e71a9e54 [ 28.474627] 9e20: e71a9e30 e71a9e40 c0118694 c04d1aa8 6013 [ 28.481268] r8:0004 r7:e71a9e24 r6: r5:6013 r4:c04d1aa8 [ 28.488012] [] (sysrq_handle_crash) from [] (__handle_sysrq+0xd8/0x258) [ 28.496385] r5:0063 r4:c101a6b0 [ 28.45] [] (__handle_sysrq) from [] (write_sysrq_trigger+0x78/0x90) [ 28.508375] r10: r9:0002 r8: r7:e6490c00 r6: r5:01ced738 [ 28.516223] r4:0002 [ 28.518791] [] (write_sysrq_trigger) from [] (proc_reg_write+0x68/0x90) [ 28.527164] r5: r4:c04d2610 [ 28.530774] [] (proc_reg_write) from [] (__vfs_write+0x34/0x134) [ 28.538547] r9:0002 r8:01ced738 r7:0002 r6:e71a9f78 r5:c0297954 r4:e6a49cc0 [ 28.546320] [] (__vfs_write) from [] (vfs_write+0xa8/0x170) [ 28.553658] r9:0002 r8:01ced738 r7:e71a9f78 r6:01ced738 r5:e6a49cc0 r4:0002 [ 28.561433] [] (vfs_write) from [] (SyS_write+0x44/0x98) [ 28.568511] r9:0002 r8:01ced738 r7: r6: r5:e6a49cc0 r4:e6a49cc0 [ 28.576291] [] (SyS_write) from [] (ret_fast_syscall+0x0/0x28) [ 28.583890] r9:e71a8000 r8:c01081a4 r7:0004 r6:b6f7eda8 r5:01ced738 r4:0002 [ 28.591662] Code: e24cb004 e590300c e1a02001 e5934008 (e5d43028) [ 28.597788] ---[ end trace b7f10c526986d6ea ]--- [ 28.602430] Kernel panic - not syncing: Fatal exception [ 28.607716] ---[ end Kernel panic - not syncing: Fatal exception Console log in success case (patch applied): root@imx6qdlsabresd:~# cat /proc/cmdline console=ttymxc0,115200 root=PARTUUID=6c7357c5-02 rootwait rw quiet crashkernel=96M root@imx6qdlsabresd:~# kexec --type zImage -p /boot/zImage --dtb=/boot/imx6q-sabresd.dtb --append="console=ttymxc0,115200 root=/dev/mmcblk1p2 rootwait rw 3 maxcpus=1 reset_devices earlycon" root@imx6qdlsabresd:~# echo c > /proc/sysrq-trigger [ 42.951366] sysrq: SysRq : Trigger a crash [ 42.955711] Unable to handle kernel NULL pointer dereference at virtual address ...(snip)... [ 43.167849] Backtrace: [ 43.170314] [] (sysrq_handle_crash) from [] (__handle_sysrq+0xd8/0x258) [ 43.178671] r5:0063 r4:c101a6b0 [ 43.182258] [] (__handle_sysrq) from [] (write_sysrq_trigger+0x78/0x90) [ 43.190617] r10: r9:0002 r8: r7:e6490c00 r6: r5:003b2738 [ 43.198450] r4:0002 [ 43.200995] [] (write_sysrq_trigger) from [] (proc_reg_write+0x68/0x90) [ 43.209350] r5: r4:c04d2614 [ 43.212937] [] (proc_reg_write) from [] (__vfs_write+0x34/0x134) [ 43.220688] r9:0002 r8:003b2738 r7:0002 r6:e6f33f78 r5:c0297954 r4:e6b47680 [ 43.228439] [] (__vfs_write) from [] (vfs_write+0xa8/0x170) [ 43.235756] r9:0002 r8:003b2738 r7:e6f33f78 r6:003b2738 r5:e6b47680 r4:0002 [ 43.243508] [] (vfs_write) from [] (SyS_write+0x44/0x98) [ 43.250564] r9:0002 r8:003b2738 r7: r6: r5:e6b47680 r4:e6b47680 [ 43.258320] [] (SyS_write) from [] (ret_fast_syscall+0x0/0x28) [ 43.265896] r9:e6f32000 r8:c01081a4 r7:0004 r6:b6f0eda8 r5:003b2738 r4:0002 [ 43.273647] Code: e3a04000 e5835000 ee074f9a ebf11af1 (e5c45000) [ 43.279767] CPU 3 will stop doing anything useful since another CPU has crashed [ 43.279771] CPU 2 will stop doing anything useful since another CPU has crashed [ 43.279775] CPU 0 will stop doing anything useful since another CPU has crashed [ 43.301962] Loading crashdump kernel... [ 43.305886] Bye! [0.00] Booting Linux on physical CPU 0x1 [0.00] Linux version 4.15.0-rc5-g13f566e (miracle@ar) (gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4)) #14 SMP Tue Jan 16 06:33:27 UTC 2018 Hiraku Toyooka (1): PCI/MSI: add NULL check before use of msi_desc drivers/pci/msi.c | 3 +++ 1 file changed, 3 insertions(+) -- 2.7.4
[PATCH 1/1] PCI/MSI: add NULL check before use of msi_desc
On starting kdump kernel in arm, arm64 and powerpc, chip->irq_mask() is called for each irq_data to mask interrupts. For MSI irqs, pci_msi_mask_irq() is called even when the irq_data doesn't have a link to msi_desc. This results in a NULL pointer dereference. This patch fixes the above error by avoiding NULL msi_desc use. Signed-off-by: Hiraku Toyooka Cc: Bjorn Helgaas Cc: Richard Zhu Cc: Lucas Stach Cc: Jingoo Han Cc: Joao Pinto Cc: Lorenzo Pieralisi Cc: Russell King Cc: Will Deacon --- drivers/pci/msi.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e066071..a6ee274 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag) { struct msi_desc *desc = irq_data_get_msi_desc(data); + if (!desc) + return; + if (desc->msi_attrib.is_msix) { msix_mask_irq(desc, flag); readl(desc->mask_base); /* Flush write to device */ -- 2.7.4
[PATCH 1/5] ramoops: use persistent_ram_free() instead of kfree() for freeing prz
persistent_ram_zone(=prz) structures are allocated by persistent_ram_new(), which includes vmap() or ioremap(). But they are currently freed by kfree(). We should use persistent_ram_free() to correct this asymmetry usage. Signed-off-by: Hiraku Toyooka Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: Tony Luck Cc: linux-kernel@vger.kernel.org --- fs/pstore/ram.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index 6c26c4d..6363768 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -567,11 +567,11 @@ fail_buf: kfree(cxt->pstore.buf); fail_clear: cxt->pstore.bufsize = 0; - kfree(cxt->mprz); + persistent_ram_free(cxt->mprz); fail_init_mprz: - kfree(cxt->fprz); + persistent_ram_free(cxt->fprz); fail_init_fprz: - kfree(cxt->cprz); + persistent_ram_free(cxt->cprz); fail_init_cprz: ramoops_free_przs(cxt); fail_out: -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] pstore: support multiple pmsg instances
This patch enables pmsg to deal with multiple instances. One possible use is content priority control on limited persistent store space. By using different buffers, we can prevent important messages from being overwritten by less important messages. When a pstore backend module specifies the number of instances (num_pmsg) in pstore_info, multiple /dev/pmsg[ID] appear. (ID is an integer starting at 0. It corresponds to minor number of the each char device.) Writes to each /dev/pmsg[ID] are isolated each other. After reboot, the contents are available in /sys/fs/pstore/pmsg-[backend]-[ID]. Signed-off-by: Hiraku Toyooka Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: Tony Luck Cc: linux-kernel@vger.kernel.org --- fs/pstore/pmsg.c | 20 ++-- include/linux/pstore.h |1 + 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/fs/pstore/pmsg.c b/fs/pstore/pmsg.c index feb5dd2..3467d63 100644 --- a/fs/pstore/pmsg.c +++ b/fs/pstore/pmsg.c @@ -50,8 +50,8 @@ static ssize_t write_pmsg(struct file *file, const char __user *buf, vfree(buffer); return -EFAULT; } - psinfo->write_buf(PSTORE_TYPE_PMSG, 0, &id, 0, buffer, 0, c, - psinfo); + psinfo->write_buf(PSTORE_TYPE_PMSG, 0, &id, + iminor(file->f_inode), buffer, 0, c, psinfo); i += c; } @@ -83,6 +83,7 @@ static char *pmsg_devnode(struct device *dev, umode_t *mode) void pstore_register_pmsg(void) { struct device *pmsg_device; + int i = 0; pmsg_major = register_chrdev(0, PMSG_NAME, &pmsg_fops); if (pmsg_major < 0) { @@ -103,9 +104,24 @@ void pstore_register_pmsg(void) pr_err("failed to create device\n"); goto err_device; } + + /* allocate additional /dev/pmsg[ID] */ + for (i = 1; i < psinfo->num_pmsg; i++) { + struct device *pmsg_device; + + pmsg_device = device_create(pmsg_class, NULL, + MKDEV(pmsg_major, i), NULL, "%s%d", + PMSG_NAME, i); + if (IS_ERR(pmsg_device)) { + pr_err("failed to create device\n"); + goto err_device; + } + } return; err_device: + for (i--; i >= 0; i--) + device_destroy(pmsg_class, MKDEV(pmsg_major, i)); class_destroy(pmsg_class); err_class: unregister_chrdev(pmsg_major, PMSG_NAME); diff --git a/include/linux/pstore.h b/include/linux/pstore.h index 8e7a25b..7cffc9d 100644 --- a/include/linux/pstore.h +++ b/include/linux/pstore.h @@ -54,6 +54,7 @@ struct pstore_info { size_t bufsize; struct mutexread_mutex; /* serialize open/read/close */ int flags; + unsigned intnum_pmsg; int (*open)(struct pstore_info *psi); int (*close)(struct pstore_info *psi); ssize_t (*read)(u64 *id, enum pstore_type_id *type, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] ramoops: introduce generic init/free functions for prz
We modifies initialization and freeing code for prz for generic usage. This change * add generic function __ramoops_init_prz() to reduce redundancy between ramoops_init_prz() and ramoops_init_przs(). * rename 'przs' member in struct ramoops_context to 'dprzs' so that it stands for 'dump przs'. * rename ramoops_init_prz() to ramoops_init_dprzs(). * change parameter of ramoops_free_przs() from struct ramoops_context * into struct persistent_ram_zone * in order to make it available for all prz array. Signed-off-by: Hiraku Toyooka Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: Tony Luck Cc: linux-kernel@vger.kernel.org --- fs/pstore/ram.c | 63 +++ 1 file changed, 35 insertions(+), 28 deletions(-) diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index 6363768..89cc90e 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -83,7 +83,7 @@ MODULE_PARM_DESC(ramoops_ecc, "bytes ECC)"); struct ramoops_context { - struct persistent_ram_zone **przs; + struct persistent_ram_zone **dprzs; struct persistent_ram_zone *cprz; struct persistent_ram_zone *fprz; struct persistent_ram_zone *mprz; @@ -199,7 +199,7 @@ static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type, /* Find the next valid persistent_ram_zone for DMESG */ while (cxt->dump_read_cnt < cxt->max_dump_cnt && !prz) { - prz = ramoops_get_next_prz(cxt->przs, &cxt->dump_read_cnt, + prz = ramoops_get_next_prz(cxt->dprzs, &cxt->dump_read_cnt, cxt->max_dump_cnt, id, type, PSTORE_TYPE_DMESG, 1); if (!prz_ok(prz)) @@ -314,10 +314,10 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type, if (part != 1) return -ENOSPC; - if (!cxt->przs) + if (!cxt->dprzs) return -ENOSPC; - prz = cxt->przs[cxt->dump_write_cnt]; + prz = cxt->dprzs[cxt->dump_write_cnt]; hlen = ramoops_write_kmsg_hdr(prz, compressed); if (size + hlen > prz->buffer_size) @@ -339,7 +339,7 @@ static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count, case PSTORE_TYPE_DMESG: if (id >= cxt->max_dump_cnt) return -EINVAL; - prz = cxt->przs[id]; + prz = cxt->dprzs[id]; break; case PSTORE_TYPE_CONSOLE: prz = cxt->cprz; @@ -371,21 +371,24 @@ static struct ramoops_context oops_cxt = { }, }; -static void ramoops_free_przs(struct ramoops_context *cxt) +static void ramoops_free_przs(struct persistent_ram_zone **przs) { int i; - cxt->max_dump_cnt = 0; - if (!cxt->przs) + if (!przs) return; - for (i = 0; !IS_ERR_OR_NULL(cxt->przs[i]); i++) - persistent_ram_free(cxt->przs[i]); - kfree(cxt->przs); + for (i = 0; i < !IS_ERR_OR_NULL(przs[i]); i++) + persistent_ram_free(przs[i]); + kfree(przs); } -static int ramoops_init_przs(struct device *dev, struct ramoops_context *cxt, -phys_addr_t *paddr, size_t dump_mem_sz) +static int __ramoops_init_prz(struct device *dev, struct ramoops_context *cxt, + struct persistent_ram_zone **prz, + phys_addr_t *paddr, size_t sz, u32 sig, bool zap); + +static int ramoops_init_dprzs(struct device *dev, struct ramoops_context *cxt, + phys_addr_t *paddr, size_t dump_mem_sz) { int err = -ENOMEM; int i; @@ -402,29 +405,24 @@ static int ramoops_init_przs(struct device *dev, struct ramoops_context *cxt, if (!cxt->max_dump_cnt) return -ENOMEM; - cxt->przs = kzalloc(sizeof(*cxt->przs) * cxt->max_dump_cnt, + cxt->dprzs = kcalloc(cxt->max_dump_cnt, sizeof(*cxt->dprzs), GFP_KERNEL); - if (!cxt->przs) { + if (!cxt->dprzs) { dev_err(dev, "failed to initialize a prz array for dumps\n"); goto fail_prz; } for (i = 0; i < cxt->max_dump_cnt; i++) { - cxt->przs[i] = persistent_ram_new(*paddr, cxt->record_size, 0, - &cxt->ecc_info, - cxt->memtype); - if (IS_ERR(cxt->przs[i])) { - err = PTR_ERR(cxt->przs[i]); - dev_err(dev, "failed to request mem region (0x%zx@0x%llx): %d\n", - cxt->
[PATCH 5/5] selftests/pstore: add testcases for multiple pmsg instances
To test multiple pmsg, we should check that /dev/pmsg[N] (N > 0) is available. After reboot, we should check that pmsg-[backend]-[N] which keeps content is detected even if pmsg-[backend]-[N-M] (0 < M <= N) doesn't exist due to lack of contents. So we add the following testcases. - pstore_tests - Write unique string to the last /dev/pmsgN - pstore_post_reboot_tests - Check the last pmsg area is detected Signed-off-by: Hiraku Toyooka Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: Shuah Khan Cc: Tony Luck Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- tools/testing/selftests/pstore/common_tests| 21 ++-- .../selftests/pstore/pstore_post_reboot_tests | 27 +++- tools/testing/selftests/pstore/pstore_tests| 16 ++-- 3 files changed, 47 insertions(+), 17 deletions(-) diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests index 3ea64d7..566a25d 100755 --- a/tools/testing/selftests/pstore/common_tests +++ b/tools/testing/selftests/pstore/common_tests @@ -27,9 +27,9 @@ show_result() { # result_value } check_files_exist() { # type of pstorefs file -if [ -e ${1}-${backend}-0 ]; then +if [ -e ${1}0 ]; then prlog "ok" - for f in `ls ${1}-${backend}-*`; do + for f in `ls ${1}*`; do prlog -e "\t${f}" done else @@ -53,6 +53,23 @@ operate_files() { # tested value, files, operation fi } +check_pmsg_content() { # pmsg filename +prev_uuid=`cat $TOP_DIR/prev_uuid` +if [ $? -eq 0 ]; then + nr_matched=`grep -c "$TEST_STRING_PATTERN" $1` + if [ "$nr_matched" = "1" ]; then + grep -q "$TEST_STRING_PATTERN"$prev_uuid $1 + show_result $? + else + prlog "FAIL" + rc=1 + fi +else + prlog "FAIL" + rc=1 +fi +} + # Parameters TEST_STRING_PATTERN="Testing pstore: uuid=" UUID=`cat /proc/sys/kernel/random/uuid` diff --git a/tools/testing/selftests/pstore/pstore_post_reboot_tests b/tools/testing/selftests/pstore/pstore_post_reboot_tests index 6ccb154..272498f 100755 --- a/tools/testing/selftests/pstore/pstore_post_reboot_tests +++ b/tools/testing/selftests/pstore/pstore_post_reboot_tests @@ -35,13 +35,13 @@ fi cd ${mount_point} prlog -n "Checking dmesg files exist in pstore filesystem ... " -check_files_exist dmesg +check_files_exist dmesg-${backend}- prlog -n "Checking console files exist in pstore filesystem ... " -check_files_exist console +check_files_exist console-${backend}- prlog -n "Checking pmsg files exist in pstore filesystem ... " -check_files_exist pmsg +check_files_exist pmsg-${backend}- prlog -n "Checking dmesg files contain oops end marker" grep_end_trace() { @@ -54,16 +54,19 @@ prlog -n "Checking console file contains oops end marker ... " grep -q "\---\[ end trace" console-${backend}-0 show_result $? -prlog -n "Checking pmsg file properly keeps the content written before crash ... " -prev_uuid=`cat $TOP_DIR/prev_uuid` -if [ $? -eq 0 ]; then -nr_matched=`grep -c "$TEST_STRING_PATTERN" pmsg-${backend}-0` -if [ $nr_matched -eq 1 ]; then - grep -q "$TEST_STRING_PATTERN"$prev_uuid pmsg-${backend}-0 - show_result $? +prlog -n "Checking pmsg-"${backend}"-0 properly keeps the content written before crash ... " +check_pmsg_content pmsg-${backend}-0 + +prlog -n "Checking the last pmsg area is detected " +last_num_pmsg=`ls -v pmsg-* | tail -n1 | sed -e "s/^pmsg-${backend}-\([0-9]*\).*$/\1/"` +last_num_devpmsg=`ls -v /dev/pmsg* | tail -n1 | sed -e "s/^\/dev\/pmsg\([0-9]*\).*$/\1/"` +#checks the last number of pmsg-*-* and /dev/pmsg* correspond. +if [ "$last_num_pmsg" -eq "$last_num_devpmsg" ]; then +if [ "$last_num_pmsg" = "0" ]; then + prlog "... No multiple pmsg-*-* exists. We skip this testcase." else - prlog "FAIL" - rc=1 + prlog -n "(pmsg-${backend}-${last_num_pmsg}) ... " + check_pmsg_content pmsg-${backend}-${last_num_pmsg} fi else prlog "FAIL" diff --git a/tools/testing/selftests/pstore/pstore_tests b/tools/testing/selftests/pstore/pstore_tests index f25d2a3..5abf0e5 100755 --- a/tools/testing/selftests/pstore/pstore_tests +++ b/tools/testing/selftests/pstore/pstore_tests @@ -13,9 +13,8 @@ prlog -n "Checking pstore console is registered ... " dmesg | grep -q "console \[pstore" show_result $? -prlog -n "Checking /dev/pmsg0 exists ... " -test -e /dev/pmsg0 -show_result $? +prlog -n "Checking /dev/pmsg files exist ... " +chec
[PATCH 0/5] pstore: ramoops: support multiple pmsg instances
The following series implements multiple pmsg. This feature allows userspace program to control individual content aging or priority. If a pstore backend module(e.g. ramoops) requires the multiple pmsg instances when registering itself to pstore, multiple /dev/pmsg[ID] are created. Writes to each /dev/pmsg[ID] are isolated each other. After reboot, the contents are available in /sys/fs/pstore/pmsg-[backend]-[ID]. In addition, we add multiple pmsg support for ramoops. We can specify multiple pmsg area size by its module parameter as follows. pmsg_size=0x1000,0x2000,... --- Hiraku Toyooka (5): ramoops: use persistent_ram_free() instead of kfree() for freeing prz ramoops: introduce generic init/free functions for prz pstore: support multiple pmsg instances ramoops: support multiple pmsg instances selftests/pstore: add testcases for multiple pmsg instances Documentation/ramoops.txt | 22 ++ fs/pstore/pmsg.c | 20 ++ fs/pstore/ram.c| 213 +++- include/linux/pstore.h |1 include/linux/pstore_ram.h |8 + tools/testing/selftests/pstore/common_tests| 21 ++ .../selftests/pstore/pstore_post_reboot_tests | 27 +-- tools/testing/selftests/pstore/pstore_tests| 16 +- 8 files changed, 257 insertions(+), 71 deletions(-) -- Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] ramoops: support multiple pmsg instances
This patch enables ramoops to deal with multiple pmsg instances. We can configure the size of each buffers by its module parameter as follows. pmsg_size=0x1000,0x2000,... Then, the ramoops allocates multiple buffers and tells the number of buffers to pstore to create multiple /dev/pmsg[ID]. Signed-off-by: Hiraku Toyooka Cc: Jonathan Corbet Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: Tony Luck Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Documentation/ramoops.txt | 22 +++ fs/pstore/ram.c| 146 ++-- include/linux/pstore_ram.h |8 ++ 3 files changed, 154 insertions(+), 22 deletions(-) diff --git a/Documentation/ramoops.txt b/Documentation/ramoops.txt index 5d86756..cff6ac7 100644 --- a/Documentation/ramoops.txt +++ b/Documentation/ramoops.txt @@ -126,3 +126,25 @@ file. Here is an example of usage: 0 811d9c54 8101a7a0 __const_udelay <- native_machine_emergency_restart+0x110/0x1e0 0 811d9c34 811d9c80 __delay <- __const_udelay+0x30/0x40 0 811d9d14 811d9c3f delay_tsc <- __delay+0xf/0x20 + +6. Pmsg support + +Ramoops supports pmsg - logging userspace mesages in persistent store. By +default, one pmsg area becomes available in ramoops. You can write data into +/dev/pmsg0, e.g.: + + # echo foo > /dev/pmsg0 + +After reboot, the stored data can be read from pmsg-ramoops-0 on the pstore +filesystem. + +You can specify size of the pmsg area by additional kernel command line, e.g.: + + "ramoops.pmsg_size=0x1000" + +You can also use multiple pmsg areas, e.g.: + + "ramoops.pmsg_size=0x1000,0x2000,..." + +Then, pmsg0, pmsg1, ... will appear on /dev. This is useful to control +individual content aging or priority. diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index 89cc90e..1d1378c 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -51,8 +51,8 @@ static ulong ramoops_ftrace_size = MIN_MEM_SIZE; module_param_named(ftrace_size, ramoops_ftrace_size, ulong, 0400); MODULE_PARM_DESC(ftrace_size, "size of ftrace log"); -static ulong ramoops_pmsg_size = MIN_MEM_SIZE; -module_param_named(pmsg_size, ramoops_pmsg_size, ulong, 0400); +static char *ramoops_pmsg_size_str; +module_param_named(pmsg_size, ramoops_pmsg_size_str, charp, 0400); MODULE_PARM_DESC(pmsg_size, "size of user space message log"); static ulong mem_address; @@ -86,14 +86,14 @@ struct ramoops_context { struct persistent_ram_zone **dprzs; struct persistent_ram_zone *cprz; struct persistent_ram_zone *fprz; - struct persistent_ram_zone *mprz; + struct persistent_ram_zone **mprzs; phys_addr_t phys_addr; unsigned long size; unsigned int memtype; size_t record_size; size_t console_size; size_t ftrace_size; - size_t pmsg_size; + size_t *pmsg_size; int dump_oops; struct persistent_ram_ecc_info ecc_info; unsigned int max_dump_cnt; @@ -103,6 +103,7 @@ struct ramoops_context { unsigned int console_read_cnt; unsigned int ftrace_read_cnt; unsigned int pmsg_read_cnt; + unsigned int num_pmsg; struct pstore_info pstore; }; @@ -220,9 +221,10 @@ static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type, if (!prz_ok(prz)) prz = ramoops_get_next_prz(&cxt->fprz, &cxt->ftrace_read_cnt, 1, id, type, PSTORE_TYPE_FTRACE, 0); - if (!prz_ok(prz)) - prz = ramoops_get_next_prz(&cxt->mprz, &cxt->pmsg_read_cnt, - 1, id, type, PSTORE_TYPE_PMSG, 0); + while (cxt->pmsg_read_cnt < cxt->num_pmsg && !prz) + prz = ramoops_get_next_prz(cxt->mprzs, &cxt->pmsg_read_cnt, + cxt->num_pmsg, id, type, + PSTORE_TYPE_PMSG, 0); if (!prz_ok(prz)) return 0; @@ -286,9 +288,11 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type, persistent_ram_write(cxt->fprz, buf, size); return 0; } else if (type == PSTORE_TYPE_PMSG) { - if (!cxt->mprz) + if (part >= cxt->num_pmsg) + return -EINVAL; + if (!cxt->mprzs || !cxt->mprzs[part]) return -ENOMEM; - persistent_ram_write(cxt->mprz, buf, size); + persistent_ram_write(cxt->mprzs[part], buf, size); return 0; } @@ -348,7 +352,9 @@ static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count, prz = cxt->fprz; break; case PSTORE_TYPE_PMSG: -
Re: [RFD] pstore: pmsg: ramoops: add multiple pmsg instances
Hello Mark, (I'm sorry for my late reply.) Mark Salyzyn wrote: >> I'm trying to make the feature of multiple pmsg instances for ramoops. > I like it, in fact I encourage it. Thank you. > Multiple instances does allow one to control individual content aging or > priority, I agree with your assessment, but make sure that you have a > bona fide user for it (even if it is just for yourself)? Yes. There are embedded devices which have limited NVRAM and no HDD/SSD. Especially in our usage, the devices are industrial machinery. When we catch a system failure, for fixing the root cause of the failure, we often have to distinguish if it happend suddenly or if there were signs long before the failure. Multiple pmsg can help this kind of cause analysis with small NVRAM because higher priority messages are preserved for a long time. > NB: ToDo: device tree support for pstore configuration Would you mean pstore backend's configuration (e.g.'mem_address' in ramoops)? Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFD] pstore: pmsg: ramoops: add multiple pmsg instances
Hello, I'm trying to make the feature of multiple pmsg instances for ramoops. I would like to discuss a possible way of getting the feature added to mainline. The reason to add the multiple pmsg is to preserve particular messages over a long time, which are useful for failure analysis. Currently, we can use /dev/pmsg0 to log userspace messages into pstore. It is useful but we can use only one pstore buffer even if pstore backend module (e.g. ramoops) can create multiple buffers. When one messages happen to be generated repeatedly, old messages are overwritten by them regardless of the importance of the messages. Multiple pmsg will solves the problem. If we have two pmsg files: /dev/pmsg{0,1}, we can write all messages into /dev/pmsg0, while writing important ones into /dev/pmsg1. It can also be used for preserving important kernel messages selected by an userspace daemon. We have two types of files for pmsg. /dev/pmsg0 is for logging and /sys/fs/pstore/pmsg-$backend-0 is for read. We can extend these files like pmsg{0,1,2,...}, pmsg-$backend-{0,1,2,...}. To realize this /dev/pmsgN interface, pstore backend modules which manage multiple buffers for pmsg should tell the necessary numbers of buffer to pstore_register_pmsg() via pstore_info. In addition, for ramoops, I think the following works for ram.c are necessary at least. * Parse multiple pmsg size in module parameter, for example, pmsg_size=0x2000,0x1000,... * Change pmsg_size in ramoops_platform_data to a pointer of size array. struct ramoops_platform_data { ... - unsigned long pmsg_size; + unsigned long *pmsg_size; ... }; When a ramoops's backend driver registers the platform_data and want pmsg buffers, it should allocate an array of size for pmsg_size. (Although there is no ramoops's backend which uses pmsg currently) * Give an ID for each pmsg buffer, and modify some functions related to buffer allocation. What do you think about this idea? Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/2] selftests/pstore: add pstore test scripts going with reboot
To test pstore in earnest, we have to cause kernel crash and check pstore filesystem after reboot. We add two scripts: - pstore_crash_test This script causes kernel crash and reboot. It is executed by 'make run_pstore_crash' in selftests. It can also be used with kdump. - pstore_post_reboot_tests This script includes test cases which check pstore's behavior after crash and reboot. It is executed together with pstore_tests by 'make run_tests [-C pstore]' in selftests. The test cases in pstore_post_reboot_tests are currently following. - Check pstore backend is registered - Mount pstore filesystem - Check dmesg/console/pmsg files exist in pstore filesystem - Check dmesg/console files contain oops end marker - Check pmsg file properly keeps the content written before crash - Remove all files in pstore filesystem Example usage is following. (before reboot) # cd /path/to/selftests # make run_tests -C pstore === Pstore unit tests (pstore_tests) === UUID=b49b02cf-b0c2-4309-be43-b08c3971e37f ... selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests) === UUID=953eb1bc-8e03-48d7-b27a-6552b24c5b7e Checking pstore backend is registered ... ok backend=ramoops cmdline=console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait mem=768M ramoops.mem_address=0x3000 ramoops.mem_size=0x1 pstore_crash_test has not been executed yet. we skip further tests. selftests: pstore_post_reboot_tests [PASS] # make run_pstore_crash === Pstore unit tests (pstore_crash_test) === UUID=93c8972d-1466-430b-8c4a-28d8681e74c6 Checking pstore backend is registered ... ok backend=ramoops cmdline=console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait mem=768M ramoops.mem_address=0x3000 ramoops.mem_size=0x1 Causing kernel crash ... (kernel crash and reboot) ... (after reboot) # make run_tests -C pstore === Pstore unit tests (pstore_tests) === UUID=8e511e77-2285-499f-8bc0-900d9af1fbcc ... selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests) === UUID=2dcc2132-4f3c-45aa-a38f-3b54bff8cef1 Checking pstore backend is registered ... ok backend=ramoops cmdline=console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait mem=768M ramoops.mem_address=0x3000 ramoops.mem_size=0x1 Mounting pstore filesystem ... ok Checking dmesg files exist in pstore filesystem ... ok dmesg-ramoops-0 dmesg-ramoops-1 Checking console files exist in pstore filesystem ... ok console-ramoops-0 Checking pmsg files exist in pstore filesystem ... ok pmsg-ramoops-0 Checking dmesg files contain oops end marker dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok Checking console file contains oops end marker ... ok Checking pmsg file properly keeps the content written before crash ... ok Removing all files in pstore filesystem console-ramoops-0 ... ok dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok pmsg-ramoops-0 ... ok selftests: pstore_post_reboot_tests [PASS] Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/Makefile |3 + tools/testing/selftests/pstore/Makefile|7 +- tools/testing/selftests/pstore/common_tests| 28 +++ tools/testing/selftests/pstore/pstore_crash_test | 30 .../selftests/pstore/pstore_post_reboot_tests | 77 5 files changed, 143 insertions(+), 2 deletions(-) create mode 100755 tools/testing/selftests/pstore/pstore_crash_test create mode 100755 tools/testing/selftests/pstore/pstore_post_reboot_tests diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 1a8fb99..2458288 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -66,6 +66,9 @@ clean_hotplug: make -C $$TARGET clean; \ done; +run_pstore_crash: + make -C pstore run_crash + INSTALL_PATH ?= install INSTALL_PATH := $(abspath $(INSTALL_PATH)) ALL_SCRIPT := $(INSTALL_PATH)/run_kselftest.sh diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile index 48623f7..bd7abe2 100644 --- a/tools/testing/selftests/pstore/Makefile +++ b/tools/testing/selftests/pstore/Makefile @@ -3,10 +3,13 @@ all: -TEST_PROGS := pstore_tests -TEST_FILES := common_tests +TEST_PROGS := pstore_tests pstore_post_reboot_tests +TEST_FILES := common_tests pstore_crash_test include ../lib.mk +run_crash: + @sh pstore_crash_test || { echo "pstore_crash_test: [FAIL]"; exit 1; } + clean: rm -rf logs/* *uuid diff --git a/tools/testing/selftests/pstore/comm
[PATCH v2 0/2] selftests/pstore: add pstore test script
These scripts include test cases which check pstore behavior. This is useful to avoid regressions of pstore. Changes since v1: * Exit with 1 in pstore/Makefile when pstore_crash_test failed. * Create helper functions to make the tests much more readable. * Use /sys/module/pstore/parameters/backend to check pstore backend is registered. * Show content of /sys/module/.../backend and /proc/cmdline for debug * Give UUID to each execution of test script. * Write unique test string with UUID into pmsg. * Check only one test string pattern appears in pmsg after crash. * Check UUID written to pmsg matches across crash. * Don't touch panic_on_oops because it is not necessary for reboot on crash (v1: http://www.kernelhub.org/?msg=831044&p=2) I also confirmed that these scripts work fine with kdump reboot with kernel boot parameter 'crash_kexec_post_notifiers'. --- Hiraku Toyooka (2): selftests/pstore: add pstore test script for pre-reboot selftests/pstore: add pstore test scripts going with reboot tools/testing/selftests/Makefile |4 + tools/testing/selftests/pstore/Makefile| 15 tools/testing/selftests/pstore/common_tests| 83 tools/testing/selftests/pstore/pstore_crash_test | 30 +++ .../selftests/pstore/pstore_post_reboot_tests | 77 +++ tools/testing/selftests/pstore/pstore_tests| 30 +++ 6 files changed, 239 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_crash_test create mode 100755 tools/testing/selftests/pstore/pstore_post_reboot_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests -- Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] selftests/pstore: add pstore test script for pre-reboot
The pstore_tests script includes test cases which check pstore's behavior before crash (and reboot). The test cases are currently following. - Check pstore backend is registered - Check pstore console is registered - Check /dev/pmsg0 exists - Write unique string to /dev/pmsg0 The unique string written to /dev/pmsg includes UUID. The UUID is also left in 'uuid' file in order to enable us to check if the pmsg keeps the string correctly after reboot. Example usage is following. # cd /path/to/selftests # make run_tests -C pstore (or just .pstore/pstore_tests) make: Entering directory '/path/to/selftests/pstore' === Pstore unit tests (pstore_tests) === UUID=b49b02cf-b0c2-4309-be43-b08c3971e37f Checking pstore backend is registered ... ok backend=ramoops cmdline=console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait mem=768M ramoops.mem_address=0x3000 ramoops.mem_size=0x1 Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing unique string to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] make: Leaving directory '/path/to/selftests/pstore' We can also see test logs later. # cat pstore/logs/20151001-072718_b49b02cf-b0c2-4309-be43-b08c3971e37f/pstore_tests.log Thu Oct 1 07:27:18 UTC 2015 === Pstore unit tests (pstore_tests) === UUID=b49b02cf-b0c2-4309-be43-b08c3971e37f Checking pstore backend is registered ... ok backend=ramoops cmdline=console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait mem=768M ramoops.mem_address=0x3000 ramoops.mem_size=0x1 Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing unique string to /dev/pmsg0 ... ok Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/Makefile|1 tools/testing/selftests/pstore/Makefile | 12 ++ tools/testing/selftests/pstore/common_tests | 55 +++ tools/testing/selftests/pstore/pstore_tests | 30 +++ 4 files changed, 98 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index cfe1213..1a8fb99 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -13,6 +13,7 @@ TARGETS += mount TARGETS += mqueue TARGETS += net TARGETS += powerpc +TARGETS += pstore TARGETS += ptrace TARGETS += seccomp TARGETS += size diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile new file mode 100644 index 000..48623f7 --- /dev/null +++ b/tools/testing/selftests/pstore/Makefile @@ -0,0 +1,12 @@ +# Makefile for pstore selftests. +# Expects pstore backend is registered. + +all: + +TEST_PROGS := pstore_tests +TEST_FILES := common_tests + +include ../lib.mk + +clean: + rm -rf logs/* *uuid diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests new file mode 100755 index 000..b1c3757 --- /dev/null +++ b/tools/testing/selftests/pstore/common_tests @@ -0,0 +1,55 @@ +#!/bin/sh + +# common_tests - Shell script commonly used by pstore test scripts +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +# Utilities +errexit() { # message +echo "Error: $1" 1>&2 +exit 1 +} + +absdir() { # file_path +(cd `dirname $1`; pwd) +} + +show_result() { # result_value +if [ $1 -eq 0 ]; then + prlog "ok" +else + prlog "FAIL" + rc=1 +fi +} + +# Parameters +TEST_STRING_PATTERN="Testing pstore: uuid=" +UUID=`cat /proc/sys/kernel/random/uuid` +TOP_DIR=`absdir $0` +LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`_${UUID}/ + +# Preparing logs +LOG_FILE=$LOG_DIR/`basename $0`.log +mkdir -p $LOG_DIR || errexit "Failed to make a log directory: $LOG_DIR" +date > $LOG_FILE +prlog() { # messages +/bin/echo "$@" | tee -a $LOG_FILE +} + +# Starting tests +rc=0 +prlog "=== Pstore unit tests (`basename $0`) ===" +prlog "UUID="$UUID + +prlog -n "Checking pstore backend is registered ... " +backend=`cat /sys/module/pstore/parameters/backend` +show_result $? +prlog -e "\tbackend=${backend}" +prlog -e "\tcmdline=`cat /proc/cmdline`" +if [ $rc -ne 0 ]; then +exit 1 +fi diff --git a/tools/testing/selftests/pstore/pstore_tests b/tools/testing/selftests/pstore/pstore_tests new file mode 100755 index 000..f25d2a3 --- /dev/null +++ b/tools/testing/selftests/pstore/pstore_tests @@ -0
Re: [PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot
Hello, I'm sorry for my late reply. Mark Salyzyn wrote: Perfectly match is an issue, since something else might be using pmsg. For instance, one of the applications that uses this interface packetizes the messages so they can be picked out from other sources that do not comply with the header (count, magic number etc). In this case, should that daemon be active, your content would be ignores, but your content would also be buried, but can be needled out with grep. What you should do is grep for your string pattern within some acceptable regex, and one should be found and no other, and it should match perfectly. This would prevent another daemon's content from disrupting your test and causing a false negative. OK. I think that the following method suit your intention. By splitting unique test string into TEST_STRING_PATTERN part and UUID part, we can check both the non-existence of previous content and the unique match on reboot-comparison run. I'll include this in v2. # before crash TEST_STRING_PATTERN="Testing pstore: uuid=" UUID=`cat /proc/sys/kernel/random/uuid` echo "$TEST_STRING_PATTERN""$UUID" > /dev/pmsg0 echo "$UUID" > uuid # after crash prlog -n "Checking pmsg file properly keeps the content written before crash ... " nr_matched=`grep -c "$TEST_STRING_PATTERN" pmsg-${backend}-0` if [ $nr_matched -eq 1 ]; then grep -q "$TEST_STRING_PATTERN"`cat uuid` pmsg-${backend}-0 if [ $? -eq 0 ]; then prlog "ok" else prlog "FAIL" else prlog "FAIL" fi Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] selftests/pstore: add pstore test scripts going with reboot
Hello, >> +prlog "Causing kernel crash ..." >> + >> +# enable all functions triggered by sysrq >> +echo 1 > /proc/sys/kernel/sysrq >> +# setting to reboot in 3 seconds after panic >> +echo 3 > /proc/sys/kernel/panic >> +# setting to cause panic when oops occurs >> +echo 1 > /proc/sys/kernel/panic_on_oops >> + >> +# create a file as reboot flag >> +touch $REBOOT_FILE >> +sync >> + >> +# cause crash >> +echo c > /proc/sysrq-trigger > > Do you need to stop kdump service before the sysrq? Yes, I should check /sys/kernel/kexec_crash_loaded. If the value is 1, this script should try to unload kexec kernel. > Or, does it cover oops and kdump case? No, not yet. I think we should support oops case at first. Best regards, Hiraku Toyooka 阿口誠司 / AGUCHI,SEIJI wrote: +prlog "Causing kernel crash ..." + +# enable all functions triggered by sysrq +echo 1 > /proc/sys/kernel/sysrq +# setting to reboot in 3 seconds after panic +echo 3 > /proc/sys/kernel/panic +# setting to cause panic when oops occurs +echo 1 > /proc/sys/kernel/panic_on_oops + +# create a file as reboot flag +touch $REBOOT_FILE +sync + +# cause crash +echo c > /proc/sysrq-trigger Do you need to stop kdump service before the sysrq? Or, does it cover oops and kdump case? Seiji -Original Message- From: 豊岡拓 / Toyooka,Hiraku Sent: Tuesday, September 15, 2015 11:42 AM To: Kees Cook Cc: LKML; Tony Luck; Linux API; Anton Vorontsov; Shuah Khan; Mark Salyzyn; Colin Cross; 阿口誠司 / AGUCHI,SEIJI Subject: Re: [PATCH 2/2] selftests/pstore: add pstore test scripts going with reboot Hello Kees, >> +run_crash: >> + @sh pstore_crash_test || echo "pstore_crash_test: [FAIL]" > > This is probably better written to exit 1 on failure, otherwise it > just _says_ it fails. (Though lots of selftests in the tree already > have this problem, it's best to avoid the pattern for new stuff.) > Maybe something like: > > @sh pstore_crash_test || { echo "pstore_crash_test: [FAIL]"; exit 1; } OK. I'll add the "exit 1". >> +prlog -n "Checking dmesg files exist in pstore filesystem ... " >> +if [ -e dmesg-${backend}-0 ]; then >> +prlog "ok" >> +for f in `ls dmesg-${backend}-*`; do >> + prlog -e "\t${f}" >> + done >> +else >> +prlog "FAIL" >> +rc=1 >> +fi > > This test pattern is repeated a lot. Maybe better to create a helper > function instead? It could make the tests much more readable. Yes, I should make a helper function in v2. Best regards, Hiraku Toyooka -- Hiraku Toyooka Systems Productivity Research Dept. / Linux Technology Center Center for Technology Innovation - Systems Engineering, Hitachi Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot
Hello, > It may be good if you can log "/sys/module/pstore/parameters/backend/" > or /proc/cmdline in failure case. > > It makes debug easy. OK, I'll have the script log the information in v2. Best regards, Hiraku Toyooka 阿口誠司 / AGUCHI,SEIJI wrote: Hi, +prlog -n "Checking pstore backend is registered ... " +be_msg=`dmesg | grep "pstore: Registered [a-zA-Z0-9]\+ as persistent store backend$"` +if [ $? -eq 0 ]; then +backend=`echo ${be_msg} | sed -e 's/^.*Registered\ \([a-zA-z0-9-]\+\)\ as.*$/\1/g'` +prlog "ok" +else +prlog "FAIL" +exit 1 It may be good if you can log "/sys/module/pstore/parameters/backend/" or /proc/cmdline in failure case. It makes debug easy. Seiji +fi -Original Message- From: 豊岡拓 / Toyooka,Hiraku Sent: Tuesday, September 15, 2015 11:31 AM To: Kees Cook Cc: LKML; Tony Luck; Linux API; Anton Vorontsov; Shuah Khan; Mark Salyzyn; Colin Cross; 阿口誠司 / AGUCHI,SEIJI Subject: Re: [PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot Hello, Kees, Thank you for your advise. >> +be_msg=`dmesg | grep "pstore: Registered [a-zA-Z0-9]\+ as persistent store backend$"` ... > This seems unstable if the system hasn't booted recently or if stuff > is spamming dmesg. What about examining /sys/module/pstore instead? OK, I'll update in that way. Best regards, Hiraku Toyooka Kees Cook wrote: On Tue, Sep 8, 2015 at 4:06 AM, Hiraku Toyooka wrote: The pstore_tests script includes test cases which check pstore's behavior before crash (and reboot). The test cases are currently following. - Check pstore backend is registered - Check pstore console is registered - Check /dev/pmsg0 exists - Write string to /dev/pmsg0 Example usage is following. make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok pstore_crash_test has not been executed yet. we skip further tests. selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' We can also see test logs later. Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/Makefile|1 + tools/testing/selftests/pstore/Makefile | 12 +++ tools/testing/selftests/pstore/common_tests | 45 +++ tools/testing/selftests/pstore/pstore_tests | 42 + 4 files changed, 100 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 24ae9e8..b58c72e 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -12,6 +12,7 @@ TARGETS += mount TARGETS += mqueue TARGETS += net TARGETS += powerpc +TARGETS += pstore TARGETS += ptrace TARGETS += seccomp TARGETS += size diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile new file mode 100644 index 000..40b887d --- /dev/null +++ b/tools/testing/selftests/pstore/Makefile @@ -0,0 +1,12 @@ +# Makefile for pstore selftests. +# Expects pstore backend is registered. + +all: + +TEST_PROGS := pstore_tests +TEST_FILES := common_tests + +include ../lib.mk + +clean: + rm -rf logs/* diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests new file mode 100755 index 000..98611c5 --- /dev/null +++ b/tools/testing/selftests/pstore/common_tests @@ -0,0 +1,45 @@ +#!/bin/sh + +# common_tests - Shell script commonly used by pstore test scripts +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +# Utilities +errexit() { # message + echo "Error: $1" 1>&2 + exit 1 +} + +absdir() { # file_path + (cd `dirname $1`; pwd) +} + +# Parameters +TOP_DIR=`absdir $0` +LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/ +TEST_STRING="Testing pstore" + +# Preparing logs +LOG_FILE=$LOG_DIR/`basename $0`.log +mkdir -p $LOG_DIR || errexit "Failed to make a log directory: $LOG_DIR" +date > $LOG_FILE +prlog() { # messages + /bin/echo "$@" | tee -a $LOG_FILE +} +prlog "=== Pstore unit tests (`basename $0`)===" + +# Starting tests +rc=0 +
Re: [PATCH 2/2] selftests/pstore: add pstore test scripts going with reboot
Hello Kees, >> +run_crash: >> + @sh pstore_crash_test || echo "pstore_crash_test: [FAIL]" > > This is probably better written to exit 1 on failure, otherwise it > just _says_ it fails. (Though lots of selftests in the tree already > have this problem, it's best to avoid the pattern for new stuff.) > Maybe something like: > > @sh pstore_crash_test || { echo "pstore_crash_test: [FAIL]"; exit 1; } OK. I'll add the "exit 1". >> +prlog -n "Checking dmesg files exist in pstore filesystem ... " >> +if [ -e dmesg-${backend}-0 ]; then >> +prlog "ok" >> +for f in `ls dmesg-${backend}-*`; do >> + prlog -e "\t${f}" >> +done >> +else >> +prlog "FAIL" >> +rc=1 >> +fi > > This test pattern is repeated a lot. Maybe better to create a helper > function instead? It could make the tests much more readable. Yes, I should make a helper function in v2. Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot
Hello, Kees, Thank you for your advise. >> +be_msg=`dmesg | grep "pstore: Registered [a-zA-Z0-9]\+ as persistent store backend$"` ... > This seems unstable if the system hasn't booted recently or if stuff > is spamming dmesg. What about examining /sys/module/pstore instead? OK, I'll update in that way. Best regards, Hiraku Toyooka Kees Cook wrote: On Tue, Sep 8, 2015 at 4:06 AM, Hiraku Toyooka wrote: The pstore_tests script includes test cases which check pstore's behavior before crash (and reboot). The test cases are currently following. - Check pstore backend is registered - Check pstore console is registered - Check /dev/pmsg0 exists - Write string to /dev/pmsg0 Example usage is following. make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok pstore_crash_test has not been executed yet. we skip further tests. selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' We can also see test logs later. Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/Makefile|1 + tools/testing/selftests/pstore/Makefile | 12 +++ tools/testing/selftests/pstore/common_tests | 45 +++ tools/testing/selftests/pstore/pstore_tests | 42 + 4 files changed, 100 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 24ae9e8..b58c72e 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -12,6 +12,7 @@ TARGETS += mount TARGETS += mqueue TARGETS += net TARGETS += powerpc +TARGETS += pstore TARGETS += ptrace TARGETS += seccomp TARGETS += size diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile new file mode 100644 index 000..40b887d --- /dev/null +++ b/tools/testing/selftests/pstore/Makefile @@ -0,0 +1,12 @@ +# Makefile for pstore selftests. +# Expects pstore backend is registered. + +all: + +TEST_PROGS := pstore_tests +TEST_FILES := common_tests + +include ../lib.mk + +clean: + rm -rf logs/* diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests new file mode 100755 index 000..98611c5 --- /dev/null +++ b/tools/testing/selftests/pstore/common_tests @@ -0,0 +1,45 @@ +#!/bin/sh + +# common_tests - Shell script commonly used by pstore test scripts +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +# Utilities +errexit() { # message + echo "Error: $1" 1>&2 + exit 1 +} + +absdir() { # file_path + (cd `dirname $1`; pwd) +} + +# Parameters +TOP_DIR=`absdir $0` +LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/ +TEST_STRING="Testing pstore" + +# Preparing logs +LOG_FILE=$LOG_DIR/`basename $0`.log +mkdir -p $LOG_DIR || errexit "Failed to make a log directory: $LOG_DIR" +date > $LOG_FILE +prlog() { # messages + /bin/echo "$@" | tee -a $LOG_FILE +} +prlog "=== Pstore unit tests (`basename $0`)===" + +# Starting tests +rc=0 + +prlog -n "Checking pstore backend is registered ... " +be_msg=`dmesg | grep "pstore: Registered [a-zA-Z0-9]\+ as persistent store backend$"` +if [ $? -eq 0 ]; then +backend=`echo ${be_msg} | sed -e 's/^.*Registered\ \([a-zA-z0-9-]\+\)\ as.*$/\1/g'` +prlog "ok" +else +prlog "FAIL" +exit 1 +fi This seems unstable if the system hasn't booted recently or if stuff is spamming dmesg. What about examining /sys/module/pstore instead? diff --git a/tools/testing/selftests/pstore/pstore_tests b/tools/testing/selftests/pstore/pstore_tests new file mode 100755 index 000..cbf613c --- /dev/null +++ b/tools/testing/selftests/pstore/pstore_tests @@ -0,0 +1,42 @@ +#!/bin/sh + +# pstore_tests - Check pstore's behavior before crash/reboot +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +. ./common_tests + +prlog -n "Checking pstore console is registered ... " +dmesg | grep -q "console \[pstore" +if [ $? -eq 0 ]; then +prlog "ok" +else +
Re: [PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot
Hello Mark, Thank you for your advise. >> +prlog -n "Checking pmsg file contains TEST_STRING ... " > Mark this as 'wish to have' OK. I'll change it to "Checking pmsg file wishes to have TEST_STRING ... ". Should I change other messages in the same way? > Can TEST_STRING be given an unique value each run, so that on the the > reboot-comparison run it can be found to be an unique match? Yes. I'll append /proc/sys/kernel/random/uuid content to TEST_STRING. I'll also change log directory name from date to the uuid. > Also confirm that any previous content (which may be binary) is not > present after reboot, and that totally new content is present. OK. As for pmsg, they are possible by checking if the /sys/fs/pstore/pmsg content perfectly matches the TEST_STRING which was written to /dev/pmsg before reboot. (The TEST_STRING can be left to a regular file before reboot as well as reboot_flag.) Is it OK? Best regards, Hiraku Toyooka -- Hiraku Toyooka Systems Productivity Research Dept. / Linux Technology Center Center for Technology Innovation - Systems Engineering, Hitachi Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] selftests/pstore: add pstore test script for pre-reboot
The pstore_tests script includes test cases which check pstore's behavior before crash (and reboot). The test cases are currently following. - Check pstore backend is registered - Check pstore console is registered - Check /dev/pmsg0 exists - Write string to /dev/pmsg0 Example usage is following. make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok pstore_crash_test has not been executed yet. we skip further tests. selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' We can also see test logs later. Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/Makefile|1 + tools/testing/selftests/pstore/Makefile | 12 +++ tools/testing/selftests/pstore/common_tests | 45 +++ tools/testing/selftests/pstore/pstore_tests | 42 + 4 files changed, 100 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 24ae9e8..b58c72e 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -12,6 +12,7 @@ TARGETS += mount TARGETS += mqueue TARGETS += net TARGETS += powerpc +TARGETS += pstore TARGETS += ptrace TARGETS += seccomp TARGETS += size diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile new file mode 100644 index 000..40b887d --- /dev/null +++ b/tools/testing/selftests/pstore/Makefile @@ -0,0 +1,12 @@ +# Makefile for pstore selftests. +# Expects pstore backend is registered. + +all: + +TEST_PROGS := pstore_tests +TEST_FILES := common_tests + +include ../lib.mk + +clean: + rm -rf logs/* diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests new file mode 100755 index 000..98611c5 --- /dev/null +++ b/tools/testing/selftests/pstore/common_tests @@ -0,0 +1,45 @@ +#!/bin/sh + +# common_tests - Shell script commonly used by pstore test scripts +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +# Utilities +errexit() { # message + echo "Error: $1" 1>&2 + exit 1 +} + +absdir() { # file_path + (cd `dirname $1`; pwd) +} + +# Parameters +TOP_DIR=`absdir $0` +LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/ +TEST_STRING="Testing pstore" + +# Preparing logs +LOG_FILE=$LOG_DIR/`basename $0`.log +mkdir -p $LOG_DIR || errexit "Failed to make a log directory: $LOG_DIR" +date > $LOG_FILE +prlog() { # messages + /bin/echo "$@" | tee -a $LOG_FILE +} +prlog "=== Pstore unit tests (`basename $0`)===" + +# Starting tests +rc=0 + +prlog -n "Checking pstore backend is registered ... " +be_msg=`dmesg | grep "pstore: Registered [a-zA-Z0-9]\+ as persistent store backend$"` +if [ $? -eq 0 ]; then +backend=`echo ${be_msg} | sed -e 's/^.*Registered\ \([a-zA-z0-9-]\+\)\ as.*$/\1/g'` +prlog "ok" +else +prlog "FAIL" +exit 1 +fi diff --git a/tools/testing/selftests/pstore/pstore_tests b/tools/testing/selftests/pstore/pstore_tests new file mode 100755 index 000..cbf613c --- /dev/null +++ b/tools/testing/selftests/pstore/pstore_tests @@ -0,0 +1,42 @@ +#!/bin/sh + +# pstore_tests - Check pstore's behavior before crash/reboot +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +. ./common_tests + +prlog -n "Checking pstore console is registered ... " +dmesg | grep -q "console \[pstore" +if [ $? -eq 0 ]; then +prlog "ok" +else +prlog "FAIL" +fi + +prlog -n "Checking /dev/pmsg0 exists ... " +if [ -e "/dev/pmsg0" ]; then +prlog "ok" +else +prlog "FAIL" +rc=1 +fi + +prlog -n "Writing TEST_STRING to /dev/pmsg0 ... " +if [ -e "/dev/pmsg0" ]; then +echo "${TEST_STRING}" > /dev/pmsg0 +if [ $? -eq 0 ]; then + prlog "ok" +else + prlog "FAIL" + rc=1 +fi +else +prlog "FAIL" +rc=1 +fi + +exit $rc -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] selftests/pstore: add pstore test scripts going with reboot
To test pstore in earnest, we have to cause kernel crash and check pstore filesystem mouted after reboot. We add two scripts: - pstore_crash_test This script to cause crash and reboot easily. It is executed by 'make run_pstore_crash' in selftests. - pstore_post_reboot_tests This script includes test cases which check pstore's behavior after crash and reboot. It is executed together with pstore_tests by 'make run_tests [-C pstore]' in selftests. The test cases in pstore_post_reboot_tests are currently following. - Check pstore backend is registered - Mount pstore filesystem - Check dmesg files exist in pstore filesystem - Check console file exist in pstore filesystem - Check pmsg file exist in pstore filesystem - Check dmesg files contain oops end marker - Check console file contain oops end marker - Check pmsg file contain the string written before crash - Remove all files in pstore filesystem Example usage is following. ... (kernel crash and reboot) ... make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok Mounting pstore filesystem ... ok Checking dmesg files exist in pstore filesystem ... ok dmesg-ramoops-0 dmesg-ramoops-1 Checking console files exist in pstore filesystem ... ok console-ramoops-0 Checking pmsg files exist in pstore filesystem ... ok pmsg-ramoops-0 Checking dmesg files contains oops end marker dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok Checking console file contains oops end marker ... ok Checking pmsg file contains TEST_STRING ... ok Removing all files in pstore filesystem console-ramoops-0 ... ok dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok pmsg-ramoops-0 ... ok selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' Signed-off-by: Hiraku Toyooka Cc: Shuah Khan Cc: Tony Luck Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Mark Salyzyn Cc: Seiji Aguchi Cc: linux-kernel@vger.kernel.org Cc: linux-...@vger.kernel.org --- tools/testing/selftests/pstore/Makefile|7 + tools/testing/selftests/pstore/common_tests|1 tools/testing/selftests/pstore/pstore_crash_test | 27 .../selftests/pstore/pstore_post_reboot_tests | 126 4 files changed, 159 insertions(+), 2 deletions(-) create mode 100755 tools/testing/selftests/pstore/pstore_crash_test create mode 100755 tools/testing/selftests/pstore/pstore_post_reboot_tests diff --git a/tools/testing/selftests/pstore/Makefile b/tools/testing/selftests/pstore/Makefile index 40b887d..32c408c 100644 --- a/tools/testing/selftests/pstore/Makefile +++ b/tools/testing/selftests/pstore/Makefile @@ -3,10 +3,13 @@ all: -TEST_PROGS := pstore_tests -TEST_FILES := common_tests +TEST_PROGS := pstore_tests pstore_post_reboot_tests +TEST_FILES := common_tests pstore_crash_test include ../lib.mk +run_crash: + @sh pstore_crash_test || echo "pstore_crash_test: [FAIL]" + clean: rm -rf logs/* diff --git a/tools/testing/selftests/pstore/common_tests b/tools/testing/selftests/pstore/common_tests index 98611c5..8003760 100755 --- a/tools/testing/selftests/pstore/common_tests +++ b/tools/testing/selftests/pstore/common_tests @@ -20,6 +20,7 @@ absdir() { # file_path # Parameters TOP_DIR=`absdir $0` LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/ +REBOOT_FILE=$TOP_DIR/reboot_flag TEST_STRING="Testing pstore" # Preparing logs diff --git a/tools/testing/selftests/pstore/pstore_crash_test b/tools/testing/selftests/pstore/pstore_crash_test new file mode 100755 index 000..6d0c422 --- /dev/null +++ b/tools/testing/selftests/pstore/pstore_crash_test @@ -0,0 +1,27 @@ +#!/bin/sh + +# pstore_crash_test - Pstore test shell script which causes crash and reboot +# +# Copyright (C) Hitachi Ltd., 2015 +# Written by Hiraku Toyooka +# +# Released under the terms of the GPL v2. + +# exit if pstore backend is not registered +. ./common_tests + +prlog "Causing kernel crash ..." + +# enable all functions triggered by sysrq +echo 1 > /proc/sys/kernel/sysrq +# setting to reboot in 3 seconds after panic +echo 3 > /proc/sys/kernel/panic +# setting to cause panic when oops occurs +echo 1 > /proc/sys/kernel/panic_on_oops + +# create a file as reboot flag +touch $REBOOT_FILE +sync + +# cause crash +echo c > /proc/sysrq-trigger diff --git a/tools/testing/selftests/pstore/pstore_post_reboot_tests b/tools/testing/selftests/pstore/pstore_post_reboot_tests new file mode 100755 index 000..0e33366 --- /dev/null +++
[PATCH 0/2] selftests/pstore: add pstore test script
These scripts include test cases which check pstore behavior. This is useful to avoid regressions of pstore. Pstore is used across kernel crash, so these test cases are split into three parts. - pstore_tests: check pstore behavior before crash - pstore_post_reboot_tests: check pstore behavior after crash and reboot - pstore_crash_test: cause kernel crash and reboot The pstore_test and the pstore_post_reboot_tests are the actual scripts for testing pstore and are executed in usual selftest's "run_test" target. On the other hand, the pstore_crash_test is to cause kernel panic and reboot, so it is executed in new "run_pstore_crash" target which is specified ad-hoc by users. In addition, there is a "common_tests" script which includes utilities and test cases used commonly in these scripts. When the pstore_crash_test is executed, it creates a file as a reboot flag. The pstore_post_reboot_tests detects whether the file exists or not. If the file doesn't exists, the test cases are skipped. These scripts expect that one pstore backend is registered before the scripts are executed. Assumed use case is following. # cd linux/tools/testing/selftests # make run_tests -C pstore make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok pstore_crash_test has not been executed yet. we skip further tests. selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' # make run_pstore_crash ... (kernel crash and reboot) ... # make run_tests -C pstore make: Entering directory '/home/root/selftests/pstore' === Pstore unit tests (pstore_tests)=== Checking pstore backend is registered ... ok Checking pstore console is registered ... ok Checking /dev/pmsg0 exists ... ok Writing TEST_STRING to /dev/pmsg0 ... ok selftests: pstore_tests [PASS] === Pstore unit tests (pstore_post_reboot_tests)=== Checking pstore backend is registered ... ok Mounting pstore filesystem ... ok Checking dmesg files exist in pstore filesystem ... ok dmesg-ramoops-0 dmesg-ramoops-1 Checking console files exist in pstore filesystem ... ok console-ramoops-0 Checking pmsg files exist in pstore filesystem ... ok pmsg-ramoops-0 Checking dmesg files contains oops end marker dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok Checking console file contains oops end marker ... ok Checking pmsg file contains TEST_STRING ... ok Removing all files in pstore filesystem console-ramoops-0 ... ok dmesg-ramoops-0 ... ok dmesg-ramoops-1 ... ok pmsg-ramoops-0 ... ok selftests: pstore_post_reboot_tests [PASS] make: Leaving directory '/home/root/selftests/pstore' We can also see test logs later. # cat pstore/logs/20150903-58/pstore_tests.log ... --- Hiraku Toyooka (2): selftests/pstore: add pstore test script for pre-reboot selftests/pstore: add pstore test scripts going with reboot tools/testing/selftests/Makefile |1 tools/testing/selftests/pstore/Makefile| 15 ++ tools/testing/selftests/pstore/common_tests| 46 +++ tools/testing/selftests/pstore/pstore_crash_test | 27 .../selftests/pstore/pstore_post_reboot_tests | 126 tools/testing/selftests/pstore/pstore_tests| 42 +++ 6 files changed, 257 insertions(+) create mode 100644 tools/testing/selftests/pstore/Makefile create mode 100755 tools/testing/selftests/pstore/common_tests create mode 100755 tools/testing/selftests/pstore/pstore_crash_test create mode 100755 tools/testing/selftests/pstore/pstore_post_reboot_tests create mode 100755 tools/testing/selftests/pstore/pstore_tests -- Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] ARM: socfpga: add smp_ops.cpu_kill to make kexec/kdump available
Kexec_load syscall in ARM requires that machine-specific code has the smp_ops.cpu_kill() before loading kernel image. This patch adds the cpu_kill(), as a result, kexec reboot and kernel crash dump become available in mach-socfpga. Signed-off-by: Hiraku Toyooka Cc: Dinh Nguyen Cc: Russell King Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org --- arch/arm/mach-socfpga/platsmp.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c index c64d89b..3caf7c3 100644 --- a/arch/arm/mach-socfpga/platsmp.c +++ b/arch/arm/mach-socfpga/platsmp.c @@ -95,11 +95,23 @@ static void socfpga_cpu_die(unsigned int cpu) cpu_do_idle(); } +/* + * We need a dummy function so that platform_can_cpu_hotplug() knows + * we support CPU hotplug. However, the function does not need to do + * anything, because CPUs going offline just do WFI. We could reset + * the CPUs but it would increase power consumption. + */ +static int socfpga_cpu_kill(unsigned int cpu) +{ + return 1; +} + struct smp_operations socfpga_smp_ops __initdata = { .smp_init_cpus = socfpga_smp_init_cpus, .smp_prepare_cpus = socfpga_smp_prepare_cpus, .smp_boot_secondary = socfpga_boot_secondary, #ifdef CONFIG_HOTPLUG_CPU .cpu_die= socfpga_cpu_die, + .cpu_kill = socfpga_cpu_kill, #endif }; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: socfpga: add smp_ops.cpu_kill to make kexec/kdump available
Hello, Russell, On 2015/06/02 17:38, Russell King - ARM Linux wrote: > I wasn't thinking of SPIs and PPIs, but SGIs - the IPI interrupts coming > from the boot CPU. All tasks in a CPU going offline are migrated to other CPUs(finally CPU0) and the rq is marked with offline before the CPU entering WFI loop. So IPI is not sent to the secondary CPU. > If you have a way to reset CPU0, why are you not using this for hotplug > when a CPU is hot-unplugged? I think you mean "way to reset CPU1". If so, the reason is an increase in power consumption in usual hotplug. This is discussed before between Alan Tull and you. (https://lkml.org/lkml/2014/9/25/37) Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ARM: socfpga: add smp_ops.cpu_kill to make kexec/kdump available
Kexec_load syscall in ARM checks that machine-specific code has the smp_ops.cpu_kill() before loading kernel image. This patch adds the cpu_kill(), as a result, kexec reboot and kernel crash dump become available in mach-socfpga. Signed-off-by: Hiraku Toyooka Cc: Dinh Nguyen Cc: Russell King Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org --- arch/arm/mach-socfpga/platsmp.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c index c64d89b..a8f0f84 100644 --- a/arch/arm/mach-socfpga/platsmp.c +++ b/arch/arm/mach-socfpga/platsmp.c @@ -95,11 +95,22 @@ static void socfpga_cpu_die(unsigned int cpu) cpu_do_idle(); } +/* + * We need a dummy function so that platform_can_cpu_hotplug() knows + * we support CPU hotplug. However, the function does not need to do + * anything, because CPUs going offline just do WFI. + */ +static int socfpga_cpu_kill(unsigned int cpu) +{ + return 1; +} + struct smp_operations socfpga_smp_ops __initdata = { .smp_init_cpus = socfpga_smp_init_cpus, .smp_prepare_cpus = socfpga_smp_prepare_cpus, .smp_boot_secondary = socfpga_boot_secondary, #ifdef CONFIG_HOTPLUG_CPU .cpu_die= socfpga_cpu_die, + .cpu_kill = socfpga_cpu_kill, #endif }; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] tracing: update documentation of snapshot utility
Commit-ID: 1abccd7419de9829bcdf9ab1f81d5f6cf74d55d3 Gitweb: http://git.kernel.org/tip/1abccd7419de9829bcdf9ab1f81d5f6cf74d55d3 Author: Hiraku Toyooka AuthorDate: Fri, 8 Mar 2013 16:32:25 +0900 Committer: Steven Rostedt CommitDate: Fri, 8 Mar 2013 06:27:11 -0500 tracing: update documentation of snapshot utility Now, "snapshot" file returns success on a reset of snapshot buffer even if the buffer wasn't allocated, instead of returning EINVAL. This patch updates snapshot desctiption according to the change. Link: http://lkml.kernel.org/r/51399409.4090...@hitachi.com Signed-off-by: Hiraku Toyooka Signed-off-by: Steven Rostedt --- Documentation/trace/ftrace.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 53d6a3c..a372304 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1873,7 +1873,7 @@ feature: status\input | 0 | 1 |else| --++++ - not allocated |(do nothing)| alloc+swap | EINVAL | + not allocated |(do nothing)| alloc+swap |(do nothing)| --++++ allocated |free|swap| clear| --++++ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] [GIT PULL][3.9] tracing: Fix in snapshot API
Steven, (03/08/2013 12:34 AM), Steven Rostedt wrote: The second patch, returns success on a reset of the buffer even if the buffer wasn't allocated. Returning -EINVAL is just confusing. I realized we should update the snapshot documentation together with this change. I attached a patch to update the documentation. Could you include this patch? Thanks, Hiraku Toyooka From: Hiraku Toyooka Subject: [PATCH] tracing: update documentation of snapshot utility Now, "snapshot" file returns success on a reset of snapshot buffer even if the buffer wasn't allocated, instead of returning EINVAL. This patch updates snapshot desctiption according to the change. Signed-off-by: Hiraku Toyooka --- Documentation/trace/ftrace.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 53d6a3c..a372304 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1873,7 +1873,7 @@ feature: status\input | 0 | 1 |else| --++++ - not allocated |(do nothing)| alloc+swap | EINVAL | + not allocated |(do nothing)| alloc+swap |(do nothing)| --++++ allocated |free|swap| clear| --++++ -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 2/2] tracing: Do not return EINVAL in snapshot when not allocated
(03/06/2013 10:49 PM), Steven Rostedt wrote: > From: "Steven Rostedt (Red Hat)" > > To use the tracing snapshot feature, writing a '1' into the snapshot > file causes the snapshot buffer to be allocated if it has not already > been allocated and dose a 'swap' with the main buffer, so that the > snapshot now contains what was in the main buffer, and the main buffer > now writes to what was the snapshot buffer. > > To free the snapshot buffer, a '0' is written into the snapshot file. > > To clear the snapshot buffer, any number but a '0' or '1' is written > into the snapshot file. But if the file is not allocated it returns > -EINVAL error code. This is rather pointless. It is better just to > do nothing and return success. > > Cc: Hiraku Toyooka > Signed-off-by: Steven Rostedt > --- Acked-by: Hiraku Toyooka > kernel/trace/trace.c |2 -- > 1 file changed, 2 deletions(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 9e3120b..1f835a8 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -4167,8 +4167,6 @@ tracing_snapshot_write(struct file *filp, const char > __user *ubuf, size_t cnt, > default: > if (current_trace->allocated_snapshot) > tracing_reset_online_cpus(&max_tr); > - else > - ret = -EINVAL; > break; > } > > -- 1.7.10.4 . > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] tracing: Add help of snapshot feature when snapshot is empty
(03/06/2013 10:49 PM), Steven Rostedt wrote: > From: "Steven Rostedt (Red Hat)" > > When cat'ing the snapshot file, instead of showing an empty trace > header like the trace file does, show how to use the snapshot > feature. > > Also, this is a good place to show if the snapshot has been allocated > or not. Users may want to "pre allocate" the snapshot to have a fast > "swap" of the current buffer. Otherwise, a swap would be slow and might > fail as it would need to allocate the snapshot buffer, and that might > fail under tight memory constraints. > > Here's what it looked like before: > > # tracer: nop > # > # entries-in-buffer/entries-written: 0/0 #P:4 > # > # _-=> irqs-off > # / _=> need-resched > #| / _---=> hardirq/softirq > #|| / _--=> preempt-depth > #||| / delay > # TASK-PID CPU# TIMESTAMP FUNCTION > # | | | | | > > Here's what it looks like now: > > # tracer: nop > # > # > # * Snapshot is freed * > # > # Snapshot commands: > # echo 0 > snapshot : Clears and frees snapshot buffer > # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. > # Takes a snapshot of the main buffer. > # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) > # (Doesn't have to be '2' works with any number that > # is not a '0' or '1') > > Cc: Hiraku Toyooka > Signed-off-by: Steven Rostedt > --- Acked-by: Hiraku Toyooka > kernel/trace/trace.c | 25 - > 1 file changed, 24 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index c2e2c23..9e3120b 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2400,6 +2400,27 @@ static void test_ftrace_alive(struct seq_file *m) > seq_printf(m, "# MAY BE MISSING FUNCTION EVENTS\n"); > } > > +#ifdef CONFIG_TRACER_MAX_TRACE > +static void print_snapshot_help(struct seq_file *m, struct trace_iterator > *iter) > +{ > + if (iter->trace->allocated_snapshot) > + seq_printf(m, "#\n# * Snapshot is allocated *\n#\n"); > + else > + seq_printf(m, "#\n# * Snapshot is freed *\n#\n"); > + > + seq_printf(m, "# Snapshot commands:\n"); > + seq_printf(m, "# echo 0 > snapshot : Clears and frees snapshot > buffer\n"); > + seq_printf(m, "# echo 1 > snapshot : Allocates snapshot buffer, if not > already allocated.\n"); > + seq_printf(m, "# Takes a snapshot of the main > buffer.\n"); > + seq_printf(m, "# echo 2 > snapshot : Clears snapshot buffer (but does > not allocate)\n"); > + seq_printf(m, "# (Doesn't have to be '2' works > with any number that\n"); > + seq_printf(m, "# is not a '0' or '1')\n"); > +} > +#else > +/* Should never be called */ > +static inline void print_snapshot_help(struct seq_file *m, struct > trace_iterator *iter) { } > +#endif > + > static int s_show(struct seq_file *m, void *v) > { > struct trace_iterator *iter = v; > @@ -2411,7 +2432,9 @@ static int s_show(struct seq_file *m, void *v) > seq_puts(m, "#\n"); > test_ftrace_alive(m); > } > - if (iter->trace && iter->trace->print_header) > + if (iter->snapshot && trace_empty(iter)) > + print_snapshot_help(m, iter); > + else if (iter->trace && iter->trace->print_header) > iter->trace->print_header(m); > else > trace_default_header(m); > -- 1.7.10.4 . > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snapshot error on non allocated buffer?
Hi Steven, (03/06/2013 12:50 AM), Steven Rostedt wrote: Hi Hiraku, I'm doing a lot of reconstruction of ftrace's buffering, and I'm also modifying a lot of the snapshot feature to work with the new stuff that's coming. Many thanks. I'm trying your multi-buffer patches. I'm looking at the -EINVAL when you write something other than '0' or '1' into the snapshot file when the snapshot is not allocated. I'm thinking that it should just return as if it succeeded. I don't understand why it should return -EINVAL? I thought that it might be a little strange if the clear operation succeeded in spite of the non-allocated buffer. (Actually, I simply implemented as you said, though.) But I don't have trouble even if it succeeds, so I'll modify the I/F to make it return successfully. Now if you want to know if the snapshot is allocated or not, I have a patch that shows how to use the snapshot feature when the snapshot is empty, and also give the status of the snapshot itself: [root] # cat /debug/tracing/snapshot # tracer: nop # # # * Snapshot is freed * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') [root] # echo 1 > /debug/tracing/snapshot [root] # echo 2 > /debug/tracing/snapshot [root] # cat /debug/tracing/snapshot # tracer: nop # # # * Snapshot is allocated * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') This seems good for me and also users. As this is a new feature for 3.9, and we are still in -rc1, I think this might be a good thing to add now. As well as not returning -EINVAL on writing to the file when the snapshot buffer isn't allocated. What do you think? I think it's OK. I'll send a patch to make the file not return -EINVAL. Does it need to be based on 3.9-rc1 or tip tree? Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/core] tracing: Add documentation of snapshot utility
Commit-ID: c1043fcda1b9e8e5144cfdaee7be262c50dbdead Gitweb: http://git.kernel.org/tip/c1043fcda1b9e8e5144cfdaee7be262c50dbdead Author: Hiraku Toyooka AuthorDate: Wed, 26 Dec 2012 11:53:09 +0900 Committer: Steven Rostedt CommitDate: Wed, 30 Jan 2013 11:02:07 -0500 tracing: Add documentation of snapshot utility This patch adds snapshot description in ftrace documentation. This description includes what the snapshot is and how to use it. Link: http://lkml.kernel.org/r/20121226025309.3252.150.stgit@liselsia Cc: Rob Landley Signed-off-by: Hiraku Toyooka Signed-off-by: Steven Rostedt --- Documentation/trace/ftrace.txt | 83 ++ 1 file changed, 83 insertions(+) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 6f51fed..53d6a3c 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1842,6 +1842,89 @@ an error. # cat buffer_size_kb 85 +Snapshot + +CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature +available to all non latency tracers. (Latency tracers which +record max latency, such as "irqsoff" or "wakeup", can't use +this feature, since those are already using the snapshot +mechanism internally.) + +Snapshot preserves a current trace buffer at a particular point +in time without stopping tracing. Ftrace swaps the current +buffer with a spare buffer, and tracing continues in the new +current (=previous spare) buffer. + +The following debugfs files in "tracing" are related to this +feature: + + snapshot: + + This is used to take a snapshot and to read the output + of the snapshot. Echo 1 into this file to allocate a + spare buffer and to take a snapshot (swap), then read + the snapshot from this file in the same format as + "trace" (described above in the section "The File + System"). Both reads snapshot and tracing are executable + in parallel. When the spare buffer is allocated, echoing + 0 frees it, and echoing else (positive) values clear the + snapshot contents. + More details are shown in the table below. + + status\input | 0 | 1 |else| + --++++ + not allocated |(do nothing)| alloc+swap | EINVAL | + --++++ + allocated |free|swap| clear| + --++++ + +Here is an example of using the snapshot feature. + + # echo 1 > events/sched/enable + # echo 1 > snapshot + # cat snapshot +# tracer: nop +# +# entries-in-buffer/entries-written: 71/71 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 + sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 +[...] + -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 + + # cat trace +# tracer: nop +# +# entries-in-buffer/entries-written: 77/77 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 + snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 +[...] + + +If you try to use this snapshot feature when current tracer is +one of the latency tracers, you will get the following results. + + # echo wakeup > current_tracer + # echo 1 > snapshot +bash: echo: write error: Device or resource busy + # cat snapshot +cat: snapshot: Device or resource busy + --- More details can be found in the source code, in the -- To unsubscribe from this list: send the line &qu
[tip:perf/core] tracing: Make a snapshot feature available from userspace
Commit-ID: debdd57f5145f3c6a4b3f8d0126abd1a2def7fc6 Gitweb: http://git.kernel.org/tip/debdd57f5145f3c6a4b3f8d0126abd1a2def7fc6 Author: Hiraku Toyooka AuthorDate: Wed, 26 Dec 2012 11:53:00 +0900 Committer: Steven Rostedt CommitDate: Wed, 30 Jan 2013 11:02:06 -0500 tracing: Make a snapshot feature available from userspace Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add "snapshot" debugfs file in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot This will allocate the spare buffer for snapshot (if it is not allocated), and take a snapshot. # cat snapshot This will show contents of the snapshot. # echo 0 > snapshot This will free the snapshot if it is allocated. Any other positive values will clear the snapshot contents if the snapshot is allocated, or return EINVAL if it is not allocated. Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia Cc: Jiri Olsa Cc: David Sharp Signed-off-by: Hiraku Toyooka [ Fixed irqsoff selftest and also a conflict with a change that fixes the update_max_tr. ] Signed-off-by: Steven Rostedt --- include/linux/ftrace_event.h | 3 + kernel/trace/Kconfig | 10 +++ kernel/trace/trace.c | 166 --- kernel/trace/trace.h | 1 + 4 files changed, 154 insertions(+), 26 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 6f8d0b7..13a54d0 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -83,6 +83,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; enum trace_iter_flags { diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index cdc9d28..3656756 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -253,6 +253,16 @@ config FTRACE_SYSCALLS help Basic tracer to catch the syscall entry and exit events. +config TRACER_SNAPSHOT + bool "Create a snapshot trace buffer" + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + config TRACE_BRANCH_PROFILING bool select GENERIC_TRACER diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 2c72466..70dce64 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -710,12 +710,11 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) WARN_ON_ONCE(!irqs_disabled()); - /* If we disabled the tracer, stop now */ - if (current_trace == &nop_trace) - return; - - if (WARN_ON_ONCE(!current_trace->use_max_tr)) + if (!current_trace->allocated_snapshot) { + /* Only the nop tracer should hit this when disabling */ + WARN_ON_ONCE(current_trace != &nop_trace); return; + } arch_spin_lock(&ftrace_max_lock); @@ -743,10 +742,8 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { - WARN_ON_ONCE(1); + if (WARN_ON_ONCE(!current_trace->allocated_snapshot)) return; - } arch_spin_lock(&ftrace_max_lock); @@ -866,10 +863,13 @@ int register_tracer(struct tracer *type) current_trace = type; - /* If we expanded the buffers, make sure the max is expanded too */ - if (ring_buffer_expanded && type->use_max_tr) - ring_buffer_resize(max_tr.buffer, trace_buf_size, - RING_BUFFER_ALL_CPUS); + if (type->use_max_tr) { + /* If we expanded the buffers, make sure the max is expanded too */ + if (ring_buffer_expanded) + ring_buffer_resize(max_tr.buffer, trace_buf_size, + RING_BUFFER_ALL_CPUS); + type->allocated_snapshot = true; + } /* the test is responsible for initializing and enabling */ pr_info("Testing tracer %s: ", type->name); @@ -885,10 +885,14 @@ int register_tracer(struct tracer *type) /* Only reset on passing, to avoid touching corrupted buffers */
[tip:perf/core] tracing: Replace static old_tracer check of tracer name
Commit-ID: 2fd196ec1eab2623096e7fc7e6f3976160392bce Gitweb: http://git.kernel.org/tip/2fd196ec1eab2623096e7fc7e6f3976160392bce Author: Hiraku Toyooka AuthorDate: Wed, 26 Dec 2012 11:52:52 +0900 Committer: Steven Rostedt CommitDate: Wed, 30 Jan 2013 11:02:05 -0500 tracing: Replace static old_tracer check of tracer name Currently the trace buffer read functions use a static variable "old_tracer" for detecting if the current tracer changes. This was suitable for a single trace file ("trace"), but to add a snapshot feature that will use the same function for its file, a check against a static variable is not sufficient. To use the output functions for two different files, instead of storing the current tracer in a static variable, as the trace iterator descriptor contains a pointer to the original current tracer's name, that pointer can now be used to check if the current tracer has changed between different reads of the trace file. Link: http://lkml.kernel.org/r/20121226025252.3252.9276.stgit@liselsia Signed-off-by: Hiraku Toyooka Signed-off-by: Steven Rostedt --- kernel/trace/trace.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 90a1c71..2c72466 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1948,18 +1948,20 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu) static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m->private; - static struct tracer *old_tracer; int cpu_file = iter->cpu_file; void *p = NULL; loff_t l = 0; int cpu; - /* copy the tracer to avoid using a global lock all around */ + /* +* copy the tracer to avoid using a global lock all around. +* iter->trace is a copy of current_trace, the pointer to the +* name may be used instead of a strcmp(), as iter->trace->name +* will point to the same string as current_trace->name. +*/ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); atomic_inc(&trace_record_cmdline_disabled); @@ -3494,7 +3496,6 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { struct trace_iterator *iter = filp->private_data; - static struct tracer *old_tracer; ssize_t sret; /* return any leftover data */ @@ -3506,10 +3507,8 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); /* @@ -3665,7 +3664,6 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, .ops= &tracing_pipe_buf_ops, .spd_release= tracing_spd_release_pipe, }; - static struct tracer *old_tracer; ssize_t ret; size_t rem; unsigned int i; @@ -3675,10 +3673,8 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); mutex_lock(&iter->mutex); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/core] tracing: Add checks if tr-> buffer is NULL in tracing_reset{_online_cpus}
Commit-ID: a54164114b96b4693b42cdb553260eec41ea4393 Gitweb: http://git.kernel.org/tip/a54164114b96b4693b42cdb553260eec41ea4393 Author: Hiraku Toyooka AuthorDate: Wed, 19 Dec 2012 16:02:34 +0900 Committer: Steven Rostedt CommitDate: Mon, 21 Jan 2013 13:22:32 -0500 tracing: Add checks if tr->buffer is NULL in tracing_reset{_online_cpus} max_tr->buffer could be NULL in the tracing_reset{_online_cpus}. In this case, a NULL pointer dereference happens, so we should return immediately from these functions. Note, the current code does not call tracing_reset*() with max_tr when its buffer is NULL, but future code will. This patch is needed to prevent the future code from crashing. Link: http://lkml.kernel.org/r/20121219070234.31200.93863.stgit@liselsia Signed-off-by: Hiraku Toyooka Signed-off-by: Steven Rostedt --- kernel/trace/trace.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index f8b7c62..72b171b 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -922,6 +922,9 @@ void tracing_reset(struct trace_array *tr, int cpu) { struct ring_buffer *buffer = tr->buffer; + if (!buffer) + return; + ring_buffer_record_disable(buffer); /* Make sure all commits have finished */ @@ -936,6 +939,9 @@ void tracing_reset_online_cpus(struct trace_array *tr) struct ring_buffer *buffer = tr->buffer; int cpu; + if (!buffer) + return; + ring_buffer_record_disable(buffer); /* Make sure all commits have finished */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 -tip 3/3] tracing: add description of snapshot to Documentation/trace/ftrace.txt
This patch adds snapshot description in ftrace documentation. This description includes what the snapshot is and how to use it. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Rob Landley Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Documentation/trace/ftrace.txt | 83 1 file changed, 83 insertions(+) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 6f51fed..53d6a3c 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1842,6 +1842,89 @@ an error. # cat buffer_size_kb 85 +Snapshot + +CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature +available to all non latency tracers. (Latency tracers which +record max latency, such as "irqsoff" or "wakeup", can't use +this feature, since those are already using the snapshot +mechanism internally.) + +Snapshot preserves a current trace buffer at a particular point +in time without stopping tracing. Ftrace swaps the current +buffer with a spare buffer, and tracing continues in the new +current (=previous spare) buffer. + +The following debugfs files in "tracing" are related to this +feature: + + snapshot: + + This is used to take a snapshot and to read the output + of the snapshot. Echo 1 into this file to allocate a + spare buffer and to take a snapshot (swap), then read + the snapshot from this file in the same format as + "trace" (described above in the section "The File + System"). Both reads snapshot and tracing are executable + in parallel. When the spare buffer is allocated, echoing + 0 frees it, and echoing else (positive) values clear the + snapshot contents. + More details are shown in the table below. + + status\input | 0 | 1 |else| + --++++ + not allocated |(do nothing)| alloc+swap | EINVAL | + --++++ + allocated |free|swap| clear| + --++++ + +Here is an example of using the snapshot feature. + + # echo 1 > events/sched/enable + # echo 1 > snapshot + # cat snapshot +# tracer: nop +# +# entries-in-buffer/entries-written: 71/71 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 + sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 +[...] + -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 + + # cat trace +# tracer: nop +# +# entries-in-buffer/entries-written: 77/77 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 + snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 +[...] + + +If you try to use this snapshot feature when current tracer is +one of the latency tracers, you will get the following results. + + # echo wakeup > current_tracer + # echo 1 > snapshot +bash: echo: write error: Device or resource busy + # cat snapshot +cat: snapshot: Device or resource busy + --- More details can be found in the source code, in the -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 -tip 2/3] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add "snapshot" debugfs file in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot This will allocate the spare buffer for snapshot (if it is not allocated), and take a snapshot. # cat snapshot This will show contents of the snapshot. # echo 0 > snapshot This will free the snapshot if it is allocated. Any other positive values will clear the snapshot contents if the snapshot is allocated, or return EINVAL if it is not allocated. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: David Sharp Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 10 +++ kernel/trace/trace.c | 134 ++ kernel/trace/trace.h |1 4 files changed, 136 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index a3d4895..9bebadd 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; enum trace_iter_flags { diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 5d89335..82a8ff5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -250,6 +250,16 @@ config FTRACE_SYSCALLS help Basic tracer to catch the syscall entry and exit events. +config TRACER_SNAPSHOT + bool "Create a snapshot trace buffer" + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + config TRACE_BRANCH_PROFILING bool select GENERIC_TRACER diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 1787304..9522af0 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -709,7 +709,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -739,7 +739,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1964,7 +1964,11 @@ static void *s_start(struct seq_file *m, loff_t *pos) *iter->trace = *current_trace; mutex_unlock(&trace_types_lock); - atomic_inc(&trace_record_cmdline_disabled); + if (iter->snapshot && iter->trace->use_max_tr) + return ERR_PTR(-EBUSY); + + if (!iter->snapshot) + atomic_inc(&trace_record_cmdline_disabled); if (*pos != iter->pos) { iter->ent = NULL; @@ -2003,7 +2007,11 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; - atomic_dec(&trace_record_cmdline_disabled); + if (iter->snapshot && iter->trace->use_max_tr) + return; + + if (!iter->snapshot) + atomic_dec(&trace_record_cmdline_disabled); trace_access_unlock(iter->cpu_file); trace_event_read_unlock(); } @@ -2438,7 +2446,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, bool snapshot) { long cpu_file = (long) inode->i_private; struct trace_iterator *iter; @@ -2471,10 +2479,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(&iter->started, GFP_KERNEL)) goto fail; - if (current_trace && current_trace->print_max) + if ((current_trace && current_trace->print_max) || snapshot) iter->tr = &max_tr; else iter->tr = &global_trace; + iter->snapshot = snapshot; iter->pos = -1; mutex_init(&iter->mutex); iter->cpu_file = cpu_file; @@ -2491,8 +2500,9 @@ __tracing_
[PATCH v4 -tip 1/3] tracing: replace static old_tracer with trace iterator's pointer to the original tracer's name
Currently the trace buffer read functions use a static variable "old_tracer" for detecting if the current tracer changes. This was suitable for a single trace file ("trace"), but to add a snapshot feature that will use the same function for its file, a check against a static variable is not sufficient. To use the output functions for two different files, instead of storing the current tracer in a static variable, as the trace iterator descriptor contains a pointer to the original current tracer's name, that pointer can now be used to check if the current tracer has changed between different reads of the trace file. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index a8ce008..1787304 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1948,18 +1948,20 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu) static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m->private; - static struct tracer *old_tracer; int cpu_file = iter->cpu_file; void *p = NULL; loff_t l = 0; int cpu; - /* copy the tracer to avoid using a global lock all around */ + /* +* copy the tracer to avoid using a global lock all around. +* iter->trace is a copy of current_trace, the pointer to the +* name may be used instead of a strcmp(), as iter->trace->name +* will point to the same string as current_trace->name. +*/ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); atomic_inc(&trace_record_cmdline_disabled); @@ -3481,7 +3483,6 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { struct trace_iterator *iter = filp->private_data; - static struct tracer *old_tracer; ssize_t sret; /* return any leftover data */ @@ -3493,10 +3494,8 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); /* @@ -3652,7 +3651,6 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, .ops= &tracing_pipe_buf_ops, .spd_release= tracing_spd_release_pipe, }; - static struct tracer *old_tracer; ssize_t ret; size_t rem; unsigned int i; @@ -3662,10 +3660,8 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && iter->trace->name != current_trace->name)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); mutex_lock(&iter->mutex); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 -tip 0/3] tracing: make a snapshot feature available from userspace
Hi, Steven, I updated the patch which replaces static variable 'old_tracer'. v3->v4: [1/3] tracing: replace static old_tracer with trace iterator's pointer to the original tracer's name - changed to comparison between pointers instead of strcmp (v3: https://lkml.org/lkml/2012/12/19/35) --- Hiraku Toyooka (3): tracing: replace static old_tracer with trace iterator's pointer to the original tracer's name tracing: make a snapshot feature available from userspace tracing: add description of snapshot to Documentation/trace/ftrace.txt Documentation/trace/ftrace.txt | 83 + include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 10 +++ kernel/trace/trace.c | 156 ++-- kernel/trace/trace.h |1 5 files changed, 228 insertions(+), 25 deletions(-) -- Hiraku TOYOOKA Linux Technology Center Yokohama Research Laboratory Hitachi Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 -tip 2/4] tracing: replace static old_tracer with strcmp
Hi, (12/21/2012 12:04 PM), Steven Rostedt wrote: On Wed, 2012-12-19 at 16:02 +0900, Hiraku Toyooka wrote: Currently, read functions for trace buffer use static "old_tracer" for detecting changes of current tracer. This is because we can assume that these functions are used from only one file ("trace"). But we are adding snapshot feature for ftrace, then those functions are called from two files. So we remove all static "old_tracer", and replace those with string comparison between current and previous tracers. I reworded what you wrote: Thank you very much. --- Currently the trace buffer read functions use a static variable "old_tracer" for detecting if the current tracer changes. This was suitable for a single trace file ("trace"), but to add a snapshot feature that will use the same function for its file, a check against a static variable is not sufficient. To use the output functions for two different files, instead of storing the current tracer in a static variable, as the trace iterator descriptor contains a pointer to the original current tracer's name, that pointer can now be used to check if the current tracer has changed between different reads of the trace file. --- And I also realized a more efficient approach. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 18 ++ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index a8ce008..8d05a44 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1948,7 +1948,6 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu) static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m->private; - static struct tracer *old_tracer; int cpu_file = iter->cpu_file; void *p = NULL; loff_t l = 0; @@ -1956,10 +1955,9 @@ static void *s_start(struct seq_file *m, loff_t *pos) /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && +strcmp(iter->trace->name, current_trace->name) != 0)) *iter->trace = *current_trace; As iter->trace is a copy of current_trace, it points to everything that the current_trace pointed to. As the current_trace->name is a pointer to a const char string that never changes, we don't need to do the strcmp() you can simply do a direct comparison: if (unlikely(current_trace && iter->trace->name != current_trace->name)) { Yes. I'll update my patch to use this efficient way. I would add a comment about that too: /* * iter->trace is a copy of current_trace, the pointer to the * name may be used instead of a strcmp(), as iter->trace->name * will point to the same string as current_trace->name. */ OK. I'll include this comment in my patch. Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 -tip 4/4] tracing: add description of snapshot to Documentation/trace/ftrace.txt
This patch adds snapshot description in ftrace documentation. This description includes what the snapshot is and how to use it. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Rob Landley Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Documentation/trace/ftrace.txt | 83 1 file changed, 83 insertions(+) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 6f51fed..53d6a3c 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1842,6 +1842,89 @@ an error. # cat buffer_size_kb 85 +Snapshot + +CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature +available to all non latency tracers. (Latency tracers which +record max latency, such as "irqsoff" or "wakeup", can't use +this feature, since those are already using the snapshot +mechanism internally.) + +Snapshot preserves a current trace buffer at a particular point +in time without stopping tracing. Ftrace swaps the current +buffer with a spare buffer, and tracing continues in the new +current (=previous spare) buffer. + +The following debugfs files in "tracing" are related to this +feature: + + snapshot: + + This is used to take a snapshot and to read the output + of the snapshot. Echo 1 into this file to allocate a + spare buffer and to take a snapshot (swap), then read + the snapshot from this file in the same format as + "trace" (described above in the section "The File + System"). Both reads snapshot and tracing are executable + in parallel. When the spare buffer is allocated, echoing + 0 frees it, and echoing else (positive) values clear the + snapshot contents. + More details are shown in the table below. + + status\input | 0 | 1 |else| + --++++ + not allocated |(do nothing)| alloc+swap | EINVAL | + --++++ + allocated |free|swap| clear| + --++++ + +Here is an example of using the snapshot feature. + + # echo 1 > events/sched/enable + # echo 1 > snapshot + # cat snapshot +# tracer: nop +# +# entries-in-buffer/entries-written: 71/71 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 + sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 +[...] + -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 + + # cat trace +# tracer: nop +# +# entries-in-buffer/entries-written: 77/77 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 + snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 +[...] + + +If you try to use this snapshot feature when current tracer is +one of the latency tracers, you will get the following results. + + # echo wakeup > current_tracer + # echo 1 > snapshot +bash: echo: write error: Device or resource busy + # cat snapshot +cat: snapshot: Device or resource busy + --- More details can be found in the source code, in the -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 -tip 2/4] tracing: replace static old_tracer with strcmp
Currently, read functions for trace buffer use static "old_tracer" for detecting changes of current tracer. This is because we can assume that these functions are used from only one file ("trace"). But we are adding snapshot feature for ftrace, then those functions are called from two files. So we remove all static "old_tracer", and replace those with string comparison between current and previous tracers. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 18 ++ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index a8ce008..8d05a44 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1948,7 +1948,6 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu) static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m->private; - static struct tracer *old_tracer; int cpu_file = iter->cpu_file; void *p = NULL; loff_t l = 0; @@ -1956,10 +1955,9 @@ static void *s_start(struct seq_file *m, loff_t *pos) /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && +strcmp(iter->trace->name, current_trace->name) != 0)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); atomic_inc(&trace_record_cmdline_disabled); @@ -3481,7 +3479,6 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { struct trace_iterator *iter = filp->private_data; - static struct tracer *old_tracer; ssize_t sret; /* return any leftover data */ @@ -3493,10 +3490,9 @@ tracing_read_pipe(struct file *filp, char __user *ubuf, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && +strcmp(iter->trace->name, current_trace->name) != 0)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); /* @@ -3652,7 +3648,6 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, .ops= &tracing_pipe_buf_ops, .spd_release= tracing_spd_release_pipe, }; - static struct tracer *old_tracer; ssize_t ret; size_t rem; unsigned int i; @@ -3662,10 +3657,9 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); - if (unlikely(old_tracer != current_trace && current_trace)) { - old_tracer = current_trace; + if (unlikely(current_trace && +strcmp(iter->trace->name, current_trace->name) != 0)) *iter->trace = *current_trace; - } mutex_unlock(&trace_types_lock); mutex_lock(&iter->mutex); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 -tip 3/4] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add "snapshot" debugfs file in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot This will allocate the spare buffer for snapshot (if it is not allocated), and take a snapshot. # cat snapshot This will show contents of the snapshot. # echo 0 > snapshot This will free the snapshot if it is allocated. Any other positive values will clear the snapshot contents if the snapshot is allocated, or return EINVAL if it is not allocated. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: David Sharp Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 10 +++ kernel/trace/trace.c | 134 ++ kernel/trace/trace.h |1 4 files changed, 136 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index a3d4895..9bebadd 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; enum trace_iter_flags { diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 5d89335..82a8ff5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -250,6 +250,16 @@ config FTRACE_SYSCALLS help Basic tracer to catch the syscall entry and exit events. +config TRACER_SNAPSHOT + bool "Create a snapshot trace buffer" + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + config TRACE_BRANCH_PROFILING bool select GENERIC_TRACER diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 8d05a44..0e5abce 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -709,7 +709,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -739,7 +739,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1960,7 +1960,11 @@ static void *s_start(struct seq_file *m, loff_t *pos) *iter->trace = *current_trace; mutex_unlock(&trace_types_lock); - atomic_inc(&trace_record_cmdline_disabled); + if (iter->snapshot && iter->trace->use_max_tr) + return ERR_PTR(-EBUSY); + + if (!iter->snapshot) + atomic_inc(&trace_record_cmdline_disabled); if (*pos != iter->pos) { iter->ent = NULL; @@ -1999,7 +2003,11 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; - atomic_dec(&trace_record_cmdline_disabled); + if (iter->snapshot && iter->trace->use_max_tr) + return; + + if (!iter->snapshot) + atomic_dec(&trace_record_cmdline_disabled); trace_access_unlock(iter->cpu_file); trace_event_read_unlock(); } @@ -2434,7 +2442,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, bool snapshot) { long cpu_file = (long) inode->i_private; struct trace_iterator *iter; @@ -2467,10 +2475,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(&iter->started, GFP_KERNEL)) goto fail; - if (current_trace && current_trace->print_max) + if ((current_trace && current_trace->print_max) || snapshot) iter->tr = &max_tr; else iter->tr = &global_trace; + iter->snapshot = snapshot; iter->pos = -1; mutex_init(&iter->mutex); iter->cpu_file = cpu_file; @@ -2487,8 +2496,9 @@ __tracing_
[PATCH v3 -tip 1/4] tracing: add checks if tr->buffer is NULL in tracing_reset{_online_cpus}
max_tr->buffer could be NULL in the tracing_reset{_online_cpus}. In this case, a NULL pointer dereference happens, so we should return immediately from these functions. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index c705c7a..a8ce008 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -922,6 +922,9 @@ void tracing_reset(struct trace_array *tr, int cpu) { struct ring_buffer *buffer = tr->buffer; + if (!buffer) + return; + ring_buffer_record_disable(buffer); /* Make sure all commits have finished */ @@ -936,6 +939,9 @@ void tracing_reset_online_cpus(struct trace_array *tr) struct ring_buffer *buffer = tr->buffer; int cpu; + if (!buffer) + return; + ring_buffer_record_disable(buffer); /* Make sure all commits have finished */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 -tip 0/4] tracing: make a snapshot feature available from userspace
Hi, Steven, Thank you for your review. I applied your review comments. These patches depend on the next patch. tracing: Add a resize function to make one buffer equivalent to another buffer http://lkml.kernel.org/r/20121017025616.2627.91226.stgit@falsita v2->v3: [1/4] tracing: add checks if tr->buffer is NULL in tracing_reset{_online_cpus} - (new patch) [2/4] tracing: replace static old_tracer with strcmp - (new patch) [3/4] tracing: make a snapshot feature available from userspace - changed snapshot file I/F (removed "snapshot_allocate") - changed CONFIG_TRACER_SNAPSHOT's location and description - changed an integer flag of __tracing_open() to bool - switched to use seq_read() directly for reading snapshot [4/4] tracing: add description of snapshot to Documentation/trace/ftrace.txt - updated documentation of the snapshot ToDo: - adding "trace_snapshot" kernel parameter to allocate spare buffer on boot. (v1: https://lkml.org/lkml/2012/10/2/67) (v2: https://lkml.org/lkml/2012/10/16/585) --- Hiraku Toyooka (4): tracing: add checks if tr->buffer is NULL in tracing_reset{_online_cpus} tracing: replace static old_tracer with strcmp tracing: make a snapshot feature available from userspace tracing: add description of snapshot to Documentation/trace/ftrace.txt Documentation/trace/ftrace.txt | 83 + include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 10 +++ kernel/trace/trace.c | 158 ++-- kernel/trace/trace.h |1 5 files changed, 231 insertions(+), 24 deletions(-) -- Hiraku TOYOOKA Linux Technology Center Yokohama Research Laboratory Hitachi Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
(12/15/2012 02:08 AM), Steven Rostedt wrote: On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] > > Actually, I would have: > > status\input | 0 | 1 |else| > --++++ > not allocated |(do nothing)| alloc+swap | EINVAL | > --++++ >allocated |free| swap | clear| > --++++ > > Perhaps we don't need to do the clear on swap, just let the trace > continue where it left off? But in case we should swap... > I think we don't need the clear on swap too. I'll update my patches like this table. > There's a fast way to clear the tracer. Look at what the wakeup tracer > does. We can make that generic. If you want, I can write that code up > too. Hmm, maybe I'll do that, as it will speed things up for > everyone :-) > BTW, any update on this? I really like to get this into 3.9. I'm applying your review comment and rebasing now. So, I'll post the updated patch series in a few days. Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
(unlikely(iter->old_tracer != current_trace && current_trace)) { >> +iter->old_tracer = current_trace; >> *iter->trace = *current_trace; >> } >> mutex_unlock(&trace_types_lock); >> >> + if (iter->snapshot && iter->trace->use_max_tr) >> +return ERR_PTR(-EBUSY); >> + >> ... >> } >> >> static void s_stop(struct seq_file *m, void *p) >> { >> struct trace_iterator *iter = m->private; >> >> +if (iter->snapshot && iter->trace->use_max_tr) >> +return; > > This part shouldn't be needed, as if s_start fails it wont call > s_stop(). But if you are paranoid (like I am ;-) then we can do: > > if (WARN_ON_ONCE(iter->snapshot && iter->trace->use_max_tr) > return; I think that seq_read() calls s_stop() even if s_start() failed. seq_read()@fs/seq_file.c: p = m->op->start(m, &pos); while (1) { err = PTR_ERR(p); if (!p || IS_ERR(p)) break; ... } m->op->stop(m, p); So, I think we need the check in s_stop(), don't we? Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
er) > return; > I see. I'll add the test to tracing_reset_online_cpus(). Should I make a separated patch? [snip] >> +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, >> + size_t cnt, loff_t *ppos) >> +{ >> +ssize_t ret = 0; >> + >> +mutex_lock(&trace_types_lock); >> +if (current_trace && current_trace->use_max_tr) >> +ret = -EBUSY; >> +mutex_unlock(&trace_types_lock); > > I don't like this, as it is racy. The current tracer could change after > the unlock, and your back to the problem. > You're right... This is racy. > Now what we may be able to do, but it would take a little checking for > lock ordering with trace_access_lock() and trace_event_read_lock(), but > we could add the mutex to trace_types_lock to both s_start() and > s_stop() and do the check in s_start() if iter->snapshot is true. > If I catch your meaning, do s_start() and s_stop() become like following code? (Now, s_start() is used from two files - "trace" and "snapshot", so we should change static "old_tracer" to per open-file.) static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m->private; -static struct tracer *old_tracer; ... /* copy the tracer to avoid using a global lock all around */ mutex_lock(&trace_types_lock); -if (unlikely(old_tracer != current_trace && current_trace)) { -old_tracer = current_trace; +if (unlikely(iter->old_tracer != current_trace && current_trace)) { +iter->old_tracer = current_trace; *iter->trace = *current_trace; } mutex_unlock(&trace_types_lock); +if (iter->snapshot && iter->trace->use_max_tr) +return ERR_PTR(-EBUSY); + ... } static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; +if (iter->snapshot && iter->trace->use_max_tr) +return; ... } > This means we can nuke the tracing_snapshot_read() and just use > seq_read() directly. > Yes, I see. Thanks, Hiraku Toyooka >> +if (ret < 0) >> +return ret; >> + >> +ret = seq_read(filp, ubuf, cnt, ppos); >> + >> +return ret; >> +} >> + >> +static ssize_t >> +tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt, >> + loff_t *ppos) >> +{ >> +unsigned long val = 0; >> +int ret; >> + >> +ret = tracing_update_buffers(); >> +if (ret < 0) >> +return ret; >> + >> +ret = kstrtoul_from_user(ubuf, cnt, 10, &val); >> +if (ret) >> +return ret; >> + >> +mutex_lock(&trace_types_lock); >> + >> +if (current_trace->use_max_tr) { >> +ret = -EBUSY; >> +goto out; >> +} > > Again, I would check the value of 'val' and if it is '1' then allocate > the buffers, '0' free them, and anything else just clear them or return > EINVAL if the buffers are not allocated. > > If it is '1' and the buffers are already allocated, then clear them too. > > Then we can just remove the 'snapshot_allocate' file. > > Thanks! > > -- Steve > >> + >> +if (!current_trace->allocated_snapshot) { >> +/* allocate spare buffer for snapshot */ >> +ret = resize_buffer_duplicate_size(&max_tr, &global_trace, >> + RING_BUFFER_ALL_CPUS); >> +if (ret < 0) >> +goto out; >> +current_trace->allocated_snapshot = true; >> +} >> + >> +if (val) { >> +local_irq_disable(); >> +update_max_tr(&global_trace, current, smp_processor_id()); >> +local_irq_enable(); >> +} else >> +tracing_reset_online_cpus(&max_tr); >> + >> +*ppos += cnt; >> +ret = cnt; >> +out: >> +mutex_unlock(&trace_types_lock); >> +return ret; >> +} >> + >> +static ssize_t >> +tracing_snapshot_allocate_read(struct file *filp, char __user *ubuf, >> + size_t cnt, loff_t *ppos) >> +{ >> +char buf[64]; >> +int r; >> + >> +r = sprintf(buf, "%d\n", current_trace->allocated_snapshot); >> +return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); >> +} >> + >> +static ssize_t >> +tracing_snapshot_allocate_write(struct file *filp, const char __user *ubuf, >> +s
[tip:perf/core] tracing: Change tracer's integer flags to bool
Commit-ID: f43c738bfa8608424610e4fc1aef4d4644e2ce11 Gitweb: http://git.kernel.org/tip/f43c738bfa8608424610e4fc1aef4d4644e2ce11 Author: Hiraku Toyooka AuthorDate: Tue, 2 Oct 2012 17:27:10 +0900 Committer: Steven Rostedt CommitDate: Wed, 31 Oct 2012 16:45:25 -0400 tracing: Change tracer's integer flags to bool print_max and use_max_tr in struct tracer are "int" variables and used like flags. This is wasteful, so change the type to "bool". Link: http://lkml.kernel.org/r/20121002082710.9807.86393.stgit@falsita Signed-off-by: Hiraku Toyooka Signed-off-by: Steven Rostedt --- kernel/trace/trace.h |4 ++-- kernel/trace/trace_irqsoff.c | 12 ++-- kernel/trace/trace_sched_wakeup.c |8 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index c15f528..c56a233 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -285,8 +285,8 @@ struct tracer { int (*set_flag)(u32 old_flags, u32 bit, int set); struct tracer *next; struct tracer_flags *flags; - int print_max; - int use_max_tr; + boolprint_max; + booluse_max_tr; }; diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index 11edebd..5ffce7b 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -604,7 +604,7 @@ static struct tracer irqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -614,7 +614,7 @@ static struct tracer irqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_irqsoff(trace) register_tracer(&trace) #else @@ -637,7 +637,7 @@ static struct tracer preemptoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -647,7 +647,7 @@ static struct tracer preemptoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptoff(trace) register_tracer(&trace) #else @@ -672,7 +672,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -682,7 +682,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptirqsoff(trace) register_tracer(&trace) diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c index 2f6af78..bc64fc1 100644 --- a/kernel/trace/trace_sched_wakeup.c +++ b/kernel/trace/trace_sched_wakeup.c @@ -589,7 +589,7 @@ static struct tracer wakeup_tracer __read_mostly = .reset = wakeup_tracer_reset, .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &tracer_flags, @@ -599,7 +599,7 @@ static struct tracer wakeup_tracer __read_mostly = #endif .open = wakeup_trace_open, .close = wakeup_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; static struct tracer wakeup_rt_tracer __read_mostly = @@ -610,7 +610,7 @@ static struct tracer wakeup_rt_tracer __read_mostly = .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, .wait_pipe = poll_wait_pipe, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &a
[PATCH v2 -tip 2/4] tracing: add a resize function for making one buffer equivalent to the other buffer
Trace buffer size is now per-cpu, so that there are following two patterns in resize of the buffers. (1) resize per-cpu buffers to same given size (2) resize per-cpu buffers to the other trace_array's buffer size for each CPU (such as preparing the max_tr which is equivalent to the global_trace's size) __tracing_resize_ring_buffer() can be used for (1), and had implemented (2) inside it for resetting the global_trace to the original size. (2) was also implemented in other place. So this patch assembles them in a new function - resize_buffer_duplicate_size(). Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 58 +++--- 1 files changed, 31 insertions(+), 27 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 08acf42..d71eee1 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3017,6 +3017,31 @@ static void set_buffer_entries(struct trace_array *tr, unsigned long val) tr->data[cpu]->entries = val; } +/* resize @tr's buffer to the size of @size_tr's entries */ +static int resize_buffer_duplicate_size(struct trace_array *tr, + struct trace_array *size_tr, int cpu_id) +{ + int cpu, ret = 0; + + if (cpu_id == RING_BUFFER_ALL_CPUS) { + for_each_tracing_cpu(cpu) { + ret = ring_buffer_resize(tr->buffer, + size_tr->data[cpu]->entries, cpu); + if (ret < 0) + break; + tr->data[cpu]->entries = size_tr->data[cpu]->entries; + } + } else { + ret = ring_buffer_resize(tr->buffer, + size_tr->data[cpu_id]->entries, cpu_id); + if (ret == 0) + tr->data[cpu_id]->entries = + size_tr->data[cpu_id]->entries; + } + + return ret; +} + static int __tracing_resize_ring_buffer(unsigned long size, int cpu) { int ret; @@ -3037,23 +3062,8 @@ static int __tracing_resize_ring_buffer(unsigned long size, int cpu) ret = ring_buffer_resize(max_tr.buffer, size, cpu); if (ret < 0) { - int r = 0; - - if (cpu == RING_BUFFER_ALL_CPUS) { - int i; - for_each_tracing_cpu(i) { - r = ring_buffer_resize(global_trace.buffer, - global_trace.data[i]->entries, - i); - if (r < 0) - break; - } - } else { - r = ring_buffer_resize(global_trace.buffer, - global_trace.data[cpu]->entries, - cpu); - } - + int r = resize_buffer_duplicate_size(&global_trace, +&global_trace, cpu); if (r < 0) { /* * AARGH! We are left with different @@ -3191,17 +3201,11 @@ static int tracing_set_tracer(const char *buf) topts = create_trace_option_files(t); if (t->use_max_tr) { - int cpu; /* we need to make per cpu buffer sizes equivalent */ - for_each_tracing_cpu(cpu) { - ret = ring_buffer_resize(max_tr.buffer, - global_trace.data[cpu]->entries, - cpu); - if (ret < 0) - goto out; - max_tr.data[cpu]->entries = - global_trace.data[cpu]->entries; - } + ret = resize_buffer_duplicate_size(&max_tr, &global_trace, + RING_BUFFER_ALL_CPUS); + if (ret < 0) + goto out; } if (t->init) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 -tip 4/4] tracing: add description of snapshot to Documentation/trace/ftrace.txt
This patch adds snapshot description in ftrace documentation. This description includes what the snapshot is and how to use it. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Rob Landley Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Documentation/trace/ftrace.txt | 97 1 files changed, 97 insertions(+), 0 deletions(-) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 6f51fed..68ac294 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1842,6 +1842,103 @@ an error. # cat buffer_size_kb 85 +Snapshot + +CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature +available to all non latency tracers. (Latency tracers which +record max latency, such as "irqsoff" or "wakeup", can't use +this feature, since those are already using the snapshot +mechanism internally.) + +Snapshot preserves a trace buffer at a particular point in time +without stopping tracing. Ftrace swaps the current buffer with a +spare buffer, and tracing continues in the (previous) spare +buffer. + +The following debugfs files in "tracing" are related to this +feature: + + snapshot: + + This is used to take a snapshot and to read the output + of the snapshot. Echo 1 into this file to allocate a + spare buffer and to take a snapshot, then read the + snapshot from the file in the same format as "trace" + (described above in the section "The File System"). Both + reads snapshot and tracing are executable in parallel. + Echoing 0 erases the snapshot contents. + + snapshot_allocate: + + This is used to pre-allocate or free a spare buffer. + Echo 1 into this file to pre-allocate a spare buffer if + you don't want to fail in the next snapshot due to + memory allocation failure, or if you don't want to lose + older trace data while allocating buffer. Echo 0 to free + the spare buffer when the snapshot becomes unnecessary. + If you take the next snapshot again, you can reuse the + buffer, then just erase the snapshot contents by echoing + 1 into the "snapshot" file, instead of freeing the + buffer. + + Reads from this file display whether the spare buffer is + allocated. When current_tracer is changed, the allocated + spare buffer is freed. If the next tracer is one of the + latency tracers, this value turns into 1 and can't be + changed, or else the value starts with 0. + + +Here is an example of using the snapshot feature. + + # echo 1 > snapshot_allocate (if you want to pre-allocate the spare buffer) + # echo 1 > events/sched/enable + # echo 1 > snapshot + # cat snapshot +# tracer: nop +# +# entries-in-buffer/entries-written: 71/71 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 + sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 +[...] + -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 + # cat trace +# tracer: nop +# +# entries-in-buffer/entries-written: 77/77 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 + snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 +[...] + + +If you try to use this snapshot feature when current tracer is +one of the latency tracers, you will get the following results. + + # echo wakeup > current_tracer + # cat snapshot_allocate +1 + # echo 1 > snapshot_allocate +bash: echo: write error: D
[PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add following two debugfs files in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot # cat snapshot snapshot_allocate: Echoing 1 to the "snapshot" allocates the max_tr buffer if it is not allocated. So taking a snapshot may delay or fail beacuse of memory allocation. To avoid that, this file provides a mean to pre-allocate (or free) the max_tr buffer. # echo 1 > snapshot_allocate [...] # echo 1 > snapshot Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: Li Zefan Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 190 +++--- kernel/trace/trace.h |1 4 files changed, 193 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 642928c..8c32c6e 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. +config TRACER_SNAPSHOT + bool + default y + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + # All tracer options should select GENERIC_TRACER. For those options that are # enabled by all tracers (context switch and event tracer) they select TRACING. # This allows those options to appear when no other tracer is selected. But the diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d71eee1..b0d097e 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) } mutex_unlock(&trace_types_lock); - atomic_inc(&trace_record_cmdline_disabled); + if (!iter->snapshot) + atomic_inc(&trace_record_cmdline_disabled); if (*pos != iter->pos) { iter->ent = NULL; @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; - atomic_dec(&trace_record_cmdline_disabled); + if (!iter->snapshot) + atomic_dec(&trace_record_cmdline_disabled); trace_access_unlock(iter->cpu_file); trace_event_read_unlock(); } @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, int snapshot) { long cpu_file = (long) inode->i_private; struct trace_iterator *iter; @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(&iter->started, GFP_KERNEL)) goto fail; - if (current_trace && current_trace->print_max) + if ((current_trace && current_trace->print_max) || snapshot) iter->tr = &max_tr; else iter->tr = &global_trace; + iter->snapshot = !!snapshot; iter->pos = -1; mutex_init(&iter->mutex); iter->cpu_file = cpu_file; @@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file) if (ring_buffer_overruns(iter->tr->buffer)) iter
[PATCH v2 -tip 1/4] tracing: change tracer's integer flags to bool
print_max and use_max_tr in struct tracer are "int" variables and used like flags. This is wasteful, so change the type to "bool". Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.h |4 ++-- kernel/trace/trace_irqsoff.c | 12 ++-- kernel/trace/trace_sched_wakeup.c |8 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 593debe..0eb6a1a 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -285,8 +285,8 @@ struct tracer { int (*set_flag)(u32 old_flags, u32 bit, int set); struct tracer *next; struct tracer_flags *flags; - int print_max; - int use_max_tr; + boolprint_max; + booluse_max_tr; }; diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index d98ee82..5c823c1 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -604,7 +604,7 @@ static struct tracer irqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -614,7 +614,7 @@ static struct tracer irqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_irqsoff(trace) register_tracer(&trace) #else @@ -637,7 +637,7 @@ static struct tracer preemptoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -647,7 +647,7 @@ static struct tracer preemptoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptoff(trace) register_tracer(&trace) #else @@ -672,7 +672,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -682,7 +682,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptirqsoff(trace) register_tracer(&trace) diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c index 02170c0..43f8abc 100644 --- a/kernel/trace/trace_sched_wakeup.c +++ b/kernel/trace/trace_sched_wakeup.c @@ -589,7 +589,7 @@ static struct tracer wakeup_tracer __read_mostly = .reset = wakeup_tracer_reset, .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &tracer_flags, @@ -599,7 +599,7 @@ static struct tracer wakeup_tracer __read_mostly = #endif .open = wakeup_trace_open, .close = wakeup_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; static struct tracer wakeup_rt_tracer __read_mostly = @@ -610,7 +610,7 @@ static struct tracer wakeup_rt_tracer __read_mostly = .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, .wait_pipe = poll_wait_pipe, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &tracer_flags, @@ -620,7 +620,7 @@ static struct tracer wakeup_rt_tracer __read_mostly = #endif .open = wakeup_trace_open, .close = wakeup_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; __init static int init_wakeup_tracer(void) -- To unsubscribe from this list: send
[PATCH v2 -tip 0/4] tracing: make a snapshot feature available from userspace
Hi, Steven, Thank you for your review. I changed the function name "resize_buffer_even" to "resize_buffer_duplicate_size". (v1: https://lkml.org/lkml/2012/10/2/67) --- Hiraku Toyooka (4): tracing: add description of snapshot to Documentation/trace/ftrace.txt tracing: make a snapshot feature available from userspace tracing: add a resize function for making one buffer equivalent to the other buffer tracing: change tracer's integer flags to bool Documentation/trace/ftrace.txt| 97 ++ include/linux/ftrace_event.h |3 kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 248 +++-- kernel/trace/trace.h |5 - kernel/trace/trace_irqsoff.c | 12 +- kernel/trace/trace_sched_wakeup.c |8 + 7 files changed, 333 insertions(+), 51 deletions(-) -- Hiraku TOYOOKA Linux Technology Center Yokohama Research Laboratory Hitachi Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -tip 2/4] tracing: add a resize function for making one buffer equivalent to the other buffer
Steven, I'm sorry for my late reply. (I was outside the office for business trip.) (2012/10/06 1:59), Steven Rostedt wrote: >> +/* resize @tr's buffer to the size of @size_tr's entries */ >> +static int resize_buffer_even(struct trace_array *tr, > > I don't mind this patch, but I just hate the name "resize_buffer_even". > What about "resize_buffer_duplicate_size"? > O.K. I'll send modified version of this patch later. Regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -tip 1/4] tracing: change tracer's integer flags to bool
print_max and use_max_tr in struct tracer are "int" variables and used like flags. This is wasteful, so change the type to "bool". Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.h |4 ++-- kernel/trace/trace_irqsoff.c | 12 ++-- kernel/trace/trace_sched_wakeup.c |8 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 593debe..0eb6a1a 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -285,8 +285,8 @@ struct tracer { int (*set_flag)(u32 old_flags, u32 bit, int set); struct tracer *next; struct tracer_flags *flags; - int print_max; - int use_max_tr; + boolprint_max; + booluse_max_tr; }; diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index d98ee82..5c823c1 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -604,7 +604,7 @@ static struct tracer irqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -614,7 +614,7 @@ static struct tracer irqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_irqsoff(trace) register_tracer(&trace) #else @@ -637,7 +637,7 @@ static struct tracer preemptoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -647,7 +647,7 @@ static struct tracer preemptoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptoff(trace) register_tracer(&trace) #else @@ -672,7 +672,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = .reset = irqsoff_tracer_reset, .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = irqsoff_print_header, .print_line = irqsoff_print_line, .flags = &tracer_flags, @@ -682,7 +682,7 @@ static struct tracer preemptirqsoff_tracer __read_mostly = #endif .open = irqsoff_trace_open, .close = irqsoff_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; # define register_preemptirqsoff(trace) register_tracer(&trace) diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c index 02170c0..43f8abc 100644 --- a/kernel/trace/trace_sched_wakeup.c +++ b/kernel/trace/trace_sched_wakeup.c @@ -589,7 +589,7 @@ static struct tracer wakeup_tracer __read_mostly = .reset = wakeup_tracer_reset, .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &tracer_flags, @@ -599,7 +599,7 @@ static struct tracer wakeup_tracer __read_mostly = #endif .open = wakeup_trace_open, .close = wakeup_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; static struct tracer wakeup_rt_tracer __read_mostly = @@ -610,7 +610,7 @@ static struct tracer wakeup_rt_tracer __read_mostly = .start = wakeup_tracer_start, .stop = wakeup_tracer_stop, .wait_pipe = poll_wait_pipe, - .print_max = 1, + .print_max = true, .print_header = wakeup_print_header, .print_line = wakeup_print_line, .flags = &tracer_flags, @@ -620,7 +620,7 @@ static struct tracer wakeup_rt_tracer __read_mostly = #endif .open = wakeup_trace_open, .close = wakeup_trace_close, - .use_max_tr = 1, + .use_max_tr = true, }; __init static int init_wakeup_tracer(void) -- To unsubscribe from this list: send
[PATCH -tip 4/4] tracing: add description of snapshot to Documentation/trace/ftrace.txt
This patch adds snapshot description in ftrace documentation. This description includes what the snapshot is and how to use it. Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Rob Landley Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Documentation/trace/ftrace.txt | 97 1 files changed, 97 insertions(+), 0 deletions(-) diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 6f51fed..68ac294 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1842,6 +1842,103 @@ an error. # cat buffer_size_kb 85 +Snapshot + +CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature +available to all non latency tracers. (Latency tracers which +record max latency, such as "irqsoff" or "wakeup", can't use +this feature, since those are already using the snapshot +mechanism internally.) + +Snapshot preserves a trace buffer at a particular point in time +without stopping tracing. Ftrace swaps the current buffer with a +spare buffer, and tracing continues in the (previous) spare +buffer. + +The following debugfs files in "tracing" are related to this +feature: + + snapshot: + + This is used to take a snapshot and to read the output + of the snapshot. Echo 1 into this file to allocate a + spare buffer and to take a snapshot, then read the + snapshot from the file in the same format as "trace" + (described above in the section "The File System"). Both + reads snapshot and tracing are executable in parallel. + Echoing 0 erases the snapshot contents. + + snapshot_allocate: + + This is used to pre-allocate or free a spare buffer. + Echo 1 into this file to pre-allocate a spare buffer if + you don't want to fail in the next snapshot due to + memory allocation failure, or if you don't want to lose + older trace data while allocating buffer. Echo 0 to free + the spare buffer when the snapshot becomes unnecessary. + If you take the next snapshot again, you can reuse the + buffer, then just erase the snapshot contents by echoing + 1 into the "snapshot" file, instead of freeing the + buffer. + + Reads from this file display whether the spare buffer is + allocated. When current_tracer is changed, the allocated + spare buffer is freed. If the next tracer is one of the + latency tracers, this value turns into 1 and can't be + changed, or else the value starts with 0. + + +Here is an example of using the snapshot feature. + + # echo 1 > snapshot_allocate (if you want to pre-allocate the spare buffer) + # echo 1 > events/sched/enable + # echo 1 > snapshot + # cat snapshot +# tracer: nop +# +# entries-in-buffer/entries-written: 71/71 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 + sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 +[...] + -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 + # cat trace +# tracer: nop +# +# entries-in-buffer/entries-written: 77/77 #P:8 +# +# _-=> irqs-off +# / _=> need-resched +#| / _---=> hardirq/softirq +#|| / _--=> preempt-depth +#||| / delay +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 + snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 +[...] + + +If you try to use this snapshot feature when current tracer is +one of the latency tracers, you will get the following results. + + # echo wakeup > current_tracer + # cat snapshot_allocate +1 + # echo 1 > snapshot_allocate +bash: echo: write error: D
[PATCH -tip 3/4] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add following two debugfs files in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot # cat snapshot snapshot_allocate: Echoing 1 to the "snapshot" allocates the max_tr buffer if it is not allocated. So taking a snapshot may delay or fail beacuse of memory allocation. To avoid that, this file provides a mean to pre-allocate (or free) the max_tr buffer. # echo 1 > snapshot_allocate [...] # echo 1 > snapshot Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: Li Zefan Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 190 +++--- kernel/trace/trace.h |1 4 files changed, 193 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 642928c..8c32c6e 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. +config TRACER_SNAPSHOT + bool + default y + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + # All tracer options should select GENERIC_TRACER. For those options that are # enabled by all tracers (context switch and event tracer) they select TRACING. # This allows those options to appear when no other tracer is selected. But the diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 1e599e6..dac5733 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) } mutex_unlock(&trace_types_lock); - atomic_inc(&trace_record_cmdline_disabled); + if (!iter->snapshot) + atomic_inc(&trace_record_cmdline_disabled); if (*pos != iter->pos) { iter->ent = NULL; @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; - atomic_dec(&trace_record_cmdline_disabled); + if (!iter->snapshot) + atomic_dec(&trace_record_cmdline_disabled); trace_access_unlock(iter->cpu_file); trace_event_read_unlock(); } @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, int snapshot) { long cpu_file = (long) inode->i_private; struct trace_iterator *iter; @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(&iter->started, GFP_KERNEL)) goto fail; - if (current_trace && current_trace->print_max) + if ((current_trace && current_trace->print_max) || snapshot) iter->tr = &max_tr; else iter->tr = &global_trace; + iter->snapshot = !!snapshot; iter->pos = -1; mutex_init(&iter->mutex); iter->cpu_file = cpu_file; @@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file) if (ring_buffer_overruns(iter->tr->buffer)) iter
[PATCH -tip 2/4] tracing: add a resize function for making one buffer equivalent to the other buffer
Trace buffer size is now per-cpu, so that there are following two patterns in resize of the buffers. (1) resize per-cpu buffers to same given size (2) resize per-cpu buffers to the other trace_array's buffer size for each CPU (such as preparing the max_tr which is equivalent to the global_trace's size) __tracing_resize_ring_buffer() can be used for (1), and had implemented (2) inside it for resetting the global_trace to the original size. (2) was also implemented in other place. So this patch assembles them in a new function - resize_buffer_even(). Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 57 ++ 1 files changed, 30 insertions(+), 27 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 08acf42..1e599e6 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3017,6 +3017,31 @@ static void set_buffer_entries(struct trace_array *tr, unsigned long val) tr->data[cpu]->entries = val; } +/* resize @tr's buffer to the size of @size_tr's entries */ +static int resize_buffer_even(struct trace_array *tr, + struct trace_array *size_tr, int cpu_id) +{ + int cpu, ret = 0; + + if (cpu_id == RING_BUFFER_ALL_CPUS) { + for_each_tracing_cpu(cpu) { + ret = ring_buffer_resize(tr->buffer, + size_tr->data[cpu]->entries, cpu); + if (ret < 0) + break; + tr->data[cpu]->entries = size_tr->data[cpu]->entries; + } + } else { + ret = ring_buffer_resize(tr->buffer, + size_tr->data[cpu_id]->entries, cpu_id); + if (ret == 0) + tr->data[cpu_id]->entries = + size_tr->data[cpu_id]->entries; + } + + return ret; +} + static int __tracing_resize_ring_buffer(unsigned long size, int cpu) { int ret; @@ -3037,23 +3062,7 @@ static int __tracing_resize_ring_buffer(unsigned long size, int cpu) ret = ring_buffer_resize(max_tr.buffer, size, cpu); if (ret < 0) { - int r = 0; - - if (cpu == RING_BUFFER_ALL_CPUS) { - int i; - for_each_tracing_cpu(i) { - r = ring_buffer_resize(global_trace.buffer, - global_trace.data[i]->entries, - i); - if (r < 0) - break; - } - } else { - r = ring_buffer_resize(global_trace.buffer, - global_trace.data[cpu]->entries, - cpu); - } - + int r = resize_buffer_even(&global_trace, &global_trace, cpu); if (r < 0) { /* * AARGH! We are left with different @@ -3191,17 +3200,11 @@ static int tracing_set_tracer(const char *buf) topts = create_trace_option_files(t); if (t->use_max_tr) { - int cpu; /* we need to make per cpu buffer sizes equivalent */ - for_each_tracing_cpu(cpu) { - ret = ring_buffer_resize(max_tr.buffer, - global_trace.data[cpu]->entries, - cpu); - if (ret < 0) - goto out; - max_tr.data[cpu]->entries = - global_trace.data[cpu]->entries; - } + ret = resize_buffer_even(&max_tr, &global_trace, +RING_BUFFER_ALL_CPUS); + if (ret < 0) + goto out; } if (t->init) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -tip 0/4] tracing: make a snapshot feature available from userspace
Hi, Steven, This patch series make a snapshot feature available from userspace via debugfs. (But I know that you are working for multi-buffer. If these patches collide with your work much, I will resubmit my patches after that. What would you think?) If we set CONFIG_TRACER_SNAPSHOT, this snapshot feature becomes available to all non latency tracers. (Latency tracers which record max latency, such as "irqsoff" or "wakeup", can't use this feature, since those are already using the snapshot mechanism internally.) Snapshot preserves a trace buffer at a particular point in time without stopping tracing. Ftrace swaps the current buffer with a spare buffer, and tracing continues in the (previous) spare buffer. The following debugfs files in "tracing" are related to this feature: snapshot: This is used to take a snapshot and to read the output of the snapshot. Echo 1 into this file to allocate a spare buffer and to take a snapshot, then read the snapshot from the file in the same format as "trace". Both reads snapshot and tracing are executable in parallel. Echoing 0 erases the snapshot contents. snapshot_allocate: This is used to pre-allocate or free a spare buffer. Echo 1 into this file to pre-allocate a spare buffer if you don't want to fail in the next snapshot due to memory allocation failure, or if you don't want to lose older trace data while allocating buffer. Echo 0 to free the spare buffer when the snapshot becomes unnecessary. If you take the next snapshot again, you can reuse the buffer, then just erase the snapshot contents by echoing 1 into the "snapshot" file, instead of freeing the buffer. Reads from this file display whether the spare buffer is allocated. When current_tracer is changed, the allocated spare buffer is freed. If the next tracer is one of the latency tracers, this value turns into 1 and can't be changed, or else the value starts with 0. Here is an example of using the snapshot feature. # echo 1 > snapshot_allocate (if you want to pre-allocate the spare buffer) # echo 1 > events/sched/enable # echo 1 > snapshot # cat snapshot # tracer: nop # # entries-in-buffer/entries-written: 71/71 #P:8 # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | -0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120 sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120 [...] -0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120 # cat trace # tracer: nop # # entries-in-buffer/entries-written: 77/77 #P:8 # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | -0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120 snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 [...] --- Hiraku Toyooka (4): tracing: add description of snapshot to Documentation/trace/ftrace.txt tracing: make a snapshot feature available from userspace tracing: add a resize function for making one buffer equivalent to the other buffer tracing: change tracer's integer flags to bool Documentation/trace/ftrace.txt| 97 +++ include/linux/ftrace_event.h |3 kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 247 +++-- kernel/trace/trace.h |5 - kernel/trace/trace_irqsoff.c | 12 +- kernel/trace/trace_sched_wakeup.c |8 + 7 files changed, 332 insertions(+), 51 deletions(-) -- Hiraku TOYOOKA Linux Technology Center Yokohama Research Laboratory Hitachi Ltd. -- To unsu
Re: [RFC PATCH -tip ] tracing: make a snapshot feature available from userspace.
n open and read, you can do that from the command line with something like: (read -n 1 < /dev/tty; cat > ~/walrus.txt) < /path/to/thingy I dunno if that's a good idea, I'm just trying to understand the design here... Your method may be possible, but I think that is somewhat complex... Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH -tip ] tracing: make a snapshot feature available from userspace.
); + + /* When always_use_max_tr == 1, we can't toggle use_max_tr. */ + if (current_trace->always_use_max_tr) { I'll state my issue here. Don't rename use_max_tr to always_use_max_tr, keep it as is and its use as is. Your other value should be "allocated_snapshot", which can be set even for the use_max_tr user. Yes. I'll put use_max_tr back to its original. + ret = -EBUSY; + goto out; + } + + if (!(current_trace->use_max_tr ^ val)) + goto out; + + if (val) { + int cpu; + for_each_tracing_cpu(cpu) { + ret = ring_buffer_resize(max_tr.buffer, + global_trace.data[cpu]->entries, + cpu); + if (ret < 0) + break; + max_tr.data[cpu]->entries = + global_trace.data[cpu]->entries; + } + if (ret < 0) { + ring_buffer_resize(max_tr.buffer, 1, + RING_BUFFER_ALL_CPUS); + set_buffer_entries(&max_tr, 1); + ret = -ENOMEM; + goto out; + } The above code is basically duplicated by the __tracing_resize_ring_buffer(). As this code is not that trivial, lets make use of a helper function and keep the bugs in one location. Have both this function and the resize function use the same code. In fact, the __tracing_resize_ring_buffer() could be modified to do all the work. It will either shrink or expand as necessary. This isn't a critical section so calling ring_buffer_resize() even when there's nothing to do should not be an issue. OK. I think I can make the common helper function. In fact, I think there's a small bug in the code that you just duplicated. Not your bug, but you copied it. Oh, I didn't notice that... Is it related to return value? struct tracer { const char *name; @@ -286,7 +288,13 @@ struct tracer { struct tracer *next; struct tracer_flags *flags; int print_max; + /* Dynamically toggled via "snapshot_enabled" debugfs file */ int use_max_tr; + /* +* If this value is 1, this tracer always uses max_tr and "use_max_tr" +* can't be toggled. +*/ + int always_use_max_tr; I already said how I dislike this. Leave use_max_tr alone, but add a allocated_snapshot. Also, I hate the wasting of 4 bytes just to act like a flag. We should probably make print_max, use_max_tr and always_use_max_tr into 'bool's. The print_max change should be a separate patch. I see. By the way, I noticed that struct tracer's values become invisible when the current_tracer is changed. This may be somewhat problematic. I'm now considering we should put the allocated_snapshot value into global trace_flags as a flag and access this value by options/snapshot_allocate. What do you think of this? }; diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index 99d20e9..37cdb75 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -614,6 +614,7 @@ static struct tracer irqsoff_tracer __read_mostly = .open = irqsoff_trace_open, .close = irqsoff_trace_close, .use_max_tr = 1, + .always_use_max_tr = 1, Remove all these. Have the 'allocated_snapshot' get set when the tracer is added, not here. OK, I'll remove them. But on the whole, I like the idea of a snapshot (and this has actually been on my todo list for some time, thanks for doing it for me ;-) Thank you for your review! Best regards, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH -tip ] tracing: make a snapshot feature available from userspace.
Hello, Rob, Thank you very much for your advice. (2012/07/05 10:01), Rob Landley wrote: On 07/04/2012 05:47 AM, Hiraku Toyooka wrote: Hello, Steven, I've sent below RFC patch, but still have no responses. This patch can be applied to current tip tree. If you have time, could you give any comment about this patch? My familiarity with ftrace is "I saw a presentation on it at a conference a couple years ago", so I'm not the guy to comment on the advisability of the patch itself. As for the documentation: seems reasonable, could use some english polishing. +Snapshot + +If CONFIG_TRACER_MAX_TRACE is set, the (generic) snapshot +feature is available in all tracers except for the special +tracers which use a snapshot inside themselves(such as "irqsoff" +or "wakeup"). This is confusing, I'm guessing irqsoff and wakeup already have persistent buffers without CONFIG_TRACER_MAX_TRACE? (So there is a generic snapshot feature, but some tracers have their own internal buffer and don't use the generic one?) No, CONFIG_TRACER_MAX_TRACE is originally for making the spare buffer available. Both some special tracers (such as irqsoff) and generic snapshot uses the common spare buffer. But purpose of each snapshot is different. In the special tracers, snapshot is used for recording information related to max latency of something (such as longest interrupt-disabled area). This is automatically updated only when the max latency is detected. In other words, those special tracers use the same spare buffer in the different way. Thus, we can not enable generic snapshot for those tracers. Is the fact that some tracers don't use this feature an important part of the description of the feature? Is making them use common code a todo item, or just a comment? Anyway, the fact is not important here. How about: CONFIG_TRACER_MAX_TRACE makes a generic snapshot feature available to all tracers. (Some tracers, such as "irqsoff" or "wakeup", use their own internal snapshot implementation.) Thanks, but I think the following one is more suitable. (Some tracers, such as "irqsoff" or "wakeup", already use the snapshot implementation internally) +This enables to preserve trace buffer at a particular point in +time without stopping tracing. When a snapshot is taken, ftrace +swaps the current buffer with a spare buffer which is prepared +in advance. This means that the tracing itself continues on the +spare buffer. Snapshots preserve a trace buffer at a particular point in time without stopping tracing; ftrace swaps the current buffer with a spare buffer, and tracking continues in the spare buffer. +Following debugfs files in "tracing" directory are related with +this feature. The following debugfs files in "tracing" are related to this feature: + snapshot_enabled: + +This is used to set or display whether the snapshot is +enabled. Echo 1 into this file to prepare a spare buffer +or 0 to shrink it. So, the memory for the spare buffer +will be consumed only when this knob is set. Write 1 to this file to allocate a snapshot buffer, 0 to free it. I'll fix them. (Query: do you have to free the buffer after taking a snapshot below?) No, we don't always need to free the buffer, although we can free it when the snapshot becomes unnecessary. We can also reuse the buffer if we'd like to take the next snapshot. (I'll add this description.) + snapshot_pipe: + +This is used to take a snapshot and to read the output +of the snapshot. Echo 1 into this file to take a +snapshot. Reads from this file is the same as the +"trace_pipe" file (described above "The File System" +section), so that both reads from the snapshot and +tracing are executable in parallel. Echo 1 into this file to take a snapshot, then read the snapshot from the file in the same format as "trace_pipe" (described above in the section "The File System"). I'll fix that. Design questions left for the reader: why are allocating a snapshot buffer and taking a snapshot separate actions? I'll add following description: Allocating a spare buffer and taking a snapshot are separated because latter one should be done successfully and immediately. If they are not separated, we may fail to take a snapshot because of memory allocation failure or we may lose older trace data while allocating memory. > Why is there only _one_ snapshot buffer? > Because of easy reviewing, I've decided to implement step by step. So this patch just provide "only one" spare buffer. However, I got some feedback for multiple snapshot at LinuxCon Japan 2012. So I may extend this feature. +Here is an example of using the snapshot feature. + + # echo nop > current_tracer + # echo 1 > snapshot_enabled + # echo 1