> ----- Original Message ----- > > From: "Li Wang" <liw...@redhat.com> > > To: "Jan Stancek" <jstan...@redhat.com> > > Cc: ltp-list@lists.sourceforge.net > > Sent: Tuesday, 6 January, 2015 10:51:52 AM > > Subject: Re: [LTP] [PATCH] mem/hugeshmat: new case for hugepage leak > > inspection > > > > > > > ----- Original Message ----- > > > > From: "Li Wang" <liw...@redhat.com> > > > > To: "Jan Stancek" <jstan...@redhat.com> > > > > Cc: ltp-list@lists.sourceforge.net > > > > Sent: Sunday, 4 January, 2015 4:10:38 AM > > > > Subject: Re: [LTP] [PATCH] mem/hugeshmat: new case for hugepage leak > > > > inspection > > > > > > > > The regression has a long history, Patch in upstream 2.6.20-rc1 > > > > GIT: 39dde65c9940c97fcd178a3d2b1c57ed8b7b68aa. > > > > https://lkml.org/lkml/2012/4/16/129 > > > > > > And fix is presumably this commit: > > > > > > ommit c5c99429fa57dcf6e05203ebe3676db1ec646793 > > > Author: Larry Woodman <lwood...@redhat.com> > > > Date: Thu Jan 24 05:49:25 2008 -0800 > > > fix hugepages leak due to pagetable page sharing > > > > > > It'd be nice to include this information somewhere in testcase or commit > > > message. > > > More comments below: > > ... > > > > > > +int shared_hugepage(void) > > > > > > +{ > > > > > > + pid_t pid; > > > > > > + int status, shmid; > > > > > > + size_t size = (size_t)SIZE; > > > > > > + void *buf; > > > > > > + > > > > > > + shmid = shmget(IPC_PRIVATE, size, SHM_HUGETLB | IPC_CREAT | > > > > > > 0777); > > > > > > + if (shmid < 0) > > > > > > + tst_brkm(TBROK | TERRNO, cleanup, "shmget"); > > > > > > + > > > > > > + buf = shmat(shmid, (void *)BOUNDARY, SHM_RND | 0777); > > > > > > Does it make a difference where you attach shared segment? I'm slightly > > > worried, > > > that absolute address you picked may already be in use on some > > > distros/arches. > > > > Hi Jan, > > > > Thank you for reviewing my stupid patch. :) > > > > Actually I also don't know why using absolute address here, I doubt this > > manner are similar with the BZ occurred environment. The sleep(3) below as > > well. > > > > Since the original case is too old to found out who is the author. I hope > > someone > > now could provide any useful discussion here, if not, I still feel we can > > use > > these > > code to cover the regression. What do you think? > > I'm assuming that both absolute address and sleep is not necessary. > I think I'll take some recent kernel, revert that patch and try to reproduce > the failure.
Hi Jan, I just tried kernel-3.10.0+ without the fixed patch, and I hit a CPU stuck issue with absolute address and sleep(3). Then I tried the testcase with absolute address but not sleep(3), the same results as below. ------------------------------------- [ 1105.650122] BUG: soft lockup - CPU#3 stuck for 22s! [hugeshmat04:2374] [ 1105.656649] Modules linked in: intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper pcspkr cryptd serio_raw iTCO_wdt iTCO_vendor_support ppdev lpc_ich mfd_core i2c_i801 ipmi_si parport_pc parport ipmi_msghandler shpchp acpi_cpufreq ioatdma i7core_edac edac_core dca xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common ast syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm ata_generic pata_acpi drm e1000e ata_piix libata ptp i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod [ 1105.710690] CPU: 3 PID: 2374 Comm: hugeshmat04 Not tainted 3.10.0+ #1 [ 1105.717124] Hardware name: Lenovo 401118U/Tyan Tank GT20-B7002LNV, BIOS 'V2.7 04.07.2011 ' 04/07/2011 [ 1105.726420] task: ffff88022cc8cfa0 ti: ffff880034e5c000 task.ti: ffff880034e5c000 [ 1105.733894] RIP: 0010:[<ffffffff8160c6b2>] [<ffffffff8160c6b2>] _raw_spin_lock+0x32/0x50 [ 1105.742081] RSP: 0018:ffff880034e5fc88 EFLAGS: 00000202 [ 1105.747388] RAX: 0000000000005897 RBX: ffff8800342392c0 RCX: 0000000000000402 [ 1105.754515] RDX: 0000000000000404 RSI: 0000000000000404 RDI: ffffea0000d24170 [ 1105.761643] RBP: ffff880034e5fc88 R08: ffff880000000000 R09: ffff880035978fc0 [ 1105.768771] R10: ffff88023ffd7800 R11: 0000000000000000 R12: ffff880034b66030 [ 1105.775898] R13: ffff880034dd3030 R14: 000000002cc8cfa0 R15: 0000000000000000 [ 1105.783026] FS: 00007f3291241740(0000) GS:ffff880237260000(0000) knlGS:0000000000000000 [ 1105.791110] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1105.796848] CR2: 00007f3290d294c0 CR3: 00000000350a6000 CR4: 00000000000007e0 [ 1105.803977] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1105.811105] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1105.818233] Stack: [ 1105.820244] ffff880034e5fd18 ffffffff8119cf10 0000000080000000 0000000040000000 [ 1105.827702] ffff880034dd3030 0000000034d25bc8 ffff8800342386b4 ffff880034239334 [ 1105.835160] ffff8800bac6eaf8 ffff8800342392c0 ffff880034905000 ffff880034238640 [ 1105.842616] Call Trace: [ 1105.845065] [<ffffffff8119cf10>] copy_hugetlb_page_range+0xc0/0x300 [ 1105.851421] [<ffffffff81181d39>] copy_page_range+0x3d9/0x480 [ 1105.857161] [<ffffffff81186da8>] ? __vma_link_rb+0xb8/0xe0 [ 1105.862728] [<ffffffff8106ba77>] dup_mm+0x357/0x660 [ 1105.867689] [<ffffffff8106c7f9>] copy_process.part.25+0xa49/0x14d0 [ 1105.873948] [<ffffffff8106d43c>] do_fork+0xbc/0x350 [ 1105.878910] [<ffffffff8106d756>] SyS_clone+0x16/0x20 [ 1105.883964] [<ffffffff81615679>] stub_clone+0x69/0x90 [ 1105.889098] [<ffffffff81615329>] ? system_call_fastpath+0x16/0x1b At last I tried the testcase without absolute address and sleep(3), the test awalys PASS. From above result, it couldn't tell the whole story, perhaps we should better testing on the kernel less than 2.6.24 version. Thanks, Li Wang > > Regards, > Jan > > > > > > > A new patch: > > > > Subject: [PATCH] mem/hugeshmat: new case for hugepage leak inspection > > > > Description of Problem: > > When over 1GB shared memory was alocated in hugepage, the hugepage > > is not released though process finished. > > > > The fix is this commit: > > commit c5c99429fa57dcf6e05203ebe3676db1ec646793 > > Author: Larry Woodman <lwood...@redhat.com> > > Date: Thu Jan 24 05:49:25 2008 -0800 > > fix hugepages leak due to pagetable page sharing > > > > Signed-off-by: Li Wang <liw...@redhat.com> > > Signed-off-by: Jan Stancek <jstan...@redhat.com> > > --- > > runtest/hugetlb | 1 + > > testcases/kernel/mem/.gitignore | 1 + > > .../kernel/mem/hugetlb/hugeshmat/hugeshmat04.c | 166 > > +++++++++++++++++++++ > > 3 files changed, 168 insertions(+) > > create mode 100644 testcases/kernel/mem/hugetlb/hugeshmat/hugeshmat04.c > > > > diff --git a/runtest/hugetlb b/runtest/hugetlb > > index 3eaf14c..805141d 100644 > > --- a/runtest/hugetlb > > +++ b/runtest/hugetlb > > @@ -10,6 +10,7 @@ hugemmap05_3 hugemmap05 -s -m > > hugeshmat01 hugeshmat01 -i 5 > > hugeshmat02 hugeshmat02 -i 5 > > hugeshmat03 hugeshmat03 -i 5 > > +hugeshmat04 hugeshamt04 -i 5 > > > > hugeshmctl01 hugeshmctl01 -i 5 > > hugeshmctl02 hugeshmctl02 -i 5 > > diff --git a/testcases/kernel/mem/.gitignore > > b/testcases/kernel/mem/.gitignore > > index f96964c..c531563 100644 > > --- a/testcases/kernel/mem/.gitignore > > +++ b/testcases/kernel/mem/.gitignore > > @@ -6,6 +6,7 @@ > > /hugetlb/hugeshmat/hugeshmat01 > > /hugetlb/hugeshmat/hugeshmat02 > > /hugetlb/hugeshmat/hugeshmat03 > > +/hugetlb/hugeshmat/hugeshmat04 > > /hugetlb/hugeshmctl/hugeshmctl01 > > /hugetlb/hugeshmctl/hugeshmctl02 > > /hugetlb/hugeshmctl/hugeshmctl03 > > diff --git a/testcases/kernel/mem/hugetlb/hugeshmat/hugeshmat04.c > > b/testcases/kernel/mem/hugetlb/hugeshmat/hugeshmat04.c > > new file mode 100644 > > index 0000000..43e9f3f > > --- /dev/null > > +++ b/testcases/kernel/mem/hugetlb/hugeshmat/hugeshmat04.c > > @@ -0,0 +1,166 @@ > > +/* > > + * Copyright (c) Linux Test Project, 2014 > > + * > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License as published by > > + * the Free Software Foundation; either version 2 of the License, or > > + * (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See > > + * the GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > > + * 02110-1301 USA > > + */ > > + > > +/* > > + * NAME > > + * hugeshmat04.c > > + * > > + * DESCRIPTION > > + * hugeshmat04 - test for hugepage leak inspection. > > + * > > + * Description of Problem: > > + * When over 1GB shered memory was alocated in hugepage, the > > hugepage > > + * is not released though process finished. > > + * > > + * You need more than 2GB memory in test job > > + * > > + * Test results: > > + * Successed: No regression found. > > + * Failed: Regression detected. > > + * > > + * HISTORY > > + * 05/2014 - Written by Fujistu Corp. > > + * 12/2014 - Port to LTP by Li Wang. > > + * > > + * RESTRICTIONS > > + * test must be run at root > > + */ > > + > > +#include <stdio.h> > > +#include <stdlib.h> > > +#include <unistd.h> > > +#include <fcntl.h> > > +#include <string.h> > > +#include <sys/mman.h> > > +#include <sys/types.h> > > +#include <sys/shm.h> > > +#include <sys/wait.h> > > + > > +#include "test.h" > > +#include "mem.h" > > + > > +#define SIZE (1024 * 1024 * 1024) > > +#define BOUNDARY (1024 * 1024 * 1024) > > + > > +char *TCID = "hugeshmat04"; > > +int TST_TOTAL = 3; > > + > > +static long huge_total; > > +static long huge_free; > > +static long huge_free2; > > +static long hugepages; > > +static long orig_hugepages; > > + > > +static void shared_hugepage(void); > > + > > +int main(int ac, char **av) > > +{ > > + int lc, i; > > + const char *msg; > > + > > + msg = parse_opts(ac, av, NULL, NULL); > > + if (msg != NULL) > > + tst_brkm(TBROK, NULL, "OPTION PARSING ERROR - %s", msg); > > + > > + setup(); > > + > > + for (lc = 0; TEST_LOOPING(lc); lc++) { > > + tst_count = 0; > > + > > + for (i = 0; i < TST_TOTAL; i++) { > > + > > + shared_hugepage(); > > + huge_free2 = read_meminfo("HugePages_Free:"); > > + > > + if (huge_free2 != huge_free) > > + tst_brkm(TFAIL, cleanup, > > + "Test failed. Hugepage leak > > inspection."); > > + else > > + tst_resm(TPASS, "No regression found."); > > + } > > + } > > + > > + cleanup(); > > + tst_exit(); > > +} > > + > > +void shared_hugepage(void) > > +{ > > + pid_t pid; > > + int status, shmid; > > + size_t size = (size_t)SIZE; > > + void *buf; > > + > > + shmid = shmget(IPC_PRIVATE, size, SHM_HUGETLB | IPC_CREAT | 0777); > > + if (shmid < 0) > > + tst_brkm(TBROK | TERRNO, cleanup, "shmget"); > > + > > + buf = shmat(shmid, (void *)BOUNDARY, SHM_RND | 0777); > > + if (buf == (void *)-1) { > > + shmctl(shmid, IPC_RMID, NULL); > > + tst_brkm(TBROK | TERRNO, cleanup, "shmat"); > > + } > > + > > + memset(buf, 2, size); > > + sleep(3); > > + pid = fork(); > > + > > + if (pid == 0) > > + exit(1); > > + else if (pid < 0) > > + tst_brkm(TBROK | TERRNO, cleanup, "fork"); > > + > > + wait(&status); > > + shmdt(buf); > > + shmctl(shmid, IPC_RMID, NULL); > > +} > > + > > +void setup(void) > > +{ > > + long mem_total, hpage_size; > > + > > + tst_require_root(NULL); > > + > > + mem_total = read_meminfo("MemTotal:") * 1024; > > + if (mem_total < 2L*SIZE) { > > + tst_resm(TINFO, "Total memory should greater than 2G."); > > + tst_exit(); > > + } > > + > > + orig_hugepages = get_sys_tune("nr_hugepages"); > > + hpage_size = read_meminfo("Hugepagesize:") * 1024; > > + > > + hugepages = (orig_hugepages * hpage_size + SIZE) / hpage_size; > > + set_sys_tune("nr_hugepages", hugepages, 1); > > + > > + huge_total = read_meminfo("HugePages_Total:"); > > + huge_free = read_meminfo("HugePages_Free:"); > > + > > + if (huge_total != hugepages || huge_free != hugepages) > > + tst_brkm(TCONF, cleanup, > > + "Maybe huge pages not enough for test."); > > + > > + TEST_PAUSE; > > +} > > + > > +void cleanup(void) > > +{ > > + TEST_CLEANUP; > > + set_sys_tune("nr_hugepages", orig_hugepages, 0); > > +} > > -- > > 1.8.3.1 > > > ------------------------------------------------------------------------------ Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net _______________________________________________ Ltp-list mailing list Ltp-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ltp-list