On Fri, Apr 08, 2011 at 01:31:59PM +0800, Bian Naimeng wrote:
> 
> 
> Han Pingtian wrote:
> > On Thu, Apr 07, 2011 at 01:52:40PM +0800, Bian Naimeng wrote:
> >>
> >> CAI Qian wrote:
> >>> ----- Original Message -----
> >>>> Qian Cai wrote:
> >>>>> On 2011-4-4, at 0:47, Bian Naimeng <[email protected]> wrote:
> >>>>>
> >>>>>> There are some problem in ksm tests.
> >>>>>>
> >>>>>> 1. We should break the test when checking is failure.
> >>>>> No, this is not the intention. The design here is to run all tests
> >>>>> to check for all stats to give a full picture even if the a single
> >>>>> failure has been observed. This type of the failures do not prevent
> >>>>> the rest of the tests from running, so there is no need to stop the
> >>>>> tests now which also give more insight to track down root causes.
> >>>> Various reason can make checking failure, someone can make the test
> >>>> hangup.
> >>>> I did this test on RHEL5, i found ksmd stopped before doing "echo 2 >
> >>>> /sys/kernel/mm/ksm/run",
> >>>> so group_check will be hanged on "new_num < old_num * 3".
> >>>>
> >>>> So, i think we should break the test if "run" is not expecting.
> >>> What happened if you run the tests for a recent upstream kernel? There
> >>> are some patches for ksm recently merged upstream. If the problem still
> >>> persistent, please paste the EXACT OUTPUT from the ksm01 test. If it is
> >>> hung, please upload sysrq-t output somewhere.
> >>>
> >> Maybe there are some bugs in the RHEL6's kernel, but the purpose of this 
> >> patch
> >> is not to workround these bugs, i want to fix the test's bug.
> >>
> >> Would you explain to me why we do this loop "while (new_num < old_num * 
> >> 3)" in
> >> group_check, i think "while (new_num < old_num + 3)" is better.
> >>
> >> Some time ago, the following patch insert this loop.
> >> http://sourceforge.net/mailarchive/forum.php?thread_name=AANLkTi%3Dg%3DojJu0m%2B556rnekYenRTXtX%2BVBOj%3DrPmnjSw%40mail.gmail.com&forum_name=ltp-list
> >>
> >> The changelog of this patch said "we should wait 3~5 increments of the
> >> /sys/kernel/mm/ksm/full_scans before checking ksm* testcases's results", 
> >> but
> >> it do "while (new_num < old_num * 3)" actually.
> > I made a mistake. The code is what I wanted to do, but the changelog was
> > wrong. When testing the new ksm patch, the developer told us we must
> > wait 3~5 times increments of the number before checking testing
> > results. So I coded to wait til new_num >= old_num * 3 before checking
> > the results. 
> > 
> 
> The bigger old_new is, the longer time test takes, it's strange.
I agree that it's strange. But it seems that we have to do this way.
> 
> > About 'echo 2 > /sys/kernel/mm/ksm/run' problem, I have tested it with
> > ksm01.
> > If I run the 'echo 2 > /sys/kernel/mm/ksm/run' before issue ksm test,
> > the content of /sys/kernel/mm/ksm/run will be changed to 1 and the test
> > can finished successfully. 
> 
> I think we shoud not care this.
Yep, I think so.
> 
> > Only if I echo the 2 between the testing process,
> > ksm01 will hang up. On that time, new_num will be zero, so your plus 3
> > method won't work either. So what should we do in this circumstance?
> > 
> 
> Please look at my patch, after stopping ksmd in the testing(echo 2 > 
> /sys/kernel/mm/ksm/run or
> echo 0 > /sys/kernel/mm/ksm/run), group_check will skip waiting at the loop 
> "new_num >= old_num * 3".
> 
Run 'echo 2 > /sys/kernel/mm/ksm/run' before calling group_check() won't
cause the while loops infinitely, because old_num and new_num would be all
zero before the while, so new_num == old_num * 3, the while won't be
ran.
> Regards
>  Bian
> 
> > Thanks.
> > 
> > Han Pingtian
> >> Regards
> >>  Bian
> >>
> >>> CAI Qian
> >>>  
> >>>>>> 2. The condition "new_num < old_num * 3" seems uncomfortable, i
> >>>>>> think
> >>>>>>   it should be "new_num < old_num + 3"
> >>>>> I don't understand. What error did you see from the testing output?
> >>>> Sometimes, the old_num is a big number, so it takes long time in this
> >>>> loop,
> >>>> i don't understand the purpose.
> >>>> Would you explain to me why we expect this condition "new_num <
> >>>> old_num * 3".
> >>>>
> >>>>>> 3. After stopping ksm(echo 2 > /sys/kernel/mm/ksm/run), the ksmd
> >>>>>>   will stop scaning pages, so looping in "new_num < old_num * 3"
> >>>>>>   is wrong.
> >>>>> Ditto.
> >>>>>
> >>>> After stopping ksm, looping in "new_num < old_num * 3" will make the
> >>>> process hang up,
> >>>> because new_num does not be increased.
> >>>>
> >>>> Regards
> >>>> Bian
> >>>>
> >>>>
> >>>>> CAI Qian
> >>>>>> Signed-off-by: Bian Naimeng <[email protected]>
> >>>>>>
> >>>>>> ---
> >>>>>> testcases/kernel/mem/include/mem.h | 2 +-
> >>>>>> testcases/kernel/mem/lib/mem.c | 19 ++++++++++---------
> >>>>>> 2 files changed, 11 insertions(+), 10 deletions(-)
> >>>>>>
> >>>>>> diff --git a/testcases/kernel/mem/include/mem.h
> >>>>>> b/testcases/kernel/mem/include/mem.h
> >>>>>> index 778d403..b640a63 100644
> >>>>>> --- a/testcases/kernel/mem/include/mem.h
> >>>>>> +++ b/testcases/kernel/mem/include/mem.h
> >>>>>> @@ -42,7 +42,7 @@ void check(char *path, long int value);
> >>>>>> void verify(char value, int proc, int start, int end, int start2,
> >>>>>> int end2);
> >>>>>> void group_check(int run, int pages_shared, int pages_sharing,
> >>>>>>        int pages_volatile, int pages_unshared, int sleep_millisecs,
> >>>>>> - int pages_to_scan);
> >>>>>> + int pages_to_scan, int scans);
> >>>>>> void create_same_memory(int size, int num, int unit);
> >>>>>> void check_ksm_options(int *size, int *num, int *unit);
> >>>>>> void write_cpusets(void);
> >>>>>> diff --git a/testcases/kernel/mem/lib/mem.c
> >>>>>> b/testcases/kernel/mem/lib/mem.c
> >>>>>> index 12e61e9..db1a7dd 100644
> >>>>>> --- a/testcases/kernel/mem/lib/mem.c
> >>>>>> +++ b/testcases/kernel/mem/lib/mem.c
> >>>>>> @@ -284,7 +284,7 @@ void check(char *path, long int value)
> >>>>>>
> >>>>>>    tst_resm(TINFO, "%s is %ld.", path, atol(buf));
> >>>>>>    if (atol(buf) != value)
> >>>>>> - tst_resm(TFAIL, "%s is not %ld.", path, value);
> >>>>>> + tst_brkm(TFAIL, tst_exit, "%s is not %ld.", path, value);
> >>>>>> }
> >>>>>>
> >>>>>> void verify(char value, int proc, int start, int end, int start2,
> >>>>>> int end2)
> >>>>>> @@ -312,7 +312,8 @@ void verify(char value, int proc, int start,
> >>>>>> int end, int start2, int end2)
> >>>>>>
> >>>>>> void group_check(int run, int pages_shared, int pages_sharing,
> >>>>>>        int pages_volatile, int pages_unshared,
> >>>>>> - int sleep_millisecs, int pages_to_scan)
> >>>>>> + int sleep_millisecs, int pages_to_scan,
> >>>>>> + int scans)
> >>>>>> {
> >>>>>>    int fd;
> >>>>>>    char buf[BUFSIZ];
> >>>>>> @@ -332,7 +333,7 @@ void group_check(int run, int pages_shared, int
> >>>>>> pages_sharing,
> >>>>>>    old_num = new_num = atoi(buf);
> >>>>>>    if (lseek(fd, 0, SEEK_SET) == -1)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "lseek");
> >>>>>> - while (new_num < old_num * 3) {
> >>>>>> + while (new_num < old_num + scans) {
> >>>>>>        sleep(1);
> >>>>>>        if (read(fd, buf, BUFSIZ) < 0)
> >>>>>>            tst_brkm(TBROK|TERRNO, cleanup, "read");
> >>>>>> @@ -587,7 +588,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>        if (kill(child[k], SIGCONT) == -1)
> >>>>>>            tst_brkm(TBROK|TERRNO, cleanup, "kill child[%d]", k);
> >>>>>>    }
> >>>>>> - group_check(1, 2, size * num * 256 - 2, 0, 0, 0, size * 256 *
> >>>>>> num);
> >>>>>> + group_check(1, 2, size * num * 256 - 2, 0, 0, 0, size * 256 *
> >>>>>> num, 3);
> >>>>>>
> >>>>>>    tst_resm(TINFO, "wait for child 1 to stop.");
> >>>>>>    if (waitpid(child[1], &status, WUNTRACED) == -1)
> >>>>>> @@ -599,7 +600,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>    tst_resm(TINFO, "resume child 1.");
> >>>>>>    if (kill(child[1], SIGCONT) == -1)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "kill");
> >>>>>> - group_check(1, 3, size * num * 256 - 3, 0, 0, 0, size * 256 *
> >>>>>> num);
> >>>>>> + group_check(1, 3, size * num * 256 - 3, 0, 0, 0, size * 256 *
> >>>>>> num, 3);
> >>>>>>
> >>>>>>    tst_resm(TINFO, "wait for child 1 to stop.");
> >>>>>>    if (waitpid(child[1], &status, WUNTRACED) == -1)
> >>>>>> @@ -613,7 +614,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>        if (kill(child[k], SIGCONT) == -1)
> >>>>>>            tst_brkm(TBROK|TERRNO, cleanup, "kill child[%d]", k);
> >>>>>>    }
> >>>>>> - group_check(1, 1, size * num * 256 - 1, 0, 0, 0, size * 256 *
> >>>>>> num);
> >>>>>> + group_check(1, 1, size * num * 256 - 1, 0, 0, 0, size * 256 *
> >>>>>> num, 3);
> >>>>>>
> >>>>>>    tst_resm(TINFO, "wait for all children to stop.");
> >>>>>>    for (k = 0; k < num; k++) {
> >>>>>> @@ -627,7 +628,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>    tst_resm(TINFO, "resume child 1.");
> >>>>>>    if (kill(child[1], SIGCONT) == -1)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "kill");
> >>>>>> - group_check(1, 1, size * num * 256 - 2, 0, 1, 0, size * 256 *
> >>>>>> num);
> >>>>>> + group_check(1, 1, size * num * 256 - 2, 0, 1, 0, size * 256 *
> >>>>>> num, 3);
> >>>>>>
> >>>>>>    tst_resm(TINFO, "wait for child 1 to stop.");
> >>>>>>    if (waitpid(child[1], &status, WUNTRACED) == -1)
> >>>>>> @@ -647,7 +648,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "open");
> >>>>>>    if (write(fd, "2", 1) != 1)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "write");
> >>>>>> - group_check(2, 0, 0, 0, 0, 0, size * 256 * num);
> >>>>>> + group_check(2, 0, 0, 0, 0, 0, size * 256 * num, 0);
> >>>>>>
> >>>>>>    tst_resm(TINFO, "wait for all children to stop.");
> >>>>>>    for (k = 0; k < num; k++) {
> >>>>>> @@ -668,7 +669,7 @@ void create_same_memory(int size, int num, int
> >>>>>> unit)
> >>>>>>    if (write(fd, "0", 1) != 1)
> >>>>>>        tst_brkm(TBROK|TERRNO, cleanup, "write");
> >>>>>>    close(fd);
> >>>>>> - group_check(0, 0, 0, 0, 0, 0, size * 256 * num);
> >>>>>> + group_check(0, 0, 0, 0, 0, 0, size * 256 * num, 0);
> >>>>>>    while (waitpid(-1, &status, WUNTRACED | WCONTINUED) > 0)
> >>>>>>        if (WEXITSTATUS(status) != 0)
> >>>>>>            tst_resm(TFAIL, "child exit status is %d",
> >>>>>> --
> >>>>>> 1.7.1
> >>>>>>
> >>>>>>
> >>>> ------------------------------------------------------------------------------
> >>>> Xperia(TM) PLAY
> >>>> It's a major breakthrough. An authentic gaming
> >>>> smartphone on the nation's most reliable network.
> >>>> And it wants your games.
> >>>> http://p.sf.net/sfu/verizon-sfdev
> >>>> _______________________________________________
> >>>> Ltp-list mailing list
> >>>> [email protected]
> >>>> https://lists.sourceforge.net/lists/listinfo/ltp-list
> >>>
> >>
> >> ------------------------------------------------------------------------------
> >> Xperia(TM) PLAY
> >> It's a major breakthrough. An authentic gaming
> >> smartphone on the nation's most reliable network.
> >> And it wants your games.
> >> http://p.sf.net/sfu/verizon-sfdev
> >> _______________________________________________
> >> Ltp-list mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/ltp-list
> > 
> 
> 
> ------------------------------------------------------------------------------
> Xperia(TM) PLAY
> It's a major breakthrough. An authentic gaming
> smartphone on the nation's most reliable network.
> And it wants your games.
> http://p.sf.net/sfu/verizon-sfdev
> _______________________________________________
> Ltp-list mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ltp-list

-- 
Han Pingtian
Quality Engineer
hpt @ #kernel-qe
Red Hat, Inc
Freedom ... courage ... Commitment ... ACCOUNTABILITY

------------------------------------------------------------------------------
Xperia(TM) PLAY
It's a major breakthrough. An authentic gaming
smartphone on the nation's most reliable network.
And it wants your games.
http://p.sf.net/sfu/verizon-sfdev
_______________________________________________
Ltp-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ltp-list

Reply via email to