On 30.11.2006 [12:43:09 +1100], David Gibson wrote: > On Tue, Nov 28, 2006 at 09:56:51PM -0800, Nishanth Aravamudan wrote: > [snip] > > > I don't really see that failing after the first child comes back with > > > a signal is a problem. I don't see any situations where the return > > > code from the second child will be vital for debugging. > > > > Ok, I can change it easily enough. > > > > > In any case you mustn't call FAIL() directly from the child process. > > > > While I agree this is messy in principle, I don't see any reason that it > > can't be called. It's simply a wrapper for exit()? I see that it calls > > cleanup() and there is a claim that cleanup() must be defined in the > > tests but this is untrue AFAICT. > > If you FAIL() from the child you'll get two separate fail messages > which makes a mess of the output. Easiest way to make sure there's a > single, comprehensible fail message is to only ever FAIL() (or PASS() > or CONFIG()) from the top level process.
Fair enough. I've made this change. > > > > In any case, I've spent a good amount of time cleaning and fixing the > > > > linkshare testcase yesterday and today. Here is what I have so far. We > > > > are still getting segfaults on xBDT.linkshare 64-bit with > > > > HUGETLB_SHARE=2, but I went and checked and it's not a testcase issue, I > > > > don't think. We also will segfault, for instance, if xBDT.linkhuge > > > > 64-bit is run manually with HUGETLB_SHARE=2 two times in a row. I am now > > > > looking into the root cause of this failure. > > > > > > What sort of system is this on? I've certainly had 64-bit > > > xBDT.linkshare work on some setups. > > > > This is with the *reworked* linkshare. The version currently in the > > library is broken in multiple ways. > > I thought you said you got failures with both the old and new versions.. Of the sharing code, not of the sharing testcase, sorry for the confusion. > > > > The patch is pretty much ready for inclusion, I think, but I'd like one > > > > more round of review. > > > > > > Looks like a nice cleanup, but some further comments below. > > > > <snip> > > > > > > +static int child_process(char *self, int num_sharings, int index) > > > > { > > > > int i; > > > > - int shmid; > > > > - ino_t *shm; > > > > + ino_t ino; > > > > + > > > > + get_link_string(self); > > > > + > > > > + shmid = shmget(SHM_KEY, num_sharings * NUM_TESTS * > > > > + sizeof(ino_t), > > > > 0666); > > > > + if (shmid < 0) > > > > + FAIL("Child's shmget failed: %s", strerror(errno)); > > > > > > As mentioned above, a direct FAIL() from the child will always cause > > > trouble - what will the parent do. Instead just do an exit() with > > > non-zero code, and let the parent do the FAIL() when it gets the bad > > > code back from wait(). > > > > The parent will see a non-zero return status and FAIL itself. FAIL() is > > a wrapper for exit with non-zero code, so I'm not sure what the > > difference is. > > Confusing multiple messages. Got it now. <snip> > > > > - children = (pid_t *)malloc(num_sharings * > > > > sizeof(pid_t)); > > > > - if (!children) > > > > + children_pids = (pid_t *)malloc(num_sharings * > > > > sizeof(pid_t)); > > > > + if (!children_pids) > > > > FAIL("malloc failed: %s", strerror(errno)); > > > > > > Is there really any advantage to having a variable number of children. > > > Things could be simplified considerably if we just cut this down to > > > the minimal testcase of 2 children, hardcoded. > > > > I don't know about "considerably", but fair enough. If no one else has > > any objections, I can make this change. Note, that it may also be > > interesting to run x{B,BDT}.linkhuge with HUGETLB_SHARE set > > appropriately, consecutively, as it results in the same sort of > > thing > > That sounds like a good idea. Makes sure HUGETLB_SHARE doesn't mess > anything up, even in a case which doesn't use sharing. Yep, exactly. > > (but would only be a "Did it PASS()?" kind of test). > > Um.. there's another kind? I mean, we'd only be able to say whether the same binary run twice with sharing on runs both times, not be able to say whether (technically) they were actually sharing the same file (since the testcase does no such checking in the linkhuge case). So it's purely a "does it run" testcase, not necessarily a "does it share" testcase. Thanks, Nish -- Nishanth Aravamudan <[EMAIL PROTECTED]> IBM Linux Technology Center ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Libhugetlbfs-devel mailing list Libhugetlbfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel