grep *.py files for test_with_dsym. But a random example I'll pull from the search results is lldb\test\expression_command\call-function\TestCallStdStringFunction.py
In it you'll see this: *@unittest2.skipUnless(sys.platform.startswith("darwin"), "requires Darwin")* @dsym_test @expectedFailureDarwin(16361880) # <rdar://problem/16361880>, we get the result correctly, but fail to invoke the Summary formatter. def test_with_dsym(self): """Test calling std::String member function.""" self.buildDsym() self.call_function() @dwarf_test @expectedFailureFreeBSD('llvm.org/pr17807') # Fails on FreeBSD buildbot @expectedFailureGcc # llvm.org/pr14437, fails with GCC 4.6.3 and 4.7.2 @expectedFailureIcc # llvm.org/pr14437, fails with ICC 13.1 @expectedFailureDarwin(16361880) # <rdar://problem/16361880>, we get the result correctly, but fail to invoke the Summary formatter. def test_with_dwarf(self): """Test calling std::String member function.""" self.buildDwarf() self.call_function() The LLDB test runner considers any class which derives from TestBase to be a "test case" (so ExprCommandCallFunctionTestCase from this file is a test case), and for each test case, any member function whose name starts with "test" to be a single test. So in this case we've got ExprCommandCallFunctionTestCase.test_with_dsym and ExprCommandCallFunctionTestCase.test_with_dwarf. The first only runs on darwin, the second runs on all platforms but is xfail'ed on FreeBSD, GCC, ICC, and darwin (I'm not sure what the @dsym_test and @dwarf_test annotations are for) On Sat, Mar 14, 2015 at 10:05 AM Jonathan Roelofs <jonat...@codesourcery.com> wrote: > > > On 3/13/15 9:10 PM, Zachary Turner wrote: > > > > > > On Fri, Mar 13, 2015 at 4:01 PM Jonathan Roelofs > > <jonat...@codesourcery.com <mailto:jonat...@codesourcery.com>> wrote: > > > > +ddunbar > > > > On 3/13/15 9:53 AM, jing...@apple.com <mailto:jing...@apple.com> > wrote: > > >>> Depending on how different the different things are. Compiler > > tests > > >>> tend to have input, output and some machine that converts the > > input to > > >>> the output. That is one very particular model of testing. > > Debugger > > >>> tests need to do: get to stage 1, if that succeeded, get to > > stage 2, > > >>> if that succeeded, etc. Plus there's generally substantial > > setup code > > >>> to get somewhere interesting, so while you are there you > > generally try > > >>> to test a bunch of similar things. Plus, the tests often have > > points > > >>> where there are several success cases, but each one requires a > > >>> different "next action", stepping being the prime example of > this. > > >>> These are very different models and I don't see that trying to > > smush > > >>> the two together would be a fruitful exercise. > > > > I think LIT does make the assumption that one "test file" has one > "test > > result". But this is a place where we could extend LIT a bit. I don't > > think it would be very painful. > > > > For me, this would be very useful for a few of the big libc++abi > tests, > > like the demangler one, as currently I have to #ifdef out a couple of > > the cases that can't possibly work on my platform. It would be much > > nicer if that particular test file outputted multiple test results of > > which I could XFAIL the ones I know won't ever work. (For anyone who > is > > curious, the one that comes to mind needs the c99 %a printf format, > > which my libc doesn't have. It's a baremetal target, and binary size > is > > really important). > > > > How much actual benefit is there in having lots of results per test > > case, rather than having them all &&'d together to one result? > > > > Out of curiosity, does lldb's existing testsuite allow you to run > > individual test results in test cases where there are more than one > test > > result? > > > > > > I think I'm not following this line of discussion. So it's possible > > you and Jim are talking about different things here. > > I think that's the case... I was imagining the "logic of the test" > something like this: > > 1) Set 5 breakpoints > 2) Continue > 3) Assert that the debugger stopped at the first breakpoint > 4) Continue > 5) Assert that the debugger stopped at the second breakpoint > 6) etc. > > Reading Jim's description again, with the help of your speculative > example, it sounds like the test logic itself isn't straightline > code.... that's okay too. What I was speaking to is a perceived > difference in what the "results" of running such a test are. > > In llvm, the assertions are CHECK lines. In libc++, the assertions are > calls to `assert` from assert.h, as well as `static_assert`s. In both > cases, failing any one of those checks in a test makes the whole test > fail. For some reason I had the impression that in lldb there wasn't a > single test result per *.py test. Perhaps that's not the case? Either > way, what I want to emphasize is that LIT doesn't care about the "logic > of the test", as long as there is one test result per test (and even > that condition could be amended, if it would be useful for lldb). > > > > > If I understand correctly (and maybe I don't), what Jim is saying is > > that a debugger test might need to do something like: > > > > 1) Set 5 breakpoints > > 2) Continue > > 3) Depending on which breakpoint gets hit, take one of 5 possible "next" > > actions. > > > > But I'm having trouble coming up with an example of why this might be > > useful. Jim, can you make this a little more concrete with a specific > > example of a test that does this, how the test works, and what the > > different success / failure cases are so we can be sure everyone is on > > the same page? > > > > In the case of the libc++ abi tests, I'm not sure what is meant by > > "multiple results per test case". Do you mean (for example) you'd like > > to be able to XFAIL individual run lines based on some condition? If > > I think this means I should make the libc++abi example even more > concrete.... In libc++/libc++abi tests, the "RUN" line is implicit > (well, aside from the few ShTest tests ericwf has added recently). Every > *.pass.cpp test is a file that the test harness knows it has to compile, > run, and check its exit status. That being said, > libcxxabi/test/test_demangle.pass.cpp has a huge array like this: > > 20 const char* cases[][2] = > 21 { > 22 {"_Z1A", "A"}, > 23 {"_Z1Av", "A()"}, > 24 {"_Z1A1B1C", "A(B, C)"}, > 25 {"_Z4testI1A1BE1Cv", "C test<A, B>()"}, > > snip > > 29594 {"_Zli2_xy", "operator\"\" _x(unsigned long long)"}, > 29595 {"_Z1fIiEDcT_", "decltype(auto) f<int>(int)"}, > 29596 }; > > Then there's some logic in `main()` that runs, __cxa_demangle on > `cases[i][0]`, and asserts that it's the same as `cases[i][1]`. If any > of those assertions fail, the entire test is marked as failing, and no > further lines in that array are verified. For the sake of discussion, > let's call each of entries in `cases` a "subtest", and the entirety of > test_demangle.pass.cpp a test. > > The sticky issue is that there are a few subtests in this test that > don't make sense on various platforms, so currently, they are #ifdef'd > out. If the LIT TestFormat and the tests themselves had a way to > communicate that a subtest failed, but to continue running other > subtests after that, then we could XFAIL these weird subtests individually. > > Keep in mind though that I'm not really advocating we go and change > test_demangle.pass.cpp to suit that model, because #ifdef's work > reasonably well there, and there are relatively few subtests that have > these platform differences... That's just the first example of the > test/subtest relationship that I could think of. > > > so, LLDB definitely needs that. One example which LLDB uses almost > > everywhere is that of running the same test with dSYM or DWARF debug > > info. On Apple platforms, tests generally need to run with both dSYM > > and DWARF debug info (literally just repeat the same test twice), and on > > non Apple platforms, only DWARF tests ever need to be run. So there > > would need to be a way to express this. > > Can you point me to an example of this? > > > > > There are plenty of other one-off examples. Debuggers have a lot of > > platform specific code, and the different platforms support different > > amounts of functionality (especially for things like Android / Windows > > that are works in progress). So we frequently have the need to have a > > single test file which has, say 10 tests in it. And specific tests can > > be XFAILed or even disabled individually based on conditions (usually > > which platform is running the test suite, but not always). > > -- > Jon Roelofs > jonat...@codesourcery.com > CodeSourcery / Mentor Embedded >
_______________________________________________ lldb-dev mailing list lldb-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev