We'll probably rewrite tests that we find are failing specifically as a result of issues like this, but I agree it's not worth re-writing everything else except on an as-needed basis.
To make the distinction explicit and enforce it kind of at an organnizational level, would it be worth creating folders under lldb/test like lldb/test/commands, and recommending that all HandleCommand tests to go there? Possibly unrelated question: But in regards to this api test vs. HandleCommand test situation, is that what the purpose of the @python_api_test decorator is (and is this decorator even still useful)? On Fri, Sep 11, 2015 at 11:42 AM Jim Ingham <jing...@apple.com> wrote: > I have held from the beginning that the only tests that should be written > using HandleCommand are those that explicitly test command behavior, and if > it is possible to write a test using the SB API you should always do it > that way for the very reasons you cite. Not everybody agreed with me at > first, so we ended up with a bunch of tests that do complex things using > HandleCommand where they really ought not to. I'm not sure it is worth the > time to go rewrite all those tests, but we shouldn't write any new tests > that way. > > Jim > > > > On Sep 11, 2015, at 11:33 AM, Zachary Turner via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > > > > The past few weeks I've spent a lot of time xfailing the rest of the > failing tests on windows so we can enable tests to run on the bots. > > > > One thing I ran into more frequently than I would have liked is that the > tests were failing not because the functionality is broken, but because the > substrings being grepped for in the output had a slightly different format > on Windows. The pattern for tests is frequently something like: > > > > result = runCommand(<some command>) > > self.expect(<result matches some regex>) > > > > A good example of this is that when you do a backtrace, on windows you > might see a fully demangled function name such as a.out`void foo(int x), > whereas on other platforms you might just see a.out`foo. > > > > I saw the reverse situation as well, where a test was passing but > shouldn't have because it was actually broken, but due to the impreciseness > of grepping output the grep was suceeding. Specifically, this was > happening on a test that verified function parameters. it launched a > program with 3 arguments, and then looked for "(int)argc=3" in the frame > info. It was broken on Windows because argc was pointing to junk memory, > so it was actually printing "(int)argc=3248902" in the output. Test was > passing, even though it was broken. > > > > Rather than make the regexes more complicated, I think the right fix > here is to stop using the command system and grepping to write tests. Just > go through the api for everything, including verifying the result. In the > second case, for example, you launch the process, set the breakpoint, wait > for it to stop, find the argument named argc, and verify that it's value is > 3. > > > > I don't want to propose going back and rewriting every single test to do > this, but I want to see how people feel about moving toward this model > going forward as the default method of writing tests. > > > > I do still think we need some tests that verify commands run, but I > think those tests should focus not on doing complicated interactions with > the debugger, and instead just verifying that things parse correctly and > the command is configured correctly, with the underlying functionality > being tested by the api tests. > > > > Thoughts? > > > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev@lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev