> On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <[email protected]> > > wrote: > >> Jonathan M Davis wrote: > >> > On Saturday 19 March 2011 18:04:57 Don wrote: > >> >> Jonathan M Davis wrote: > >> >>> On Saturday 19 March 2011 17:11:56 Don wrote: > >> >>>> Here's the task: > >> >>>> Given a .d source file, strip out all of the unittest {} blocks, > >> >>>> including everything inside them. > >> >>>> Strip out all comments as well. > >> >>>> Print out the resulting file. > >> >>>> > >> >>>> Motivation: Bug reports frequently come with very large test cases. > >> >>>> Even ones which look small often import from Phobos. > >> >>>> Reducing the test case is the first step in fixing the bug, and > >> > >> it's > >> > >> >>>> frequently ~30% of the total time required. Stripping out the unit > >> >>>> tests is the most time-consuming and error-prone part of reducing > >> > >> the > >> > >> >>>> test case. > >> >>>> > >> >>>> This should be a good task if you're relatively new to D but would > >> >>>> like to do something really useful. > >> >>> > >> >>> Unfortunately, to do that 100% correctly, you need to actually have > >> > >> a > >> > >> >>> working D lexer (and possibly parser). You might be able to get > >> >>> something close enough to work in most cases, but it doesn't take > >> > >> all > >> > >> >>> that much to throw off a basic implementation of this sort of thing > >> > >> if > >> > >> >>> you don't lex/parse it with something which properly understands D. > >> >>> > >> >>> - Jonathan M Davis > >> >> > >> >> I didn't say it needs 100% accuracy. You can assume, for example, > >> > >> that > >> > >> >> "unittest" always occurs at the start of a line. The only other > >> > >> things > >> > >> >> you need to lex are {}, string literals, and comments. > >> >> > >> >> BTW, the immediate motivation for this is std.datetime in Phobos. The > >> >> sheer number of unittests in there is an absolute catastrophe for > >> >> tracking down bugs. It makes a tool like this MANDATORY. > >> > > >> > I tried to create a similar tool before and gave up because I couldn't > >> > make it 100% accurate and was running into problems with it. If > >> > >> someone > >> > >> > wants to take a shot at it though, that's fine. > >> > > >> > As for the unit tests in std.datetime making it hard to track down > >> > >> bugs, > >> > >> > that only makes sense to me if you're trying to look at the whole > >> > >> thing > >> > >> > at once and track down a compiler bug which happens _somewhere_ in the > >> > code, but you don't know where. Other than a problem like that, I > >> > >> don't > >> > >> > really see how the unit tests get in the way of tracking down bugs. Is > >> > it that you need to compile in a version of std.datetime which doesn't > >> > have any unit tests compiled in but you still need to compile with > >> > -unittest for other stuff? > >> > >> No. All you know there's a bug that's being triggered somewhere in > >> Phobos (with -unittest). It's probably not in std.datetime. > >> But Phobos is a horrible ball of mud where everything imports everything > >> else, and std.datetime is near the centre of that ball. What you have to > >> do is reduce the amount of code, and especially the number of modules, > >> as rapidly as possible; this means getting rid of imports. > >> > >> To do this, you need to remove large chunks of code from the files. This > >> is pretty simple; comment out half of the file, if it still works, then > >> delete it. Normally this works well because typically only about a dozen > >> lines are actually being used. After doing this about three or four > >> times it's small enough that you can usually get rid of most of the > >> imports. Unittests foul this up because they use functions/classes from > >> inside the file. > >> > >> In the case of std.datetime it's even worse because the signal-to-noise > >> ratio is so incredibly poor; it's really difficult to find the few lines > >> of code that are actually being used by other Phobos modules. > >> > >> My experience (obviously only over the last month or so) has been that > >> if the reduction of a bug is non-obvious, more than 10% of the total > >> time taken to fix that bug is the time taken to cut down std.datetime. > > > > Hmmm. I really don't know what could be done to fix that (other than > > making it > > easier to rip out the unittest blocks). And enough of std.datetime > > depends on > > other parts of std.datetime that trimming it down isn't (and can't be) > > exactly > > easy. In general, SysTime is the most likely type to be used, and it > > depends > > on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of > > the > > free functions in the module. It's not exactly designed in a manner which > > allows you to cut out large chunks and still have it compile. And I don't > > think that it _could_ be designed that way and still have the > > functionality > > that it has. > > > > I guess that this sort of problem is one that would pop up mainly when > > dealing > > with compiler bugs. I have a hard time seeing it popping up with your > > typical > > bug in Phobos itself. So, I guess that this is the sort of thing that > > you'd > > run into and I likely wouldn't. > > > > I really don't know how the situation could be improved though other than > > making it easier to cut out the unit tests. > > I was just thinking .. if we get a list of the symbols the linker is > including, then write an app to take that list, and strip everything else > out of the source .. would that work. The Q's are how hard is it to get > the symbols from the linker and then how hard is it to match those to > source. IIRC there are functions in phobos to convert to/from symbol > names, so if the app had sufficient lexing and parsing capability it could > match on those.
That would require a full-blown D lexer and parser. - Jonathan M Davis
