On Mar 14, 2012, at 8:35 AM, Manuel Klimek wrote: > Hi Doug, > > first, a big thank you for finding the time to review. I completely > agree with everything you write. Especially getting the tooling > infrastructure into a more non-cmake centric shape is something that > was already high on the TODO list, but I didn't want to make further > changes before getting first feedback, because I figured reviewing > something that still constantly changes is no fun for you ... Which > brings me to a central question - which of the changes would you want > to see finished before getting a first version in? Especially making > the command line handling more orthogonal is something I'd like to do > in lock-step with our google-internal build system handling (which is > not cmake), and thus I think would be a great use case to make sure we > don't miss anything. Of course I totally understand if you're not > comfortable checking in the current design, and I guess that assuring > you that getting the tooling infrastructure into a shape that the > community is happy with is my top priority might not help here ;) On > the other hand, for various reasons iterating inside the repository > would make me quite a bit more productive, so I'll still ask :)
I'd like to see ClangTool split into CompilationDatabase and Tool (or whatever they're called) before it goes in, because I don't like committing something with an architecture that we know will change. Anything else (-fcompilation-database, simple compilation database, convenience functions, change from FrontendActionFactory to function object) can go in later. > Some more comments inline... > > On Wed, Mar 14, 2012 at 1:35 PM, Douglas Gregor <[email protected]> wrote: >> Hi Manuel, >> >> I finally found some free cycles to dig into this. >> >> I love the functionality provided by the tooling infrastructure, and it's >> really nice to be able to easily create a single process that churns through >> all of the translation units in a program to perform some >> analysis/refactoring/whatever. There are also some little gems hidden in >> here (such as being able to easily syntax-check code supplied by a string). >> >> My major concern with the tooling patch is that it's tailored for the case >> of building tools that run over the JSON database emitted by CMake. It's >> great to make that case work smoothly, but the tooling infrastructure as >> presented does not adequately support users who aren't already using CMake. >> That immediately eliminates the vast majority of potential users of the >> tooling infrastructure, because while CMake is popular, it's nowhere near >> dominant, and the barrier to entry for switching one's project over to CMake >> is quite possibly higher than the barrier to entry for building a tool w/o >> the tooling infrastructure that will work on non-CMake projects. >> >> I suggest a more general architecture that separates out the "program build" >> database from the code that parses code and runs the action on that code. To >> be a bit more concrete, I'm suggesting splitting ClangTool into two >> subsystems: >> >> - An abstract class CompilationDatabase that allows one to iterate >> over the CompileCommands that went into a particular build (or apply some >> function object to each CompileCommand), along with: >> - A subclass JSONCompilationDatabase that retrieves >> CompileCommands from the JSON output produced by CMake >> - A subclass SimpleCompilationDatabase where one >> programmatically adds CompileCommands >> - Note: it should be possible for us to create additional >> compilation database kinds, e.g., by parsing makefiles directly >> - A function that makes it easy to take argc/argv an >> interpret it either as a use of JSONCompilationDatabase or >> SimpleCompilationDatabase. This should make it easy to create a unified >> command-line interface to Tooling-based Clang tools, so it's easy for end >> users to run different tools on their projects >> - A function that writes a CompilationDatabase in JSON >> format, so that it can be loaded again. (Not immediately critical, but >> useful nonetheless) >> >> - A class ("Tooling"? "Tool"?) that encapsulates the environment in >> which the tool will be executed along with the ability to actually do the >> execution, e.g., >> - mapVirtualFiles, as in ClangTool >> - a run function that runs a single CompileCommand with a >> frontend action factory >> - a run function that runs every CompileCommand in the given >> program database with a frontend action factory >> - a run function that compiles a given code string (+ >> command-line options, presumably) with a frontend action factory, like >> runSyntaxOnlyToolOnCode does > > I'm not sure I understand this correctly - compiling a given code > string is done by mapping it in as a virtual file with a specific name > and adding some special command default command line arguments. > Perhaps you are imagining a different use case, but for me > runSyntaxOnlyToolOnCode is first and foremost a unit testing help. It's fairly common for someone to ask, "how do I compile some code that I have in a std::string?" runSyntaxOnlyToolOnCode starts to solve that problem. I mentioned command-line arguments simply because one will need to be able to specify (at least) language dialect options for this parse, since we have no file name to go on. >> Now, there's no reason why we can't still have two-line syntax checkers with >> this architecture. It might look like this: >> >> int main(int argc, char **argv) { >> llvm::OwningPtr<CompilationDatabase> >> Database(loadCompilationDatabase(argc, argv)); >> return Tool().run(newFrontendActionFactory<clang::SyntaxOnlyAction>(), >> *Database); >> } >> >> Now, to actually make tooling useful to non-CMake users, we'd want to make >> it possible to create a JSON compilation database without involving CMake at >> all. I suggest that we add a Clang command-line flag >> "-fcompilation-database=<filename>" and have the driver update <filename> >> with each compilation command it is invoked with. That way, one could do a >> normal project build with, e.g., >> >> make CXX=clang++ >> CXXFLAGS="-fcompilation-database=compile_commands.json" > > I actually already have written something similar (well, it's > currently a python wrapper which might be a bad idea, but it was > really simple). The design I have is that we output a little JSON > snippet next to each .o file. The problem with handing the compilation > database to each compile step is concurrent runs - how would you > propose to sync on the database? I can't imagine implementing the sync > would be worth the cost, but I'm happy to be convinced otherwise… POSIX file locks are fine; as I noted in the conversion with Chandler, most developers have projects small enough that the overhead won't matter to them, but simplicity and extremely important. >> + /// \brief Returns the file manager used in the tool. >> + /// >> + /// The file manager is shared between all translation units. >> + FileManager &getFiles() { return Files; } >> >> Given that the FileManager is exposed, should mapVirtualFile deal in >> FileEntry*'s? (Or, at least, have an overload that uses a FileEntry*?) > > /me wonders why he exposed the FileManager at all - one interesting > part that's missing from the current CL is one of the next steps that > allows us to do refactorings (if you're interested, the current > incarnation is at: > http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/include/clang/Tooling/Refactoring.h?view=markup > with an example tool at > http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/tools/remove-cstr-calls/RemoveCStrCalls.cpp?view=markup) > > I'll investigate. I don't want users to have to deal with > FileEntry*'s, and rather keep the interface as simple as possible. > Unless you have a specific use case? No use case in mind. I'd be just as happy (heck, probably happier) if the FileManager wasn't exposed at all. >> + // Create the compilers actual diagnostics engine. >> + Compiler.createDiagnostics(CC1Args.size(), >> + const_cast<char**>(CC1Args.data())); >> >> Should the tooling infrastructure provide a way to deal with diagnostics? > > I have not really thought of that. We probably want to be able to > inject diagnostics at some point (we have some diagnostic mapreduces > internally when we iterate on implementing new diagnostics). Okay. >> +using clang::tooling::ClangTool; >> +using clang::tooling::newFrontendActionFactory; >> + >> +int main(int argc, char **argv) { >> + ClangTool Tool(argc, argv); >> + return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>()); >> +} >> >> I feel like the tooling infrastructure would benefit from a slightly more >> involved example, especially one that benefits from running over all files >> in a project within the same process. Perhaps that isn't easy until we have >> an AST matching framework? > > The AST matching framework will actually not help much here - this > becomes very interesting when you do real refactorings (see the > example I cited above), because then we need to deduplicate edits at > the end of running over all translation units. I was thinking of something a bit more simple. For example, finding all references to a given global variable within a project. - Doug _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
