On Mar 14, 2012, at 11:14 AM, Manuel Klimek wrote:
> On Wed, Mar 14, 2012 at 7:05 PM, Douglas Gregor <[email protected]> wrote:
>>
>> On Mar 14, 2012, at 8:35 AM, Manuel Klimek wrote:
>>
>>> Hi Doug,
>>>
>>> first, a big thank you for finding the time to review. I completely
>>> agree with everything you write. Especially getting the tooling
>>> infrastructure into a more non-cmake centric shape is something that
>>> was already high on the TODO list, but I didn't want to make further
>>> changes before getting first feedback, because I figured reviewing
>>> something that still constantly changes is no fun for you ... Which
>>> brings me to a central question - which of the changes would you want
>>> to see finished before getting a first version in? Especially making
>>> the command line handling more orthogonal is something I'd like to do
>>> in lock-step with our google-internal build system handling (which is
>>> not cmake), and thus I think would be a great use case to make sure we
>>> don't miss anything. Of course I totally understand if you're not
>>> comfortable checking in the current design, and I guess that assuring
>>> you that getting the tooling infrastructure into a shape that the
>>> community is happy with is my top priority might not help here ;) On
>>> the other hand, for various reasons iterating inside the repository
>>> would make me quite a bit more productive, so I'll still ask :)
>>
>> I'd like to see ClangTool split into CompilationDatabase and Tool (or
>> whatever they're called) before it goes in, because I don't like committing
>> something with an architecture that we know will change. Anything else
>> (-fcompilation-database, simple compilation database, convenience functions,
>> change from FrontendActionFactory to function object) can go in later.
>>
>>> Some more comments inline...
>>>
>>> On Wed, Mar 14, 2012 at 1:35 PM, Douglas Gregor <[email protected]> wrote:
>>>> Hi Manuel,
>>>>
>>>> I finally found some free cycles to dig into this.
>>>>
>>>> I love the functionality provided by the tooling infrastructure, and it's
>>>> really nice to be able to easily create a single process that churns
>>>> through all of the translation units in a program to perform some
>>>> analysis/refactoring/whatever. There are also some little gems hidden in
>>>> here (such as being able to easily syntax-check code supplied by a string).
>>>>
>>>> My major concern with the tooling patch is that it's tailored for the case
>>>> of building tools that run over the JSON database emitted by CMake. It's
>>>> great to make that case work smoothly, but the tooling infrastructure as
>>>> presented does not adequately support users who aren't already using
>>>> CMake. That immediately eliminates the vast majority of potential users of
>>>> the tooling infrastructure, because while CMake is popular, it's nowhere
>>>> near dominant, and the barrier to entry for switching one's project over
>>>> to CMake is quite possibly higher than the barrier to entry for building a
>>>> tool w/o the tooling infrastructure that will work on non-CMake projects.
>>>>
>>>> I suggest a more general architecture that separates out the "program
>>>> build" database from the code that parses code and runs the action on that
>>>> code. To be a bit more concrete, I'm suggesting splitting ClangTool into
>>>> two subsystems:
>>>>
>>>> - An abstract class CompilationDatabase that allows one to iterate
>>>> over the CompileCommands that went into a particular build (or apply some
>>>> function object to each CompileCommand), along with:
>>>> - A subclass JSONCompilationDatabase that retrieves
>>>> CompileCommands from the JSON output produced by CMake
>>>> - A subclass SimpleCompilationDatabase where one
>>>> programmatically adds CompileCommands
>>>> - Note: it should be possible for us to create additional
>>>> compilation database kinds, e.g., by parsing makefiles directly
>>>> - A function that makes it easy to take argc/argv an
>>>> interpret it either as a use of JSONCompilationDatabase or
>>>> SimpleCompilationDatabase. This should make it easy to create a unified
>>>> command-line interface to Tooling-based Clang tools, so it's easy for end
>>>> users to run different tools on their projects
>>>> - A function that writes a CompilationDatabase in JSON
>>>> format, so that it can be loaded again. (Not immediately critical, but
>>>> useful nonetheless)
>>>>
>>>> - A class ("Tooling"? "Tool"?) that encapsulates the environment in
>>>> which the tool will be executed along with the ability to actually do the
>>>> execution, e.g.,
>>>> - mapVirtualFiles, as in ClangTool
>>>> - a run function that runs a single CompileCommand with a
>>>> frontend action factory
>>>> - a run function that runs every CompileCommand in the
>>>> given program database with a frontend action factory
>>>> - a run function that compiles a given code string (+
>>>> command-line options, presumably) with a frontend action factory, like
>>>> runSyntaxOnlyToolOnCode does
>>>
>>> I'm not sure I understand this correctly - compiling a given code
>>> string is done by mapping it in as a virtual file with a specific name
>>> and adding some special command default command line arguments.
>>> Perhaps you are imagining a different use case, but for me
>>> runSyntaxOnlyToolOnCode is first and foremost a unit testing help.
>>
>> It's fairly common for someone to ask, "how do I compile some code that I
>> have in a std::string?" runSyntaxOnlyToolOnCode starts to solve that
>> problem. I mentioned command-line arguments simply because one will need to
>> be able to specify (at least) language dialect options for this parse, since
>> we have no file name to go on.
>>
>>>> Now, there's no reason why we can't still have two-line syntax checkers
>>>> with this architecture. It might look like this:
>>>>
>>>> int main(int argc, char **argv) {
>>>> llvm::OwningPtr<CompilationDatabase>
>>>> Database(loadCompilationDatabase(argc, argv));
>>>> return Tool().run(newFrontendActionFactory<clang::SyntaxOnlyAction>(),
>>>> *Database);
>>>> }
>>>>
>>>> Now, to actually make tooling useful to non-CMake users, we'd want to make
>>>> it possible to create a JSON compilation database without involving CMake
>>>> at all. I suggest that we add a Clang command-line flag
>>>> "-fcompilation-database=<filename>" and have the driver update <filename>
>>>> with each compilation command it is invoked with. That way, one could do a
>>>> normal project build with, e.g.,
>>>>
>>>> make CXX=clang++
>>>> CXXFLAGS="-fcompilation-database=compile_commands.json"
>>>
>>> I actually already have written something similar (well, it's
>>> currently a python wrapper which might be a bad idea, but it was
>>> really simple). The design I have is that we output a little JSON
>>> snippet next to each .o file. The problem with handing the compilation
>>> database to each compile step is concurrent runs - how would you
>>> propose to sync on the database? I can't imagine implementing the sync
>>> would be worth the cost, but I'm happy to be convinced otherwise…
>>
>> POSIX file locks are fine; as I noted in the conversion with Chandler, most
>> developers have projects small enough that the overhead won't matter to
>> them, but simplicity and extremely important.
>
> Do they work on Windows? (I did a quick search and it seems like file
> locking on windows is a pain, but I might have missed something).
I could live with Windows not supporting this flag until someone interested in
that platform comes along and fixes it.
>>>> + /// \brief Returns the file manager used in the tool.
>>>> + ///
>>>> + /// The file manager is shared between all translation units.
>>>> + FileManager &getFiles() { return Files; }
>>>>
>>>> Given that the FileManager is exposed, should mapVirtualFile deal in
>>>> FileEntry*'s? (Or, at least, have an overload that uses a FileEntry*?)
>>>
>>> /me wonders why he exposed the FileManager at all - one interesting
>>> part that's missing from the current CL is one of the next steps that
>>> allows us to do refactorings (if you're interested, the current
>>> incarnation is at:
>>> http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/include/clang/Tooling/Refactoring.h?view=markup
>>> with an example tool at
>>> http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/tools/remove-cstr-calls/RemoveCStrCalls.cpp?view=markup)
>>>
>>> I'll investigate. I don't want users to have to deal with
>>> FileEntry*'s, and rather keep the interface as simple as possible.
>>> Unless you have a specific use case?
>>
>> No use case in mind. I'd be just as happy (heck, probably happier) if the
>> FileManager wasn't exposed at all.
>>
>>>> + // Create the compilers actual diagnostics engine.
>>>> + Compiler.createDiagnostics(CC1Args.size(),
>>>> + const_cast<char**>(CC1Args.data()));
>>>>
>>>> Should the tooling infrastructure provide a way to deal with diagnostics?
>>>
>>> I have not really thought of that. We probably want to be able to
>>> inject diagnostics at some point (we have some diagnostic mapreduces
>>> internally when we iterate on implementing new diagnostics).
>>
>> Okay.
>>
>>>> +using clang::tooling::ClangTool;
>>>> +using clang::tooling::newFrontendActionFactory;
>>>> +
>>>> +int main(int argc, char **argv) {
>>>> + ClangTool Tool(argc, argv);
>>>> + return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>());
>>>> +}
>>>>
>>>> I feel like the tooling infrastructure would benefit from a slightly more
>>>> involved example, especially one that benefits from running over all files
>>>> in a project within the same process. Perhaps that isn't easy until we
>>>> have an AST matching framework?
>>>
>>> The AST matching framework will actually not help much here - this
>>> becomes very interesting when you do real refactorings (see the
>>> example I cited above), because then we need to deduplicate edits at
>>> the end of running over all translation units.
>>
>> I was thinking of something a bit more simple. For example, finding all
>> references to a given global variable within a project.
>
> How does that benefit from running everything in-process?
Easy to eliminate duplicates from references within headers, perhaps?
/me is grasping at straws.
- Doug
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits