Re: [cfe-commits] [PATCH] Implements support to run standalone tools

Manuel Klimek Wed, 14 Mar 2012 11:17:10 -0700

On Wed, Mar 14, 2012 at 7:05 PM, Douglas Gregor <[email protected]> wrote:
>
> On Mar 14, 2012, at 8:35 AM, Manuel Klimek wrote:
>
>> Hi Doug,
>>
>> first, a big thank you for finding the time to review. I completely
>> agree with everything you write. Especially getting the tooling
>> infrastructure into a more non-cmake centric shape is something that
>> was already high on the TODO list, but I didn't want to make further
>> changes before getting first feedback, because I figured reviewing
>> something that still constantly changes is no fun for you ... Which
>> brings me to a central question - which of the changes would you want
>> to see finished before getting a first version in? Especially making
>> the command line handling more orthogonal is something I'd like to do
>> in lock-step with our google-internal build system handling (which is
>> not cmake), and thus I think would be a great use case to make sure we
>> don't miss anything. Of course I totally understand if you're not
>> comfortable checking in the current design, and I guess that assuring
>> you that getting the tooling infrastructure into a shape that the
>> community is happy with is my top priority might not help here ;) On
>> the other hand, for various reasons iterating inside the repository
>> would make me quite a bit more productive, so I'll still ask :)
>
> I'd like to see ClangTool split into CompilationDatabase and Tool (or 
> whatever they're called) before it goes in, because I don't like committing 
> something with an architecture that we know will change. Anything else 
> (-fcompilation-database, simple compilation database, convenience functions, 
> change from FrontendActionFactory to function object) can go in later.
>
>> Some more comments inline...
>>
>> On Wed, Mar 14, 2012 at 1:35 PM, Douglas Gregor <[email protected]> wrote:
>>> Hi Manuel,
>>>
>>> I finally found some free cycles to dig into this.
>>>
>>> I love the functionality provided by the tooling infrastructure, and it's 
>>> really nice to be able to easily create a single process that churns 
>>> through all of the translation units in a program to perform some 
>>> analysis/refactoring/whatever. There are also some little gems hidden in 
>>> here (such as being able to easily syntax-check code supplied by a string).
>>>
>>> My major concern with the tooling patch is that it's tailored for the case 
>>> of building tools that run over the JSON database emitted by CMake. It's 
>>> great to make that case work smoothly, but the tooling infrastructure as 
>>> presented does not adequately support users who aren't already using CMake. 
>>> That immediately eliminates the vast majority of potential users of the 
>>> tooling infrastructure, because while CMake is popular, it's nowhere near 
>>> dominant, and the barrier to entry for switching one's project over to 
>>> CMake is quite possibly higher than the barrier to entry for building a 
>>> tool w/o the tooling infrastructure that will work on non-CMake projects.
>>>
>>> I suggest a more general architecture that separates out the "program 
>>> build" database from the code that parses code and runs the action on that 
>>> code. To be a bit more concrete, I'm suggesting splitting ClangTool into 
>>> two subsystems:
>>>
>>>        - An abstract class CompilationDatabase that allows one to iterate 
>>> over the CompileCommands that went into a particular build (or apply some 
>>> function object to each CompileCommand), along with:
>>>                - A subclass JSONCompilationDatabase that retrieves 
>>> CompileCommands from the JSON output produced by CMake
>>>                - A subclass SimpleCompilationDatabase where one 
>>> programmatically adds CompileCommands
>>>                - Note: it should be possible for us to create additional 
>>> compilation database kinds, e.g., by parsing makefiles directly
>>>                - A function that makes it easy to take argc/argv an 
>>> interpret it either as a use of JSONCompilationDatabase or 
>>> SimpleCompilationDatabase. This should make it easy to create a unified 
>>> command-line interface to Tooling-based Clang tools, so it's easy for end 
>>> users to run different tools on their projects
>>>                - A function that writes a CompilationDatabase in JSON 
>>> format, so that it can be loaded again. (Not immediately critical, but 
>>> useful nonetheless)
>>>
>>>        - A class ("Tooling"? "Tool"?) that encapsulates the environment in 
>>> which the tool will be executed along with the ability to actually do the 
>>> execution, e.g.,
>>>                - mapVirtualFiles, as in ClangTool
>>>                - a run function that runs a single CompileCommand with a 
>>> frontend action factory
>>>                - a run function that runs every CompileCommand in the given 
>>> program database with a frontend action factory
>>>                - a run function that compiles a given code string (+ 
>>> command-line options, presumably) with a frontend action factory, like 
>>> runSyntaxOnlyToolOnCode does
>>
>> I'm not sure I understand this correctly - compiling a given code
>> string is done by mapping it in as a virtual file with a specific name
>> and adding some special command default command line arguments.
>> Perhaps you are imagining a different use case, but for me
>> runSyntaxOnlyToolOnCode is first and foremost a unit testing help.
>
> It's fairly common for someone to ask, "how do I compile some code that I 
> have in a std::string?" runSyntaxOnlyToolOnCode starts to solve that problem. 
> I mentioned command-line arguments simply because one will need to be able to 
> specify (at least) language dialect options for this parse, since we have no 
> file name to go on.
>
>>> Now, there's no reason why we can't still have two-line syntax checkers 
>>> with this architecture. It might look like this:
>>>
>>>  int main(int argc, char **argv) {
>>>    llvm::OwningPtr<CompilationDatabase> 
>>> Database(loadCompilationDatabase(argc, argv));
>>>    return Tool().run(newFrontendActionFactory<clang::SyntaxOnlyAction>(), 
>>> *Database);
>>>  }
>>>
>>> Now, to actually make tooling useful to non-CMake users, we'd want to make 
>>> it possible to create a JSON compilation database without involving CMake 
>>> at all. I suggest that we add a Clang command-line flag 
>>> "-fcompilation-database=<filename>" and have the driver update <filename> 
>>> with each compilation command it is invoked with. That way, one could do a 
>>> normal project build with, e.g.,
>>>
>>>        make CXX=clang++ 
>>> CXXFLAGS="-fcompilation-database=compile_commands.json"
>>
>> I actually already have written something similar (well, it's
>> currently a python wrapper which might be a bad idea, but it was
>> really simple). The design I have is that we output a little JSON
>> snippet next to each .o file. The problem with handing the compilation
>> database to each compile step is concurrent runs - how would you
>> propose to sync on the database? I can't imagine implementing the sync
>> would be worth the cost, but I'm happy to be convinced otherwise…
>
> POSIX file locks are fine; as I noted in the conversion with Chandler, most 
> developers have projects small enough that the overhead won't matter to them, 
> but simplicity and extremely important.


Do they work on Windows? (I did a quick search and it seems like file
locking on windows is a pain, but I might have missed something).

>>> +  /// \brief Returns the file manager used in the tool.
>>> +  ///
>>> +  /// The file manager is shared between all translation units.
>>> +  FileManager &getFiles() { return Files; }
>>>
>>> Given that the FileManager is exposed, should mapVirtualFile deal in 
>>> FileEntry*'s? (Or, at least, have an overload that uses a FileEntry*?)
>>
>> /me wonders why he exposed the FileManager at all - one interesting
>> part that's missing from the current CL is one of the next steps that
>> allows us to do refactorings (if you're interested, the current
>> incarnation is at:
>> http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/include/clang/Tooling/Refactoring.h?view=markup
>> with an example tool at
>> http://llvm.org/viewvc/llvm-project/cfe/branches/tooling/tools/remove-cstr-calls/RemoveCStrCalls.cpp?view=markup)
>>
>> I'll investigate. I don't want users to have to deal with
>> FileEntry*'s, and rather keep the interface as simple as possible.
>> Unless you have a specific use case?
>
> No use case in mind. I'd be just as happy (heck, probably happier) if the 
> FileManager wasn't exposed at all.
>
>>> +  // Create the compilers actual diagnostics engine.
>>> +  Compiler.createDiagnostics(CC1Args.size(),
>>> +                             const_cast<char**>(CC1Args.data()));
>>>
>>> Should the tooling infrastructure provide a way to deal with diagnostics?
>>
>> I have not really thought of that. We probably want to be able to
>> inject diagnostics at some point (we have some diagnostic mapreduces
>> internally when we iterate on implementing new diagnostics).
>
> Okay.
>
>>> +using clang::tooling::ClangTool;
>>> +using clang::tooling::newFrontendActionFactory;
>>> +
>>> +int main(int argc, char **argv) {
>>> +  ClangTool Tool(argc, argv);
>>> +  return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>());
>>> +}
>>>
>>> I feel like the tooling infrastructure would benefit from a slightly more 
>>> involved example, especially one that benefits from running over all files 
>>> in a project within the same process. Perhaps that isn't easy until we have 
>>> an AST matching framework?
>>
>> The AST matching framework will actually not help much here - this
>> becomes very interesting when you do real refactorings (see the
>> example I cited above), because then we need to deduplicate edits at
>> the end of running over all translation units.
>
> I was thinking of something a bit more simple. For example, finding all 
> references to a given global variable within a project.

How does that benefit from running everything in-process?

Cheers,
/Manuel

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [cfe-commits] [PATCH] Implements support to run standalone tools

Reply via email to