On 6/22/2012 1:17 PM, Jordan Rose wrote: > Hi, Joshua. First off, I really like that you submitted a /patch/ for a > better plugin API, including all the documentation and example changes that > would have otherwise been necessary to put into an e-mail. It shows that your > proposal is feasible and well-defined already, and gives us something to work > with. > >> Why would you need two copies of Basic/Lex/etc.? Clang is already built with >> all of this stuff, so the plugin would be able to use clang's version of >> everything here in normal codegen anyways, unless I'm misunderstanding your >> statement. > > What I meant was that if we wanted to run some action instead of CodeGen, we > would need to use libTooling instead...but that means building a separate > executable with everything in it again. But I forgot about -fsyntax-only, and > plugins could require this in their clang_plugin_begin_file check. > > (It might be nice if they could check this sooner, though...maybe they can > get the CompilerInvocation in clang_plugin_init, possibly even as non-const! > -fsyntax-only isn't the only interesting setting.) > > On the flip side, it might be desirable for plugins to be able to produce > errors and a nonzero result code. Whether or not these plugin diagnostics can > halt CodeGen is something to consider as well.
I expose CompilerInstance, from which you can get the DiagnosticsEngine [1] and emit diagnostics. The diagnostics engine even makes it easy to create custom diagnostics quickly, unlike, say, attributes. > Hooray for using a designated entry point instead of a static constructor! > This is something that was a problem on Windows, IIRC. > > That said, I'd go even further with the version checks, and /require/ that > the major and minor versions match, at the very least. I don't think we can > trust objects to be layout-compatible or functions to have all the same > arguments even between SVN revisions or point fixes, let alone major/minor > versions -- only libclang is stable enough for that. I implemented something > like this for static analyzer plugins by requiring plugins to export a symbol > that just contains the version string they were compiled against. You can see > how it works in CheckerRegistry.h and CheckerRegistration.cpp. The intent is that all plugins do the check, but this allows the really clever people who absolutely need multiple version support to be able to do so. Somewhere, I have documentation that emphasizes that the check out to be done. > Also, you might want to mention here that these entry points are extern "C", > in case someone decides to ignore the advice to include Plugin.h. I can do that. I expect it ought to be obvious to most people since: 1. Plugins will fail to load if it doesn't work very quickly. 2. C++ name mangling is known to be problematically different. 3. You need to include Plugin.h to be able to use plugin::PluginArg and plugin::Version in the first place. >> +<p>To run a plugin, you merely need to specify the plugin to be loaded >> using the >> +-fplugin command line options. Additional parameters for the plugins can be >> passed with -fplugin-arg-<plugin-name>-<arg>=<value> or >> +-fplugin-arg<plugin-name>-<arg>. These options work the same when >> both run >> +with the normal driver and the cc1 process.</p> > The second one is missing a dash between the literal 'arg' and the plugin > name. Also, personally I would switch the order of these two to put the > flag-only version first, since it looks simpler. > > This is STILL rather unwieldy, but I guess the idea was that if you really > want to write your own command line, you're supposed to use libTooling? Or a > wrapper script like the one that appeared awhile back. The argument syntax I derived from gcc (see <http://gcc.gnu.org/onlinedocs/gccint/Plugins.html>). Given clang's argument parsing structure, I figured that an argument syntax similar to this was the best way to pass through arguments to plugins easily. >> +namespace plugin { >> + >> +/** >> + * An individual argument for a plugin. >> + * >> + * Plugin arguments are specified as -fplugin-arg-<plugin>-<name>=<value> or >> + * -fplugin-arg-<plugin>-<name> attributes. In the latter case, the value >> + * attribute would be an empty string. >> + */ >> +struct PluginArg { >> + const char *name; >> + const char *value; >> +}; > Any reason why 'const char *' and not StringRef? Since clang_plugin_init is the function which decides whether or not the plugin is usable, I wanted to make sure that it would be as ABI-stable as possible. Using const char * here means that this works even if LLVM decides to later modify the StringRef structure layout. In all other functions, I'm much less concerned with ABI and API stability. It also allows us to reuse this for a C plugin API if we later decide to make such an option available. >> +// The following functions are not implemented within clang itself, but are >> +// prototypes of the various functions that may be implemented by plugins. >> +// Including it in a header file allows these hooks to be documented with >> +// doxygen, and the use of extern "C" on declarations allows people who >> forget >> +// to define these functions with extern "C" to still have them work. >> +extern "C" { >> +/** >> + * Callback to initialize the plugin. >> + * >> + * \note This is the only function whose ABI compatibility is guaranteed to >> + * remain stable. All plugins should verify that the version passed >> into >> + * this function is the same version that they were compiled with and >> + * abort if this is not the case. To detect the version compiled with, >> + * include clang/Basic/Version.h and check CLANG_VERSION_MAJOR and >> + * CLANG_VERSION_MINOR. >> + * >> + * \arg argc The number of arguments in the argv array >> + * \arg argv An array of the arguments passed to this plugin from the >> + * command line. >> + * \arg version The version of the compiler being called. >> + * \return Whether or not the plugin initialization succeeded. If >> initialization >> + * fails, the compilation process is terminated. >> + */ >> +extern bool clang_plugin_init(int argc, clang::plugin::PluginArg *argv, >> + clang::plugin::Version *version); > Suggestion: replace argc/argv with ArrayRef. > > Very nice doc comment. Some coding standards issues here, to match up with > the rest of LLVM: > > - We prefer C++-style comments even for Doxygen blocks, even for extern "C" > blocks (unless the entire file is C-compatible): /// > > - Anything that's a noun (types, variables, and parameters) should start with > an uppercase letter. The exceptions are types that look like standard library > types. Functions and methods should start with lowercase letters. This > convention is fairly new and so there are large chunks of the codebase that > don't look like this, but we're trying to be consistent going forwards. > > - Parameter lists that are too long should line up under the open paren. > > Most of these are already in http://llvm.org/docs/CodingStandards.html; we > should probably update the indentation bit as well. > > I'm on the fence about whether or not the namespaces should be included. On > the one hand, we don't explicitly write "clang" /anywhere/ in our header > files (exaggerating but only slightly). On the other, this is going to be > referenced by plugin writers who might not know to write "using namespace > clang". And probably won't be "using namespace clang::plugin". In order to make the 'work even if you forget extern "C"' function, I need to put this is in the global namespace. I originally had these in the clang namespace, at which point i made plugin::PluginArg and plugin::Version. > >> +/** >> + * Callback that is called before a file is compiled. >> + * >> + * Before this function is called, the callbacks object will have all of its >> + * pointers set to null. All non-null callbacks will be deleted by the >> compiler >> + * internally, so one should always return a new pointer for every file. >> + * >> + * \arg fileName The name of the function being called. >> + * \arg CI The compiler instance, which contains information about >> how >> + * the file is being compiled and other utility classes. >> + * \arg callbacks The pointer to the callbacks that this plugin will define >> for >> + * this file. >> + */ >> +extern void clang_plugin_begin_file(llvm::StringRef fileName, >> + const clang::CompilerInstance &CI, >> + clang::plugin::PluginFileCallbacks *callbacks); > Since the 'callbacks' argument is assured to be present, it should be a > reference. (This actually applies to all plugin entry points.) > > I know I for one feel a little squeamish about putting reference types in > extern "C" functions, but there's nothing in the standard that prohibits it, > and you already have "const CompilerInstance &CI". > > Finally, I would suggest changing "file" to "compilation" or "translation > unit", since this isn't called for includes and such. clang_plugin_begin_file is already getting a bit unwieldy for a function name, so maybe clang_plugin_begin_tu would work better? >> +/** >> + * Callback that is called after a file has been compiled. >> + * >> + * \arg fileName The name of the file that was just compiled. >> + */ >> +extern void clang_plugin_end_file(llvm::StringRef fileName); > Does this need the file name? It should always be the same as the matching > clang_plugin_begin_file. I honestly can't remember why I added the fileName argument here... > >> diff --git a/include/clang/Driver/Options.td >> b/include/clang/Driver/Options.td >> --- a/include/clang/Driver/Options.td >> +++ b/include/clang/Driver/Options.td >> @@ -586,16 +586,18 @@ def fpack_struct_EQ : Joined<"-fpack-str >> HelpText<"Specify the default maximum struct packing alignment">; >> def fpascal_strings : Flag<"-fpascal-strings">, Group<f_Group>, >> Flags<[CC1Option]>, >> HelpText<"Recognize and construct Pascal-style string literals">; >> def fpch_preprocess : Flag<"-fpch-preprocess">, Group<f_Group>; >> def fpic : Flag<"-fpic">, Group<f_Group>; >> def fno_pic : Flag<"-fno-pic">, Group<f_Group>; >> def fpie : Flag<"-fpie">, Group<f_Group>; >> def fno_pie : Flag<"-fno-pie">, Group<f_Group>; >> +def fplugin : Joined<"-fplugin=">, Group<f_Group>, Flags<[CC1Option]>; >> +def fplugin_arg : Joined<"-fplugin-arg-">, Group<f_Group>, >> Flags<[CC1Option]>; > Take a look at the old plugin and plugin_arg flags in CC1Options.td for > examples on how to make these look nicer. (In particular, fplugin should > probably be fplugin_EQ, and fplugin_arg can be more descriptive.) Also, > you'll have to decide whether we want to accept a split "-fplugin Plugin". The argument handling here is pretty much cargo-culted from plugin/plugin_arg stuff, although I originally wrote this patch a fair amount of time ago, so some things may have changed without me realizing it. >> --- a/include/clang/Frontend/FrontendOptions.h >> +++ b/include/clang/Frontend/FrontendOptions.h >> @@ -160,16 +160,23 @@ public: >> std::vector<std::string> AddPluginActions; >> >> /// Args to pass to the additional plugins >> std::vector<std::vector<std::string> > AddPluginArgs; > Presumably these are deprecated? And would go away eventually? My intent is to kill the old plugin architecture ASAP, and it sounds like I have some agreement with dgregor in this regard, so yes. >> /// The list of plugins to load. >> std::vector<std::string> Plugins; > But this probably has to stay, since it includes LLVM plugins. I haven't tested this (yet), but it should still be possible to use LLVM plugins with the -fplugin= structure, since static initializers will still run. >> + llvm::sys::DynamicLibrary PluginLibrary; >> + clang::plugin::PluginFileCallbacks callbacks; > Here you definitely should not include the 'clang' namespace. > plugin::PluginFileCallbacks is fine. I'd even be okay with "using > llvm::sys::DynamicLibrary". I wasn't sure if the style desired was to always do the full namespace or do partial namespaces. >> + intptr_t getFunctionCall(const char *name) { >> + return (intptr_t)PluginLibrary.getAddressOfSymbol(name); >> + } > Why is this an intptr_t? That's not really any better than void * for casting > to and from function pointers. If we're on a platform where function pointers > are bigger than object pointers, getAddressOfSymbol is already going to break. IIRC, going from void* to void(*)() causes some clang warning. In interest of shutting up warnings, a round-trip through intptr_t stops it from warning. > > >> +/// Helper function to call a given function in a plugin. >> +#define CALL_PLUGIN_FUNCTION(plugin, fname, args) \ >> + do { \ >> + Plugin::fname func_ = >> (Plugin::fname)(plugin)->getFunctionCall(#fname); \ >> + if (func_) \ >> + func_ args; \ >> + } while (0) > Technically not a function. :-) More importantly, please show example usage; > this macro is not at all straightfoward. > > Also, I would like if the macro took a reference rather than a pointer, but > that's just me. (That way it could in theory be called for plugin objects on > the stack...but that probably never happens, huh.) I expect that anyone who would use this would call it from an iterator, where "acting" like a pointer is better. >> + // These typedefs must have the same signatures as the ones in >> + // clang/Basic/Plugin.h, including the same name (it makes the helper >> macro >> + // above work properly). >> + typedef void (*clang_plugin_begin_file)(StringRef, const >> CompilerInstance &, >> + clang::plugin::PluginFileCallbacks*); >> + typedef void (*clang_plugin_end_file)(StringRef); >> + typedef void (*clang_plugin_destroy)(); >> + }; > Why no init? Ah, the boolean flag. init gets called before this object is created. >> + public: >> + PluginWrapperAction(FrontendAction *WrappedAction, >> + std::vector<Plugin> &plugins) > ArrayRef, since it's free here. I probably wrote this before llvm got ArrayRef. > > >> + for (std::vector<Plugin>::iterator it = plugins.begin(); it != >> plugins.end(); >> + ++it) { >> + // Retrieve the callbacks, and clear all of the values to NULL >> + clang::plugin::PluginFileCallbacks *callbacks = it->getCallbacks(); >> + memset(callbacks, 0, sizeof(clang::plugin::PluginFileCallbacks)); > This is silly. Plugin should just default-initialize the struct when it's > created. (Empty parens will do the right thing.) Some notes: 1. This works even if a plugin doesn't define the method (e.g., a plugin just wants to do some LLVM hooking) 2. This has some footgun-prevention effects for plugin writers > > >> + // Get the callbacks from the plugin >> + CALL_PLUGIN_FUNCTION(it, clang_plugin_begin_file, (Filename, CI, >> callbacks)); >> + >> + // ASTConsumer is handled by GetASTConsumer; others are handled now >> + if (callbacks->ppCallback) >> + CI.getPreprocessor().addPPCallbacks(callbacks->ppCallback); >> + if (callbacks->diagnostics) >> + { >> + // The front end has already run this, so let's do this now. >> + callbacks->diagnostics->BeginSourceFile(CI.getLangOpts(), >> + &CI.getPreprocessor()); >> + CI.getDiagnostics().setClient(new ChainedDiagnosticConsumer( >> + &CI.getDiagnosticClient(), callbacks->diagnostics)); >> + } >> + } > If we initialize plugins before calling the underlying action, is this > necessary? This is necessary if we do plugins via an ASTAction-wrapper approach. If plugins get attached to a CompilerInvocation or CompilerInstance instead, it wouldn't be necessary. >> + return true; > No way for plugins to signal an error? This doesn't really signal an error, it just tells the compiler to not process the file. Real errors probably need to go through the DiagnosticsEngine instead. > > >> + >> +ASTConsumer *PluginWrapperAction::CreateASTConsumer(CompilerInstance &CI, >> + StringRef InFile) { >> + std::vector<ASTConsumer*> Consumers(1, >> + WrapperFrontendAction::CreateASTConsumer(CI, InFile)); >> + for (std::vector<Plugin>::iterator it = plugins.begin(); it != >> plugins.end(); >> + ++it) { >> + if (ASTConsumer *consumer = it->getCallbacks()->astConsumer) { >> + Consumers.push_back(consumer); >> + } >> + } >> + return new MultiplexConsumer(Consumers); >> +} > TinyPtrVector or SmallVector? Most of the time we will not have plugins, and > even when we do there are probably very few. > > (This is not a performance-sensitive piece of code, I know, but we might as > well get it right.) > > >> + // Construct the arguments >> + std::vector<std::pair<std::string, std::string> > &pluginArgs = >> + Clang->getFrontendOpts().CompilerPluginArgs[i]; > ArrayRef...? > >> + plugin::PluginArg *args = new plugin::PluginArg[pluginArgs.size()]; > SmallVector<PluginArg, 4> makes sense here -- it's a vector that has a > certain number of slots allocated on the stack, but switches to acting like > std::vector if it needs to grow. Also, no forgetting to delete[] if this gets > refactored some day. > > And finally, if you do decide to allow plugins to mess with the command line > options, this will have to happen sooner. :-) To me, the basic concept of plugins is that they are things which attach to the normal compilation process, so that saying CXX="clang++ -fplugin=/path/to/plugin.so" wouldn't cause any build to fail (modulo the plugin adding in strict warnings/errors, of course) that would have succeeded with clang in the first place. In that vein, letting plugins mess with command line options is not the right thing to do. Now granted, there are probably specific things normally associated with the command line that are likely desirable (adding libraries to link and messing with LLVM parameters both come to mind), but I suspect that most of those desires could be satisfied with more explicit and less powerful hooks. [1] ARRGGH. Diagnostics has got to be the word I have the least chance of spelling correctly on my first try :-( -- Joshua Cranmer News submodule owner DXR coauthor _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
