vedgy marked 2 inline comments as done. vedgy added inline comments.
================ Comment at: clang/include/clang-c/Index.h:329 + * CXIndexOptions Opts = { sizeof(CXIndexOptions), + * clang_getDefaultGlobalOptions() }; + * \endcode ---------------- When I almost finished the requested changes, I remembered that the return value of `clang_getDefaultGlobalOptions()` depends on environment variables, and thus `0` is not necessarily the default. Adjusted the changes and updated this revision. Does the extra requirement to non-zero initialize this second member sway your opinion on the usefulness of the helper function `inline CXIndexOptions clang_getDefaultIndexOptions()`? Note that there may be same (environment) or other important reasons why future new options couldn't be zeroes by default. ================ Comment at: clang/tools/c-index-test/c-index-test.c:79 + Opts.PreambleStoragePath = NULL; + Opts.InvocationEmissionPath = getenv("CINDEXTEST_INVOCATION_EMISSION_PATH"); + ---------------- aaron.ballman wrote: > vedgy wrote: > > aaron.ballman wrote: > > > vedgy wrote: > > > > aaron.ballman wrote: > > > > > vedgy wrote: > > > > > > When a libclang user needs to override a single option in > > > > > > `CXIndexOptions`, [s]he has to set every member of the struct > > > > > > explicitly. When new options are added, each libclang user needs to > > > > > > update the code that sets the options under `CINDEX_VERSION_MINOR` > > > > > > `#if`s. Accidentally omitting even one member assignment risks > > > > > > undefined or wrong behavior. How about adding an `inline` helper > > > > > > function `CXIndexOptions clang_getDefaultIndexOptions()`, which > > > > > > assigns default values to all struct members? Libclang users can > > > > > > then call this function to obtain the default configuration, then > > > > > > tweak only the members they want to override. > > > > > > > > > > > > If this suggestion is to be implemented, how to deal with > > > > > > [[https://stackoverflow.com/questions/68004269/differences-of-the-inline-keyword-in-c-and-c|the > > > > > > differences of the inline keyword in C and C++]]? > > > > > By default, `0` should give you the most reasonable default behavior > > > > > for most of the existing options (and new options should follow the > > > > > same pattern). Ideally, users should be able to do: > > > > > ``` > > > > > CXIndexOptions Opts; > > > > > memset(&Opts, 0, sizeof(Opts)); > > > > > Opts.Size = sizeof(Opts); > > > > > Opts.Whatever = 12; > > > > > CXIndex Idx = clang_createIndexWithOptions(&Opts); > > > > > ``` > > > > > Global options defaulting to 0 is fine (uses regular thread > > > > > priorities), we don't think want to default to excluding declarations > > > > > from PCH, and we want to use the default preamble and invocation > > > > > emission paths (if any). The only option that nonzero as a default > > > > > *might* make sense for is displaying diagnostics, but even that seems > > > > > reasonable to expect the developer to manually enable. > > > > > > > > > > So I don't know that we need a function to get us default indexing > > > > > options as `0` should be a reasonable default for all of the options. > > > > > As we add new options, we need to be careful to add them in backwards > > > > > compatible ways where `0` means "do the most likely thing". > > > > > > > > > > WDYT? > > > > The disadvantages of committing to defaulting to `0`: > > > > 1. The usage you propose is still more verbose and error-prone than > > > > ``` > > > > CXIndexOptions Opts = clang_getDefaultIndexOptions(); > > > > Opts.Whatever = 12; > > > > CXIndex Idx = clang_createIndexWithOptions(&Opts); > > > > ``` > > > > 2. The `memset` would look very unclean in modern C++ code. > > > > 3. The `0` commitment may force unnatural naming of a future option to > > > > invert its meaning. > > > > > > > > The advantages: > > > > 1. No need to implement the inline function now. > > > > 2. Faster, but I am sure this performance difference doesn't matter. > > > > Even a non-inline function call itself (even if it did nothing) to > > > > `clang_createIndexWithOptions()` should take longer than assigning a > > > > few values to built-in-typed members. > > > > > > > > Another advantage of not having to remember to update the inline > > > > function's implementation when new options are added is counterbalanced > > > > by the need to be careful to add new options in backwards compatible > > > > way where `0` is the default. > > > > > > > > Any other advantages of the `0` that I miss? Maybe there are some > > > > advantages for C users, but I suspect most libclang users are C++. > > > >The usage you propose is still more verbose and error-prone than > > > > ``` > > > > CXIndexOptions Opts = clang_getDefaultIndexOptions(); > > > > Opts.Whatever = 12; > > > > CXIndex Idx = clang_createIndexWithOptions(&Opts); > > > > ``` > > > > > > I see it as being roughly the same amount of verbosity and proneness to > > > error, but maybe that's just me. > > > > > > > The memset would look very unclean in modern C++ code. > > > > > > I don't see this as a problem. 1) `std::memset` gets plenty of usage in > > > modern C++ still, 2) you can also initialize with ` = { > > > sizeof(CXIndexOptions) };` and rely on the zero init for subsequent > > > members after the first (personally, I think that's too clever, but > > > reasonable people may disagree), 3) this is a C API, folks using C++ may > > > wish to wrap it in a better interface for C++ anyway. > > > > > > > The 0 commitment may force unnatural naming of a future option to > > > > invert its meaning. > > > > > > I do worry about this one, especially given there's no way to predict > > > what future options we'll need. But it also forces us to think about what > > > the correct default behavior should be, whereas if we push it off to a > > > helper function, we make it harder for everyone who didn't know to use > > > that helper function for whatever reason. > > > > > > > Any other advantages of the 0 that I miss? Maybe there are some > > > > advantages for C users, but I suspect most libclang users are C++. > > > > > > Familiarity for those who are used to dealing with existing C APIs that > > > behave this way (like Win32 APIs), but that can probably be argued either > > > way. My guess is that all libclang users are having to work through the C > > > interface, and some decently large amount of them do so from languages > > > other than C. But whether that means most are C++ users or not is > > > questionable -- I know I've seen uses from Python and Java as well as C++. > > > > > > I'm not strongly opposed to having an API to get the default values, but > > > I don't see a lot of value in it either. It's something we could always > > > add later if we find there's a need for it in practice. > > > I see it as being roughly the same amount of verbosity and proneness to > > > error, but maybe that's just me. > > Calling the default-value function is one requirement. Assigning `sizeof` > > and calling `memset` - 2. Plus the user has to pass the correct arguments > > to `memset`. > > > > > 1) `std::memset` gets plenty of usage in modern C++ still > > Yes, there is usage, but modern C++ guidelines recommend replacing such > > usages with standard algorithms whenever possible. `memset`-ting an object > > generally risks undefined behavior. > > > > > whereas if we push it off to a helper function, we make it harder for > > > everyone who didn't know to use that helper function for whatever reason. > > If a user doesn't read the documentation, [s]he would likely not guess that > > `memset` needs to be called either. So I don't see a disadvantage of the > > helper function here. > > > > > It's something we could always add later if we find there's a need for it > > > in practice. > > We could add it later. But if we advise using `memset` in the first > > version, introducing a struct member with non-zero default value would risk > > breaking backward compatibility for users that don't track libclang API > > changes closely. It would also force all users that **do** notice the > > change to update their code with `#if CINDEX_VERSION_MINOR >= XY` to use > > either `memset` when compiling against libclang versions that don't yet > > have the helper function, or the helper function. > > > > Introducing the helper function from the beginning affords more flexibility > > and space for future libclang API changes with less risk of breaking > > backward compatibility. If someone decides to ignore the helper function > > and `memset` the struct instead, breaking such usage should not be an > > obstacle/concern to future libclang changes. > >> I see it as being roughly the same amount of verbosity and proneness to > >> error, but maybe that's just me. > > Calling the default-value function is one requirement. Assigning sizeof and > > calling memset - 2. Plus the user has to pass the correct arguments to > > memset. > > Both of which can be done in a single one-liner if brevity is that important > for your coding style. > > > Yes, there is usage, but modern C++ guidelines recommend replacing such > > usages with standard algorithms whenever possible. memset-ting an object > > generally risks undefined behavior. > > I don't see how that can be much of a real risk in a C interface. > > > If a user doesn't read the documentation, [s]he would likely not guess that > > memset needs to be called either. So I don't see a disadvantage of the > > helper function here. > > Agreed. > > > We could add it later. But if we advise using memset in the first version, > > introducing a struct member with non-zero default value would risk breaking > > backward compatibility ... > > My recommendation is that we don't add the API now and we leave a comment in > the options struct reminding folks about the importance of new options being > expressed such that `0` gives the correct default behavior. As you say, it is > a risk that we break the model in the future, but it seems like a somewhat > low risk. > > Added and modified documentation, adapted usage in tests. ================ Comment at: clang/tools/libclang/CIndex.cpp:3701-3702 +CXIndex clang_createIndexWithOptions(const CXIndexOptions *options) { + if (options->Size != sizeof(CXIndexOptions)) + return NULL; + CIndexer *CIdxr = clang_createIndex_Impl(options->ExcludeDeclarationsFromPCH, ---------------- vedgy wrote: > aaron.ballman wrote: > > I think we want this to be `>` and not `!=`, maybe. > > > > If the sizes are equal, we're on the happy path. > > > > If the options from the caller are smaller than the options we know about, > > that should be okay because we won't attempt read the options not provided > > and instead rely on the default behavior being reasonable. > > > > If the options from the caller are larger than the options we know about, > > we have to assume the user set an option we can't handle, and thus failing > > the request is reasonable. > > > > So the way I'm envisioning us reading the options is: > > ``` > > if (options->Size >= offsetof(CXIndexOptions, FieldWeCareAbout)) > > do_something(options->FieldWeCareAbout); > > ``` > You are probably right looking forward. > > I have opted for `!=` because this is the initial struct version. So right > now a smaller struct size means something very unexpected or a failure to > assign correct size on the user side. The advantage of `!=` is then more > reliable catching of bugs in freshly written user code that uses the struct > and an added flexibility of being able to //reduce// the struct's size in the > next revision. Added your recommended way to add new options in comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D143418/new/ https://reviews.llvm.org/D143418 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits