vedgy marked 2 inline comments as done.
vedgy added inline comments.

================
Comment at: clang/include/clang-c/Index.h:329
+ * CXIndexOptions Opts = { sizeof(CXIndexOptions),
+ *                         clang_getDefaultGlobalOptions() };
+ * \endcode
----------------
When I almost finished the requested changes, I remembered that the return 
value of `clang_getDefaultGlobalOptions()` depends on environment variables, 
and thus `0` is not necessarily the default. Adjusted the changes and updated 
this revision.

Does the extra requirement to non-zero initialize this second member sway your 
opinion on the usefulness of the helper function `inline CXIndexOptions 
clang_getDefaultIndexOptions()`? Note that there may be same (environment) or 
other important reasons why future new options couldn't be zeroes by default.


================
Comment at: clang/tools/c-index-test/c-index-test.c:79
+    Opts.PreambleStoragePath = NULL;
+    Opts.InvocationEmissionPath = 
getenv("CINDEXTEST_INVOCATION_EMISSION_PATH");
+
----------------
aaron.ballman wrote:
> vedgy wrote:
> > aaron.ballman wrote:
> > > vedgy wrote:
> > > > aaron.ballman wrote:
> > > > > vedgy wrote:
> > > > > > When a libclang user needs to override a single option in 
> > > > > > `CXIndexOptions`, [s]he has to set every member of the struct 
> > > > > > explicitly. When new options are added, each libclang user needs to 
> > > > > > update the code that sets the options under `CINDEX_VERSION_MINOR` 
> > > > > > `#if`s. Accidentally omitting even one member assignment risks 
> > > > > > undefined or wrong behavior. How about adding an `inline` helper 
> > > > > > function `CXIndexOptions clang_getDefaultIndexOptions()`, which 
> > > > > > assigns default values to all struct members? Libclang users can 
> > > > > > then call this function to obtain the default configuration, then 
> > > > > > tweak only the members they want to override.
> > > > > > 
> > > > > > If this suggestion is to be implemented, how to deal with 
> > > > > > [[https://stackoverflow.com/questions/68004269/differences-of-the-inline-keyword-in-c-and-c|the
> > > > > >  differences of the inline keyword in C and C++]]?
> > > > > By default, `0` should give you the most reasonable default behavior 
> > > > > for most of the existing options (and new options should follow the 
> > > > > same pattern). Ideally, users should be able to do:
> > > > > ```
> > > > > CXIndexOptions Opts;
> > > > > memset(&Opts, 0, sizeof(Opts));
> > > > > Opts.Size = sizeof(Opts);
> > > > > Opts.Whatever = 12;
> > > > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > > > ```
> > > > > Global options defaulting to 0 is fine (uses regular thread 
> > > > > priorities), we don't think want to default to excluding declarations 
> > > > > from PCH, and we want to use the default preamble and invocation 
> > > > > emission paths (if any). The only option that nonzero as a default 
> > > > > *might* make sense for is displaying diagnostics, but even that seems 
> > > > > reasonable to expect the developer to manually enable.
> > > > > 
> > > > > So I don't know that we need a function to get us default indexing 
> > > > > options as `0` should be a reasonable default for all of the options. 
> > > > > As we add new options, we need to be careful to add them in backwards 
> > > > > compatible ways where `0` means "do the most likely thing".
> > > > > 
> > > > > WDYT?
> > > > The disadvantages of committing to defaulting to `0`:
> > > > 1. The usage you propose is still more verbose and error-prone than
> > > > ```
> > > > CXIndexOptions Opts = clang_getDefaultIndexOptions();
> > > > Opts.Whatever = 12;
> > > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > > ```
> > > > 2. The `memset` would look very unclean in modern C++ code.
> > > > 3. The `0` commitment may force unnatural naming of a future option to 
> > > > invert its meaning.
> > > > 
> > > > The advantages:
> > > > 1. No need to implement the inline function now.
> > > > 2. Faster, but I am sure this performance difference doesn't matter. 
> > > > Even a non-inline function call itself (even if it did nothing) to 
> > > > `clang_createIndexWithOptions()` should take longer than assigning a 
> > > > few values to built-in-typed members.
> > > > 
> > > > Another advantage of not having to remember to update the inline 
> > > > function's implementation when new options are added is counterbalanced 
> > > > by the need to be careful to add new options in backwards compatible 
> > > > way where `0` is the default.
> > > > 
> > > > Any other advantages of the `0` that I miss? Maybe there are some 
> > > > advantages for C users, but I suspect most libclang users are C++.
> > > >The usage you propose is still more verbose and error-prone than
> > > > ```
> > > > CXIndexOptions Opts = clang_getDefaultIndexOptions();
> > > > Opts.Whatever = 12;
> > > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > > ```
> > > 
> > > I see it as being roughly the same amount of verbosity and proneness to 
> > > error, but maybe that's just me.
> > > 
> > > > The memset would look very unclean in modern C++ code.
> > > 
> > > I don't see this as a problem. 1) `std::memset` gets plenty of usage in 
> > > modern C++ still, 2) you can also initialize with ` = { 
> > > sizeof(CXIndexOptions) };` and rely on the zero init for subsequent 
> > > members after the first (personally, I think that's too clever, but 
> > > reasonable people may disagree), 3) this is a C API, folks using C++ may 
> > > wish to wrap it in a better interface for C++ anyway.
> > >  
> > > > The 0 commitment may force unnatural naming of a future option to 
> > > > invert its meaning.
> > > 
> > > I do worry about this one, especially given there's no way to predict 
> > > what future options we'll need. But it also forces us to think about what 
> > > the correct default behavior should be, whereas if we push it off to a 
> > > helper function, we make it harder for everyone who didn't know to use 
> > > that helper function for whatever reason.
> > > 
> > > > Any other advantages of the 0 that I miss? Maybe there are some 
> > > > advantages for C users, but I suspect most libclang users are C++.
> > > 
> > > Familiarity for those who are used to dealing with existing C APIs that 
> > > behave this way (like Win32 APIs), but that can probably be argued either 
> > > way. My guess is that all libclang users are having to work through the C 
> > > interface, and some decently large amount of them do so from languages 
> > > other than C. But whether that means most are C++ users or not is 
> > > questionable -- I know I've seen uses from Python and Java as well as C++.
> > > 
> > > I'm not strongly opposed to having an API to get the default values, but 
> > > I don't see a lot of value in it either. It's something we could always 
> > > add later if we find there's a need for it in practice.
> > > I see it as being roughly the same amount of verbosity and proneness to 
> > > error, but maybe that's just me.
> > Calling the default-value function is one requirement. Assigning `sizeof` 
> > and calling `memset` - 2. Plus the user has to pass the correct arguments 
> > to `memset`.
> > 
> > > 1) `std::memset` gets plenty of usage in modern C++ still
> > Yes, there is usage, but modern C++ guidelines recommend replacing such 
> > usages with standard algorithms whenever possible. `memset`-ting an object 
> > generally risks undefined behavior.
> > 
> > > whereas if we push it off to a helper function, we make it harder for 
> > > everyone who didn't know to use that helper function for whatever reason.
> > If a user doesn't read the documentation, [s]he would likely not guess that 
> > `memset` needs to be called either. So I don't see a disadvantage of the 
> > helper function here.
> > 
> > > It's something we could always add later if we find there's a need for it 
> > > in practice.
> > We could add it later. But if we advise using `memset` in the first 
> > version, introducing a struct member with non-zero default value would risk 
> > breaking backward compatibility for users that don't track libclang API 
> > changes closely. It would also force all users that **do** notice the 
> > change to update their code with `#if CINDEX_VERSION_MINOR >= XY` to use 
> > either `memset` when compiling against libclang versions that don't yet 
> > have the helper function, or the helper function.
> > 
> > Introducing the helper function from the beginning affords more flexibility 
> > and space for future libclang API changes with less risk of breaking 
> > backward compatibility. If someone decides to ignore the helper function 
> > and `memset` the struct instead, breaking such usage should not be an 
> > obstacle/concern to future libclang changes.
> >> I see it as being roughly the same amount of verbosity and proneness to 
> >> error, but maybe that's just me.
> > Calling the default-value function is one requirement. Assigning sizeof and 
> > calling memset - 2. Plus the user has to pass the correct arguments to 
> > memset.
> 
> Both of which can be done in a single one-liner if brevity is that important 
> for your coding style.
> 
> > Yes, there is usage, but modern C++ guidelines recommend replacing such 
> > usages with standard algorithms whenever possible. memset-ting an object 
> > generally risks undefined behavior.
> 
> I don't see how that can be much of a real risk in a C interface.
> 
> > If a user doesn't read the documentation, [s]he would likely not guess that 
> > memset needs to be called either. So I don't see a disadvantage of the 
> > helper function here.
> 
> Agreed.
> 
> > We could add it later. But if we advise using memset in the first version, 
> > introducing a struct member with non-zero default value would risk breaking 
> > backward compatibility ...
> 
> My recommendation is that we don't add the API now and we leave a comment in 
> the options struct reminding folks about the importance of new options being 
> expressed such that `0` gives the correct default behavior. As you say, it is 
> a risk that we break the model in the future, but it seems like a somewhat 
> low risk.
> 
> 
Added and modified documentation, adapted usage in tests.


================
Comment at: clang/tools/libclang/CIndex.cpp:3701-3702
+CXIndex clang_createIndexWithOptions(const CXIndexOptions *options) {
+  if (options->Size != sizeof(CXIndexOptions))
+    return NULL;
+  CIndexer *CIdxr = clang_createIndex_Impl(options->ExcludeDeclarationsFromPCH,
----------------
vedgy wrote:
> aaron.ballman wrote:
> > I think we want this to be `>` and not `!=`, maybe.
> > 
> > If the sizes are equal, we're on the happy path.
> > 
> > If the options from the caller are smaller than the options we know about, 
> > that should be okay because we won't attempt read the options not provided 
> > and instead rely on the default behavior being reasonable.
> > 
> > If the options from the caller are larger than the options we know about, 
> > we have to assume the user set an option we can't handle, and thus failing 
> > the request is reasonable.
> > 
> > So the way I'm envisioning us reading the options is:
> > ```
> > if (options->Size >= offsetof(CXIndexOptions, FieldWeCareAbout))
> >   do_something(options->FieldWeCareAbout);
> > ```
> You are probably right looking forward.
> 
> I have opted for `!=` because this is the initial struct version. So right 
> now a smaller struct size means something very unexpected or a failure to 
> assign correct size on the user side. The advantage of `!=` is then more 
> reliable catching of bugs in freshly written user code that uses the struct 
> and an added flexibility of being able to //reduce// the struct's size in the 
> next revision.
Added your recommended way to add new options in comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143418/new/

https://reviews.llvm.org/D143418

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to