aaron.ballman added inline comments.

================
Comment at: clang/tools/c-index-test/c-index-test.c:79
+    Opts.PreambleStoragePath = NULL;
+    Opts.InvocationEmissionPath = 
getenv("CINDEXTEST_INVOCATION_EMISSION_PATH");
+
----------------
vedgy wrote:
> aaron.ballman wrote:
> > vedgy wrote:
> > > aaron.ballman wrote:
> > > > vedgy wrote:
> > > > > When a libclang user needs to override a single option in 
> > > > > `CXIndexOptions`, [s]he has to set every member of the struct 
> > > > > explicitly. When new options are added, each libclang user needs to 
> > > > > update the code that sets the options under `CINDEX_VERSION_MINOR` 
> > > > > `#if`s. Accidentally omitting even one member assignment risks 
> > > > > undefined or wrong behavior. How about adding an `inline` helper 
> > > > > function `CXIndexOptions clang_getDefaultIndexOptions()`, which 
> > > > > assigns default values to all struct members? Libclang users can then 
> > > > > call this function to obtain the default configuration, then tweak 
> > > > > only the members they want to override.
> > > > > 
> > > > > If this suggestion is to be implemented, how to deal with 
> > > > > [[https://stackoverflow.com/questions/68004269/differences-of-the-inline-keyword-in-c-and-c|the
> > > > >  differences of the inline keyword in C and C++]]?
> > > > By default, `0` should give you the most reasonable default behavior 
> > > > for most of the existing options (and new options should follow the 
> > > > same pattern). Ideally, users should be able to do:
> > > > ```
> > > > CXIndexOptions Opts;
> > > > memset(&Opts, 0, sizeof(Opts));
> > > > Opts.Size = sizeof(Opts);
> > > > Opts.Whatever = 12;
> > > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > > ```
> > > > Global options defaulting to 0 is fine (uses regular thread 
> > > > priorities), we don't think want to default to excluding declarations 
> > > > from PCH, and we want to use the default preamble and invocation 
> > > > emission paths (if any). The only option that nonzero as a default 
> > > > *might* make sense for is displaying diagnostics, but even that seems 
> > > > reasonable to expect the developer to manually enable.
> > > > 
> > > > So I don't know that we need a function to get us default indexing 
> > > > options as `0` should be a reasonable default for all of the options. 
> > > > As we add new options, we need to be careful to add them in backwards 
> > > > compatible ways where `0` means "do the most likely thing".
> > > > 
> > > > WDYT?
> > > The disadvantages of committing to defaulting to `0`:
> > > 1. The usage you propose is still more verbose and error-prone than
> > > ```
> > > CXIndexOptions Opts = clang_getDefaultIndexOptions();
> > > Opts.Whatever = 12;
> > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > ```
> > > 2. The `memset` would look very unclean in modern C++ code.
> > > 3. The `0` commitment may force unnatural naming of a future option to 
> > > invert its meaning.
> > > 
> > > The advantages:
> > > 1. No need to implement the inline function now.
> > > 2. Faster, but I am sure this performance difference doesn't matter. Even 
> > > a non-inline function call itself (even if it did nothing) to 
> > > `clang_createIndexWithOptions()` should take longer than assigning a few 
> > > values to built-in-typed members.
> > > 
> > > Another advantage of not having to remember to update the inline 
> > > function's implementation when new options are added is counterbalanced 
> > > by the need to be careful to add new options in backwards compatible way 
> > > where `0` is the default.
> > > 
> > > Any other advantages of the `0` that I miss? Maybe there are some 
> > > advantages for C users, but I suspect most libclang users are C++.
> > >The usage you propose is still more verbose and error-prone than
> > > ```
> > > CXIndexOptions Opts = clang_getDefaultIndexOptions();
> > > Opts.Whatever = 12;
> > > CXIndex Idx = clang_createIndexWithOptions(&Opts);
> > > ```
> > 
> > I see it as being roughly the same amount of verbosity and proneness to 
> > error, but maybe that's just me.
> > 
> > > The memset would look very unclean in modern C++ code.
> > 
> > I don't see this as a problem. 1) `std::memset` gets plenty of usage in 
> > modern C++ still, 2) you can also initialize with ` = { 
> > sizeof(CXIndexOptions) };` and rely on the zero init for subsequent members 
> > after the first (personally, I think that's too clever, but reasonable 
> > people may disagree), 3) this is a C API, folks using C++ may wish to wrap 
> > it in a better interface for C++ anyway.
> >  
> > > The 0 commitment may force unnatural naming of a future option to invert 
> > > its meaning.
> > 
> > I do worry about this one, especially given there's no way to predict what 
> > future options we'll need. But it also forces us to think about what the 
> > correct default behavior should be, whereas if we push it off to a helper 
> > function, we make it harder for everyone who didn't know to use that helper 
> > function for whatever reason.
> > 
> > > Any other advantages of the 0 that I miss? Maybe there are some 
> > > advantages for C users, but I suspect most libclang users are C++.
> > 
> > Familiarity for those who are used to dealing with existing C APIs that 
> > behave this way (like Win32 APIs), but that can probably be argued either 
> > way. My guess is that all libclang users are having to work through the C 
> > interface, and some decently large amount of them do so from languages 
> > other than C. But whether that means most are C++ users or not is 
> > questionable -- I know I've seen uses from Python and Java as well as C++.
> > 
> > I'm not strongly opposed to having an API to get the default values, but I 
> > don't see a lot of value in it either. It's something we could always add 
> > later if we find there's a need for it in practice.
> > I see it as being roughly the same amount of verbosity and proneness to 
> > error, but maybe that's just me.
> Calling the default-value function is one requirement. Assigning `sizeof` and 
> calling `memset` - 2. Plus the user has to pass the correct arguments to 
> `memset`.
> 
> > 1) `std::memset` gets plenty of usage in modern C++ still
> Yes, there is usage, but modern C++ guidelines recommend replacing such 
> usages with standard algorithms whenever possible. `memset`-ting an object 
> generally risks undefined behavior.
> 
> > whereas if we push it off to a helper function, we make it harder for 
> > everyone who didn't know to use that helper function for whatever reason.
> If a user doesn't read the documentation, [s]he would likely not guess that 
> `memset` needs to be called either. So I don't see a disadvantage of the 
> helper function here.
> 
> > It's something we could always add later if we find there's a need for it 
> > in practice.
> We could add it later. But if we advise using `memset` in the first version, 
> introducing a struct member with non-zero default value would risk breaking 
> backward compatibility for users that don't track libclang API changes 
> closely. It would also force all users that **do** notice the change to 
> update their code with `#if CINDEX_VERSION_MINOR >= XY` to use either 
> `memset` when compiling against libclang versions that don't yet have the 
> helper function, or the helper function.
> 
> Introducing the helper function from the beginning affords more flexibility 
> and space for future libclang API changes with less risk of breaking backward 
> compatibility. If someone decides to ignore the helper function and `memset` 
> the struct instead, breaking such usage should not be an obstacle/concern to 
> future libclang changes.
>> I see it as being roughly the same amount of verbosity and proneness to 
>> error, but maybe that's just me.
> Calling the default-value function is one requirement. Assigning sizeof and 
> calling memset - 2. Plus the user has to pass the correct arguments to memset.

Both of which can be done in a single one-liner if brevity is that important 
for your coding style.

> Yes, there is usage, but modern C++ guidelines recommend replacing such 
> usages with standard algorithms whenever possible. memset-ting an object 
> generally risks undefined behavior.

I don't see how that can be much of a real risk in a C interface.

> If a user doesn't read the documentation, [s]he would likely not guess that 
> memset needs to be called either. So I don't see a disadvantage of the helper 
> function here.

Agreed.

> We could add it later. But if we advise using memset in the first version, 
> introducing a struct member with non-zero default value would risk breaking 
> backward compatibility ...

My recommendation is that we don't add the API now and we leave a comment in 
the options struct reminding folks about the importance of new options being 
expressed such that `0` gives the correct default behavior. As you say, it is a 
risk that we break the model in the future, but it seems like a somewhat low 
risk.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143418/new/

https://reviews.llvm.org/D143418

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to