dang added inline comments.

================
Comment at: clang/include/clang/Serialization/ASTBitCodes.h:396-400
       /// Record code for the signature that identifiers this AST file.
       SIGNATURE = 1,
 
+      /// Record code for the signature of the AST block.
+      AST_SIGNATURE,
----------------
dexonsmith wrote:
> dang wrote:
> > dang wrote:
> > > dexonsmith wrote:
> > > > These names and descriptions hard hard to differentiate. Is there 
> > > > another way of naming these that will be more clear?
> > > > 
> > > > (One idea I had is to create `CONTROL_BLOCK_HASH` and `AST_BLOCK_HASH` 
> > > > and then `SIGNATURE` could just be their hash-combine, but maybe you 
> > > > have another idea.)
> > > I kept the same hasher when computing both of these which mitigates the 
> > > cost. I don't see the need for also emitting a hash for the control 
> > > block, there are some optional records that are not in both the AST block 
> > > and the control block anyway.
> > I also think that the `AST_BLOCK_HASH` and the `SIGNATURE` are enough 
> > information already. In most cases you can deduce if the control block was 
> > different by just checking if the signatures were different and the ASTs 
> > the same.
> Thanks for updating to `AST_BLOCK_HASH`, I think that addressed the concern I 
> had.
> 
> However, having thought further about it, I don't think it makes sense to 
> create a new record type. I suggest just having a convention of writing the 
> hashes out in a particular order.
> - First record is the overall signature.
> - Second record is the AST block hash.
> - In the future, we can add a third record for the control block hash.
> 
> In which case, the (single) record name should be something generic like 
> `SIGNATURE` or `SHA1_HASH`.
> 
> Note: I doubt there would be additional cost to computing the AST and control 
> block hashes separately, but I agree there isn't a compelling use for the 
> control block hash currently (although it would be nice to confirm properties 
> like if the `CONTROL_BLOCK_HASH` were typically stable when sources changed).
> 
> Note: I don't think the content of the "hashed" control block vs. unhashed 
> control block have been carefully considered from first principles. At some 
> point we might do that. I expect there are currently things that change the 
> control block that would not invalidate it for importers.
I agree with the fact that it might be worth maintaining a control block hash, 
but I do think this is not the purpose of this change. I can do it as a follow 
up when I have some time. I am somewhat opposed to keeping them all in the same 
record for now, because it makes it harder to grep for one of the signatures in 
the output of `llvm-bcanalyzer` which is currently the only available tooling 
for looking inside the unhashed control block.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80383/new/

https://reviews.llvm.org/D80383



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to