dang added inline comments.
================ Comment at: clang/include/clang/Serialization/ASTBitCodes.h:396-400 /// Record code for the signature that identifiers this AST file. SIGNATURE = 1, + /// Record code for the signature of the AST block. + AST_SIGNATURE, ---------------- dexonsmith wrote: > dang wrote: > > dang wrote: > > > dexonsmith wrote: > > > > These names and descriptions hard hard to differentiate. Is there > > > > another way of naming these that will be more clear? > > > > > > > > (One idea I had is to create `CONTROL_BLOCK_HASH` and `AST_BLOCK_HASH` > > > > and then `SIGNATURE` could just be their hash-combine, but maybe you > > > > have another idea.) > > > I kept the same hasher when computing both of these which mitigates the > > > cost. I don't see the need for also emitting a hash for the control > > > block, there are some optional records that are not in both the AST block > > > and the control block anyway. > > I also think that the `AST_BLOCK_HASH` and the `SIGNATURE` are enough > > information already. In most cases you can deduce if the control block was > > different by just checking if the signatures were different and the ASTs > > the same. > Thanks for updating to `AST_BLOCK_HASH`, I think that addressed the concern I > had. > > However, having thought further about it, I don't think it makes sense to > create a new record type. I suggest just having a convention of writing the > hashes out in a particular order. > - First record is the overall signature. > - Second record is the AST block hash. > - In the future, we can add a third record for the control block hash. > > In which case, the (single) record name should be something generic like > `SIGNATURE` or `SHA1_HASH`. > > Note: I doubt there would be additional cost to computing the AST and control > block hashes separately, but I agree there isn't a compelling use for the > control block hash currently (although it would be nice to confirm properties > like if the `CONTROL_BLOCK_HASH` were typically stable when sources changed). > > Note: I don't think the content of the "hashed" control block vs. unhashed > control block have been carefully considered from first principles. At some > point we might do that. I expect there are currently things that change the > control block that would not invalidate it for importers. I agree with the fact that it might be worth maintaining a control block hash, but I do think this is not the purpose of this change. I can do it as a follow up when I have some time. I am somewhat opposed to keeping them all in the same record for now, because it makes it harder to grep for one of the signatures in the output of `llvm-bcanalyzer` which is currently the only available tooling for looking inside the unhashed control block. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80383/new/ https://reviews.llvm.org/D80383 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits