aaron.ballman added inline comments.
================ Comment at: clang/include/clang/AST/Stmt.h:596-597 - // These don't need to be particularly wide, because they're - // strictly limited by the forms of expressions we permit. - unsigned NumSubExprs : 8; - unsigned ResultIndex : 32 - 8 - NumExprBits; + unsigned NumSubExprs : 16; + unsigned ResultIndex : 16; }; ---------------- yronglin wrote: > aaron.ballman wrote: > > dblaikie wrote: > > > aaron.ballman wrote: > > > > yronglin wrote: > > > > > yronglin wrote: > > > > > > dblaikie wrote: > > > > > > > yronglin wrote: > > > > > > > > dblaikie wrote: > > > > > > > > > aaron.ballman wrote: > > > > > > > > > > yronglin wrote: > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > aaron.ballman wrote: > > > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > > > Could/should we add some error checking in the > > > > > > > > > > > > > > > ctor to assert that we don't overflow these > > > > > > > > > > > > > > > longer values/just hit the bug later on? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (& could we use `unsigned short` here rather than > > > > > > > > > > > > > > > bitfields?) > > > > > > > > > > > > > > We've already got them packed in with other > > > > > > > > > > > > > > bit-fields from the expression bits, so I think > > > > > > > > > > > > > > it's reasonable to continue the pattern of using > > > > > > > > > > > > > > bit-fields (that way we don't accidentally end up > > > > > > > > > > > > > > with padding between the unnamed bits at the start > > > > > > > > > > > > > > and the named bits in this object). > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think adding some assertions would not be a bad > > > > > > > > > > > > > > idea as a follow-up. > > > > > > > > > > > > > Maybe some unconditional (rather than only in asserts > > > > > > > > > > > > > builds) error handling? (report_fatal_error, if this > > > > > > > > > > > > > is low priority enough to not have an elegant failure > > > > > > > > > > > > > mode, but something where we don't just overflow and > > > > > > > > > > > > > carry on would be good... ) > > > > > > > > > > > > Ping on this? I worry this code has just punted the > > > > > > > > > > > > same bug further down, but not plugged the hole/ensured > > > > > > > > > > > > we don't overflow on novel/larger inputs. > > > > > > > > > > > Sorry for the late reply, I was looking through the > > > > > > > > > > > emails and found this. I agree add some assertions to > > > > > > > > > > > check the value is a good idea, It's easy to help people > > > > > > > > > > > catch bugs, at least with when > > > > > > > > > > > `-DLLVM_ENABLE_ASSERTIONS=ON`, and I'm glad to work on > > > > > > > > > > > it, but one thing that worries me is that, in ASTReader, > > > > > > > > > > > we access this field directly, not through the > > > > > > > > > > > constructor or accessor, and we have to add assertions > > > > > > > > > > > everywhere. > > > > > > > > > > > https://github.com/llvm/llvm-project/blob/05b4310c8aec7a050574277ced08a0ab86b27681/clang/lib/Serialization/ASTReaderStmt.cpp#L1382 > > > > > > > > > > I don't think we have to add too many assertions. As best I > > > > > > > > > > can tell, we'll need one in each of the `PseudoObjectExpr` > > > > > > > > > > constructors and one in > > > > > > > > > > `ASTStmtReader::VisitPseudoObjectExpr()`, but those are the > > > > > > > > > > only places we assign a value into the bit-field. Three > > > > > > > > > > assertions isn't a lot, but if we're worried, we could add > > > > > > > > > > a setter method that does the assertion and use the setter > > > > > > > > > > in all three places. > > > > > > > > > My concern wasn't (well, wasn't entirely) about adding more > > > > > > > > > assertions - but about having a reliable error here. The > > > > > > > > > patch only makes the sizes larger, but doesn't have a > > > > > > > > > hard-stop in case those sizes are exceeded again (which, > > > > > > > > > admittedly, is much harder to do - maybe it's totally > > > > > > > > > unreachable now, for all practical purposes?) > > > > > > > > > > > > > > > > > > I suspect with more carefully constructed recursive inputs > > > > > > > > > could still reach the higher limit & I think it'd be good to > > > > > > > > > fail hard in that case in some way? (it's probably rare > > > > > > > > > enough that a report_fatal_error would be > > > > > > > > > not-the-worst-thing-ever) > > > > > > > > > > > > > > > > > > But good assertions would be nice too (the old code only > > > > > > > > > failed when you hit /exactly/ on just the overflow value, and > > > > > > > > > any more than that the wraparound would not crash/fail, but > > > > > > > > > misbehave) - I did add the necessary assertion to ArrayRef > > > > > > > > > (begin <= end) which would've helped detect this more > > > > > > > > > reliably, but some assert checking for overflow in the ctor > > > > > > > > > would be good too (with all the usual nuance/care in checking > > > > > > > > > for overflow) - unless we're going to make that into a fatal > > > > > > > > > or other real error. > > > > > > > > Sorry for the very late reply. I have no preference between > > > > > > > > assertion and `llvm_unreachable`, if error then fail fast is > > > > > > > > looks good. I have a patch D158296 to add assertion. > > > > > > > Thanks for the assertions - though they still haven't met my main > > > > > > > concern that this should have a hard failure even in a > > > > > > > non-assertions build. > > > > > > > > > > > > > > I know we don't have a perfect plan/policy for these sort of "run > > > > > > > out of resources/hit a representational limit" issues (at least I > > > > > > > don't think we do... do we, @aaron.ballman ? I know we have some > > > > > > > limits (recursion, template expansion, etc) but they're fairly > > > > > > > specific/aren't about every possible case of integer overflow in > > > > > > > some representational element, etc) but we've seen this one is > > > > > > > pretty reachable. > > > > > > > > > > > > > > Here's a test case that would still trigger the assertion, and > > > > > > > trigger UB in a non-assertions build: > > > > > > > ``` > > > > > > > truct t1 { }; > > > > > > > template<typename T1> > > > > > > > struct templ { > > > > > > > T1 v1; > > > > > > > T1 v2; > > > > > > > T1 v3; > > > > > > > T1 v4; > > > > > > > }; > > > > > > > > > > > > > > struct t2 { > > > > > > > templ<templ<templ<templ<templ<templ<t1>>>>>> c0; > > > > > > > templ<templ<templ<templ<templ<templ<t1>>>>>> c1; > > > > > > > templ<templ<templ<templ<templ<templ<t1>>>>>> c2; > > > > > > > }; > > > > > > > > > > > > > > void aj(...); > > > > > > > void f1(t2 w) { __builtin_dump_struct(&w, aj); } > > > > > > > ``` > > > > > > > (used templates to pack this a bit more densely than the original > > > > > > > test case) - the `sizeof` the struct is certainly a bit > > > > > > > outlandish (~12kbytes) bit not, I think, totally unreasonable? > > > > > > Thanks for your example. I have three ways: > > > > > > 1. use `llvm_unreachable` to emit a hard failure but not an > > > > > > assertion. > > > > > > 2. extend these two field to 32-bit unsigned, it's may big enough. > > > > > > 3. limit the functionality of `__builtin_dump_struct`, if there are > > > > > > too many fields in a struct, the part exceeding the limit will not > > > > > > be output, and replaced with `...`(maybe). > > > > > > > > > > > > WDYT? You guys are expert in clang, and I would like to wait for > > > > > > your guidance :) > > > > > We have ever limit the length of c-string value in > > > > > `__builtin_dump_struct`. > > > > > https://github.com/llvm/llvm-project/blob/5675f44cebf602931e884595c1d488bcedf6b8f2/clang/lib/Sema/SemaChecking.cpp#L524-L527 > > > > > I know we don't have a perfect plan/policy for these sort of "run out > > > > > of resources/hit a representational limit" issues (at least I don't > > > > > think we do... do we, @aaron.ballman ? I know we have some limits > > > > > (recursion, template expansion, etc) but they're fairly > > > > > specific/aren't about every possible case of integer overflow in some > > > > > representational element, etc) but we've seen this one is pretty > > > > > reachable. > > > > > > > > Correct, we don't have a general mechanism for handling resource > > > > limits, we mostly play whack-a-mole when we run into them. So having a > > > > general utility that can work for other bit-fields would be really > > > > nice. However, `__builtin_dump_struct` is a debugging interface and not > > > > something we expect users to call particularly often, so some sharp > > > > edges with the interface are not the end of the world IMO. > > > > > > > > > limit the functionality of __builtin_dump_struct, if there are too > > > > > many fields in a struct, the part exceeding the limit will not be > > > > > output, and replaced with ...(maybe). > > > > > > > > I think this is basically what @dblaikie was asking for -- if there are > > > > too many fields in the structure, either give a diagnostic that the > > > > structure is too complex for us to dump, or cut off the output > > > > somewhere sensible, etc. @dblaikie, do you have a preference for > > > > diagnostic vs prettier output? Ideally, I lean towards prettier output, > > > > but at the same time, that might be a bigger ask than what you were > > > > after. > > > > > > > Yeah, I don't have major/heroic requirements here - just that we don't > > > crash on these inputs. An error, wider integers, both, etc, it's all good > > > to me. > > > > > > just an error seems fine to me - as you say, it's basically a debugging > > > interface, people shouldn't be making this some loadbearing/totally > > > generic part of an API or somesuch. > > Thank you! I'm also fine with going with just an error. > I'm fine too! But I have some concerns, the BuiltinDumpStructGenerator create > call of print functions. but the call of print functions not only for field, > but also for `{` and '}', so I have no idea to describe the problem in the > diagnostic message, 'too many fields in struct' doesn't feel right to me. Do > you have any other good idea? I'd be happy to continue cook D158296 Hmm, I think I'd be fine with something along the lines of: `class|struct|union 'blah' is too complex to dump` and leave "too complex" vague (because this is a debugging interface and it involves some quite questionably-sized objects, I think it's fine for the QoI bar to be lower for this diagnostic). WDYT? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154784/new/ https://reviews.llvm.org/D154784 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits