On Tue, 12 Mar 2024, Jakub Jelinek wrote: > On Tue, Mar 12, 2024 at 01:47:53PM +0100, Richard Biener wrote: > > > Admittedly the above is the ugliest part of the patch IMHO. > > > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case > > > we have before ubsan pass > > > # _7(ab) = PHI <_20(9), _8(ab)(11)> > > > _22 = bar (*_7(ab)); > > > and want to instrument the *_7(ab) load among the parameters for object > > > size. > > > So, it wants to add > > > _2 = __builtin_dynamic_object_size (_7(ab), 0); > > > sequence and later: > > > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > > > before the call insn. If it isn't a noreturn call, it would just > > > add those statements into the same basic block using gsi_insert*before. > > > But because it is a returns_twice call, it needs to be done in a different > > > block. And having all of ubsan/asan/bitintlower find that out first and > > > figure out that it should use _20 instead of _7(ab) seems hard, especially > > > that for cases like: > > > > I would have expected the result as > > > > # _7 = PHI <_20(9)> > > _2 = __builtin_dynamic_object_size (_7, 0); > > .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0); > > // fallthru > > > > # _99(ab) = PHI <_7, _8(ab)(11)> > > _22 = bar (*_99(ab)); > > > > where all uses of _7 in the IL before splitting get replaced via > > their immediate uses but the to-be inserted IL can continue using > > the defs (that requires the not inserted IL to not have update_stmt > > called on them, of course). > > > > I think split_block would have the PHIs organized that way already, > > you are adding the _99 defs anyway, right? That is, > > > > + tree new_lhs = copy_ssa_name (lhs); > > + gimple_phi_set_result (phi, new_lhs); > > + gphi *new_phi = create_phi_node (lhs, other_edge->dest); > > > > here you swap the defs, I'd omit that. And instead essentially do > > replace_uses_by (lhs, new_lhs) here (without the folding and stuff). > > You have to clear SSA_NAME_OCCURS_IN_ABNORMAL_PHI on the old > > and set it on the new LHS if it was set. > > > > I think that should contain the "SSA update" to > > edge_before_returns_twice_call for the case it splits the block? > > I have considered that, but there are problems with that. > > The most important is that in the most common case, there is no > block splitting nor any edge splitting, all we have is that > # _99(ab) = PHI <_7(6), _8(ab)(11)> > _22 = bar (*_99(ab)); > or similar. And we can do several insertions of statements or sequences > before the bar call. If it is not returns_twice or even not a call, > we want to use _99(ab) in there and insert just before the call/statement. > If it is returns_twice call, we want to replace the _99(ab) uses with > something that is actually usable on the 6->10 edge. > So, if we wanted to keep using the _99(ab) (wouldn't it be even wrong > that it is (ab)?), we'd need to essentially on every addition > add statements like > _99(ab) = _7; > (for each PHI node except the virtual one) before the added statement or > statements (or add it only if used in the statement/sequence?) and > change the PHI to be > # _100(ab) = PHI <_99(ab)(6), _8(ab)(11)> > and replace all uses: > _22 = bar (*_100(ab)); > Right now ubsan can insert something like 3 statements/sequences per > argument, so there can be thousands of them and the block could have > thousands of PHIs, so that is millions of useless SSA_NAME copies.
Ah, yeah, I see :/ > So, the intention of edge_before_returns_twice_call is just that > it in the common case just finds the non-EDGE_ABNORMAL edge if there is one, > if there isn't just one, it adjusts the IL such that there is just one. > And then the next step is to handle that case. So I guess the updated patch is OK then. Thanks, Richard.