On Tue, 12 Mar 2024, Jakub Jelinek wrote:

> On Tue, Mar 12, 2024 at 01:47:53PM +0100, Richard Biener wrote:
> > > Admittedly the above is the ugliest part of the patch IMHO.
> > > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case
> > > we have before ubsan pass
> > >   # _7(ab) = PHI <_20(9), _8(ab)(11)>
> > >   _22 = bar (*_7(ab));
> > > and want to instrument the *_7(ab) load among the parameters for object
> > > size.
> > > So, it wants to add
> > >   _2 = __builtin_dynamic_object_size (_7(ab), 0);
> > > sequence and later:
> > >   .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0);
> > > before the call insn.  If it isn't a noreturn call, it would just
> > > add those statements into the same basic block using gsi_insert*before.
> > > But because it is a returns_twice call, it needs to be done in a different
> > > block.  And having all of ubsan/asan/bitintlower find that out first and
> > > figure out that it should use _20 instead of _7(ab) seems hard, especially
> > > that for cases like:
> > 
> > I would have expected the result as
> > 
> >   # _7 = PHI <_20(9)>
> >   _2 = __builtin_dynamic_object_size (_7, 0);
> >   .UBSAN_OBJECT_SIZE (_7(ab), 1024, _2, 0);
> >   // fallthru
> > 
> >   # _99(ab) = PHI <_7, _8(ab)(11)>
> >   _22 = bar (*_99(ab));
> > 
> > where all uses of _7 in the IL before splitting get replaced via
> > their immediate uses but the to-be inserted IL can continue using
> > the defs (that requires the not inserted IL to not have update_stmt
> > called on them, of course).
> > 
> > I think split_block would have the PHIs organized that way already,
> > you are adding the _99 defs anyway, right?  That is,
> > 
> > +         tree new_lhs = copy_ssa_name (lhs);
> > +         gimple_phi_set_result (phi, new_lhs);
> > +         gphi *new_phi = create_phi_node (lhs, other_edge->dest);
> > 
> > here you swap the defs, I'd omit that.  And instead essentially do
> > replace_uses_by (lhs, new_lhs) here (without the folding and stuff).
> > You have to clear SSA_NAME_OCCURS_IN_ABNORMAL_PHI on the old
> > and set it on the new LHS if it was set.
> > 
> > I think that should contain the "SSA update" to
> > edge_before_returns_twice_call for the case it splits the block?
> 
> I have considered that, but there are problems with that.
> 
> The most important is that in the most common case, there is no
> block splitting nor any edge splitting, all we have is that
>   # _99(ab) = PHI <_7(6), _8(ab)(11)>
>   _22 = bar (*_99(ab));
> or similar.  And we can do several insertions of statements or sequences
> before the bar call.  If it is not returns_twice or even not a call,
> we want to use _99(ab) in there and insert just before the call/statement.
> If it is returns_twice call, we want to replace the _99(ab) uses with
> something that is actually usable on the 6->10 edge.
> So, if we wanted to keep using the _99(ab) (wouldn't it be even wrong
> that it is (ab)?), we'd need to essentially on every addition
> add statements like
>   _99(ab) = _7;
> (for each PHI node except the virtual one) before the added statement or
> statements (or add it only if used in the statement/sequence?) and
> change the PHI to be
>   # _100(ab) = PHI <_99(ab)(6), _8(ab)(11)>
> and replace all uses:
>   _22 = bar (*_100(ab));
> Right now ubsan can insert something like 3 statements/sequences per
> argument, so there can be thousands of them and the block could have
> thousands of PHIs, so that is millions of useless SSA_NAME copies.

Ah, yeah, I see :/

> So, the intention of edge_before_returns_twice_call is just that
> it in the common case just finds the non-EDGE_ABNORMAL edge if there is one,
> if there isn't just one, it adjusts the IL such that there is just one.
> And then the next step is to handle that case.

So I guess the updated patch is OK then.

Thanks,
Richard.

Reply via email to