On 8/16/19 4:06 PM, Marc Glisse wrote: > On Fri, 16 Aug 2019, Jeff Law wrote: > >> On 8/16/19 12:09 PM, Marc Glisse wrote: >>> On Fri, 16 Aug 2019, Jeff Law wrote: >>> >>>> This patch improves our ability to detect dead stores by handling cases >>>> where the size memcpy, memset, strncpy, etc call is not constant. This >>>> addresses some, but not all, of the issues in 80576. >>>> >>>> The key here is when the size is not constant we can make conservative >>>> decisions that still give us a chance to analyze the code for dead >>>> stores. >>>> >>>> Remember that for dead store elimination, we're trying to prove that >>>> given two stores, the second store overwrites (partially or fully) the >>>> same memory locations as the first store. That makes the first store >>>> either partially or fully dead. >>>> >>>> When we encounter the first store, we set up a bitmap of bytes written >>>> by that store (live_bytes). We then look at subsequent stores and >>>> clear >>>> the appropriate entries in the bitmap. >>>> >>>> If the first store has a nonconstant length argument we can use the >>>> range of the length argument (max) and the size of the destination >>>> object to make a conservative estimation of how many bytes are written. >>>> >>>> For the second store the conservative thing to do for a non-constant >>>> length is to use the minimum of the range of the length argument. >>> >>> So I guess it won't handle things like >>> >>> void f(char*p,int n){ >>> __builtin_memset(p,3,n); >>> __builtin_memset(p,7,n); >>> } >>> >>> where we know nothing about the length, except that it is the same? Or >>> do you look at symbolic ranges? >> >> So handling this was slightly uglier than I'd hoped. I mis-remembered >> what we had in an ao_ref. We have a way to describe constant sizes and >> to mark that something was non-constant (size of -1), but not what the >> non-constant value was. >> >> I looked at two approaches. One created a dse_ref structure that had an >> embedded ao_ref. That creates a fair amount of churn. But it certainly >> looks do-able. >> >> The second derived a dse_ref from an ao_ref which allows the vast >> majority of the code in tree-ssa-dse.c to just work as-is. With that in >> place it was fairly simply to initialize the new field and check it in a >> couple places. Resulting in: >> >>> ; Function f (f, funcdef_no=0, decl_uid=1909, cgraph_uid=1, >>> symbol_order=0) >>> >>> Deleted dead call: __builtin_memset (p_5(D), 3, _1); >>> >>> f (char * p, int n) >>> { >>> long unsigned int _1; >>> >>> ;; basic block 2, loop depth 0, maybe hot >>> ;; prev block 0, next block 1, flags: (NEW, VISITED) >>> ;; pred: ENTRY (FALLTHRU,EXECUTABLE) >>> _1 = (long unsigned int) n_3(D); >>> __builtin_memset (p_5(D), 7, _1); >>> return; >>> ;; succ: EXIT (EXECUTABLE) >>> >>> } >>> >> >> Not sure how useful this will be in practice though. > > Oh, I wasn't expecting you to handle it right now... Might as well since I'm already in the code :-)
> > I've seen this kind of thing (not necessarily with memset, memcpy was > probably involved as well) a few times, enough that it immediately came > to mind when I saw the title of your message. There are probably traces > in bugzilla, although I don't think there is a convenient search query > to find them. I found PR 79716 that seems related at least, but not what > I was looking for. I didn't realize this was 79716. I'd just glossed over that today looking for something else in this space :-) With the way DSE works it really shouldn't matter if it's memset, or memcpy. 79716 also raises the issue of turning a calloc back into a simple malloc when the zero initialization was overwritten. We're not handling that either. But that seems even less likely to be seen with any regularity. jeff