ilovepi wrote:

First, thanks for the context. I don't see anything like this written down, so 
I plan to find some place in our docs to put those details. I'll be sure to CC 
you and other folks I think will have thoughts on the precise verbiage. The 
compiler's contract with libc is, from what I can tell, complicated, under 
specified, and mostly undocumented. Having spoke w/ some libc folks about libc 
semantics in the past, I don't think it will be easy to pin down all the 
details to the extent we want. I think writing down what you put above is just 
the first step.

Maybe part of the issue is that I don't see a fundamental reason why libc is 
special beyond a few key things:
  - some apis will need a no-bultin-foo, to prevent their implementation from 
calling themselves.
  - some apis have well understood usage that the compiler can leverage (I'd 
put the memcmp->bcmp optimization in this list, but memcpy/memset are what I 
think of first)
  - malloc, because of aliasing

I'm probably neglecting something obvious in that short list, but for most 
things, I don't think anything special needs to happen. What shouldn't happen 
though, is that the compiler deletes a function definition, and then 
reintroduces a call to that function ... maybe that's what you mean by "staying 
an abstraction past codegen"? I didn't initially read it that way, but I guess 
in that light I see where you're coming from.

Put another way, I think its strictly a bug in our phase ordering to allow 
functions to be deleted if they may have calls introduced again. Since 
memcmp/bcmp are special this way(as are the existing libcalls), I guess maybe 
that's part of the problem. I was kind of under the impression that 
RuntimeLibcalls was our mechanism for handling that, though.

As for making a libc cooperate w/ the compiler, perhaps there is a set of 
attributes we could use (or introduce?). We already have a few of these 
(attribute `malloc` comes to mind).   Maybe for things marked as being part of 
libc, we only mark them as dead, but don't collect until the end. Any new calls 
emitted would make them alive again. I haven't thought this bit through much, 
yet.


So, I guess let me try to explain my expectations for how we'd like the 
compiler to behave when LTOing a program along w/ libc. Mostly, we don't want 
the compiler to change its default behavior. So when it sees a call to 
`malloc`, the returned pointer is marked `noalias`, even if the call were 
inlined. For other memory routines, the compiler can either use it's own 
specialized implementations (like it normally does) or it can inline the call. 
That assumes the definitions were compiled w/ something like 
`-fno-builtin-memcpy` for the memcpy implementation (you know, so its 
functional). For anything that may have a call emitted via compiler 
transformation, it cannot be DCE'd until we're certain no new calls will be 
created. In the worst case that means we have to rely on linker GC, but maybe 
that's acceptable for something as limited as libc. Does that make sense? I 
have a feeling I'm oversimplifying something in my mental model, but I hope 
that's at least a reasonable set of goals as a first approximation.


https://github.com/llvm/llvm-project/pull/135706
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to