kadircet added a comment.

In D81719#2092589 <https://reviews.llvm.org/D81719#2092589>, @sammccall wrote:

> Thanks for all this investigation!
>
> >   80.71    0.002330           5       394       374 openat
>
> I'm curious what the 400 attempts and 20 successes are (I've seen this before 
> but don't remember now). Probably not worth digging into though unless you 
> happen to have the strace logs.


This is mostly gcc installation scanning for libc and such, biggest call site 
in 
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Gnu.cpp#L2408,
 which is called multiple times from 
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Gnu.cpp#L1907.

>> buildCompilerInvocation usage inside scanPreamble doesn't need any access to 
>> any files, so I suggest we just pass empty FS
> 
> I guess this makes sense, My only worry is the driver getting into a 
> different state if probing or cwd or something fails. But this really 
> shouldn't affect preamble scanning. If it's safe, this seems worth doing just 
> to have more isolation.

Changing this patch to do that instead.

>> we need a different cache for buildCompilerInvocation, one that caches 
>> dir_begin() failures
> 
> Yeah this is complicated - worthwhile if the IO is actually adding ~20ms. 
> Easiest way to tell if tracing tools aren't helping might be to use an empty 
> FS and ignore all the resulting problems - timing for buildCompilerInvocation 
> should be correct.
>  If needed, maybe the record/replay FSes used for lldb reproducers are 
> usable? Nice to avoid that complexity if possible though.

Benchmarked with an empty inmemoryfs and real filesystem using fallback 
commands(`clang a.cc`). there seems to be about a 6 times speed up when 
buildCompilerInvocation is run without IO. 
empty fs takes about 0.17 ms on average, whereas real file system takes 1.01 ms 
on average.

Changed the compile command to something google3 sized (~400 args):
empty fs takes about 1.4ms, whereas the real IO takes about 1.8ms.

So shaving off some IO might help a lot for trivial command lines, but for 
complicated commands we need to improve command line parsing or start caching 
the result.

>> 48.73    2.244680          56     39747 tolower
> 
> How many per call to buildCompilerInvocation? Maybe arg parsing is doing 
> something dumb...

this is only a single call to buildCompilerInvocation :(

Just for fun, top 5 library calls in a complicated command line case (~400 
args):

  % time     seconds  usecs/call     calls      function
  ------ ----------- ----------- --------- --------------------
   50.61    3.815022          54     69820 tolower
   25.35    1.911093          55     34416 strlen
    9.35    0.704789          53     13067 bcmp
    5.09    0.383536          56      6773 
_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_appendEPKcm
    3.69    0.278408          56      4935 _ZdlPv

so calls to tolower/strlen seems to be scaling sub-linearly (previous command 
line had only 2 args, so there's about 200x increase whereas call counts seem 
to have only doubled).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81719/new/

https://reviews.llvm.org/D81719



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to