Re: Unused GCC builtins

2018-01-27 Thread Martin Sebor

On 01/24/2018 07:09 AM, Jakub Jelinek wrote:

On Wed, Jan 24, 2018 at 03:04:55PM +0100, Manuel Rigger wrote:

In a second step, we also considered internal builtins and found that the
vararg handling builtins (__builtin_va_start, __builtin_va_end,
__builtin_va_arg, and __builtin_va_copy) are relied upon by many projects,
even though they are undocumented in GCC's builtins API. Could they be
added to the documentation?


Why?  What is documented is va_start/va_end/va_arg/va_copy, that is
what people should use, the builtins are just internal implementation of
those macros.


There are a number of reasons why documenting visible APIs is
helpful whether or not they are meant to be used by end users.

Features that are not meant to be used should be documented
as such.  Mentioning that they are meant only for internal use
makes their purpose clear and sets the right expectation about
the level of support and portability between GCC versions.  It
also makes it clear that we didn't forget to document them by
accident.

The manual isn't just a reference for GCC users.  It's also
a helpful reference for developers of GCC-compatible compilers
who are not allowed to read GCC source code due to copyright or
licensing constraints, or for people maintaining or supporting
their own GCC-based operating environments.  Finally, it is also
a reference for GCC developers.

For all these reasons I think every built-in that can be used
(intentionally or otherwise) deserves to be documented in
the manual.

Martin



Re: Unused GCC builtins

2018-01-24 Thread Florian Weimer
* Jakub Jelinek:

> On Wed, Jan 24, 2018 at 03:04:55PM +0100, Manuel Rigger wrote:
>> In a second step, we also considered internal builtins and found that the
>> vararg handling builtins (__builtin_va_start, __builtin_va_end,
>> __builtin_va_arg, and __builtin_va_copy) are relied upon by many projects,
>> even though they are undocumented in GCC's builtins API. Could they be
>> added to the documentation?
>
> Why?  What is documented is va_start/va_end/va_arg/va_copy, that is
> what people should use, the builtins are just internal implementation of
> those macros.

And these builtins differ from the math builtins because  is
provided by GCC, but  is not, and there are many different
implementations.


Re: Unused GCC builtins

2018-01-24 Thread Jakub Jelinek
On Wed, Jan 24, 2018 at 03:04:55PM +0100, Manuel Rigger wrote:
> In a second step, we also considered internal builtins and found that the
> vararg handling builtins (__builtin_va_start, __builtin_va_end,
> __builtin_va_arg, and __builtin_va_copy) are relied upon by many projects,
> even though they are undocumented in GCC's builtins API. Could they be
> added to the documentation?

Why?  What is documented is va_start/va_end/va_arg/va_copy, that is
what people should use, the builtins are just internal implementation of
those macros.

Jakub


Re: Unused GCC builtins

2018-01-24 Thread Manuel Rigger
Thank you for all answers, which are very useful for us!

As you pointed out, we only considered GitHub projects. If I understood
correctly, builtins would still not be deprecated even if we considered all
other open-source hosting sites because closed-source projects could still
rely on them, right? Additionally, target-specific builtins could not be
deprecated or removed because of vendor ABIs.

Several of you noted that we did not consider internal builtins that are
used in the implementation of GCC headers or directly by the compiler. Also
the documentation mentions that GCC provides "a large number of built-in
functions other than the ones mentioned" for "internal use" which "are not
documented here because they may change from time to time" (see
https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Other-Builtins.html#Other-
Builtins). We deliberately looked only at public builtins (and not internal
ones), as we are mainly interested in the effort needed to support GCC
builtins in other tools that process C code (e.g., other compilers or
analysis tools). We want to prevent that such tool developers need to
implement internal or unused builtins. So even if we cannot remove the
implementation of a builtin, removing it from the documentation could
already be a win.

In a second step, we also considered internal builtins and found that the
vararg handling builtins (__builtin_va_start, __builtin_va_end,
__builtin_va_arg, and __builtin_va_copy) are relied upon by many projects,
even though they are undocumented in GCC's builtins API. Could they be
added to the documentation?

Thanks,
Manuel

2018-01-22 19:29 GMT+01:00 Florian Weimer :

> * Manuel Rigger:
>
> > Details: We downloaded all C projects from GitHub that had more than 80
> > GitHub stars, which yielded almost 5,000 projects with a total of more
> > than one billion lines of C code. We filtered GCC, forks of GCC, and
> > other compilers as we did not want to incorporate internal usages of GCC
> > builtins or test cases. We extracted all builtin names from the GCC
> > docs, and also tried to find such names in the source code, which we
> > considered as builtin usages.
>
> You actually need to compile the sources with an instrumented compiler
> to discover uses of built-ins.  Not all references will have verbatim,
> textual references in source code, but their names are constructed
> using preprocessor macros.  This happens for the majority of the
> floating-point-related built-ins you listed, I think.
>


Re: Unused GCC builtins

2018-01-22 Thread Florian Weimer
* Manuel Rigger:

> Details: We downloaded all C projects from GitHub that had more than 80
> GitHub stars, which yielded almost 5,000 projects with a total of more
> than one billion lines of C code. We filtered GCC, forks of GCC, and
> other compilers as we did not want to incorporate internal usages of GCC
> builtins or test cases. We extracted all builtin names from the GCC
> docs, and also tried to find such names in the source code, which we
> considered as builtin usages.

You actually need to compile the sources with an instrumented compiler
to discover uses of built-ins.  Not all references will have verbatim,
textual references in source code, but their names are constructed
using preprocessor macros.  This happens for the majority of the
floating-point-related built-ins you listed, I think.


Re: Unused GCC builtins

2018-01-22 Thread Andrew Pinski
On Mon, Jan 22, 2018 at 7:55 AM, David Brown  wrote:
> On 22/01/18 16:46, Manuel Rigger wrote:
>> Hi everyone,
>>
>> As part of my research, we have been analyzing the usage of GCC builtins
>> in 5,000 C GitHub projects. One of our findings is that many of these
>> builtins are unused, even though they are described in the documentation
>> (see https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html#C-Extensions)
>> and obviously took time to develop and maintain. I’ve uploaded a CSV
>> file with the unused builtins to
>> http://ssw.jku.at/General/Staff/ManuelRigger/unused-builtins.csv.
>>
>> Details: We downloaded all C projects from GitHub that had more than 80
>> GitHub stars, which yielded almost 5,000 projects with a total of more
>> than one billion lines of C code. We filtered GCC, forks of GCC, and
>> other compilers as we did not want to incorporate internal usages of GCC
>> builtins or test cases. We extracted all builtin names from the GCC
>> docs, and also tried to find such names in the source code, which we
>> considered as builtin usages. We excluded subdirectories with GCC or
>> Clang, and removed other false positives. In total, we found 320k
>> builtin usages in these projects, and 3030 unused builtins out of a
>> total of 6039 builtins.
>>
>> What is your take on this? Do you believe that some of these unused
>> builtins could be removed from the GCC docs or deprecated? Or are they
>> used in special "niche" domains that we did not consider? If yes, do you
>> think it is worth to maintain them? Are some of them only used in C++
>> projects? Might it be possible to remove their implementations (which
>> has already happened for the Cilk Plus builtins)?
>>
>> We would be glad for any feedback.
>>
>> - Manuel
>>
>
> Many of these are going to be used automatically by the compiler.  You
> write "strdup" in your code, and the compiler treats it as
> "__builtin_strdup".  I don't know that such functions need to be
> documented as extensions, but they are certainly in use.
>
> You will also find that a large number of the builtins are for specific
> target processors, and projects using them are not going to turn up on
> GitHub.  They will be used in embedded software that is not open source.

And the many of the target ones are used indirectly via another
function/macro (e.g. __builtin_ia32_ptestc256).  The function/macro is
defined in a header that GCC  provides too.

Thanks,
Andrew

>
> I am sure there are builtins that are rarely or never used - but I doubt
> if it is anything like as many as you have identified from this survey.
>
>
>


Re: Unused GCC builtins

2018-01-22 Thread Jakub Jelinek
On Mon, Jan 22, 2018 at 04:55:42PM +0100, David Brown wrote:
> Many of these are going to be used automatically by the compiler.  You
> write "strdup" in your code, and the compiler treats it as
> "__builtin_strdup".  I don't know that such functions need to be
> documented as extensions, but they are certainly in use.
> 
> You will also find that a large number of the builtins are for specific
> target processors, and projects using them are not going to turn up on
> GitHub.  They will be used in embedded software that is not open source.

Not just that.  If the statistics e.g. ignored GCC headers, then obviously
it will miss most of the target builtins, because the normal and only
supported way for the target builtins is to use them through the intrinsic
inline functions or macros provided by those headers.
So, take those out (usually a vendor ABI is something that says what
intrinsics are provided, so even if you made statistics on what intrinsic is
used in the 5000 most popular projects, we still couldn't remove them) and
taking out the above category, where the builtins are just an alternative
for a standard function and depending on prototype and chosen standard some
functions are treated like builtins, pretty much nothing remains in your
survey.

Jakub


Re: Unused GCC builtins

2018-01-22 Thread Joel Sherrill



On 1/22/2018 9:55 AM, David Brown wrote:

On 22/01/18 16:46, Manuel Rigger wrote:

Hi everyone,

As part of my research, we have been analyzing the usage of GCC builtins
in 5,000 C GitHub projects. One of our findings is that many of these
builtins are unused, even though they are described in the documentation
(see https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html#C-Extensions)
and obviously took time to develop and maintain. I’ve uploaded a CSV
file with the unused builtins to
http://ssw.jku.at/General/Staff/ManuelRigger/unused-builtins.csv.

Details: We downloaded all C projects from GitHub that had more than 80
GitHub stars, which yielded almost 5,000 projects with a total of more
than one billion lines of C code. We filtered GCC, forks of GCC, and
other compilers as we did not want to incorporate internal usages of GCC
builtins or test cases. We extracted all builtin names from the GCC
docs, and also tried to find such names in the source code, which we
considered as builtin usages. We excluded subdirectories with GCC or
Clang, and removed other false positives. In total, we found 320k
builtin usages in these projects, and 3030 unused builtins out of a
total of 6039 builtins.

What is your take on this? Do you believe that some of these unused
builtins could be removed from the GCC docs or deprecated? Or are they
used in special "niche" domains that we did not consider? If yes, do you
think it is worth to maintain them? Are some of them only used in C++
projects? Might it be possible to remove their implementations (which
has already happened for the Cilk Plus builtins)?

We would be glad for any feedback.

- Manuel



Many of these are going to be used automatically by the compiler.  You
write "strdup" in your code, and the compiler treats it as
"__builtin_strdup".  I don't know that such functions need to be
documented as extensions, but they are certainly in use.

You will also find that a large number of the builtins are for specific
target processors, and projects using them are not going to turn up on
GitHub.  They will be used in embedded software that is not open source.

I am sure there are builtins that are rarely or never used - but I doubt
if it is anything like as many as you have identified from this survey.



My first thought was that there is a lot of free and open source 
software that is not hosted at github. Larger projects are often 
self-hosted. Does this list cover all GNU, Savannah, sourceware.org, 
Apache, KDE, *BSD, Mozilla, etc projects?


You might get lucky and some like RTEMS and FreeBSD (I think) have
a github mirror. But github is not the entire universe of free and
open source software.

--joel sherrill
RTEMS


Re: Unused GCC builtins

2018-01-22 Thread David Brown
On 22/01/18 16:46, Manuel Rigger wrote:
> Hi everyone,
> 
> As part of my research, we have been analyzing the usage of GCC builtins
> in 5,000 C GitHub projects. One of our findings is that many of these
> builtins are unused, even though they are described in the documentation
> (see https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html#C-Extensions)
> and obviously took time to develop and maintain. I’ve uploaded a CSV
> file with the unused builtins to
> http://ssw.jku.at/General/Staff/ManuelRigger/unused-builtins.csv.
> 
> Details: We downloaded all C projects from GitHub that had more than 80
> GitHub stars, which yielded almost 5,000 projects with a total of more
> than one billion lines of C code. We filtered GCC, forks of GCC, and
> other compilers as we did not want to incorporate internal usages of GCC
> builtins or test cases. We extracted all builtin names from the GCC
> docs, and also tried to find such names in the source code, which we
> considered as builtin usages. We excluded subdirectories with GCC or
> Clang, and removed other false positives. In total, we found 320k
> builtin usages in these projects, and 3030 unused builtins out of a
> total of 6039 builtins.
> 
> What is your take on this? Do you believe that some of these unused
> builtins could be removed from the GCC docs or deprecated? Or are they
> used in special "niche" domains that we did not consider? If yes, do you
> think it is worth to maintain them? Are some of them only used in C++
> projects? Might it be possible to remove their implementations (which
> has already happened for the Cilk Plus builtins)?
> 
> We would be glad for any feedback.
> 
> - Manuel
> 

Many of these are going to be used automatically by the compiler.  You
write "strdup" in your code, and the compiler treats it as
"__builtin_strdup".  I don't know that such functions need to be
documented as extensions, but they are certainly in use.

You will also find that a large number of the builtins are for specific
target processors, and projects using them are not going to turn up on
GitHub.  They will be used in embedded software that is not open source.

I am sure there are builtins that are rarely or never used - but I doubt
if it is anything like as many as you have identified from this survey.