Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-24 Thread Giancarlo Razzolini via arch-general

Em janeiro 24, 2020 16:41 Eli Schwartz via arch-general escreveu:


Oooh, that's actually a really interesting idea. I bet we could make
this just consume a foo-debug package, then we could just modify
devtools to add OPTIONS+=('debug') in makepkg.conf, and have commitpkg
upload the debug package separately to the symbol server.



I thought more of a "build and they shall come" approach where we put this 
server
out there and create a todo for people to upload symbols. Of course, -any 
packages
don't need any changes, also, I don't think all packages would require symbols 
to
be uploaded.

Of course, if this is a relatively speaking low handing fruit to implement on 
our
tooling, why not?

I'm just not sure yet if tecken is usable outside mozilla things, but let's see.
I have been discussing too with a KDE dev that approached me about this and was
unaware of this thread (and other discussions we had in the past). And he'll 
also
propose that KDE implements a symbols server, regardless of what we do.

But, let me say again, I'm completely against having actual debug packages.

Regards,
Giancarlo Razzolini

pgpORFsyIryyx.pgp
Description: PGP signature


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-24 Thread Eli Schwartz via arch-general
On 1/24/20 2:29 PM, Giancarlo Razzolini wrote:
> I wouldn't be opposed to have something like tecken [0] or some other
> software
> for this (not sure if there is one) where we would upload all the symbol
> artifacts
> for Arch built packages and that users could use when needed.
> 
> This wouldn't require changing neither dbscripts nor devtools, but it
> would be helpful
> if devtools had some function to facilitate this. This solution wouldn't
> bloat neither
> our packages nor our mirrors and would be useful to all Arch users. Of
> course, we could
> keep only the last X number of versions on this service, I don't see the
> point in having
> something like the Arch Linux Archive where we try to preserve everything.
> 
> Regards,
> Giancarlo Razzolini
> 
> [0] https://github.com/mozilla-services/tecken

Oooh, that's actually a really interesting idea. I bet we could make
this just consume a foo-debug package, then we could just modify
devtools to add OPTIONS+=('debug') in makepkg.conf, and have commitpkg
upload the debug package separately to the symbol server.

-- 
Eli Schwartz
Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-24 Thread Giancarlo Razzolini via arch-general

Em janeiro 21, 2020 20:05 Eli Schwartz via arch-general escreveu:


I'm personally not a fan of bloating packages even by 10% or whatever
for debug symbols that many users don't need.



Me neither.


As I said above, split debug packages need "dbscripts" support to make
sure they are correctly handled by our repository-building scripts.

If dbscripts supported it, we could enable the debug option in devtools'
makepkg.conf and start building all packages with debug info.

(Patches welcome!)



I wouldn't be opposed to have something like tecken [0] or some other software
for this (not sure if there is one) where we would upload all the symbol 
artifacts
for Arch built packages and that users could use when needed.

This wouldn't require changing neither dbscripts nor devtools, but it would be 
helpful
if devtools had some function to facilitate this. This solution wouldn't bloat 
neither
our packages nor our mirrors and would be useful to all Arch users. Of course, 
we could
keep only the last X number of versions on this service, I don't see the point 
in having
something like the Arch Linux Archive where we try to preserve everything.

Regards,
Giancarlo Razzolini

[0] https://github.com/mozilla-services/tecken


pgpATPPW76Feg.pgp
Description: PGP signature


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-22 Thread Łukasz Michalski
On 22/01/2020 14.36, Justin Capella via arch-general wrote:
> point. Maybe one day users could submit coredumps / backtraces to a
> webservice that would reference the symbols, and "bucket" the traces to
> help triage/identify unique crashes
> 

For me it would be better if I could just download debugging symbols using 
pacman and analyze coredumps locally.

Regards,
Łukasz


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-22 Thread Justin Capella via arch-general
The reason I'd really like native packages to be built with split symbols
even if they aren't included in the package but available through some
other means... Is so that bug wranglers can more easily make sense of
traces/coredumpctl info output, where rebuilding the package would just be
a hassle and potentially result in different symbols, which defeats the
point. Maybe one day users could submit coredumps / backtraces to a
webservice that would reference the symbols, and "bucket" the traces to
help triage/identify unique crashes


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Tobias Hunger via arch-general
Hi Neven,

On Tue, 21 Jan 2020, 23:58 Neven Sajko via arch-general, <
arch-general@archlinux.org> wrote:

> One thing that I should have said right away is that one can not know
> in advance when and which executable he will need to debug.
>

Clear Linux uses a daemon installed in the client to make debug symbols
automatically available on access.

The details can be found here:
https://docs.01.org/clearlinux/latest/guides/clear/debug.html

Best Regards,
Tobias


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Eli Schwartz via arch-general
On 1/21/20 6:00 PM, Neven Sajko wrote:
> Regarding the firefox example, are the split debugging symbols files
> publicly available?

Mozilla's symbol server is described here:
https://developer.mozilla.org/en-US/docs/Mozilla/Using_the_Mozilla_symbol_server#Downloading_symbols_on_Linux_Mac_OS_X

-- 
Eli Schwartz
Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Eli Schwartz via arch-general
On 1/21/20 5:44 PM, Neven Sajko wrote:
>> There is no "even", here. The golang programming language is not
>> *atypical*, it should not receive abnormal treatment.
>>
>> I'm not sure what you men by "design makes use of debugging symbols at
>> runtime". They're debug symbols, not runtime logic symbols.
> 
> Golang (and libbacktrace) use DWARF for backtraces at runtime.

Ah, should've guessed. :D

That's a nice extra, but I'd still suspect that opening a coredump in
gdb or similar is even better than getting a pretty backtrace on exit.

>> It is very nice indeed! Splitdebug symbols work fine in gdb, and I
>> believe in radare2 as well: https://github.com/radareorg/radare2/issues/5758
>>
>> Of course, archlinux doesn't really provide splitdebug packages by
>> default, so you cannot generally use them unless you're using your own
>> packages...
> 
> I would of course prefer split debugging symbols to no symbols at all.
> 
>> Debug symbols, on the other hand, are *always* unnecessary unless you
>> are debugging. Moreover, they tend to result in dramatically increased
>> package size. Headers are tiny, and docs often are (but we have lint
>> checkers to warn us if abnormal packages contain mostly docs, and there
>> are several packages that do indeed split out *-docs, so this is not an
>> absolute!)
>>
>> Have you tried building, say, a web browser with debugging symbols?
> 
> Sorry, I did not mean to argue that absolutely all executables must be
> installed with debugging symbols. The ideal situation I am imagining
> is that if a packager thinks the debugging symbols would be too much
> for some executable in the package, she simply disables them and
> enables stripping for the whole package. But most executables are
> small and stripping their debugging symbols does not gain much.
> 
>> No it does not, makepkg handles this transparently with absolutely no
>> effort on the part of the maintainer.
> 
> I was actually referring to the fact that this feature was not
> available before because of libalpm limitations (I think that required
> hooks or somehing, and was only added recently?). Anyway, I am not
> saying this is some great issue, but it certainly somewhat increases
> complexity of some Arch projects. But maybe that complexity is good if
> it is not exclusively needed for this usecase, thus on further thought
> I probably should have done more research before raising this
> particular point.

It's not a libalpm limitation. :) pacman doesn't know or care about
this, it just appears as a package. You can see how this works for the
glib2/gtk3 packages using my personal repo:
https://wiki.archlinux.org/index.php/Debug_-_Getting_Traces#Gtk3/glib2

The changes needed to handle debug packages would be all in the
dbscripts project, and would amount to tracking the packages when they
are added, and dispatching them to their own repository e.g. in
[community-debug]

>> Perhaps libbacktrace should work more like gdb then? It works fine with
>> gdb, and the ELF metadata has .gnu_debuglink for this exact purpose --
>> it's fundamental to binutils, see the objcopy manpage for example.
> 
> I assumed libbacktrace could not do that because of constraints on
> memory allocation (whether on stack or on heap) or reentrancy, but
> apparently it has that functionality since 2017. Oops.

Cool. ;)

Has golang also grown that feature?

>> You're saying it's "harder and more complicated" to use detached debug
>> symbols, but I'm really not seeing it.
> 
> Depending on an arbitrary file determined by a path is complicated,
> there are all kinds of concerns, like async-signal-safety (one has to
> use open instead of fopen), getting the file before somebody
> overwrites it or moves it (or just changes a symlink) ...

Eh, I don't really think you need to worry about people overwriting or
moving it, we're dealing with package manager managed files. You'd need
to have the same worries about plugins which are loaded via dlopen(), or
programming languages that uses script interpreters rather than ld.so --
you can just assume there is consistency managed at the OS layer.

>> They're *huge*, and the standard gdb, when used to execute a program or
>> to inspect a coredump file, can seamlessly merge the detached debug data
>> and display enhanced debug info. This works even when you only install
>> the split -debug package using pacman, *after* the program crashes. The
>> coredump contains all the info you need.
>>
>> Programs like firefox have extensive upstream tooling for telemetry,
>> whereby heavily stripped programs are distributed to end users, and if
>> the program crashes it can send the backtrace to Mozilla.org; this
>> backtrace is then merged with the debug info which is on Mozilla's
>> servers, to produce meaningful output. Users don't have to suffer huge
>> downloads.
>>
>> (Mozilla's symbol server can also be used with a trivial gdb script to
>> let gdb download the debug info on-demand, if you're debugging firefox.)
>>
>> 

Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Neven Sajko via arch-general
Regarding the firefox example, are the split debugging symbols files
publicly available?


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Neven Sajko via arch-general
One thing that I should have said right away is that one can not know
in advance when and which executable he will need to debug.


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Neven Sajko via arch-general
> There is no "even", here. The golang programming language is not
> *atypical*, it should not receive abnormal treatment.
>
> I'm not sure what you men by "design makes use of debugging symbols at
> runtime". They're debug symbols, not runtime logic symbols.

Golang (and libbacktrace) use DWARF for backtraces at runtime.

> It is very nice indeed! Splitdebug symbols work fine in gdb, and I
> believe in radare2 as well: https://github.com/radareorg/radare2/issues/5758
>
> Of course, archlinux doesn't really provide splitdebug packages by
> default, so you cannot generally use them unless you're using your own
> packages...

I would of course prefer split debugging symbols to no symbols at all.

> Debug symbols, on the other hand, are *always* unnecessary unless you
> are debugging. Moreover, they tend to result in dramatically increased
> package size. Headers are tiny, and docs often are (but we have lint
> checkers to warn us if abnormal packages contain mostly docs, and there
> are several packages that do indeed split out *-docs, so this is not an
> absolute!)
>
> Have you tried building, say, a web browser with debugging symbols?

Sorry, I did not mean to argue that absolutely all executables must be
installed with debugging symbols. The ideal situation I am imagining
is that if a packager thinks the debugging symbols would be too much
for some executable in the package, she simply disables them and
enables stripping for the whole package. But most executables are
small and stripping their debugging symbols does not gain much.

> No it does not, makepkg handles this transparently with absolutely no
> effort on the part of the maintainer.

I was actually referring to the fact that this feature was not
available before because of libalpm limitations (I think that required
hooks or somehing, and was only added recently?). Anyway, I am not
saying this is some great issue, but it certainly somewhat increases
complexity of some Arch projects. But maybe that complexity is good if
it is not exclusively needed for this usecase, thus on further thought
I probably should have done more research before raising this
particular point.

> Perhaps libbacktrace should work more like gdb then? It works fine with
> gdb, and the ELF metadata has .gnu_debuglink for this exact purpose --
> it's fundamental to binutils, see the objcopy manpage for example.

I assumed libbacktrace could not do that because of constraints on
memory allocation (whether on stack or on heap) or reentrancy, but
apparently it has that functionality since 2017. Oops.

> You're saying it's "harder and more complicated" to use detached debug
> symbols, but I'm really not seeing it.

Depending on an arbitrary file determined by a path is complicated,
there are all kinds of concerns, like async-signal-safety (one has to
use open instead of fopen), getting the file before somebody
overwrites it or moves it (or just changes a symlink) ...

> They're *huge*, and the standard gdb, when used to execute a program or
> to inspect a coredump file, can seamlessly merge the detached debug data
> and display enhanced debug info. This works even when you only install
> the split -debug package using pacman, *after* the program crashes. The
> coredump contains all the info you need.
>
> Programs like firefox have extensive upstream tooling for telemetry,
> whereby heavily stripped programs are distributed to end users, and if
> the program crashes it can send the backtrace to Mozilla.org; this
> backtrace is then merged with the debug info which is on Mozilla's
> servers, to produce meaningful output. Users don't have to suffer huge
> downloads.
>
> (Mozilla's symbol server can also be used with a trivial gdb script to
> let gdb download the debug info on-demand, if you're debugging firefox.)
>
> The Arch maintainer for firefox actually does exactly this -- our
> firefox package is stripped, but the symbols are uploaded to Mozilla
> right after makepkg completes.

Well this is certainly *complicated*. But it is warranted because of
the great size difference, most packages don't need this and could
include debugging symbols, I think.

To reiterate, I certainly think that split debugging symbols in split
packages in official repos would be an improvement; but I would like
to know why are more packages built with included debugging symbols.
Do you think that, eg., all packages in "core" being built with
debugging symbols would be OK? Maybe it would be OK if just function
names were included, without source file line info?

Sidenote: Do you know why are split debug packages not yet available?

Regards,
Neven Sajko


Re: [arch-general] Why are Archlinux packages stripped of (debugging) symbols?

2020-01-21 Thread Eli Schwartz via arch-general
On 1/21/20 3:21 PM, Neven Sajko via arch-general wrote:
> Hello,
> 
> Why is it that makepkg strips symbols by default, 

Because Arch Linux's default vendor options for makepkg.conf include the
optional strip option.

> and many packagers
> even make extra effort to get packages stripped; instead of building
> with "-g"? 

Packagers do not go to extra effort for this. makepkg provides this as a
tuneable, and any PKGBUILD is supposed to build with debug symbols when
the "debug" makepkg.conf configuration option is set; if it does not,
then the PKGBUILD has a bug that should be fixed.

> Even Go software, which by Go's design makes use of
> debugging symbols at run time had been stripped as far as I remember
> (although it seems that has changed, thankfully).

There is no "even", here. The golang programming language is not
*atypical*, it should not receive abnormal treatment.

I'm not sure what you men by "design makes use of debugging symbols at
runtime". They're debug symbols, not runtime logic symbols.

> It is quite nice to have debugging symbols in executables for learning
> and entertainment purposes (seriously, try Ghidra or radare2 once),
> and they are, of course, indispensable when bad luck strikes and one
> actually has to debug.

It is very nice indeed! Splitdebug symbols work fine in gdb, and I
believe in radare2 as well: https://github.com/radareorg/radare2/issues/5758

Of course, archlinux doesn't really provide splitdebug packages by
default, so you cannot generally use them unless you're using your own
packages...

> And there do not seem to be any significant downsides to extra
> symbols, it just means more permanent storage and bandwidth used.
> Especially in view of Arch's existing packaging practice patterns,
> like no "-dev" or "-doc" split packages.

Headers and such are distributed along with the main package because by
definition, they are needed as a core part of the project. Anyone who
wants to build reverse dependencies needs them, the *only* people who
don't need development headers are the people who never build packages
themselves. There's a simple solution for such people: pacman.conf
supports "NoExtract = usr/include/"

Also, "NoExtract = usr/share/doc/" if you do not want the help
documentation which many end users do in fact need.

Debug symbols, on the other hand, are *always* unnecessary unless you
are debugging. Moreover, they tend to result in dramatically increased
package size. Headers are tiny, and docs often are (but we have lint
checkers to warn us if abnormal packages contain mostly docs, and there
are several packages that do indeed split out *-docs, so this is not an
absolute!)

Have you tried building, say, a web browser with debugging symbols?

> I know some developers have some degree of desire for split packages
> with stripped symbols in separate files, but that would indeed be
> inconsistent with the lack of "-dev" or "-doc" packages. More
> importantly, splitting symbols from executable files is most of the
> time a harmful complication: it makes packaging more complicated,

No it does not, makepkg handles this transparently with absolutely no
effort on the part of the maintainer.

In fact, makepkg can programmatically split out debug packages using
trivial logic when it *cannot* do so for development files (which
include more than headers) or documentation (which is sort of kind of
standard except not really), which may well be a contributing factor to
why makepkg supports it at all. ;)

> it
> makes using the separated symbols by humans more complicated, and it
> makes using the debugging symbols from the program they belong to
> harder (ref. Ian Lance Taylor's libbacktrace, which does not work with
> symbols in a separate file, very possibly for reasons fundamental to
> libbacktrace's purpose).

Perhaps libbacktrace should work more like gdb then? It works fine with
gdb, and the ELF metadata has .gnu_debuglink for this exact purpose --
it's fundamental to binutils, see the objcopy manpage for example.

You're saying it's "harder and more complicated" to use detached debug
symbols, but I'm really not seeing it.

> To conclude: besides arguing for debugging symbols to be installed as
> part of executable files, I am honestly asking what are the reasons
> for the apparent aversion towards them in Arch's (and wider) culture
> (because I am curious about that).

They're *huge*, and the standard gdb, when used to execute a program or
to inspect a coredump file, can seamlessly merge the detached debug data
and display enhanced debug info. This works even when you only install
the split -debug package using pacman, *after* the program crashes. The
coredump contains all the info you need.

Programs like firefox have extensive upstream tooling for telemetry,
whereby heavily stripped programs are distributed to end users, and if
the program crashes it can send the backtrace to Mozilla.org; this
backtrace is then merged with the debug info which is on Mozilla's