[jira] [Commented] (IMPALA-11511) Provide an option to build with compressed debug info

2024-04-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842168#comment-17842168
 ] 

ASF subversion and git services commented on IMPALA-11511:
--

Commit 56f35ad40a7ded4a700222eef8dc99cf7ea44625 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=56f35ad40 ]

IMPALA-12684: Enable IMPALA_COMPRESSED_DEBUG_INFO by default

IMPALA_COMPRESSED_DEBUG_INFO was introduced in IMPALA-11511
and reduces Impala binary sizes by >50%. Debug tools like
gdb and our minidump processing scripts handle compressed
debug information properly. There are slightly higher link
times and additional overhead when doing debugging.

Overall, the reduction in binary sizes seems worth it given
the modest overhead. Compressing the debug information also
avoids concerns that adding debug information to toolchain
components would increase binary sizes.

This changes the default for IMPALA_COMPRESSED_DEBUG_INFO to
true.

Testing:
 - Ran pstack on a Centos 7 machine running tests with
   IMPALA_COMPRESSED_DEBUG_INFO=true and verified that
   the symbols work properly
 - Forced the production of minidumps for a job using
   IMPALA_COMPRESSED_DEBUG_INFO=true and verified it is
   processed properly.
 - Used this locally for development for several months

Change-Id: I31640f1453d351b11644bb46af3d2158b22af5b3
Reviewed-on: http://gerrit.cloudera.org:8080/20871
Reviewed-by: Quanlong Huang 
Tested-by: Impala Public Jenkins 


> Provide an option to build with compressed debug info
> -
>
> Key: IMPALA-11511
> URL: https://issues.apache.org/jira/browse/IMPALA-11511
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> For builds with debug information, the debug information is often a large 
> portion of the binary size. There is a feature that compresses the debug info 
> using ZLIB via the "-gz" compilation flag. It makes a very large difference 
> to the size of our binaries:
> {noformat}
> GCC 7.5:
> debug: 726767520
> debug with -gz: 325970776
> release: 707911496
> with with -gz: 301671026
> GCC 10.4:
> debug: 870378832
> debug with -gz: 351442253
> release: 974600536
> release with -gz: 367938487{noformat}
> The size reduction would be useful for developers, but support in other tools 
> is mixed. gdb has support and seems to work fine. breakpad does not have 
> support. However, it is easy to convert a binary with compressed debug 
> symbols to one with normal debug symbols using objcopy:
> {noformat}
> objcopy --decompress-debug-sections [in_binary] [out_binary]{noformat}
> Given a minidump, it is possible to run objcopy to decompress the debug 
> symbols for the original binary, dump the breakpad symbols, and then process 
> the minidump successfully. So, it should be possible to modify 
> bin/dump_breakpad_symbols.py to do this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11511) Provide an option to build with compressed debug info

2024-01-29 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812090#comment-17812090
 ] 

Joe McDonnell commented on IMPALA-11511:


Ignore the previous update, IMPALA-11511 did not cause IMPALA-12745.

> Provide an option to build with compressed debug info
> -
>
> Key: IMPALA-11511
> URL: https://issues.apache.org/jira/browse/IMPALA-11511
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> For builds with debug information, the debug information is often a large 
> portion of the binary size. There is a feature that compresses the debug info 
> using ZLIB via the "-gz" compilation flag. It makes a very large difference 
> to the size of our binaries:
> {noformat}
> GCC 7.5:
> debug: 726767520
> debug with -gz: 325970776
> release: 707911496
> with with -gz: 301671026
> GCC 10.4:
> debug: 870378832
> debug with -gz: 351442253
> release: 974600536
> release with -gz: 367938487{noformat}
> The size reduction would be useful for developers, but support in other tools 
> is mixed. gdb has support and seems to work fine. breakpad does not have 
> support. However, it is easy to convert a binary with compressed debug 
> symbols to one with normal debug symbols using objcopy:
> {noformat}
> objcopy --decompress-debug-sections [in_binary] [out_binary]{noformat}
> Given a minidump, it is possible to run objcopy to decompress the debug 
> symbols for the original binary, dump the breakpad symbols, and then process 
> the minidump successfully. So, it should be possible to modify 
> bin/dump_breakpad_symbols.py to do this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11511) Provide an option to build with compressed debug info

2024-01-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812086#comment-17812086
 ] 

ASF subversion and git services commented on IMPALA-11511:
--

Commit 41a3f4d4ca43092d0ef48eeaa765626b720e986c in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=41a3f4d4c ]

IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages

When using bin/dump_breakpad_symbols.py to dump symbols for RPM/DEB
packages, the script extracts the packages to a temporary directory
and relies on keeping that directory around until the processing
is finished. The parallel processing added in IMPALA-11511 breaks
the logic that keeps the temporary directory around, so the script
generates errors like:

Found debugging info in 
/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug
Failed to open ELF file 
'/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No 
such file or directory
Failed to write symbol file.

This turns off parallelism for bin/dump_breakpad_symbols.py when
processing RPM/DEB packages (i.e. -r/--pkg). This also avoids using
a ThreadPool when num_processes <= 1.

Testing:
 - Hand tested with Redhat 7 RPMs

Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3
Reviewed-on: http://gerrit.cloudera.org:8080/20943
Reviewed-by: Impala Public Jenkins 
Tested-by: Joe McDonnell 


> Provide an option to build with compressed debug info
> -
>
> Key: IMPALA-11511
> URL: https://issues.apache.org/jira/browse/IMPALA-11511
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> For builds with debug information, the debug information is often a large 
> portion of the binary size. There is a feature that compresses the debug info 
> using ZLIB via the "-gz" compilation flag. It makes a very large difference 
> to the size of our binaries:
> {noformat}
> GCC 7.5:
> debug: 726767520
> debug with -gz: 325970776
> release: 707911496
> with with -gz: 301671026
> GCC 10.4:
> debug: 870378832
> debug with -gz: 351442253
> release: 974600536
> release with -gz: 367938487{noformat}
> The size reduction would be useful for developers, but support in other tools 
> is mixed. gdb has support and seems to work fine. breakpad does not have 
> support. However, it is easy to convert a binary with compressed debug 
> symbols to one with normal debug symbols using objcopy:
> {noformat}
> objcopy --decompress-debug-sections [in_binary] [out_binary]{noformat}
> Given a minidump, it is possible to run objcopy to decompress the debug 
> symbols for the original binary, dump the breakpad symbols, and then process 
> the minidump successfully. So, it should be possible to modify 
> bin/dump_breakpad_symbols.py to do this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11511) Provide an option to build with compressed debug info

2022-09-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609617#comment-17609617
 ] 

ASF subversion and git services commented on IMPALA-11511:
--

Commit 10c19b1a5730a898e17cc653be6bd19f0dc3340e in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=10c19b1a5 ]

IMPALA-11511: Add build options for reducing binary sizes

Impala's build produces dozens of C++ binaries
that link in all Impala libraries. Each binary is
hundreds of megabytes, leading to 10s of gigabytes
of disk space. A large proportion of this (~80%) is debug
information. The debug information increases in newer
versions of GCC such as GCC 10.

This introduces two options for reducing the size
of debug information:
 - IMPALA_MINIMAL_DEBUG_INFO=true builds Impala with
   minimal debug information (-g1). This contains line tables
   and can resolve backtraces, but it does not contain
   variable information and restricts further debugging.
 - IMPALA_COMPRESSED_DEBUG_INFO=true builds Impala with
   compressed debug information (-gz). This does not change
   the debug information included, but the compression saves
   significant disk space. gdb is known to work with
   compressed debug information, but other tools may not
   support it. The dump_breakpad_symbols.py script has been
   adjusted to handle these binaries.
These are disabled by default.

Release impalad binary sizes:
Configuration  | Size (bytes) | % reduction over base
Base   | 707834808| N/A
Stripped   |  83351664| 88%
Minimal debuginfo  | 215924096| 69%
Compressed debuginfo   | 301619286| 57%
Minimal + compressed debuginfo | 120886705| 83%

Testing:
 - Generated minidumps and resolved them
 - Verified this is disabled by default

Change-Id: I04a20258a86053d8f3972b9c7c81cd5bec1bbb66
Reviewed-on: http://gerrit.cloudera.org:8080/18962
Reviewed-by: Michael Smith 
Reviewed-by: Wenzhe Zhou 
Tested-by: Impala Public Jenkins 


> Provide an option to build with compressed debug info
> -
>
> Key: IMPALA-11511
> URL: https://issues.apache.org/jira/browse/IMPALA-11511
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> For builds with debug information, the debug information is often a large 
> portion of the binary size. There is a feature that compresses the debug info 
> using ZLIB via the "-gz" compilation flag. It makes a very large difference 
> to the size of our binaries:
> {noformat}
> GCC 7.5:
> debug: 726767520
> debug with -gz: 325970776
> release: 707911496
> with with -gz: 301671026
> GCC 10.4:
> debug: 870378832
> debug with -gz: 351442253
> release: 974600536
> release with -gz: 367938487{noformat}
> The size reduction would be useful for developers, but support in other tools 
> is mixed. gdb has support and seems to work fine. breakpad does not have 
> support. However, it is easy to convert a binary with compressed debug 
> symbols to one with normal debug symbols using objcopy:
> {noformat}
> objcopy --decompress-debug-sections [in_binary] [out_binary]{noformat}
> Given a minidump, it is possible to run objcopy to decompress the debug 
> symbols for the original binary, dump the breakpad symbols, and then process 
> the minidump successfully. So, it should be possible to modify 
> bin/dump_breakpad_symbols.py to do this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org