[
https://issues.apache.org/jira/browse/IMPALA-12689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838398#comment-17838398
]
Joe McDonnell commented on IMPALA-12689:
----------------------------------------
Fixed by:
{noformat}
commit cd9260e5276d0e342b21869c51e71aea9643504c
Author: Joe McDonnell <[email protected]>
Date: Thu Feb 15 18:22:15 2024 -0800 IMPALA-12689: Change TPC-H and TPC-DS
builds to respect CFLAGS
The TPC-H and TPC-DS builds currently do not respect the
CFLAGS environment variable, so they don't incorporate the
values that we set in init-compiler.sh.
This modifies the build scripts for TPC-H and TPC-DS to
patch their makefiles to add our CFLAGS. This has the
side effect of turning on -O3 optimization, resulting
in faster binaries used to generate the TPC-H and
TPC-DS datasets:
TPC-H's dbgen at scale 42:
Unoptimized: 4m46.269s
Optimized: 3m46.379s
TPC-DS's dsdgen at scale 20:
Unoptimized: 9m41.441s
Optimized: 7m25.017s
Testing:
- Ran a build and verified that the flags include our
CFLAGS value
Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Reviewed-on: http://gerrit.cloudera.org:8080/21111
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Joe McDonnell <[email protected]>
{noformat}
> Toolchain TPC-H and TPC-DS binaries are not built with optimizations
> --------------------------------------------------------------------
>
> Key: IMPALA-12689
> URL: https://issues.apache.org/jira/browse/IMPALA-12689
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 4.4.0
> Reporter: Joe McDonnell
> Priority: Major
>
> The tpc-h and tpc-ds components of the toolchain do not enable any kind of
> compiler optimization flags. This is irrelevant to Impala's shipped binary,
> but it does impact the performance of the data generators for TPC-H and
> TPC-DS. Turning on -O3 seems to improve the data generation time by ~25%.
> {noformat}
> ##### TPC-H ########
> # Unoptimized
> $ time ./dbgen -f -s 42
> TPC-H Population Generator (Version 2.17.0)
> Copyright Transaction Processing Performance Council 1994 - 2010
> real 4m46.269s
> user 4m20.982s
> sys 0m19.390s
> # -O3
> $ time ./dbgen -f -s 42
> TPC-H Population Generator (Version 2.17.0)
> Copyright Transaction Processing Performance Council 1994 - 2010
> real 3m46.379s
> user 3m23.721s
> sys 0m18.436s
> ##### TPC-DS #######
> # Unoptimized
> $ time ./dsdgen -force -scale 20
> DBGEN2 Population Generator (Version 2.0.0)
> Copyright Transaction Processing Performance Council (TPC) 2001 - 2015
> Warning: Selected scale factor is NOT valid for result publication
> real 9m41.441s
> user 8m3.447s
> sys 1m37.944s
> # -O3
> $ time ./dsdgen -force -scale 20
> DBGEN2 Population Generator (Version 2.0.0)
> Copyright Transaction Processing Performance Council (TPC) 2001 - 2015
> Warning: Selected scale factor is NOT valid for result publication
> real 7m25.017s
> user 5m48.487s
> sys 1m36.265s
> {noformat}
> We should modify the toolchain to add -O3 to these builds.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]