[
https://issues.apache.org/jira/browse/ARROW-13606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neal Richardson updated ARROW-13606:
------------------------------------
Description:
Background: ARROW-12853 reported that if the C++ code in the R package is built
with {{-flto}}, it compiles and links successfully, but then the package
segfaults on load. ARROW-13199 attempted to fix this by adding {{-fno-lto}} to
the build, and it did work for the ubuntu 21.04 CI job. However, the LTO check
on CRAN failed.
The reason for the difference between the passing ubuntu 21.04 CI job and the
CRAN check turns out to be that ARROW-13199 only actually added {{-fno-lto}} to
the link step. It intended to add them to the CXXFLAGS for compiling too, but
it set the wrong variable, so they were not passed in. Yet, the build was still
successful because the CXXFLAGS used on the ubuntu 21.04 build included
{{-ffat-lto-objects}} in addition to {{-flto}}. fat-lto-objects can be linked
[with or without
LTO|https://stackoverflow.com/questions/13799452/what-is-the-difference-in-gcc-between-lto-and-fat-lto-objects],
so the {{-fno-lto}} at link time was able to work. The CRAN machine
configuration does not have {{-ffat-lto-objects}} as a compile flag, so the
{{-fno-lto}} in the linker meant that linking would fail.
There are two solutions: fix the segfault-on-load and allow LTO or properly
disable LTO. This issue is to do the latter; we are continuing to work on the
former in a separate issue. ARROW-13507 tried to fix this by setting {{UseLTO:
false}} in {{DESCRIPTION}}, but it turns out that the CRAN check overrides that
flag and forces LTO anyway, so that isn't a viable option. And we have to add
{{-ffat-lto-objects}} instead of {{-fno-lto}} because {{R CMD INSTALL
--use-LTO}} appends {{-flto}} at the end, which seems to override {{-fno-lto}}
if we set it in the package Makevars. But {{-ffat-lto-objects}} complements
{{-flto}} and lets the package link with or without LTO.
Example output from the ubuntu-21.04 build (compiles with {{-flto=auto
-ffat-lto-objects}}, links with {{-flto=10 -fno-lto}}, result is successful
non-LTO library
{code}
...
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG
-I/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/include
-Werror -DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
-I'/usr/local/lib/R/site-library/cpp11/include' -fpic -g -O2
-ffile-prefix-map=/build/r-base-aXXzqd/r-base-4.1.0=. -flto=auto
-ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security
-Wdate-time -D_FORTIFY_SOURCE=2 -g -c type_infer.cpp -o type_infer.o
g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto
-Wl,-z,relro -o arrow.so RTasks.o altrep.o array.o array_to_vector.o
arraydata.o arrowExports.o buffer.o chunkedarray.o compression.o compute-exec.o
compute.o config.o csv.o dataset.o datatype.o expression.o feather.o field.o
filesystem.o imports.o io.o json.o memorypool.o message.o parquet.o py-to-r.o
r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o scalar.o
schema.o symbols.o table.o threadpool.o type_infer.o
-L/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/lib
-larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
-larrow_dataset -lparquet -fno-lto -L/usr/lib/R/lib -lR
...
{code}
Example output from the CRAN check (compiles with {{-flto=10}}, links with
{{-fno-lto}}, so it fails:
{code}
...
g++ -std=gnu++11 -I"/data/gannet/ripley/R/R-devel/include" -DNDEBUG
-I/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/include
-DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
-I'/data/gannet/ripley/R/test-4.2/cpp11/include' -I/usr/local/include -fpic
-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-parentheses
-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector-strong -fstack-clash-protection -fcf-protection -flto=10 -c
type_infer.cpp -o type_infer.o
g++ -std=gnu++11 -shared -L/usr/local/lib64 -o arrow.so RTasks.o altrep.o
array.o array_to_vector.o arraydata.o arrowExports.o buffer.o chunkedarray.o
compression.o compute.o config.o csv.o dataset.o datatype.o expression.o
feather.o field.o filesystem.o imports.o io.o json.o memorypool.o message.o
parquet.o py-to-r.o r_to_arrow.o recordbatch.o recordbatchreader.o
recordbatchwriter.o scalar.o schema.o symbols.o table.o threadpool.o
type_infer.o
-L/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/lib
-larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
-larrow_dataset -lparquet -fno-lto
/usr/bin/ld: RTasks.o: plugin needed to handle lto object
/usr/bin/ld: RTasks.o: plugin needed to handle lto object
/usr/bin/ld: altrep.o: plugin needed to handle lto object
...
{code}
was:
Background: ARROW-12853 reported that if the C++ code in the R package is built
with {{-flto}}, it compiles and links successfully, but then the package
segfaults on load. ARROW-13199 attempted to fix this by adding {{-fno-lto}} to
the build, and it did work for the ubuntu 21.04 CI job. However, the LTO check
on CRAN failed.
The reason for the difference between the passing ubuntu 21.04 CI job and the
CRAN check turns out to be that ARROW-13199 only actually added {{-fno-lto}} to
the link step. It intended to add them to the CXXFLAGS for compiling too, but
it set the wrong variable, so they were not passed in. Yet, the build was still
successful because the CXXFLAGS used on the ubuntu 21.04 build included
{{-ffat-lto-objects}} in addition to {{-flto}}. fat-lto-objects can be linked
[with or without
LTO|https://stackoverflow.com/questions/13799452/what-is-the-difference-in-gcc-between-lto-and-fat-lto-objects],
so the {{-fno-lto}} at link time was able to work. The CRAN machine
configuration does not have {{-ffat-lto-objects}} as a compile flag, so the
{{-fno-lto}} in the linker meant that linking would fail.
There are two solutions: fix the segfault-on-load and allow LTO or properly
disable LTO. This issue is to do the latter; we are continuing to work on the
former in a separate issue. ARROW-13507 tried to fix this by setting {{UseLTO:
false}} in {{DESCRIPTION}}, but it turns out that the CRAN check overrides that
flag and forces LTO anyway, so that isn't a viable option.
Example output from the ubuntu-21.04 build (compiles with {{-flto=auto
-ffat-lto-objects}}, links with {{-flto=10 -fno-lto}}, result is successful
non-LTO library
{code}
...
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG
-I/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/include
-Werror -DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
-I'/usr/local/lib/R/site-library/cpp11/include' -fpic -g -O2
-ffile-prefix-map=/build/r-base-aXXzqd/r-base-4.1.0=. -flto=auto
-ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security
-Wdate-time -D_FORTIFY_SOURCE=2 -g -c type_infer.cpp -o type_infer.o
g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto
-Wl,-z,relro -o arrow.so RTasks.o altrep.o array.o array_to_vector.o
arraydata.o arrowExports.o buffer.o chunkedarray.o compression.o compute-exec.o
compute.o config.o csv.o dataset.o datatype.o expression.o feather.o field.o
filesystem.o imports.o io.o json.o memorypool.o message.o parquet.o py-to-r.o
r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o scalar.o
schema.o symbols.o table.o threadpool.o type_infer.o
-L/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/lib
-larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
-larrow_dataset -lparquet -fno-lto -L/usr/lib/R/lib -lR
...
{code}
Example output from the CRAN check (compiles with {{-flto=10}}, links with
{{-fno-lto}}, so it fails:
{code}
...
g++ -std=gnu++11 -I"/data/gannet/ripley/R/R-devel/include" -DNDEBUG
-I/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/include
-DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
-I'/data/gannet/ripley/R/test-4.2/cpp11/include' -I/usr/local/include -fpic
-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-parentheses
-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector-strong -fstack-clash-protection -fcf-protection -flto=10 -c
type_infer.cpp -o type_infer.o
g++ -std=gnu++11 -shared -L/usr/local/lib64 -o arrow.so RTasks.o altrep.o
array.o array_to_vector.o arraydata.o arrowExports.o buffer.o chunkedarray.o
compression.o compute.o config.o csv.o dataset.o datatype.o expression.o
feather.o field.o filesystem.o imports.o io.o json.o memorypool.o message.o
parquet.o py-to-r.o r_to_arrow.o recordbatch.o recordbatchreader.o
recordbatchwriter.o scalar.o schema.o symbols.o table.o threadpool.o
type_infer.o
-L/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/lib
-larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
-larrow_dataset -lparquet -fno-lto
/usr/bin/ld: RTasks.o: plugin needed to handle lto object
/usr/bin/ld: RTasks.o: plugin needed to handle lto object
/usr/bin/ld: altrep.o: plugin needed to handle lto object
...
{code}
> [R] Actually disable LTO
> ------------------------
>
> Key: ARROW-13606
> URL: https://issues.apache.org/jira/browse/ARROW-13606
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Reporter: Neal Richardson
> Priority: Major
> Labels: pull-request-available
> Fix For: 6.0.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Background: ARROW-12853 reported that if the C++ code in the R package is
> built with {{-flto}}, it compiles and links successfully, but then the
> package segfaults on load. ARROW-13199 attempted to fix this by adding
> {{-fno-lto}} to the build, and it did work for the ubuntu 21.04 CI job.
> However, the LTO check on CRAN failed.
> The reason for the difference between the passing ubuntu 21.04 CI job and the
> CRAN check turns out to be that ARROW-13199 only actually added {{-fno-lto}}
> to the link step. It intended to add them to the CXXFLAGS for compiling too,
> but it set the wrong variable, so they were not passed in. Yet, the build was
> still successful because the CXXFLAGS used on the ubuntu 21.04 build included
> {{-ffat-lto-objects}} in addition to {{-flto}}. fat-lto-objects can be linked
> [with or without
> LTO|https://stackoverflow.com/questions/13799452/what-is-the-difference-in-gcc-between-lto-and-fat-lto-objects],
> so the {{-fno-lto}} at link time was able to work. The CRAN machine
> configuration does not have {{-ffat-lto-objects}} as a compile flag, so the
> {{-fno-lto}} in the linker meant that linking would fail.
> There are two solutions: fix the segfault-on-load and allow LTO or properly
> disable LTO. This issue is to do the latter; we are continuing to work on the
> former in a separate issue. ARROW-13507 tried to fix this by setting
> {{UseLTO: false}} in {{DESCRIPTION}}, but it turns out that the CRAN check
> overrides that flag and forces LTO anyway, so that isn't a viable option. And
> we have to add {{-ffat-lto-objects}} instead of {{-fno-lto}} because {{R CMD
> INSTALL --use-LTO}} appends {{-flto}} at the end, which seems to override
> {{-fno-lto}} if we set it in the package Makevars. But {{-ffat-lto-objects}}
> complements {{-flto}} and lets the package link with or without LTO.
> Example output from the ubuntu-21.04 build (compiles with {{-flto=auto
> -ffat-lto-objects}}, links with {{-flto=10 -fno-lto}}, result is successful
> non-LTO library
> {code}
> ...
> g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG
> -I/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/include
> -Werror -DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
> -I'/usr/local/lib/R/site-library/cpp11/include' -fpic -g -O2
> -ffile-prefix-map=/build/r-base-aXXzqd/r-base-4.1.0=. -flto=auto
> -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security
> -Wdate-time -D_FORTIFY_SOURCE=2 -g -c type_infer.cpp -o type_infer.o
> g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto
> -Wl,-z,relro -o arrow.so RTasks.o altrep.o array.o array_to_vector.o
> arraydata.o arrowExports.o buffer.o chunkedarray.o compression.o
> compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o
> feather.o field.o filesystem.o imports.o io.o json.o memorypool.o message.o
> parquet.o py-to-r.o r_to_arrow.o recordbatch.o recordbatchreader.o
> recordbatchwriter.o scalar.o schema.o symbols.o table.o threadpool.o
> type_infer.o
> -L/arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-5.0.0.9000/lib
> -larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
> -larrow_dataset -lparquet -fno-lto -L/usr/lib/R/lib -lR
> ...
> {code}
> Example output from the CRAN check (compiles with {{-flto=10}}, links with
> {{-fno-lto}}, so it fails:
> {code}
> ...
> g++ -std=gnu++11 -I"/data/gannet/ripley/R/R-devel/include" -DNDEBUG
> -I/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/include
> -DARROW_R_WITH_ARROW -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET
> -I'/data/gannet/ripley/R/test-4.2/cpp11/include' -I/usr/local/include -fpic
> -g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes
> -Wno-parentheses -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector-strong -fstack-clash-protection -fcf-protection -flto=10 -c
> type_infer.cpp -o type_infer.o
> g++ -std=gnu++11 -shared -L/usr/local/lib64 -o arrow.so RTasks.o altrep.o
> array.o array_to_vector.o arraydata.o arrowExports.o buffer.o chunkedarray.o
> compression.o compute.o config.o csv.o dataset.o datatype.o expression.o
> feather.o field.o filesystem.o imports.o io.o json.o memorypool.o message.o
> parquet.o py-to-r.o r_to_arrow.o recordbatch.o recordbatchreader.o
> recordbatchwriter.o scalar.o schema.o symbols.o table.o threadpool.o
> type_infer.o
> -L/data/gannet/ripley/R/packages/tests-LTO/arrow/libarrow/arrow-5.0.0/lib
> -larrow_dataset -lparquet -larrow -larrow -larrow_bundled_dependencies
> -larrow_dataset -lparquet -fno-lto
> /usr/bin/ld: RTasks.o: plugin needed to handle lto object
> /usr/bin/ld: RTasks.o: plugin needed to handle lto object
> /usr/bin/ld: altrep.o: plugin needed to handle lto object
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)