That makes sense to me. 500MB is a bit excessive (well, 94MB also sounds 
excessive but I suppose that's LLVM).

Maybe we can host the separated debug symbols as part of the GitHub release 
instead?

re: runner space, something like this should work [1].

[1]: 
https://github.com/apache/arrow-adbc/blob/4823ca2fe68e6b113414cf5b6787ce662b52e0b6/.github/workflows/nightly-verify.yml#L161-L163

On Thu, Mar 13, 2025, at 02:05, Jean-Baptiste Onofré wrote:
> Hi
>
> Yes, agree with Kou and Jacob: we should probably strip and "square"
> to only keep debug.
>
> Thoughts ?
>
> Regards
> JB
>
> On Wed, Mar 12, 2025 at 12:41 PM Jacob Wujciak <assignu...@apache.org> wrote:
>>
>> Thanks for testing this Logan! We probably have to clear some more
>> disk space on the runner, this is a problem we have had before, they
>> keep adding things to the image.
>>
>> A 5x increase seems a bit much to integrate into the normal builds, so
>> Kou's approach is probably best?
>>
>> Am Mi., 12. März 2025 um 05:48 Uhr schrieb Logan Riggs
>> <logan.ri...@dremio.com.invalid>:
>> >
>> > I did a quick test building with RelWithDebInfo. Unfortunately the
>> > arrow-java github runners ran out of space for the unix builds.
>> >
>> > Instead I built locally and for Arm64 Linux the gandiva .so file is 573mb
>> > vs 94mb for a release build.
>> >
>> > On Tue, Mar 11, 2025 at 6:19 PM Sutou Kouhei <k...@clear-code.com> wrote:
>> >
>> > > Hi,
>> > >
>> > > It seems that deb (dh_strip) uses objcopy for this.
>> > >
>> > > See also the "--only-keep-debug" option section in
>> > > objcopy(1):
>> > >
>> > > https://man7.org/linux/man-pages/man1/objcopy.1.html
>> > >
>> > > > --only-keep-debug
>> > > >     Strip a file, removing contents of any sections that would not
>> > > >     be stripped by --strip-debug and leaving the debugging
>> > > >     sections intact.  In ELF files, this preserves all note
>> > > >     sections in the output.
>> > > >
>> > > >     Note - the section headers of the stripped sections are
>> > > >     preserved, including their sizes, but the contents of the
>> > > >     section are discarded.  The section headers are preserved so
>> > > >     that other tools can match up the debuginfo file with the real
>> > > >     executable, even if that executable has been relocated to a
>> > > >     different address space.
>> > > >
>> > > >     The intention is that this option will be used in conjunction
>> > > >     with --add-gnu-debuglink to create a two part executable.  One
>> > > >     a stripped binary which will occupy less space in RAM and in a
>> > > >     distribution and the second a debugging information file which
>> > > >     is only needed if debugging abilities are required.  The
>> > > >     suggested procedure to create these files is as follows:
>> > > >
>> > > >     1.<Link the executable as normal.  Assuming that it is called>
>> > > >         "foo" then...
>> > > >
>> > > >     1.<Run "objcopy --only-keep-debug foo foo.dbg" to>
>> > > >         create a file containing the debugging info.
>> > > >
>> > > >     1.<Run "objcopy --strip-debug foo" to create a>
>> > > >         stripped executable.
>> > > >
>> > > >     1.<Run "objcopy --add-gnu-debuglink=foo.dbg foo">
>> > > >         to add a link to the debugging info into the stripped
>> > > >         executable.
>> > > >
>> > > >     Note---the choice of ".dbg" as an extension for the debug info
>> > > >     file is arbitrary.  Also the "--only-keep-debug" step is
>> > > >     optional.  You could instead do this:
>> > > >
>> > > >     1.<Link the executable as normal.>
>> > > >     1.<Copy "foo" to  "foo.full">
>> > > >     1.<Run "objcopy --strip-debug foo">
>> > > >     1.<Run "objcopy --add-gnu-debuglink=foo.full foo">
>> > > >
>> > > >     i.e., the file pointed to by the --add-gnu-debuglink can be
>> > > >     the full executable.  It does not have to be a file created by
>> > > >     the --only-keep-debug switch.
>> > > >
>> > > >     Note---this switch is only intended for use on fully linked
>> > > >     files.  It does not make sense to use it on object files where
>> > > >     the debugging information may be incomplete.  Besides the
>> > > >     gnu_debuglink feature currently only supports the presence of
>> > > >     one filename containing debugging information, not multiple
>> > > >     filenames on a one-per-object-file basis.
>> > >
>> > > I'm not sure whether there is a similar tool for macOS or not...
>> > >
>> > > We can use .pdb on Windows. We can install .pdb something
>> > > like the following:
>> > >
>> > > ----
>> > > diff --git a/cpp/cmake_modules/BuildUtils.cmake
>> > > b/cpp/cmake_modules/BuildUtils.cmake
>> > > index 90839cb446..0757e3cf81 100644
>> > > --- a/cpp/cmake_modules/BuildUtils.cmake
>> > > +++ b/cpp/cmake_modules/BuildUtils.cmake
>> > > @@ -424,6 +424,11 @@ function(ADD_ARROW_LIB LIB_NAME)
>> > >              RUNTIME DESTINATION ${INSTALL_RUNTIME_DIR}
>> > >              INCLUDES
>> > >              DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
>> > > +    if(MSVC)
>> > > +      install(FILES $<TARGET_PDB_FILE:${LIB_NAME}_shared>
>> > > +              DESTINATION "${INSTALL_RUNTIME_DIR}"
>> > > +              OPTIONAL)
>> > > +    endif()
>> > >    endif()
>> > >
>> > >    if(BUILD_STATIC)
>> > > ----
>> > >
>> > >
>> > > Thanks,
>> > > --
>> > > kou
>> > >
>> > > In <CAB8EV3TYN=d-+guwjddnmjaqkokrtm6aa5mtpmv8zqqclrk...@mail.gmail.com>
>> > >   "[DISCUSS] Arrow Java: add RelWithDebInfo builds support" on Tue, 11 
>> > > Mar
>> > > 2025 09:47:07 +0100,
>> > >   Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
>> > >
>> > > > Hi folks,
>> > > >
>> > > > I discussed on Zulip with some of you about that, but I would like to
>> > > > bring the discussion on the dev mailing list to get the whole
>> > > > community involved.
>> > > >
>> > > > I propose to add DEBUG symbols to Arrow Java JNI and release/provide
>> > > > the corresponding artifacts (with specific Maven classifier).
>> > > >
>> > > > Today, we don't provide JNI artifacts with DEBUG symbols, meaning
>> > > > that, if a crash occurs (for any reason), the user is kind of blind to
>> > > > find the causes.
>> > > >
>> > > > The proposal is to provide new artifacts with debug Maven classifier
>> > > > for instance built with DEBUG symbols (the existing/current artifacts
>> > > > are still there, nothing changes here).
>> > > >
>> > > > We can ship RelWithDebInfo builds instead of Release builds (deb/RPM
>> > > > split debug symbols to separated files and package them as separated
>> > > > packages for instance).
>> > > > We can mimic this for JNI (adding a new CI/tool).
>> > > >
>> > > > Thoughts ?
>> > > >
>> > > > Thanks !
>> > > > Regards
>> > > > JB
>> > >

Reply via email to