This is an automated email from the ASF dual-hosted git repository.
wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 45dc301 ARROW-2256: [C++] Fix libfuzzer builds for clang-7
45dc301 is described below
commit 45dc3013c7a9c8d9a3f9ce5701c6e2b40725c4a1
Author: Marco Neumann <[email protected]>
AuthorDate: Fri Jun 7 13:11:30 2019 -0500
ARROW-2256: [C++] Fix libfuzzer builds for clang-7
Author: Marco Neumann <[email protected]>
Author: Wes McKinney <[email protected]>
Closes #4496 from crepererum/ARROW-2256 and squashes the following commits:
75f7ef3ba <Wes McKinney> Add option to set prefixes on fuzzing executables
bb82b0857 <Marco Neumann> improve fuzzer docs
4d98ce625 <Marco Neumann> extend coverage infos to be more useful for
fuzzing
e91fcbe8c <Marco Neumann> fix coverage information to work with fuzzing
---
cpp/cmake_modules/BuildUtils.cmake | 29 ++++++++++++----
cpp/cmake_modules/san-config.cmake | 9 +++--
cpp/src/arrow/ipc/CMakeLists.txt | 2 +-
.../ipc/{ipc-fuzzing-test.cc => fuzzing-test.cc} | 0
docs/source/developers/cpp.rst | 39 ++++++++++++++++++----
5 files changed, 64 insertions(+), 15 deletions(-)
diff --git a/cpp/cmake_modules/BuildUtils.cmake
b/cpp/cmake_modules/BuildUtils.cmake
index 5f04254..781cedc 100644
--- a/cpp/cmake_modules/BuildUtils.cmake
+++ b/cpp/cmake_modules/BuildUtils.cmake
@@ -668,24 +668,41 @@ endfunction()
# No main function must be present within the source file!
#
function(ADD_ARROW_FUZZING REL_FUZZING_NAME)
+ set(options)
+ set(one_value_args)
+ set(multi_value_args PREFIX)
+ cmake_parse_arguments(ARG
+ "${options}"
+ "${one_value_args}"
+ "${multi_value_args}"
+ ${ARGN})
+ if(ARG_UNPARSED_ARGUMENTS)
+ message(SEND_ERROR "Error: unrecognized arguments:
${ARG_UNPARSED_ARGUMENTS}")
+ endif()
+
if(NO_FUZZING)
return()
endif()
+ get_filename_component(FUZZING_NAME ${REL_FUZZING_NAME} NAME_WE)
+
+ if(ARG_PREFIX)
+ set(FUZZING_NAME "${ARG_PREFIX}-${FUZZING_NAME}")
+ endif()
+
if(ARROW_BUILD_STATIC)
set(FUZZ_LINK_LIBS arrow_static)
else()
set(FUZZ_LINK_LIBS arrow_shared)
endif()
- add_executable(${REL_FUZZING_NAME} "${REL_FUZZING_NAME}.cc")
- target_link_libraries(${REL_FUZZING_NAME} ${FUZZ_LINK_LIBS})
- target_compile_options(${REL_FUZZING_NAME} PRIVATE "-fsanitize=fuzzer")
- set_target_properties(${REL_FUZZING_NAME} PROPERTIES LINK_FLAGS
"-fsanitize=fuzzer")
+ add_executable(${FUZZING_NAME} "${REL_FUZZING_NAME}.cc")
+ target_link_libraries(${FUZZING_NAME} ${FUZZ_LINK_LIBS})
+ target_compile_options(${FUZZING_NAME} PRIVATE "-fsanitize=fuzzer")
+ set_target_properties(${FUZZING_NAME}
+ PROPERTIES LINK_FLAGS "-fsanitize=fuzzer" LABELS
"fuzzing")
endfunction()
-#
-
function(ARROW_INSTALL_ALL_HEADERS PATH)
set(options)
set(one_value_args)
diff --git a/cpp/cmake_modules/san-config.cmake
b/cpp/cmake_modules/san-config.cmake
index 95ef553..2facc39 100644
--- a/cpp/cmake_modules/san-config.cmake
+++ b/cpp/cmake_modules/san-config.cmake
@@ -92,9 +92,14 @@ if(${ARROW_USE_COVERAGE})
if(NOT ("${COMPILER_FAMILY}" STREQUAL "clang"))
message(SEND_ERROR "You can only enable coverage with clang")
endif()
- add_definitions("-fsanitize-coverage=trace-pc-guard")
+ add_definitions(
+
"-fsanitize-coverage=pc-table,inline-8bit-counters,edge,no-prune,trace-cmp,trace-div,trace-gep"
+ )
- set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize-coverage=trace-pc-guard")
+ set(
+ CMAKE_CXX_FLAGS
+ "${CMAKE_CXX_FLAGS}
-fsanitize-coverage=pc-table,inline-8bit-counters,edge,no-prune,trace-cmp,trace-div,trace-gep"
+ )
endif()
if("${ARROW_USE_UBSAN}" OR "${ARROW_USE_ASAN}" OR "${ARROW_USE_TSAN}")
diff --git a/cpp/src/arrow/ipc/CMakeLists.txt b/cpp/src/arrow/ipc/CMakeLists.txt
index 001604e..e79c1ff 100644
--- a/cpp/src/arrow/ipc/CMakeLists.txt
+++ b/cpp/src/arrow/ipc/CMakeLists.txt
@@ -103,4 +103,4 @@ if(ARROW_BUILD_UTILITIES)
endif()
add_arrow_benchmark(read-write-benchmark PREFIX "arrow-ipc")
-add_arrow_fuzzing(ipc-fuzzing-test)
+add_arrow_fuzzing(fuzzing-test PREFIX "arrow-ipc")
diff --git a/cpp/src/arrow/ipc/ipc-fuzzing-test.cc
b/cpp/src/arrow/ipc/fuzzing-test.cc
similarity index 100%
rename from cpp/src/arrow/ipc/ipc-fuzzing-test.cc
rename to cpp/src/arrow/ipc/fuzzing-test.cc
diff --git a/docs/source/developers/cpp.rst b/docs/source/developers/cpp.rst
index f95981a..525d7d9 100644
--- a/docs/source/developers/cpp.rst
+++ b/docs/source/developers/cpp.rst
@@ -489,7 +489,10 @@ work). You can build them using the following code:
.. code-block:: shell
- cmake -DARROW_FUZZING=ON -DARROW_USE_ASAN=ON ..
+ export CC=clang
+ export CXX=clang++
+ cmake -DARROW_FUZZING=ON -DARROW_USE_ASAN=ON
-DCMAKE_BUILD_TYPE=RelWithDebInfo ..
+ make
``ARROW_FUZZING`` will enable building of fuzzer executables as well as enable
the
addition of coverage helpers via ``ARROW_USE_COVERAGE``, so that the fuzzer
can observe
@@ -501,16 +504,40 @@ provoked by the fuzzer will be found early. You may also
enable other sanitizers
well. Just keep in mind that some of them do not work together and some may
result
in very long execution times, which will slow down the fuzzing procedure.
+We use the ``RelWithDebInfo`` build type which is optimized ``Release`` but
contains
+debug information. Just using ``Debug`` would be too slow to get proper fuzzing
+results and ``Release`` would make it impossible to get proper tracebacks.
Also, some
+bugs might (but hopefully are not) be specific to the release build due to
+misoptimization.
+
Now you can start one of the fuzzer, e.g.:
.. code-block:: shell
- ./debug/debug/ipc-fuzzing-test
+ mkdir -p corpus
+ ./relwithdebinfo/arrow-ipc-fuzzing-test corpus
+
+This will try to find a malformed input that crashes the payload. A corpus of
+interesting inputs will be stored into the ``corpus`` directory. You can save
and
+share this with others if you want, or even pre-fill it with files to provide
the
+fuzzer with a warm-start. If a crash was found, the program will show the
stack trace
+as well as the input data. The input data will also be written to a file named
+``crash-<some id>``. After a problem was found this way, it should be reported
and
+fixed. Usually, the fuzzing process cannot be continued until the fix is
applied, since
+the fuzzer usually converts to the problem again. To debug the underlying
issue, you
+can use GDB:
+
+.. code-block:: shell
+
+ env ASAN_OPTIONS=abort_on_error=1 gdb -ex r --args
./relwithdebinfo/arrow-ipc-fuzzing-test crash-<some id>
+
+For more options, use:
+
+.. code-block:: shell
+
+ ./relwithdebinfo/arrow-ipc-fuzzing-test -help=1
-This will try to find a malformed input that crashes the payload and will show
the
-stack trace as well as the input data. After a problem was found this way, it
should
-be reported and fixed. Usually, the fuzzing process cannot be continued until
the
-fix is applied, since the fuzzer usually converts to the problem again.
+or visit the `libFuzzer documentation <https://llvm.org/docs/LibFuzzer.html>`_.
If you build fuzzers with ASAN, you need to set the ``ASAN_SYMBOLIZER_PATH``
environment variable to the absolute path of ``llvm-symbolizer``, which is a
tool