On 5/27/21 9:48 AM, Alexander Grund wrote:

The EB log file reports an error:

//tensorflow/core/common_runtime:graph_constructor_test FAILED TO BUILD

and the log file ends with:

Executed 137 out of 814 tests: 137 tests pass, 1 fails to build and 676 were skipped.
FAILED: Build did NOT complete successfully
This is a build failure, so something we should fix or at least find the cause. Please check the log, there should be something about why/how it failed to compile. Just search for the name and scroll a bit around. If you attach it, I can also take a look.

The EB log file is 205 MB, so it's hard to share :-(

I have this environment:

export EASYBUILD_BUILDPATH=/run/user/$UID/eb_build
ulimit -s 2000240
export EASYBUILD_TMPDIR=/scratch/$USER

and there is quite a bit of space available:

$ df -h /run/user/$UID/eb_build /scratch
Filesystem                         Size  Used Avail Use% Mounted on
tmpfs                               19G   19G   30M 100% /run/user/983
/dev/mapper/VolGroup00-lv_scratch  850G  675M  849G   1% /scratch

Searching for FAIL in the log file, I noticed this section:

== 2021-05-26 15:20:28,456 tensorflow.py:899 INFO Starting cpu test
== 2021-05-26 15:20:28,457 run.py:225 INFO running cmd: bazel --output_user_root=/run/user/983/eb_build/TensorFlow/2.4.1/fosscuda-2020b/tmpkYJDaH-bazel-tf --host_jvm_args=-Xms512m --host_jvm_args=-Xmx4096m test --config=noaws --config=nogcp --config=nohd fs --compilation_mode=opt --config=opt --subcommands --verbose_failures --jobs=64 --copt="-fPIC" --action_env=CPATH='/home/modules/software/cURL/7.72.0-GCCcore-10.2.0/include:/home/modules/software/double-conversion/3.1.5-GCCcore-10.2.0/include:/home/modu
les/software/flatbuffers/1.12.0-GCCcore-10.2.0/include:/home/modules/software/giflib/5.2.1-GCCcore-10.2.0/include:/home/modules/software/hwloc/2.2.0-GCCcore-10.2.0/include:/home/modules/software/ICU/67.1-GCCcore-10.2.0/include:/home/modules/software/JsonC
pp/1.9.4-GCCcore-10.2.0/include:/home/modules/software/libjpeg-turbo/2.0.5-GCCcore-10.2.0/include:/home/modules/software/libpng/1.6.37-GCCcore-10.2.0/include:/home/modules/software/LMDB/0.9.24-GCCcore-10.2.0/include:/home/modules/software/nsync/1.24.0-GCC
core-10.2.0/include:/home/modules/software/PCRE/8.44-GCCcore-10.2.0/include:/home/modules/software/protobuf/3.14.0-GCCcore-10.2.0/include:/home/modules/software/pybind11/2.6.0-GCCcore-10.2.0/include:/home/modules/software/snappy/1.1.8-GCCcore-10.2.0/inclu
de:/home/modules/software/SQLite/3.33.0-GCCcore-10.2.0/include:/home/modules/software/zlib/1.2.11-GCCcore-10.2.0/include' --action_env=LIBRARY_PATH='/home/modules/software/cURL/7.72.0-GCCcore-10.2.0/lib:/home/modules/software/double-conversion/3.1.5-GCCco
re-10.2.0/lib:/home/modules/software/flatbuffers/1.12.0-GCCcore-10.2.0/lib:/home/modules/software/giflib/5.2.1-GCCcore-10.2.0/lib:/home/modules/software/hwloc/2.2.0-GCCcore-10.2.0/lib:/home/modules/software/ICU/67.1-GCCcore-10.2.0/lib:/home/modules/softwa
re/JsonCpp/1.9.4-GCCcore-10.2.0/lib:/home/modules/software/libjpeg-turbo/2.0.5-GCCcore-10.2.0/lib64:/home/modules/software/libpng/1.6.37-GCCcore-10.2.0/lib:/home/modules/software/LMDB/0.9.24-GCCcore-10.2.0/lib:/home/modules/software/nsync/1.24.0-GCCcore-1
0.2.0/lib:/home/modules/software/PCRE/8.44-GCCcore-10.2.0/lib:/home/modules/software/protobuf/3.14.0-GCCcore-10.2.0/lib:/home/modules/software/pybind11/2.6.0-GCCcore-10.2.0/lib:/home/modules/software/snappy/1.1.8-GCCcore-10.2.0/lib:/home/modules/software/
SQLite/3.33.0-GCCcore-10.2.0/lib:/home/modules/software/zlib/1.2.11-GCCcore-10.2.0/lib' --action_env=PYTHONPATH --action_env=PYTHONNOUSERSITE=1 --distinct_host_configuration=false --config=mkl --test_output=errors --build_tests_only --local_test_jobs=64 - -test_tag_filters='-gpu,-tpu,-no_cuda_on_cpu_tap,-no_pip,-no_oss,-oss_serial,-benchmark-test,-v1only' --build_tag_filters='-gpu,-tpu,-no_cuda_on_cpu_tap,-no_pip,-no_oss,-oss_serial,-benchmark-test,-v1only' --test_env=CUDA_VISIBLE_DEVICES='-1' --test_timeo ut=3600 --test_size_filters=small -- //tensorflow/core/... -//tensorflow/core:example_java_proto -//tensorflow/core/example:example_protos_closure //tensorflow/cc/... //tensorflow/c/... //tensorflow/python/... -//tensorflow/core/profiler/internal/gpu:devi ce_tracer_test -//tensorflow/c/eager:c_api_test_gpu -//tensorflow/c/eager:c_api_distributed_test -//tensorflow/c/eager:c_api_distributed_test_gpu -//tensorflow/c/eager:c_api_cluster_test_gpu -//tensorflow/c/eager:c_api_remote_function_test_gpu -//tensorfl ow/c/eager:c_api_remote_test_gpu -//tensorflow/core/kernels:sparse_matmul_op_test -//tensorflow/core/kernels:sparse_matmul_op_test_gpu -//tensorflow/core/common_runtime:collective_param_resolver_local_test -//tensorflow/core/common_runtime:mkl_layout_pass
_test -//tensorflow/core/kernels/mkl:mkl_fused_ops_test
== 2021-05-26 15:30:49,144 run.py:595 INFO parse_log_for_error msg: Command used: bazel --output_user_root=/run/user/983/eb_build/TensorFlow/2.4.1/fosscuda-2020b/tmpkYJDaH-bazel-tf --host_jvm_args=-Xms512m --host_jvm_args=-Xmx4096m test --config=noaws -- config=nogcp --config=nohdfs --compilation_mode=opt --config=opt --subcommands --verbose_failures --jobs=64 --copt="-fPIC" --action_env=CPATH='/home/modules/software/cURL/7.72.0-GCCcore-10.2.0/include:/home/modules/software/double-conversion/3.1.5-GCCcore
-10.2.0/include:/home/modules/software/flatbuffers/1.12.0-GCCcore-10.2.0/include:/home/modules/software/giflib/5.2.1-GCCcore-10.2.0/include:/home/modules/software/hwloc/2.2.0-GCCcore-10.2.0/include:/home/modules/software/ICU/67.1-GCCcore-10.2.0/include:/h
ome/modules/software/JsonCpp/1.9.4-GCCcore-10.2.0/include:/home/modules/software/libjpeg-turbo/2.0.5-GCCcore-10.2.0/include:/home/modules/software/libpng/1.6.37-GCCcore-10.2.0/include:/home/modules/software/LMDB/0.9.24-GCCcore-10.2.0/include:/home/modules
/software/nsync/1.24.0-GCCcore-10.2.0/include:/home/modules/software/PCRE/8.44-GCCcore-10.2.0/include:/home/modules/software/protobuf/3.14.0-GCCcore-10.2.0/include:/home/modules/software/pybind11/2.6.0-GCCcore-10.2.0/include:/home/modules/software/snappy/
1.1.8-GCCcore-10.2.0/include:/home/modules/software/SQLite/3.33.0-GCCcore-10.2.0/include:/home/modules/software/zlib/1.2.11-GCCcore-10.2.0/include' --action_env=LIBRARY_PATH='/home/modules/software/cURL/7.72.0-GCCcore-10.2.0/lib:/home/modules/software/dou
ble-conversion/3.1.5-GCCcore-10.2.0/lib:/home/modules/software/flatbuffers/1.12.0-GCCcore-10.2.0/lib:/home/modules/software/giflib/5.2.1-GCCcore-10.2.0/lib:/home/modules/software/hwloc/2.2.0-GCCcore-10.2.0/lib:/home/modules/software/ICU/67.1-GCCcore-10.2.
0/lib:/home/modules/software/JsonCpp/1.9.4-GCCcore-10.2.0/lib:/home/modules/software/libjpeg-turbo/2.0.5-GCCcore-10.2.0/lib64:/home/modules/software/libpng/1.6.37-GCCcore-10.2.0/lib:/home/modules/software/LMDB/0.9.24-GCCcore-10.2.0/lib:/home/modules/softw
are/nsync/1.24.0-GCCcore-10.2.0/lib:/home/modules/software/PCRE/8.44-GCCcore-10.2.0/lib:/home/modules/software/protobuf/3.14.0-GCCcore-10.2.0/lib:/home/modules/software/pybind11/2.6.0-GCCcore-10.2.0/lib:/home/modules/software/snappy/1.1.8-GCCcore-10.2.0/l
ib:/home/modules/software/SQLite/3.33.0-GCCcore-10.2.0/lib:/home/modules/software/zlib/1.2.11-GCCcore-10.2.0/lib' --action_env=PYTHONPATH --action_env=PYTHONNOUSERSITE=1 --distinct_host_configuration=false --config=mkl --test_output=errors --build_tests_o nly --local_test_jobs=64 --test_tag_filters='-gpu,-tpu,-no_cuda_on_cpu_tap,-no_pip,-no_oss,-oss_serial,-benchmark-test,-v1only' --build_tag_filters='-gpu,-tpu,-no_cuda_on_cpu_tap,-no_pip,-no_oss,-oss_serial,-benchmark-test,-v1only' --test_env=CUDA_VISIBLE _DEVICES='-1' --test_timeout=3600 --test_size_filters=small -- //tensorflow/core/... -//tensorflow/core:example_java_proto -//tensorflow/core/example:example_protos_closure //tensorflow/cc/... //tensorflow/c/... //tensorflow/python/... -//tensorflow/core/ profiler/internal/gpu:device_tracer_test -//tensorflow/c/eager:c_api_test_gpu -//tensorflow/c/eager:c_api_distributed_test -//tensorflow/c/eager:c_api_distributed_test_gpu -//tensorflow/c/eager:c_api_cluster_test_gpu -//tensorflow/c/eager:c_api_remote_fun ction_test_gpu -//tensorflow/c/eager:c_api_remote_test_gpu -//tensorflow/core/kernels:sparse_matmul_op_test -//tensorflow/core/kernels:sparse_matmul_op_test_gpu -//tensorflow/core/common_runtime:collective_param_resolver_local_test -//tensorflow/core/comm on_runtime:mkl_layout_pass_test -//tensorflow/core/kernels/mkl:mkl_fused_ops_test == 2021-05-26 15:30:49,145 run.py:597 INFO parse_log_for_error (some may be harmless) regExp (?<![(,-]|\w)(?:error|segmentation fault|failed)(?![(,-]|\.?\w) found: WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/f402e682d0ef5598eeffc9a21a691b03e602ff58.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpExcep
tion GET returned 404 Not Found
SUBCOMMAND: # //tensorflow/core/platform:error [action 'Linking tensorflow/core/platform/liberror.so', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform] SUBCOMMAND: # //tensorflow/core/platform:error [action 'Compiling tensorflow/core/platform/error.cc', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform]

external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.d '-frandom-seed=bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.o' -DEIGEN_MPL2_O NLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -D__CLANG_SUPPORT_DYN_ANNOTATION__ -iquote . -iquote bazel-out/k8-opt/bin -iquote external/eigen_archive -iquote bazel-out/k8-opt/bin/external/eigen_archive -iquote external/com_google_absl -iqu ote bazel-out/k8-opt/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/k8-opt/bin/external/nsync -iquote external/double_conversion -iquote bazel-out/k8-opt/bin/external/double_conversion -iquote external/com_google_protobuf -iquote ba zel-out/k8-opt/bin/external/com_google_protobuf -isystem third_party/eigen3/mkl_include -isystem bazel-out/k8-opt/bin/third_party/eigen3/mkl_include -isystem external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -Wno-builtin-macro-re defined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIE -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunc tion-sections -fdata-sections -w -DAUTOLOAD_DYNAMIC_KERNELS -O2 -ftree-vectorize '-march=native' -fno-math-errno -fPIC -fPIC '-std=c++14' -c tensorflow/core/platform/error.cc -o bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.o) SUBCOMMAND: # //tensorflow/core/platform:error [action 'Linking tensorflow/core/platform/liberror.a', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform] ERROR: /run/user/983/eb_build/TensorFlow/2.4.1/fosscuda-2020b/TensorFlow/tensorflow-2.4.1/tensorflow/core/common_runtime/BUILD:2700:11: Linking of rule '//tensorflow/core/common_runtime:graph_constructor_test' failed (Exit 1): crosstool_wrapper_driver_is_
not_gcc failed: error executing command
/home/modules/software/binutils/2.35-GCCcore-10.2.0/bin/ld.gold: fatal error: bazel-out/k8-opt/bin/tensorflow/core/common_runtime/graph_constructor_test: No space left on device
collect2: error: ld returned 1 exit status
FAILED: Build did NOT complete successfully
//tensorflow/core/common_runtime:graph_constructor_test FAILED TO BUILD
FAILED: Build did NOT complete successfully
== 2021-05-26 15:30:49,145 run.py:554 WARNING Found 11 errors in command output (output: WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/f402e682d0ef5598eeffc9a21a691b03e602ff58.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found SUBCOMMAND: # //tensorflow/core/platform:error [action 'Linking tensorflow/core/platform/liberror.so', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform] SUBCOMMAND: # //tensorflow/core/platform:error [action 'Compiling tensorflow/core/platform/error.cc', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform]

external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.d '-frandom-seed=bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.o' -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -D__CLANG_SUPPORT_DYN_ANNOTATION__ -iquote . -iquote bazel-out/k8-opt/bin -iquote external/eigen_archive -iquote bazel-out/k8-opt/bin/external/eigen_archive -iquote external/com_google_absl -iquote bazel-out/k8-opt/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/k8-opt/bin/external/nsync -iquote external/double_conversion -iquote bazel-out/k8-opt/bin/external/double_conversion -iquote external/com_google_protobuf -iquote bazel-out/k8-opt/bin/external/com_google_protobuf -isystem third_party/eigen3/mkl_include -isystem bazel-out/k8-opt/bin/third_party/eigen3/mkl_include -isystem external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIE -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -w -DAUTOLOAD_DYNAMIC_KERNELS -O2 -ftree-vectorize '-march=native' -fno-math-errno -fPIC -fPIC '-std=c++14' -c tensorflow/core/platform/error.cc -o bazel-out/k8-opt/bin/tensorflow/core/platform/_objs/error/error.o) SUBCOMMAND: # //tensorflow/core/platform:error [action 'Linking tensorflow/core/platform/liberror.a', configuration: f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6, execution platform: @local_execution_config_platform//:platform] ERROR: /run/user/983/eb_build/TensorFlow/2.4.1/fosscuda-2020b/TensorFlow/tensorflow-2.4.1/tensorflow/core/common_runtime/BUILD:2700:11: Linking of rule '//tensorflow/core/common_runtime:graph_constructor_test' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command /home/modules/software/binutils/2.35-GCCcore-10.2.0/bin/ld.gold: fatal error: bazel-out/k8-opt/bin/tensorflow/core/common_runtime/graph_constructor_test: No space left on device
        collect2: error: ld returned 1 exit status
        FAILED: Build did NOT complete successfully
//tensorflow/core/common_runtime:graph_constructor_test FAILED TO BUILD
        FAILED: Build did NOT complete successfully)


Please note these two errors:

WARNING: Download from 
https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/f402e682d0ef5598eeffc9a21a691b03e602ff58.tar.gz
 failed: class 
com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpExcep
tion GET returned 404 Not Found

Is the URL outdated?

/home/modules/software/binutils/2.35-GCCcore-10.2.0/bin/ld.gold: fatal error: 
bazel-out/k8-opt/bin/tensorflow/core/common_runtime/graph_constructor_test: No 
space left on device

What device might that be? As shown above, I have quite a bit of disk space. Is /tmp being used and getting full?

I'd also suggest to join Slack as discussions there are potentially faster.

I'll take a look - are there instructions for Slack?

Thanks,
Ole

--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: [email protected]
Homepage: http://dcwww.fysik.dtu.dk/~ohnielse/
Mobile: (+45) 5180 1620

Reply via email to