Dear Mikael, dear Kenneth,

I tried to compile with only with_jemalloc = False and it failed.

Using Mikael's patch, it's working fine! And yes, it's a CentOS 6.9, good
guess.

Many thanks, I'm very happy that I won't have to fight to install Bazel and
Tensorflow manually anymore!

I'm trying to compile the TF with cuda support as well but I'm not sure if
I proceed correctly.

I have added in the eb file:

under dependencies: ('cuDNN', '6.0-CUDA-8.0.61', '', True)

and cuda_compute_capabilities = ['6.0']

Do I have to enable cuda support in Bazel as well?

here is the build log when trying to build for cuda:
https://gist.github.com/ysagon/a205242bb91becaccf4b684d5285514d






2018-04-25 21:45 GMT+02:00 Kenneth Hoste <[email protected]>:

> Hi all,
>
> On 25/04/2018 21:21, Mikael Öhman wrote:
> > Hi Yann,
> >
> > A bit of a shot in the dark here, but does this happen to be a CentOS6
> > machine?
> > If so, I had to set "with_jemalloc = False" and apply the lrt-flag patch
> > https://github.com/easybuilders/easybuild-easyconfigs/pull/6089 (which
> I
> > hope will make it into the next release? still waiting approval though)
>
> Hmm, I lost track of that one, I'll see what I can do to squeeze it into
> EasyBuild v3.6.0...
>
> >
> > (if i recall correctly, jemalloc was due to the old kernel missing some
> > feature (and disabling jemalloc was the easiest fix), and the -lrt flag
> > was related to some behavior in the linker in combination with Bazel not
> > passing on necessary link flags)
> >
> > Though, I couldn't actually spot any actual error message in the entire
> > log you pasted.
>
> There error is pretty clear at the end of the log:
>
> external/jemalloc/src/pages.c: In function 'je_pages_huge':
> external/jemalloc/src/pages.c:203:30: error: 'MADV_HUGEPAGE' undeclared
> (first use in this function)
>    return (madvise(addr, size, MADV_HUGEPAGE) != 0);
>                                ^~~~~~~~~~~~~
> external/jemalloc/src/pages.c:203:30: note: each undeclared identifier
> is reported only once for each function it appears in
> external/jemalloc/src/pages.c: In function 'je_pages_nohuge':
> external/jemalloc/src/pages.c:217:30: error: 'MADV_NOHUGEPAGE'
> undeclared (first use in this function)
>    return (madvise(addr, size, MADV_NOHUGEPAGE) != 0);
>                                ^~~~~~~~~~~~~~~
>
> This does indeed look like you'll need to disable jemalloc support, by
> including this in the easyconfig file:
>
>         with_jemalloc = False
>
> Please let us know whether that helps, and whether you need to apply
> Mikael's patch as well or not.
>
>
> regards,
>
> Kenneth
>
> >
> > Best regards, Mikael
> >
> >
> >
> > On Wed, Apr 25, 2018 at 6:01 PM, Yann Sagon <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Dear Kenneth,
> >
> >     Thanks for the links. I tried with the tf in develop branch, it
> >     still doesn't compile successfuly.
> >
> >     Here is a full log with debug:
> >
> >     https://gist.github.com/ysagon/30ecfee7789d6cf8f9304103fcdb539f
> >     <https://gist.github.com/ysagon/30ecfee7789d6cf8f9304103fcdb539f>
> >
> >     Best
> >
> >
> >     2018-04-25 15:17 GMT+02:00 Kenneth Hoste <[email protected]
> >     <mailto:[email protected]>>:
> >
> >         Dear Yann,
> >
> >         This does not show the actual error that occurred, can you
> >         provide a
> >         full (debug) log?
> >
> >         Note that we have an easyconfig for TensorFlow 1.7.0 with
> >         foss/2018a as
> >         well in the develop branch of the repository [1], and there have
> >         been
> >         some small fixes to the TensorFlow easyblock as well [2].
> >
> >         All this will be included in the upcoming EasyBuild release, due
> >         for
> >         later this week.
> >
> >
> >         regards,
> >
> >         Kenneth
> >
> >         [1]
> >         https://github.com/easybuilders/easybuild-
> easyconfigs/tree/develop/easybuild/easyconfigs/t/TensorFlow
> >         <https://github.com/easybuilders/easybuild-
> easyconfigs/tree/develop/easybuild/easyconfigs/t/TensorFlow>
> >         [2]
> >         https://github.com/easybuilders/easybuild-
> easyblocks/blob/develop/easybuild/easyblocks/t/tensorflow.py
> >         <https://github.com/easybuilders/easybuild-
> easyblocks/blob/develop/easybuild/easyblocks/t/tensorflow.py>
> >
> >         On 25/04/2018 15:12, Yann Sagon wrote:
> >          > Dear list,
> >          >
> >          > I'm very happy to see that TF is now available in eb as
> >         compiled from
> >          > source, not only the whl.
> >          >
> >          > Unfortunately, I have an error when trying to build:
> >          >
> >          >
> >          > /opt/ebsofts/Core/GCCcore/6.4.0/bin/gcc -U_FORTIFY_SOURCE
> >          > -fstack-protector -Wall -B/opt/ebsofts/Core/GCCcore/6.4.0/bin
> >          > -B/opt/ebsofts/Compiler/GCCcore/6.4.0/binutils/2.28/bin
> >          > -Wunused-but-set-parameter -Wno-free-nonheap-object
> >          > -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG
> >          > -ffunction-sections -fdata-sections
> >          > -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK -O2 '-march=core2' -MD
> -MF
> >          >
> >         bazel-out/k8-py3-opt/bin/external/jpeg/_objs/jpeg/
> external/jpeg/jcphuff.d
> >         -iquote
> >          > external/jpeg -iquote
> >         bazel-out/k8-py3-opt/genfiles/external/jpeg
> >          > -iquote external/bazel_tools -iquote
> >          > bazel-out/k8-py3-opt/genfiles/external/bazel_tools -isystem
> >          > external/bazel_tools/tools/cpp/gcc3 -O3 -w
> >         -fno-canonical-system-headers
> >          > -Wno-builtin-macro-redefined '-D__DATE__="redacted"'
> >          > '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c
> >          > external/jpeg/jcphuff.c -o
> >          >
> >         bazel-out/k8-py3-opt/bin/external/jpeg/_objs/jpeg/
> external/jpeg/jcphuff.o)^M
> >          > Target //tensorflow/tools/pip_package:build_pip_package
> >         failed to build
> >          > INFO: Elapsed time: 101.289s, Critical Path: 61.90s^M
> >          > FAILED: Build did NOT complete successfully^M
> >          >   (at easybuild/tools/run.py:481 in parse_cmd_output)
> >          >
> >          > I'm not able to say why it's not working. Any clue?
> >          >
> >          > Best
> >          >
> >
> >
> >
>

Reply via email to