> On 5 Jan 2018, at 15:18, Kenneth Hoste <kenneth.ho...@ugent.be> wrote:
> 
> On 05/01/2018 14:13, Jakob Schiøtz wrote:
>> Hi again,
>> 
>> Yes, I have overlooked that - I just switched my repo to your branch and 
>> tried to build :-)
>> 
>> Now I get an error when building TensorFlow.  It is a 502 Bad Gateway, 
>> indicating that some server is down somewhere.  But is it not a problem that 
>> the build process itself tried to download extra stuff in addition to the 
>> source files listed in the .eb file?  At least it makes the checksum 
>> checking moot.
> 
> That's indeed a problem, but one that is hard to avoid with TensorFlow, at 
> least in a first iteration...
> 
> Once we're happy with the current approach, a new target could be to get 
> TensorFlow to build "offline".
> 
> One step at a time though... ;-)

It could be a showstopper for me, though.  On our cluster, only two nodes have 
GPUs.  With the binary build, I could only install TensorFlow on those, since 
although CUDA and friends are available on all the nodes, you can only load the 
resulting TensorFlow module on a machine with a GPU.  Unfortunately, these two 
nodes are officially compute-nodes, not login-nodes, and that means that they 
are cut off from the Internet.  So no downloading is possible on these. :-(

So I have two questions:

1. What do we expect to gain by building from source instead of installing from 
the wheel? 

2. Would it be OK to have a “-bin” variant installing from the binary 
distribution until we get these issues ironed out?

In my second attempt, I managed to build with foss/2017b (obviously the server 
was up again).  I have not really tested it yet (I am only just dabbing into 
TensorFlow and my main application i crashing due to another problem).  Do you 
want me to submit the new .eb file as a PR to your PR?  Or should I just wait 
till your stuff has converged?

/Jakob


> 
> 
> regards,
> 
> Kenneth
>> 
>> Best regards
>> 
>> Jakob
>> 
>> 
>> ............
>> WARNING: The lower priority option '-c opt' does not override the previous 
>> value '--compilation_mode=opt'.
>> WARNING: The lower priority option '-c opt' does not override the previous 
>> value '--compilation_mode=opt'.
>> ____Downloading 
>> https://github.com/bazelbuild/rules_closure/archive/4af89ef1db659eb41f110df189b67d4cf14073e1.tar.gz
>>  via codeload.github.com: 40,240 bytes
>> ____Downloading 
>> https://github.com/bazelbuild/rules_closure/archive/4af89ef1db659eb41f110df189b67d4cf14073e1.tar.gz
>>  via codeload.github.com: 205,436 bytes
>> ____Loading package: tensorflow/tools/pip_package
>> ____Loading package: @bazel_tools//tools/cpp
>> ____Loading package: @local_jdk//
>> ____Loading package: @local_config_cc//
>> ____Loading complete.  Analyzing...
>> ERROR: 
>> /home/niflheim/schiotz/easybuild_experimental/sandybridge/build/TensorFlow/1.4.0/foss-2017b-Python-3.6.3/tensorflow-1.4.0/tensorflow/tools/pip_package/BUILD:139:1:
>>  error loading package 'tensorflow': Encountered error while reading 
>> extension file 'protobuf.bzl': no such package '@protobuf_archive//': 
>> java.io.IOException: Error downloading 
>> [http://mirror.bazel.build/github.com/google/protobuf/archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz]
>>  to 
>> /tmp/eb-GpWEyg/tmpfJrPWS-bazel-build/external/protobuf_archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz:
>>  GET returned 502 Bad Gateway and referenced by 
>> '//tensorflow/tools/pip_package:build_pip_package'.
>> ERROR: 
>> /home/niflheim/schiotz/easybuild_experimental/sandybridge/build/TensorFlow/1.4.0/foss-2017b-Python-3.6.3/tensorflow-1.4.0/tensorflow/tools/pip_package/BUILD:139:1:
>>  error loading package 'tensorflow': Encountered error while reading 
>> extension file 'protobuf.bzl': no such package '@protobuf_archive//': 
>> java.io.IOException: Error downloading 
>> [http://mirror.bazel.build/github.com/google/protobuf/archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz]
>>  to 
>> /tmp/eb-GpWEyg/tmpfJrPWS-bazel-build/external/protobuf_archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz:
>>  GET returned 502 Bad Gateway and referenced by 
>> '//tensorflow/tools/pip_package:build_pip_package'.
>> ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' 
>> failed; build aborted: error loading package 'tensorflow': Encountered error 
>> while reading extension file 'protobuf.bzl': no such package 
>> '@protobuf_archive//': java.io.IOException: Error downloading 
>> [http://mirror.bazel.build/github.com/google/protobuf/archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz]
>>  to 
>> /tmp/eb-GpWEyg/tmpfJrPWS-bazel-build/external/protobuf_archive/b04e5cba356212e4e8c66c61bbe0c3a20537c5b9.tar.gz:
>>  GET returned 502 Bad Gateway.
>> ____Elapsed time: 6.561s
>>  (at easybuild/tools/run.py:481 in parse_cmd_output)
>> == 2018-01-05 14:07:30,582 easyblock.py:2685 WARNING build failed (first 300 
>> chars): cmd "bazel --output_base=/tmp/eb-GpWEyg/tmpfJrPWS-bazel-build build 
>> --compilation_mode=opt --config=opt --subcommands --verbose_failures  
>> --config=mkl //tensorflow/tools/pip_package:build_pip_package" exited with 
>> exit code 1 and output:
>> ............
>> 
>> 
>>> On 5 Jan 2018, at 13:50, Kenneth Hoste <kenneth.ho...@ugent.be> wrote:
>>> 
>>> Hi Jakob,
>>> 
>>> On 05/01/2018 13:19, Jakob Schiøtz wrote:
>>>> Hi Kenneth,
>>>> 
>>>> Is it possible that you forgot to check in the patches 
>>>> TensorFlow-1.4.0_swig-env.patch and TensorFlow-1.4.0_no-enum34.patch in 
>>>> your PR?  Attempting to build TensorFlow fails because it cannot find 
>>>> these.
>>> The patch files are available from 
>>> https://github.com/easybuilders/easybuild-easyconfigs/pull/5318 (as 
>>> mentioned in the description of the PR).
>>> 
>>> 
>>> regards,
>>> 
>>> Kenneth
>>>> Best regards
>>>> 
>>>> Jakob
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 4 Jan 2018, at 16:37, Jakob Schiøtz <schi...@fysik.dtu.dk> wrote:
>>>>> 
>>>>> Dear Kenneth, Pablo and Maxime,
>>>>> 
>>>>> Thanks for your feedback.  Yes, I will try to see if I can build from 
>>>>> source, but I will focus on the foss toolchain since we use that one for 
>>>>> our Python here (we do not have the Intel MPI license, and the iomkl 
>>>>> toolchain could not built Python last time I tried).
>>>>> 
>>>>> I assume the reason for building from source is to ensure consistent 
>>>>> library versions etc.  If that proves very difficult, could we perhaps in 
>>>>> the interim have builds (with a -bin suffix?) using the prebuilt wheels?
>>>>> 
>>>>> Best regards
>>>>> 
>>>>> Jakob
>>>>> 
>>>>> 
>>>>>> On 4 Jan 2018, at 15:29, Kenneth Hoste <kenneth.ho...@ugent.be> wrote:
>>>>>> 
>>>>>> Dear Jakob,
>>>>>> 
>>>>>> On 04/01/2018 10:23, Jakob Schiøtz wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I made a TensorFlow easyconfig a while ago depending on Python with the 
>>>>>>> foss toolchain; and including a variant with GPU support (PR 4904).  
>>>>>>> The latter has not yet been merged, probably because it is annoying to 
>>>>>>> have something that can only build on a machine with a GPU (it fails 
>>>>>>> the sanity check otherwise, as TensorFlow with GPU support cannot load 
>>>>>>> on a machine without it).
>>>>>> Not being able to test this on a non-GPU system is a bit unfortunate, 
>>>>>> but that's not a reason that it hasn't been merged yet, that's mostly 
>>>>>> due to a lack of time from my side to get back to it...
>>>>>> 
>>>>>>> Since I made that PR, two newer releases of TensorFlow have appeared 
>>>>>>> (1.3 and 1.4).   There are easyconfigs for 1.3 with the Intel tool 
>>>>>>> chain.  I am considering making easyconfigs for TensorFlow 1.4 with 
>>>>>>> Python-3.6.3-foss-2017b (both with and without GPU support), but first 
>>>>>>> I would like to know if anybody else is doing this - it is my 
>>>>>>> impression that somebody who actually know what they are doing may be 
>>>>>>> working on TensorFlow. :-)
>>>>>> I have spent quite a bit of time puzzling together an easyblock that 
>>>>>> supports building TensorFlow from source, see [1].
>>>>>> 
>>>>>> It already works for non-GPU installations (see [2] for example), but 
>>>>>> it's not entirely finished yet because:
>>>>>> 
>>>>>> * building from source with CUDA support does not work yet, the build 
>>>>>> fails with strange Bazel errors...
>>>>>> 
>>>>>> * there are some issues when the TensorFlow easyblock is used together 
>>>>>> with --use-ccache and the Intel compilers;
>>>>>>  because two compiler wrappers are used, they end up calling each other 
>>>>>> resulting in a "fork bomb" style situation...
>>>>>> 
>>>>>> I would really like to get it finished and have easyconfigs available 
>>>>>> for TensorFlow 1.4 and newer where we properly build TensorFlow from 
>>>>>> source rather than using the binary wheels...
>>>>>> 
>>>>>> Are you up for giving it a try, and maybe helping out with the problems 
>>>>>> mentioned above?
>>>>>> 
>>>>>> 
>>>>>> regards,
>>>>>> 
>>>>>> Kenneth
>>>>>> 
>>>>>> 
>>>>>> [1] https://github.com/easybuilders/easybuild-easyblocks/pull/1287
>>>>>> [2] https://github.com/easybuilders/easybuild-easyconfigs/pull/5499
>>>>>> 
>>>>>>> Best regards
>>>>>>> 
>>>>>>> Jakob
>>>>>>> 
>>>>>>> --
>>>>>>> Jakob Schiøtz, professor, Ph.D.
>>>>>>> Department of Physics
>>>>>>> Technical University of Denmark
>>>>>>> DK-2800 Kongens Lyngby, Denmark
>>>>>>> http://www.fysik.dtu.dk/~schiotz/
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> --
>>>>> Jakob Schiøtz, professor, Ph.D.
>>>>> Department of Physics
>>>>> Technical University of Denmark
>>>>> DK-2800 Kongens Lyngby, Denmark
>>>>> http://www.fysik.dtu.dk/~schiotz/
>>>>> 
>>>>> 
>>>>> 
>>>> --
>>>> Jakob Schiøtz, professor, Ph.D.
>>>> Department of Physics
>>>> Technical University of Denmark
>>>> DK-2800 Kongens Lyngby, Denmark
>>>> http://www.fysik.dtu.dk/~schiotz/
>>>> 
>>>> 
>>>> 
>> --
>> Jakob Schiøtz, professor, Ph.D.
>> Department of Physics
>> Technical University of Denmark
>> DK-2800 Kongens Lyngby, Denmark
>> http://www.fysik.dtu.dk/~schiotz/
>> 
>> 
>> 
> 

--
Jakob Schiøtz, professor, Ph.D.
Department of Physics
Technical University of Denmark
DK-2800 Kongens Lyngby, Denmark
http://www.fysik.dtu.dk/~schiotz/



Reply via email to