Hi Kenneth, I have now tested your TensorFlow 1.4.0 eb on our machines with a real-world script. It works, but it runs three times slower than with the prebuild TensorFlow 1.2.1 :-(
The prebuild version complains that it was build without AVX2 etc, so I do not really understand why it is so much slower to use the version compiled from source - assuming of course that there is not a factor three performance loss between 1.2.1 and 1.4.0; which seems unlikely. Best regards Jakob > On 5 Jan 2018, at 13:57, Kenneth Hoste <kenneth.ho...@ugent.be> wrote: > > On 04/01/2018 16:37, Jakob Schiøtz wrote: >> Dear Kenneth, Pablo and Maxime, >> >> Thanks for your feedback. Yes, I will try to see if I can build from >> source, but I will focus on the foss toolchain since we use that one for our >> Python here (we do not have the Intel MPI license, and the iomkl toolchain >> could not built Python last time I tried). >> >> I assume the reason for building from source is to ensure consistent library >> versions etc. If that proves very difficult, could we perhaps in the >> interim have builds (with a -bin suffix?) using the prebuilt wheels? > > The main reason for building from source is performance and compatibility > with the OS. > > The binary wheels that are available for TensorFlow are not compatible with > older OS versions like CentOS 6, as I experienced first-hand when trying to > get it to work on an older (GPU) system. > Since the compilation from source with CUDA support didn't work yet, I had to > resort to injecting a newer glibc version in the 'python' binary, which was > not fun (well...). > > For CPU-only installations, you really have no other option than building > from source, since the binary wheels were not built with AVX2 instructions > for example, which leads to large performance losses (some quick benchmarking > showed a 7x increase in performance for TF 1.4 built with foss/2017b over > using the binary wheel). > > For GPU installations, a similar concern arises, although it may be less > severe there, depending on what CUDA compute capabilities the binary wheels > were built with (I only tested the wheels on old systems with NVIDIA K20x/K40 > GPUs, so there I doubt you'll get much performance increase when building > from source). > > If it turns out to be too difficult or time-consuming to get the build from > source with CUDA support to work, then we can of course progress with > sticking to the binary wheel releases for now, I'm not going to oppose that. > > > regards, > > Kenneth > >> >> Best regards >> >> Jakob >> >> >>> On 4 Jan 2018, at 15:29, Kenneth Hoste <kenneth.ho...@ugent.be> wrote: >>> >>> Dear Jakob, >>> >>> On 04/01/2018 10:23, Jakob Schiøtz wrote: >>>> Hi, >>>> >>>> I made a TensorFlow easyconfig a while ago depending on Python with the >>>> foss toolchain; and including a variant with GPU support (PR 4904). The >>>> latter has not yet been merged, probably because it is annoying to have >>>> something that can only build on a machine with a GPU (it fails the sanity >>>> check otherwise, as TensorFlow with GPU support cannot load on a machine >>>> without it). >>> Not being able to test this on a non-GPU system is a bit unfortunate, but >>> that's not a reason that it hasn't been merged yet, that's mostly due to a >>> lack of time from my side to get back to it... >>> >>>> Since I made that PR, two newer releases of TensorFlow have appeared (1.3 >>>> and 1.4). There are easyconfigs for 1.3 with the Intel tool chain. I am >>>> considering making easyconfigs for TensorFlow 1.4 with >>>> Python-3.6.3-foss-2017b (both with and without GPU support), but first I >>>> would like to know if anybody else is doing this - it is my impression >>>> that somebody who actually know what they are doing may be working on >>>> TensorFlow. :-) >>> I have spent quite a bit of time puzzling together an easyblock that >>> supports building TensorFlow from source, see [1]. >>> >>> It already works for non-GPU installations (see [2] for example), but it's >>> not entirely finished yet because: >>> >>> * building from source with CUDA support does not work yet, the build fails >>> with strange Bazel errors... >>> >>> * there are some issues when the TensorFlow easyblock is used together with >>> --use-ccache and the Intel compilers; >>> because two compiler wrappers are used, they end up calling each other >>> resulting in a "fork bomb" style situation... >>> >>> I would really like to get it finished and have easyconfigs available for >>> TensorFlow 1.4 and newer where we properly build TensorFlow from source >>> rather than using the binary wheels... >>> >>> Are you up for giving it a try, and maybe helping out with the problems >>> mentioned above? >>> >>> >>> regards, >>> >>> Kenneth >>> >>> >>> [1] https://github.com/easybuilders/easybuild-easyblocks/pull/1287 >>> [2] https://github.com/easybuilders/easybuild-easyconfigs/pull/5499 >>> >>>> Best regards >>>> >>>> Jakob >>>> >>>> -- >>>> Jakob Schiøtz, professor, Ph.D. >>>> Department of Physics >>>> Technical University of Denmark >>>> DK-2800 Kongens Lyngby, Denmark >>>> http://www.fysik.dtu.dk/~schiotz/ >>>> >>>> >>>> >> -- >> Jakob Schiøtz, professor, Ph.D. >> Department of Physics >> Technical University of Denmark >> DK-2800 Kongens Lyngby, Denmark >> http://www.fysik.dtu.dk/~schiotz/ >> >> >> > -- Jakob Schiøtz, professor, Ph.D. Department of Physics Technical University of Denmark DK-2800 Kongens Lyngby, Denmark http://www.fysik.dtu.dk/~schiotz/