We build Tensorflow on our CentOS 6 cluster. I think disabling jemalloc should be enough to avoid the HUGEPAGE problem.
Greetings André ----- Am 14. Mrz 2018 um 12:03 schrieb Jack Perdue j-per...@tamu.edu: > p.s. so far, for non-GPU, I have a 6x speed up over the Anaconda3/5.1.0 > version > on an AVX2-based cluster (1 node, 28 CPUs). (stick with -O2 [EB > default]... -O3 ["opt": True] doesn't help). > > Preparing to run same benchmark[*] with the GPU(s) (2xTesla80). > > https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/README.md > > Some other notes: > > Don't even try to build TensorFlow on RHEL/CentOS6..... its older > kernel doesn't have MADV_HUGEPAGE support. > > For TensorFlow, I customized for our site with: > > cuda_compute_capabilities = ['3.5', '3.7'] # for Tesla K20 (ada) and > K80 (terra) > > I could have left out 3.5 since ada is RHEL6. > > jack > > > On 03/14/2018 05:48 AM, Jack Perdue wrote: >> +1 !!!!! >> >> I struggled with the same issue (I have no idea where Stephane got >> his/her copy). >> >> FWIW, here's (attached) what I came up with which includes >> that fix and a cleanup of the duplicate libs. >> >> Jack Perdue >> Lead Systems Administrator >> High Performance Research Computing >> TAMU Division of Research >> j-per...@tamu.edu http://hprc.tamu.edu >> HPRC Helpdesk: h...@hprc.tamu.edu >> >> On 03/14/2018 05:35 AM, Joachim Hein wrote: >>> Hi, >>> >>> I am trying TensorFlow-1.5.0-goolfc-2017b-Python-3.6.3.eb . It is >>> looking for a file cudnn-9.0-linux-x64-v7.0.5.15.tgz , however I am >>> currently getting cudnn-9.0-linux-x64-v7.tgz from the Nvidea download >>> site. The sha256 sum of the file I just downloaded agrees with the >>> one in the EB-config. After renaming my download to the name >>> expected by EB, cuDNN builds. >>> >>> Can the config be upgraded to handle both, old and new name? Is that >>> something EB supports? Otherwise we should leave a comment inside >>> the config, that renaming is a work around (one needs a manual >>> download of sources anyway). >>> >>> Any comments? >>> >>> Best wishes >>> Joachim >>> >>> >>> -- André Gemünd Fraunhofer-Institute for Algorithms and Scientific Computing andre.gemu...@scai.fraunhofer.de Tel: +49 2241 14-2193 /C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend