When I did use preallocation I used lib.cnmem=1 for theano 0.8.2 and gpuarray.preallocate=1 for theano 0.9.0 and 0.10.dev. For most experiments (including those in the log files) I did not use preallocation, because the only way I could see the difference in memory usage was through nvidia-smi, which only shows the static pre-allocation when it is used. I believe the problem does not disappear with pre-allocation, since I see my training crash for much smaller models with the new backend even then. However, I cannot measure the effect of switching backends on GPU memory when I use preallocation.
On Thursday, June 22, 2017 at 3:23:15 PM UTC+2, nouiz wrote: > > Do you use the Theano flag: gpuarray.preallocate=1? When you tried the > preallocation, how did you use it? > > Is is mostly equivalent to lib.cnmem. But our default is different and by > default give more speed up, but sometimes can cause memory fragmentation. > the flag above fix the new fragmentation that can happen by default. > > On Thu, Jun 22, 2017 at 5:33 AM Fabian Stemmer <[email protected] > <javascript:>> wrote: > >> One addition: >> The theano 0.9.0 setup used libgpuarray v0.6.2. >> The theano 0.10.dev setup used libgpuarray v0.6.5 - I just updated to >> v0.6.7 and tested again, but I still get ~2GB memory usage. >> >> >> On Thursday, June 22, 2017 at 8:38:26 AM UTC+2, Fabian Stemmer wrote: >>> >>> Hi, >>> >>> I recently tried to switch my CNN implementation to the new theano GPU >>> backend. To do so, I switched from "device=gpu" to "device=cuda" with >>> theano9 and libgpuarray installed. My theano code then works with the new >>> backend without any further changes. >>> >>> However, when I do this, I see my GPU memory consumption increase >>> drastically. When I use theano memory profiling both GPU backends show the >>> same memory consumption, but when I use nvidia-smi to monitor memory usage >>> while the job is running, the old backend hovers somewhere around 400MB, >>> while the new backend uses 2GB for the same model size and data. When I try >>> to train larger models, the new GPU backend fails with memory errors for >>> much smaller models than the old backend. This is also true when I activate >>> memory pre-allocation. >>> >>> I tried to remove parts of my model or exclude certain theano >>> optimizations (e.g. exclude conv_dnn to force theano to use a different >>> convolution algorithm) but nothing I changed in the model structure had an >>> impact on the discrepancy I see in memory usage. >>> >>> I use CUDA 8.0 and cuDNN 5105 for these experiments. For the old backend >>> I see very similar behavior for both the 0.8.2 and 0.9.0 releases. For the >>> new backend I tested the 0.9.0 release as well as a recent github checkout >>> (commit c5cd87fa7895dc44c7acd54cb85e6d232b33bd3a) - both showed the same >>> memory increase. >>> >>> I attached log files including my models computational graph and >>> information on libraries, environment variables, etc. Please let me know if >>> I can supply any additional information to make it easier to look into >>> this. I tried to prepare a simple sample script to reproduce the behavior, >>> but was so far unable to do so. >>> >>> Thanks >>> Fabian >>> >> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
