[theano-users] Re: theano_alexnet "train.py"

Goffredo Giordano Sun, 26 Mar 2017 09:58:28 -0700

Hello Arnold,
I'm trying to test the theano_alexnet training from 
https://github.com/uoguelph-mlrg/theano_alexnet. My computer is a Windows 
10 native-machine 64 bit Intel core i7. I use WinPython-64bit-3.4.4.4QT5 
from WinPython 3.4.4.3, Visual Studio 2015 Community Edition Update 3, CUDA 
8.0.44 (64-bit), cuDNN v5.1 (August 10, 2016) for CUDA 8.0, Git source 
control based on MinGW compiler and OpenBLAS 0.2.14. As fundamental python 
libraries Theano is 0.9.0beta1 version, Scipy is 0.19.0, Keras 1.2.2, 
Lasagne 0.2.dev1, Numpy 1.11.1, hickle 2.0.4, h5py 2.6.0, pycuda, pylearn2, 
zeromq. I had got some help from this group in my 
https://groups.google.com/forum/#!topic/theano-users/nyXwoO7A_rU. As you 
could see in the last message I found this problem related to 
generate_toy_data.sh in the make_train_val_txt.py. I understood that it is 
a problem related to the ILSVRC2012_validation_ground_truth.txt file. Could 
you help me in understanding this error? I have downloaded the training and 
validation images firstly from 
http://www.image-net.org/challenges/LSVRC/2012/nonpub-downloads but also 
from http://image-net.org/challenges/LSVRC/2014/download-images-5jj5.php


$ sh generate_toy_data.sh
generating toy dataset ...

Traceback (most recent call last):
  File "make_train_val_txt.py", line 61, in <module>
    str(dict_orig_id_to_sorted_id[int(val_labels[ind])]) + '\n'
KeyError: 490




Il giorno mercoledì 6 aprile 2016 00:10:06 UTC+2, Arnold Tunick ha scritto:
>
> Hello Petar,
>
>
> 1.  I received help from you on or about 15-17 March 2016 thru Google 
> groups theano-users (topic: theano_alexnet "train.py").
> 2.  I have made great progress to install and test the prerequisite 
> software to implement Theano-AlexNet on a Windows 10 notebook computer. 
> 3.  I have re-installed and tested the newer version of Theano (v0.8.0) 
> with CUDA 7.5, MS Visual Studio 12.0, python 2.7.9.4, pycuda 2015.1.3 , 
> boost 1.5.9, TDM-GCC (64-bit), numpy, zeromq, hickle and pylearn2.
> 4.  I have successfully pre-processed a subset of the ImageNet data using 
> the script generate_toy_data.sh, which generated all of the expected 
> folders and files.
> 5.  After fixing some problems related to TypeErrors, per your 
> instruction, I then went ahead and ran theano-alexnet train.py as 
> C:\SciSoft\Git\theano_alexnet>python train.py THEANO_FLAGS=mode=FAST_RUN, 
> floatX=float32. 
> 6. Now the program initializes fine, but when it starts the training, it 
> crashes with an error message that indicated something about the 
> operating system (OS). [see messages below].
> 7.  I have contacted Weiguang Ding, who co-authored a 06 April 2015 arXiv 
> paper on theano-alexnet entitled, "Theano-based large-scale visual 
> recognition with multiple GPUs.
> 8.  Yet, he recommended that I continue to explore the Google groups 
> theano-users for help.
> 9.  Interestingly, both Fred Bastien and Pascal Lamblin advised running 
> the code on Linux because they think that the theano-alexnet code may use 
> features from CUDA that are only available on Linux.
> 10. Nevertheless, I would like to continue to work towards viable 
> solution using the setup that I have already established, so that I can use 
> Theano-AlexNet to explore feature recognition from various new images.
> 11. Any suggestions or recommendations that you may offer would be greatly 
> appreciated.
> .
> Thanks in advance for time and expert help.
> .
> Best,
> Arnold Tunick
>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++++
> > C:\SciSoft\Git\theano_alexnet>python train.py 
> THEANO_FLAGS=mode=FAST_RUN, floatX=float32
> >
> > Using gpu device 0: Quadro K4000M (CNMeM is disabled, CuDNN 3007)
> .
> > ... building the model
> .
> > conv (cudnn) layer with shape_in: (3, 227, 227, 256)
> > conv (cudnn) layer with shape_in: (96, 27, 27, 256)
> > conv (cudnn) layer with shape_in: (256, 13, 13, 256)
> > conv (cudnn) layer with shape_in: (384, 13, 13, 256)
> > conv (cudnn) layer with shape_in: (384, 13, 13, 256)
> > fc layer with num_in: 9216 num_out: 4096
> > dropout layer with P_drop: 0.5
> > fc layer with num_in: 4096 num_out: 4096
> > dropout layer with P_drop: 0.5
> > softmax layer with num_in: 4096 num_out: 1000
> .
> > ... training
> .
> > Process Process-1:
> > Traceback (most recent call last):
> >   File
> > "C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\
> multiprocessing\process.py",
> > line 266, in _bootstrap
> >     self.run()
> >   File
> > "C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\
> multiprocessing\process.py",
> > line 120, in run
> >     self._target(*self._args, **self._kwargs)
> >   File "C:\SciSoft\Git\theano_alexnet\train.py", line 69, in train_net
> >     h = drv.mem_get_ipc_handle(gpuarray_batch.ptr)
> .
> > LogicError: cuIpcGetMemHandle failed: OS call failed or operation not 
> > supported 
> on this OS
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] Re: theano_alexnet "train.py"

Reply via email to