Hello Arnold,
I'm trying to test the theano_alexnet training from
https://github.com/uoguelph-mlrg/theano_alexnet. My computer is a Windows
10 native-machine 64 bit Intel core i7. I use WinPython-64bit-3.4.4.4QT5
from WinPython 3.4.4.3, Visual Studio 2015 Community Edition Update 3, CUDA
8.0.44 (64-bit), cuDNN v5.1 (August 10, 2016) for CUDA 8.0, Git source
control based on MinGW compiler and OpenBLAS 0.2.14. As fundamental python
libraries Theano is 0.9.0beta1 version, Scipy is 0.19.0, Keras 1.2.2,
Lasagne 0.2.dev1, Numpy 1.11.1, hickle 2.0.4, h5py 2.6.0, pycuda, pylearn2,
zeromq. I had got some help from this group in my
https://groups.google.com/forum/#!topic/theano-users/nyXwoO7A_rU. As you
could see in the last message I found this problem related to
generate_toy_data.sh in the make_train_val_txt.py. I understood that it is
a problem related to the ILSVRC2012_validation_ground_truth.txt file. Could
you help me in understanding this error? I have downloaded the training and
validation images firstly from
http://www.image-net.org/challenges/LSVRC/2012/nonpub-downloads but also
from http://image-net.org/challenges/LSVRC/2014/download-images-5jj5.php
$ sh generate_toy_data.sh
generating toy dataset ...
Traceback (most recent call last):
File "make_train_val_txt.py", line 61, in <module>
str(dict_orig_id_to_sorted_id[int(val_labels[ind])]) + '\n'
KeyError: 490
Il giorno mercoledì 6 aprile 2016 00:10:06 UTC+2, Arnold Tunick ha scritto:
>
> Hello Petar,
>
>
> 1. I received help from you on or about 15-17 March 2016 thru Google
> groups theano-users (topic: theano_alexnet "train.py").
> 2. I have made great progress to install and test the prerequisite
> software to implement Theano-AlexNet on a Windows 10 notebook computer.
> 3. I have re-installed and tested the newer version of Theano (v0.8.0)
> with CUDA 7.5, MS Visual Studio 12.0, python 2.7.9.4, pycuda 2015.1.3 ,
> boost 1.5.9, TDM-GCC (64-bit), numpy, zeromq, hickle and pylearn2.
> 4. I have successfully pre-processed a subset of the ImageNet data using
> the script generate_toy_data.sh, which generated all of the expected
> folders and files.
> 5. After fixing some problems related to TypeErrors, per your
> instruction, I then went ahead and ran theano-alexnet train.py as
> C:\SciSoft\Git\theano_alexnet>python train.py THEANO_FLAGS=mode=FAST_RUN,
> floatX=float32.
> 6. Now the program initializes fine, but when it starts the training, it
> crashes with an error message that indicated something about the
> operating system (OS). [see messages below].
> 7. I have contacted Weiguang Ding, who co-authored a 06 April 2015 arXiv
> paper on theano-alexnet entitled, "Theano-based large-scale visual
> recognition with multiple GPUs.
> 8. Yet, he recommended that I continue to explore the Google groups
> theano-users for help.
> 9. Interestingly, both Fred Bastien and Pascal Lamblin advised running
> the code on Linux because they think that the theano-alexnet code may use
> features from CUDA that are only available on Linux.
> 10. Nevertheless, I would like to continue to work towards viable
> solution using the setup that I have already established, so that I can use
> Theano-AlexNet to explore feature recognition from various new images.
> 11. Any suggestions or recommendations that you may offer would be greatly
> appreciated.
> .
> Thanks in advance for time and expert help.
> .
> Best,
> Arnold Tunick
>
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++++
> > C:\SciSoft\Git\theano_alexnet>python train.py
> THEANO_FLAGS=mode=FAST_RUN, floatX=float32
> >
> > Using gpu device 0: Quadro K4000M (CNMeM is disabled, CuDNN 3007)
> .
> > ... building the model
> .
> > conv (cudnn) layer with shape_in: (3, 227, 227, 256)
> > conv (cudnn) layer with shape_in: (96, 27, 27, 256)
> > conv (cudnn) layer with shape_in: (256, 13, 13, 256)
> > conv (cudnn) layer with shape_in: (384, 13, 13, 256)
> > conv (cudnn) layer with shape_in: (384, 13, 13, 256)
> > fc layer with num_in: 9216 num_out: 4096
> > dropout layer with P_drop: 0.5
> > fc layer with num_in: 4096 num_out: 4096
> > dropout layer with P_drop: 0.5
> > softmax layer with num_in: 4096 num_out: 1000
> .
> > ... training
> .
> > Process Process-1:
> > Traceback (most recent call last):
> > File
> > "C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\
> multiprocessing\process.py",
> > line 266, in _bootstrap
> > self.run()
> > File
> > "C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\
> multiprocessing\process.py",
> > line 120, in run
> > self._target(*self._args, **self._kwargs)
> > File "C:\SciSoft\Git\theano_alexnet\train.py", line 69, in train_net
> > h = drv.mem_get_ipc_handle(gpuarray_batch.ptr)
> .
> > LogicError: cuIpcGetMemHandle failed: OS call failed or operation not
> > supported
> on this OS
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.