Do you have a script we could run to reproduce it? It would help investigate this. Also give us the Theano flags that you use to reproduce this.
thanks Fred On Wed, Apr 5, 2017 at 12:54 PM Sergey Ovcharenko <[email protected]> wrote: > Hi, > > I'm struggling to get a theano graph spread over two GPU's working, but I > keep encountering the GpuArrayException: b'an illegal memory access was > encountered' error (full traceback is in the end of this email). > The basic idea is to do a forward pass through two neural networks, each > located on a separate device each and combine the outputs. > > I'm using the latest Theano, libgpuarray and Lasagne to build the > networks, and have hacked Lasagne a bit to able to pass target='device' to > the shared variable constructor during weights initialization. > > I have THEANO_FLAGS="contexts=dev1->cuda1;dev2->cuda2" and the output > after theano import is: > Using cuDNN version 5005 on context None > Mapped name None to device cuda: GeForce GTX 980 (0000:0A:00.0) > Using cuDNN version 5005 on context dev1 > Mapped name dev1 to device cuda1: GeForce GTX 980 (0000:09:00.0) > Using cuDNN version 5005 on context dev2 > Mapped name dev2 to device cuda2: GeForce GTX 980 (0000:06:00.0) > > > The networks definition is quite lengthy (and doesn't always reproduce on > toy graphs), so I'm providing a simpified example of what I'm doing. > inp_0 = T.tensor4('inp0') > r0 = build_model('dev1', input_var=inp_0) > inp_1 = T.tensor4('inp1') > r1 = build_model("dev2", input_var=inp_1) > > r0_out = lasagne.layers.get_output(r0['fc6'], deterministic=False) > r1_out = lasagne.layers.get_output(r1['fc6'], deterministic=False) > > train_r0 = theano.function( > [inp_0, inp_1], > [r0_out, r1_out] > ) > > result0 = train_r0(x, x2) > This code fails with the aforementioned error. > > I've also tried to compile a separate function for each of the networks, > like > train_r0 = theano.function( > [inp_0], > [r0_out] > ) > > train_r1 = theano.function( > [inp_1], > [r1_out] > ) > > And running either train_r0 or train_r1 fails. But compiling and running a > single function (no matter train_r0 or train_r1) works just fine. > Could someone help me debug this? Please let me know if I should provide > additional code/info. > > Thanks, > Sergey. > > The full traceback: > > RuntimeError Traceback (most recent call last) > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py > in __call__(self, *args, **kwargs) > 883 outputs =\ > --> 884 self.fn() if output_subset is None else\ > 885 self.fn(output_subset=output_subset) > > RuntimeError: Error in the elemwise call > > During handling of the above exception, another exception occurred: > > GpuArrayException Traceback (most recent call last) > <ipython-input-11-902c3b4617f7> in <module>() > ----> 1 result0 = train_r0(x, x2) > 2 #result1 = train_r1(x2) > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py > in __call__(self, *args, **kwargs) > 896 node=self.fn.nodes[self.fn.position_of_error], > 897 thunk=thunk, > --> 898 storage_map=getattr(self.fn, 'storage_map', None)) > 899 else: > 900 # old-style linkers raise their own exceptions > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/link.py > in raise_with_op(node, thunk, exc_info, storage_map) > 139 > 140 hints = [] > --> 141 detailed_err_msg = "\nApply node that caused the error: " + > str(node) > 142 if exc_value.__applynode_index__ is not None: > 143 detailed_err_msg += "\nToposort index: %d" % node_index > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py > in __str__(self) > 178 > 179 def __str__(self): > --> 180 return op_as_string(self.inputs, self) > 181 > 182 def __repr__(self): > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py > in op_as_string(i, op, leaf_formatter, node_formatter) > 1256 between i and o > 1257 """ > -> 1258 strs = as_string(i, op.inputs, leaf_formatter, node_formatter) > 1259 return node_formatter(op, strs) > 1260 > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py > in as_string(i, o, leaf_formatter, node_formatter) > 1336 return leaf_formatter(r) > 1337 > -> 1338 return [describe(output) for output in o] > 1339 > 1340 > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py > in <listcomp>(.0) > 1336 return leaf_formatter(r) > 1337 > -> 1338 return [describe(output) for output in o] > 1339 > 1340 > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py > in describe(r) > 1334 return s > 1335 else: > -> 1336 return leaf_formatter(r) > 1337 > 1338 return [describe(output) for output in o] > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gpuarray/type.py > in __str__(self) > 604 except gpuarray.GpuArrayException: > 605 np_data = self.data > --> 606 return "GpuArrayConstant{%s}" % np_data > 607 > 608 > > pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__str__ > (pygpu/gpuarray.c:28703)() > > /home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/numpy/core/numeric.py > in asarray(a, dtype, order) > 529 > 530 """ > --> 531 return array(a, dtype, copy=False, order=order) > 532 > 533 > > pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__array__ > (pygpu/gpuarray.c:21616)() > > pygpu/gpuarray.pyx in pygpu.gpuarray._pygpu_as_ndarray > (pygpu/gpuarray.c:18322)() > > pygpu/gpuarray.pyx in pygpu.gpuarray.array_read (pygpu/gpuarray.c:6923)() > > GpuArrayException: b'an illegal memory access was encountered' > > > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
