Hi,
I'm struggling to get a theano graph spread over two GPU's working, but I
keep encountering the GpuArrayException: b'an illegal memory access was
encountered' error (full traceback is in the end of this email).
The basic idea is to do a forward pass through two neural networks, each
located on a separate device each and combine the outputs.
I'm using the latest Theano, libgpuarray and Lasagne to build the networks,
and have hacked Lasagne a bit to able to pass target='device' to the shared
variable constructor during weights initialization.
I have THEANO_FLAGS="contexts=dev1->cuda1;dev2->cuda2" and the output after
theano import is:
Using cuDNN version 5005 on context None
Mapped name None to device cuda: GeForce GTX 980 (0000:0A:00.0)
Using cuDNN version 5005 on context dev1
Mapped name dev1 to device cuda1: GeForce GTX 980 (0000:09:00.0)
Using cuDNN version 5005 on context dev2
Mapped name dev2 to device cuda2: GeForce GTX 980 (0000:06:00.0)
The networks definition is quite lengthy (and doesn't always reproduce on
toy graphs), so I'm providing a simpified example of what I'm doing.
inp_0 = T.tensor4('inp0')
r0 = build_model('dev1', input_var=inp_0)
inp_1 = T.tensor4('inp1')
r1 = build_model("dev2", input_var=inp_1)
r0_out = lasagne.layers.get_output(r0['fc6'], deterministic=False)
r1_out = lasagne.layers.get_output(r1['fc6'], deterministic=False)
train_r0 = theano.function(
[inp_0, inp_1],
[r0_out, r1_out]
)
result0 = train_r0(x, x2)
This code fails with the aforementioned error.
I've also tried to compile a separate function for each of the networks,
like
train_r0 = theano.function(
[inp_0],
[r0_out]
)
train_r1 = theano.function(
[inp_1],
[r1_out]
)
And running either train_r0 or train_r1 fails. But compiling and running a
single function (no matter train_r0 or train_r1) works just fine.
Could someone help me debug this? Please let me know if I should provide
additional code/info.
Thanks,
Sergey.
The full traceback:
RuntimeError Traceback (most recent call last)
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py
in __call__(self, *args, **kwargs)
883 outputs =\
--> 884 self.fn() if output_subset is None else\
885 self.fn(output_subset=output_subset)
RuntimeError: Error in the elemwise call
During handling of the above exception, another exception occurred:
GpuArrayException Traceback (most recent call last)
<ipython-input-11-902c3b4617f7> in <module>()
----> 1 result0 = train_r0(x, x2)
2 #result1 = train_r1(x2)
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py
in __call__(self, *args, **kwargs)
896 node=self.fn.nodes[self.fn.position_of_error],
897 thunk=thunk,
--> 898 storage_map=getattr(self.fn, 'storage_map', None))
899 else:
900 # old-style linkers raise their own exceptions
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/link.py
in raise_with_op(node, thunk, exc_info, storage_map)
139
140 hints = []
--> 141 detailed_err_msg = "\nApply node that caused the error: " +
str(node)
142 if exc_value.__applynode_index__ is not None:
143 detailed_err_msg += "\nToposort index: %d" % node_index
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
in __str__(self)
178
179 def __str__(self):
--> 180 return op_as_string(self.inputs, self)
181
182 def __repr__(self):
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
in op_as_string(i, op, leaf_formatter, node_formatter)
1256 between i and o
1257 """
-> 1258 strs = as_string(i, op.inputs, leaf_formatter, node_formatter)
1259 return node_formatter(op, strs)
1260
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
in as_string(i, o, leaf_formatter, node_formatter)
1336 return leaf_formatter(r)
1337
-> 1338 return [describe(output) for output in o]
1339
1340
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
in <listcomp>(.0)
1336 return leaf_formatter(r)
1337
-> 1338 return [describe(output) for output in o]
1339
1340
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
in describe(r)
1334 return s
1335 else:
-> 1336 return leaf_formatter(r)
1337
1338 return [describe(output) for output in o]
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gpuarray/type.py
in __str__(self)
604 except gpuarray.GpuArrayException:
605 np_data = self.data
--> 606 return "GpuArrayConstant{%s}" % np_data
607
608
pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__str__ (pygpu/gpuarray.c:28703)()
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/numpy/core/numeric.py
in asarray(a, dtype, order)
529
530 """
--> 531 return array(a, dtype, copy=False, order=order)
532
533
pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__array__
(pygpu/gpuarray.c:21616)()
pygpu/gpuarray.pyx in pygpu.gpuarray._pygpu_as_ndarray
(pygpu/gpuarray.c:18322)()
pygpu/gpuarray.pyx in pygpu.gpuarray.array_read (pygpu/gpuarray.c:6923)()
GpuArrayException: b'an illegal memory access was encountered'
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.