Hi,

I'm struggling to get a theano graph spread over two GPU's working, but I 
keep encountering the GpuArrayException: b'an illegal memory access was 
encountered' error (full traceback is in the end of this email).
The basic idea is to do a forward pass through two neural networks, each 
located on a separate device each and combine the outputs.

I'm using the latest Theano, libgpuarray and Lasagne to build the networks, 
and have hacked Lasagne a bit to able to pass target='device' to the shared 
variable constructor during weights initialization. 

I have THEANO_FLAGS="contexts=dev1->cuda1;dev2->cuda2" and the output after 
theano import is:
Using cuDNN version 5005 on context None 
Mapped name None to device cuda: GeForce GTX 980 (0000:0A:00.0) 
Using cuDNN version 5005 on context dev1 
Mapped name dev1 to device cuda1: GeForce GTX 980 (0000:09:00.0) 
Using cuDNN version 5005 on context dev2 
Mapped name dev2 to device cuda2: GeForce GTX 980 (0000:06:00.0)


The networks definition is quite lengthy (and doesn't always reproduce on 
toy graphs), so I'm providing a simpified example of what I'm doing. 
inp_0 = T.tensor4('inp0')
r0 = build_model('dev1', input_var=inp_0)
inp_1 = T.tensor4('inp1')
r1 = build_model("dev2", input_var=inp_1)

r0_out = lasagne.layers.get_output(r0['fc6'], deterministic=False)
r1_out = lasagne.layers.get_output(r1['fc6'], deterministic=False)

train_r0 = theano.function(
    [inp_0, inp_1],
    [r0_out, r1_out]
)

result0 = train_r0(x, x2)
This code fails with the aforementioned error.

I've also tried to compile a separate function for each of the networks, 
like
train_r0 = theano.function(
    [inp_0],
    [r0_out]
)

train_r1 = theano.function(
    [inp_1],
    [r1_out]
)

And running either train_r0 or train_r1 fails. But compiling and running a 
single function (no matter train_r0 or train_r1) works just fine.
Could someone help me debug this? Please let me know if I should provide 
additional code/info.

Thanks,
Sergey.

The full traceback:

RuntimeError                              Traceback (most recent call last)
/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py
 in __call__(self, *args, **kwargs)
    883             outputs =\
--> 884                 self.fn() if output_subset is None else\
    885                 self.fn(output_subset=output_subset)

RuntimeError: Error in the elemwise call

During handling of the above exception, another exception occurred:

GpuArrayException                         Traceback (most recent call last)
<ipython-input-11-902c3b4617f7> in <module>()
----> 1 result0 = train_r0(x, x2)
      2 #result1 = train_r1(x2)

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/compile/function_module.py
 in __call__(self, *args, **kwargs)
    896                     node=self.fn.nodes[self.fn.position_of_error],
    897                     thunk=thunk,
--> 898                     storage_map=getattr(self.fn, 'storage_map', None))
    899             else:
    900                 # old-style linkers raise their own exceptions

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/link.py
 in raise_with_op(node, thunk, exc_info, storage_map)
    139 
    140     hints = []
--> 141     detailed_err_msg = "\nApply node that caused the error: " + 
str(node)
    142     if exc_value.__applynode_index__ is not None:
    143         detailed_err_msg += "\nToposort index: %d" % node_index

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
 in __str__(self)
    178 
    179     def __str__(self):
--> 180         return op_as_string(self.inputs, self)
    181 
    182     def __repr__(self):

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
 in op_as_string(i, op, leaf_formatter, node_formatter)
   1256     between i and o
   1257     """
-> 1258     strs = as_string(i, op.inputs, leaf_formatter, node_formatter)
   1259     return node_formatter(op, strs)
   1260 

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
 in as_string(i, o, leaf_formatter, node_formatter)
   1336             return leaf_formatter(r)
   1337 
-> 1338     return [describe(output) for output in o]
   1339 
   1340 

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
 in <listcomp>(.0)
   1336             return leaf_formatter(r)
   1337 
-> 1338     return [describe(output) for output in o]
   1339 
   1340 

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gof/graph.py
 in describe(r)
   1334                     return s
   1335         else:
-> 1336             return leaf_formatter(r)
   1337 
   1338     return [describe(output) for output in o]

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/theano/gpuarray/type.py
 in __str__(self)
    604         except gpuarray.GpuArrayException:
    605             np_data = self.data
--> 606         return "GpuArrayConstant{%s}" % np_data
    607 
    608 

pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__str__ (pygpu/gpuarray.c:28703)()

/home/facenx/.virtualenvs/multitheano/lib/python3.5/site-packages/numpy/core/numeric.py
 in asarray(a, dtype, order)
    529 
    530     """
--> 531     return array(a, dtype, copy=False, order=order)
    532 
    533 

pygpu/gpuarray.pyx in pygpu.gpuarray.GpuArray.__array__ 
(pygpu/gpuarray.c:21616)()

pygpu/gpuarray.pyx in pygpu.gpuarray._pygpu_as_ndarray 
(pygpu/gpuarray.c:18322)()

pygpu/gpuarray.pyx in pygpu.gpuarray.array_read (pygpu/gpuarray.c:6923)()

GpuArrayException: b'an illegal memory access was encountered'



-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to