How did you solve the problem?  What was it caused by?

Thanks!

El viernes, 11 de marzo de 2016, 6:24:45 (UTC+1), Sonam Singh escribió:
>
> I guess it was driver issue. I tweaked here and there and wasn't able to 
> reproduce after few days.
>
> Thanks
> -Sonam
> On Wed, Mar 9, 2016 at 4:50 AM, Frédéric Bastien <frederic...@gmail.com 
> <javascript:>> wrote:
>
>> This is hard to debug. Do you still have this with the same model but 
>> smaller layer size? Can you try with the env variable CUDA_LAUNCH_BLOCKING=1
>> This can give better error message.
>>
>> Fred
>> Le 1 mars 2016 04:38, "Sonam Singh" <sonams...@gmail.com <javascript:>> 
>> a écrit :
>>
>>> Hi,
>>>
>>> I have an RNN Encoder - decoder model which is running fine on CPU but 
>>> throws this error on GPU. 
>>> Note: I can run similar( RNN encoder decoder with slight cost function 
>>> changes on different data) without any errors, so drivers and all seem fine
>>>
>>> Let me know if exception_verbosity=high trace is needed .
>>>
>>>
>>> TRACE:
>>>
>>> Error when tring to find the memory information on the GPU: unspecified 
>>> launch failure
>>> Error freeing device pointer 0x130040c600 (unspecified launch failure). 
>>> Driver report 0 bytes free and 0 bytes total 
>>> device_free: cudaFree() returned an error, but there is already an 
>>> Python error set. This happen during the clean up when there is 
>>> a first error and the CUDA driver is in a so bad state that it don't 
>>> work anymore. We keep the previous error set to help debugging
>>>  it.CudaNdarray_uninit: error freeing dev_structure memory 0x130040c600 
>>> (self=0x7f3433c3bab0)
>>>
>>>
>>>
>>>
>>>
>>>  reraise(exc_type, exc_value, exc_trace)
>>>   File 
>>> "/home/ms/ssingh/anaconda2/lib/python2.7/site-packages/Theano-0.8.0.dev0-py2.7.egg/theano/compile/function_module.py",
>>>  
>>> line 
>>> 859, in __call__
>>>     outputs = self.fn()
>>>   File 
>>> "/home/ms/ssingh/anaconda2/lib/python2.7/site-packages/Theano-0.8.0.dev0-py2.7.egg/theano/scan_module/scan_op.py",
>>>  
>>> line 963,
>>>  in rval
>>>     r = p(n, [x[0] for x in i], o)
>>>   File 
>>> "/home/ms/ssingh/anaconda2/lib/python2.7/site-packages/Theano-0.8.0.dev0-py2.7.egg/theano/scan_module/scan_op.py",
>>>  
>>> line 952,
>>>  in <lambda>
>>>     self, node)
>>>   File "theano/scan_module/scan_perform.pyx", line 505, in 
>>> theano.scan_module.scan_perform.perform (/home/ms/ssingh/.theano/compile
>>>
>>> dir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.5-Santiago-x86_64-2.7.10-64/scan_perform/mod.cpp:5551)
>>> RuntimeError: CudaNdarray.__setitem__: syncing structure to device failed
>>> Apply node that caused the error: 
>>> forall_inplace,gpu,grad_of_scan_fn}(TensorConstant{10}, 
>>> GpuDimShuffle{0,2,1}.0, GpuDimShuffle{0,2
>>> ,1}.0, GpuElemwise{Composite{(i0 - sqr(i1))},no_inplace}.0, 
>>> GpuSubtensor{::int64}.0, GpuAlloc{memset_0=True}.0, GpuAlloc{memset_0=T
>>> rue}.0, GpuAlloc{memset_0=True}.0, GpuDimShuffle{1,0}.0)
>>> Toposort index: 247
>>> Inputs types: [TensorType(int64, scalar), CudaNdarrayType(float32, 3D), 
>>> CudaNdarrayType(float32, 3D), CudaNdarrayType(float32, 3D),
>>>  CudaNdarrayType(float32, 3D), CudaNdarrayType(float32, 3D), 
>>> CudaNdarrayType(float32, 3D), CudaNdarrayType(float32, matrix), CudaNd
>>> arrayType(float32, matrix)]
>>> Inputs shapes: [(), (10, 2048, 4), (10, 10000, 4), (10, 4, 2048), (11, 
>>> 4, 2048), (2, 10000, 2048), (2, 2048, 2048), (2, 2048), (204
>>> 8, 2048)]
>>> Inputs strides: [(), (-8192, 1, 2048), (-10000, 1, 100000), (8192, 2048, 
>>> 1), (-8192, 2048, 1), (20480000, 2048, 1), (4194304, 2048,
>>>  1), (2048, 1), (1, 2048)]
>>> Inputs values: [array(10), 'not shown', 'not shown', 'not shown', 'not 
>>> shown', 'not shown', 'not shown', 'not shown', 'not shown']
>>> Outputs clients: [[], 
>>> [GpuSubtensor{int64}(forall_inplace,gpu,grad_of_scan_fn}.1, Constant{1})], 
>>> [GpuSubtensor{int64}(forall_inplac
>>> e,gpu,grad_of_scan_fn}.2, Constant{1})], 
>>> [GpuSubtensor{int64}(forall_inplace,gpu,grad_of_scan_fn}.3, Constant{1})]]
>>>
>>>
>>>
>>> Thanks,
>>> Sonam
>>>
>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to theano-users...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "theano-users" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/theano-users/EjBvk1oyS3Y/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> theano-users...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to