Frederic, this is the feedback afterl the upgrades about float 16. Python 2.7.11 |Anaconda custom (64-bit)| (default, Dec 6 2015, 18:08:32) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> import run_multi_conv Traceback (most recent call last): File "<stdin>", line 1, in <module> File "run_multi_conv.py", line 1, in <module> import mpr_convnet_class as conv File "mpr_convnet_class.py", line 2, in <module> from convnet3d import ConvLayer, PoolLayer File "convnet3d.py", line 3, in <module> from theano.tensor.nnet.conv3d2d import conv3d File "/home/luca/data/Theano-master/theano/__init__.py", line 125, in <module> import theano.gpuarray File "/home/luca/data/Theano-master/theano/gpuarray/__init__.py", line 31, in <module> from . import fft, dnn, opt, nerv, extra_ops File "/home/luca/data/Theano-master/theano/gpuarray/dnn.py", line 17, in <module> from theano.compile.ops import shape_i, shape_i_op ImportError: cannot import name shape_i_op >>>
On Thursday, July 21, 2016 at 11:15:06 AM UTC+2, [email protected] wrote: > > Frederic, > I'll do it and give you a feedback, > many thanks > Luca > > On Tuesday, July 19, 2016 at 10:09:21 PM UTC+2, nouiz wrote: >> >> We have a PR that upgrade some stuff about float16: >> >> https://github.com/Theano/Theano/pull/4764/files >> >> It probably fix your problem. Can you try it to confirm that you don't >> have a different problem? >> >> thanks >> >> Frédéric >> >> On Fri, Jul 15, 2016 at 4:55 AM, <[email protected]> wrote: >> >>> ok I try. >>> thanks >>> >>> On Thursday, July 14, 2016 at 11:44:41 PM UTC+2, Arnaud Bergeron wrote: >>>> >>>> I can't reproduce your problem using a simple convolution in float16. >>>> >>>> Either this is because your code is doing something unexpected or >>>> because the problem has been fixed in the development version. >>>> >>>> In nay case the development version is a much better option for the new >>>> backend and float16 so I encourage you to upgrade and try again: >>>> http://deeplearning.net/software/theano/install.html#bleeding-edge-install-instructions >>>> . >>>> >>>> 2016-07-14 4:22 GMT-04:00 <[email protected]>: >>>> >>>>> Here is .theanorc: >>>>> >>>>> [global] >>>>> floatX = float16 >>>>> device=cuda >>>>> [cuda] >>>>> root = /usr/local/cuda-7.5 >>>>> >>>>> >>>>> [nvcc] >>>>> fastmath=True >>>>> >>>>> optimizer = fast_compile >>>>> >>>>> On Thursday, July 14, 2016 at 10:19:56 AM UTC+2, [email protected] >>>>> wrote: >>>>>> >>>>>> Hi Arnaud, >>>>>> I put _f16_ok = True in dnn.py ( attached). >>>>>> >>>>>> This is the error I received: >>>>>> >>>>>> Python 2.7.11 |Anaconda custom (64-bit)| (default, Dec 6 2015, >>>>>> 18:08:32) >>>>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 >>>>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> Anaconda is brought to you by Continuum Analytics. >>>>>> Please check out: http://continuum.io/thanks and https://anaconda.org >>>>>> >>> import run_multi_conv >>>>>> >>>>>> Mapped name None to device cuda: GeForce 840M >>>>>> WARNING (theano.gof.compilelock): Overriding existing lock by dead >>>>>> process '3202' (I am process '3351') >>>>>> Using cuDNN version 5005 on context None >>>>>> /home/luca/data/Theano-master/theano/tensor/signal/downsample.py:6: >>>>>> UserWarning: downsample module has been moved to the >>>>>> theano.tensor.signal.pool module. >>>>>> "downsample module has been moved to the theano.tensor.signal.pool >>>>>> module.") >>>>>> >>> >>>>>> >>> run_multi_conv.run_experiments() >>>>>> Disabling C code for Elemwise{mul,no_inplace} due to unsupported >>>>>> float16 >>>>>> Disabling C code for Elemwise{Cast{float32}} due to unsupported >>>>>> float16 >>>>>> Disabling C code for Elemwise{Cast{float16}} due to unsupported >>>>>> float16 >>>>>> Disabling C code for Elemwise{Cast{float16}} due to unsupported >>>>>> float16 >>>>>> Disabling C code for Alloc due to unsupported float16 >>>>>> Disabling C code for Cast{float16} due to unsupported float16 >>>>>> Disabling C code for Cast{float16} due to unsupported float16 >>>>>> Disabling C code for Cast{float16} due to unsupported float16 >>>>>> Disabling C code for Cast{float16} due to unsupported float16 >>>>>> Disabling C code for RandomFunction{binomial} due to unsupported >>>>>> float16 >>>>>> Disabling C code for RandomFunction{binomial} due to unsupported >>>>>> float16 >>>>>> =============================== >>>>>> 00001 #include <Python.h> >>>>>> 00002 #include <iostream> >>>>>> 00003 #include "theano_mod_helper.h" >>>>>> 00004 #include <gpuarray/array.h> >>>>>> 00005 #include <gpuarray/kernel.h> >>>>>> 00006 #include <gpuarray/error.h> >>>>>> 00007 #include <gpuarray/buffer.h> >>>>>> 00008 #include <gpuarray/buffer_blas.h> >>>>>> 00009 #include <numpy/arrayobject.h> >>>>>> 00010 #include <gpuarray_api.h> >>>>>> 00011 #include <math.h> >>>>>> 00012 #include <numpy/arrayscalars.h> >>>>>> 00013 #include "cudnn.h" >>>>>> 00014 #include "cudnn_helper.h" >>>>>> 00015 #include "gpuarray_helper.h" >>>>>> 00016 #include "gpuarray/types.h" >>>>>> 00017 #include "gpuarray/array.h" >>>>>> 00018 #include "gpuarray/util.h" >>>>>> 00019 #include "gpuarray/ext_cuda.h" >>>>>> 00020 #include "gpuarray_api.h" >>>>>> 00021 #include "numpy_compat.h" >>>>>> 00022 ////////////////////// >>>>>> 00023 //// Support Code >>>>>> 00024 ////////////////////// >>>>>> 00025 >>>>>> 00026 >>>>>> 00027 >>>>>> 00028 static int >>>>>> 00029 c_set_tensorNd(PyGpuArrayObject *var, >>>>>> cudnnTensorDescriptor_t desc) { >>>>>> 00030 cudnnDataType_t dt; >>>>>> 00031 size_t ds; >>>>>> 00032 switch (var->ga.typecode) { >>>>>> 00033 case GA_FLOAT: >>>>>> 00034 dt = CUDNN_DATA_FLOAT; >>>>>> 00035 break; >>>>>> 00036 case GA_DOUBLE: >>>>>> 00037 dt = CUDNN_DATA_DOUBLE; >>>>>> 00038 break; >>>>>> 00039 #if CUDNN_VERSION > 3000 >>>>>> 00040 case GA_HALF: >>>>>> 00041 dt = CUDNN_DATA_HALF; >>>>>> 00042 break; >>>>>> 00043 #endif >>>>>> 00044 default: >>>>>> 00045 PyErr_SetString(PyExc_TypeError, "Non-float datatype in >>>>>> c_set_tensorNd"); >>>>>> 00046 return -1; >>>>>> 00047 } >>>>>> 00048 ds = gpuarray_get_elsize(var->ga.typecode); >>>>>> 00049 >>>>>> 00050 int strs[5], dims[5], default_stride = 1; >>>>>> 00051 unsigned int nd = PyGpuArray_NDIM(var); >>>>>> 00052 >>>>>> 00053 if (nd > 5) { >>>>>> 00054 PyErr_SetString(PyExc_TypeError, "Tensor of more than >>>>>> 5d"); >>>>>> 00055 return -1; >>>>>> 00056 } >>>>>> 00057 >>>>>> 00058 for (unsigned int _i = nd; _i > 0; _i--) { >>>>>> 00059 unsigned int i = _i - 1; >>>>>> 00060 strs[i] = PyGpuArray_STRIDE(var, i) ? >>>>>> 00061 PyGpuArray_STRIDE(var, i)/ds : default_stride; >>>>>> 00062 default_stride *= PyGpuArray_DIM(var, i); >>>>>> 00063 dims[i] = PyGpuArray_DIM(var, i); >>>>>> 00064 } >>>>>> 00065 >>>>>> 00066 cudnnStatus_t err = cudnnSetTensorNdDescriptor(desc, dt, >>>>>> nd, dims, strs); >>>>>> 00067 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00068 PyErr_Format(PyExc_RuntimeError, >>>>>> 00069 "Could not set tensorNd descriptor: %s", >>>>>> 00070 cudnnGetErrorString(err)); >>>>>> 00071 return -1; >>>>>> 00072 } >>>>>> 00073 return 0; >>>>>> 00074 } >>>>>> 00075 >>>>>> 00076 static int >>>>>> 00077 c_set_filter(PyGpuArrayObject *var, cudnnFilterDescriptor_t >>>>>> desc) { >>>>>> 00078 cudnnDataType_t dt; >>>>>> 00079 cudnnStatus_t err; >>>>>> 00080 >>>>>> 00081 if (!GpuArray_IS_C_CONTIGUOUS(&var->ga)) { >>>>>> 00082 PyErr_SetString(PyExc_ValueError, >>>>>> 00083 "Only contiguous filters (kernels) are >>>>>> supported."); >>>>>> 00084 return -1; >>>>>> 00085 } >>>>>> 00086 switch (var->ga.typecode) { >>>>>> 00087 case GA_FLOAT: >>>>>> 00088 dt = CUDNN_DATA_FLOAT; >>>>>> 00089 break; >>>>>> 00090 case GA_DOUBLE: >>>>>> 00091 dt = CUDNN_DATA_DOUBLE; >>>>>> 00092 break; >>>>>> 00093 #if CUDNN_VERSION > 3000 >>>>>> 00094 case GA_HALF: >>>>>> 00095 dt = CUDNN_DATA_HALF; >>>>>> 00096 break; >>>>>> 00097 #endif >>>>>> 00098 default: >>>>>> 00099 PyErr_SetString(PyExc_TypeError, "Non-float datatype in >>>>>> c_set_filter"); >>>>>> 00100 return -1; >>>>>> 00101 } >>>>>> 00102 >>>>>> 00103 int dims[5]; >>>>>> 00104 unsigned int nd = PyGpuArray_NDIM(var); >>>>>> 00105 >>>>>> 00106 if (nd > 5) { >>>>>> 00107 PyErr_SetString(PyExc_TypeError, "Tensor of more than >>>>>> 5d"); >>>>>> 00108 return -1; >>>>>> 00109 } >>>>>> 00110 >>>>>> 00111 for (unsigned int _i = nd; _i > 0; _i--) { >>>>>> 00112 unsigned int i = _i - 1; >>>>>> 00113 dims[i] = PyGpuArray_DIM(var, i); >>>>>> 00114 } >>>>>> 00115 >>>>>> 00116 #if CUDNN_VERSION >= 5000 >>>>>> 00117 err = cudnnSetFilterNdDescriptor(desc, dt, >>>>>> CUDNN_TENSOR_NCHW, nd, dims); >>>>>> 00118 #else >>>>>> 00119 err = cudnnSetFilterNdDescriptor(desc, dt, nd, dims); >>>>>> 00120 #endif >>>>>> 00121 >>>>>> 00122 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00123 PyErr_Format(PyExc_RuntimeError, >>>>>> 00124 "Could not set filter descriptor: %s.", >>>>>> 00125 cudnnGetErrorString(err)); >>>>>> 00126 return -1; >>>>>> 00127 } >>>>>> 00128 return 0; >>>>>> 00129 } >>>>>> 00130 >>>>>> 00131 >>>>>> 00132 >>>>>> 00133 namespace { >>>>>> 00134 struct >>>>>> __struct_compiled_op_86feacd077d8749f42b5d82709a80ba3 { >>>>>> 00135 PyObject* __ERROR; >>>>>> 00136 >>>>>> 00137 PyObject* storage_V3; >>>>>> 00138 PyObject* storage_V5; >>>>>> 00139 PyObject* storage_V7; >>>>>> 00140 PyObject* storage_V9; >>>>>> 00141 PyObject* storage_V11; >>>>>> 00142 PyObject* storage_V13; >>>>>> 00143 PyObject* storage_V1; >>>>>> 00144 PyObject* storage_V15; >>>>>> 00145 >>>>>> 00146 PyObject* py_V15; >>>>>> 00147 PyGpuContextObject *V15; >>>>>> 00148 #define DTYPE_INPUT_0 npy_float16 >>>>>> 00149 #define TYPENUM_INPUT_0 23 >>>>>> 00150 #define ITEMSIZE_INPUT_0 2 >>>>>> 00151 #define DTYPE_INPUT_1 npy_float16 >>>>>> 00152 #define TYPENUM_INPUT_1 23 >>>>>> 00153 #define ITEMSIZE_INPUT_1 2 >>>>>> 00154 #define DTYPE_INPUT_2 npy_float16 >>>>>> 00155 #define TYPENUM_INPUT_2 23 >>>>>> 00156 #define ITEMSIZE_INPUT_2 2 >>>>>> 00157 #define DTYPE_INPUT_4 npy_float16 >>>>>> 00158 #define TYPENUM_INPUT_4 23 >>>>>> 00159 #define ITEMSIZE_INPUT_4 2 >>>>>> 00160 #define DTYPE_INPUT_5 npy_float16 >>>>>> 00161 #define TYPENUM_INPUT_5 23 >>>>>> 00162 #define ITEMSIZE_INPUT_5 2 >>>>>> 00163 #define DTYPE_OUTPUT_0 npy_float16 >>>>>> 00164 #define TYPENUM_OUTPUT_0 23 >>>>>> 00165 #define ITEMSIZE_OUTPUT_0 2 >>>>>> 00166 #define APPLY_SPECIFIC(str) >>>>>> str##_node_86feacd077d8749f42b5d82709a80ba3_0 >>>>>> 00167 #define CONV_INPLACE 1 >>>>>> 00168 #define CONV_ALGO >>>>>> CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM >>>>>> 00169 >>>>>> 00170 >>>>>> 00171 PyGpuContextObject *ctx; >>>>>> 00172 cudnnHandle_t APPLY_SPECIFIC(_handle); >>>>>> 00173 >>>>>> 00174 >>>>>> 00175 cudnnTensorDescriptor_t APPLY_SPECIFIC(input); >>>>>> 00176 cudnnTensorDescriptor_t APPLY_SPECIFIC(output); >>>>>> 00177 cudnnFilterDescriptor_t APPLY_SPECIFIC(kerns); >>>>>> 00178 >>>>>> 00179 >>>>>> 00180 >>>>>> 00181 #ifdef CHOOSE_ALGO >>>>>> 00182 int reuse_algo; >>>>>> 00183 cudnnConvolutionFwdAlgo_t prev_algo; >>>>>> 00184 #ifndef CHOOSE_ONCE >>>>>> 00185 size_t prev_img_dims[5]; >>>>>> 00186 size_t prev_kern_dims[5]; >>>>>> 00187 #endif >>>>>> 00188 #endif >>>>>> 00189 >>>>>> 00190 int >>>>>> 00191 APPLY_SPECIFIC(conv_fwd)(PyGpuArrayObject *input, >>>>>> PyGpuArrayObject *kerns, >>>>>> 00192 PyGpuArrayObject *om, >>>>>> 00193 cudnnConvolutionDescriptor_t desc, >>>>>> 00194 double alpha, double beta, >>>>>> 00195 PyGpuArrayObject **output, >>>>>> 00196 PyGpuContextObject *c) { >>>>>> 00197 cudnnStatus_t err = CUDNN_STATUS_SUCCESS; >>>>>> 00198 float af = alpha, bf = beta; >>>>>> 00199 void *alpha_p; >>>>>> 00200 void *beta_p; >>>>>> 00201 >>>>>> 00202 if (PyGpuArray_DIMS(input)[1] != >>>>>> PyGpuArray_DIMS(kerns)[1]) { >>>>>> 00203 PyErr_SetString(PyExc_ValueError, >>>>>> 00204 "images and kernel must have the same stack >>>>>> size"); >>>>>> 00205 return 1; >>>>>> 00206 } >>>>>> 00207 >>>>>> 00208 if (c_set_tensorNd(input, APPLY_SPECIFIC(input)) == -1) >>>>>> 00209 return 1; >>>>>> 00210 if (c_set_filter(kerns, APPLY_SPECIFIC(kerns)) == -1) >>>>>> 00211 return 1; >>>>>> 00212 >>>>>> 00213 switch (input->ga.typecode) { >>>>>> 00214 case GA_DOUBLE: >>>>>> 00215 alpha_p = (void *)α >>>>>> 00216 beta_p = (void *)β >>>>>> 00217 break; >>>>>> 00218 case GA_FLOAT: >>>>>> 00219 case GA_HALF: >>>>>> 00220 alpha_p = (void *)⁡ >>>>>> 00221 beta_p = (void *)&bf; >>>>>> 00222 break; >>>>>> 00223 default: >>>>>> 00224 PyErr_SetString(PyExc_TypeError, "Unsupported type in >>>>>> convolution"); >>>>>> 00225 return 1; >>>>>> 00226 } >>>>>> 00227 >>>>>> 00228 #ifdef CONV_INPLACE >>>>>> 00229 Py_XDECREF(*output); >>>>>> 00230 *output = om; >>>>>> 00231 Py_INCREF(*output); >>>>>> 00232 #else >>>>>> 00233 if (theano_prep_output(output, PyGpuArray_NDIM(om), >>>>>> PyGpuArray_DIMS(om), >>>>>> 00234 om->ga.typecode, GA_C_ORDER, c) != >>>>>> 0) >>>>>> 00235 return 1; >>>>>> 00236 if (beta != 0.0 && pygpu_move(*output, om)) >>>>>> 00237 return 1; >>>>>> 00238 #endif >>>>>> 00239 >>>>>> 00240 if (c_set_tensorNd(*output, APPLY_SPECIFIC(output)) == -1) >>>>>> 00241 return 1; >>>>>> 00242 >>>>>> 00243 cudnnConvolutionFwdAlgo_t algo = CONV_ALGO; >>>>>> 00244 >>>>>> 00245 cuda_enter(c->ctx); >>>>>> 00246 #ifdef CHOOSE_ALGO >>>>>> 00247 #ifndef CHOOSE_ONCE >>>>>> 00248 reuse_algo = 1; >>>>>> 00249 for (unsigned int i = 0; i < PyGpuArray_NDIM(input); i++) { >>>>>> 00250 reuse_algo = (reuse_algo && >>>>>> 00251 PyGpuArray_DIM(input, i) == >>>>>> prev_img_dims[i]); >>>>>> 00252 reuse_algo = (reuse_algo && >>>>>> 00253 PyGpuArray_DIM(kerns, i) == >>>>>> prev_kern_dims[i]); >>>>>> 00254 } >>>>>> 00255 #endif >>>>>> 00256 >>>>>> 00257 if (!reuse_algo) { >>>>>> 00258 #ifdef CHOOSE_TIME >>>>>> 00259 int count; >>>>>> 00260 cudnnConvolutionFwdAlgoPerf_t choice; >>>>>> 00261 err = cudnnFindConvolutionForwardAlgorithm( >>>>>> 00262 APPLY_SPECIFIC(_handle), APPLY_SPECIFIC(input), >>>>>> APPLY_SPECIFIC(kerns), >>>>>> 00263 desc, APPLY_SPECIFIC(output), 1, &count, &choice); >>>>>> 00264 >>>>>> 00265 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00266 PyErr_Format(PyExc_RuntimeError, >>>>>> 00267 "error selecting convolution algo: %s", >>>>>> 00268 cudnnGetErrorString(err)); >>>>>> 00269 cuda_exit(c->ctx); >>>>>> 00270 return 1; >>>>>> 00271 } >>>>>> 00272 algo = choice.algo; >>>>>> 00273 #else >>>>>> 00274 size_t free; >>>>>> 00275 int err2 = gpucontext_property(c->ctx, >>>>>> GA_CTX_PROP_FREE_GMEM, &free); >>>>>> 00276 >>>>>> 00277 if (err2 != GA_NO_ERROR) { >>>>>> 00278 PyErr_Format(PyExc_RuntimeError, "Error when trying to >>>>>> find the " >>>>>> 00279 "memory information on the GPU"); >>>>>> 00280 cuda_exit(c->ctx); >>>>>> 00281 return 1; >>>>>> 00282 } >>>>>> 00283 >>>>>> 00284 err = cudnnGetConvolutionForwardAlgorithm( >>>>>> 00285 APPLY_SPECIFIC(_handle), APPLY_SPECIFIC(input), >>>>>> APPLY_SPECIFIC(kerns), >>>>>> 00286 desc, APPLY_SPECIFIC(output), >>>>>> 00287 CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT, free, >>>>>> &algo); >>>>>> 00288 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00289 PyErr_Format(PyExc_RuntimeError, >>>>>> 00290 "error selecting convolution algo: %s", >>>>>> 00291 cudnnGetErrorString(err)); >>>>>> 00292 cuda_exit(c->ctx); >>>>>> 00293 return 1; >>>>>> 00294 } >>>>>> 00295 #endif >>>>>> 00296 prev_algo = algo; >>>>>> 00297 } else { >>>>>> 00298 algo = prev_algo; >>>>>> 00299 } >>>>>> 00300 >>>>>> 00301 #ifdef CHOOSE_ONCE >>>>>> 00302 reuse_algo = 1; >>>>>> 00303 #else >>>>>> 00304 for (unsigned int i = 0; i < PyGpuArray_NDIM(input); i++) { >>>>>> 00305 prev_img_dims[i] = PyGpuArray_DIM(input, i); >>>>>> 00306 prev_kern_dims[i] = PyGpuArray_DIM(kerns, i); >>>>>> 00307 } >>>>>> 00308 #endif >>>>>> 00309 >>>>>> 00310 #endif >>>>>> 00311 >>>>>> 00312 /* These two algos are not supported for 3d conv */ >>>>>> 00313 if (PyGpuArray_NDIM(input) == 5 && >>>>>> 00314 (algo == >>>>>> CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM || >>>>>> 00315 algo == CUDNN_CONVOLUTION_FWD_ALGO_GEMM)) >>>>>> 00316 algo = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM; >>>>>> 00317 >>>>>> 00318 // The FFT implementation does not support strides, 1x1 >>>>>> filters or inputs >>>>>> 00319 // with a spatial dimension larger than 1024. The >>>>>> tiled-FFT implementation >>>>>> 00320 // does not support strides. >>>>>> 00321 // If the chosen implementation is FFT or tiled-FFT, >>>>>> validate that it can >>>>>> 00322 // be used on the current data and default to a safe >>>>>> implementation if it >>>>>> 00323 // can't. >>>>>> 00324 // The following code is 2d-specific but it is fine as FFT >>>>>> and tiled-FFT are >>>>>> 00325 // defined only for 2d filters >>>>>> 00326 if ((algo == CUDNN_CONVOLUTION_FWD_ALGO_FFT || >>>>>> 00327 algo == CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING) && >>>>>> PyGpuArray_NDIM(input) == 4) { >>>>>> 00328 >>>>>> 00329 // Extract the properties of the convolution descriptor >>>>>> 00330 int nd; >>>>>> 00331 int pad[2]; >>>>>> 00332 int stride[2]; >>>>>> 00333 int upscale[2]; >>>>>> 00334 cudnnConvolutionMode_t mode; >>>>>> 00335 cudnnDataType_t data_type; >>>>>> 00336 err = cudnnGetConvolutionNdDescriptor(desc, 2, &nd, pad, >>>>>> stride, >>>>>> 00337 upscale, &mode, >>>>>> &data_type); >>>>>> 00338 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00339 PyErr_Format(PyExc_RuntimeError, >>>>>> 00340 "error getting convolution properties: >>>>>> %s", >>>>>> 00341 cudnnGetErrorString(err)); >>>>>> 00342 cuda_exit(c->ctx); >>>>>> 00343 return 1; >>>>>> 00344 } >>>>>> 00345 >>>>>> 00346 if (algo == CUDNN_CONVOLUTION_FWD_ALGO_FFT) >>>>>> 00347 { >>>>>> 00348 if (stride[0] != 1 || stride[1] != 1 || >>>>>> 00349 PyGpuArray_DIM(input, 2) > 1024 || >>>>>> PyGpuArray_DIM(input, 3) > 1024 || >>>>>> 00350 (PyGpuArray_DIM(kerns, 2) == 1 && >>>>>> PyGpuArray_DIM(kerns, 3) == 1)) >>>>>> 00351 { >>>>>> 00352 algo = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM; >>>>>> 00353 } >>>>>> 00354 } >>>>>> 00355 else >>>>>> 00356 { >>>>>> 00357 // algo == CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING >>>>>> 00358 if (stride[0] != 1 || stride[1] != 1) >>>>>> 00359 { >>>>>> 00360 algo = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM; >>>>>> 00361 } >>>>>> 00362 } >>>>>> 00363 } >>>>>> 00364 >>>>>> 00365 { >>>>>> 00366 size_t worksize; >>>>>> 00367 gpudata *workspace; >>>>>> 00368 err = >>>>>> cudnnGetConvolutionForwardWorkspaceSize(APPLY_SPECIFIC(_handle), >>>>>> 00369 >>>>>> APPLY_SPECIFIC(input), >>>>>> 00370 >>>>>> APPLY_SPECIFIC(kerns), >>>>>> 00371 desc, >>>>>> 00372 >>>>>> APPLY_SPECIFIC(output), >>>>>> 00373 algo, >>>>>> 00374 &worksize); >>>>>> 00375 >>>>>> 00376 if (err == CUDNN_STATUS_NOT_SUPPORTED) { >>>>>> 00377 // Fallback to none algo if not supported >>>>>> 00378 // TODO: Print a warning >>>>>> 00379 algo = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM; >>>>>> 00380 >>>>>> 00381 err = >>>>>> cudnnGetConvolutionForwardWorkspaceSize(APPLY_SPECIFIC(_handle), >>>>>> 00382 >>>>>> APPLY_SPECIFIC(input), >>>>>> 00383 >>>>>> APPLY_SPECIFIC(kerns), >>>>>> 00384 desc, >>>>>> 00385 >>>>>> APPLY_SPECIFIC(output), >>>>>> 00386 algo, >>>>>> 00387 >>>>>> &worksize); >>>>>> 00388 } >>>>>> 00389 >>>>>> 00390 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00391 PyErr_Format(PyExc_RuntimeError, >>>>>> 00392 "error getting worksize: %s", >>>>>> 00393 cudnnGetErrorString(err)); >>>>>> 00394 cuda_exit(c->ctx); >>>>>> 00395 return 1; >>>>>> 00396 } >>>>>> 00397 >>>>>> 00398 /* >>>>>> 00399 * This is less than ideal since we need to free it >>>>>> after (which >>>>>> 00400 * introduces a synchronization point. But we don't have >>>>>> a module >>>>>> 00401 * to place a nice get_work_mem() function in. >>>>>> 00402 */ >>>>>> 00403 if (worksize != 0) { >>>>>> 00404 workspace = gpudata_alloc(c->ctx, worksize, NULL, 0, >>>>>> NULL); >>>>>> 00405 if (workspace == NULL) { >>>>>> 00406 PyErr_SetString(PyExc_RuntimeError, >>>>>> 00407 "Could not allocate working memory"); >>>>>> 00408 cuda_exit(c->ctx); >>>>>> 00409 return 1; >>>>>> 00410 } >>>>>> 00411 } >>>>>> 00412 >>>>>> 00413 cuda_wait(input->ga.data, GPUARRAY_CUDA_WAIT_READ); >>>>>> 00414 cuda_wait(kerns->ga.data, GPUARRAY_CUDA_WAIT_READ); >>>>>> 00415 cuda_wait((*output)->ga.data, GPUARRAY_CUDA_WAIT_WRITE); >>>>>> 00416 >>>>>> 00417 err = cudnnConvolutionForward( >>>>>> 00418 APPLY_SPECIFIC(_handle), >>>>>> 00419 alpha_p, >>>>>> 00420 APPLY_SPECIFIC(input), PyGpuArray_DEV_DATA(input), >>>>>> 00421 APPLY_SPECIFIC(kerns), PyGpuArray_DEV_DATA(kerns), >>>>>> 00422 desc, algo, >>>>>> 00423 worksize == 0 ? NULL : *(void **)workspace, worksize, >>>>>> 00424 beta_p, >>>>>> 00425 APPLY_SPECIFIC(output), PyGpuArray_DEV_DATA(*output)); >>>>>> 00426 >>>>>> 00427 if (worksize != 0) >>>>>> 00428 gpudata_release(workspace); >>>>>> 00429 >>>>>> 00430 cuda_record(input->ga.data, GPUARRAY_CUDA_WAIT_READ); >>>>>> 00431 cuda_record(kerns->ga.data, GPUARRAY_CUDA_WAIT_READ); >>>>>> 00432 cuda_record((*output)->ga.data, >>>>>> GPUARRAY_CUDA_WAIT_WRITE); >>>>>> 00433 } >>>>>> 00434 cuda_exit(c->ctx); >>>>>> 00435 >>>>>> 00436 if (err != CUDNN_STATUS_SUCCESS) { >>>>>> 00437 PyErr_Format(PyExc_RuntimeError, "error doing operation: >>>>>> %s", >>>>>> 00438 cudnnGetErrorString(err)); >>>>>> 00439 return 1; >>>>>> 00440 } >>>>>> 00441 return 0; >>>>>> 00442 } >>>>>> 00443 >>>>>> 00444 #undef DTYPE_INPUT_0 >>>>>> 00445 #undef TYPENUM_INPUT_0 >>>>>> 00446 #undef ITEMSIZE_INPUT_0 >>>>>> 00447 #undef DTYPE_INPUT_1 >>>>>> 00448 #undef TYPENUM_INPUT_1 >>>>>> 00449 #undef ITEMSIZE_INPUT_1 >>>>>> 00450 #undef DTYPE_INPUT_2 >>>>>> 00451 #undef TYPENUM_INPUT_2 >>>>>> 00452 #undef ITEMSIZE_INPUT_2 >>>>>> 00453 #undef DTYPE_INPUT_4 >>>>>> 00454 #undef TYPENUM_INPUT_4 >>>>>> 00455 #undef ITEMSIZE_INPUT_4 >>>>>> 00456 #undef DTYPE_INPUT_5 >>>>>> 00457 #undef TYPENUM_INPUT_5 >>>>>> 00458 #undef ITEMSIZE_INPUT_5 >>>>>> 00459 #undef DTYPE_OUTPUT_0 >>>>>> 00460 #undef TYPENUM_OUTPUT_0 >>>>>> 00461 #undef ITEMSIZE_OUTPUT_0 >>>>>> 00462 #undef APPLY_SPECIFIC >>>>>> 00463 #undef CONV_INPLACE >>>>>> 00464 #undef CONV_ALGO >>>>>> 00465 >>>>>> 00466 >>>>>> __struct_compiled_op_86feacd077d8749f42b5d82709a80ba3() { >>>>>> 00467 // This is only somewhat safe because we: >>>>>> 00468 // 1) Are not a virtual class >>>>>> 00469 // 2) Do not use any virtual classes in the >>>>>> members >>>>>> 00470 // 3) Deal with mostly POD and pointers >>>>>> 00471 >>>>>> 00472 // If this changes, we would have to revise >>>>>> this, but for >>>>>> 00473 // now I am tired of chasing segfaults because >>>>>> 00474 // initialization code had an error and some >>>>>> pointer has >>>>>> 00475 // a junk value. >>>>>> 00476 memset(this, 0, sizeof(*this)); >>>>>> 00477 } >>>>>> 00478 >>>>>> ~__struct_compiled_op_86feacd077d8749f42b5d82709a80ba3(void) { >>>>>> 00479 cleanup(); >>>>>> 00480 } >>>>>> 00481 >>>>>> 00482 int init(PyObject* __ERROR, PyObject* storage_V3, >>>>>> PyObject* storage_V5, PyObject* storage_V7, PyObject* storage_V9, >>>>>> PyObject* >>>>>> storage_V11, PyObject* storage_V13, PyObject* storage_V1, PyObject* >>>>>> storage_V15) { >>>>>> 00483 Py_XINCREF(storage_V3); >>>>>> 00484 Py_XINCREF(storage_V5); >>>>>> 00485 Py_XINCREF(storage_V7); >>>>>> 00486 Py_XINCREF(storage_V9); >>>>>> 00487 Py_XINCREF(storage_V11); >>>>>> 00488 Py_XINCREF(storage_V13); >>>>>> 00489 Py_XINCREF(storage_V1); >>>>>> 00490 Py_XINCREF(storage_V15); >>>>>> 00491 this->storage_V3 = storage_V3; >>>>>> 00492 this->storage_V5 = storage_V5; >>>>>> 00493 this->storage_V7 = storage_V7; >>>>>> 00494 this->storage_V9 = storage_V9; >>>>>> 00495 this->storage_V11 = storage_V11; >>>>>> 00496 this->storage_V13 = storage_V13; >>>>>> 00497 this->storage_V1 = storage_V1; >>>>>> 00498 this->storage_V15 = storage_V15; >>>>>> 00499 >>>>>> 00500 >>>>>> 00501 >>>>>> 00502 >>>>>> 00503 >>>>>> 00504 >>>>>> 00505 >>>>>> 00506 >>>>>> 00507 >>>>>> 00508 py_V15 = PyList_GET_ITEM(storage_V15, 0); >>>>>> 00509 {Py_XINCREF(py_V15);} >>>>>> 00510 >>>>>> 00511 if (!PyObject_TypeCheck(py_V15, &PyGpuContextType)) { >>>>>> 00512 PyErr_SetString(PyExc_TypeError, "expected a GpuContext"); >>>>>> 00513 { >>>>>> 00514 if (!PyErr_Occurred()) { >>>>>> 00515 PyErr_SetString(PyExc_RuntimeError, >>>>>> 00516 "Unexpected error in an Op's C code. " >>>>>> 00517 "No Python exception was set."); >>>>>> 00518 } >>>>>> 00519 return 15; >>>>>> 00520 } >>>>>> 00521 } >>>>>> 00522 >>>>>> 00523 V15 = (PyGpuContextObject *)py_V15; >>>>>> 00524 Py_INCREF(V15); >>>>>> 00525 >>>>>> 00526 >>>>>> 00527 #define DTYPE_INPUT_0 npy_float16 >>>>>> 00528 #define TYPENUM_INPUT_0 23 >>>>>> 00529 #define ITEMSIZE_INPUT_0 2 >>>>>> 00530 #define DTYPE_INPUT_1 npy_float16 >>>>>> 00531 #define TYPENUM_INPUT_1 23 >>>>>> 00532 #define ITEMSIZE_INPUT_1 2 >>>>>> 00533 #define DTYPE_INPUT_2 npy_float16 >>>>>> 00534 #define TYPENUM_INPUT_2 23 >>>>>> 00535 #define ITEMSIZE_INPUT_2 2 >>>>>> 00536 #define DTYPE_INPUT_4 npy_float16 >>>>>> 00537 #define TYPENUM_INPUT_4 23 >>>>>> 00538 #define ITEMSIZE_INPUT_4 2 >>>>>> 00539 #define DTYPE_INPUT_5 npy_float16 >>>>>> 00540 #define TYPENUM_INPUT_5 23 >>>>>> 00541 #define ITEMSIZE_INPUT_5 2 >>>>>> 00542 #define DTYPE_OUTPUT_0 npy_float16 >>>>>> 00543 #define TYPENUM_OUTPUT_0 23 >>>>>> 00544 #define ITEMSIZE_OUTPUT_0 2 >>>>>> 00545 #define APPLY_SPECIFIC(str) >>>>>> str##_node_86feacd077d8749f42b5d82709a80ba3_0 >>>>>> 00546 #define CONV_INPLACE 1 >>>>>> 00547 #define CONV_ALGO >>>>>> CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM >>>>>> 00548 #define FAIL { \ >>>>>> 00549 if (!PyErr_Occurred()) { \ >>>>>> 00550 PyErr_SetString(PyExc_RuntimeError, \ >>>>>> 00551 "Unexpected error in an Op's C code. " \ >>>>>> 00552 "No Python exception was set."); \ >>>>>> 00553 } \ >>>>>> 00554 return 17; \ >>>>>> 00555 } >>>>>> 00556 #define PARAMS V15 >>>>>> 00557 >>>>>> 00558 >>>>>> 00559 { >>>>>> 00560 // We need to keep a reference here to have it available >>>>>> in the destructor. >>>>>> 00561 ctx = PARAMS; >>>>>> 00562 Py_INCREF(ctx); >>>>>> 00563 >>>>>> 00564 cuda_enter(PARAMS->ctx); >>>>>> 00565 cudnnStatus_t err; >>>>>> 00566 APPLY_SPECIFIC(_handle) = NULL; >>>>>> 00567 if ((err = cudnnCreate(&APPLY_SPECIFIC(_handle))) != >>>>>> CUDNN_STATUS_SUCCESS) { >>>>>> 00568 PyErr_Format(PyExc_RuntimeError, "could not create cuDNN >>>>>> handle: %s", >>>>>> 00569 cudnnGetErrorString(err)); >>>>>> 00570 cuda_exit(PARAMS->ctx); >>>>>> 00571 FAIL; >>>>>> 00572 } >>>>>> 00573 if ((err = cudnnSetStream(APPLY_SPECIFIC(_handle), >>>>>> 00574 cuda_get_stream(PARAMS->ctx))) >>>>>> != CUDNN_STATUS_SUCCESS) { >>>>>> 00575 PyErr_Format(PyExc_RuntimeError, "Could not set cudnn >>>>>> stream: %s", >>>>>> 00576 cudnnGetErrorString(err)); >>>>>> 00577 cuda_exit(PARAMS->ctx); >>>>>> 00578 FAIL; >>>>>> 00579 } >>>>>> 00580 cuda_exit(PARAMS->ctx); >>>>>> 00581 } >>>>>> 00582 >>>>>> 00583 >>>>>> 00584 >>>>>> 00585 cudnnStatus_t APPLY_SPECIFIC(err); >>>>>> 00586 APPLY_SPECIFIC(input) = NULL; >>>>>> 00587 APPLY_SPECIFIC(output) = NULL; >>>>>> 00588 APPLY_SPECIFIC(kerns) = NULL; >>>>>> 00589 if ((APPLY_SPECIFIC(err) = >>>>>> cudnnCreateTensorDescriptor(&APPLY_SPECIFIC(input))) != >>>>>> CUDNN_STATUS_SUCCESS) { >>>>>> 00590 PyErr_Format(PyExc_MemoryError, "could not allocate tensor >>>>>> descriptor " >>>>>> 00591 "(inp): %s", >>>>>> cudnnGetErrorString(APPLY_SPECIFIC(err))); >>>>>> 00592 FAIL; >>>>>> 00593 } >>>>>> 00594 if ((APPLY_SPECIFIC(err) = >>>>>> cudnnCreateTensorDescriptor(&APPLY_SPECIFIC(output))) != >>>>>> CUDNN_STATUS_SUCCESS) { >>>>>> 00595 PyErr_Format(PyExc_MemoryError, "could not allocate tensor >>>>>> descriptor " >>>>>> 00596 "(out): %s", >>>>>> cudnnGetErrorString(APPLY_SPECIFIC(err))); >>>>>> 00597 FAIL; >>>>>> 00598 } >>>>>> 00599 if ((APPLY_SPECIFIC(err) = >>>>>> cudnnCreateFilterDescriptor(&APPLY_SPECIFIC(kerns))) != >>>>>> CUDNN_STATUS_SUCCESS) { >>>>>> 00600 PyErr_Format(PyExc_MemoryError, "could not allocate filter >>>>>> descriptor: %s", >>>>>> 00601 cudnnGetErrorString(APPLY_SPECIFIC(err))); >>>>>> 00602 FAIL; >>>>>> 00603 } >>>>>> 00604 >>>>>> 00605 >>>>>> 00606 >>>>>> 00607 #ifdef CHOOSE_ALGO >>>>>> 00608 reuse_algo = 0; >>>>>> 00609 prev_algo = CONV_ALGO; >>>>>> 00610 #ifndef CHOOSE_ONCE >>>>>> 00611 memset(prev_img_dims, 0, sizeof(prev_img_dims)); >>>>>> 00612 memset(prev_kern_dims, 0, sizeof(prev_kern_dims)); >>>>>> 00613 #endif >>>>>> 00614 #endif >>>>>> 00615 >>>>>> 00616 >>>>>> 00617 #undef FAIL >>>>>> 00618 #undef PARAMS >>>>>> 00619 #undef DTYPE_INPUT_0 >>>>>> 00620 #undef TYPENUM_INPUT_0 >>>>>> 00621 #undef ITEMSIZE_INPUT_0 >>>>>> 00622 #undef DTYPE_INPUT_1 >>>>>> 00623 #undef TYPENUM_INPUT_1 >>>>>> 00624 #undef ITEMSIZE_INPUT_1 >>>>>> 00625 #undef DTYPE_INPUT_2 >>>>>> 00626 #undef TYPENUM_INPUT_2 >>>>>> 00627 #undef ITEMSIZE_INPUT_2 >>>>>> 00628 #undef DTYPE_INPUT_4 >>>>>> 00629 #undef TYPENUM_INPUT_4 >>>>>> 00630 #undef ITEMSIZE_INPUT_4 >>>>>> 00631 #undef DTYPE_INPUT_5 >>>>>> 00632 #undef TYPENUM_INPUT_5 >>>>>> 00633 #undef ITEMSIZE_INPUT_5 >>>>>> 00634 #undef DTYPE_OUTPUT_0 >>>>>> 00635 #undef TYPENUM_OUTPUT_0 >>>>>> 00636 #undef ITEMSIZE_OUTPUT_0 >>>>>> 00637 #undef APPLY_SPECIFIC >>>>>> 00638 #undef CONV_INPLACE >>>>>> 00639 #undef CONV_ALGO >>>>>> 00640 this->__ERROR = __ERROR; >>>>>> 00641 return 0; >>>>>> 00642 } >>>>>> 00643 void cleanup(void) { >>>>>> 00644 __label_1: >>>>>> 00645 >>>>>> 00646 double __DUMMY_1; >>>>>> 00647 __label_3: >>>>>> 00648 >>>>>> 00649 double __DUMMY_3; >>>>>> 00650 __label_5: >>>>>> 00651 >>>>>> 00652 double __DUMMY_5; >>>>>> 00653 __label_7: >>>>>> 00654 >>>>>> 00655 double __DUMMY_7; >>>>>> 00656 __label_9: >>>>>> 00657 >>>>>> 00658 double __DUMMY_9; >>>>>> 00659 __label_11: >>>>>> 00660 >>>>>> 00661 double __DUMMY_11; >>>>>> 00662 __label_13: >>>>>> 00663 >>>>>> 00664 double __DUMMY_13; >>>>>> 00665 __label_15: >>>>>> 00666 Py_XDECREF(V15); V15 = NULL; >>>>>> 00667 {Py_XDECREF(py_V15);} >>>>>> 00668 >>>>>> 00669 double __DUMMY_15; >>>>>> 00670 __label_18: >>>>>> 00671 >>>>>> 00672 #define DTYPE_INPUT_0 npy_float16 >>>>>> 00673 #define TYPENUM_INPUT_0 23 >>>>>> 00674 #define ITEMSIZE_INPUT_0 2 >>>>>> 00675 #define DTYPE_INPUT_1 npy_float16 >>>>>> 00676 #define TYPENUM_INPUT_1 23 >>>>>> 00677 #define ITEMSIZE_INPUT_1 2 >>>>>> 00678 #define DTYPE_INPUT_2 npy_float16 >>>>>> 00679 #define TYPENUM_INPUT_2 23 >>>>>> 00680 #define ITEMSIZE_INPUT_2 2 >>>>>> 00681 #define DTYPE_INPUT_4 npy_float16 >>>>>> 00682 #define TYPENUM_INPUT_4 23 >>>>>> 00683 #define ITEMSIZE_INPUT_4 2 >>>>>> 00684 #define DTYPE_INPUT_5 npy_float16 >>>>>> 00685 #define TYPENUM_INPUT_5 23 >>>>>> 00686 #define ITEMSIZE_INPUT_5 2 >>>>>> 00687 #define DTYPE_OUTPUT_0 npy_float16 >>>>>> 00688 #define TYPENUM_OUTPUT_0 23 >>>>>> 00689 #define ITEMSIZE_OUTPUT_0 2 >>>>>> 00690 #define APPLY_SPECIFIC(str) >>>>>> str##_node_86feacd077d8749f42b5d82709a80ba3_0 >>>>>> 00691 #define CONV_INPLACE 1 >>>>>> 00692 #define CONV_ALGO >>>>>> CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM >>>>>> 00693 >>>>>> 00694 >>>>>> 00695 cuda_enter(ctx->ctx); >>>>>> 00696 cudnnDestroy(APPLY_SPECIFIC(_handle)); >>>>>> 00697 cuda_exit(ctx->ctx); >>>>>> 00698 Py_DECREF((PyObject *)ctx); >>>>>> 00699 >>>>>> 00700 >>>>>> 00701 if (APPLY_SPECIFIC(input) != NULL) >>>>>> 00702 cudnnDestroyTensorDescriptor(APPLY_SPECIFIC(input)); >>>>>> 00703 if (APPLY_SPECIFIC(output) != NULL) >>>>>> 00704 cudnnDestroyTensorDescriptor(APPLY_SPECIFIC(output)); >>>>>> 00705 if (APPLY_SPECIFIC(kerns) != NULL) >>>>>> 00706 cudnnDestroyFilterDescriptor(APPLY_SPECIFIC(kerns)); >>>>>> 00707 >>>>>> 00708 #undef DTYPE_INPUT_0 >>>>>> 00709 #undef TYPENUM_INPUT_0 >>>>>> 00710 #undef ITEMSIZE_INPUT_0 >>>>>> 00711 #undef DTYPE_INPUT_1 >>>>>> 00712 #undef TYPENUM_INPUT_1 >>>>>> 00713 #undef ITEMSIZE_INPUT_1 >>>>>> 00714 #undef DTYPE_INPUT_2 >>>>>> 00715 #undef TYPENUM_INPUT_2 >>>>>> 00716 #undef ITEMSIZE_INPUT_2 >>>>>> 00717 #undef DTYPE_INPUT_4 >>>>>> 00718 #undef TYPENUM_INPUT_4 >>>>>> 00719 #undef ITEMSIZE_INPUT_4 >>>>>> 00720 #undef DTYPE_INPUT_5 >>>>>> 00721 #undef TYPENUM_INPUT_5 >>>>>> 00722 #undef ITEMSIZE_INPUT_5 >>>>>> 00723 #undef DTYPE_OUTPUT_0 >>>>>> 00724 #undef TYPENUM_OUTPUT_0 >>>>>> 00725 #undef ITEMSIZE_OUTPUT_0 >>>>>> 00726 #undef APPLY_SPECIFIC >>>>>> 00727 #undef CONV_INPLACE >>>>>> 00728 #undef CONV_ALGO >>>>>> 00729 double __DUMMY_18; >>>>>> 00730 >>>>>> 00731 Py_XDECREF(this->storage_V3); >>>>>> 00732 Py_XDECREF(this->storage_V5); >>>>>> 00733 Py_XDECREF(this->storage_V7); >>>>>> 00734 Py_XDECREF(this->storage_V9); >>>>>> 00735 Py_XDECREF(this->storage_V11); >>>>>> 00736 Py_XDECREF(this->storage_V13); >>>>>> 00737 Py_XDECREF(this->storage_V1); >>>>>> 00738 Py_XDECREF(this->storage_V15); >>>>>> 00739 } >>>>>> 00740 int run(void) { >>>>>> 00741 int __failure = 0; >>>>>> 00742 >>>>>> 00743 PyObject* py_V1; >>>>>> 00744 >>>>>> 00745 PyGpuArrayObject *V1; >>>>>> 00746 >>>>>> 00747 PyObject* py_V3; >>>>>> 00748 >>>>>> 00749 PyGpuArrayObject *V3; >>>>>> 00750 >>>>>> 00751 PyObject* py_V5; >>>>>> 00752 >>>>>> 00753 PyGpuArrayObject *V5; >>>>>> 00754 >>>>>> 00755 PyObject* py_V7; >>>>>> 00756 >>>>>> 00757 PyGpuArrayObject *V7; >>>>>> 00758 >>>>>> 00759 PyObject* py_V9; >>>>>> 00760 >>>>>> 00761 cudnnConvolutionDescriptor_t V9; >>>>>> 00762 >>>>>> 00763 PyObject* py_V11; >>>>>> 00764 >>>>>> 00765 typedef npy_float16 V11_dtype; // Deprecated >>>>>> use dtype_V11 instead. >>>>>> 00766 typedef npy_float16 dtype_V11; >>>>>> 00767 >>>>>> 00768 npy_float16 V11; >>>>>> 00769 >>>>>> 00770 PyObject* py_V13; >>>>>> 00771 >>>>>> 00772 typedef npy_float16 V13_dtype; // Deprecated >>>>>> use dtype_V13 instead. >>>>>> 00773 typedef npy_float16 dtype_V13; >>>>>> 00774 >>>>>> 00775 npy_float16 V13; >>>>>> 00776 >>>>>> 00777 { >>>>>> 00778 >>>>>> 00779 py_V1 = PyList_GET_ITEM(storage_V1, 0); >>>>>> 00780 {Py_XINCREF(py_V1);} >>>>>> 00781 >>>>>> 00782 if (py_V1 == Py_None) >>>>>> 00783 { >>>>>> 00784 V1 = NULL; >>>>>> 00785 } >>>>>> 00786 else >>>>>> 00787 { >>>>>> 00788 >>>>>> 00789 V1 = NULL; >>>>>> 00790 if (py_V1 == Py_None) { >>>>>> 00791 PyErr_SetString(PyExc_ValueError, "expected a >>>>>> GpuArray, not None"); >>>>>> 00792 { >>>>>> 00793 __failure = 2; >>>>>> 00794 if (!PyErr_Occurred()) { >>>>>> 00795 PyErr_SetString(PyExc_RuntimeError, >>>>>> 00796 "Unexpected error in an Op's C code. " >>>>>> 00797 "No Python exception was set."); >>>>>> 00798 } >>>>>> 00799 goto __label_2;} >>>>>> 00800 } >>>>>> 00801 /* First check if we are the base type exactly (the >>>>>> most common case), >>>>>> 00802 then do the full subclass check if needed. */ >>>>>> 00803 if (py_V1->ob_type != &PyGpuArrayType && >>>>>> 00804 !PyObject_TypeCheck(py_V1, &PyGpuArrayType)) { >>>>>> 00805 PyE >>>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "theano-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "theano-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
