Hi Fred, I installed the Theano version that contains "Adding an AbstractConv3d interface #4862" https://github.com/Theano/Theano/pull/4862 <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FTheano%2FTheano%2Fpull%2F4862&sa=D&sntz=1&usg=AFQjCNFQrOrtn25ECqWBKo3SBerQDq46cA> but now it doesn't work because in the previous Theano version, class Pool had these parameters: ds, ignore_border, st, padding, mode,openmp in the latest Theano version class Pool there is no ds: only ignore_border, mode,openmp.
In maxpool3d.py I was calling op = DownsampleFactorMax((ds,ds), ignore_border) where DownsampleFactorMax = pool.Pool I tried Pool(mode=..., ...)(input, ws=ws) but it doesn't work. How can I call Pool passing (ds,ds) ? Many Thanks Luca -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
""" Max pooling spatio-temporal inputs for Theano """ from theano import tensor from theano.tensor.signal.downsample import DownsampleFactorMax #it was originally ignore_border=False and then corrected as suggested by Pascal '''Pascal update on ignore_border''' def max_pool_3d(input, ds, ignore_border=True): """ Takes as input a N-D tensor, where N >= 3. It downscales the input video by the specified factor, by keeping only the maximum value of non-overlapping patches of size (ds,ds,ds) (time, height, width) :type input: N-D theano tensor of input images. :param input: input images. Max pooling will be done over the 3 last dimensions. :type ds: tuple of length 3 :param ds: factor by which to downscale. (2,2,2) will halve the video in each dimension. :param ignore_border: boolean value. When True, (5,5,5) input with ds=(2,2,2) will generate a (2,2,2) output. (3,3,3) otherwise. """ if input.ndim < 3: raise NotImplementedError('max_pool_3d requires a dimension >= 3') # extract nr dimensions vid_dim = input.ndim # max pool in two different steps, so we can use the 2d implementation of # downsamplefactormax. First maxpool frames as usual. # Then maxpool the time dimension. Shift the time dimension to the third # position, so rows and cols are in the back # extract dimensions frame_shape = input.shape[-2:] # count the number of "leading" dimensions, store as dmatrix # tensor.prod: product of every term in x along axis batch_size = tensor.prod(input.shape[:-2]) # Reshape x by right padding the shape with n_ones 1s. batch_size = tensor.shape_padright(batch_size,1) # store as 4D tensor with shape: (batch_size,1,height,width) #tensor.cast # Cast any tensor x to a Tensor of the same shape, but with a different numerical type dtype. new_shape = tensor.cast(tensor.join(0, batch_size, tensor.as_tensor([1,]), frame_shape), 'int32') input_4D = tensor.reshape(input, new_shape, ndim=4) # downsample mini-batch of videos in rows and cols op = DownsampleFactorMax((ds,ds), ignore_border) output = op(input_4D) # restore to original shape outshape = tensor.join(0, input.shape[:-2], output.shape[-2:]) out = tensor.reshape(output, outshape, ndim=input.ndim) # now maxpool time # output (time, rows, cols), reshape so that time is in the back shufl = (list(range(vid_dim-3)) + [vid_dim-2]+[vid_dim-1]+[vid_dim-3]) input_time = out.dimshuffle(shufl) # reset dimensions vid_shape = input_time.shape[-2:] # count the number of "leading" dimensions, store as dmatrix batch_size = tensor.prod(input_time.shape[:-2]) batch_size = tensor.shape_padright(batch_size,1) # store as 4D tensor with shape: (batch_size,1,width,time) new_shape = tensor.cast(tensor.join(0, batch_size, tensor.as_tensor([1,]), vid_shape), 'int32') input_4D_time = tensor.reshape(input_time, new_shape, ndim=4) # downsample mini-batch of videos in time op = DownsampleFactorMax((1,ds), ignore_border) outtime = op(input_4D_time) # output # restore to original shape (xxx, rows, cols, time) outshape = tensor.join(0, input_time.shape[:-2], outtime.shape[-2:]) shufl = (list(range(vid_dim-3)) + [vid_dim-1]+[vid_dim-3]+[vid_dim-2]) return tensor.reshape(outtime, outshape, ndim=input.ndim).dimshuffle(shufl)