Hi Arash, Make sure all tensors are copied to the same cuda device to avoid memory error. Please reinstall singa using the latest wheel file
http://comp.nus.edu.sg/~dbsystem/singa/assets/file/pb2.6-cuda7.5-cudnn5/singa-1.0.0-cp27-none-linux_x86_64.whl which resolved the problem of int32<->float32. Thanks. On Sun, Oct 9, 2016 at 4:14 PM, Wang Wei <[email protected]> wrote: > The label tensor should be of shape (2947,) like this one > https://github.com/apache/incubator-singa/blob/master/ > python/singa/metric.py#L31 > I will check the problem from int32 and float32. > > On Sun, Oct 9, 2016 at 3:27 PM Arash Shafiei <[email protected]> > wrote: > >> I am facing a problem concerning creating tensor from numpy: >> >> Suppose that I have an array of these values: >> b = np.asarray([ [1., 0., 0., 0., 0., 0.], [0., 1., 0., 0., 0., 0.]]) >> >> I transform it to a tensor: >> t = tensor.from_numpy(b.astype(np.int32)) >> >> Now I again take the numpy back: >> a = tensor.to_numpy(t) >> >> The values of 'a' change and I lose accuracy: >> >>> a >> array([[ 1.40129846e-45, 0.00000000e+00, 0.00000000e+00, >> 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], >> [ 0.00000000e+00, 1.40129846e-45, 0.00000000e+00, >> 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], >> dtype=float32) >> >> Now suppose that I create a tensor of float32: >> t = tensor.from_numpy(b.astype(np.float32)) >> >> Now I again take the numpy back: >> a = tensor.to_numpy(t) >> >> Now the values of 'a' are correct: >> array([[ 1., 0., 0., 0., 0., 0.], >> [ 0., 1., 0., 0., 0., 0.]], dtype=float32) >> >> But in the second case where I am having float32, while calculating the >> L2 norm: >> lvalue = lossfun.forward(model_pb2.kTrain, act, labels) >> batch_loss += lvalue.l2() >> >> The error message is: >> [F d1009 t15:26:37 p29915:584 /home/wuwf/work/incubator- >> singa/src/core/device/cuda_gpu.cc:112] Check failed: error == >> cudaSuccess (77 vs. 0) an illegal memory access was encountered >> Aborted (core dumped) >> >> Thanks for the support. >> >> >> On Sun, Oct 9, 2016 at 2:09 PM, Arash Shafiei <[email protected]> >> wrote: >> >> Thanks much. >> >> I am trying to use Evaluate() function. I pass activations of the dense >> layers and all the labels: >> >> evaluator.Evaluate(act,labels) >> >> The dimensions seem to be correct: >> >> act.shape == labels.shape == (2947, 6) >> >> But I am getting the following error: >> TypeError: in method 'Metric_Evaluate', argument 2 of type 'singa::Tensor >> const &' >> >> On Sun, Oct 9, 2016 at 1:17 PM, Wang Wei <[email protected]> wrote: >> >> No. Loss != Inaccuracy. >> If you want to compute the accuracy, you need to create an >> evaluator=singa.Accuracy(), and call evaluator.Evaluate(o, t), where o is >> the output from the dense layer and t is the ground truth tensor. You can >> follow the example here https://github.com/apache/ >> incubator-singa/blob/master/python/singa/metric.py#L67. >> >> Good Luck! >> >> On Sun, Oct 9, 2016 at 12:08 PM Arash Shafiei <[email protected]> >> wrote: >> >> Thanks for the hint. >> >> I was sending it to the device but the problem turned out to be that I >> did not cast labels to int32. >> >> Now it is working and I am getting: >> >> [................... ] 96.4% training loss = 0.003444 >> Epoch 49, train loss is 0.003509 >> Epoch 49, evaluation loss is 0.003534 >> >> Does this mean that after 50 epoch the evaluation has only 3.5% >> inaccuracy? >> >> On Sun, Oct 9, 2016 at 11:29 AM, Wei Wang <[email protected]> wrote: >> >> Have you moved all tensor onto the same devices? Including the tensor for >> the labels. >> >> >> On 9 Oct 2016, at 11:02 AM, Arash Shafiei <[email protected]> >> wrote: >> >> outputs = rnn.forward(model_pb2.kTrain, inputs)[0:-2] >> grads = [] >> batch_loss = 0 >> g_dense_w.set_value(0.0) >> g_dense_b.set_value(0.0) >> print 'outputs len', len(outputs) // 128 >> output = outputs[-1] >> act = dense.forward(model_pb2.kTrain, output) >> print 'output shape', output.shape // (256, 28) >> print 'activation shape', act.shape // (256, 6) >> print 'labels shape', labels.shape // (256, 6) >> lvalue = lossfun.forward(model_pb2.kTrain, act, labels) >> batch_loss += lvalue.l1() // [F d1009 t11:00:24 p23551:016 >> /home/wuwf/work/incubator-singa/src/core/tensor/./tensor_math_cuda.h:344] >> Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) >> CUBLAS_STATUS_MAPPING_ERROR >> Aborted (core dumped) >> >> >> >> >> On Sun, Oct 9, 2016 at 10:55 AM, Wei Wang <[email protected]> wrote: >> >> Could you please paste the relevant code leading to this error? >> >> >> >> On 9 Oct 2016, at 10:32 AM, Arash Shafiei <[email protected]> >> wrote: >> >> Thanks, it worked. >> >> So far, I managed to do rnn::forward(...) but now I am stuck somewhere >> else. >> >> rnn::forward(...) returns a tensor (denoted as lvalue). I have to obtain >> the L1 norm using lvalue.l1(). >> >> But I get this error: >> [F d1009 t10:30:14 p23056:-56 /home/wuwf/work/incubator- >> singa/src/core/tensor/./tensor_math_cuda.h:344] Check failed: status == >> CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR >> Aborted (core dumped) >> >> On Sat, Oct 8, 2016 at 9:43 PM, Wang Wei <[email protected]> wrote: >> >> Actually, the char-rnn example is from type (4), where each rnn unit >> would generate a prediction and has a ground truth label. >> >> For your model (type 2), you only need to use the y128 (of shape 256, 28) >> from the rnn::forward() as the input to the dense layer. All other yi >> should be ignored. >> Consequently, you would have an output (denoted as o) of shape (256, 6) >> from the dense layer, which is the prediction for the whole sequence (of >> length 128). >> By feeding the prediction o and the label into the loss layer, you can >> compute the loss value and compute the gradient for o (denoted as o'). >> Backward propagating the o through the dense layer, you would get the >> gradient for y128, denoted as y'128. >> >> *The input of the rnn::backward() would be <y'1, y'2, ...y'128, hy', >> cy'>, where only y'128 is a valid tensor. y'1, y'2 ... should be tensor >> with value 0.* >> >> Best, >> Wei >> >> >> On Sat, Oct 8, 2016 at 9:33 PM Arash Shafiei <[email protected]> >> wrote: >> >> Thanks. It worked. >> >> I am now at the phase of evaluating the loss. >> >> singa.loss.SoftmaxCrossEntropy has a forward function where it takes >> prediction tensors and ground truth. >> >> My problem now is that the prediction is a sequence and my label is not a >> sequence. >> >> Your char-rnn example is an application of type (1) in the figure bellow, >> but activity recognition is an application of type (2). >> >> >> <rnn-app.png> >> Therefore for each sequence in a batch I have only 1 label. (although >> this label can be of one dimension from the set of {1,2,3,4,5,6} or of 6 >> dimension from the set of { [1,0,0,0,0,0], [0,1,0,0,0,0] , etc. } >> >> So now I need predictions and ground truth. The prediction for me is of >> shape >> (128, 256, 28) >> where 128 is the length of the sequence, 256 is the batch size and 28 is >> the hidden layer size. >> >> And my ground truth is of shape >> (256, 1) or (256, 6) -- depending on how you model it.. >> >> But as I understood from the example of char-rnn my ground truth must be >> of shape: >> (128, 256) >> >> Would you have any insight about this? >> Thanks.. >> >> >> On Sat, Oct 8, 2016 at 6:42 PM, Wang Wei <[email protected]> wrote: >> >> Currently, numpy array of dtype=np.float32 or np.int could be converted >> into singa tensor. >> Please convert the numpy array into np.float32 and then call >> tensor.from_numpy(t) (without dtype=np.float32). >> >> On Sat, Oct 8, 2016 at 6:36 PM Arash Shafiei <[email protected]> >> wrote: >> >> The values that I have are floating points [-1 1]. >> >> While using tensor.from_numpy(...), I was getting this error: >> >> Not implemented yet for float64 >> >> I understood from the tutorial that we could pass the data type: >> >> y = tensor.from_numpy(..., dtype=np.float32) >> >> But using dtype, I am getting another error: >> >> TypeError: from_numpy() got an unexpected keyword argument 'dtype' >> >> >> >> On Sat, Oct 8, 2016 at 3:45 PM, Wang Wei <[email protected]> wrote: >> >> Hi >> >> According to the API of forward function: http://singa.apache.org/en/ >> docs/layer.html#singa.layer.RNN.forward >> The input should be a vector of Tensors, <x1, x2, ... x128, hx, cx>, xi >> is of shape (1500, 9), hx and cx are optional whose shape should be (1500, >> 28). >> The output would be a vector of Tensors, <y1, y2, ..., y128, hy, cy>, yi >> is of shape (1500, 28), hy and cy are optional depending on the existence >> of hx and cx. >> If you want to put the dense layer on top of the last rnn unit (i.e. the >> 128-th), then you feed y128 to the dense layer. >> >> function convert just reshapes the raw data into a sequence of tensors >> <x1, x2, ..>. >> >> BTW, typically, people would use a smaller batchsize e.g. less than 256. >> >> May I forward our discussion to the incubator email list in case others >> have similar problems? >> Thanks. >> >> Best, >> Wei >> >> So here what I have: >> >> input batch of dimension (1500, 128, 9) >> This means a batch of 1500 windows each having 128 vector of 9 dimensions. >> >> input label of dimension (1500, 6) >> This means a label batch of 1500 vector of 6 dimensions. This is to label >> if the person is sitting ([1,0,0,0,0,0]) or standing ([0,1,0,0,0,0]), etc. >> >> I am creating an lstm layer with hidden_size=28 and >> input_sample_shape=(9,) and num_stacks=1 >> >> Then I create a dense layer with num_output=6 and input_sample_shape=(28,) >> >> Now I would like to feed the data to the 'forward' function of lstm and >> dense layer. But I could not make it work and I could not quit understand >> from the example what 'convert' and 'numpy2tensors' are suppose to do... >> >> I would appreciate your comments.. >> >> On Sun, Sep 25, 2016 at 12:23 PM, Arash Shafiei <[email protected]> >> wrote: >> >> Yes, I was thinking of batch size to be 32. >> >> Thanks. I am getting more how it works and I am thinking how RNN would be >> helpful. Because we do not want to predict a sequence. We just have a >> sequence (in raw data) and a set of features (in processed data) and we >> want to know the classification. >> >> So I was thinking of using other approaches with SINGA. I understood that >> there is also MLP. We could use MLP from SINGA to see the result first. >> >> In this case input would be a set of 561 values with a label. >> Then the MLP, given a set of test data with 561 features would predict >> the label. >> >> Thanks for advices.. >> >> >>
