Re: activity recognition using apache singa

Wang Wei Sat, 08 Oct 2016 03:43:33 -0700

Currently, numpy array of dtype=np.float32 or np.int could be converted
into singa tensor.
Please convert the numpy array into np.float32 and then call
tensor.from_numpy(t) (without dtype=np.float32).


On Sat, Oct 8, 2016 at 6:36 PM Arash Shafiei <[email protected]>
wrote:

> The values that I have are floating points [-1 1].
>
> While using tensor.from_numpy(...), I was getting this error:
>
> Not implemented yet for  float64
>
> I understood from the tutorial that we could pass the data type:
>
> y = tensor.from_numpy(..., dtype=np.float32)
>
> But using dtype, I am getting another error:
>
> TypeError: from_numpy() got an unexpected keyword argument 'dtype'
>
>
>
> On Sat, Oct 8, 2016 at 3:45 PM, Wang Wei <[email protected]> wrote:
>
> Hi
>
> According to the API of forward function:
> http://singa.apache.org/en/docs/layer.html#singa.layer.RNN.forward
> The input should be a vector of Tensors, <x1, x2, ... x128, hx, cx>, xi is
> of shape (1500, 9), hx and cx are optional whose shape should be (1500, 28).
> The output would be a vector of Tensors, <y1, y2, ..., y128, hy, cy>, yi
> is of shape (1500, 28), hy and cy are optional depending on the existence
> of hx and cx.
> If you want to put the dense layer on top of the last rnn unit (i.e. the
> 128-th), then you feed y128 to the dense layer.
>
> function convert just reshapes the raw data into a sequence of tensors
> <x1, x2, ..>.
>
> BTW, typically, people would use a smaller batchsize e.g. less than 256.
>
> May I forward our discussion to the incubator email list in case others
> have similar problems?
> Thanks.
>
> Best,
> Wei
>
> So here what I have:
>
> input batch of dimension (1500, 128, 9)
> This means a batch of 1500 windows each having 128 vector of 9 dimensions.
>
> input label of dimension (1500, 6)
> This means a label batch of 1500 vector of 6 dimensions. This is to label
> if the person is sitting ([1,0,0,0,0,0]) or standing ([0,1,0,0,0,0]), etc.
>
> I am creating an lstm layer with hidden_size=28 and
> input_sample_shape=(9,) and num_stacks=1
>
> Then I create a dense layer with num_output=6 and input_sample_shape=(28,)
>
> Now I would like to feed the data to the 'forward' function of lstm and
> dense layer. But I could not make it work and I could not quit understand
> from the example what 'convert' and 'numpy2tensors' are suppose to do...
>
> I would appreciate your comments..
>
> On Sun, Sep 25, 2016 at 12:23 PM, Arash Shafiei <[email protected]>
> wrote:
>
> Yes, I was thinking of batch size to be 32.
>
> Thanks. I am getting more how it works and I am thinking how RNN would be
> helpful. Because we do not want to predict a sequence. We just have a
> sequence (in raw data) and a set of features (in processed data) and we
> want to know the classification.
>
> So I was thinking of using other approaches with SINGA. I understood that
> there is also MLP. We could use MLP from SINGA to see the result first.
>
> In this case input would be a set of 561 values with a label.
> Then the MLP, given a set of test data with 561 features would predict the
> label.
>
> Thanks for advices..
>
>
>
> On Sun, Sep 25, 2016 at 12:03 PM, Wang Wei <[email protected]>
> wrote:
>
>
>
> On Sun, Sep 25, 2016 at 9:37 AM, Arash Shafiei <[email protected]>
> wrote:
>
> Hi Wang Wei,
>
> I am trying to understand the char-nn example, but there is still
> something that I am missing and cannot figure is out by myself.
>
> The convert function creates two numpy array x and y. As I understood the
> array x is the data and array y are labels.
>
> I checked the dimentions of these arrays.
> x.shape is (32, 100, 101)
> y.shape is (32, 100)
>
> 32 is the batch size
> 100 is the sequence size
> 101 is the vocabulary size, i.e. there ae 101 unique chars in the
> linux_input.txt.  each input from one sample and at one time step is a
> one-hot vector with all positions being 0 except the position of the
> character (set to 1).
>
>
> given a sequence of chars,   a,b,c,d,e,f
> if the input (x) is  a, b, c, d, e
> then the label is  b, c, d, e, f
>
>
>
> In my understanding you are taking a batch of 100 character and the next
> character must be the label. So according to my understanding
> x.shape must be (32, 100)
> y.shape must be (32, 1)
>
> I mean that you have a batch of 32 sample to train and each sample is a
> series of 100 character. For each sample, there must be a label, which says
> what character must follow this series. And that character is only 1.
>
> Is there anything that I do not quit understand?
>
> I would need this information in order to modify your sample program for
> the activity recognition.
> So ultimately in my use case:
> x.shape probably is (32, 561)
> y.shape probably is (32, 1)
>
>
> For you case, if you use 561 features, then how about the sequence length?
> Is 32 the batchsize?
>
> 561 are floating point features which is between [-1:1].
> 1 is the label which is in [1,2,3,4,5,6]
>
> I would appreciate your help.
> Thanks.
>
> On Sat, Sep 24, 2016 at 1:59 PM, Wang Wei <[email protected]> wrote:
>
> No . Don't average them.
> xij is a a vector of 6 values. You can normalize them using standard
> normalization methods.
>
> On Sat, Sep 24, 2016 at 1:54 PM, Arash Shafiei <[email protected]>
> wrote:
>
> Thanks for the analysis. I appreciate it.
>
> There is only one thing:
> The activities do not seem to be continuous for a person. It is like
> people are told to walk for a fixed period and 128 sample in R^6 is
> collected. Then people are told to sit, etc.
>
> So the person is not the focus and the focus is one activity.
>
> We are currently working on the first approach you proposed and will see
> result.
>
> Later, we would like to try the second approach. My only concern was that
> xi0, xi1, ... are in R^6 and you propose to concatenate them. Since they
> are floating points I do not know how concatenation would work. Even if we
> average, we would lose lots of information. We will think about it.
>
> Thanks for your advices.
>
>
> On Sat, Sep 24, 2016 at 1:27 PM, Wang Wei <[email protected]> wrote:
>
> Let's denote xij \in R^6 for the j-th time point of the i-th activity of a
> person,
> let yi \in R561 for the i-th activity of a person.
>
> If the activities of a person are continuous, then you have to approaches
> 1. use y0, y1, y2, .... (all activities of a person) as input, and use the
> labels l0, l1, l2... as the corresponding output of the RNN. The RNN needs
> to output a label for each activity.
> 2. use the raw data, xi0, xi1, xi2.... (all information from a activity)
> as the input, and use the label li as the output of the RNN. The RNN needs
> to output of a label for all time points of one activity.
>
>
>
> On Sat, Sep 24, 2016 at 12:33 PM, Arash Shafiei <[email protected]>
> wrote:
>
> Yes, in the raw data, for each labeled sample (activity) there are 128
> time points, each with 6 channels of floating point data. (acc-x, acc-y,
> acc-z, gyro-x, gyro-y, gyro-z)
>
> For each sample (activity) of 128 points of 6 channels, 561 features are
> generated.
>
> Each person performs almost 200 activities.
>
>
>
>
>
> On Sat, Sep 24, 2016 at 12:20 PM, Wang Wei <[email protected]>
> wrote:
>
> Do you mean that in the dataset, each sample(person) has 128 time points,
> each one with 6 channels?
> If so, I think you can concatenate all 6 channels into a single channel.
>
> On Sat, Sep 24, 2016 at 12:03 PM, Arash Shafiei <[email protected]>
> wrote:
>
> Hi Wan Wei,
>
> We were wondering if the input of RNN can have multiple channel.
>
> In the example that you have for text prediction, the only channel is the
> characters entering the network.
>
> Now if there are multiple time series, then the network needs multiple
> channels.
>
> For example the raw data coming from accelerometers and gyroscopes are
> compose 6 time series. It means that the data can have 6 dimensions and
> therefore the input of network can have 6 channels.
>
> I verified the data set and it turns out that 561 features are generated
> from 128*6 raw data. So a sequence of samples has 128 values for acc-x,
> acc-y, acc-z, gyro-x, gyro-y, and gyro-z.
>
> As a result the 561 features are not time series anymore.
>
> We are thinking of:
> 1) Use a decision tree of 561 processed feature.
> 2) Use RNN for raw data.
>
> To use RNN for raw data, we would need channels for the input. Would this
> be possible with SINGA?
>
> Thanks.
>
>
>

Re: activity recognition using apache singa

Reply via email to