[GitHub] [incubator-bluemarlin] jimmylao commented on issue #49: dlpredictor model description

GitBox Tue, 22 Feb 2022 12:05:52 -0800


jimmylao commented on issue #49:
URL: 
https://github.com/apache/incubator-bluemarlin/issues/49#issuecomment-1048165884

@klonikar

1. What is context in the red cells in decoder? Is it used as input to next
timestep?
As shown in the 1st figure, the context of the red cells are from the
attention layer which will be concatenated with the state output of current
cell, then used as input to next cell.

2. Can you explain split of train_window data into train_skip_first, data
used for training and final eval? How is the data in the final 10 days of 82
(12+60+10) days used in training? Is it used to update regularization params or
weights in backprop?

![image](https://user-images.githubusercontent.com/60371672/155204831-02ce537a-e723-4dec-a806-052c7e78a4d8.png)
As shown in the figure above, the data is typical split into configurable 4
parts, i.e. train_skip_first, train_window, validate and back_offset. In your
case,

- train_skip_first = 12
- train_window = 60
- predict (validate) = 10
- back_offiset = 0

"Is it used to update regularization params or weights in backprop?" - I
don't understand the meaning of this question. The data is used for training a
model or leverage trained model for prediction, in training stage, they are
used to compute gradient in each batch/epoch which can further be used in
backprop.

3. What are the features used at time step of 82 days? How is ts_n in
trainready table used as input? I thought that's the output.

![image](https://user-images.githubusercontent.com/60371672/155207931-bec7a5fa-e3a1-495c-8211-af1b9696558b.png)
As shown in the figure above, time_x is the major input data needed to train
the model, it has multiple components. trainready table is the output of
preprocessing and input for model training. There's a program called
tfrecord_reader.py to convert trainready data format to model training data
format.

4. Attention: Explain attention heads = 10, and attention depth = 60
These parameters are used when the "attention" option is turned on. The
purpose of attention is to find the correlation among different time step. The
attention method used by this model used the attention concept not the
implementation, i.e. it uses convolution to model the correlation among
different time steps which is faster than typical attention implementation,
however, such approximation does not generate better performance. Domain
knowledge based attention features, such as: monthly or quarterly correlation,
are added in the model to fill the gap and was claimed to work better than
using the attention implementation in the model.

5. Can you give mathematical description of the model?
I'll add some math equation later. I suggest you read through the code and
run a couple of sample data first.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@bluemarlin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-bluemarlin] jimmylao commented on issue #49: dlpredictor model description

Reply via email to