Thanks Matt! That’s very helpful.

> On Nov 7, 2014, at 8:10 PM, Matthew Taylor <[email protected]> wrote:
> 
> I'll answer what I can...
> 
> On Fri, Nov 7, 2014 at 9:56 AM, Nicholas Mitri <[email protected]> wrote:
>> 1. Why does prediction lag instead of lead until a large number of samples 
>> has been processed? I remember reading about that in the ML and it having 
>> something to do with HTM passing through observed values as-is when it can’t 
>> predict well. Can someone elaborate on that please both in terms of how its 
>> implemented and the rationale behind it? It tends to produce very misleading 
>> plots especially when the anomaly score isn’t usually high enough to 
>> indicate the pass-through events.
> 
> For the plot, the prediction results are shifted by 1 so that they
> align by timestamp. Before the model learns enough to make decent
> predictions, you are right that it usually just predicts the value i.
> just saw. It looks like it is lagging because the plots are aligned,
> and the prediction line is just showing the last value it saw. Once
> predictions get better, the line get closer and closer to being
> completely aligned. Perfect predictions would show up on the plot as
> both lines perfectly overlapping each other.
> 
> There are more sophisticated ways one could plot this, for example you
> could change the plot to show the most recent prediction out in front
> of the data. For simplicity's sake, I didn't do this for the tutorial.
> 
>> 2. Why is the timestamp included as an encoded field and passed to the 
>> network in the gym example? Is it processed in the same way as the 
>> consumption field or is it only used to align predictions with their 
>> corresponding inputs? For cases with uniform sampling (like the sine 
>> example), can we simply ignore that field and only encode the equivalent of 
>> the consumption field?
> 
> For the hotgym example, there are daily and weekly temporal patterns.
> The datetime must be encoded along with other input to get
> characteristics of time like "time of day" and "day of week". If we
> didn't encode the time like this, the model would not recognize these
> patterns because there were not encoded along with the other input.
> 
> For the sine example, there are no true time-based patterns (no daily,
> hourly, weekly patterns, etc.). So there is no need to encode time in
> the input. It is a sequencial pattern, but adding an encoded timestamp
> to the input wouldn't help with predictions, because there are no time
> patterns. The only pattern is the sine cycle itself.
> 
> ------
> Matt Taylor
> OS Community Flag-Bearer
> Numenta
> 


Reply via email to