Re: About tutorial experiments

Matthew Taylor Fri, 07 Nov 2014 10:11:32 -0800

I'll answer what I can...

On Fri, Nov 7, 2014 at 9:56 AM, Nicholas Mitri <[email protected]> wrote:
> 1. Why does prediction lag instead of lead until a large number of samples 
> has been processed? I remember reading about that in the ML and it having 
> something to do with HTM passing through observed values as-is when it can’t 
> predict well. Can someone elaborate on that please both in terms of how its 
> implemented and the rationale behind it? It tends to produce very misleading 
> plots especially when the anomaly score isn’t usually high enough to indicate 
> the pass-through events.


For the plot, the prediction results are shifted by 1 so that they
align by timestamp. Before the model learns enough to make decent
predictions, you are right that it usually just predicts the value it
just saw. It looks like it is lagging because the plots are aligned,
and the prediction line is just showing the last value it saw. Once
predictions get better, the line get closer and closer to being
completely aligned. Perfect predictions would show up on the plot as
both lines perfectly overlapping each other.

There are more sophisticated ways one could plot this, for example you
could change the plot to show the most recent prediction out in front
of the data. For simplicity's sake, I didn't do this for the tutorial.

> 2. Why is the timestamp included as an encoded field and passed to the 
> network in the gym example? Is it processed in the same way as the 
> consumption field or is it only used to align predictions with their 
> corresponding inputs? For cases with uniform sampling (like the sine 
> example), can we simply ignore that field and only encode the equivalent of 
> the consumption field?

For the hotgym example, there are daily and weekly temporal patterns.
The datetime must be encoded along with other input to get
characteristics of time like "time of day" and "day of week". If we
didn't encode the time like this, the model would not recognize these
patterns because there were not encoded along with the other input.

For the sine example, there are no true time-based patterns (no daily,
hourly, weekly patterns, etc.). So there is no need to encode time in
the input. It is a sequencial pattern, but adding an encoded timestamp
to the input wouldn't help with predictions, because there are no time
patterns. The only pattern is the sine cycle itself.

------
Matt Taylor
OS Community Flag-Bearer
Numenta

Re: About tutorial experiments

Reply via email to