Re: [nupic-discuss] NuPIC Prediction Method

Matthew Taylor Thu, 26 Jun 2014 14:53:27 -0700

Lexing,

I hope to have answered your questions below.

On Wed, Jun 25, 2014 at 2:35 PM, Lexing.Tong <[email protected]>
wrote:

> Hello,
>
> I am not exactly sure whether I am using NuPIC correctly and whether my
> assumptions are correct, so I have a few questions about how the prediction
> system works.
>
> 1) I believe the predictions have been shifted to line up with the actual
> values such that on each line, the prediction given refers to the predicted
> value for that line, not the value that is predicted for the next line.

Predictions are only shifted if the "inference shifter" [1] is used.
Otherwise, the predictions received in the result are a step ahead.

> If this is the case, is NuPIC only supposed to predict up to the last
> actual point?  Given 100 actual values as data, NuPIC will produce
> predictions up to that 100th point, but it will not go any further.  In
> this case, how would you get a prediction for the 101st point where the
> actual value is unknown, or is NuPIC not designed to work this way?
>

When setting up a TemporalMutliStep model, you can specify as many
steps-ahead as you want in the model parameters. For example, "1,2,101"
would return three predictions for each row of input (one step ahead, two
steps ahead, and 101 steps ahead). See the simple hot gym example of sample
model params [2].

If setting up a swarm, see the "predictionSteps" swarm parameter [3], where
you can specify as many steps ahead as you like for a swarm.

>
> 2) Also, in the hotgym scenario, in some cases like for the first value
> and other beginning values, the "prediction" is exactly the actual value.
>  Clearly, no prediction has actually occurred and NuPIC has just placed the
> actual value as its prediction (maybe as a placeholder?) because it doesn't
> know what to do.  I think this is misleading, and that the prediction
> should be blank when NuPIC is not actually predicting (this is not
> referring to the case where NuPIC uses the previous value as a prediction;
> this case is a valid prediction since it is using the previous data).
>

I disagree with you there. I think the right thing to do is to return the
previous value. If someone shows you the number "342.53356" and asks you to
predict what comes next, a prediction of the same value is more useful than
nothing at all.

>
> 3) Is NuPIC supposed to use all the data for each prediction instead of
> just the previous data?

Yes.

> For example, I changed the last actual value given in the hotgym scenario,
> and after swarming and running, the new predictions were different from the
> old predictions, which implies that the last actual value has an effect on
> the predictions for the previous values.

The last value absolutely has an effect. At any point in time, the CLA
believes it is within some number of sequences. If the last value it
receives convinces it that it's no longer within a sequence, it might
predict drastically different values.

> If this is actually the case, doesn't that mean NuPIC has some hindsight
> bias because it is using "future" values to help it predict "past" values?

I don't understand what you mean. NuPIC doesn't predict past values. All
the past values have already been processed to establish the current state
of the CLA.

>  I think it may have to do with iterationCount in the swarm description;
> I'm not sure what "feeding all available aggregated records" means when you
> use -1 (from https://github.com/numenta/nupic/wiki/Running-Swarms).
>

The iteration count only has to do with swarming, and it means that the
swarm will run against all the data given in a data file. If you set an
interationCount to 100, it will only swarm over the first 100 records in
the input. This value has nothing at all to do with running models after
the swarm is complete and model params have been used to create a running
model.

>
> 4) What is the relationship/correlation between Anomaly Score and Anomaly
> Likelihood for the anomaly section of the hotgym?  I thought that a point
> with a high anomaly score meant that it had a high chance of being an
> anomaly, but this doesn't seem to be the case.
>

See Subutai's presentation on this from our last hackathon [4].

>
> Thanks,
> Lexing
>

[1]
https://github.com/numenta/nupic/wiki/Online-Prediction-Framework#shifting-inferences
[2]
https://github.com/numenta/nupic/blob/master/examples/opf/clients/hotgym/simple/model_params.py#L236
[3]
https://github.com/numenta/nupic/blob/master/examples/swarm/simple/search_def.json#L53
[4]
http://numenta.org/blog/2014/05/09/2014-spring-hackathon-outcome.html#anomaly_detection_in_cla

---------
Matt Taylor
OS Community Flag-Bearer
Numenta

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] NuPIC Prediction Method

Reply via email to