Tom,

On feature of NuPIC and HTM is online learning, so as the data
patterns change over time, NuPIC will recognize the new patterns while
forgetting older patterns. So we usually don't have to re-run swarms.
Swarming is not perfect, and sometimes we do some manual tuning to the
model parameters it returns. But generally, when patterns within the
data change over time, there is no need to re-swarm. Perhaps you might
need to update the min/max values if the data starts jumping out of
it's normal range.

You would have to re-swarm if you wanted to add a new field to the
data input, or recategorized a field as a different data type or
something like that.

---------
Matt Taylor
OS Community Flag-Bearer
Numenta


On Mon, Apr 27, 2015 at 6:51 PM, Tom Tan <[email protected]> wrote:
> Hi,
>
> A newbie question:  The swarm runs over a pre-selected dataset.  I suppose 
> the the resulting model params will be optimal for those selected data, hence 
> raising the possibility of overfitting.  The resulted model params could be 
> ill fit for data that never seen before.
>
> The “classical” ML approach is to compare different models using a new 
> “cross-validation” data set.  The model gives the smallest error will be 
> chosen.  Does Nupic has “error” outputs?
>
> To further extend the question - when underline data behavior changes, 
> when/what to signal the need to re-run swarm.    Swarming seems pretty 
> computational expensive, is it practical to run swarm over high speed online 
> streaming data?
>
> Regards,
> Tom
>
>

Reply via email to