I’m actually experimenting with the CoordinateEncoder so its’ good to see this
example. I took the model parameters from the geospatial demo so I’ll compare
to this and see if anything should be changed.
A couple of questions:
- Any suggestions on the scaling of the radius parameter ? It can be measured
relative to the previous sample or relative to a sample from a few seconds ago,
how will that change the outcome?
- In the example you are calculating anomaly likelihood what does the random
number mean?
anomalyLikelihood = anomalyLikelihoodHelper.anomalyProbability(
random.random(), anomalyScore
)
Thanks,
Zvika
On 19 Oct 2015, at 4:33 PM, Matthew Taylor
<[email protected]<mailto:[email protected]>> wrote:
Zvika,
Before you try a different encoder, you should attempt to use the
CoordinateEncoder directly. It can accept X,Y coordinates and a
"radius" which can represent speed. That is what I used to get NuPIC
to do anomaly detection on Minecraft XYZ coordinates:
https://github.com/nupic-community/mine-hack/blob/master/python/nupic_client.py#L71-L79
And for an anomaly detection model on coordinates, you won't need to
swarm because we already have model params that work well detection
these types of anomalies here:
https://github.com/nupic-community/mine-hack/blob/master/python/model_params/model_params.py.
You should be able to re-use those model params (maybe with a few
string replacements).
---------
Matt Taylor
OS Community Flag-Bearer
Numenta
On Mon, Oct 19, 2015 at 5:05 AM, Zvika Ashani <[email protected]> wrote:
Hi Nupic,
I am trying to see if I can do anomaly detection over a data set that
represents object tracks. Each object track is a list of data points that have
the following information:
- timestamp
- position (x,y between 0 and 1)
- speed
I want to learn a large number for such tracks and then look for anomalies in
new tracks.
This is kind of like the nupic.geospatial example but the position data is in a
different coordinate system.
I am looking at using a vector encoder with each sample being [x,y,speed] and
then feeding each track as a separate sequence into the model.
Questions:
- is this the correct approach or is there some better way of encoding the data?
- is it possible to swarm over this to find the best model?
Thanks,
Zvika