Chetan, wouldn't it be better to send a null value instead?

Jonathan, for #2: There is no swarm being run in the anomaly example.
I just used the same model params after changing the inference type to
TemporalAnomaly. Everything else is the same. Regarding creating many
swarm descriptions programmatically... that makes sense sometimes, but
I don't think it does in this context. The data coming out of each
traffic sensor is probably very similar to all the other sensors. I
would imagine that the same model params would be equally applicable
to each sensor when you create a model. As the models run and learn
their individual sensor patterns over time, that is what will make
them different. Take the NYC traffic example, I used the same model
params for every route, because the data format is the same for every
route. You could probably do the same.

#3: Yes, but I would like to see a detailed description of your data
or a sample data set.



---------
Matt Taylor
OS Community Flag-Bearer
Numenta


On Wed, Sep 23, 2015 at 3:20 PM, Chetan Surpur <[email protected]> wrote:
> Jonathan,
>
> On Sep 23, 2015, at 1:54 AM, Jonathan Mackenzie <[email protected]> wrote:
>
> Can a model be fed an instance with missing input fields? Sometimes my data
> has error readings (indicated by a count of 2046) from the sensor and this
> is not something that. is anomalous (2046 cars passing a single sensor in 5
> minutes is highly anomalous, nigh impossible to occur and should probably be
> ignored). How should I handle this? Should I just drop the instance
> entirely? Keeping in mind that for a particular time, an intersection can
> have valid readings on some sensors and error readings on others. Error
> readings are not very common, about 8 in a month.
>
>
> I would just repeat the last value whenever you detect an error reading.
>
> - Chetan
>
>
> On 15 September 2015 at 11:20, Matthew Taylor <[email protected]> wrote:
>>
>> Jonathan, my replies are below:
>>
>> On Sun, Sep 13, 2015 at 8:21 PM, Jonathan Mackenzie <[email protected]>
>> wrote:
>> > Following up on our discussions in gitter, basically, I want to perform
>> > automated incident detection (AID as it's called in the literature) on
>> > arterial roads (freeway roads are a different matter and transferability
>> > of
>> > algorithms between freeways and arterial roads is _difficult_).
>> >
>> > I have 3.5 TB of data from 2006-2013 on ~540 intersections ... can nupic
>> > handle this much data?
>>
>> Yes. NuPIC can handle as much data as you throw at it, because the
>> data is not stored. It will take you quite awhile to process that much
>> data, however. I would suggest you attempt to multiprocess.
>>
>> Your data looks good to me, but at what interval do you get it? I
>> would suggest that you take high-speed data and aggregate it to 10-15
>> minute intervals. If you pass the data in at faster intervals, NuPIC
>> may not recognize larger temporal patterns, like weekly or seasonal
>> patterns. This might not work if you are trying to identify traffic
>> incidents within 10 minutes.
>>
>> > The system would be used to determine if an incident has occurred
>> > between
>> > two intersections based on an anomaly value threshold. My initial
>> > thought
>> > for using nupic was to create a model for each intersection where the
>> > inputs
>> > were each individual loop detector. But apparently this is not possible
>> > since htmengine performs anomaly detection on a single field only. I
>> > still
>> > want to perform anomaly detection, so from here, to use htmengine it
>> > looks
>> > like I have 2 options:
>> >
>> >  * Encode the readings into a single value; would this work?
>>
>> Interesting idea, but the problem is how to encode data from multiple
>> sensors into one data point. I'm not sure how this would work.
>>
>> >  * Make a model for every single sensor. Would this be useful?
>>
>> Yes, I'm sure this would be useful, but there is a scaling problem.
>> How many individual sensors do you have? It will take one model per
>> sensor. If you have thousands of sensors, it is going to be hard to
>> scale that many NuPIC models.
>>
>> > It seems
>> > intuitive to think that incidents have an effect on the overall flow of
>> > an
>> > intersection. Would the models be related to each other?
>>
>> The models would not be related because they are only paying attention
>> to their own streams, but if you got high anomaly indications from
>> several models in the same intersection at once, it would be a huge
>> indicator that something just happened.
>>
>> > Could the sensor
>> > model anomaly outputs be fed into a model for their intersection?
>>
>> This has been brought up before, but we've never tried it so we don't
>> know what would happen.
>>
>> >
>> > What's the best way of solving my problem?
>>
>> Another idea is to focus on just a few intersections so you don't have
>> to deal with the scaling problem. You could create multi-variate
>> models (models that look at more than one field of data) for each
>> intersection. But you would need to build these models manually using
>> the OPF, so it would take more work than the HTM Engine. But you'd
>> have much more flexibility and control over your program. You can see
>> a decent OPF example of this (except the multi-variate part) with the
>> Hot Gym tutorials:
>>
>> -
>> https://github.com/numenta/nupic/tree/master/examples/opf/clients/hotgym/prediction/one_gym
>> -
>> https://github.com/numenta/nupic/tree/master/examples/opf/clients/hotgym/anomaly/one_gym
>>
>> > I've followed the htmengine tutorial, but got stuck at the part where I
>> > plug
>> > the readings into the models.
>>
>> I would like to help you if you are stuck. Not sure what you mean, but
>> if you can share your codebase, I (or someone else) can try to help.
>>
>> Regards,
>> ---------
>> Matt Taylor
>> OS Community Flag-Bearer
>> Numenta
>>
>
>
>
> --
> Jonathan Mackenzie
> BEng (Software) Hons
> PhD Candidate, Flinders University
>
>

Reply via email to