Re: [nupic-discuss] Prediction of multiple time series for concrete bridge problem [NI]

Cavan Day-Lewis Tue, 19 Aug 2014 10:31:34 -0700

Classification: NPL Management Ltd - Internal

Hi all,


Following advice in this discussion I have been able to input multiple
fields (timestamp, temperature and tilt) and asked it to predict tilt,
which it did. To check that NuPIC was making use of the temperature
field, I also ran NuPIC with just timestamp and tilt, I compared the two
predictions (both were for +1 steps ahead). I found they were identical
and I was hoping that NuPIC would use the temperature field to help its
predictions for tilt.

I am sure I have missed something when setting it up. Find attached the
code I used. Is there anything I am missing?

For your reference AWT_01 is the temperature field 

Note: in swarm_description.py it says:

> 'sensorParams': { 'encoders': { u'AWT_01': None,

Is this correct?

Thanks in advance,
Cavan


-----Original Message-----
From: nupic [mailto:[email protected]] On Behalf Of
Matthew Taylor
Sent: 18 August 2014 17:55
To: NuPIC general mailing list.
Subject: Re: [nupic-discuss] Prediction of multiple time series for
concrete bridge problem

Hi John,

Interesting problem, and a good fit for the hot gym example. Is any of
your data public? If so, it would be cool to create an open source repo
where anyone in the community can work on this. I'd certainly be able to
help structure the experiments a bit.

> would be a concatenation of two such bitmaps, right? My question then 
> is how do I extract the temperature(t+1) and tilt(t+1) as separate 
> numbers? How can the classifier make sense of htis?
>
> Where it says inferences above it lists the +5 predictions for only 
> one time series, namely the one specified as predictedFieldName. Can 
> this parameter become a list?

No, because each model can only produce inferences for one field, so
you'll need to either pick one field to predict or create two
near-identical models that focus on predictions for different fields.

> If I can only output one at a time that is ok too, as long as the 
> classifier can disambiguate the two streams...?

You'll need to create two models to do what you want to do here. Each
one will have its own classifier, so no worries about the
disambiguation.

> Another question. As an intermediate phase, does the classifier create

> a bit sequence similar to the input bit sequence shown above that I
could look at?
> I.e. would it produce a slider control of 0s and 1s which can be 
> interpreted as a number?

No, I don't think it does (other Numenta engrs should correct me here if
I am wrong). All you'll get from the classifier is a category string or
scalar value. Internally, it is dealing with bit sequences, but you
aren't exposed to that. What are you going to try to do with the bit
sequence you're looking for?

By the way, you might like this conversation:
http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2014-August/0
04457.html

---------
Matt Taylor
OS Community Flag-Bearer
Numenta


On Fri, Aug 15, 2014 at 8:21 AM, John Blackburn
<[email protected]> wrote:
> Thanks very much, Fergal,
>
> I think before using the anomaly detection facilities of NuPIC I would

> like to just consider inference and see how well NuPIC can do at 
> predicting the bulk of the data where no anomalies are expected. I 
> therefore intend to use
> 'inferenceType': 'TemporalMultiStep' at first and I would like to get 
> predictions for several quantities. For instance, if I input both 
> temperature and tilt (consider just one time sequence each), I would 
> like
> temperature(t+1) AND tilt(t+1) to be output. To make things more 
> concrete, here is the result I get when running nupic for 1 timestep 
> using model.run(...). [here I am adapting the run.py file in the 
> one_gym tutorial]
>
> result =  ModelResult(  predictionNumber=106
>
>         rawInput={'timestamp': datetime.datetime(2010, 7, 6, 10, 0),
> 'kw_energy_consumption': 44.2}
>
>         sensorInput=SensorInput(        dataRow=(44.2,)
>
>         dataDict={'timestamp': datetime.datetime(2010, 7, 6, 10, 0),
> 'kw_energy_consumption': 44.2}
>
>         dataEncodings=[array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,

> 0., 0.,  0.,  0.,  0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  
> 1.,
>
>         1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  
> 0.,
>
>         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], 
> dtype=float32)]
>
>         sequenceReset=0.0
>
>         category=0
>
> )
>
>         inferences={'multiStepPredictions': {5: {36.6: 
> 0.32600618332119363,
> 40.7: 0.0194778517125032, 42.2: 0.058716832538386846, 39.6:
> 0.21727291982498376, 36.2: 0.017515548091618965, 44.099999999999994:
> 0.024999999999999991, 10.7: 0.081172572787322539, 5.420999999999999:
> 0.23697379692615872}}, 'multiStepBestPredictions': {5: 36.6}}
>
>         metrics=None
>
>         predictedFieldIdx=0
>
>         predictedFieldName=kw_energy_consumption
>
> )
>
>
> Here I can see how the single stream of energy consumption data for 
> hotgym has been converted to a binary sequence containg mostly 0s. I 
> understand this is like a "slider control" so if a high number is 
> input the 1s are mostly to the right. So to enter temperature and 
> tilt, the bit sequence would be a concatenation of two such bitmaps, 
> right? My question then is how do I extract the temperature(t+1) and 
> tilt(t+1) as separate numbers? How can the classifier make sense of
htis?
>
> Where it says inferences above it lists the +5 predictions for only 
> one time series, namely the one specified as predictedFieldName. Can 
> this parameter become a list?
>
> If I can only output one at a time that is ok too, as long as the 
> classifier can disambiguate the two streams...?
>
> Another question. As an intermediate phase, does the classifier create

> a bit sequence similar to the input bit sequence shown above that I
could look at?
> I.e. would it produce a slider control of 0s and 1s which can be 
> interpreted as a number?
>
> Many thanks for your help,
>
> John.
>
>
> On Fri, Aug 15, 2014 at 11:15 AM, Fergal Byrne 
> <[email protected]>
> wrote:
>>
>> Hi John,
>>
>> No, NuPIC is great at looking at multiple fields of data and 
>> extracting both the per-field structure and the inter-field 
>> structure, but in practise it makes sense to proceed step-by-step 
>> from 18 single-metric models, up to pairs, and so on, and thus 
>> discover as you go how best to feed data in. With
>> 18 metrics, the power set of combinations is very large, and most of 
>> these will be useless (or at best marginal), so you add fields one at

>> a time to models which already lead the list of models, neglecting 
>> the ones which are failing to match your events.
>>
>> It's almost impossible statistically that the structure is evenly 
>> distributed across all your metrics, and much more likely that the 
>> most interesting inputs will be single fields, pairs or triplets of 
>> fields. If you have a strong intuition (or some evidence) that one 
>> pair of fields - such as temperature and tilt - is correlated, then 
>> these should be at the top of your list when you get up to pairs of
metrics.
>>
>> Combining fields is simply a matter of concatenating the encodings 
>> for each field into a larger bit array (this happens internally in 
>> OPF). See the hotgym example for how to code it. Each column will 
>> sample from a subset of all bits, so NuPIC will identify correlative 
>> patterns automatically. The rate at which it does this will depend on

>> the "density' of structure in the entire input. Giving the system a 
>> combination of all 18 metrics will work, but will do so very slowly, 
>> because much of the input data (coming from irrelevant metrics) will 
>> not contribute to the recognition or the anomaly detection. On the 
>> other hand, treating each single metric as if it were the only input 
>> will help identify (to first order) which metrics contribute the most
to solving the problem.
>>
>> My recommendation is to follow the procedure for identifying "unusual

>> events" using the likelihood module to filter anomaly detection as 
>> outlined by Subutai and Matt. You're looking for good matches between

>> known disturbances and the output of signals from the likelihood 
>> module (in Matt's talk he identifies the correlation between changes 
>> in the music and "red zones" in the likelihood plot). Go through this

>> process for each single metric, and choose the top several metrics to

>> "breed" your generation of paired metrics. If you get an improved 
>> correlation, add the best to your gene pool and iterate. Terminate 
>> when you stop improving the model, or when you get tired of seeking
the last 1%!
>>
>> Regards,
>>
>> Fergal Byrne
>>
>>
>>
>> On Fri, Aug 15, 2014 at 10:42 AM, John Blackburn 
>> <[email protected]> wrote:
>>>
>>> Dear Fergal and Ian,
>>>
>>> Thanks very much for your replies on this. Are you saying it is not 
>>> possible for NuPIC to take in multiple time series and predict 
>>> multiple time series? As I understand it, you are advising me to 
>>> input only one of the time series e.g. the first tilt sensor. 
>>> However, in my system there is a strong correlation between the 
>>> temperature and the tilt so it would be wrong for NuPIC to be 
>>> unaware of the temperature data while predicting tilt. Is it 
>>> possible for NuPIC to account for spatial correlations between data
sets also?
>>>
>>> I could presumably give it all the data as a bitmap but then how 
>>> would I extract one of the data (eg tile 1) without getting mixed up

>>> with the other data. It would be useful to have some more 
>>> documentation on what the decoder does and how to use it. Is any
available?
>>>
>>> John.
>>>
>>>
>>> On Thu, Aug 14, 2014 at 12:30 PM, Fergal Byrne 
>>> <[email protected]> wrote:
>>>>
>>>> Hi John,
>>>>
>>>> I agree with Ian: the first thing to do is to create a separate 
>>>> model which learns the spatiotemporal characteristics of each input

>>>> metric. This will give you a picture of how well each metric 
>>>> behaves as a measure of the anomalies in your bridge's lifecycle. 
>>>> Experience with Grok (which does only this model-per-metric regime)

>>>> on numerous systems shows that this is often enough, in that a 
>>>> single high anomaly likelihood score among all the metrics is 
>>>> enough to identify an event worthy of attention, and a second or
third blip on other metrics will confirm it.
>>>>
>>>> It's important to use the likelihood score first, as it will filter

>>>> out many perfectly normal events which your system produces, and 
>>>> which might frequently cause high anomaly scores from the raw 
>>>> predictions. if you can confirm that you are getting good 
>>>> correlations between your known events and likelihood alarms on one

>>>> or more metrics, this will allow you to identify which single 
>>>> metrics and combinations are best at identifying your disturbances.
>>>>
>>>> Once you've identified the clearly best metrics (A, B and C say), 
>>>> you could start adding the others (d, e, f, etc) one at a time, 
>>>> creating a set of metrics which might give you even better 
>>>> correlation (eg Ac, Ba might be better than A or B alone).
>>>>
>>>> As Ian says, this is how the swarming algorithm works, but in this 
>>>> case the space of combinations is too large for swarming to make 
>>>> any sense. Use a depth-first approach instead by using 
>>>> single-metric models to group your metrics in quality bands. (The 
>>>> other issue with swarming is that it uses anomaly scores rather 
>>>> than likelihood scores to rank candidate choices of input fields).
>>>>
>>>> Please keep us informed about how you get on.
>>>>
>>>> Regards,
>>>>
>>>> Fergal Byrne
>>>>
>>>>
>>>> On Wed, Aug 13, 2014 at 6:05 PM, Ian Danforth 
>>>> <[email protected]>
>>>> wrote:
>>>>>
>>>>> Use separate models for each giving each model time and sensor
values.
>>>>>
>>>>> Start with two sensors and run both through the swarming process 
>>>>> and let us know what difficulties you run into.
>>>>>
>>>>> Ian
>>>>>
>>>>> On 13 Aug 2014 03:37, "John Blackburn" 
>>>>> <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> I am a researcher at the National Physical Laboratory, London and

>>>>>> am attempting to use NuPIC to model the strain and temperature 
>>>>>> variations of a concrete bridge for anomaly detection. The bridge

>>>>>> has 10 temperatures sensors and 8 "tilt sensors" (basically 
>>>>>> strain) arranged across it. I have hourly readings for all of 
>>>>>> these sensors for a 3 year period. I would like NuPIC to predict 
>>>>>> all of these quantities (and keep them separate). Compared to the

>>>>>> "hotgym" example, the difference here is that there are 18 
>>>>>> separate streams of data which would need to be suitably encoded 
>>>>>> and decoded to make predictions of each one. I suspect the 
>>>>>> decoding stage would be most
>>>>>> difficult: from the set of cell activations we need to discover 
>>>>>> 18 numbers and keep them separate. The HTM should account for 
>>>>>> cross correlations between time series as well as 
>>>>>> auto-correlations. I would like to consider
>>>>>> +1 and +5 predictions, for example.
>>>>>>
>>>>>> During the course of the experiment, various interventions were 
>>>>>> carried out at known times. These include cutting support cables,

>>>>>> removing chunks of concrete and adding heavy weights. The NN 
>>>>>> should show anomalous behaviour at the time these interventions 
>>>>>> were done. The system has been modelled using an Echo Sensor 
>>>>>> Network so I want to compare performance of ESN to HTM.
>>>>>>
>>>>>> So, is this task possible with NuPIC and how might I adjust the 
>>>>>> encoder, decoder to deal with multiple streams?
>>>>>>
>>>>>> Many thanks for your help,
>>>>>>
>>>>>> John Blackburn.
>>>>>>
>>>>>> _______________________________________________
>>>>>> nupic mailing list
>>>>>> [email protected]
>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Fergal Byrne, Brenter IT
>>>>
>>>> Author, Real Machine Intelligence with Clortex and NuPIC 
>>>> https://leanpub.com/realsmartmachines
>>>>
>>>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
>>>> http://euroclojure.com/2014/
>>>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>>>
>>>> http://inbits.com - Better Living through Thoughtful Technology 
>>>> http://ie.linkedin.com/in/fergbyrne/ - 
>>>> https://github.com/fergalbyrne
>>>>
>>>> e:[email protected] t:+353 83 4214179 Join the quest for 
>>>> Machine Intelligence at http://numenta.org Formerly of Adnet 
>>>> [email protected] http://www.adnet.ie
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>
>>
>>
>> --
>>
>> Fergal Byrne, Brenter IT
>>
>> Author, Real Machine Intelligence with Clortex and NuPIC 
>> https://leanpub.com/realsmartmachines
>>
>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
>> http://euroclojure.com/2014/
>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>
>> http://inbits.com - Better Living through Thoughtful Technology 
>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>
>> e:[email protected] t:+353 83 4214179 Join the quest for 
>> Machine Intelligence at http://numenta.org Formerly of Adnet 
>> [email protected] http://www.adnet.ie
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org



--
If you have received this message in error, please notify us and remove it from 
your system.
NPL Management Ltd cannot guarantee that the e-mail or any attachments are free 
from viruses.
 
NPL Management Ltd is a company registered in England and Wales, number: 2937881
Registered office: Serco House | 16 Bartley Wood Business Park | Hook, 
Hampshire | UK | RG27 9UY

run.py
Description: run.py

swarm_description.py
Description: swarm_description.py

TimeTempTilt_model_params.py
Description: TimeTempTilt_model_params.py

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Prediction of multiple time series for concrete bridge problem [NI]

Reply via email to