Re: Noob questions (HTM model, swarming).

Matthew Taylor Fri, 11 Dec 2015 11:55:34 -0800

You are looking for large numbers like > 30 to make real
contributions. The smaller numbers mean there is basically no
correlation.
---------
Matt Taylor
OS Community Flag-Bearer
Numenta



On Sun, Dec 6, 2015 at 10:32 AM, Raf <[email protected]> wrote:
> Hi, Matthew.
>>
>> Raf, if you are trying to analyze an EEG signal with HTM, many have
>> attempted this before you.
>>
>> I suggest these resources:
>> - EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1]
>> - EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2]
>> - Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della
>> Motta, J. Naulty [3]
>> - ECG + HTM - Kentaro Iizuka [4]
>
> Thanks, I'm pretty excited to look into
>
>> Also, regarding your swarm, I would certainly use an inference type of
>> "TemporalMultiStep". And you can remove the "datetime" field entirely
>> as long as your samples are taken at the same interval (the datetime
>> will make the swarm look for "time of day" and "day of week" type of
>> patterns, which should not exist in this dataset). And you input data has
>> far too many dimensions, IMO.
>
> This explains a lot.
>
>> And yes, the swarm you ran did not find a correlation between any of
>> the other input fields at the "VO" field, which I assume is a
>> pre-processed classification of the current state? That could mean a
>> couple different things. (1) It has all the info it needs to make a
>> prediction given VO and the history of the VO field over time, or (2)
>> it has found VO to be unpredictable.
>
>
> Yep, that's exactly what VO is ("1" = spO2 <= 97.5, "2" > 97.5). I suppose
> it is then the first case, although I think the prediction could still be
> improved (maybe it should learn from more samples).
>>
>> At the end of the swarming
>> process, some text is dumped to the screen about "field
>> contributions". That tells you exactly how important NuPIC found each
>> field when evaluating its influence on future predictions.
>>
> Ok, that's exactly another question I wanted to ask but then I forgot to :)
> I don't have the exact numbers, but I remember them pretty well.
> Almost all of them were negative floats (all between -2.0 and -7.0)
> exception made for three sensors which were positive numbers (between +2.0
> and +3.0). What is their measuring unit? Is it in terms of mean squared
> error? Or something else?
>
> Thanks again!
>
> Cheers.
> Raf
>
>
> On 06/12/2015 16:40, Matthew Taylor wrote:
>>
>> Raf, if you are trying to analyze an EEG signal with HTM, many have
>> attempted this before you.
>>
>> I suggest these resources:
>> - EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1]
>> - EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2]
>> - Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della
>> Motta, J. Naulty [3]
>> - ECG + HTM - Kentaro Iizuka [4]
>>
>> I would pay attention to Marion's work. Their team successfully
>> classified EEG signals using an Open BCI device at the HTM Challenge.
>> However, this is still an active area of research and experimentation.
>>
>> Also, regarding your swarm, I would certainly use an inference type of
>> "TemporalMultiStep". And you can remove the "datetime" field entirely
>> as long as your samples are taken at the same interval (the datetime
>> will make the swarm look for "time of day" and "day of week" type of
>> patterns, which should not exist in this dataset). And you input data
>> has far too many dimensions, IMO.
>>
>> And yes, the swarm you ran did not find a correlation between any of
>> the other input fields at the "VO" field, which I assume is a
>> pre-processed classification of the current state? That could mean a
>> couple different things. (1) It has all the info it needs to make a
>> prediction given VO and the history of the VO field over time, or (2)
>> it has found VO to be unpredictable. At the end of the swarming
>> process, some text is dumped to the screen about "field
>> contributions". That tells you exactly how important NuPIC found each
>> field when evaluating its influence on future predictions.
>>
>> [1] https://www.youtube.com/watch?v=TFbNbaBtxr4
>> [2] https://www.youtube.com/watch?v=UEh48KOmkIA
>> [3] https://www.youtube.com/watch?v=Ij4StdJBxEo
>> [4] https://www.youtube.com/watch?v=gzfTZhd6X9c
>> ---------
>> Matt Taylor
>> OS Community Flag-Bearer
>> Numenta
>>
>>
>> On Sat, Dec 5, 2015 at 8:37 AM, Raf <[email protected]> wrote:
>>>
>>> Hey Matthew,
>>>
>>> Thanks for your reply.
>>>
>>> Basically I made different attempts: I first used "TemporalMultiStep"
>>> trying
>>> to predict the exact spO2 value and in another one (the one you saw) I
>>> used
>>> "NontemporalClassification" (and then I also tried with
>>> "TemporalClassification") providing three possible categories: "0" (not
>>> breathing, hopefully it should never pop up), "1" (spO2 <= 97.5) and "2"
>>> (spO2 > 98.5). I also tried these ones hoping it would have made the task
>>> easier for the HTM. You can find the dataset I used in this case in the
>>> attachment.
>>> Notice that: for the spO2 I used a very very sensitive pulse oximeter,
>>> whilst for the thoracic low-res electrodes I relied (unfortunately) on a
>>> self made set of active electrodes linked to a raspberry pi; also, the
>>> patient is 54yo and has pneumonia.
>>>
>>> (I tried to upload the dataset on dropbox but according to
>>> https://status.dropbox.com/ it is under maintenance. Hopefully you can
>>> see
>>> the attachments.
>>>
>>> Cheers,
>>> Raf
>>>
>>>
>>> On 05/12/2015 16:34, Matthew Taylor wrote:
>>>
>>> Raf,
>>>
>>> It looks like the model params your swarm returned were labeled as
>>> "NontemporalClassification", which makes me think something might be
>>> wrong.
>>> Can you share the input data (a sample) and the swarm configuration you
>>> used?
>>>
>>> Thanks,
>>>
>>> ---------
>>> Matt Taylor
>>> OS Community Flag-Bearer
>>> Numenta
>>>
>>> On Sat, Dec 5, 2015 at 4:59 AM, cogmission (David Ray)
>>> <[email protected]> wrote:
>>>>
>>>> Hi Raf,
>>>>
>>>> Oops.
>>>>
>>>> ... and yes both "A" and "C" are modeled ("C" being inhibitory chemical
>>>> and cellular effects) - but again not as the multiple discrete values -
>>>> more
>>>> like a summation of their effect (except for inhibitory cells whose
>>>> effects
>>>> are explicitly modeled).
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>> On Sat, Dec 5, 2015 at 6:53 AM, cogmission (David Ray)
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hi Raf,
>>>>>
>>>>> Welcome!  Please do not respond to this, I'm just putting in a "time
>>>>> filler" until someone more neurobiologically knowledgable answers this
>>>>> later
>>>>> on today, but "A" is in fact being modeled by NuPIC, as often is stated
>>>>> in
>>>>> these circles that the synapses on distal dendrites act as "proximity
>>>>> detectors" - and we model this via a summation of synapse "permanence"
>>>>> values. In addition there is some portion of organic necessity not
>>>>> being
>>>>> abstracted in the software as some of these concepts were deemed to not
>>>>> have
>>>>> a significant enough impact on the overall process (as you also
>>>>> stated).
>>>>> Spikes (some not all), to a certain extent (from explanations I have
>>>>> overheard on this mailing list) are one of those concepts that organic
>>>>> implementation requires yet do not have a significance with regard to
>>>>> maintaining the metaphor in software. Some Spikes are in fact being
>>>>> modeled
>>>>> as a summation instead of a continuous rate (I believe).
>>>>>
>>>>> Much care (by leading edge neuro-scientific researchers at the Redwood
>>>>> Institute as well as Numenta researchers), has been taken to ascertain
>>>>> that
>>>>> portion of explicit biological translation which is necessary to
>>>>> implement
>>>>> the over-all algorithms, but as always (as I have observed), the
>>>>> community
>>>>> is open to theoretical examination of their assumptions in this regard.
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> On Sat, Dec 5, 2015 at 6:04 AM, Raf <[email protected]> wrote:
>>>>>>
>>>>>> Hi everybody.
>>>>>>
>>>>>> I've got a couple of questions for you.
>>>>>>
>>>>>> I'm a med student and I'm new to Nupic.
>>>>>> I'm very impressed by what Numenta is achieving and I do believe that
>>>>>> your work in the long run will be compared to the discovery of
>>>>>> penicillin :)
>>>>>>
>>>>>> My project -for now- is to produce a model able to detect
>>>>>> neurological/psychiatric issues through a simple eeg waves
>>>>>> recognition; I'm
>>>>>> an intern at the Neurosurgery dept. and my final goal would be to use
>>>>>> patterns recognition as an intraoperative tool to help surgeons
>>>>>> distinguish
>>>>>> between healthy tissue and cancerous cells just with a continuous
>>>>>> eeg/emg
>>>>>> data feed.
>>>>>>
>>>>>> In order to get familiar with the machine learning world, and not
>>>>>> disposing of good enough datasets, since a couple of years I use
>>>>>> financial
>>>>>> data (notoriously difficult-impossible do predict) as a sandbox
>>>>>> environment
>>>>>> to experiment with NNs. I had encouraging results.
>>>>>>
>>>>>> I'm still learning the python code behind nupic and I've two questions
>>>>>> for you.
>>>>>>
>>>>>> 1 - FIRST QUESTION
>>>>>> In the paper "Hierarchical Temporal Memory" (version 0.2.1, 2011), I
>>>>>> read that: "[...] The predominant view (especially in regards to the
>>>>>> neo-cortex) is that the rate of spikes is what matters. Therefore the
>>>>>> output
>>>>>> of a cell can be viewed as a scalar value". I'm aware that
>>>>>> transforming a
>>>>>> biological complex system such as the neocortex into an computer
>>>>>> software
>>>>>> necessarily leads to simplifications. As we know from a biological
>>>>>> point of
>>>>>> view the transmission of the signal is subject to numerous variables
>>>>>> and I
>>>>>> wonder how their implementation could improve the software
>>>>>> predictions.
>>>>>> The variables I'd like to focus on are:
>>>>>>      -A) The propagation of the action potential along the membranes
>>>>>> follows an exponential loss distribution due to the resistances met
>>>>>> along
>>>>>> the axons. For the HTM model (where the synapses express binary
>>>>>> weights)
>>>>>> this could mean that the more distant two connected cells are, the
>>>>>> weakest
>>>>>> their shared signal becomes (of course directly depending on "where"
>>>>>> the
>>>>>> dendrite segments are from their starting point - this would probably
>>>>>> require the introduction of the physical concept of space in HTM).
>>>>>>      -B) The signal propagation speed is directly proportional to the
>>>>>> axon's diameter carrying it;  this appears to be valid both in
>>>>>> unmyelinated
>>>>>> and myelinated axons (though representing a more obvious phenomenon in
>>>>>> the
>>>>>> latter type). This could have a huge impact for HTM: if bigger axons
>>>>>> (=
>>>>>> weight+) burst temporally before other smaller ones towards the same
>>>>>> target
>>>>>> dendrite, they can also inhibit temporarily that targeted cell
>>>>>> (causing a
>>>>>> later refractory period) therefore filtering the signal.
>>>>>>      -C) Receptors, neurotransmitters, electrical and chemical
>>>>>> synapses,
>>>>>> EPSP (excitatory postsynaptic potential) and IPSP (inhibitory
>>>>>> postsynaptic
>>>>>> potential) . This is an enormous chapter. Current NNs systems and, if
>>>>>> my
>>>>>> understanding is correct, also Nupic treat synapses like if they were
>>>>>> all
>>>>>> electrical synapses. In reality, according to the current consesus,
>>>>>> the
>>>>>> mammal brain uses electrical synapses mostly to "synchronize" vast
>>>>>> areas of
>>>>>> the neocortex (I'm deliberately omitting other findings because are
>>>>>> not
>>>>>> relevant to my point). Although the electrical synapses demonstrates
>>>>>> various
>>>>>> advantage when compared to their chemical equivalents (speed,
>>>>>> resistance,
>>>>>> fatigue, etc.), it appears that the complexity and the fine
>>>>>> filtering/modulation of the signals inside the PFC is due to the
>>>>>> presence of
>>>>>> numerous other elements present in chemical synapses:
>>>>>> neurotransmitters
>>>>>> (such as acetylcholine, dopamine, gaba, norepinephrine...);
>>>>>> pre-synaptic,
>>>>>> synaptic gap and post-synaptic features; different receptors; etc.
>>>>>> Each of
>>>>>> these elements can strongly influence the signal and the overall
>>>>>> "learning"
>>>>>> process. For example: although an axon "weight" is big and it is
>>>>>> bursting
>>>>>> copiously the above mentioned elements can suppress its signal.
>>>>>>
>>>>>> My first question is: are the first two points (A and C) implemented
>>>>>> in
>>>>>> Nupic? Do you reckon that it could be useful to increase the
>>>>>> complexity of
>>>>>> Nupic also implementing the chemical synapses "class" with the
>>>>>> elements
>>>>>> described in point C?
>>>>>>
>>>>>>
>>>>>> 2 - SECOND QUESTION
>>>>>> I'm trying to run a couple of models. This is an extract from a OPF I
>>>>>> created through swarming.
>>>>>>
>>>>>>   'model': 'CLA',
>>>>>>   'modelParams': {'anomalyParams': {u'anomalyCacheRecords': None,
>>>>>>                                     u'autoDetectThreshold': None,
>>>>>>                                     u'autoDetectWaitRecords': None},
>>>>>>                   'clParams': {'alpha': 0.06173462582232023,
>>>>>>                                'clVerbosity': 0,
>>>>>>                                'regionName': 'CLAClassifierRegion',
>>>>>>                                'steps': '0'},
>>>>>>                   'inferenceType': 'NontemporalClassification',
>>>>>>                   'sensorParams': {'encoders': {u'DATE_dayOfWeek':
>>>>>> None,
>>>>>>                                                 u'DATE_timeOfDay':
>>>>>> {'fieldname': 'DATE',
>>>>>>
>>>>>> 'name': 'DATE',
>>>>>>
>>>>>> 'timeOfDay': (21,
>>>>>>
>>>>>> 2.2537623685060675),
>>>>>>
>>>>>> 'type': 'DateEncoder'},
>>>>>>                                                 u'DATE_weekend': None,
>>>>>>                                                 '_classifierInput':
>>>>>> {'classifierOnly': True,
>>>>>>
>>>>>> 'clipInput': True,
>>>>>>
>>>>>> 'fieldname': 'VO',
>>>>>>
>>>>>> 'maxval': 2.0,
>>>>>>
>>>>>> 'minval': 0.0,
>>>>>>
>>>>>> 'n':
>>>>>> 449,
>>>>>>
>>>>>> 'name': '_classifierInput',
>>>>>>
>>>>>> 'type': 'ScalarEncoder',
>>>>>>
>>>>>> 'w':
>>>>>> 21},
>>>>>>                                                 u'o10N_A': None,
>>>>>>                                                 u'o11N_A': None,
>>>>>>                                                 u'o12N_A': None,
>>>>>>                                                 u'o13N_A': None,
>>>>>>                                                 u'o14N_A': None,
>>>>>>                                                 u'o15N_A': None,
>>>>>>                                                 u'o1N_A': None,
>>>>>>                                                 u'o1N_B': None,
>>>>>>                                                 u'o2N_A': None,
>>>>>>                                                 u'o2N_B': None,
>>>>>>                                                 u'o3N_A': None,
>>>>>>                                                 u'o3N_B': None,
>>>>>>                                                 u'o4N_A': None,
>>>>>>                                                 u'o4N_B': None,
>>>>>>                                                 u'o5N_A': None,
>>>>>>                                                 u'o5N_B': None,
>>>>>>                                                 u'o6N_A': None,
>>>>>>                                                 u'o6N_B': None,
>>>>>>                                                 u'o7N_A': None,
>>>>>>                                                 u'o7N_B': None,
>>>>>>                                                 u'o8N_A': None,
>>>>>>                                                 u'o8N_B': None,
>>>>>>                                                 u'o9N_A': None,
>>>>>>                                                 u'o9N_B': None},
>>>>>>                                    'sensorAutoReset': None,
>>>>>>                                    'verbosity': 0},
>>>>>>                   'spEnable': False,
>>>>>>                   'spParams': {'columnCount': 2048,
>>>>>>                                'globalInhibition': 1,
>>>>>>                                'inputWidth': 0,
>>>>>>                                'maxBoost': 2.0,
>>>>>>                                'numActiveColumnsPerInhArea': 40,
>>>>>>                                'potentialPct': 0.8,
>>>>>>                                'seed': 1956,
>>>>>>                                'spVerbosity': 0,
>>>>>>                                'spatialImp': 'cpp',
>>>>>>                                'synPermActiveInc': 0.05,
>>>>>>                                'synPermConnected': 0.1,
>>>>>>                                'synPermInactiveDec': 0.0005},
>>>>>>                   'tpEnable': False,
>>>>>>                   'tpParams': {'activationThreshold': 16,
>>>>>>                                'cellsPerColumn': 32,
>>>>>>                                'columnCount': 2048,
>>>>>>                                'globalDecay': 0.0,
>>>>>>                                'initialPerm': 0.21,
>>>>>>                                'inputWidth': 2048,
>>>>>>                                'maxAge': 0,
>>>>>>                                'maxSegmentsPerCell': 128,
>>>>>>                                'maxSynapsesPerSegment': 32,
>>>>>>                                'minThreshold': 12,
>>>>>>                                'newSynapseCount': 20,
>>>>>>                                'outputType': 'normal',
>>>>>>                                'pamLength': 1,
>>>>>>                                'permanenceDec': 0.1,
>>>>>>                                'permanenceInc': 0.1,
>>>>>>                                'seed': 1960,
>>>>>>                                'temporalImp': 'cpp',
>>>>>>                                'verbosity': 0},
>>>>>>                   'trainSPNetOnlyIfRequested': False},
>>>>>>
>>>>>> If I understood correctly, all the inputs (from o1N_A to o15N_A) were
>>>>>> discarded by the swarming process. I've also run a larger swarm, but
>>>>>> they
>>>>>> are still discarded. Unfortunately I'm sure that at least a good 60%
>>>>>> of them
>>>>>> are relevant sensors. How can I improve the swarming? Am I doing
>>>>>> something
>>>>>> wrong? (The sensors are outputs from thoracic low-res electrodes; the
>>>>>> predicted field is "VO" which represents the amount of spO2 present in
>>>>>> the
>>>>>> blood stream at the moment - the idea is to predict the oxygen
>>>>>> saturation
>>>>>> from the respiratory act).
>>>>>>
>>>>>> Thanks for your replies. Sorry for my english (I'm italian).
>>>>>>
>>>>>> Raf
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Raf
>>>>>>
>>>>>> www.madraf.com/algotrading
>>>>>> reply to: [email protected]
>>>>>> skype: algotrading_madraf
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> With kind regards,
>>>>>
>>>>> David Ray
>>>>> Java Solutions Architect
>>>>>
>>>>> Cortical.io
>>>>> Sponsor of:  HTM.java
>>>>>
>>>>> [email protected]
>>>>> http://cortical.io
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> With kind regards,
>>>>
>>>> David Ray
>>>> Java Solutions Architect
>>>>
>>>> Cortical.io
>>>> Sponsor of:  HTM.java
>>>>
>>>> [email protected]
>>>> http://cortical.io
>>>
>>>
>>>
>>> --
>>> Raf
>>>
>>> www.madraf.com/algotrading
>>> reply to: [email protected]
>>> skype: algotrading_madraf
>>
>>
>
> --
> Raf
>
> www.madraf.com/algotrading
> reply to: [email protected]
> skype: algotrading_madraf
>
>

Re: Noob questions (HTM model, swarming).

Reply via email to