Re: Noob questions (HTM model, swarming).

Raf Sun, 06 Dec 2015 10:35:37 -0800

Hi, Matthew.

Raf, if you are trying to analyze an EEG signal with HTM, many have
attempted this before you.


I suggest these resources:
- EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1]
- EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2]
- Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della
Motta, J. Naulty [3]
- ECG + HTM - Kentaro Iizuka [4]

Thanks, I'm pretty excited to look into

Also, regarding your swarm, I would certainly use an inference type of
"TemporalMultiStep". And you can remove the "datetime" field entirely
as long as your samples are taken at the same interval (the datetime
will make the swarm look for "time of day" and "day of week" type of
patterns, which should not exist in this dataset). And you input data has far 
too many dimensions, IMO.

This explains a lot.

And yes, the swarm you ran did not find a correlation between any of
the other input fields at the "VO" field, which I assume is a
pre-processed classification of the current state? That could mean a
couple different things. (1) It has all the info it needs to make a
prediction given VO and the history of the VO field over time, or (2)
it has found VO to be unpredictable.

Yep, that's exactly what VO is ("1" = spO2 <= 97.5, "2" > 97.5). Isuppose it is then the first case, although I think the prediction couldstill be improved (maybe it should learn from more samples).

At the end of the swarming
process, some text is dumped to the screen about "field
contributions". That tells you exactly how important NuPIC found each
field when evaluating its influence on future predictions.

Ok, that's exactly another question I wanted to ask but then I forgot to:) I don't have the exact numbers, but I remember them pretty well.Almost all of them were negative floats (all between -2.0 and -7.0)exception made for three sensors which were positive numbers (between+2.0 and +3.0). What is their measuring unit? Is it in terms of meansquared error? Or something else?


Thanks again!

Cheers.
Raf

On 06/12/2015 16:40, Matthew Taylor wrote:

Raf, if you are trying to analyze an EEG signal with HTM, many have
attempted this before you.

I suggest these resources:
- EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1]
- EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2]
- Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della
Motta, J. Naulty [3]
- ECG + HTM - Kentaro Iizuka [4]

I would pay attention to Marion's work. Their team successfully
classified EEG signals using an Open BCI device at the HTM Challenge.
However, this is still an active area of research and experimentation.

Also, regarding your swarm, I would certainly use an inference type of
"TemporalMultiStep". And you can remove the "datetime" field entirely
as long as your samples are taken at the same interval (the datetime
will make the swarm look for "time of day" and "day of week" type of
patterns, which should not exist in this dataset). And you input data
has far too many dimensions, IMO.

And yes, the swarm you ran did not find a correlation between any of
the other input fields at the "VO" field, which I assume is a
pre-processed classification of the current state? That could mean a
couple different things. (1) It has all the info it needs to make a
prediction given VO and the history of the VO field over time, or (2)
it has found VO to be unpredictable. At the end of the swarming
process, some text is dumped to the screen about "field
contributions". That tells you exactly how important NuPIC found each
field when evaluating its influence on future predictions.

[1] https://www.youtube.com/watch?v=TFbNbaBtxr4
[2] https://www.youtube.com/watch?v=UEh48KOmkIA
[3] https://www.youtube.com/watch?v=Ij4StdJBxEo
[4] https://www.youtube.com/watch?v=gzfTZhd6X9c
---------
Matt Taylor
OS Community Flag-Bearer
Numenta


On Sat, Dec 5, 2015 at 8:37 AM, Raf <[email protected]> wrote:

Hey Matthew,

Thanks for your reply.

Basically I made different attempts: I first used "TemporalMultiStep" trying
to predict the exact spO2 value and in another one (the one you saw) I used
"NontemporalClassification" (and then I also tried with
"TemporalClassification") providing three possible categories: "0" (not
breathing, hopefully it should never pop up), "1" (spO2 <= 97.5) and "2"
(spO2 > 98.5). I also tried these ones hoping it would have made the task
easier for the HTM. You can find the dataset I used in this case in the
attachment.
Notice that: for the spO2 I used a very very sensitive pulse oximeter,
whilst for the thoracic low-res electrodes I relied (unfortunately) on a
self made set of active electrodes linked to a raspberry pi; also, the
patient is 54yo and has pneumonia.

(I tried to upload the dataset on dropbox but according to
https://status.dropbox.com/ it is under maintenance. Hopefully you can see
the attachments.

Cheers,
Raf


On 05/12/2015 16:34, Matthew Taylor wrote:

Raf,

It looks like the model params your swarm returned were labeled as
"NontemporalClassification", which makes me think something might be wrong.
Can you share the input data (a sample) and the swarm configuration you
used?

Thanks,

---------
Matt Taylor
OS Community Flag-Bearer
Numenta

On Sat, Dec 5, 2015 at 4:59 AM, cogmission (David Ray)
<[email protected]> wrote:

Hi Raf,

Oops.

... and yes both "A" and "C" are modeled ("C" being inhibitory chemical
and cellular effects) - but again not as the multiple discrete values - more
like a summation of their effect (except for inhibitory cells whose effects
are explicitly modeled).

Cheers,
David

On Sat, Dec 5, 2015 at 6:53 AM, cogmission (David Ray)
<[email protected]> wrote:

Hi Raf,

Welcome!  Please do not respond to this, I'm just putting in a "time
filler" until someone more neurobiologically knowledgable answers this later
on today, but "A" is in fact being modeled by NuPIC, as often is stated in
these circles that the synapses on distal dendrites act as "proximity
detectors" - and we model this via a summation of synapse "permanence"
values. In addition there is some portion of organic necessity not being
abstracted in the software as some of these concepts were deemed to not have
a significant enough impact on the overall process (as you also stated).
Spikes (some not all), to a certain extent (from explanations I have
overheard on this mailing list) are one of those concepts that organic
implementation requires yet do not have a significance with regard to
maintaining the metaphor in software. Some Spikes are in fact being modeled
as a summation instead of a continuous rate (I believe).

Much care (by leading edge neuro-scientific researchers at the Redwood
Institute as well as Numenta researchers), has been taken to ascertain that
portion of explicit biological translation which is necessary to implement
the over-all algorithms, but as always (as I have observed), the community
is open to theoretical examination of their assumptions in this regard.

Cheers,
David

On Sat, Dec 5, 2015 at 6:04 AM, Raf <[email protected]> wrote:

Hi everybody.

I've got a couple of questions for you.

I'm a med student and I'm new to Nupic.
I'm very impressed by what Numenta is achieving and I do believe that
your work in the long run will be compared to the discovery of penicillin :)

My project -for now- is to produce a model able to detect
neurological/psychiatric issues through a simple eeg waves recognition; I'm
an intern at the Neurosurgery dept. and my final goal would be to use
patterns recognition as an intraoperative tool to help surgeons distinguish
between healthy tissue and cancerous cells just with a continuous eeg/emg
data feed.

In order to get familiar with the machine learning world, and not
disposing of good enough datasets, since a couple of years I use financial
data (notoriously difficult-impossible do predict) as a sandbox environment
to experiment with NNs. I had encouraging results.

I'm still learning the python code behind nupic and I've two questions
for you.

1 - FIRST QUESTION
In the paper "Hierarchical Temporal Memory" (version 0.2.1, 2011), I
read that: "[...] The predominant view (especially in regards to the
neo-cortex) is that the rate of spikes is what matters. Therefore the output
of a cell can be viewed as a scalar value". I'm aware that transforming a
biological complex system such as the neocortex into an computer software
necessarily leads to simplifications. As we know from a biological point of
view the transmission of the signal is subject to numerous variables and I
wonder how their implementation could improve the software predictions.
The variables I'd like to focus on are:
     -A) The propagation of the action potential along the membranes
follows an exponential loss distribution due to the resistances met along
the axons. For the HTM model (where the synapses express binary weights)
this could mean that the more distant two connected cells are, the weakest
their shared signal becomes (of course directly depending on "where" the
dendrite segments are from their starting point - this would probably
require the introduction of the physical concept of space in HTM).
     -B) The signal propagation speed is directly proportional to the
axon's diameter carrying it;  this appears to be valid both in unmyelinated
and myelinated axons (though representing a more obvious phenomenon in the
latter type). This could have a huge impact for HTM: if bigger axons (=
weight+) burst temporally before other smaller ones towards the same target
dendrite, they can also inhibit temporarily that targeted cell (causing a
later refractory period) therefore filtering the signal.
     -C) Receptors, neurotransmitters, electrical and chemical synapses,
EPSP (excitatory postsynaptic potential) and IPSP (inhibitory postsynaptic
potential) . This is an enormous chapter. Current NNs systems and, if my
understanding is correct, also Nupic treat synapses like if they were all
electrical synapses. In reality, according to the current consesus, the
mammal brain uses electrical synapses mostly to "synchronize" vast areas of
the neocortex (I'm deliberately omitting other findings because are not
relevant to my point). Although the electrical synapses demonstrates various
advantage when compared to their chemical equivalents (speed, resistance,
fatigue, etc.), it appears that the complexity and the fine
filtering/modulation of the signals inside the PFC is due to the presence of
numerous other elements present in chemical synapses: neurotransmitters
(such as acetylcholine, dopamine, gaba, norepinephrine...); pre-synaptic,
synaptic gap and post-synaptic features; different receptors; etc. Each of
these elements can strongly influence the signal and the overall "learning"
process. For example: although an axon "weight" is big and it is bursting
copiously the above mentioned elements can suppress its signal.

My first question is: are the first two points (A and C) implemented in
Nupic? Do you reckon that it could be useful to increase the complexity of
Nupic also implementing the chemical synapses "class" with the elements
described in point C?


2 - SECOND QUESTION
I'm trying to run a couple of models. This is an extract from a OPF I
created through swarming.

  'model': 'CLA',
  'modelParams': {'anomalyParams': {u'anomalyCacheRecords': None,
                                    u'autoDetectThreshold': None,
                                    u'autoDetectWaitRecords': None},
                  'clParams': {'alpha': 0.06173462582232023,
                               'clVerbosity': 0,
                               'regionName': 'CLAClassifierRegion',
                               'steps': '0'},
                  'inferenceType': 'NontemporalClassification',
                  'sensorParams': {'encoders': {u'DATE_dayOfWeek': None,
                                                u'DATE_timeOfDay':
{'fieldname': 'DATE',

'name': 'DATE',

'timeOfDay': (21,

2.2537623685060675),

'type': 'DateEncoder'},
                                                u'DATE_weekend': None,
                                                '_classifierInput':
{'classifierOnly': True,

'clipInput': True,

'fieldname': 'VO',

'maxval': 2.0,

'minval': 0.0,
                                                                     'n':
449,

'name': '_classifierInput',

'type': 'ScalarEncoder',
                                                                     'w':
21},
                                                u'o10N_A': None,
                                                u'o11N_A': None,
                                                u'o12N_A': None,
                                                u'o13N_A': None,
                                                u'o14N_A': None,
                                                u'o15N_A': None,
                                                u'o1N_A': None,
                                                u'o1N_B': None,
                                                u'o2N_A': None,
                                                u'o2N_B': None,
                                                u'o3N_A': None,
                                                u'o3N_B': None,
                                                u'o4N_A': None,
                                                u'o4N_B': None,
                                                u'o5N_A': None,
                                                u'o5N_B': None,
                                                u'o6N_A': None,
                                                u'o6N_B': None,
                                                u'o7N_A': None,
                                                u'o7N_B': None,
                                                u'o8N_A': None,
                                                u'o8N_B': None,
                                                u'o9N_A': None,
                                                u'o9N_B': None},
                                   'sensorAutoReset': None,
                                   'verbosity': 0},
                  'spEnable': False,
                  'spParams': {'columnCount': 2048,
                               'globalInhibition': 1,
                               'inputWidth': 0,
                               'maxBoost': 2.0,
                               'numActiveColumnsPerInhArea': 40,
                               'potentialPct': 0.8,
                               'seed': 1956,
                               'spVerbosity': 0,
                               'spatialImp': 'cpp',
                               'synPermActiveInc': 0.05,
                               'synPermConnected': 0.1,
                               'synPermInactiveDec': 0.0005},
                  'tpEnable': False,
                  'tpParams': {'activationThreshold': 16,
                               'cellsPerColumn': 32,
                               'columnCount': 2048,
                               'globalDecay': 0.0,
                               'initialPerm': 0.21,
                               'inputWidth': 2048,
                               'maxAge': 0,
                               'maxSegmentsPerCell': 128,
                               'maxSynapsesPerSegment': 32,
                               'minThreshold': 12,
                               'newSynapseCount': 20,
                               'outputType': 'normal',
                               'pamLength': 1,
                               'permanenceDec': 0.1,
                               'permanenceInc': 0.1,
                               'seed': 1960,
                               'temporalImp': 'cpp',
                               'verbosity': 0},
                  'trainSPNetOnlyIfRequested': False},

If I understood correctly, all the inputs (from o1N_A to o15N_A) were
discarded by the swarming process. I've also run a larger swarm, but they
are still discarded. Unfortunately I'm sure that at least a good 60% of them
are relevant sensors. How can I improve the swarming? Am I doing something
wrong? (The sensors are outputs from thoracic low-res electrodes; the
predicted field is "VO" which represents the amount of spO2 present in the
blood stream at the moment - the idea is to predict the oxygen saturation
from the respiratory act).

Thanks for your replies. Sorry for my english (I'm italian).

Raf


--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf




--
With kind regards,

David Ray
Java Solutions Architect

Cortical.io
Sponsor of:  HTM.java

[email protected]
http://cortical.io




--
With kind regards,

David Ray
Java Solutions Architect

Cortical.io
Sponsor of:  HTM.java

[email protected]
http://cortical.io



--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf


--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf

Re: Noob questions (HTM model, swarming).

Reply via email to