Hi, Matthew.
Raf, if you are trying to analyze an EEG signal with HTM, many have attempted this before you.I suggest these resources: - EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1] - EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2] - Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della Motta, J. Naulty [3] - ECG + HTM - Kentaro Iizuka [4]
Thanks, I'm pretty excited to look into
Also, regarding your swarm, I would certainly use an inference type of "TemporalMultiStep". And you can remove the "datetime" field entirely as long as your samples are taken at the same interval (the datetime will make the swarm look for "time of day" and "day of week" type of patterns, which should not exist in this dataset). And you input data has far too many dimensions, IMO.
This explains a lot.
And yes, the swarm you ran did not find a correlation between any of the other input fields at the "VO" field, which I assume is a pre-processed classification of the current state? That could mean a couple different things. (1) It has all the info it needs to make a prediction given VO and the history of the VO field over time, or (2) it has found VO to be unpredictable.
Yep, that's exactly what VO is ("1" = spO2 <= 97.5, "2" > 97.5). I suppose it is then the first case, although I think the prediction could still be improved (maybe it should learn from more samples).
Ok, that's exactly another question I wanted to ask but then I forgot to :) I don't have the exact numbers, but I remember them pretty well. Almost all of them were negative floats (all between -2.0 and -7.0) exception made for three sensors which were positive numbers (between +2.0 and +3.0). What is their measuring unit? Is it in terms of mean squared error? Or something else?At the end of the swarming process, some text is dumped to the screen about "field contributions". That tells you exactly how important NuPIC found each field when evaluating its influence on future predictions.
Thanks again! Cheers. Raf On 06/12/2015 16:40, Matthew Taylor wrote:
Raf, if you are trying to analyze an EEG signal with HTM, many have attempted this before you. I suggest these resources: - EEGs and NuPIC (2014 Fall NuPIC Hackathon) [1] - EEG Data Classification [DEMO #16] (2014 Fall NuPIC Hackathon) [2] - Brain Squared - W. Wnekowicz, M. Le Borgne, P. Karashchuk, A. Della Motta, J. Naulty [3] - ECG + HTM - Kentaro Iizuka [4] I would pay attention to Marion's work. Their team successfully classified EEG signals using an Open BCI device at the HTM Challenge. However, this is still an active area of research and experimentation. Also, regarding your swarm, I would certainly use an inference type of "TemporalMultiStep". And you can remove the "datetime" field entirely as long as your samples are taken at the same interval (the datetime will make the swarm look for "time of day" and "day of week" type of patterns, which should not exist in this dataset). And you input data has far too many dimensions, IMO. And yes, the swarm you ran did not find a correlation between any of the other input fields at the "VO" field, which I assume is a pre-processed classification of the current state? That could mean a couple different things. (1) It has all the info it needs to make a prediction given VO and the history of the VO field over time, or (2) it has found VO to be unpredictable. At the end of the swarming process, some text is dumped to the screen about "field contributions". That tells you exactly how important NuPIC found each field when evaluating its influence on future predictions. [1] https://www.youtube.com/watch?v=TFbNbaBtxr4 [2] https://www.youtube.com/watch?v=UEh48KOmkIA [3] https://www.youtube.com/watch?v=Ij4StdJBxEo [4] https://www.youtube.com/watch?v=gzfTZhd6X9c --------- Matt Taylor OS Community Flag-Bearer Numenta On Sat, Dec 5, 2015 at 8:37 AM, Raf <[email protected]> wrote:Hey Matthew, Thanks for your reply. Basically I made different attempts: I first used "TemporalMultiStep" trying to predict the exact spO2 value and in another one (the one you saw) I used "NontemporalClassification" (and then I also tried with "TemporalClassification") providing three possible categories: "0" (not breathing, hopefully it should never pop up), "1" (spO2 <= 97.5) and "2" (spO2 > 98.5). I also tried these ones hoping it would have made the task easier for the HTM. You can find the dataset I used in this case in the attachment. Notice that: for the spO2 I used a very very sensitive pulse oximeter, whilst for the thoracic low-res electrodes I relied (unfortunately) on a self made set of active electrodes linked to a raspberry pi; also, the patient is 54yo and has pneumonia. (I tried to upload the dataset on dropbox but according to https://status.dropbox.com/ it is under maintenance. Hopefully you can see the attachments. Cheers, Raf On 05/12/2015 16:34, Matthew Taylor wrote: Raf, It looks like the model params your swarm returned were labeled as "NontemporalClassification", which makes me think something might be wrong. Can you share the input data (a sample) and the swarm configuration you used? Thanks, --------- Matt Taylor OS Community Flag-Bearer Numenta On Sat, Dec 5, 2015 at 4:59 AM, cogmission (David Ray) <[email protected]> wrote:Hi Raf, Oops. ... and yes both "A" and "C" are modeled ("C" being inhibitory chemical and cellular effects) - but again not as the multiple discrete values - more like a summation of their effect (except for inhibitory cells whose effects are explicitly modeled). Cheers, David On Sat, Dec 5, 2015 at 6:53 AM, cogmission (David Ray) <[email protected]> wrote:Hi Raf, Welcome! Please do not respond to this, I'm just putting in a "time filler" until someone more neurobiologically knowledgable answers this later on today, but "A" is in fact being modeled by NuPIC, as often is stated in these circles that the synapses on distal dendrites act as "proximity detectors" - and we model this via a summation of synapse "permanence" values. In addition there is some portion of organic necessity not being abstracted in the software as some of these concepts were deemed to not have a significant enough impact on the overall process (as you also stated). Spikes (some not all), to a certain extent (from explanations I have overheard on this mailing list) are one of those concepts that organic implementation requires yet do not have a significance with regard to maintaining the metaphor in software. Some Spikes are in fact being modeled as a summation instead of a continuous rate (I believe). Much care (by leading edge neuro-scientific researchers at the Redwood Institute as well as Numenta researchers), has been taken to ascertain that portion of explicit biological translation which is necessary to implement the over-all algorithms, but as always (as I have observed), the community is open to theoretical examination of their assumptions in this regard. Cheers, David On Sat, Dec 5, 2015 at 6:04 AM, Raf <[email protected]> wrote:Hi everybody. I've got a couple of questions for you. I'm a med student and I'm new to Nupic. I'm very impressed by what Numenta is achieving and I do believe that your work in the long run will be compared to the discovery of penicillin :) My project -for now- is to produce a model able to detect neurological/psychiatric issues through a simple eeg waves recognition; I'm an intern at the Neurosurgery dept. and my final goal would be to use patterns recognition as an intraoperative tool to help surgeons distinguish between healthy tissue and cancerous cells just with a continuous eeg/emg data feed. In order to get familiar with the machine learning world, and not disposing of good enough datasets, since a couple of years I use financial data (notoriously difficult-impossible do predict) as a sandbox environment to experiment with NNs. I had encouraging results. I'm still learning the python code behind nupic and I've two questions for you. 1 - FIRST QUESTION In the paper "Hierarchical Temporal Memory" (version 0.2.1, 2011), I read that: "[...] The predominant view (especially in regards to the neo-cortex) is that the rate of spikes is what matters. Therefore the output of a cell can be viewed as a scalar value". I'm aware that transforming a biological complex system such as the neocortex into an computer software necessarily leads to simplifications. As we know from a biological point of view the transmission of the signal is subject to numerous variables and I wonder how their implementation could improve the software predictions. The variables I'd like to focus on are: -A) The propagation of the action potential along the membranes follows an exponential loss distribution due to the resistances met along the axons. For the HTM model (where the synapses express binary weights) this could mean that the more distant two connected cells are, the weakest their shared signal becomes (of course directly depending on "where" the dendrite segments are from their starting point - this would probably require the introduction of the physical concept of space in HTM). -B) The signal propagation speed is directly proportional to the axon's diameter carrying it; this appears to be valid both in unmyelinated and myelinated axons (though representing a more obvious phenomenon in the latter type). This could have a huge impact for HTM: if bigger axons (= weight+) burst temporally before other smaller ones towards the same target dendrite, they can also inhibit temporarily that targeted cell (causing a later refractory period) therefore filtering the signal. -C) Receptors, neurotransmitters, electrical and chemical synapses, EPSP (excitatory postsynaptic potential) and IPSP (inhibitory postsynaptic potential) . This is an enormous chapter. Current NNs systems and, if my understanding is correct, also Nupic treat synapses like if they were all electrical synapses. In reality, according to the current consesus, the mammal brain uses electrical synapses mostly to "synchronize" vast areas of the neocortex (I'm deliberately omitting other findings because are not relevant to my point). Although the electrical synapses demonstrates various advantage when compared to their chemical equivalents (speed, resistance, fatigue, etc.), it appears that the complexity and the fine filtering/modulation of the signals inside the PFC is due to the presence of numerous other elements present in chemical synapses: neurotransmitters (such as acetylcholine, dopamine, gaba, norepinephrine...); pre-synaptic, synaptic gap and post-synaptic features; different receptors; etc. Each of these elements can strongly influence the signal and the overall "learning" process. For example: although an axon "weight" is big and it is bursting copiously the above mentioned elements can suppress its signal. My first question is: are the first two points (A and C) implemented in Nupic? Do you reckon that it could be useful to increase the complexity of Nupic also implementing the chemical synapses "class" with the elements described in point C? 2 - SECOND QUESTION I'm trying to run a couple of models. This is an extract from a OPF I created through swarming. 'model': 'CLA', 'modelParams': {'anomalyParams': {u'anomalyCacheRecords': None, u'autoDetectThreshold': None, u'autoDetectWaitRecords': None}, 'clParams': {'alpha': 0.06173462582232023, 'clVerbosity': 0, 'regionName': 'CLAClassifierRegion', 'steps': '0'}, 'inferenceType': 'NontemporalClassification', 'sensorParams': {'encoders': {u'DATE_dayOfWeek': None, u'DATE_timeOfDay': {'fieldname': 'DATE', 'name': 'DATE', 'timeOfDay': (21, 2.2537623685060675), 'type': 'DateEncoder'}, u'DATE_weekend': None, '_classifierInput': {'classifierOnly': True, 'clipInput': True, 'fieldname': 'VO', 'maxval': 2.0, 'minval': 0.0, 'n': 449, 'name': '_classifierInput', 'type': 'ScalarEncoder', 'w': 21}, u'o10N_A': None, u'o11N_A': None, u'o12N_A': None, u'o13N_A': None, u'o14N_A': None, u'o15N_A': None, u'o1N_A': None, u'o1N_B': None, u'o2N_A': None, u'o2N_B': None, u'o3N_A': None, u'o3N_B': None, u'o4N_A': None, u'o4N_B': None, u'o5N_A': None, u'o5N_B': None, u'o6N_A': None, u'o6N_B': None, u'o7N_A': None, u'o7N_B': None, u'o8N_A': None, u'o8N_B': None, u'o9N_A': None, u'o9N_B': None}, 'sensorAutoReset': None, 'verbosity': 0}, 'spEnable': False, 'spParams': {'columnCount': 2048, 'globalInhibition': 1, 'inputWidth': 0, 'maxBoost': 2.0, 'numActiveColumnsPerInhArea': 40, 'potentialPct': 0.8, 'seed': 1956, 'spVerbosity': 0, 'spatialImp': 'cpp', 'synPermActiveInc': 0.05, 'synPermConnected': 0.1, 'synPermInactiveDec': 0.0005}, 'tpEnable': False, 'tpParams': {'activationThreshold': 16, 'cellsPerColumn': 32, 'columnCount': 2048, 'globalDecay': 0.0, 'initialPerm': 0.21, 'inputWidth': 2048, 'maxAge': 0, 'maxSegmentsPerCell': 128, 'maxSynapsesPerSegment': 32, 'minThreshold': 12, 'newSynapseCount': 20, 'outputType': 'normal', 'pamLength': 1, 'permanenceDec': 0.1, 'permanenceInc': 0.1, 'seed': 1960, 'temporalImp': 'cpp', 'verbosity': 0}, 'trainSPNetOnlyIfRequested': False}, If I understood correctly, all the inputs (from o1N_A to o15N_A) were discarded by the swarming process. I've also run a larger swarm, but they are still discarded. Unfortunately I'm sure that at least a good 60% of them are relevant sensors. How can I improve the swarming? Am I doing something wrong? (The sensors are outputs from thoracic low-res electrodes; the predicted field is "VO" which represents the amount of spO2 present in the blood stream at the moment - the idea is to predict the oxygen saturation from the respiratory act). Thanks for your replies. Sorry for my english (I'm italian). Raf -- Raf www.madraf.com/algotrading reply to: [email protected] skype: algotrading_madraf-- With kind regards, David Ray Java Solutions Architect Cortical.io Sponsor of: HTM.java [email protected] http://cortical.io-- With kind regards, David Ray Java Solutions Architect Cortical.io Sponsor of: HTM.java [email protected] http://cortical.io-- Raf www.madraf.com/algotrading reply to: [email protected] skype: algotrading_madraf
-- Raf www.madraf.com/algotrading reply to: [email protected] skype: algotrading_madraf
