Re: Please help Ken with his HTM Challenge project

Pascal Weinberger Sun, 25 Oct 2015 07:45:06 -0700

From the nupic.audio discussion in PR #21 thanks to Richards pointer:

Ok, thanks :) I didn't notice this discussion :o Really interesting problem... 
In deep learning problems like this are usually solved by what is called 
data-augmentation, where you successively add different levels of noise and 
shifts to your data to a) get more training data and b) be more resilient to 
overfitting. I guess we could use a similar approach for this problem. on 
data-augmentation: 
https://www.techopedia.com/definition/28033/data-augmentation 
https://wwwf.imperial.ac.uk/~dvandyk/Research/01-jcgs-art.pdf 
http://jmlr.org/proceedings/papers/v38/gan15.pdf Or just Google ;D



Best

Pascal 

____________________________

BE THE CHANGE YOU WANT TO SEE IN THE WORLD ...


> On 25 Oct 2015, at 16:17, Richard Crowder <[email protected]> wrote:
> 
> For reference; 
> http://www.intmath.com/blog/mathematics/math-of-ecgs-fourier-series-4281
> 
>> On Sun, Oct 25, 2015 at 2:16 PM, Richard Crowder <[email protected]> wrote:
>> Also, covering Sergey's first question, what anomalies are you looking for?
>> See this report that uses a Fourier Transform and it's inverse for R-peak 
>> detection;
>> http://www.egr.msu.edu/classes/ece480/capstone/spring13/group03/documents/SignalProcessingofECGSignalsinMatlab.pdf
>> 
>> PS: I have Matlab if required/helps. Although other ways in Python can be 
>> used, see nupic.critic for example.
>> 
>>> On Sat, Oct 24, 2015 at 6:13 PM, Richard Crowder <[email protected]> wrote:
>>> Kentaro, Sergey,
>>> 
>>> I've been trying to get my head around available data for training/testing.
>>> 
>>> For example, https://physionet.org/physiobank/ Graph viewing and download 
>>> via 
>>> https://physionet.org/cgi-bin/atm/ATM?database=mimic2wdb&tool=plot_waveforms
>>>  (River view applicable?)
>>> 
>>> Any idea what could be the best form of data, and which kind of data to 
>>> obtain (ECG only?, with arterial blood pressure, need for labeling?). See 
>>> graph here https://physionet.org/physiobank/database/mimic2wdb/
>>> 
>>> Best regards, Richard.
>>> 
>>> 
>>>> On Thu, Oct 22, 2015 at 1:12 PM, 飯塚健太郎 <[email protected]> wrote:
>>>> Richard, Sergey, 
>>>> Thank you for replies.
>>>> 
>>>> I read replies carefully, and noticed some fact.
>>>> 
>>>> Currently, My code using raw ECG data with NuPIC’s Scalar Encoder and
>>>> TemporalAnomaly for inferenceType.
>>>> 
>>>> But It is another way,
>>>> use pre encoded ECG data to learn and predict anomalies.
>>>> 
>>>> I found FFT used in Audio Stream example.
>>>> https://github.com/numenta/nupic/blob/master/examples/audiostream/audiostream_tp.py#L249
>>>> 
>>>> It might be better to use Wavelet or another encoding technique,
>>>> That technique make data more discretely and might be suitable for detect 
>>>> anomalies.
>>>> 
>>>> I think I should learn about Encoding technique.
>>>> I’ll read the paper Richard suggested, too.
>>>> 
>>>> Thanks!
>>>> 
>>>> 2015-10-22 19:36 GMT+09:00 Richard Crowder <[email protected]>:
>>>>> Hello Kentaro,
>>>>> 
>>>>> Sergey's questions, response, and paper link are important. The linked 
>>>>> paper is the first I've read on ECG signal analysis, but has a lot of 
>>>>> cross-over with audio and speech signal analysis and recognition. Plus 
>>>>> recently research into steganalysis [1]. 
>>>>> 
>>>>> For example -
>>>>> The use of Wavelet transform, or Fourier Transform / DCT (both magnitude 
>>>>> AND phase), 
>>>>> Perceptual linear prediction, as opposed to Mel-Frequency Cepstral 
>>>>> analysis,
>>>>> Very importantly, statistical analysis of spectral features - Wavelet/DCT 
>>>>> with Hilbert transform, spectral envelope curve analysis and derivative 
>>>>> tracking (velocity and acceleration of curve changes, can limit up to 5th 
>>>>> order).
>>>>> 
>>>>> A lot of this occurs within animal's brains, with mammals adding addition 
>>>>> feedback and inference through the neocortex. As humans, we have 
>>>>> exploited the spectral analysis within our 'old brain' to listen, detect, 
>>>>> and track spectral features. Such as ECG signals, and sonar signals 
>>>>> (hunting for shoals of fish and submarines), for example. Cross-over and 
>>>>> similar analysis occurs in vision sensory analysis too (e.g. edge 
>>>>> detection).
>>>>> 
>>>>> Which points to the key questions of how you are encoding the ECG 
>>>>> signals? As well as classification techniques?
>>>>> 
>>>>> Best regards, Richard.
>>>>> 
>>>>> 1 http://www.shsu.edu/~qxl005/new/publications/tifs_audiosteg.pdf
>>>>> 
>>>>>> On Thu, Oct 22, 2015 at 10:18 AM, Sergey Alexashenko 
>>>>>> <[email protected]> wrote:
>>>>>> Actually, I can write out the scenarios here. 
>>>>>> 
>>>>>> NuPIC should definitely be able to learn different people's heartbeats 
>>>>>> in one model. You have to give it plenty of data to learn on. Also, make 
>>>>>> sure to resetSequenceStates every time you start feeding in data from a 
>>>>>> new person. Finally, you might want to shuffle the data so that you 
>>>>>> don't feed it person 1, then person 2, then person 3, but rather a 
>>>>>> mixture of all the data to reduce bias towards the latest people (but I 
>>>>>> don't think that this is necessary to be honest). 
>>>>>> 
>>>>>> There is, however, the issue of encoding. I'm assuming that you are 
>>>>>> using a scalar encoder produced by swarming. That's fine, that's a quick 
>>>>>> approach and it might work (in fact I would bet that it will produce 
>>>>>> usable results - be mindful of swarming on a data set including 
>>>>>> different people's data, though!). 
>>>>>> 
>>>>>> However, if you think about the data type - ECG data, unlike, say, EEG 
>>>>>> data, consists of almost perfectly discrete steps (heartbeats) which 
>>>>>> could be matched to NuPIC timesteps very well. If you run through the 
>>>>>> trouble of extracting features from your data (there is ample literature 
>>>>>> on how to do it - see [1] for example), and creating encoders for all 
>>>>>> the intervals/amplitudes, I think that NuPIC would do a marvelous job. 
>>>>>> Note that this approach condenses the time interval per step to one per 
>>>>>> heartbeat and, thus, is not going to work if you are trying to do 
>>>>>> super-rapid detection or prediction (on a time scale shorter than one 
>>>>>> heartbeat). It is also more time-consuming for you - once again, 
>>>>>> swarming could work well enough.
>>>>>> 
>>>>>> Hope this helps,
>>>>>> 
>>>>>> Sergey
>>>>>> 
>>>>>> [1] http://arxiv.org/pdf/1005.0957.pdf
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Thu, Oct 22, 2015 at 1:58 AM, Sergey Alexashenko 
>>>>>>> <[email protected]> wrote:
>>>>>>> Hello Kentaro,
>>>>>>> 
>>>>>>> I think that NuPIC can definitely work with ECG data, but I need a 
>>>>>>> little more information about your project to make any helpful 
>>>>>>> suggestions. Two questions:
>>>>>>> 
>>>>>>> 1) Are you trying to predict or detect anomalies? You use both terms, 
>>>>>>> but they involve somewhat different mechanisms.
>>>>>>> 
>>>>>>> 2) How are you encoding ECG data?
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Sergey
>>>>>>> 
>>>>>>> 
>>>>>>>> On Wed, Oct 21, 2015 at 10:07 PM, Kentaro Iizuka 
>>>>>>>> <[email protected]> wrote:
>>>>>>>> Hello NuPIC.
>>>>>>>> 
>>>>>>>> Thank you Matt for post.
>>>>>>>> 
>>>>>>>> Here is my question detail. (It is same as gitter post)
>>>>>>>> https://gist.github.com/iizukak/72526863d3f504f2ff5e
>>>>>>>> 
>>>>>>>> I hope somebody have good idea for that.
>>>>>>>> 
>>>>>>>> Thank you!
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2015-10-22 13:29 GMT+09:00 Matthew Taylor <[email protected]>:
>>>>>>>> > Hello NuPIC,
>>>>>>>> >
>>>>>>>> > Check this out: 
>>>>>>>> > https://gitter.im/numenta/htm-challenge/archives/2015/10/21
>>>>>>>> >
>>>>>>>> > Watch the ECG anomaly in the video: 
>>>>>>>> > https://youtu.be/5KdwV-trMhE?t=1m41s
>>>>>>>> >
>>>>>>>> > He has an interesting question about how to train a model on a 
>>>>>>>> > healthy
>>>>>>>> > heartbeat, and it is expressed well with pictures in the link above. 
>>>>>>>> > He
>>>>>>>> > wants to train a model with the ECG history of more than one person 
>>>>>>>> > to get a
>>>>>>>> > representation of a "healthy heartbeat". The problem is that every 
>>>>>>>> > person's
>>>>>>>> > heartbeat is a little different. Is it feasible to train a model on 
>>>>>>>> > multiple
>>>>>>>> > heartbeats in sequence? I'm not sure if it will work, but maybe 
>>>>>>>> > someone has
>>>>>>>> > a better idea?
>>>>>>>> >
>>>>>>>> > Solving this problem would help in a lot of different signal analysis
>>>>>>>> > applications of HTM...
>>>>>>>> >
>>>>>>>> > ---------
>>>>>>>> > Matt Taylor
>>>>>>>> > OS Community Flag-Bearer
>>>>>>>> > Numenta
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Kentaro Iizuka<[email protected]>
>>>>>>>> 
>>>>>>>> Github
>>>>>>>> https://github.com/iizukak/
>>>>>>>> 
>>>>>>>> Facebook
>>>>>>>> https://www.facebook.com/kentaroiizuka
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> 飯塚健太郎([email protected])
>>>> 
>>>> 埼玉大学理工学研究科
>>>> 暗号基盤研究室
>>>> 博士前期課程一年次
>

Re: Please help Ken with his HTM Challenge project

Reply via email to