I did some work to test NuPIC for Morse decoding application. I created a simple Python script to calculate SDRs from Morse codebook. To visualize the SDR for each letter and number I plotted SDRs - however, my results does not match with CLA whitepaper description.
I used /nupic/examples/sp/hello_sp.py as the base and modified the script to accommodate Morse code bit vectors. I expected that SDRs of Morse symbols that differ only one bit would resemble each other. This does not seem to be the case. The results are here: http://ag1le.blogspot.com/2014/08/cortical-learning-algorithm-for-morse.html Am I doing something incorrectly here? regards Mauri > > On Fri, Aug 22, 2014 at 2:21 AM, Mauri Niininen <[email protected]> > wrote: > >> Francisco (or anybody) >> >> Do you have any Python code examples how to use cortical.io encoder? >> >> I am trying to figure out how to build a Morse codebook using NuPIC - >> i.e. matching learned sequences to corresponding characters. I did similar >> exercise earlier using Kohonen's Self Organizing Maps (SOM), I was running >> some 40,000 examples of noisy Morse code that SOM learned and built an >> internal representation. The difficulty was to build a classifier that >> converted internal representation of ".-" sequence to letter "A". >> >> It appears that NuPIC indeed quickly learns sequences like ".-" ("dit" >> "dah" = letter "A") but how do I convert the prediction to a label "A" >> when NuPIC sees this or similar pattern? >> >> >> Mauri >> >> >> >> >> >> >> On Thu, Aug 21, 2014 at 8:05 AM, Francisco De Sousa Webber < >> [email protected]> wrote: >> >>> Chris, >>> I am not knowledgeable in the area but the limits you describe seem to >>> be the same limits associated with (human) speech recognition. The way the >>> brain tackles the signal to noise ratio is by actually understanding what a >>> certain message is about. Based on this understanding the brain makes a >>> couple of guesses what a word could mean, coming up with some candidate >>> words from an incomplete perception. Then the semantically most probable >>> word is chosen. A possible approach could therefore be to convert the morse >>> stream into word-SDRs (using cortical.io encoder) and to match the >>> candidate words semantically. >>> >>> Francisco >>> >>> >>> >>> On 21.08.2014, at 03:52, Mauri Niininen <[email protected]> >>> wrote: >>> >>> I agree with Chris that the target should be human level recognition. >>> However, to define that level of performance in engineering terms is not >>> very clear. Humans have also some limitations in copying CW - some of >>> these limits have been well documented in the literature. >>> >>> 1) *Speed* - most humans cannot copy Morse code faster than 40 - 60 >>> WPM (words per minute) - roughly equal to 200 - 300 characters per minute. >>> There are some competitions where the focus is on speed - Google search >>> found Guinness world record "On 6 May 2003 Andrei Bindasov (Belarus) >>> successfully transmitted 216 Morse code marks of mixed text in one minute. >>> The attempt was held as part of the International Amateur Radio Union's 5th >>> World Championship in High Speed Telegraphy in Belarus." Also, check >>> http://www.rufzxp.net/ - it states that Goran Hajoševic, YT7AW >>> cracked 1000 CPM (characters per minute) speed limit in copying call signs >>> correctly. NuPIC with fast enough hardware might scale beyond human >>> capabilities. >>> >>> 2) *Speed vs. Signal-to-Noise Ratio (SNR)* - as the SNR decreases it >>> becomes increasingly harder to copy CW correctly. You can improve the >>> situation by using audio bandpass filtering but as the speed of Morse >>> increases the signal bandwidth also increases. Very narrow audio filters >>> tend to create "ringing" artifacts - a noise spike starts to sound like >>> "dit" creating copy errors. I have done some testing ( >>> http://ag1le.blogspot.com/2013/01/morse-decoder-snr-vs-cer-testing.html) >>> to plot the character error rate (CER) vs. SNR limits using different >>> techniques. Human auditory system has excellent adaptive filtering >>> capabilities but still have some limits. Skilled CW operators can >>> compensate by slowing down the speed during poor conditions, asking other >>> stations to repeat ("AGN" - again ), turning antenna or by other means. >>> NuPIC with "sensor - motor" integration could potentially learn to do this? >>> >>> 3) *Speed variability* - very often especially in CW contests you can >>> hear stations sending general call "CQ DE <call sign>" at certain speed. >>> When responding to calling stations they send acknowledgement and signal >>> report ("5NN") at much higher speed. Since Morse code does not have >>> built-in speed synchronization it is quite difficult to built adaptive >>> speed tracking algorithm that is able to decode correctly when speed jumps >>> from 20 WPM to 45 WPM between two characters. If NuPIC can learn to >>> recognize this type of pattern rather than exact "dit"/"dah" timing this >>> may be an area to gain better decoding accuracy. >>> >>> 4) *Rhythm variability* - hand keyed Morse (HKM) code has a lot of >>> timing variability in "dits", "dahs" and inter-element and inter-character >>> pauses. I have done some testing and collected data - see >>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html. >>> Humans can handle this variability by "filling in" missed or incorrect >>> characters from word or dialogue context. This part is very difficult to >>> handle with algorithms and perhaps this would be the area where NuPIC could >>> bring most benefits. The reason is that each HKM operator has different >>> "signature" that is easy for humans to recognize and learn after listening >>> a while, but very difficult to build a generalized algorithm that can >>> handle all rhythm variants with high accuracy. This was one of the main >>> reasons to create Bayesian Morse decoder ( >>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html) - it >>> works better for HKM cases but is still far from human performance level. >>> NuPIC could potentially learn the rhythm and adapt like humans do? >>> >>> 5) *Fading, flutter, interference* - due to the properties of RF signal >>> propagation in different frequency bands received signal amplitude can >>> change very rapidly. Signals can fade down below noise level and come back >>> up within few tens of milliseconds making it very difficult to accurately >>> detect what is signal and what is noise. Sometimes you can also have >>> "flutter" or echoes in signals creating challenges for computer algorithms. >>> Interference from stations in nearby or overlapping frequency especially in >>> "pile-up" situations when you have 10 - 200 stations within 1-2 kHz >>> bandwidth calling a rare DX station all at the same time is very difficult >>> for computer algorithms to deal with. It is also challenging for humans, >>> but best DX operators can "manage" the pile-up by sending hints like "5 >>> UP" or "10 UP" meaning that he is listening 5 - 10 kHz above his own >>> transmission frequency. This causes other operators to "spread out" which >>> helps in copying Morse code. NuPIC would need to have "sensor - motor" >>> integration to be able do something like this. >>> >>> 6) *Doppler shift, poor transmitter or receiver frequency stability* - >>> you can hear often stations who are drifting in frequency either due to >>> doppler effects (such as when working through a fast moving satellite, or >>> having Earth-Moon-Earth (EME) contact) or due to TX/RX frequency >>> instability. Humans can compensate by turning VFO - for computers this >>> requires some algorithm that automatically tracks the wanted signal. Some >>> of the most advanced Morse decoder software packages have this capability. >>> NuPIC with proper audio encoders could potentially deal with these issues. >>> >>> 7)* Sending Errors* - operators make frequently errors when sending >>> Morse code, especially when they try to send too fast. Humans can still >>> understand the meaning as they "fill in" missed parts from the context of >>> the dialogue - for computer algorithms this is not so easy. NuPIC with >>> proper training on commonly used ham radio "jargon" or typical contest >>> exchanges could make a difference and improve decoding accuracy. >>> >>> 8) *Ability to copy multiple conversations simultaneously* - for many >>> people it is hard to follow multiple simultaneous conversations accurately >>> at the same time. Skilled CW operators can pick up relevant details such as >>> call signs from a "pile-up" with tens of stations in 1-2 Khz bandwidth. I >>> believe this is an area where well trained NuPIC could make a difference if >>> we can create proper sparse encoding scheme (see >>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html >>> as an example). You could also create a multi channel CW decoder, >>> something similar to what I recently wrote for FLDIGI (see >>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-5.html) >>> >>> For each of these cases above we could derive some proper metrics to set >>> a "performance standard" and start building software that is approaching >>> human performance level. Ability to do all above things simultaneously and >>> perfectly requires thousands of hours of learning - best human CW operators >>> have spent a lot of time honing their skills in contests and working DX >>> stations around the world. To listen world's best CW operators is like >>> listening skilled musicians - they have both excellent skills and passion >>> to this art form. >>> >>> I am not sure NuPIC is able to learn all above in its present form. It >>> would be very cool to get at least parts of above working, though. Creating >>> proper encoder/decoders, "sensor-motor" interfaces and natural language >>> interfaces would certainly take machine learning a giant leap forward and >>> we could start applying NuPIC for other hard problems. >>> >>> 73 >>> Mauri AG1LE >>> >>> >>> >>> >>> On Wed, Aug 20, 2014 at 8:19 PM, Chris Albertson < >>> [email protected]> wrote: >>> >>>> On Wed, Aug 20, 2014 at 2:20 PM, Skeptical Engineer >>>> <[email protected]> wrote: >>>> > I think that NuPIC is a perfect solution for CW interpretation. It >>>> would require some front-end work to be done. I’m the one who chatted the >>>> questions in to Office Hours about audio coding last week, and I’m looking >>>> to start coding some audio front-end stuff, building on the code that >>>> already exists for that. I was looking more at audio music and spoken >>>> language decoding, but CW could be a good intermediate step. I need to >>>> learn more about the types and scope of variation in the code >>>> transmissions. >>>> >>>> What do you need to know? >>>> >>>> First off this can be very easy if the Morse Code is machine generated >>>> and there is little noise. You don't need machine learning for the >>>> easy case. it can be hard coded in C. It is just a lookup table. >>>> >>>> Next up on the scale is real-world off the air decoding of strong >>>> signals. This has been done too. Using technique as in above but >>>> with much more signal processing up front. Still no "intelligence" >>>> involved. >>>> >>>> What is needed is human level recognition. The current state of the >>>> art is just short of human level performance. One thing that MUST be >>>> done to make the effort of practical use is to sort out WHO is sending >>>> WHAT. Listen and you hear tones being transmitted with each sender >>>> using a slightly different tone. But in is very important to know >>>> that no station is actually sending AUDIO. The tone is an artifact of >>>> the RECIEVER and is not present on the airwaves. What you hear is >>>> the "beat frequency" between the sending and receiving radios. If the >>>> tone sounds like it is 600Hz then the two radios are tuned 600Hz >>>> apart. The transmitter is sending only an unmodulated carrier. >>>> Why bring this up? Because a working CW decoder would likely be >>>> listening to a wider bandwidth than audio and it is "listening" to a >>>> power spectra not an audio MP3 file. >>>> >>>> You also will need to understand some of the conversation. At least >>>> enough to pick out the call signs. But we can do this with a simple >>>> "regular expression" A simple left recursive grammar s almost over >>>> kill for "understanding" these conversations. So it should be easy >>>> for CLA, that fast can be done >>>> >>>> The final step would be to couple this with "operations". For >>>> example the computer can not hear a signal, so it turns so knobs (so >>>> to speak) on the radio to adjust a filter or whatever. These signals >>>> are coming in over the air and are not recorded to we can "do stuff" >>>> like point out antenna at the transmitter, filter out noise. Why >>>> say this: Because I think THIS is where something like NuPic can >>>> really shine, when it plays an ACTIVE ROLE. You need this for human >>>> level performance. >>>> >>>> Some very good real-world example MP3 recordings are on the link >>>> below. >>>> http://www.dxuniversity.com/audio/ >>>> These require human level performance to make sense of. >>>> Notice that humans needs to be able to keep overlapping conversations >>>> logical separate. With speech we can do this because everyone has a >>>> unique sounding voice. It is kind of this way with CW too. >>>> >>>> Summary. >>>> >>>> The easy case is very easy. A beginning first year university >>>> computer science student could decode Morse Code under ideal >>>> conditions. In the real-world "pile up" case CW is as hard as trying >>>> to understand a dozen overlapping conversations at a cocktail party >>>> where everyone is talking at once. It is well past the current state >>>> of the art in speech recognition. But I think CW is a good area for >>>> research just because of this range of difficulty. >>>> >>>> The trap is developing a technique that ONLY works on the easy cases >>>> and can't scale up. >>>> >>>> >>>> >>>> >>>> >>>> > >>>> > NuPIC is the brains that recognizes patterns, we just need to figure >>>> out the right sensory arrangement to see the most useful patterns. >>>> > >>>> > rich >>>> > >>>> > On Aug 20, 2014, at 11:21, Matthew Taylor <[email protected]> wrote: >>>> > >>>> >> Chris, >>>> >> >>>> >> Please keep in mind that this is very early stage technology. We are >>>> >> working on the foundations of HTM with NuPIC, and we open-sourced the >>>> >> codebase to get community involvement as soon as it was feasible. >>>> >> True, NuPIC is not a turnkey solution for any problem at this point, >>>> >> but our goals are to share this tech with anyone who wants to work on >>>> >> it, and encourage motivated developers to craft solutions to >>>> >> interesting problems. >>>> >> >>>> >> In the future, I imagine a library of community-provide encoders that >>>> >> can be easily plugged into NuPIC. (For other musings about the future >>>> >> of NuPIC, see [1].) But in the meantime, we have a lot of work to get >>>> >> done. If you want to be a part of it, you could join our sprint >>>> >> planning meetings [2] and open office hours [3]. >>>> >> >>>> >> [1] https://www.youtube.com/watch?v=QPkA6nJifOw >>>> >> [2] >>>> https://www.youtube.com/watch?v=oB71cqyRi9s&list=PL3yXMgtrZmDrtAuw9jJCNbaJmW3nSD3hC >>>> >> [3] >>>> https://www.youtube.com/watch?v=MWBFw4WoZxA&list=PL3yXMgtrZmDqsqo6hytKjhrkfFNEYDqfn >>>> >> --------- >>>> >> Matt Taylor >>>> >> OS Community Flag-Bearer >>>> >> Numenta >>>> >> >>>> >> >>>> >> On Wed, Aug 20, 2014 at 7:59 AM, Chris Albertson >>>> >> <[email protected]> wrote: >>>> >>> I posted this same exact question here some weeks ago. I've not >>>> >>> read your links yet but I will. >>>> >>> >>>> >>> My conclusion about NuPic is about the same as your #1, #2 and #3. >>>> >>> That is you need a large "do it yourself" solution on top of NuPic, >>>> >>> so I wonder what's gained, If you need to write you own encoder, >>>> >>> layering and feedback and then extract the results (inverse >>>> encoder?) >>>> >>> what is gained by using NuPic over some other NN library? Those >>>> >>> "higher order sequences" would be handled in NuPIc by a hierarchy of >>>> >>> CLAs that you would have to implement. >>>> >>> >>>> >>> I'm thinking now that recognizing CW is a lot like speech >>>> recognition. >>>> >>> But the up front encoding needs to be some kind of phase locked >>>> >>> loop on the "dit period" >>>> >>> >>>> >>> I also thought NuPIc would be great for this, just pass in the audio >>>> >>> stream.... But I don't think it's up to the job. >>>> >>> >>>> >>> On Tue, Aug 19, 2014 at 9:39 PM, Mauri Niininen >>>> >>> <[email protected]> wrote: >>>> >>>> I am looking for some expert advice from NuPIC gurus here. >>>> >>>> >>>> >>>> I have been working on the problem of decoding Morse code from >>>> noisy, real >>>> >>>> life signals as received using HF radios. I have implemented >>>> several types >>>> >>>> of signal processing and machine learning algorithms trying to >>>> improve >>>> >>>> accuracy and reduce decoding character error rate (CER) caused by >>>> various >>>> >>>> reasons, such as >>>> >>>> - poor signal-to-noise ratio >>>> >>>> - signal fading due to RF propagation >>>> >>>> - poor rhythm & timing of hand keyed CW >>>> >>>> - rapid speed changes >>>> >>>> - signal interference from adjacent frequencies >>>> >>>> >>>> >>>> If you are interested in this subject there is more detailed >>>> descriptions on >>>> >>>> problems and solutions I have tested so far in here: >>>> >>>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html >>>> >>>> http://ag1le.blogspot.com/2014/06/new-morse-decoder-part-4.html >>>> >>>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-6.html >>>> >>>> >>>> http://ag1le.blogspot.com/2013/01/towards-bayesian-morse-decoder.html >>>> >>>> >>>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html >>>> >>>> >>>> http://ag1le.blogspot.com/2012/05/morse-code-decoding-with-self.html >>>> >>>> >>>> >>>> >>>> >>>> My questions are related to NuPIC and how could I start testing >>>> whether CLA >>>> >>>> algorithm would perform better than the currently used Bayesian >>>> algorithm? >>>> >>>> >>>> >>>> The challenges I see after studying the NuPIC documentation & >>>> example code: >>>> >>>> >>>> >>>> 1) How to create encoder for building sparse representation from >>>> audio >>>> >>>> signals? (some ideas here: >>>> >>>> >>>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html >>>> >>>> ) >>>> >>>> >>>> >>>> 2) If you teach NuPIC CLA to recognize Morse character set as >>>> sequence of >>>> >>>> "mark" / "space" bit patterns, how can you decode apparently >>>> random bit >>>> >>>> patterns from spatial pooler back to ASCII character set to be >>>> displayed to >>>> >>>> user? Does any of the existing classifiers allow users to create >>>> their own >>>> >>>> "codebook" (see >>>> >>>> >>>> http://ag1le.blogspot.com/2012/05/fldigi-adding-matched-filter-feature-to.html >>>> >>>> example using Kohonen Self Organizing Maps to build a codebook) >>>> >>>> >>>> >>>> 3) Does NuPIC CLA also recognize some common language patterns >>>> ("higher >>>> >>>> order sequences") that are typically used in normal ham radio >>>> contacts ? Or >>>> >>>> is there a need to chain multiple CLAs in some sort of hierarchy? >>>> >>>> >>>> >>>> regards, >>>> >>>> Mauri AG1LE >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> >>>> nupic mailing list >>>> >>>> [email protected] >>>> >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> >>>> >>> Chris Albertson >>>> >>> Redondo Beach, California >>>> >>> >>>> >>> _______________________________________________ >>>> >>> nupic mailing list >>>> >>> [email protected] >>>> >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >> >>>> >> _______________________________________________ >>>> >> nupic mailing list >>>> >> [email protected] >>>> >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> > >>>> > >>>> > _______________________________________________ >>>> > nupic mailing list >>>> > [email protected] >>>> > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>>> >>>> -- >>>> >>>> Chris Albertson >>>> Redondo Beach, California >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
