I did some work to test NuPIC for Morse decoding application.  I created a
simple Python script to calculate SDRs from Morse codebook. To visualize
the SDR for each letter and number I plotted SDRs  - however, my results
does not match with CLA whitepaper description.

I used  /nupic/examples/sp/hello_sp.py  as the base and modified the script
to accommodate Morse code bit vectors. I expected that SDRs of Morse
symbols that differ only one bit would resemble each other. This does not
seem to be the case.  The results are here:
http://ag1le.blogspot.com/2014/08/cortical-learning-algorithm-for-morse.html


Am I doing something incorrectly here?

regards
Mauri


>
> On Fri, Aug 22, 2014 at 2:21 AM, Mauri Niininen <[email protected]>
> wrote:
>
>> Francisco (or anybody)
>>
>> Do you have any Python code examples how to use cortical.io encoder?
>>
>> I am trying to figure out how to build a Morse codebook using NuPIC -
>> i.e. matching learned sequences to corresponding characters.  I did similar
>> exercise earlier using Kohonen's Self Organizing Maps (SOM), I was running
>> some 40,000 examples of noisy Morse code that SOM learned and built an
>> internal representation. The difficulty was to build a classifier that
>> converted internal representation of  ".-" sequence to letter "A".
>>
>> It appears that NuPIC indeed quickly learns sequences like ".-"   ("dit"
>> "dah"  =  letter "A") but how do I convert the prediction to a label "A"
>> when NuPIC sees this or similar pattern?
>>
>>
>> Mauri
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 21, 2014 at 8:05 AM, Francisco De Sousa Webber <
>> [email protected]> wrote:
>>
>>> Chris,
>>> I am not knowledgeable in the area but the limits you describe seem to
>>> be the same limits associated with (human) speech recognition. The way the
>>> brain tackles the signal to noise ratio is by actually understanding what a
>>> certain message is about. Based on this understanding the brain makes a
>>> couple of guesses what a word could mean, coming up with some candidate
>>> words from an incomplete perception. Then the semantically most probable
>>> word is chosen. A possible approach could therefore be to convert the morse
>>> stream into word-SDRs (using cortical.io encoder) and to match the
>>> candidate words semantically.
>>>
>>> Francisco
>>>
>>>
>>>
>>> On 21.08.2014, at 03:52, Mauri Niininen <[email protected]>
>>> wrote:
>>>
>>> I agree with Chris that the target should be human level recognition.
>>> However, to define that level of performance in engineering terms is not
>>> very clear.  Humans have also some limitations in copying CW - some of
>>> these limits have been well documented in the literature.
>>>
>>> 1) *Speed*  -  most humans cannot copy Morse code faster  than 40 - 60
>>> WPM (words per minute) - roughly equal to 200 - 300 characters per minute.
>>>  There are some competitions where the focus is on speed - Google search
>>> found Guinness world record "On 6 May 2003 Andrei Bindasov (Belarus)
>>> successfully transmitted 216 Morse code marks of mixed text in one minute.
>>> The attempt was held as part of the International Amateur Radio Union's 5th
>>> World Championship in High Speed Telegraphy in Belarus."  Also,  check
>>> http://www.rufzxp.net/  - it states that  Goran Hajoševic, YT7AW
>>> cracked  1000 CPM (characters per minute) speed limit in copying call signs
>>> correctly. NuPIC with fast enough hardware might scale beyond human
>>> capabilities.
>>>
>>> 2) *Speed vs. Signal-to-Noise Ratio (SNR)*  - as the SNR decreases it
>>> becomes increasingly harder to copy CW correctly. You can improve the
>>> situation by using audio bandpass filtering but as the speed of Morse
>>> increases the signal bandwidth also increases. Very narrow audio filters
>>> tend to create "ringing" artifacts - a noise spike starts to sound like
>>> "dit" creating copy errors.   I have done some testing (
>>> http://ag1le.blogspot.com/2013/01/morse-decoder-snr-vs-cer-testing.html)
>>> to plot the character error rate (CER)  vs.  SNR limits using different
>>> techniques.  Human auditory system has excellent adaptive filtering
>>> capabilities but still have some limits. Skilled CW operators can
>>> compensate by slowing down the speed during poor conditions,  asking other
>>> stations to repeat  ("AGN" - again ), turning antenna or by other means.
>>> NuPIC with "sensor - motor" integration could potentially learn to do this?
>>>
>>> 3) *Speed variability* - very often especially in CW contests you can
>>> hear stations sending general call "CQ DE <call sign>" at certain speed.
>>> When responding to calling stations they send acknowledgement and signal
>>> report ("5NN") at much higher speed. Since Morse code does not have
>>> built-in speed synchronization it is quite difficult to built adaptive
>>> speed tracking algorithm that is able to decode correctly when speed jumps
>>> from 20 WPM to 45 WPM between two characters. If NuPIC can learn to
>>> recognize this type of pattern rather than exact "dit"/"dah" timing this
>>> may be an area to gain better decoding accuracy.
>>>
>>> 4) *Rhythm variability* -  hand keyed Morse (HKM) code has a lot of
>>> timing variability in "dits", "dahs" and inter-element and inter-character
>>> pauses. I have done some testing and collected data - see
>>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html.
>>>  Humans can handle this variability by "filling in" missed or incorrect
>>> characters from word or dialogue context. This part is very difficult to
>>> handle with algorithms and perhaps this would be the area where NuPIC could
>>> bring most benefits.  The reason is that each HKM operator has different
>>> "signature" that is easy for humans to recognize and learn after listening
>>> a while, but very difficult to build a generalized algorithm that can
>>> handle all rhythm variants with high accuracy. This was one of the main
>>> reasons to create Bayesian Morse decoder (
>>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html)  - it
>>> works better for HKM cases but is still far from human performance level.
>>> NuPIC could potentially learn the rhythm and adapt like humans do?
>>>
>>> 5) *Fading, flutter, interference* - due to the properties of RF signal
>>> propagation in different frequency bands received signal amplitude can
>>> change very rapidly. Signals can fade down below noise level and come back
>>> up within few tens of milliseconds making it very difficult to accurately
>>> detect what is signal and what is noise. Sometimes you can also have
>>> "flutter" or echoes in signals creating challenges for computer algorithms.
>>> Interference from stations in nearby or overlapping frequency especially in
>>> "pile-up" situations when you have 10 - 200 stations within 1-2 kHz
>>> bandwidth calling a rare DX station all at the same time is very difficult
>>> for computer algorithms to deal with. It is also challenging for humans,
>>> but best DX operators can "manage" the pile-up by sending hints like  "5
>>> UP" or "10 UP"  meaning that he is listening 5 - 10 kHz above his own
>>> transmission frequency. This causes other operators to "spread out" which
>>> helps in copying Morse code.  NuPIC would need to have "sensor - motor"
>>> integration to be able do something like this.
>>>
>>> 6) *Doppler shift, poor transmitter or receiver frequency stability* -
>>>  you can hear often stations who are drifting in frequency either due to
>>> doppler effects (such as when working through a fast moving satellite, or
>>> having Earth-Moon-Earth (EME) contact) or due to TX/RX frequency
>>> instability.  Humans can compensate by turning VFO  - for computers this
>>> requires some algorithm that automatically tracks the wanted signal. Some
>>> of the most advanced Morse decoder software packages have this capability.
>>>  NuPIC with proper audio encoders could potentially deal with these issues.
>>>
>>> 7)* Sending Errors* -  operators make frequently errors when sending
>>> Morse code, especially when they try to send too fast. Humans can still
>>> understand the meaning as they "fill in" missed parts from the context of
>>> the dialogue  - for computer algorithms this is not so easy. NuPIC with
>>> proper training on commonly used ham radio "jargon" or typical contest
>>> exchanges could make a difference and improve decoding accuracy.
>>>
>>> 8) *Ability to copy multiple conversations simultaneously* - for many
>>> people it is hard to follow multiple simultaneous conversations accurately
>>> at the same time. Skilled CW operators can pick up relevant details such as
>>> call signs from a "pile-up" with tens of stations in 1-2 Khz bandwidth.  I
>>> believe this is an area where well trained NuPIC could make a difference if
>>> we can create proper sparse encoding scheme (see
>>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html
>>> as an example).  You could also create a multi channel CW decoder,
>>> something similar to what I recently wrote for FLDIGI (see
>>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-5.html)
>>>
>>> For each of these cases above we could derive some proper metrics to set
>>> a "performance standard" and start building software that is approaching
>>> human performance level. Ability to do all above things simultaneously and
>>> perfectly requires thousands of hours of learning - best human CW operators
>>> have spent a lot of time honing their skills in contests and working DX
>>> stations around the world. To listen world's best CW operators is like
>>> listening skilled musicians - they have both excellent skills and passion
>>> to this art form.
>>>
>>> I am not sure NuPIC is able to learn all above in its present form. It
>>> would be very cool to get at least parts of above working, though. Creating
>>> proper encoder/decoders, "sensor-motor" interfaces and natural language
>>> interfaces would certainly take machine learning a giant leap forward and
>>> we could start applying NuPIC for other hard problems.
>>>
>>> 73
>>> Mauri AG1LE
>>>
>>>
>>>
>>>
>>> On Wed, Aug 20, 2014 at 8:19 PM, Chris Albertson <
>>> [email protected]> wrote:
>>>
>>>> On Wed, Aug 20, 2014 at 2:20 PM, Skeptical Engineer
>>>> <[email protected]> wrote:
>>>> > I think that NuPIC is a perfect solution for CW interpretation.  It
>>>> would require some front-end work to be done.  I’m the one who chatted the
>>>> questions in to Office Hours about audio coding last week, and I’m looking
>>>> to start coding some audio front-end stuff, building on the code that
>>>> already exists for that.  I was looking more at audio music and spoken
>>>> language decoding, but CW could be a good intermediate step.  I need to
>>>> learn more about the types and scope of variation in the code 
>>>> transmissions.
>>>>
>>>> What do you need to know?
>>>>
>>>> First off this can be very easy if the Morse Code is machine generated
>>>> and there is little noise.   You don't need machine learning for the
>>>> easy case.  it can be hard coded in C.  It is just a lookup table.
>>>>
>>>> Next up on the scale is real-world off the air decoding of strong
>>>> signals.  This has been done too.  Using technique as in above but
>>>> with much more signal processing up front.  Still no "intelligence"
>>>> involved.
>>>>
>>>> What is needed is human level recognition.  The current state of the
>>>> art is just short of human level performance.   One thing that MUST be
>>>> done to make the effort of practical use is to sort out WHO is sending
>>>> WHAT.   Listen and you hear tones being transmitted with each sender
>>>> using a slightly different tone.   But in is very important to know
>>>> that no station is actually sending AUDIO.  The tone is an artifact of
>>>> the RECIEVER and is not present on the airwaves.   What you hear is
>>>> the "beat frequency" between the sending and receiving radios.  If the
>>>> tone sounds like it is 600Hz then the two radios are tuned 600Hz
>>>> apart.  The transmitter is sending only an unmodulated carrier.
>>>> Why bring this up?  Because a working CW decoder would likely be
>>>> listening to a wider bandwidth than audio and it is "listening" to a
>>>> power spectra not an audio MP3 file.
>>>>
>>>> You also will need to understand some of the conversation.  At least
>>>> enough to pick out the call signs.  But we can do this with a simple
>>>> "regular expression"  A simple left recursive grammar s almost over
>>>> kill for "understanding" these conversations.   So it should be easy
>>>> for CLA, that fast can be done
>>>>
>>>> The final step would be to couple this with "operations".   For
>>>> example the computer can not hear a signal, so it turns so knobs (so
>>>> to speak) on the radio to adjust a filter or whatever.   These signals
>>>> are coming in over the air and are not recorded to we can "do stuff"
>>>> like point out antenna at the transmitter, filter out noise.    Why
>>>> say this:   Because I think THIS is where something like NuPic can
>>>> really shine, when it plays an ACTIVE ROLE.   You need this for human
>>>> level performance.
>>>>
>>>> Some very good real-world example MP3 recordings are on the link
>>>> below.
>>>> http://www.dxuniversity.com/audio/
>>>> These require human level performance to make sense of.
>>>> Notice that humans needs to be able to keep overlapping conversations
>>>> logical separate.  With speech we can do this because everyone has a
>>>> unique sounding voice.  It is kind of this way with CW too.
>>>>
>>>> Summary.
>>>>
>>>> The easy case is very easy.  A beginning first year university
>>>> computer science student could decode Morse Code under ideal
>>>> conditions.  In the real-world "pile up" case CW is as hard as trying
>>>> to understand a dozen overlapping conversations at a cocktail party
>>>> where everyone is talking at once.   It is well past the current state
>>>> of the art in speech recognition.   But I think CW is a good area for
>>>> research just because of this range of difficulty.
>>>>
>>>> The trap is developing a technique that ONLY works on the easy cases
>>>> and can't scale up.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> >
>>>> > NuPIC is the brains that recognizes patterns, we just need to figure
>>>> out the right sensory arrangement to see the most useful patterns.
>>>> >
>>>> > rich
>>>> >
>>>> > On Aug 20, 2014, at 11:21, Matthew Taylor <[email protected]> wrote:
>>>> >
>>>> >> Chris,
>>>> >>
>>>> >> Please keep in mind that this is very early stage technology. We are
>>>> >> working on the foundations of HTM with NuPIC, and we open-sourced the
>>>> >> codebase to get community involvement as soon as it was feasible.
>>>> >> True, NuPIC is not a turnkey solution for any problem at this point,
>>>> >> but our goals are to share this tech with anyone who wants to work on
>>>> >> it, and encourage motivated developers to craft solutions to
>>>> >> interesting problems.
>>>> >>
>>>> >> In the future, I imagine a library of community-provide encoders that
>>>> >> can be easily plugged into NuPIC. (For other musings about the future
>>>> >> of NuPIC, see [1].) But in the meantime, we have a lot of work to get
>>>> >> done. If you want to be a part of it, you could join our sprint
>>>> >> planning meetings [2] and open office hours [3].
>>>> >>
>>>> >> [1] https://www.youtube.com/watch?v=QPkA6nJifOw
>>>> >> [2]
>>>> https://www.youtube.com/watch?v=oB71cqyRi9s&list=PL3yXMgtrZmDrtAuw9jJCNbaJmW3nSD3hC
>>>> >> [3]
>>>> https://www.youtube.com/watch?v=MWBFw4WoZxA&list=PL3yXMgtrZmDqsqo6hytKjhrkfFNEYDqfn
>>>> >> ---------
>>>> >> Matt Taylor
>>>> >> OS Community Flag-Bearer
>>>> >> Numenta
>>>> >>
>>>> >>
>>>> >> On Wed, Aug 20, 2014 at 7:59 AM, Chris Albertson
>>>> >> <[email protected]> wrote:
>>>> >>> I posted this same exact question here some weeks ago.    I've not
>>>> >>> read your links yet but I will.
>>>> >>>
>>>> >>> My conclusion about NuPic is about the same as your #1, #2 and #3.
>>>> >>> That is you need a large  "do it yourself" solution on top of NuPic,
>>>> >>> so I wonder what's gained,  If you need to write you own encoder,
>>>> >>> layering and feedback and then extract the results (inverse
>>>> encoder?)
>>>> >>> what is gained by using NuPic over some other NN library?   Those
>>>> >>> "higher order sequences" would be handled in NuPIc by a hierarchy of
>>>> >>> CLAs that you would have to implement.
>>>> >>>
>>>> >>> I'm thinking now that recognizing CW is a lot like speech
>>>> recognition.
>>>> >>>   But the up front encoding needs to be some kind of phase locked
>>>> >>> loop on the "dit period"
>>>> >>>
>>>> >>> I also thought NuPIc would be great for this, just pass in the audio
>>>> >>> stream....   But I don't think it's up to the job.
>>>> >>>
>>>> >>> On Tue, Aug 19, 2014 at 9:39 PM, Mauri Niininen
>>>> >>> <[email protected]> wrote:
>>>> >>>> I am looking for some expert advice from NuPIC gurus here.
>>>> >>>>
>>>> >>>> I have been working on the problem of decoding Morse code from
>>>> noisy, real
>>>> >>>> life signals as received using HF radios. I have implemented
>>>> several types
>>>> >>>> of signal processing and machine learning algorithms trying to
>>>> improve
>>>> >>>> accuracy and reduce decoding character error rate (CER) caused by
>>>> various
>>>> >>>> reasons, such as
>>>> >>>> - poor signal-to-noise ratio
>>>> >>>> - signal fading due to RF propagation
>>>> >>>> - poor rhythm & timing of hand keyed CW
>>>> >>>> - rapid speed changes
>>>> >>>> - signal interference from adjacent frequencies
>>>> >>>>
>>>> >>>> If you are interested in this subject there is more detailed
>>>> descriptions on
>>>> >>>> problems and solutions I have tested so far in here:
>>>> >>>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html
>>>> >>>> http://ag1le.blogspot.com/2014/06/new-morse-decoder-part-4.html
>>>> >>>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-6.html
>>>> >>>>
>>>> http://ag1le.blogspot.com/2013/01/towards-bayesian-morse-decoder.html
>>>> >>>>
>>>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html
>>>> >>>>
>>>> http://ag1le.blogspot.com/2012/05/morse-code-decoding-with-self.html
>>>> >>>>
>>>> >>>>
>>>> >>>> My questions are related to NuPIC and how could I start testing
>>>> whether CLA
>>>> >>>> algorithm would perform better than the currently used Bayesian
>>>> algorithm?
>>>> >>>>
>>>> >>>> The challenges I see  after studying the NuPIC documentation &
>>>> example code:
>>>> >>>>
>>>> >>>> 1) How to create encoder for building sparse representation from
>>>> audio
>>>> >>>> signals?   (some ideas here:
>>>> >>>>
>>>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html
>>>> >>>> )
>>>> >>>>
>>>> >>>> 2) If you teach NuPIC CLA  to recognize Morse character set as
>>>> sequence of
>>>> >>>> "mark" / "space" bit patterns, how can you decode apparently
>>>> random bit
>>>> >>>> patterns from spatial pooler  back to ASCII character set to be
>>>> displayed to
>>>> >>>> user?  Does any of the existing classifiers allow users to create
>>>> their own
>>>> >>>> "codebook" (see
>>>> >>>>
>>>> http://ag1le.blogspot.com/2012/05/fldigi-adding-matched-filter-feature-to.html
>>>> >>>> example using Kohonen Self Organizing Maps to build a codebook)
>>>> >>>>
>>>> >>>> 3) Does NuPIC CLA also recognize some common language patterns
>>>> ("higher
>>>> >>>> order sequences") that are typically used in normal ham radio
>>>> contacts ?  Or
>>>> >>>> is there a need to chain multiple CLAs in some sort of hierarchy?
>>>> >>>>
>>>> >>>> regards,
>>>> >>>> Mauri  AG1LE
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> _______________________________________________
>>>> >>>> nupic mailing list
>>>> >>>> [email protected]
>>>> >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>>
>>>> >>> Chris Albertson
>>>> >>> Redondo Beach, California
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> nupic mailing list
>>>> >>> [email protected]
>>>> >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>> >>
>>>> >> _______________________________________________
>>>> >> nupic mailing list
>>>> >> [email protected]
>>>> >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > nupic mailing list
>>>> > [email protected]
>>>> > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Chris Albertson
>>>> Redondo Beach, California
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to