Re: [nupic-discuss] NuPIC for decoding noisy Morse code signals?

Mauri Niininen Fri, 22 Aug 2014 00:33:18 -0700

I found this diploma thesis:
http://www.dfki.de/web/forschung/publikationen?pubid=5462  - Extending
Hierarchical Temporal Memory for Sequence Classification by Klaus Greff.
 This approach seems to address the sequence classification problems.  The
source is here: http://trac.assembla.com/qhtm




On Fri, Aug 22, 2014 at 2:21 AM, Mauri Niininen <[email protected]>
wrote:

> Francisco (or anybody)
>
> Do you have any Python code examples how to use cortical.io encoder?
>
> I am trying to figure out how to build a Morse codebook using NuPIC - i.e.
> matching learned sequences to corresponding characters.  I did similar
> exercise earlier using Kohonen's Self Organizing Maps (SOM), I was running
> some 40,000 examples of noisy Morse code that SOM learned and built an
> internal representation. The difficulty was to build a classifier that
> converted internal representation of  ".-" sequence to letter "A".
>
> It appears that NuPIC indeed quickly learns sequences like ".-"   ("dit"
> "dah"  =  letter "A") but how do I convert the prediction to a label "A"
> when NuPIC sees this or similar pattern?
>
>
> Mauri
>
>
>
>
>
>
> On Thu, Aug 21, 2014 at 8:05 AM, Francisco De Sousa Webber <
> [email protected]> wrote:
>
>> Chris,
>> I am not knowledgeable in the area but the limits you describe seem to be
>> the same limits associated with (human) speech recognition. The way the
>> brain tackles the signal to noise ratio is by actually understanding what a
>> certain message is about. Based on this understanding the brain makes a
>> couple of guesses what a word could mean, coming up with some candidate
>> words from an incomplete perception. Then the semantically most probable
>> word is chosen. A possible approach could therefore be to convert the morse
>> stream into word-SDRs (using cortical.io encoder) and to match the
>> candidate words semantically.
>>
>> Francisco
>>
>>
>>
>> On 21.08.2014, at 03:52, Mauri Niininen <[email protected]> wrote:
>>
>> I agree with Chris that the target should be human level recognition.
>> However, to define that level of performance in engineering terms is not
>> very clear.  Humans have also some limitations in copying CW - some of
>> these limits have been well documented in the literature.
>>
>> 1) *Speed*  -  most humans cannot copy Morse code faster  than 40 - 60
>> WPM (words per minute) - roughly equal to 200 - 300 characters per minute.
>>  There are some competitions where the focus is on speed - Google search
>> found Guinness world record "On 6 May 2003 Andrei Bindasov (Belarus)
>> successfully transmitted 216 Morse code marks of mixed text in one minute.
>> The attempt was held as part of the International Amateur Radio Union's 5th
>> World Championship in High Speed Telegraphy in Belarus."  Also,  check
>> http://www.rufzxp.net/  - it states that  Goran Hajoševic, YT7AW cracked
>>  1000 CPM (characters per minute) speed limit in copying call signs
>> correctly. NuPIC with fast enough hardware might scale beyond human
>> capabilities.
>>
>> 2) *Speed vs. Signal-to-Noise Ratio (SNR)*  - as the SNR decreases it
>> becomes increasingly harder to copy CW correctly. You can improve the
>> situation by using audio bandpass filtering but as the speed of Morse
>> increases the signal bandwidth also increases. Very narrow audio filters
>> tend to create "ringing" artifacts - a noise spike starts to sound like
>> "dit" creating copy errors.   I have done some testing (
>> http://ag1le.blogspot.com/2013/01/morse-decoder-snr-vs-cer-testing.html)
>> to plot the character error rate (CER)  vs.  SNR limits using different
>> techniques.  Human auditory system has excellent adaptive filtering
>> capabilities but still have some limits. Skilled CW operators can
>> compensate by slowing down the speed during poor conditions,  asking other
>> stations to repeat  ("AGN" - again ), turning antenna or by other means.
>> NuPIC with "sensor - motor" integration could potentially learn to do this?
>>
>> 3) *Speed variability* - very often especially in CW contests you can
>> hear stations sending general call "CQ DE <call sign>" at certain speed.
>> When responding to calling stations they send acknowledgement and signal
>> report ("5NN") at much higher speed. Since Morse code does not have
>> built-in speed synchronization it is quite difficult to built adaptive
>> speed tracking algorithm that is able to decode correctly when speed jumps
>> from 20 WPM to 45 WPM between two characters. If NuPIC can learn to
>> recognize this type of pattern rather than exact "dit"/"dah" timing this
>> may be an area to gain better decoding accuracy.
>>
>> 4) *Rhythm variability* -  hand keyed Morse (HKM) code has a lot of
>> timing variability in "dits", "dahs" and inter-element and inter-character
>> pauses. I have done some testing and collected data - see
>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html.
>>  Humans can handle this variability by "filling in" missed or incorrect
>> characters from word or dialogue context. This part is very difficult to
>> handle with algorithms and perhaps this would be the area where NuPIC could
>> bring most benefits.  The reason is that each HKM operator has different
>> "signature" that is easy for humans to recognize and learn after listening
>> a while, but very difficult to build a generalized algorithm that can
>> handle all rhythm variants with high accuracy. This was one of the main
>> reasons to create Bayesian Morse decoder (
>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html)  - it
>> works better for HKM cases but is still far from human performance level.
>> NuPIC could potentially learn the rhythm and adapt like humans do?
>>
>> 5) *Fading, flutter, interference* - due to the properties of RF signal
>> propagation in different frequency bands received signal amplitude can
>> change very rapidly. Signals can fade down below noise level and come back
>> up within few tens of milliseconds making it very difficult to accurately
>> detect what is signal and what is noise. Sometimes you can also have
>> "flutter" or echoes in signals creating challenges for computer algorithms.
>> Interference from stations in nearby or overlapping frequency especially in
>> "pile-up" situations when you have 10 - 200 stations within 1-2 kHz
>> bandwidth calling a rare DX station all at the same time is very difficult
>> for computer algorithms to deal with. It is also challenging for humans,
>> but best DX operators can "manage" the pile-up by sending hints like  "5
>> UP" or "10 UP"  meaning that he is listening 5 - 10 kHz above his own
>> transmission frequency. This causes other operators to "spread out" which
>> helps in copying Morse code.  NuPIC would need to have "sensor - motor"
>> integration to be able do something like this.
>>
>> 6) *Doppler shift, poor transmitter or receiver frequency stability* -
>>  you can hear often stations who are drifting in frequency either due to
>> doppler effects (such as when working through a fast moving satellite, or
>> having Earth-Moon-Earth (EME) contact) or due to TX/RX frequency
>> instability.  Humans can compensate by turning VFO  - for computers this
>> requires some algorithm that automatically tracks the wanted signal. Some
>> of the most advanced Morse decoder software packages have this capability.
>>  NuPIC with proper audio encoders could potentially deal with these issues.
>>
>> 7)* Sending Errors* -  operators make frequently errors when sending
>> Morse code, especially when they try to send too fast. Humans can still
>> understand the meaning as they "fill in" missed parts from the context of
>> the dialogue  - for computer algorithms this is not so easy. NuPIC with
>> proper training on commonly used ham radio "jargon" or typical contest
>> exchanges could make a difference and improve decoding accuracy.
>>
>> 8) *Ability to copy multiple conversations simultaneously* - for many
>> people it is hard to follow multiple simultaneous conversations accurately
>> at the same time. Skilled CW operators can pick up relevant details such as
>> call signs from a "pile-up" with tens of stations in 1-2 Khz bandwidth.  I
>> believe this is an area where well trained NuPIC could make a difference if
>> we can create proper sparse encoding scheme (see
>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html
>> as an example).  You could also create a multi channel CW decoder,
>> something similar to what I recently wrote for FLDIGI (see
>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-5.html)
>>
>> For each of these cases above we could derive some proper metrics to set
>> a "performance standard" and start building software that is approaching
>> human performance level. Ability to do all above things simultaneously and
>> perfectly requires thousands of hours of learning - best human CW operators
>> have spent a lot of time honing their skills in contests and working DX
>> stations around the world. To listen world's best CW operators is like
>> listening skilled musicians - they have both excellent skills and passion
>> to this art form.
>>
>> I am not sure NuPIC is able to learn all above in its present form. It
>> would be very cool to get at least parts of above working, though. Creating
>> proper encoder/decoders, "sensor-motor" interfaces and natural language
>> interfaces would certainly take machine learning a giant leap forward and
>> we could start applying NuPIC for other hard problems.
>>
>> 73
>> Mauri AG1LE
>>
>>
>>
>>
>> On Wed, Aug 20, 2014 at 8:19 PM, Chris Albertson <
>> [email protected]> wrote:
>>
>>> On Wed, Aug 20, 2014 at 2:20 PM, Skeptical Engineer
>>> <[email protected]> wrote:
>>> > I think that NuPIC is a perfect solution for CW interpretation.  It
>>> would require some front-end work to be done.  I’m the one who chatted the
>>> questions in to Office Hours about audio coding last week, and I’m looking
>>> to start coding some audio front-end stuff, building on the code that
>>> already exists for that.  I was looking more at audio music and spoken
>>> language decoding, but CW could be a good intermediate step.  I need to
>>> learn more about the types and scope of variation in the code transmissions.
>>>
>>> What do you need to know?
>>>
>>> First off this can be very easy if the Morse Code is machine generated
>>> and there is little noise.   You don't need machine learning for the
>>> easy case.  it can be hard coded in C.  It is just a lookup table.
>>>
>>> Next up on the scale is real-world off the air decoding of strong
>>> signals.  This has been done too.  Using technique as in above but
>>> with much more signal processing up front.  Still no "intelligence"
>>> involved.
>>>
>>> What is needed is human level recognition.  The current state of the
>>> art is just short of human level performance.   One thing that MUST be
>>> done to make the effort of practical use is to sort out WHO is sending
>>> WHAT.   Listen and you hear tones being transmitted with each sender
>>> using a slightly different tone.   But in is very important to know
>>> that no station is actually sending AUDIO.  The tone is an artifact of
>>> the RECIEVER and is not present on the airwaves.   What you hear is
>>> the "beat frequency" between the sending and receiving radios.  If the
>>> tone sounds like it is 600Hz then the two radios are tuned 600Hz
>>> apart.  The transmitter is sending only an unmodulated carrier.
>>> Why bring this up?  Because a working CW decoder would likely be
>>> listening to a wider bandwidth than audio and it is "listening" to a
>>> power spectra not an audio MP3 file.
>>>
>>> You also will need to understand some of the conversation.  At least
>>> enough to pick out the call signs.  But we can do this with a simple
>>> "regular expression"  A simple left recursive grammar s almost over
>>> kill for "understanding" these conversations.   So it should be easy
>>> for CLA, that fast can be done
>>>
>>> The final step would be to couple this with "operations".   For
>>> example the computer can not hear a signal, so it turns so knobs (so
>>> to speak) on the radio to adjust a filter or whatever.   These signals
>>> are coming in over the air and are not recorded to we can "do stuff"
>>> like point out antenna at the transmitter, filter out noise.    Why
>>> say this:   Because I think THIS is where something like NuPic can
>>> really shine, when it plays an ACTIVE ROLE.   You need this for human
>>> level performance.
>>>
>>> Some very good real-world example MP3 recordings are on the link
>>> below.
>>> http://www.dxuniversity.com/audio/
>>> These require human level performance to make sense of.
>>> Notice that humans needs to be able to keep overlapping conversations
>>> logical separate.  With speech we can do this because everyone has a
>>> unique sounding voice.  It is kind of this way with CW too.
>>>
>>> Summary.
>>>
>>> The easy case is very easy.  A beginning first year university
>>> computer science student could decode Morse Code under ideal
>>> conditions.  In the real-world "pile up" case CW is as hard as trying
>>> to understand a dozen overlapping conversations at a cocktail party
>>> where everyone is talking at once.   It is well past the current state
>>> of the art in speech recognition.   But I think CW is a good area for
>>> research just because of this range of difficulty.
>>>
>>> The trap is developing a technique that ONLY works on the easy cases
>>> and can't scale up.
>>>
>>>
>>>
>>>
>>>
>>> >
>>> > NuPIC is the brains that recognizes patterns, we just need to figure
>>> out the right sensory arrangement to see the most useful patterns.
>>> >
>>> > rich
>>> >
>>> > On Aug 20, 2014, at 11:21, Matthew Taylor <[email protected]> wrote:
>>> >
>>> >> Chris,
>>> >>
>>> >> Please keep in mind that this is very early stage technology. We are
>>> >> working on the foundations of HTM with NuPIC, and we open-sourced the
>>> >> codebase to get community involvement as soon as it was feasible.
>>> >> True, NuPIC is not a turnkey solution for any problem at this point,
>>> >> but our goals are to share this tech with anyone who wants to work on
>>> >> it, and encourage motivated developers to craft solutions to
>>> >> interesting problems.
>>> >>
>>> >> In the future, I imagine a library of community-provide encoders that
>>> >> can be easily plugged into NuPIC. (For other musings about the future
>>> >> of NuPIC, see [1].) But in the meantime, we have a lot of work to get
>>> >> done. If you want to be a part of it, you could join our sprint
>>> >> planning meetings [2] and open office hours [3].
>>> >>
>>> >> [1] https://www.youtube.com/watch?v=QPkA6nJifOw
>>> >> [2]
>>> https://www.youtube.com/watch?v=oB71cqyRi9s&list=PL3yXMgtrZmDrtAuw9jJCNbaJmW3nSD3hC
>>> >> [3]
>>> https://www.youtube.com/watch?v=MWBFw4WoZxA&list=PL3yXMgtrZmDqsqo6hytKjhrkfFNEYDqfn
>>> >> ---------
>>> >> Matt Taylor
>>> >> OS Community Flag-Bearer
>>> >> Numenta
>>> >>
>>> >>
>>> >> On Wed, Aug 20, 2014 at 7:59 AM, Chris Albertson
>>> >> <[email protected]> wrote:
>>> >>> I posted this same exact question here some weeks ago.    I've not
>>> >>> read your links yet but I will.
>>> >>>
>>> >>> My conclusion about NuPic is about the same as your #1, #2 and #3.
>>> >>> That is you need a large  "do it yourself" solution on top of NuPic,
>>> >>> so I wonder what's gained,  If you need to write you own encoder,
>>> >>> layering and feedback and then extract the results (inverse encoder?)
>>> >>> what is gained by using NuPic over some other NN library?   Those
>>> >>> "higher order sequences" would be handled in NuPIc by a hierarchy of
>>> >>> CLAs that you would have to implement.
>>> >>>
>>> >>> I'm thinking now that recognizing CW is a lot like speech
>>> recognition.
>>> >>>   But the up front encoding needs to be some kind of phase locked
>>> >>> loop on the "dit period"
>>> >>>
>>> >>> I also thought NuPIc would be great for this, just pass in the audio
>>> >>> stream....   But I don't think it's up to the job.
>>> >>>
>>> >>> On Tue, Aug 19, 2014 at 9:39 PM, Mauri Niininen
>>> >>> <[email protected]> wrote:
>>> >>>> I am looking for some expert advice from NuPIC gurus here.
>>> >>>>
>>> >>>> I have been working on the problem of decoding Morse code from
>>> noisy, real
>>> >>>> life signals as received using HF radios. I have implemented
>>> several types
>>> >>>> of signal processing and machine learning algorithms trying to
>>> improve
>>> >>>> accuracy and reduce decoding character error rate (CER) caused by
>>> various
>>> >>>> reasons, such as
>>> >>>> - poor signal-to-noise ratio
>>> >>>> - signal fading due to RF propagation
>>> >>>> - poor rhythm & timing of hand keyed CW
>>> >>>> - rapid speed changes
>>> >>>> - signal interference from adjacent frequencies
>>> >>>>
>>> >>>> If you are interested in this subject there is more detailed
>>> descriptions on
>>> >>>> problems and solutions I have tested so far in here:
>>> >>>> http://ag1le.blogspot.com/2013/09/new-morse-decoder-part-1.html
>>> >>>> http://ag1le.blogspot.com/2014/06/new-morse-decoder-part-4.html
>>> >>>> http://ag1le.blogspot.com/2014/07/new-morse-decoder-part-6.html
>>> >>>>
>>> http://ag1le.blogspot.com/2013/01/towards-bayesian-morse-decoder.html
>>> >>>>
>>> http://ag1le.blogspot.com/2013/02/probabilistic-neural-network-classifier.html
>>> >>>>
>>> http://ag1le.blogspot.com/2012/05/morse-code-decoding-with-self.html
>>> >>>>
>>> >>>>
>>> >>>> My questions are related to NuPIC and how could I start testing
>>> whether CLA
>>> >>>> algorithm would perform better than the currently used Bayesian
>>> algorithm?
>>> >>>>
>>> >>>> The challenges I see  after studying the NuPIC documentation &
>>> example code:
>>> >>>>
>>> >>>> 1) How to create encoder for building sparse representation from
>>> audio
>>> >>>> signals?   (some ideas here:
>>> >>>>
>>> http://ag1le.blogspot.com/2014/05/sparse-representations-of-noisy-morse.html
>>> >>>> )
>>> >>>>
>>> >>>> 2) If you teach NuPIC CLA  to recognize Morse character set as
>>> sequence of
>>> >>>> "mark" / "space" bit patterns, how can you decode apparently random
>>> bit
>>> >>>> patterns from spatial pooler  back to ASCII character set to be
>>> displayed to
>>> >>>> user?  Does any of the existing classifiers allow users to create
>>> their own
>>> >>>> "codebook" (see
>>> >>>>
>>> http://ag1le.blogspot.com/2012/05/fldigi-adding-matched-filter-feature-to.html
>>> >>>> example using Kohonen Self Organizing Maps to build a codebook)
>>> >>>>
>>> >>>> 3) Does NuPIC CLA also recognize some common language patterns
>>> ("higher
>>> >>>> order sequences") that are typically used in normal ham radio
>>> contacts ?  Or
>>> >>>> is there a need to chain multiple CLAs in some sort of hierarchy?
>>> >>>>
>>> >>>> regards,
>>> >>>> Mauri  AG1LE
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> nupic mailing list
>>> >>>> [email protected]
>>> >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>>
>>> >>> Chris Albertson
>>> >>> Redondo Beach, California
>>> >>>
>>> >>> _______________________________________________
>>> >>> nupic mailing list
>>> >>> [email protected]
>>> >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>> >>
>>> >> _______________________________________________
>>> >> nupic mailing list
>>> >> [email protected]
>>> >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>> >
>>> >
>>> > _______________________________________________
>>> > nupic mailing list
>>> > [email protected]
>>> > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>>
>>> --
>>>
>>> Chris Albertson
>>> Redondo Beach, California
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] NuPIC for decoding noisy Morse code signals?

Reply via email to