Hi Steve, What you are trying to do is interesting, even if it's a bit extreme; but I do agree that you may well be on to another application area.
I mentioned phonetics yesterday, but perhaps I should have said diphones instead -- which is the movement from one sound to another. I once read that they produce the most natural syntesised voice, but that may not be the most recent information. Still, it makes a lot of sense as a path for voice generation; and quite possibly also for voice recognition. I don't know if the latter has been researched. I suppose that there would have to be a sort of lexeme-ish state machine that is in a non-deterministic state while a diphone forms and it cannot decide which way it will go, but that at some point it can decide (or gain certainty) about which phoneme the voice is transitioning to. The transitions between phonemes, so the set of possible diphones, is likely to be limited by physical constraints -- we can't twist our tongues into any knot that theory can recognise. This would give rise to a highly compacted representation, even beyond the mere phonemes or their Huffman compressed forms. Another angle could be an attempt to reproduce the shape of the mouth and the position of the tongle, nasality and so on, deriving that info from each of the Codec2 packets. This would mean that you are not limited to an alphabet of phonemes, nor would you need to see context around any single frame; that is advantageous when packets are lost. Modelling only the voice tract shape would loose the specific sound of the human speaking, but that is a sacrafice that you are making to get more compression. Quite a lot of new work, I fear. Probably PhD-thesis size, or otherwise a project into which more people need to help out? It seems very useful, and I've been thinking along similar lines. It's the kind of thing to publish about early & often if you want to get others involved, and to avoid the technological progress-freeze and adoption-restraint that is usually the result of patents in this field -- just look at G.722, which is finally being adopted now that its patents have expired. Cheers, -Rick ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2