On 03/10/2017 09:51 PM, David Rowe wrote: > Hi Jeroen, > > Wow that's great work - thank you so much. So wonderful for me to see > someone diving into the code and making things happen at this level. > Also the results are pretty good - that guy vk5dgr sounded the worst :-) > > Here is the same source processed by Codec 2 1300: > > http://rowetel.com/downloads/codec2/all.1300.wav > > What does everyone else on the list think of the two modes compared to > each other? There is an obvious level difference - an artifact of the > post filtering. When listening, compare one sentence at a time, e.g. > the first 3 seconds of both, then the 2nd 3 seconds etc. Think about > which sample you prefer. > > Couple of questions/comments: > > 1/ How did you train the VQ - using the Octave code? No, it seems I am very good at completly messing up octave code... I used a small C program load a set of vectors (e.g. error output of 2nd stage) and it searches for vector that has the best results when applied to all other vectors (measured in number of vectors that have a significant improvement). The best one is added to the codebook and all vectors that where improved enough are removed from the set. Then the next vector is searched by comparing it with the remaining set. There are a few thresholds being taken into account to make sure the improvement is big enough. (A very small vector would improve almost the whole set, but would not add anything usefull)
It is not very quick, but seems to find a nice set of vectors in the end. > 2/ The squared error can be directly related to spectral distortion, if > the difference (in dB) between the original and quantised spectrum, so > the extra VQ stages have reduced the RMS distortion by sqrt(147/20) = > 2.7 dB to sqrt(44/20) = 1.5dB. > > 3/ Given more bits are available, might be interesting to expand the > bandwidth the VQ works over. The 1300 samples do sound a bit more > "wideband" to me, compared to 1300C. Do you mean the upper limit of 3700Hz? Or increasing the rate K from 20 to e.g. 40? I tried increasing it to 39 (actually 40, but it was not really used) such that the even factors would align with the 20 of the 700c mode (in order to reuse the first two stages). But the results did not add as much as I hoped... > 4/ The post filter may not be so important now, and could be "relaxed". I'll try some different settings, I experimented a bith with it switched of completly, but it didn't sound as good as switched on. But some different settings (like 10db/dec or a lesser gain) might be worth trying. > 5/ Yes I'm sure the pitch estimator loses it occasionally. To test > objectively pitch contours could be compared with different filtering. > > 6/ I have some other training material you could try, to compare > results. But sounds like you are on the right track. > Sure, more material the better.. Do you have a download link or another way to transfer it? Regards, Jeroen ------------------------------------------------------------------------------ Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2