The hard thing about live pitch-tracking is getting the minimal latency keeping reliability. It's not that simple. You also want "voicedness", which is more challenging than pitch.

I think they developed it for a specific work, but IIRC it was challenging to get it accurate.

I don't now much about current pitch trackers, but I think you can do a high quality one for voice using filterbanks. Some people do resynthesis that way (and well, that is just an alternative to FFT after all). That's pretty much how cochlea works, I think, by having overlapping frequency bands. But it probably is hard to get right. I assume you can make a better pitch tracker that is specialized for voice by thinking about FoF synthesis, the sound of the voice is really a sequence of bursts of roughly the same shape (like granular synthesis in a way) and you should be able to figure out some statistical relationship between formants and how they change with pitch. I'm not saying it is easy. Probably a lot published on this though.

I don't know what "voicedness" is? You mean things like vibrato?

I've not tried the multiple FFT, I was worried pitch would lag oddly when changing FFT size. Perhaps it could work.

I think it should work in theory, but you'll probably get some of complications due to the distortions that comes with the windowing function etc? And making a real time phase vocoder is more work than it looks like on paper... Obviously doable, but there are some "missing bits" in the theoretical descriptions. I guess that's why IRCAM can sell licenses to superVP. :)

Or maybe one can use wavelets, but I don't know much about wavelet transforms (they don't map to cosine, so imagine it will be much harder to do well).

I have trouble to imagine the reconstruction so don't use them (well, I did once, but didn't _get_ it).

Yeah, I don't know. Still, in the past few years it has been popular with distorted and glitchy sounds, so maybe one could do some cool distorted effects with it.

