David Kuehling wrote:
Hi Alan,
"Alan" == Alan W Black <[email protected]> writes:
Sorry I've note had time to update my ebook reader Bard (though I do
use it as my primary ebook reader all the time) and importantly (to
me) the flite synthesizer, but I just updated my nanonote to the
latest release and am hopefully re-enthused.
I note that the jz4720 has SIMD instructions and I wonder if I can use
this to make the better synthesis technique fast enough on the
Nanonote. In my real job we have been working on synthesis (and
display) support for 7 Indian languages -- and we do have a Mandarin
synthesizer too and I'd like that all to work on the Nanonote.
The jz4720 SIMD is not supported by binutils assembler (implying no GCC
support). Last time I checked vendor support I only found an
(unofficial?) awk-script that converted simd opcodes into hexcodes, to
be used as a binutils preprocessor. Yuck.
There is also AFAIK no Linux kernel support, so process switching won't
save/restore the simd register file. So no multi-tasking with
SIMD-using programs (at least not without weird side-effects).
Ok, this would make it very hard to use. I did find the mplayer code
and the Ingenic cross compiler does work for me (and does generate .o
files that I can link on the nanonote itself), but I've not yet got to
the stage of testing the SIMD instructions. But without kernel register
support that would make it not really work.
As these SIMD instructions are completely proprietary I'd rather not
invest time into optimizing your code around them. What's the problem
with synthesis that is so CPU-intensive? Maybe some bunch of hand-coded
MIPS-assembler would already do the job? Or maybe some algorithmic
optimizations can solve the problem without resorting to a "brute force"
machine code optimization approach? Can you point us to the specific
C-code that needs tuning? Looks like a fun problem.
In statistical parametric synthesis, we use a computationally expensive
process called MLSA (cst_mlsa.c) with takes about 90% of the time to
synthesize. It really needs about 800MHz (on an ARM) to be real-time.
With restricting the parameter order and number of other interesting
hacks we can get something fast enough on a 400MHz ARM (an HTC TytnII).
Although there probably is more optimization possible with that
algorithm, we've also been looking at different parameterization of the
speech, that still has the same predictive capabilities (e.g. LSPs) but
require less computation for resynthesis. The Nanonote although *I*'d
like that to work, 600MHz+ devices (Raspberry PI, MK802 and various
android phones) are probably our real target.
Alan
Back in 2009 on the list I see comments from Wolfgang about some
(enough?) of the SIMD information being released to allow some
hacking.
http://lists.en.qi-hardware.com/pipermail/discussion/2009-September/000471.html
He refers to a previous message but I can't seem to find that.
What is the status of SIMD support, does the latest mplayer use them?
Can I find some C code (or just assembler instructions details) that
might help me try some things out?
Nope, mplayer does not use them. Mplayer uses the hardware scaler,
though.
cheers,
David
------------------------------------------------------------------------
_______________________________________________
Qi Hardware Discussion List
Mail to list (members only): [email protected]
Subscribe or Unsubscribe:
http://lists.en.qi-hardware.com/mailman/listinfo/discussion
_______________________________________________
Qi Hardware Discussion List
Mail to list (members only): [email protected]
Subscribe or Unsubscribe:
http://lists.en.qi-hardware.com/mailman/listinfo/discussion