Re: [Qi Hardware Discuss] SIMD instructions in jz4720

Alan W Black Mon, 27 Aug 2012 05:01:43 -0700

David Kuehling wrote:

Hi Alan,

"Alan" == Alan W Black <[email protected]> writes:

Sorry I've note had time to update my ebook reader Bard (though I do
use it as my primary ebook reader all the time) and importantly (to
me) the flite synthesizer, but I just updated my nanonote to the
latest release and am hopefully re-enthused.

I note that the jz4720 has SIMD instructions and I wonder if I can use
this to make the better synthesis technique fast enough on the
Nanonote. In my real job we have been working on synthesis (and
display) support for 7 Indian languages -- and we do have a Mandarin
synthesizer too and I'd like that all to work on the Nanonote.


The jz4720 SIMD is not supported by binutils assembler (implying no GCC
support).  Last time I checked vendor support I only found an
(unofficial?) awk-script that converted simd opcodes into hexcodes, to
be used as a binutils preprocessor.  Yuck.

There is also AFAIK no Linux kernel support, so process switching won't
save/restore the simd register file.  So no multi-tasking with
SIMD-using programs (at least not without weird side-effects).

Ok, this would make it very hard to use. I did find the mplayer codeand the Ingenic cross compiler does work for me (and does generate .ofiles that I can link on the nanonote itself), but I've not yet got tothe stage of testing the SIMD instructions. But without kernel registersupport that would make it not really work.


As these SIMD instructions are completely proprietary I'd rather not
invest time into optimizing your code around them.  What's the problem
with synthesis that is so CPU-intensive?  Maybe some bunch of hand-coded
MIPS-assembler would already do the job?  Or maybe some algorithmic
optimizations can solve the problem without resorting to a "brute force"
machine code optimization approach?  Can you point us to the specific
C-code that needs tuning?  Looks like a fun problem.

In statistical parametric synthesis, we use a computationally expensiveprocess called MLSA (cst_mlsa.c) with takes about 90% of the time tosynthesize. It really needs about 800MHz (on an ARM) to be real-time.With restricting the parameter order and number of other interestinghacks we can get something fast enough on a 400MHz ARM (an HTC TytnII).Although there probably is more optimization possible with thatalgorithm, we've also been looking at different parameterization of thespeech, that still has the same predictive capabilities (e.g. LSPs) butrequire less computation for resynthesis. The Nanonote although *I*'dlike that to work, 600MHz+ devices (Raspberry PI, MK802 and variousandroid phones) are probably our real target.


Alan

Back in 2009 on the list I see comments from Wolfgang about some
(enough?) of the SIMD information being released to allow some
hacking.
http://lists.en.qi-hardware.com/pipermail/discussion/2009-September/000471.html
He refers to a previous message but I can't seem to find that.

What is the status of SIMD support, does the latest mplayer use them?
Can I find some C code (or just assembler instructions details) that
might help me try some things out?


Nope, mplayer does not use them.  Mplayer uses the hardware scaler,
though.

cheers,

David


------------------------------------------------------------------------

_______________________________________________
Qi Hardware Discussion List
Mail to list (members only): [email protected]
Subscribe or Unsubscribe: 
http://lists.en.qi-hardware.com/mailman/listinfo/discussion



_______________________________________________
Qi Hardware Discussion List
Mail to list (members only): [email protected]
Subscribe or Unsubscribe: 
http://lists.en.qi-hardware.com/mailman/listinfo/discussion

Re: [Qi Hardware Discuss] SIMD instructions in jz4720

Reply via email to