message from Colin Beckingham <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
On 04/19/2012 07:15 PM, Colin Beckingham wrote:
On 04/19/2012 05:25 PM, Alan W Black wrote:
Colin Beckingham wrote:
message from Colin Beckingham <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
I'm playing with Spanish in Festival.
(voice_el_diphone )
sets up the voice fine, and
(lex.lookup 'mu~neca)
produces
("mu~neca" nil (((m u) 0) ((ny e1) 1) ((k a) 0)))
good stuff. Now I try accented vowels and I am not progressing at
all. In
(lex.lookup 'María)
I'm trying to accent the i. But my console gives me
("María" nil (((e) 0) ((k i1 s) 1)))
same thing for estaría
Investigating festvox/spanlex.scm seems to indicate a "'" arrangement
but this just truncates the output. Any hints?
I believe this is actually due to the terminal you are using (and
festival readline command line interpretation. Accent characters
(encoded in the upper part of an 8 bit encoding) aren't getting to the
system properly.
Try creating a file with fff.scm
(format t "%l\n" (lex.lookup "María"))
in it, and the in Festival calling
(load "fff.scm")
Also you need to ensure that the encoding of the file is Latin-1 (rather
than unicode -- I believe).
Hmm, I did try using urxvt and obtained similar results.
Now, following your suggestion, with fff.scm created in kwrite
specifying iso-8559-1 I get a result of
festival> (voice_el_diphone )
el_diphone
festival> (load "fff.scm")
("María" nil (((e) 0) ((k i1 s) 1)))
nil
and using Python with unicode I get
#!/usr/bin/python
# -*- coding: utf-8 -*-
#
a = u'(begin(voice_el_diphone )(format t "%l\n" (lex.lookup "María")))'
b = 'downloads/festival/bin/festival'
c = '-b'
d = (b,c,a)
import subprocess as sp
print sp.check_output(d)
$ python fff.py
("María" nil (((e) 0) ((k i1 s) 1)))
This is not exactly impeding my progress, I'm just trying to cover the
bases and understand the process.
I'm wondering if there is a way similar to the n-tilde format of '~n'
which worked that would apply to other accents? This would put all
characters back into a lower ASCII range and therefore much more
manageable in, for example, Python.
I think I have it working now, using lxterm, rewriting my source file
fff.scm. It is easy to get stray characters left in the line (one half
of a deleted character) but be unaware they are there.
Result now is
festival> (load "fff.scm")
("Mar'ia" nil (((m a) 0) ((r i1) 1) ((a) 0)))
which is more in line with expectations.
Thanks for help.
--
---
Colin Beckingham
613-454-5369
http://www.it4gh.com
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= University of Edinburgh's Festival Speech Synthesis System =
= http://festvox.org/festival Sent Via [email protected] =
= To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
_______________________________________________
Festlang-talk mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/festlang-talk