message from Colin Beckingham <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =


On 04/19/2012 07:15 PM, Colin Beckingham wrote:


On 04/19/2012 05:25 PM, Alan W Black wrote:
Colin Beckingham wrote:
message from Colin Beckingham <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
I'm playing with Spanish in Festival.

(voice_el_diphone )

sets up the voice fine, and

(lex.lookup 'mu~neca)

produces

("mu~neca" nil (((m u) 0) ((ny e1) 1) ((k a) 0)))

good stuff. Now I try accented vowels and I am not progressing at
all. In

(lex.lookup 'María)

I'm trying to accent the i. But my console gives me

("María" nil (((e) 0) ((k i1 s) 1)))

same thing for estaría

Investigating festvox/spanlex.scm seems to indicate a "'" arrangement
but this just truncates the output. Any hints?


I believe this is actually due to the terminal you are using (and
festival readline command line interpretation. Accent characters
(encoded in the upper part of an 8 bit encoding) aren't getting to the
system properly.

Try creating a file with fff.scm

(format t "%l\n" (lex.lookup "María"))

in it, and the in Festival calling

(load "fff.scm")

Also you need to ensure that the encoding of the file is Latin-1 (rather
than unicode -- I believe).


Hmm, I did try using urxvt and obtained similar results.

Now, following your suggestion, with fff.scm created in kwrite
specifying iso-8559-1 I get a result of

festival> (voice_el_diphone )
el_diphone
festival> (load "fff.scm")
("María" nil (((e) 0) ((k i1 s) 1)))
nil

and using Python with unicode I get

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
a = u'(begin(voice_el_diphone )(format t "%l\n" (lex.lookup "María")))'
b = 'downloads/festival/bin/festival'
c = '-b'
d = (b,c,a)
import subprocess as sp
print sp.check_output(d)

$ python fff.py
("María" nil (((e) 0) ((k i1 s) 1)))

This is not exactly impeding my progress, I'm just trying to cover the
bases and understand the process.

I'm wondering if there is a way similar to the n-tilde format of '~n'
which worked that would apply to other accents? This would put all
characters back into a lower ASCII range and therefore much more
manageable in, for example, Python.


I think I have it working now, using lxterm, rewriting my source file fff.scm. It is easy to get stray characters left in the line (one half of a deleted character) but be unaware they are there.

Result now is

festival> (load "fff.scm")
("Mar'ia" nil (((m a) 0) ((r i1) 1) ((a) 0)))

which is more in line with expectations.

Thanks for help.

--
---
Colin Beckingham
613-454-5369
http://www.it4gh.com
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
=    University of Edinburgh's Festival Speech Synthesis System       =
= http://festvox.org/festival      Sent Via [email protected] =
=                           To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

_______________________________________________
Festlang-talk mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/festlang-talk

Reply via email to