Hi Tim, Sorry for the ambiguity. To be more specific, the file name is fine: in the shell script the file name $*.mp3 expands correctly to e.g. мазать.mp3 . The audio within the file consists of the Google robot voice reading the string of percent-escaped characters literally, not reading the Russian word.
I will try Random Coder's suggestion of a more complete user agent string - apparently http://whatsmyuseragent.com/ is a handy way to find out what your browser claims to be :) On Tue, Mar 31, 2015 at 9:50 PM, Tim Rühsen <[email protected]> wrote: > Hi Steven, > > Am Dienstag, 31. März 2015, 18:11:58 schrieb Stephen Wells: > > Dear all - I am currently trying to use wget to obtain mp3 files from the > > Google Translate TTS system. In principle this can be done using: > > > > wget -U Mozilla -O "${string}.mp3" " > > http://translate.google.com/translate_tts?tl=TL&q=${string}" > > > > where TL is a twoletter language code (en,fr,de and so on). > > > > However I am meeting a serious error when I try to send Russian strings > > (tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under > > Cygwin) and the file system will display the cyrillic strings no problem. > > If I provide a command like this: > > > > http://translate.google.com/translate_tts?tl=ru&q=мазать > > > > wget incorrectly processes the Cyrillic characters _before_ sending the > > http request, so what it actually requests is: > > > > > http://translate.google.com/translate_tts?tl=ru&q=%D0%BC%D0%B0%D0%B7%D0%B0%D > > 1%82%D1%8C > > This seems to be the correct behavior of a web client. > The URL in the GET request is transmitted UTF-8 encoded and percent > escaping > is performed for chars >127 (not mentioning control chars here). > > > This of course produces a string of gibberish in the resulting mp3 file! > > This is something different. If you are talking about the file name, well > there is --restrict-file-names=nocontrol. Did you give it a try ? > > > Is there any way to make wget actually send the string it is given, > instead > > of mangling it on the way out? This is really blocking me. > > From what you write, I am unsure if you are talking about the resulting > file > name or about HTTP URL encoding in a GET request. > > Regards, Tim >
