On Wed, Feb 25, 2009 at 9:48 AM, Eric Scoles <[email protected]> wrote:
> > > On 2009-02-25, Dave Henn <[email protected]> wrote: > >> >> >> On Wed, Feb 25, 2009 at 8:59 AM, Eric Scoles <[email protected]>wrote: >> >>> >>> [snip] >>> >>> If I undestand correctly what you're describing, this would not be that. >>> Sounds like what you describe is a matter of piping text through MacInTalk, >>> or something like that, and saving it as MP3. Plus, they wouldn't have any >>> inflection, as we both noted, so it would be hard to listen to -- especially >>> for something like that, where you need to understand all of it. >>> >> > > I don't think we're talking about the same "cues." > > Yes, there are periods and commas and paragraphs and quotation marks, and > you can code a text-to-speech system to account for that. > (MacInTalk does.) But that's a long way from Roy Blount, Jr. Or Tom Bodet. > Or Peter Riegert. Imagine Sarah Vowell read by a text to speech system. OK, > bad example: some people would prefer that, I know. How about David > Sedaris? > > Consider Blount's point about the accent: IBM has coded that into their > voice tree systems, possibly using his own southern accent as one model. > I've listened to accented text to speech voices, and they're not terrible. > But you'd have to know to use them, and there's no cue in plaintext > for that. There's also no cue for gender, pitch, timbre, tone, or, really, > cadence. > > [snip] > I'm only going to reply to this part because it's quick. There are LOTS of other cues before a reader, far more than the individual words and the punctuation. There is the context in which each word is employed, which is dependent in part on the words surrounding it., partly on the words in larger portions of the text being processed. There is the flow of the text in a sentence, it's rhythm, or, as you say, cadence that should be parsable using phonemes and syllabic databases (I'm sure I'm mangling the terminology, but you get the idea). How often have you seen someone or yourself read a passage aloud a second time because the first time didn't sound right? Something cued, or didn't, your change in how you read the passage. All of these things are goals for text-to-speech, and context is already being used in many. I don't know about rhythm, but that shouldn't be long if it's not already there. As far as gender, if a system has a sufficient database of names, it should be able to take a good guess at that, and pitch and timbre would at least partially follow from gender. Tone, I don't know, but context would certainly help there. -- Dave Henn [email protected] --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "R-SPEC: The Rochester Speculative Literature Association" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/r-spec?hl=en -~----------~----~----~----~------~----~------~--~---
