On 28 Jul 2008, at 12:23, Hans-Joerg Bibiko wrote:
How about this?

unletter <- function(word) {
gsub('-64',' ',paste(sprintf("%02d",utf8ToInt(tolower(word)) - 96),collapse=''))
}

unletter("abc")
[1] "010203"

unletter("Aw")
[1] "0123"

unletter("I walk to school")
[1] "09 23011211 2015 190308151512"

I do not know precisely what do you want to do.

With:
as.double(unlist(strsplit(unletter("I walk to school")," ")))

you will get a numeric vector out of the string.
But this leads to a problem with large words like:

as.double(unlist(strsplit(unletter("schoolschool")," ")))
[1] 1.903082e+23

Thus I would suggest if there's a need to mirror words as numeric values and the numeric values haven't a meaning to parse your text in beforehand to build a hash (a list) of all distinct words in your text and assign a number to each word.
This would end up in a list à la:
words <- ("abc" = 1, "I" = 2, "go" = 3, etc.)

After that you can access these numeric values via:
words['go']
$go
[1] 3

--Hans
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to