Gregor GORJANC wrote:
Henrik Bengtsson wrote:

Gregor GORJANC wrote:

...

What do you think about this scratch, which afcourse doesn't solve all
"special" characters:

fixURLchar <- function(URL,
                      from = c(" ", "\"", ",", "#"),
                      to = c("%20", "%22", "%2c", "%23"))


Just a comment. It is much safer/easier to use named vectors for
mapping, e.g.

map <- c(" "="%20", "\""="%22", ","="%2c", "#"="%23")


...

Henrik, thanks. So you suggest something like

for (i in seq(along=map)) {
    URL <- gsub(pattern=names(map)[i], replacement=map[i], x=URL)
}


Yes, something like that. To optimize, you might want to do

patterns <- names(map);
for (i in seq(along=map)) {
  URL <- gsub(pattern=patterns[i], replacement=map[i], x=URL)
}

More important is that you treat a standard "%" different from a "%" used in encoding, e.g. how do you want to convert the string "100% %20"? You probably have to utilize more "fancy" regular expressions to detect a standard "%". Maybe "%[^0-9a-fA-F]" will do. There should be much more details in the document Brian Ripley refered you to.

In other words, you have to be careful and try to think through all cases you function may be called. A good test is to call it twice, once on your original string and the on the escaped on; you should get the same result. It depends how complete you want your function to be.

Good luck

Henrik

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to