--- Begin Message ---
Uwe Ligges a écrit :
I guess your problem has been solved by last night's discussion with
Gabor G.?
Uwe Ligges
Laurent Rhelp wrote:
Uwe Ligges a écrit :
Laurent Rhelp wrote:
Uwe Ligges a écrit :
Laurent Rhelp wrote:
Dear R-List,
I have a great many files in a directory and I would like to
replace in every file the character " by the character ' and in
the same time, I have to change ' by '' (i.e. the character '
twice and not the unique character ") when the character ' is
embodied in "....."
So, "....." becomes '.....' and ".....'......" becomes
'.....''......'
Certainly, regular expression could help me but I am not able to
use it.
How can I do that with R ?
In fact, you do not need to know anything about regular
expressions in this case, since you are simply going to replace
certain characters by others without any fuzzy restrictions:
x <- "\".....'......\""
cat(x, "\n")
xn <- gsub('"', "'", gsub("'", "''", x))
cat(xn, "\n")
Uwe Ligges
Thank you very much
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Yes, You are right. So I wrote the code below (that I find a little
awkward but it works).
##-----
dirdata <- getwd()
fichnames <- list.files(path=paste(dirdata,"\\initial\\",sep=""))
see ?file.path to improve the above.
for( i in 1:length(fichnames)){
see ?seq to improve the above: seq(along = fichnames)
Or even better, just work on the names (see below).
filein <- paste(dirdata,"\\initial\\",fichnames[i],sep="")
again, file.path() is your friend
conin <- file(filein)
open(conin)
> nbrows <- length( readLines(conin,n=-1) )
close(conin)
You can simply use readLines() with the filename which open the
connection to a file itself. And I do not see why you want to read
the file here. Since your code becomes really complicated now, let
me suggest the following procedure (untested!):
dirdata <- getwd()
fichnames <- list.files(file.path(dirdata, "initial"))
for(i in fichnames){
temp <- readLines(file.path(dirdata, "initial", i))
temp <- gsub('"', "'", gsub("'", "''", temp))
writeLines(temp, con = file.path(dirdata, "result", i))
}
Uwe Ligges
fileout <- paste(dirdata,"\\result\\",fichnames[i],sep="")
conout <- file(fileout,"w")
conin <- file(filein)
open(conin)
for( l in 1:nbrows )
{
text <- gsub('"',"'",gsub("'","''",readLines(conin,n=1)))
writeLines(con=conout,text=text)
}
close(conin)
close(conout)
}
##------
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Dear Uwe,
The code doesn't do what I want because I want to replace ' by ''
only when the character ' is embodied in "......"
So :
1. " becomes '
2. ".....'......" becomes '......''......'
3. but '.......' has to stay '.......' and not ''......''
Did I miss something ?
Yes, Gabor gave me the end of the solution. Thank you Uwe and Gabor.
For the people who are interested in the solution I will recapitulate
below the differents steps.
The objective was to replace in a lot of files single quotes in double
quoted strings with single quoted strings containing double quotes in
place of the single quotes. We have to allow for the fact that the
pattern can be written on several lines in the files.
Let us list the steps to realize that :
1. read the lines of every file in a vector of strings (readLines)
2. transform this vector in a string with multiple lines
(paste(,collapse="\n")
3. use a regular expression on this string to do the replacement
4. go back to a vector of strings (strsplit) to have again the initial
file altered
5. write the new file
##--
library(gsubfn)
squote <- "'" # single quote.
# This is a double quote, single quote, double quote
dquote <- '"' # double quote
#This is a single quote, double quote, single quote
f <- function(x) chartr(paste(squote, dquote), paste(dquote, squote), x)
dirdata <- getwd() # not necessary
fichnames <- list.files(file.path(dirdata, "\\initial"),pattern=".PRC$")
# to select only the files with .PRC extension for example
for(i in fichnames){
Lines <- readLines(file.path(dirdata, "\\initial", i))
temp <- gsubfn('["][^"]*["]', f, paste(Lines, collapse = "\n"))
Lines <- unlist(strsplit(temp,"\n")) # strsplit returns a list not a
character vector
writeLines(Lines, con = file.path(dirdata, "\\result", i))
}
Thank you very much again.
--- End Message ---
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.