Here is how I would do it to just do character substitution on the data: > inFile <- textConnection(" V1 V2 V3 V4 V5 + 1 1 b b a -0.4990719 + 2 2 b a a 1.5134101 + 3 3 a b b 1.9375467 + 4 4 a a b 0.3310612 + 5 5 a b a 0.2807520 + 6 6 a a b 0.9646351 + 7 7 b a b 0.6243979 + 8 8 a b a -0.8076008 + 9 9 a b b -1.7645273 + 10 10 b b a 0.5460802 + 11 11 c c b 12.3000000") > output <- NULL # initialize output file (just a vector in this case > while(length(input <- readLines(inFile, n=3)) > 0){ + # replace 'b' with 'z' + for (i in seq_along(input)){ + input[i] <- gsub('b', 'z', input[i]) + } + output <- c(output, input) # collect the output + } > close(inFile) > print(cbind(output)) # show converted data output [1,] " V1 V2 V3 V4 V5" [2,] "1 1 z z a -0.4990719" [3,] "2 2 z a a 1.5134101" [4,] "3 3 a z z 1.9375467" [5,] "4 4 a a z 0.3310612" [6,] "5 5 a z a 0.2807520" [7,] "6 6 a a z 0.9646351" [8,] "7 7 z a z 0.6243979" [9,] "8 8 a z a -0.8076008" [10,] "9 9 a z z -1.7645273" [11,] "10 10 z z a 0.5460802" [12,] "11 11 c c z 12.3000000" >
On Wed, Aug 18, 2010 at 10:51 PM, Juliet Hannah <juliet.han...@gmail.com> wrote: > Hi Jim, > > I was trying to use your template without success. With the toy data > below, could > you explain how to use this template to change all "b"s to "z"s -- > just as an exercise, reading > in 3 lines at a time. I need to use this strategy for a larger > problem, but I haven't > been able to get the basics working. > > Thanks, > > Juliet > > myData <- structure(list(V1 = 1:11, V2 = structure(c(2L, 2L, 1L, 1L, 1L, > 1L, 2L, 1L, 1L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor"), > V3 = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, > 3L), .Label = c("a", "b", "c"), class = "factor"), V4 = structure(c(1L, > 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a", > "b"), class = "factor"), V5 = c(-0.499071939558026, 1.51341011554134, > 1.93754671209923, 0.331061227463955, 0.280752001959284, 0.964635079229074, > 0.624397908891502, -0.807600774484419, -1.76452730888732, > 0.546080229326458, 12.3)), .Names = c("V1", "V2", "V3", "V4", > "V5"), class = "data.frame", row.names = c(NA, -11L)) > > On Sun, Aug 15, 2010 at 1:06 PM, jim holtman <jholt...@gmail.com> wrote: >> For efficiency of processing, look at reading in several >> hundred/thousand lines at a time. One line read/write will probably >> spend most of the time in the system calls to do the I/O and will take >> a long time. So do something like this: >> >> con <- file('yourInputFile', 'r') >> outfile <- file('yourOutputFile', 'w') >> while (length(input <- readLines(con, n=1000) > 0){ >> for (i in 1:length(input)){ >> ......your one line at a time processing >> } >> writeLines(output, con=outfile) >> } >> >> On Sun, Aug 15, 2010 at 7:58 AM, Data Analytics Corp. >> <w...@dataanalyticscorp.com> wrote: >>> Hi, >>> >>> I have an upcoming project that will involve a large text file. I want to >>> >>> 1. read the file into R one line at a time >>> 2. do some string manipulations on the line >>> 3. write the line to another text file. >>> >>> I can handle the last two parts. Scan and read.table seem to read the whole >>> file in at once. Since this is a very large file (several hundred thousand >>> lines), this is not practical. Hence the idea of reading one line at at >>> time. The question is, can R read one line at a time? If so, how? Any >>> suggestions are appreciated. >>> >>> Thanks, >>> >>> Walt >>> >>> ________________________ >>> >>> Walter R. Paczkowski, Ph.D. >>> Data Analytics Corp. >>> 44 Hamilton Lane >>> Plainsboro, NJ 08536 >>> ________________________ >>> (V) 609-936-8999 >>> (F) 609-936-3733 >>> w...@dataanalyticscorp.com >>> www.dataanalyticscorp.com >>> >>> _____________________________________________________ >>> >>> >>> -- >>> ________________________ >>> >>> Walter R. Paczkowski, Ph.D. >>> Data Analytics Corp. >>> 44 Hamilton Lane >>> Plainsboro, NJ 08536 >>> ________________________ >>> (V) 609-936-8999 >>> (F) 609-936-3733 >>> w...@dataanalyticscorp.com >>> www.dataanalyticscorp.com >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.