Hi Matt,
Though it's the last solution on your list, I would treat this as a
text editing problem: just find and replace "\.[0-9]", then read in
the result.

perl -pi -e 's/x\.[0-9]//g' *test.txt

likely done in seconds.

But other R solutions seem to be coming in in a fairly timely manner too.

t

On 30 May 2011, at 10:10, Matthew Keller wrote:
Here's the problem:
x <- c('18x.6','12x.9','302x.3')

I want to get a vector that is c('18x','12x','302x')

This is easily done using this code:

unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1]))

So far so good. The problem is that x is a vector of length 132e6.
When I run the above code, it runs for > 30 minutes, and it takes > 23
Gb RAM (no kidding!).

Does anyone have ideas about how to speed up the code above and (more
importantly) reduce the RAM footprint? I'd prefer not to change the
file on disk using, e.g., awk, but I will do that as a last resort.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to