That's a good solution, but if you're really, really sure that the timestamps
are in the format you gave, it's quite a bit faster to use substr and paste,
because you don't have to do any searching in the string.
HTH
Rex
> x = rep("09:30:00.000.633",1000000)
> system.time(y<-paste(substr(x,1,12),substr(x,14,16),sep=""))
user system elapsed
0.87 0.00 0.88
> system.time(y<-sub("\\.(\\d+)$", "\\1", x))
user system elapsed
1.65 0.00 1.65
> system.time(y<-sub("\\.(\\d+)$", "\\1", x))
user system elapsed
1.65 0.00 1.66
> system.time(y<-paste(substr(x,1,12),substr(x,14,16),sep=""))
user system elapsed
0.88 0.00 0.89
>
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Henrique Dallazuanna
Sent: Friday, March 18, 2011 8:32 AM
To: rivercode
Cc: [email protected]
Subject: Re: [R] Replace split with regex for speed ?
Try this:
sub("\\.(\\d+)$", "\\1", ts)
On Thu, Mar 17, 2011 at 11:01 PM, rivercode <[email protected]> wrote:
>
> Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last "." so
> it is in format HH:MM:SS.MMMUUU.
>
> What is the fastest way to do this, since it has to be repeated on millions
> of rows. Should I use regex ?
>
> Currently doing it with a string split, which is slow:
>
> >head(ts)
> [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309
> 09:30:00.003.635 09:30:00.026.370
>
>
> ts = strsplit(ts, ".", fixed = TRUE)
> ts=lapply(ts, function(x) { paste(x[1], ".", x[2], x[3], sep="") } ) #
> Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU
> ts = unlist(ts)
>
> Thanks,
> Chris
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
message may contain confidential information. If you are not the designated
recipient, please notify the sender immediately, and delete the original and
any copies. Any use of the message by you is prohibited.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.