[R] how to subset rows using regular expression patterns
hi netters, i have a dataframe A with several columns(variables). the elements of column M are character strings. so A$M=c(ab,abc,bcd,ac,abcd,fg,.fl). i wanna extract all the rows where A$M match some regular expression pattern. for a simple example, let the pattern be just ab, i wanna subset the rows where A$M=ab or abc or abcd or abXX. i know i can write a loop,using some regular expression pattern functions like grep row by row. but when A's size is pretty large, it's inefficient. could anyone give me a hint about a faster code? thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to subset rows using regular expression patterns
zhihua li [EMAIL PROTECTED] writes: hi netters, i have a dataframe A with several columns(variables). the elements of column M are character strings. so A$M=c(ab,abc,bcd,ac,abcd,fg,.fl). i wanna extract all the rows where A$M match some regular expression pattern. for a simple example, let the pattern be just ab, i wanna subset the rows where A$M=ab or abc or abcd or abXX. i know i can write a loop,using some regular expression pattern functions like grep row by row. but when A's size is pretty large, it's inefficient. could anyone give me a hint about a faster code? thanks a lot! Notice that grep() returns an index vector, so A[grep(pattern, A$M),] or subset(A, grep(pattern, M)) should do it. -- O__ Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to subset rows using regular expression patterns
Something like A[grep('^ab', as.vector(A$M)),] might work zhihua li wrote: hi netters, i have a dataframe A with several columns(variables). the elements of column M are character strings. so A$M=c(ab,abc,bcd,ac,abcd,fg,.fl). i wanna extract all the rows where A$M match some regular expression pattern. for a simple example, let the pattern be just ab, i wanna subset the rows where A$M=ab or abc or abcd or abXX. i know i can write a loop,using some regular expression pattern functions like grep row by row. but when A's size is pretty large, it's inefficient. could anyone give me a hint about a faster code? thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html