Jake,
You can use the plyr library or some form of apply.  If you are on a 64bit 
system you can multithread and it goes much faster.

something like this(for 32bit):
require(plyr)
df1 <- data.frame(Taxa = c('blue', 'red', NA,'blue', 'red', NA,'blue', 'red', 
NA))
df2 <- data.frame(Taxa = c( 'blue', 'red', NA), Class = c('Z', 'HI', 'A'))

#function to do the lookup
find.class<-function(x)df2[grep(x, df2$Taxa),'Class']

ddply(.data=df1,
      .variables='Taxa',
      .fun=transform,
      Class=find.class(Taxa))

Joel

From: Beaulieu, Jake
Sent: Thursday, September 12, 2013 12:06 PM
To: r-help@r-project.org
Cc: Wahman, David; Farrar, David; Allen, Joel; Green, Hyatt; McManus, Michael
Subject: grep(pattern = each element of a vector) ?

Hi,

I have a large dataframe that contains species names.  I have a second 
dataframe that contains species names and some additional info, called 'Class', 
about each species.  I would like match the species name is the first data 
frame with the 'Class' information contained in the second.  Since the species 
names are often formatted differently between the data sets, merge doesn't work 
well.  grep does the trick, but the function needs to be called separately for 
each observation in the first data frame.  I put grep into a loop, but this is 
too slow.  Is there a way to run grep repeatedly without resorting to a loop?  
Possibly something in the apply family?

  df1 <- data.frame(Taxa = c('blue', 'red', NA))
  df2 <- data.frame(Taxa = c( 'blue', 'red', NA), Class = c('Z', 'HI', 'A'))

  index <- NULL
  for (i in 1:length(df1$Taxa)) {
    index[i] <- grep(df1$Taxa[1], df2$Taxa)
    }
  index

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: i386-w64-mingw32/i386 (32-bit)

==================================
Jake J. Beaulieu, PhD
US Environmental Protection Agency
National Risk Management Research Lab
26 W. Martin Luther King Drive
Cincinnati, OH 45268
USA
513-569-7842  (desk)
513-487-2511 (fax)
beaulieu.j...@epa.gov<mailto:beaulieu.j...@epa.gov>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to