On Jul 14, 2011, at 6:15 PM, Tyler Rinker wrote:



Good Afternoon R Community,

I often work with very large data bases and want to search for select cases by a particular word or numeric value. I created the following simple function to do just that. It searchs a particular column for the phrase and returns a data frame with the rows that contain that phrase (for a particular column).

Search<-function(term, dataframe, column.name, variation=.02,...){
   te<-substitute(term)
      te<-as.character(te)
  cn<-substitute(column.name)
     cn<-as.character(cn)
HUNT<-agrep(te,dataframe[,cn],ignore.case =TRUE,max.distance=variation,...)
   ### dataframe[c(HUNT),]

   HUNTL <- (1:NROW(dataframe) %in% HUNT)

}


You would make life simpler by keeping your results as logical vectors the same length as your dataframe.

Then:

 logHunt <-  sapply(dfrmname, Search, term=term, )
     indexL <- rowSums(logHunt) >=1
    dfrmname[indexL, ]

Untested in absence of test data.

--
David.


I would like to modify this to search all columns for the phrase keep only the unique rows and return a data frame for any columns (minus repeated rows) that contain the phrase.

I assumed this would be an easy task for me using sapply() and unique() or union(). Because this argument takes more than one argument (vector{column} is not the only argument) I don’t know how to set it up. Could someone tell me how to apply this function to multiple columns and return one data frame with all the agrep matches (I’ll figure out how to deal with duplicates after that; that’s the easy part).

Thank you in advance for your help,
Tyler Rinker

PS if your idea is a for loop please explain it well or provide the code because I do not have a programming background and for loops are very difficult to wrap my head around.

Running windows 7
R version 2.14.0 (beta)                                         
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to