Hi The problem is I'm first connecting to the Access database with odbcConnectAccess and then select with a sqlQuery the dataframe. In your solution you are typing it. But mine databases consist of approximately 60000 records.
Maybe you have another solution? Thanks in advance. Regards, Priya On 11/7/06, Christoph Buser <[EMAIL PROTECTED]> wrote: > > Hi > > Maybe this example can help you to find your solution: > > dat1 <- data.frame(CUSTOMER_ID = c("1000786BR", "1002047BR", "10127BR", > "1004166834BR"," 1004310897BR", "1006180BR", > "10064798BR", "1007311BR", "1007621BR", > "1008195BR", "10126BR", "95323994BR"), > CUSTOMER_RR = c("5+", "4", "5+", "2", "X", "4", "4", > "5+", > "4", "4-", "5+", "4")) > > dat2 <- data.frame(CUSTOMER_ID = c("1200786BR", "1802047BR", "1027BR", > "10166834BR", "107BR", "100BR", "164798BR", > "1008195BR", > "10126BR"), > CUSTOMER_RR = c("6+", "4", "1+", "2", "X", "4", "4", > "4", > "5+")) > > ## Merge, but only by "CUSTOMER_ID" > datM <- merge(dat1, dat2, by = "CUSTOMER_ID") > datM > ## Select only cases that have a similar "CUSTOMER_RR" > datM1 <- datM[as.character(datM[, "CUSTOMER_RR.x"]) %in% > as.character(datM[,"CUSTOMER_RR.y"]), ] > datM1 > > Regards, > > Christoph > > -------------------------------------------------------------- > > Credit and Surety PML study: visit our web page www.cs-pml.org > > -------------------------------------------------------------- > Christoph Buser <[EMAIL PROTECTED]> > Seminar fuer Statistik, LEO C13 > ETH Zurich 8092 Zurich SWITZERLAND > phone: x-41-44-632-4673 fax: 632-1228 > http://stat.ethz.ch/~buser/ > -------------------------------------------------------------- > > > > Priya Kanhai writes: > > Hi, > > > > I''ve a question about comparing 2 dataframes: RRC_db1 and RRC_db2 of > > different length. > > > > For example: > > > > RRC_db1: > > > > CUSTOMER_ID CUSTOMER_RR > > 1 1000786BR 5+ > > 2 1002047BR 4 > > 3 10127BR 5+ > > 4 1004166834BR 2 > > 5 1004310897BR X > > 6 1006180BR 4 > > 7 10064798BR 4 > > 8 1007311BR 5+ > > 9 1007621BR 4 > > 10 1008195BR 4- > > 11 10126BR 5+ > > 12 95323994BR 4 > > > > RRC_db2: > > > > CUSTOMER_ID CUSTOMER_RR > > 1 1200786BR 6+ > > 2 1802047BR 4 > > 3 1027BR 1+ > > 4 10166834BR 2 > > 5 107BR X > > 6 100BR 4 > > 7 164798BR 4 > > 8 1008195BR 4- > > 9 10126BR 5+ > > > > > > I want to pick the CUSTOMER_ID of RRC_db1 which also exist in RRC_db2: > > third <- merge(RRC_db1, RRC_db2) or third <-subset(RRC_db1, > CUSTOMER_ID%in% > > RRC_db2$CUSTOMER_ID) > > > > But I also want to check if the CUSTOMER_RR is correct. I had tried > this: > > > > > test <- function(RRC_db1,RRC_db2) > > + { > > + noteq <- c() > > + for( i in 1:length(RRC_db1$CUSTOMER_ID)){ > > + for( j in 1:length(RRC_db2$CUSTOMER_ID)){ > > + if(RRC_db1$CUSTOMER_ID[i] == RRC_db2$CUSTOMER_ID[j]){ > > + if(RRC_db1$CUSTOMER_RR[i] != RRC_db2$CUSTOMER_RR[j]){ > > + noteq <- c(noteq,RRC_db1$CUSTOMER_ID[i]); > > + } > > + } > > + } > > + } > > + noteq; > > + } > > > > > > test(RRC_db1, RRC_db2) > > Error in Ops.factor(RRC_db1$CUSTOMER_ID[i], RRC_db2$CUSTOMER_ID[j]) : > > level sets of factors are different > > > > > > But then I got this error. > > > > I don't only want the CUSTOMER_ID to be the same but also the > CUSTOMER_RR. > > > > Can you please help me? > > > > Thanks in advance. > > > > Regards, > > > > Priya > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.