Hi, It is not that clear. If VAR1 is a match between columns AB001A, AB0002A, VAR2 between AB001A, AB362 and VAR3 between AB0002A and AB362: Also, I assume row8 match would be taken as 1.
dat1<- read.table(text=" S.No AB001A AB0002A AB362 1 -/- C/C A/A 2 C/C C/C A/A 3 C/C C/C A/A 4 C/C C/C A/A 5 C/C C/C A/A 6 C/C C/C A/A 7 C/C C/C A/A 8 -/- -/- -/- 9 C/C C/C A/A 10 C/C C/C A/A 11 -/- C/C A/A 12 C/C C/C A/A 13 C/C C/C A/A 14 C/C C/C A/A 16 C/C -/- A/A 17 -/- C/C A/A 18 C/C C/C A/A 19 C/C C/C A/A ",sep="",header=TRUE,stringsAsFactors=FALSE) library(plyr) res<-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH) head(res) # S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM MATCH Rank #1 1 -/- C/C A/A 0 0 0 0 0.00000 2.5 #2 2 C/C C/C A/A 1 0 0 1 33.33333 11.0 #3 3 C/C C/C A/A 1 0 0 1 33.33333 11.0 #4 4 C/C C/C A/A 1 0 0 1 33.33333 11.0 #5 5 C/C C/C A/A 1 0 0 1 33.33333 11.0 #6 6 C/C C/C A/A 1 0 0 1 33.33333 11.0 #or res<-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH,ties.method="min")) head(res) # S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM MATCH Rank #1 1 -/- C/C A/A 0 0 0 0 0.00000 1 #2 2 C/C C/C A/A 1 0 0 1 33.33333 5 #3 3 C/C C/C A/A 1 0 0 1 33.33333 5 #4 4 C/C C/C A/A 1 0 0 1 33.33333 5 #5 5 C/C C/C A/A 1 0 0 1 33.33333 5 #6 6 C/C C/C A/A 1 0 0 1 33.33333 5 A.K. >Hi to all bloggers, >my data looks like this, > >S. No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM %Match Rank > 1 -/- C/C A/A > 2 C/C C/C A/A > 3 C/C C/C A/A > 4 C/C C/C A/A > 5 C/C C/C A/A > 6 C/C C/C A/A > 7 C/C C/C A/A > 8 -/- -/- -/- > 9 C/C C/C A/A > 10 C/C C/C A/A > 11 -/- C/C A/A > 12 C/C C/C A/A > 13 C/C C/C A/A > 14 C/C C/C A/A > 16 C/C -/- A/A > 17 -/- C/C A/A > 18 C/C C/C A/A > 19 C/C C/C A/A >I want to match obs 3 with obs 2 if it exactly matched then score will be 1 else 0, that will be stored in var1 for AB001a, in var2 for ab0002a and in >var3 for ab362 and i want to calculate sum of all the 1's and observation match percent and their rank (top ten matchers), I did this successfully in >excel but it took me lot of time, i used if condition in excel like (=if(A3=A$2,1,0) and then i dragged among all obs and i did sum of all obs, their >%match and rank. My question is how can i do this in R? can i use match package for this? or other packages will help me? my data is so big with >5,15,567 obs. can any one guide me how to do this in sas because i want to reduce my time to analyze my data. Thanking you Regards, ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.