Gabor Grothendieck wrote: > If you make the levels the same does that give what you want: > > levs <- c(LETTERS[1:6], "0") > tmp1 <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"), levs)) > tmp2 <- data.frame(col1 = factor(c("C", "D", "E", "F"), levs), col2 = 1:4) > merge(tmp2, tmp1, all = TRUE, sort = FALSE) > merge(tmp1, tmp2, all = TRUE, sort = FALSE)
Gabor thanks for this, but unfortunatelly the result is the same. I get the following via both ways - note that I use all.x or all.y = TRUE. > merge(tmp2, tmp1, all.x = TRUE, sort = FALSE) col1 col2 1 C 1 2 C 1 3 A NA 4 A NA 5 0 NA 6 0 NA But I want this order as it is in tmp 1 col1 1 A 2 A 3 C 4 C 5 0 6 0 >>Hello! >> >>I am merging two datasets and I have encountered a problem with sort. >>Can someone please point me to my error. Here is the example. >> >>## I have dataframes, first one with factor and second one with factor >>## and integer >> >>>tmp1 <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"))) >>>tmp2 <- data.frame(col1 = factor(c("C", "D", "E", "F")), col2 = 1:4) >>>tmp1 >> >> col1 >>1 A >>2 A >>3 C >>4 C >>5 0 >>6 0 >> >>>tmp2 >> >> col1 col2 >>1 C 1 >>2 D 2 >>3 E 3 >>4 F 4 >> >>## Now merge them >> >>>(tmp12 <- merge(tmp1, tmp2, by.x = "col1", by.y = "col1", >> >> all.x = TRUE, sort = FALSE)) >> col1 col2 >>1 C 1 >>2 C 1 >>3 A NA >>4 A NA >>5 0 NA >>6 0 NA >> >>## As you can see, sort was applied, since row order is not the same as >>## in tmp1. Reading help page for ?merge did not reveal much about >>## sorting. However I did try to see the result of "non-default" - >>## help page says that order should be the same as in 'y'. So above >>## makes sense >> >>## Now merge - but change x an y >> >>>(tmp21 <- merge(tmp2, tmp1, by.x = "col1", by.y = "col1", >> >> all.y = TRUE, sort = FALSE)) >> col1 col2 >>1 C 1 >>2 C 1 >>3 A NA >>4 A NA >>5 0 NA >>6 0 NA >> >>## The result is the same. I am stumped here. But looking a bit at these >>## object I found something peculiar >> >> >>>str(tmp1) >> >>`data.frame': 6 obs. of 1 variable: >> $ col1: Factor w/ 3 levels "0","A","C": 2 2 3 3 1 1 >> >>>str(tmp2) >> >>`data.frame': 4 obs. of 2 variables: >> $ col1: Factor w/ 4 levels "C","D","E","F": 1 2 3 4 >> $ col2: int 1 2 3 4 >> >>>str(tmp12) >> >>`data.frame': 6 obs. of 2 variables: >> $ col1: Factor w/ 3 levels "0","A","C": 3 3 2 2 1 1 >> $ col2: int 1 1 NA NA NA NA >> >>>str(tmp21) >> >>`data.frame': 6 obs. of 2 variables: >> $ col1: Factor w/ 6 levels "C","D","E","F",..: 1 1 6 6 5 5 >> $ col2: int 1 1 NA NA NA NA >> >>## Is it OK, that internal presentation of factors vary between >>## different merges. Levels are also different, once only levels >>## from original data.frame are used, while in second example all >>## levels are propagated. >> >>## I have tried the same with characters >> >>>tmp1$col1 <- as.character(tmp1$col1) >>>tmp2$col1 <- as.character(tmp2$col1) >>>(tmp12c <- merge(tmp1, tmp2, by.x = "col1", by.y = "col1", >> >> all.x = TRUE, sort = FALSE)) >> col1 col2 >>1 C 1 >>2 C 1 >>3 A NA >>4 A NA >>5 0 NA >>6 0 NA >> >> >>>(tmp21c <- merge(tmp2, tmp1, by.x = "col1", by.y = "col1", >> >> all.y = TRUE, sort = FALSE)) >> col1 col2 >>1 C 1 >>2 C 1 >>3 A NA >>4 A NA >>5 0 NA >>6 0 NA >> >>## The same with characters. Is this a bug. It definitely does not agree >>## with help page, since order is not the same as in 'y'. Can someone >>## please check on newer versions? >> >>## Is there any other way to get the same order as in 'y' i.e. tmp1? >> >> >>>R.version >> >> _ >>platform i486-pc-linux-gnu >>arch i486 >>os linux-gnu >>system i486, linux-gnu >>status >>major 2 >>minor 2.0 >>year 2005 >>month 10 >>day 06 >>svn rev 35749 >>language R >> >>Thank you very much! >> >>-- >>Lep pozdrav / With regards, >> Gregor Gorjanc >> >>---------------------------------------------------------------------- >>University of Ljubljana PhD student >>Biotechnical Faculty >>Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan >>Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si >> >>SI-1230 Domzale tel: +386 (0)1 72 17 861 >>Slovenia, Europe fax: +386 (0)1 72 17 888 >> >>---------------------------------------------------------------------- >>"One must learn by doing the thing; for though you think you know it, >> you have no certainty until you try." Sophocles ~ 450 B.C. >> >>______________________________________________ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> -- Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europe fax: +386 (0)1 72 17 888 ---------------------------------------------------------------------- "One must learn by doing the thing; for though you think you know it, you have no certainty until you try." Sophocles ~ 450 B.C. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html