If you make the levels the same does that give what you want: levs <- c(LETTERS[1:6], "0") tmp1 <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"), levs)) tmp2 <- data.frame(col1 = factor(c("C", "D", "E", "F"), levs), col2 = 1:4) merge(tmp2, tmp1, all = TRUE, sort = FALSE) merge(tmp1, tmp2, all = TRUE, sort = FALSE)
On 3/6/06, Gregor Gorjanc <[EMAIL PROTECTED]> wrote: > Hello! > > I am merging two datasets and I have encountered a problem with sort. > Can someone please point me to my error. Here is the example. > > ## I have dataframes, first one with factor and second one with factor > ## and integer > > tmp1 <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"))) > > tmp2 <- data.frame(col1 = factor(c("C", "D", "E", "F")), col2 = 1:4) > > tmp1 > col1 > 1 A > 2 A > 3 C > 4 C > 5 0 > 6 0 > > tmp2 > col1 col2 > 1 C 1 > 2 D 2 > 3 E 3 > 4 F 4 > > ## Now merge them > > (tmp12 <- merge(tmp1, tmp2, by.x = "col1", by.y = "col1", > all.x = TRUE, sort = FALSE)) > col1 col2 > 1 C 1 > 2 C 1 > 3 A NA > 4 A NA > 5 0 NA > 6 0 NA > > ## As you can see, sort was applied, since row order is not the same as > ## in tmp1. Reading help page for ?merge did not reveal much about > ## sorting. However I did try to see the result of "non-default" - > ## help page says that order should be the same as in 'y'. So above > ## makes sense > > ## Now merge - but change x an y > > (tmp21 <- merge(tmp2, tmp1, by.x = "col1", by.y = "col1", > all.y = TRUE, sort = FALSE)) > col1 col2 > 1 C 1 > 2 C 1 > 3 A NA > 4 A NA > 5 0 NA > 6 0 NA > > ## The result is the same. I am stumped here. But looking a bit at these > ## object I found something peculiar > > > str(tmp1) > `data.frame': 6 obs. of 1 variable: > $ col1: Factor w/ 3 levels "0","A","C": 2 2 3 3 1 1 > > str(tmp2) > `data.frame': 4 obs. of 2 variables: > $ col1: Factor w/ 4 levels "C","D","E","F": 1 2 3 4 > $ col2: int 1 2 3 4 > > str(tmp12) > `data.frame': 6 obs. of 2 variables: > $ col1: Factor w/ 3 levels "0","A","C": 3 3 2 2 1 1 > $ col2: int 1 1 NA NA NA NA > > str(tmp21) > `data.frame': 6 obs. of 2 variables: > $ col1: Factor w/ 6 levels "C","D","E","F",..: 1 1 6 6 5 5 > $ col2: int 1 1 NA NA NA NA > > ## Is it OK, that internal presentation of factors vary between > ## different merges. Levels are also different, once only levels > ## from original data.frame are used, while in second example all > ## levels are propagated. > > ## I have tried the same with characters > > tmp1$col1 <- as.character(tmp1$col1) > > tmp2$col1 <- as.character(tmp2$col1) > > (tmp12c <- merge(tmp1, tmp2, by.x = "col1", by.y = "col1", > all.x = TRUE, sort = FALSE)) > col1 col2 > 1 C 1 > 2 C 1 > 3 A NA > 4 A NA > 5 0 NA > 6 0 NA > > > (tmp21c <- merge(tmp2, tmp1, by.x = "col1", by.y = "col1", > all.y = TRUE, sort = FALSE)) > col1 col2 > 1 C 1 > 2 C 1 > 3 A NA > 4 A NA > 5 0 NA > 6 0 NA > > ## The same with characters. Is this a bug. It definitely does not agree > ## with help page, since order is not the same as in 'y'. Can someone > ## please check on newer versions? > > ## Is there any other way to get the same order as in 'y' i.e. tmp1? > > > R.version > _ > platform i486-pc-linux-gnu > arch i486 > os linux-gnu > system i486, linux-gnu > status > major 2 > minor 2.0 > year 2005 > month 10 > day 06 > svn rev 35749 > language R > > Thank you very much! > > -- > Lep pozdrav / With regards, > Gregor Gorjanc > > ---------------------------------------------------------------------- > University of Ljubljana PhD student > Biotechnical Faculty > Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan > Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si > > SI-1230 Domzale tel: +386 (0)1 72 17 861 > Slovenia, Europe fax: +386 (0)1 72 17 888 > > ---------------------------------------------------------------------- > "One must learn by doing the thing; for though you think you know it, > you have no certainty until you try." Sophocles ~ 450 B.C. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html