Hi! I am working with a large network data consisting of source-target pairs stored in a tibble. Now I need to transform the directed dataset to an undirected network data. This means, I need to keep only one instance for pairs with the same "nodes". In other words, if my data has one row with A (source) and B (target) and one with B (source) and A (target), only the pair A-B should be kept.
Here an example how I have solved this problem so far: --- snip --- # Create some data x<-tibble(Source=rep(1:3,4), Target=c(rep(1,3),rep(2,3),rep(3,3),rep(4,3))) x # print original data # Remove "undirected" duplicates x<-x %>% mutate(pair=mapply(function(x,y) paste0(sort(c(x,y)),collapse="-"), Source, Target)) %>% distinct(pair, .keep_all = T) %>% mutate(Source=sapply(pair, function(x) unlist(strsplit(x, split="-"))[1]), Target=sapply(pair, function(x) unlist(strsplit(x, split="-"))[2])) %>% select(-pair) x # print cleaned data --- snip --- The good thing with my own solution is that it allows the creation of weighted pairs as well. One just needs to replace 'distinct(pair, .keep_all=T)' with 'count(pair)'. I have done a lot of searching but not found any function providing this functionality. Does someone know an alternative, maybe a more effective function/solution? Best, Kimmo Elo -- Dr. Kimmo Elo Senior researcher in European Studies ===================================================== University of Turku Centre for Parliamentary Studies Finland E-mail: kimmo....@utu.fi ===================================================== ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.