Well, I hadn't ever seen RBGL before, so that's great. I've been using igraph and sna mainly, but there are a few points lacking between these two. RBGL solves a lot of problems for me!
But I'm not sure it will solve this specific problem. Are you suggesting I use RBGL to do a depth-first search of all the subgraphs? For this particular depth-first search I'm not searching every subgraph, but just those that are constructed from a minimal cutset of the parent subgraph. At each level of the search, I have to compute graph cohesion (vertex connectivity), which can take considerable time. A lot of computation time is saved by only searching subgraphs obtained through cutsets. So a complete search of all the subgraphs won't work, but the redundancy I come across is I think unavoidable. The particular algorithm I'm trying to implement is Moody and White's cohesive blocking, in which the end result is a nested set of all subgraphs with a higher cohesion (connectivity) than their parents. (see http://www.santafe.edu/research/publications/workingpapers/ 00-08-049.pdf ) On Mar 16, 2007, at 11:00 AM, Robert Gentleman wrote: > why not just use the graph package and RBGL at www.bioconductor.org > > > Peter McMahan wrote: > >> Hello, >> I'm running an algorithm for graph structural cohesion that >> requires a depth-first search of subgraphs of a rather large >> network. The algorithm will necessarily be redundant in the >> subgraphs it recurses to, so to speed up the process I >> implemented a check at each subgraph to see if it's been searched >> already. >> This algorithm is very slow, and takes days to complete on a >> graph with about 200 nodes. It seems that a significant portion >> of the computation time is spent looking up the current subgraph >> in the list of searched subgraphs to see if it is redundant, and >> I'm wondering if there's a faster way to do this. >> Currently, the list of searched subgraphs is a list >> (`theseSearchedBranches`) of unnamed numeric vectors. I'm >> checking against the list using the following command: >> if(list(v) %in% theseSearchedBranches){ >> cat(" Branch already searched: passing.\n\n") >> return(NULL) >> } >> v is a sorted numeric, with length anywhere between 3 and 200. >> Because theseSearchedBranches gets quite long as the algorithm >> progresses, the %in% command is taking maybe one or two seconds >> per lookup. >> So to reiterate, I have a list() that looks something like this: >> [[1]] >> [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 >> [18]56 72 75 76 85 95 105 110 134 158 159 165 186 >> [[2]] >> [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 >> [18]56 72 75 76 85 95 105 110 134 147 159 165 186 >> [[3]] >> [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 >> [18]50 56 72 75 85 95 105 110 134 147 158 159 165 186 >> ... >> and so on for tens of thousands of entries, and I am trying to >> find some sort of fast equivalent for %in% to search it. I'm also >> not adding the vectors to the list in any particular order, as I >> don't think %in% would know how to take advantage of that anyway. >> Is there a data structure other than list() that I can use that >> would be faster? Would it be better to just create a hashed env >> and add empty variables named "0.5.6.11.12..."? >> I know there are fast lookup algorithms out there that could take >> advantage of the fact that the items being searched are >> indiviually ordered numeric vectors, but I can't find any info >> about R implementations on the mailing lists or help. Is there >> any data type that implements a b-tree type of lookup scheme? >> Any help would be greatly appreciated. >> Thanks, >> Peter >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Robert Gentleman, PhD > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M2-B876 > PO Box 19024 > Seattle, Washington 98109-1024 > 206-667-7700 > [EMAIL PROTECTED] > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
