Note that these algorithms will *not* resolve the fundamentally exponential nature of the problem, they simply provide algorithms for doing it as efficiently as possible, but this still involves essentially visiting every node in the tree at least once until you hit a cycle in the tree or decide you have traveled too far from the start. Moreover, after quickly perusing (in other words I could have missed it) I did not see specific mention of setting a depth of search limit. it is critical that you set a limit or you will end up attempting to spider the entire internet in search of cycles. The examples shown in the references are (for good reason) constrained in terms of node count, but on the real web the pool is (obviously) much larger and the problem becomes fundamentally different.
Hank On Tue, Jul 15, 2008 at 9:43 AM, Dennis Kubes <[EMAIL PROTECTED]> wrote: > Excellent. Thanks was exactly what I needed. Thanks Andrzej. > > Dennis > > > Andrzej Bialecki wrote: > >> Dennis Kubes wrote: >> >>> Does anybody know how to efficiently (non-exponentially) walk a web graph >>> to detect cycles. This would be very useful in identifying spammy webpage >>> and tight knit communities. >>> >>> I have a program that I will be releasing soon that does the detection >>> through converting a webgraph into a tree and walking the tree nodes, but it >>> is exponential in terms of intermediate map reduce output and computation. >>> >> >> Perhaps this could be helpful: >> >> * >> http://www.cs.berkeley.edu/~kamil/teaching/sp03/041403.pdf<http://www.cs.berkeley.edu/%7Ekamil/teaching/sp03/041403.pdf>see >> the last section. >> >> * http://citeseer.ist.psu.edu/395003.html >> >> * http://en.wikipedia.org/wiki/Cycle_detection >> >> >> -- blog: whydoeseverythingsuck.com
