Note that these algorithms will *not* resolve the fundamentally exponential
nature of the problem, they simply provide algorithms for doing it as
efficiently as possible, but this still involves essentially visiting every
node in the tree at least once until you hit a cycle in the tree or decide
you have traveled too far from the start. Moreover, after quickly perusing
(in other words I could have missed it) I did not see specific mention of
setting a depth of search limit. it is critical that you set a limit or you
will end up attempting to spider the entire internet in search of cycles.
The examples shown in the references are (for good reason) constrained in
terms of node count, but on the real web the pool is (obviously) much larger
and the problem becomes fundamentally different.

Hank

On Tue, Jul 15, 2008 at 9:43 AM, Dennis Kubes <[EMAIL PROTECTED]> wrote:

> Excellent.  Thanks was exactly what I needed.  Thanks Andrzej.
>
> Dennis
>
>
> Andrzej Bialecki wrote:
>
>> Dennis Kubes wrote:
>>
>>> Does anybody know how to efficiently (non-exponentially) walk a web graph
>>> to detect cycles.  This would be very useful in identifying spammy webpage
>>> and tight knit communities.
>>>
>>> I have a program that I will be releasing soon that does the detection
>>> through converting a webgraph into a tree and walking the tree nodes, but it
>>> is exponential in terms of intermediate map reduce output and computation.
>>>
>>
>> Perhaps this could be helpful:
>>
>> * 
>> http://www.cs.berkeley.edu/~kamil/teaching/sp03/041403.pdf<http://www.cs.berkeley.edu/%7Ekamil/teaching/sp03/041403.pdf>see
>>  the last section.
>>
>> * http://citeseer.ist.psu.edu/395003.html
>>
>> * http://en.wikipedia.org/wiki/Cycle_detection
>>
>>
>>


-- 
blog: whydoeseverythingsuck.com

Reply via email to