On Tue, Jul 15, 2008 at 10:20 AM, brainstorm <[EMAIL PROTECTED]> wrote:
> On Tue, Jul 15, 2008 at 4:12 PM, hank williams <[EMAIL PROTECTED]> wrote: > > Note that these algorithms will *not* resolve the fundamentally > exponential > > nature of the problem, > > > O(|V|+|E|), IIRC > True. The problem is not actually exponential. It is the *dataset* that is when you talk about the web graph. The problem is essentially linear with the size of the dataset. If your dataset is constrained (by setting a limit) then so is your execution. > > > > efficiently as possible, but this still involves essentially visiting > every > > node in the tree at least once until you hit a cycle in the tree or > decide > > you have traveled too far from the start. Moreover, after quickly > perusing > > (in other words I could have missed it) I did not see specific mention of > > setting a depth of search limit. it is critical that you set a limit or > you > > will end up attempting to spider the entire internet in search of cycles. > > The examples shown in the references are (for good reason) constrained in > > terms of node count, but on the real web the pool is (obviously) much > larger > > and the problem becomes fundamentally different. > > > > Hank > > > > On Tue, Jul 15, 2008 at 9:43 AM, Dennis Kubes <[EMAIL PROTECTED]> wrote: > > > >> Excellent. Thanks was exactly what I needed. Thanks Andrzej. > >> > >> Dennis > >> > >> > >> Andrzej Bialecki wrote: > >> > >>> Dennis Kubes wrote: > >>> > >>>> Does anybody know how to efficiently (non-exponentially) walk a web > graph > >>>> to detect cycles. This would be very useful in identifying spammy > webpage > >>>> and tight knit communities. > >>>> > >>>> I have a program that I will be releasing soon that does the detection > >>>> through converting a webgraph into a tree and walking the tree nodes, > but it > >>>> is exponential in terms of intermediate map reduce output and > computation. > >>>> > >>> > >>> Perhaps this could be helpful: > >>> > >>> * > >>> http://www.cs.berkeley.edu/~kamil/teaching/sp03/041403.pdf<http://www.cs.berkeley.edu/%7Ekamil/teaching/sp03/041403.pdf> > <http://www.cs.berkeley.edu/%7Ekamil/teaching/sp03/041403.pdf>see the last > section. > >>> > >>> * http://citeseer.ist.psu.edu/395003.html > >>> > >>> * http://en.wikipedia.org/wiki/Cycle_detection > >>> > >>> > >>> > > > > > > -- > > blog: whydoeseverythingsuck.com > > > -- blog: whydoeseverythingsuck.com
