AFAIK this is not a nutch or hadoop issue, but a basic math problem, which
is inherent in walking trees. You have to figure out how many levels of the
graph you are willing to walk, and live with the mathematical consequences.
The further away you walk, the more untenable the math gets, but for a few
levels it can be OK. But it is nothing but a brute force problem.

Hank

On Mon, Jul 14, 2008 at 11:57 AM, Dennis Kubes <[EMAIL PROTECTED]> wrote:

> Does anybody know how to efficiently (non-exponentially) walk a web graph
> to detect cycles.  This would be very useful in identifying spammy webpage
> and tight knit communities.
>
> I have a program that I will be releasing soon that does the detection
> through converting a webgraph into a tree and walking the tree nodes, but it
> is exponential in terms of intermediate map reduce output and computation.
>
> Dennis
>



-- 
blog: whydoeseverythingsuck.com

Reply via email to