You're correct, it isn't a nutch or hadoop problem, just related :). I
was hoping there was some mathematical trick out there which I wasn't
aware of which might reduce the intermediate permutations. Guess not.
Thanks for the reply.
Dennis
hank williams wrote:
AFAIK this is not a nutch or hadoop issue, but a basic math problem, which
is inherent in walking trees. You have to figure out how many levels of the
graph you are willing to walk, and live with the mathematical consequences.
The further away you walk, the more untenable the math gets, but for a few
levels it can be OK. But it is nothing but a brute force problem.
Hank
On Mon, Jul 14, 2008 at 11:57 AM, Dennis Kubes <[EMAIL PROTECTED]> wrote:
Does anybody know how to efficiently (non-exponentially) walk a web graph
to detect cycles. This would be very useful in identifying spammy webpage
and tight knit communities.
I have a program that I will be releasing soon that does the detection
through converting a webgraph into a tree and walking the tree nodes, but it
is exponential in terms of intermediate map reduce output and computation.
Dennis