hi all.
I'm a newbie in nutch 9,
but now comes a problem when I want to change nutch indexer like follows:
I make nutch crawl the web sites with a depth of 2
and I want the indexer to do things according to the depth of crawl
which means that
----------------------------------------------------------------------
if depth == 1
just store this page to a path without any change
if depth == 2
index this page
-----------------------------------------------------------------------
and the relation between pages of depth 1 nad 2 is like follows:
when we query,the result should be according to the indexes of pages of
depth 2,
but what we first got is page 1, from which links to page 2
did I make it clear to all?
and I've tried to rewrite the indexer but in vain to find that all key
operations are written in hadoop whose src code can't be seen
any advice will be greatly appreciated

Reply via email to