I have followed the tutorial at media-style.com and actually have a mapred installation of nutch working. Thanks Stefan :) My question now is the correct steps to continuously fetch and index. I have read some people talking about mergesegs and updatedb however Stefan's tutorial doesn't list these as steps. If you want to continually fetch more and more levels from your crawldb and appropriately update your index what is the correct method for doing so? Currently I am doing this: generate fetch invertlinks index
Only problem I am having is that I seem to not be able to get any pages past the index pages on the root domains I injected. I feel like I am missing some important steps. Any input is appreciated. Mike
