Hi Doug,
The slides from my talk yesterday at OSCON give some hints on how to get started. We need a MapReduce tutorial.http://wiki.apache.org/nutch/Presentations
Can you explan what this means: Page 20: - cheduling is bottleneck, not disk, network or CPU? Thanks. Stefan