Ted Dunning wrote:
A fairly standard approach in a problem like this is for each of the crawlers to write multiple items to separate files (one per crawler). This can be done using a map-reduce structure by having the map input be items to crawl and then using the reduce mechanism to gather like items together. This structure can give you the desired output structure, especially if the reduces are careful about naming their outputs well.
Yes. That's what I was trying to say in my previous message, but perhaps was not as clear.
Doug
