Interesting, looks very much like a service built in here in the Netherlands.
On Tuesday 28 September 2010 13:56:27 Julien Nioche wrote: > Dear Nutch Users, > > FYI I've blogged yesterday about an interesting use case of Nutch. We've > helped the guys at SimilarPages to use Nutch on EC2 for a super large crawl > (3 billion docs parsed), which they we've then used with a bit of MapReduce > magic to find similarities between web pages. > > I will probably add a Use Case section on the Wiki and write a short > description of the project but in the meantime you can find more details on > http://digitalpebble.blogspot.com/2010/09/similarpages-is-out.html and of > course http://www.similarpages.com/ itself. > > Best, > > Julien Nioche > Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

