Interesting!
> Excellent Julian. Excuse me for not picking this up from your blog. > > I took your comment a few weeks ago regarding 'large crawls' a bit too > light hearted ;0) > > This puts a big smile on my face. > Ta for now > Lewis > > On Thu, Nov 17, 2011 at 5:39 PM, Julien Nioche < > > [email protected]> wrote: > > We (DigitalPebble) managed the crawl for them and wrote the custom bits > > they required. The problems they mentioned were more related to EC2 than > > Hadoop as such. More on > > http://digitalpebble.blogspot.com/2010/09/similarpages-is-out.html > > > > Jul > > > > On 17 November 2011 16:57, Lewis John Mcgibbney > > <[email protected] > > > > > wrote: > >> Hi, > >> > >> Some more positives here. > >> > >> Lewis > >> > >> ---------- Forwarded message ---------- > >> From: Pietro Borradori <[email protected]> > >> Date: Thu, Nov 17, 2011 at 4:46 PM > >> Subject: Fw: Lewis John McGibbney sent a message via SimilarPages – A > >> web discovery and search add-on > >> To: "[email protected]" <[email protected]> > >> Cc: Marco Laurita <[email protected]> > >> > >> > >> Hi Lewis, > >> > >> Thanks for your email... I'm sorry to reply you late... > >> Nutch is a fundamental piece of SimilarPages architecture, because of > >> its crawling features and for the solid base on which it is built that > >> is Hadoop. Hadoop allows us to make all the computations on the crawled > >> data, it is really a fantastic project! Hadoop gives us some headache > >> sometimes when we need large clusters to perform the computation on the > >> crawled data, especially when there are some instances whith hardware > >> failures where Hadoop is supposed to overcome such situations without > >> problems. Marco co-founder/CTO of SimilarPages is at you disposal for > >> any deeper insight re Nutch/Hadoop implementation if helpful. > >> > >> Here is the page of our site re Nutch/Hadoop > >> > >> http://www.similarpages.com/web/index.php?option=com_content&view=articl > >> e&id=8&Itemid=20 > >> > >> We liked Nutch/hadoop projects in our 2 official FB pages: > >> http://www.facebook.com/pages/SimilarPagescom/303352486359786?sk=wall > >> > >> http://www.facebook.com/pages/SimilarPages-A-web-discovery-and-search-ad > >> don/149182788451193 > >> > >> A take a tour video here... > >> > >> http://www.similarpages.com/web/index.php?option=com_content&view=articl > >> e&id=15&Itemid=4 > >> > >> You can follow me on twitter @MrCappuccini > >> > >> We've finally released the beta of the SimilarPages search engine!! > >> Check it out at www.similarpages.com and let us know what you think!! > >> > >> my best > >> Pietro > >> > >> Pietro Borradori > >> Founder & CEO > >> > >> [image: http://www.similarpages.com/images/Loghetto_posta.jpg] > >> > >> ------------------------------ > > > > -- > > * > > *Open Source Solutions for Text Engineering > > > > http://digitalpebble.blogspot.com/ > > http://www.digitalpebble.com

