We (DigitalPebble) managed the crawl for them and wrote the custom bits
they required. The problems they mentioned were more related to EC2 than
Hadoop as such. More on
http://digitalpebble.blogspot.com/2010/09/similarpages-is-out.html

Jul

On 17 November 2011 16:57, Lewis John Mcgibbney
<[email protected]>wrote:

> Hi,
>
> Some more positives here.
>
> Lewis
>
> ---------- Forwarded message ----------
> From: Pietro Borradori <[email protected]>
> Date: Thu, Nov 17, 2011 at 4:46 PM
> Subject: Fw: Lewis John McGibbney sent a message via SimilarPages – A web
> discovery and search add-on
> To: "[email protected]" <[email protected]>
> Cc: Marco Laurita <[email protected]>
>
>
> Hi Lewis,
>
> Thanks for your email... I'm sorry to reply you late...
> Nutch is a fundamental piece of SimilarPages architecture, because of its
> crawling features and for the solid base on which it is built that is
> Hadoop. Hadoop allows us to make all the computations on the crawled data,
> it is really a fantastic project!  Hadoop gives us some headache sometimes
> when we need large clusters to perform the computation on the crawled data,
> especially when there are some instances whith hardware failures where
> Hadoop is supposed to overcome such situations without problems. Marco
> co-founder/CTO of SimilarPages is at you disposal for any deeper insight re
> Nutch/Hadoop implementation if helpful.
>
> Here is the page of our site re Nutch/Hadoop
>
> http://www.similarpages.com/web/index.php?option=com_content&view=article&id=8&Itemid=20
>
> We liked Nutch/hadoop projects in our 2 official FB pages:
> http://www.facebook.com/pages/SimilarPagescom/303352486359786?sk=wall
>
> http://www.facebook.com/pages/SimilarPages-A-web-discovery-and-search-addon/149182788451193
>
> A take a tour video here...
>
> http://www.similarpages.com/web/index.php?option=com_content&view=article&id=15&Itemid=4
>
> You can follow me on twitter @MrCappuccini
>
> We've finally released the beta of the SimilarPages search engine!! Check
> it out at www.similarpages.com and let us know what you think!!
>
> my best
> Pietro
>
> Pietro Borradori
> Founder & CEO
>
> [image: http://www.similarpages.com/images/Loghetto_posta.jpg]
>
> ------------------------------
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to