Re: Nutch and EC2

2010-04-12 Thread Kevin Conor
My experience on EC2 has been that the RAM and disk space are overkill, while the computing speed is lacking. I had been running my crawler on a 1GB slicehost slice, and when I moved it over to a medium high-cpu instance on EC2 (~2x the cost), the generate and update steps took 50% longer. Right

Re: Nutch and EC2

2010-04-12 Thread Stefano Cherchi
Hi Yves, I'm going to start some test of nutch+solr on EC2 in a couple of days, so I will be able to give you some feedback on it soon. I'm actually a little concerned about computing speed, rather than ram or disk space, because I've experienced a consistent lack of performance in cpu-intens

Re: Nutch and EC2

2010-04-10 Thread Ken Krugler
Hi Yves, On Apr 9, 2010, at 7:49am, Yves Petinot wrote: Hi, I'm currently contemplating migrating my crawler cluster to EC2 and while this appears very tempting (infinite number of nodes), i've read about some potential limitations in terms of the number of map/ red tasks that can effecti