Hi, I have been running nutch in local mode and so far I am able to have a good understanding on how it all works.
I wanted to start with distributed crawling using some public cloud provider. I just wanted to know if fellow users have any experience in setting up nutch for distributed crawling. >From nutch wiki I have some idea on what hardware requirements should be. I just wanted to know which of the public cloud providers (IaaS or PaaS) are good to setup hadoop clusters on. Basically ones on which it is easy to setup/manage the cluster and ones which are easy on budget. Please let me know if you folks have any insights based on your experiences. Thanks and Regards Sachin