Hi Michael, Concerns are related to Gore as like here: https://www.quora.com/Compared-to-Nutch-2-x-why-does-Nutch-1-x-have-a-better-performance I think you also saw the comparison of Nutch 1.7 and Nutch 2.2.1: http://digitalpebble.blogspot.com.tr/2013/09/nutch-fight-17-vs-221.html
However GORA getting better as like the mentioned problem is solved at that blog post: https://issues.apache.org/jira/browse/GORA-119 I've used Nutch 2.x for a large scale crawling and everything was fine. However servers had much more memory than 2 GB. So, I think that you should run a test and try it yourself due to you have very limited memory. Kind Regards, Furkan KAMACI On Sun, Oct 30, 2016 at 7:19 PM, Michael Coffey <[email protected]> wrote: > Newbie question: I am trying to decide between Nutch 1.x or 2.x. The > application is to crawl a large portion of the www using a massive number > (thousands) of small machines (<= 2GB RAM each). I like the idea of the > simpler architecture and pluggable storage backend of 2.x. However, I am > concerned about things I've read about 2.x being less stable and possibly > less efficient than 1.x. Are these concerns valid at this time? > > > > >

