Hi all, After our discussion about which Hadoop release to use for the upcoming Nutch release, I decided to ask around on the Hadoop mailing list. The message was clear that we should go with 0.12.1 - see below:
Owen O'Malley wrote: > > On Mar 10, 2007, at 12:32 AM, Andrzej Bialecki wrote: > >>> I think the experience on big clusters at Yahoo! is that 0.12.1 >>> should be more stable than 0.11.2, but others can confirm that. >> >> Hm.. That's not the impression I have from JIRA and the mailing list. >> My impression is that even though 0.12.1 is more robust in some >> situations, the significant changes (checksum filesystem, speculative >> execution, in memory sorting, improved map output handling, etc, etc) >> made between these releases introduced many subtle bugs which only >> now start coming into light. > > We never upgraded our main clusters to 11.2 because it never > stabilized to our satisfaction, which is why I was proposing an 11.3. > However, 12.1 is looking pretty good with the exception of a couple > of bugs and we decided to hold out for 12.1. At this point, if I was > going to 11, I'd want a lot of the fixes that have been done in between. 0.12.x release has speculative execution turned on by default, but I remember that there were places in Nutch that would break when using PhasedFileSystem (which is what Hadoop uses when run in that mode). I'm afraid there might be other issues here as well - noone tested Nutch with 0.12 to be sure that it works ok. On the other hand, I only tested 0.11.2 in a limited production env., so there may be other bugs lurking there that Owen referred to, which show up when you run larger jobs (or different jobs). What do you think? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers