Hi all,

After our discussion about which Hadoop release to use for the upcoming 
Nutch release, I decided to ask around on the Hadoop mailing list. The 
message was clear that we should go with 0.12.1 - see below:

Owen O'Malley wrote:
>
> On Mar 10, 2007, at 12:32 AM, Andrzej Bialecki wrote:
>
>>> I think the experience on big clusters at Yahoo! is that 0.12.1 
>>> should be more stable than 0.11.2, but others can confirm that.
>>
>> Hm.. That's not the impression I have from JIRA and the mailing list. 
>> My impression is that even though 0.12.1 is more robust in some 
>> situations, the significant changes (checksum filesystem, speculative 
>> execution, in memory sorting, improved map output handling, etc, etc) 
>> made between these releases introduced many subtle bugs which only 
>> now start coming into light.
>
> We never upgraded our main clusters to 11.2 because it never 
> stabilized to our satisfaction, which is why I was proposing an 11.3. 
> However, 12.1 is looking pretty good with the exception  of a couple 
> of bugs and we decided to hold out for 12.1. At this point, if I was 
> going to 11, I'd want a lot of the fixes that have been done in between.

 0.12.x release has speculative execution turned on by default, but I 
remember that there were places in Nutch that would break when using 
PhasedFileSystem (which is what Hadoop uses when run in that mode). I'm 
afraid there might be other issues here as well - noone tested Nutch 
with 0.12 to be sure that it works ok.

On the other hand, I only tested 0.11.2 in a limited production env., so 
there may be other bugs lurking there that Owen referred to, which show 
up when you run larger jobs (or different jobs).

What do you think?

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to