On Thursday, May 22, 2014 10:17:42 PM Sylvain Gault wrote:
> Hello,
> 
> I'm new to this mailing list, so forgive me if I don't do everything
> right.
> 
> I didn't know whether I should ask on this mailing list or on
> mapreduce-dev or on yarn-dev. So I'll just start there. ^^
> 
> Short story: I'm looking for some paper(s) studying the scalability
> of Hadoop MapReduce. And I found this extremely difficult to find on
> google scholar. Do you have something worth citing in a PhD thesis?
> 
> Long story: I'm writing my PhD thesis about MapReduce and when I talk
> about Hadoop I'd like to say "how much it scales". I heared two years
> ago some people say that "Yahoo! got it scale up to 4000 nodes and plan
> to try on 6000 nodes" or something like that. I also heared that
> YARN/MRv2 should scale better, but I don't plan to talk much about
> YARN/MRv2. So I'd take anything I could cite as a reference in my
> manuscript. :)
Hello, Sylvain.
One of the reason why the Hadoop dev team began to work in YARN is precisely 
looking for a more scalable and resourceful Hadoop system, so if you actually 
want to 
talk about Hadoop scalability, you should talk about YARN and MR2.

The paper is here:
https://developer.yahoo.com/blogs/hadoop/next-generation-apache-hadoop-mapreduce-3061.html

and the related JIRA issues here:
https://issues.apache.org/jira/browse/MAPREDUCE-278
https://issues.apache.org/jira/browse/MAPREDUCE-279

You should talk with Arun C Murthy, Chief Architect at Hortonworks about all 
these 
topics. He could help you much more than I could.

-- 
Marcos Ortiz[1] (@marcosluis2186[2])
http://about.me/marcosortiz[3] 
> 
> 
> Best regards,
> Sylvain Gault

--------
[1] http://www.linkedin.com/in/mlortiz
[2] http://twitter.com/marcosluis2186
[3] http://about.me/marcosortiz

VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 de julio de 
2014. Ver www.uci.cu

Reply via email to