Re: [ANNOUNCE] New Hadoop Committer - Chris Trezzo
Congratulations, Chris On Mon, Apr 24, 2017 at 12:42 PM, Sangjin Leewrote: > It is my pleasure to announce that Chris Trezzo has been elected as a > committer on the Apache Hadoop project. We appreciate his contributions to > Hadoop thus far, and look forward to more. > > Please join me in congratulating Chris! > > Regards, > Sangjin on behalf of the Apache Hadoop PMC >
Re: [ANNOUNCE] New Hadoop Committer - Varun Saxena
Congratulations Varun! On Fri, Jun 24, 2016 at 11:57 AM, Li Luwrote: > Congrats Varun! > > Li Lu > > > On Jun 24, 2016, at 11:51, 俊平堵 wrote: > > > > On behalf of the Apache Hadoop PMC, I am pleased to announce that Varun > > Saxena > > has been elected a committer on the Apache Hadoop project. We appreciate > > all > > of Varun's hard work thus far, and we look forward to his continued > > contributions. > > > > Congratulations, Varun! > > > > > > Cheers, > > > > Junping > > > - > To unsubscribe, e-mail: general-unsubscr...@hadoop.apache.org > For additional commands, e-mail: general-h...@hadoop.apache.org >
Re: [ANNOUNCE] New Hadoop Committer - Larry McCay
Congrats, Larry. > On Apr 12, 2016, at 11:33 PM, John Zhugewrote: > > Congratulations Larry! > > John Zhuge > Software Engineer, Cloudera > >> On Tue, Apr 12, 2016 at 11:31 PM, Xiao Chen wrote: >> >> Congrats Larry! >> >> -Xiao >> >>> On Tue, Apr 12, 2016 at 11:30 PM, Lei Xu wrote: >>> >>> Congratulation! >>> On Tue, Apr 12, 2016 at 11:28 PM, Gera Shegalov wrote: Congrats and Welcome, Larry! On Wed, Apr 13, 2016 at 7:57 AM Chris Nauroth < >> cnaur...@hortonworks.com> wrote: > On behalf of the Apache Hadoop PMC, I am pleased to announce that >> Larry > McCay has been elected a committer on the Apache Hadoop project. We > appreciate all of Larry's hard work thus far, and we look > forward to his continued contributions. > > Congratulations, Larry! > > > --Chris Nauroth >>> >>> >>> >>> -- >>> Lei (Eddy) Xu >>> Software Engineer, Cloudera >>
Re: [ANNOUNCE] New Apache Hadoop Committer : Kai Zheng
Congrats, Kai. > On Apr 13, 2016, at 12:35 AM, Rakesh Radhakrishnan> wrote: > > Congratulations, Kai! > > Rakesh > >> On Wed, Apr 13, 2016 at 1:00 PM, Zhihai Xu wrote: >> >> Congrats Kai! >> >> zhihai >> >> On Wed, Apr 13, 2016 at 12:19 AM, Varun Saxena >> wrote: >> >>> Congrats Kai ! >>> On Wed, 13 Apr 2016 at 12:45, Uma gangumalla >>> wrote: >>> Hi All, On behalf of the Apache Hadoop PMC, I am pleased to announce that Kai Zheng has been elected as a committer in the Apache Hadoop project. We appreciate all the work Kai Zheng has put into the project so far, and look forward to his future contributions. Welcome aboard, Kai Zheng. Congratulations! Regards, Uma Maheswara Rao Gangumalla On behalf of the Apache Hadoop PMC >>
Re: [ANNOUNCE] New Hadoop PMC Member - Sangjin Lee
Congratulations, Sangjin. > On Apr 12, 2016, at 11:10 PM, Zhihai Xuwrote: > > Congrats Sangjin! > > zhihai > > On Tue, Apr 12, 2016 at 11:05 PM, Rohith Sharma K S < > rohithsharm...@huawei.com> wrote: > >> Congrats Sangjin :-) >> >> >> -Original Message- >> From: Chris Nauroth [mailto:cnaur...@hortonworks.com] >> Sent: 13 April 2016 11:34 >> To: general@hadoop.apache.org >> Subject: [ANNOUNCE] New Hadoop PMC Member - Sangjin Lee >> >> On behalf of the Apache Hadoop PMC, I am very pleased to announce that >> Sangjin Lee has been elected as a PMC Member on the Apache Hadoop project, >> recognizing his continued contributions to the project so far. >> >> Please join me in congratulating Sangjin! >> >> --Chris Nauroth >> >>
Re: [ANNOUNCE] New Hadoop Committer - Naganarasimha Garla
Congratulations, Naga. On Thu, Apr 7, 2016 at 11:00 AM, Varun Saxenawrote: > Congrats Naga ! > > - Varun. > > On Thu, Apr 7, 2016 at 11:29 PM, Wangda Tan wrote: > > > On behalf of the Apache Hadoop PMC, I am pleased to announce > > that Naganarasimha Garla has been elected a committer on the Apache > Hadoop > > project. We appreciate all of Naga's hard work thus far, and we look > > forward to his continued contributions. > > > > Welcome onboard and congratulations, Naga! > > > > Thanks, > > Wangda Tan > > >
Re: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC
Congratulations Yongjun. > On Feb 29, 2016, at 12:17 AM, Naganarasimha G R (Naga) >wrote: > > Congrats Yongjun ! > > + Naga > > From: Tsuyoshi Ozawa [oz...@apache.org] > Sent: Monday, February 29, 2016 13:37 > To: general@hadoop.apache.org > Cc: Yongjun Zhang > Subject: Re: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC > > Congrats, Yongjun! > > - Tsuyoshi > >> On Mon, Feb 29, 2016 at 4:55 PM, Xiao Chen wrote: >> Congrats Yongjun! :) >> >> -Xiao >> >> On Sun, Feb 28, 2016 at 11:52 PM, Brahma Reddy Battula < >> brahmareddy.batt...@huawei.com> wrote: >> >>> Congrats Yongjun!! >>> >>> -Original Message- >>> From: a...@cloudera.com [mailto:a...@cloudera.com] On Behalf Of Aaron T. >>> Myers >>> Sent: 29 February 2016 15:50 >>> To: general@hadoop.apache.org; Yongjun Zhang >>> Subject: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC >>> >>> On behalf of the Apache Hadoop PMC, I am very pleased to announce that >>> Yongjun Zhang has been elected as a PMC member on the Apache Hadoop >>> project. This is in recognition of Yongjun's sustained and significant >>> contributions to the project, and we look forward to even more >>> contributions from him in the future. >>> >>> Please join me in congratulating Yongjun on this accomplishment. >>> >>> Great work, Yongjun. >>> >>> Best, >>> Aaron, on behalf of the Apache Hadoop PMC >>>
Re: [ANNOUNCE] New Hadoop Committer - Eric Payne
Congratulations. On Thu, Feb 11, 2016 at 7:55 AM, Rohith Sharma K S < rohithsharm...@apache.org> wrote: > Congratulations Eric!! :-) > On 11 Feb 2016 21:23, "Jason Lowe"wrote: > > > On behalf of the Apache Hadoop PMC, I am pleased to announce that Eric > > Payne has been elected a committer on the Apache Hadoop project. We > > appreciate all of Eric's hard work thus far, and we look forward to his > > continued contributions. > > > > Welcome and congratulations, Eric! > > > > Jason > > >
Re: [ANNOUNCE] New Hadoop Committer - Masatake Iwasaki
Congratulations. On Wed, Jan 27, 2016 at 6:43 PM, Xiao Chenwrote: > Congrats Masatake Iwasaki! Well deserved. > > -Xiao > > On Wed, Jan 27, 2016 at 6:38 PM, Yongjun Zhang > wrote: > > > Congratulations Masatake! > > > > --Yongjun > > > > On Wed, Jan 27, 2016 at 6:37 PM, Tsuyoshi OZAWA < > > ozawa.tsuyo...@lab.ntt.co.jp> wrote: > > > > > On behalf of the Apache Hadoop PMC, I am pleased to announce that > > > Masatake Iwasaki has been elected as a committer on the Apache Hadoop > > > project. We appreciate all of Masatake's hard work thus far, and we > > > look forward to his continued contributions. > > > > > > Welcome Masatake! > > > > > > Regards, > > > - Tsuyoshi > > > > > >
Re: [ANNOUNCE] Additions to Apache Hadoop PMC - Akira, Robert, Tusyoshi and Wangda
Congratulations! > On Jan 12, 2016, at 5:08 PM, Karthik Kambatlawrote: > > On behalf of the Apache Hadoop PMC, I am very pleased to announce the > following folks have been elected as a PMC member on the Apache Hadoop > project recognizing their sustained and significant contributions to the > project: > > - Akira Ajisaka > - Robert Kanter > - Tsuyoshi Ozawa > - Wangda Tan > > Please join me congratulating them. > > Cheers! > Karthik > (on behalf of the Hadoop PMC)
Re: [ANNOUNCE] New Hadoop Committer - Walter Su
Congrats, Walter. > On Oct 28, 2015, at 5:16 PM, Vinayakumar Bwrote: > > Congrats Walter! > > Welcome aboard. > > Regards, > Vinay > >> On Thu, Oct 29, 2015 at 5:45 AM, Haohui Mai wrote: >> >> On behalf of the Apache Hadoop PMC, I am very pleased to announce that >> Walter Su has been elected a committer on the Apache Hadoop project >> recognizing his continued contributions to the project. >> >> Walter is already a branch committer on HDFS-7285 and can hit the ground >> running :) >> >> Welcome Walter! >> >> Cheers! >>
Re: [ANNOUNCE] New Apache Hadoop committer Zhe Zhang
Congrats, Zhe > On Oct 19, 2015, at 1:56 PM, Andrew Wangwrote: > > Hi all, > > It is my pleasure to welcome Zhe Zhang as an Apache Hadoop committer. Zhe > has been working on the project for over a year now, notably on the HDFS > erasure coding feature (HDFS-7285) as well as other bug fixes and > improvements. > > Please join me in congratulating Zhe! > > Best, > Andrew
Re: [ANNOUNCE] New Hadoop Committer - Anubhav dhoot
Congrats Anubhav > On Sep 20, 2015, at 9:56 PM, Karthik Kambatlawrote: > > On behalf of the Apache Hadoop PMC, I am pleased to announce that Anubhav > Dhoot has been elected a committer on the Apache Hadoop project recognizing > his contributions to the project. > > We appreciate Anubhav's hard work thus far, and and look forward to his > contributions. > > Welcome Anubhav! > > Cheers!
Re: [ANNOUNCE] Welcoming Devaraj K to Apache Hadoop PMC
Congrats, Devaraj. On Wed, Aug 5, 2015 at 5:16 PM, Brahma Reddy Battula brahmareddy.batt...@hotmail.com wrote: Congratulations Devaraj! Date: Wed, 5 Aug 2015 16:56:47 -0700 Subject: [ANNOUNCE] Welcoming Devaraj K to Apache Hadoop PMC From: umamah...@apache.org To: general@hadoop.apache.org Hi, On behalf of the Apache Hadoop Project Management Committee (PMC), it gives me great pleasure to announce that Devaraj K is recently welcomed to join as a member of Apache Hadoop PMC. He has done lot of good work to hadoop and looking forward to his greater achievements in the future! Please join me in welcoming Deva to Hadoop PMC! Congratulations Deva! Regards, Uma (On behalf of the Apache Hadoop Project Management Committee)
Re: [ANNOUNCE] New Hadoop committer - Zhihai Xu
Congratulations, Zhihai. On Thu, Jul 30, 2015 at 12:49 PM, Wangda Tan wheele...@gmail.com wrote: Congrats! Regards, Wangda On Thu, Jul 30, 2015 at 12:36 PM, Karthik Kambatla ka...@cloudera.com wrote: On behalf of the Apache Hadoop PMC, I am pleased to announce that Zhihai Xu has been elected a committer on the Apache Hadoop project recognizing his prolific work in the last year or so. We appreciate Zhihai's hardwork thus far, and look forward to his contributions. Welcome Zhihai! Cheers! Karthik
Re: environment setup
Putting general@ to bcc. If you plan to ask about Apache Hadoop setup issue, consider using user@ If you are installing distro from some vendor, please use vendor-specific mailing list. Cheers 2015-07-26 13:13 GMT-07:00 Serkan Taş serkan@likyateknoloji.com: Hi, Which mail list best fits for the questions of hadoop dev environment setup problems ? Thanx *Serkan Taş* Mobil : +90 532 250 07 71 Likya Bilgi Teknolojileri ve İletişim Hiz. Ltd. Şti. www.likyateknoloji.com -- Bu elektronik posta ve onunla iletilen bütün dosyalar gizlidir. Sadece yukarıda isimleri belirtilen kişiler arasında özel haberleşme amacını taşımaktadır. Size yanlışlıkla ulaşmışsa bu elektonik postanın içeriğini açıklamanız, kopyalamanız, yönlendirmeniz ve kullanmanız kesinlikle yasaktır. Lütfen mesajı geri gönderiniz ve sisteminizden siliniz. Likya Bilgi Teknolojileri ve İletişim Hiz. Ltd. Şti. bu mesajın içeriği ile ilgili olarak hiç bir hukuksal sorumluluğu kabul etmez. This electronic mail and any files transmitted with it are intended for the private use of the persons named above. If you received this message in error, forwarding, copying or use of any of the information is strictly prohibited. Please immediately notify the sender and delete it from your system. Likya Bilgi Teknolojileri ve İletişim Hiz. Ltd. Şti. does not accept legal responsibility for the contents of this message. -- P Bu e-postayı yazdırmadan önce, çevreye olan sorumluluğunuzu tekrar düşünün. Please consider your environmental responsibility before printing this e-mail.
Re: [ANNOUNCE] Welcoming Vinayakumar B to Apache Hadoop PMC
Congrats, Vinay. On Wed, Jul 8, 2015 at 11:10 AM, Uma gangumalla umamah...@apache.org wrote: Hi, On behalf of the Apache Hadoop Project Management Committee (PMC), it gives me great pleasure to announce that Vinayakumar B is recently welcomed to join as a member of Apache Hadoop PMC. He has done lot of good work to hadoop and looking forward to his greater achievements in the future! Please join me in welcoming Vinay to Hadoop PMC! Congratulations Vinay! Regards, Uma (On behalf of the Apache Hadoop Project Management Committee)
Re: [ANNOUNCE] Welcoming Xuan Gong to Apache Hadoop PMC
Congrats, Xuan. On Tue, Jul 7, 2015 at 12:49 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, It gives me great pleasure to announce that Xuan Gong is welcomed to join as a member of Apache Hadoop PMC. Here's appreciating all his work so far into the project and looking forward to greater achievements in the future! Please join me in welcoming Xuan to Hadoop PMC! Congratulations Xuan! Thanks, +Vinod On behalf of the Apache Hadoop PMC
Re: [ANNOUNCE] New Hadoop Committer - Rohith Sharma K S
Congrats Rohith! On Tue, Jul 7, 2015 at 1:50 PM, Zhijie Shen zs...@hortonworks.com wrote: Congrats! From: Sangjin Lee sjl...@gmail.com Sent: Tuesday, July 07, 2015 1:48 PM To: general@hadoop.apache.org Subject: Re: [ANNOUNCE] New Hadoop Committer - Rohith Sharma K S Congratulations Rohith! On Tue, Jul 7, 2015 at 1:29 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Welcome, Rohith! --Chris Nauroth On 7/7/15, 1:24 PM, Jian He j...@hortonworks.com wrote: On behalf of the Apache Hadoop PMC, I am pleased to announce that Rohith Sharma K S has been elected as a committer on the Apache Hadoop project. We appreciate all of Rohith¹s hard work thus far, and we look forward to his continued contributions. Thanks, Jian He
Re: [ANNOUNCE] New Hadoop committer - Lei (Eddy) Xu
Congratulations, Eddy. On Jun 19, 2015, at 8:43 PM, Esteban Gutierrez este...@cloudera.com wrote: Congratulations Eddy! -- Cloudera, Inc. On Fri, Jun 19, 2015 at 3:26 AM, Joep Rottinghuis jrottingh...@gmail.com wrote: Congrats Eddy ! Joep Sent from my iPhone On Jun 18, 2015, at 9:20 PM, Rohith Sharma K S rohithsharm...@huawei.com wrote: Congratulations :-) Thanks Regards Rohith Sharma K S -Original Message- From: Andrew Wang [mailto:andrew.w...@cloudera.com] Sent: 16 June 2015 03:21 To: general@hadoop.apache.org Subject: [ANNOUNCE] New Hadoop committer - Lei (Eddy) Xu Hello all, It is my pleasure to announce that Lei Xu, also known as Eddy, has accepted the Apache Hadoop PMC's invitation to become a committer. We appreciate all of Eddy's hard work thus far, and look forward to his continued contributions. Welcome and congratulations, Eddy! Best, Andrew, on behalf of the Apache Hadoop PMC
Re: [ANNOUNCE] New Hadoop Committer - Ming Ma
Congratulations, Ming. On Jun 19, 2015, at 8:25 PM, Esteban Gutierrez este...@cloudera.com wrote: Congratulations Ming Ma! -- Cloudera, Inc. On Fri, Jun 19, 2015 at 2:07 PM, Masatake Iwasaki iwasak...@oss.nttdata.co.jp wrote: Congratulations, Ming Ma! On 6/18/15 12:55, Chris Nauroth wrote: On behalf of the Apache Hadoop PMC, I am pleased to announce that Ming Ma has been elected as a committer on the Apache Hadoop project. We appreciate all of Ming's hard work thus far, and we look forward to his continued contributions. Welcome, Ming! --Chris Nauroth
Re: [ANNOUNCE] New Hadoop committer - Varun Vasudev
Congratulations Varun On Jun 17, 2015, at 3:36 PM, Devaraj K deva...@apache.org wrote: Congrats Varun... Thanks Devaraj On Tue, Jun 16, 2015 at 10:24 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, It gives me great pleasure to announce that the Apache Hadoop PMC recently invited Varun Vasudev to become a committer in the project, to which he accepted. We deeply appreciate his efforts in the project so far, specifically in the areas of YARN and MapReduce. Here’s looking forward to his continued contributions going into the future! Welcome aboard and congratulations, Varun! +Vinod On behalf of the Apache Hadoop PMC -- Thanks Devaraj K
Re: [ANNOUNCE] New Hadoop committer - Arun Suresh
Congratulations, Arun. On Fri, Mar 27, 2015 at 9:36 AM, Sangjin Lee sjl...@gmail.com wrote: Congratulations! On Thu, Mar 26, 2015 at 6:29 PM, Sun, Dapeng dapeng@intel.com wrote: Congratulations Arun! Regards Dapeng -Original Message- From: Andrew Wang [mailto:andrew.w...@cloudera.com] Sent: Friday, March 27, 2015 6:39 AM To: general@hadoop.apache.org Subject: [ANNOUNCE] New Hadoop committer - Arun Suresh It is my pleasure to announce that Arun Suresh has accepted the Apache Hadoop PMC's invitation to become a committer on the Apache Hadoop project. We appreciate all of Arun's hard work thus far, and look forward to his continued contributions. Welcome, Arun!
Re: Patch review process
In some cases, contributor responded to review comments and attached patches addressing the comments. Later on, there was simply no response to the latest patch - even with follow-on ping. I wish this aspect can be improved. Cheers On Sun, Jan 25, 2015 at 6:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoopgene...@yahoo.com.invalid wrote: Hi contributors, I would like to (re)start a discussion regrading to our patch review process. A similar discussion has been happened in a the hadoop private mailing list, which is inappropriate. Here is the problem:The patch available queues become longer and longer. It seems that we never can catch up. There are patches sitting in the queues for years. How could we speed up? Regrads,Tsz-Wo
Re: [ANNOUNCE] Welcoming Jian He to Apache Hadoop PMC
Congratulations, Jian ! On Mon, Jan 5, 2015 at 3:51 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, On behalf of the Apache Hadoop PMC, it gives me great pleasure to announce that Jian He is recently welcomed to join as a member of Apache Hadoop PMC. Here's appreciating all his work so far into the project and looking forward to greater achievements in the future! Please join me in welcoming Jian to Hadoop PMC! Congratulations Jian! Thanks, +Vinod
Re: [ANNOUNCE] Welcoming Zhijie Shen to Apache Hadoop PMC
Congratulations, Zhijie ! On Jan 5, 2015, at 3:51 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, On behalf of the Apache Hadoop PMC, it gives me great pleasure to announce that Zhijie Shen is recently welcomed to join as a member of Apache Hadoop PMC. Here's appreciating all his work so far into the project and looking forward to greater achievements in the future! Please join me in welcoming Zhijie to Hadoop PMC! Congratulations Zhijie! Thanks, +Vinod
Re: [ANNOUNCE] New Hadoop Committer - Carlo Curino
Congratulations, Carlo ! On Dec 5, 2014, at 12:43 PM, Chris Douglas cdoug...@apache.org wrote: On behalf of the Apache Hadoop PMC, I'm pleased to announce (albeit belatedly) that Carlo Curino has been elected as a committer on the project. Congratulations Carlo; thank you for all your contributions to Hadoop! -C
Re: [ANNOUNCE] New Hadoop Committers - Gera Shegalov and Robert Kanter
Congratulations, Gera and Robert! On Thu, Dec 4, 2014 at 1:43 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Congratulations! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Dec 4, 2014 at 11:56 AM, Sandy Ryza sandy.r...@cloudera.com wrote: On behalf of the Apache Hadoop PMC, I am pleased to announce that Gera Shegalov and Robert Kanter have been elected as committers on the Apache Hadoop project. We appreciate all the work they have put into the project so far, and look forward to their future contributions. Welcome, Gera and Robert! -Sandy -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: All mirrored download links from the Apache Hadoop site are broken
Looks like the following is working: http://apache.mesi.com.ar/hadoop/common/ FYI On Fri, Oct 31, 2014 at 10:55 AM, Andrew Purtell apurt...@apache.org wrote: http://hadoop.apache.org/releases.html#Download - http://www.apache.org/dyn/closer.cgi/hadoop/common/ Every single mirror link 404s. -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Some questions for getting help
You can connect to chat room #hadoop on IRC. There're many people on that channel. Cheers On May 18, 2014, at 11:31 PM, Pankti Majmudar pankti.majmu...@gmail.com wrote: Hi, I am a newbie to open source development and Hadoop. I was looking at the newbie issues on JIRA and would like to pick up one of the minor bugs. I referred to the 'How To Contribute' page on Apache and started with getting the source code and setting the environment up. Please let me know if there are any sources(IRCs etc) where I can ask for specific help and discuss about fixing any issue/bug. Is the mailing list the only method for communicating or asking questions? Any pointers will be helpful. Thanks, Pankti
Re: I stopped receiving any email from this group!
See https://twitter.com/infrabot Cheers On Sat, May 10, 2014 at 5:19 PM, Peyman Mohajerian mohaj...@gmail.comwrote:
Re: [ANNOUNCE] New Hadoop Committer - Xuan Gong
Congratulations, Xuan. On Sun, Apr 13, 2014 at 5:49 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that the Apache Hadoop PMC voted in Xuan Gong as a committer in the Apache Hadoop project. His contributions to YARN and MapReduce have been outstanding! We appreciate all the work Xuan has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Xuan! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: hadoop eclipse plugin
I used the following command in the root of workspace: 859 ant clean compile 860 ant eclipse Eclipse project file is generated: -rw-r--r--1 tyu staff7942 Jan 31 22:21 .classpath -rw-r--r--1 tyu staff 392 Jan 31 22:21 .project Cheers On Fri, Jan 31, 2014 at 9:43 PM, pavan pavan.danthul...@gmail.com wrote: Hi, I am very new to hadoop, exploring hadoop-1.2.1 on windows using (cygwin + eclipse europa). Can anyone guide/help me in finding the information on eclipse plugin for the below configuration? OS : Windows(using Cygwin) Hadoop version : hadoop-1.2.1 Eclipse : Ecipse Europa 3.3.2 Thanks Pavan Kumar D
Re: [ANNOUNCE] New Hadoop Committer - Junping Du
Congrats, Junping. On Tue, Dec 10, 2013 at 3:07 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that Junping Du is voted in as a committer in the Apache Hadoop project. His contributions to Hadoop have been outstanding! We appreciate all the work Junping has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Junping! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [ANNOUNCE] New Hadoop Committer - Omkar Vinit Joshi
Congratulations, Omkar. On Tue, Dec 10, 2013 at 3:08 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that Omkar Vinit Joshi is voted in as a committer to the Apache Hadoop project. His contributions to YARN have been nothing short of outstanding! We appreciate all the work Omkar has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Omkar! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [ANNOUNCE] New Hadoop Committer - Mayank Bansal
Congrats, Mayank. On Tue, Dec 10, 2013 at 3:11 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that Mayank Bansal is voted in as a committer in the Apache Hadoop project. A long term Hadoop contributor, his contributions to YARN and MapReduce have been outstanding! We appreciate all the work Mayank has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Mayank! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [ANNOUNCE] New Hadoop Committer - Jian He
Congrats, Jian. On Tue, Dec 10, 2013 at 3:10 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that Jian He is voted in as a committer to the Apache Hadoop project. His contributions to YARN have been nothing short of outstanding! We appreciate all the work Jian has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Jian! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [ANNOUNCE] New Hadoop Committer - Zhijie Shen
Congratulations, Zhijie. On Tue, Dec 10, 2013 at 3:10 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I am very pleased to announce that Zhijie Shen is voted in as a committer in the Apache Hadoop project. His contributions to YARN and MapReduce have been outstanding! We appreciate all the work Zhijie has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Zhijie! Thanks, +Vinod On behalf of the Apache Hadoop PMC -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [ANNOUNCE] New Hadoop Committer - Roman Shaposhnik
Congratulations, Roman. On Oct 24, 2013, at 4:35 PM, Andrew Wang andrew.w...@cloudera.com wrote: Congrats Roman! On Thu, Oct 24, 2013 at 4:33 PM, Alejandro Abdelnur t...@cloudera.comwrote: On behalf of the Apache Hadoop PMC, I am pleased to announce that Roman Shaposhnik has been elected a committer in the Apache Hadoop project. We appreciate all the work Roman has put into the project so far, and look forward to his future contributions. Welcome, Roman! Cheers, -- Alejandro
Re: Unclear Hadoop 2.1X documentation
It might be easier if you start with some sandbox environment for your first setup. Search 'hadoop sandbox' in google. Cheers On Sat, Sep 14, 2013 at 11:22 AM, Mahmoud Al-Ewiwi mew...@gmail.com wrote: Thanks Mr. Ted what about 3 and 4 where are hadoop-common and hadoop-hdfs On Sat, Sep 14, 2013 at 9:09 PM, Ted Yu yuzhih...@gmail.com wrote: For #1, you can get the tar ball from http://www.apache.org/dyn/closer.cgi/hadoop/common/ e.g. http://www.motorlogy.com/apache/hadoop/common/hadoop-2.1.0-beta/ It is in maven too: http://mvnrepository.com/artifact/org.apache.hadoop/ For #2, see https://code.google.com/p/protobuf/ On Sat, Sep 14, 2013 at 10:54 AM, Mahmoud Al-Ewiwi mew...@gmail.com wrote: Hello, I'm new to Hadoop and i want to learn it in order to do a project. I'v started reading the documentation at this site: http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-project-dist/hadoop-common/SingleCluster.html for setting a single node, but i could not figure a lot of things in these documentation. 1.You should be able to obtain the MapReduce tarball from the release I could not find this tarball, where is it. 2.You will need protoc 2.5.0 installed what is that, there is no even a link for it or what is it 3.Assuming you have installed hadoop-common/hadoop-hdfs what also are these, and why you are assuming that. i have just downlaod the hadoop-2.1.0-beta http://ftp.itu.edu.tr/Mirror/Apache/hadoop/common/hadoop-2.1.0-beta/ and extracted it 4. and exported *$HADOOP_COMMON_HOME*/*$HADOOP_HDFS_HOME* !! that is strange, to where should these environment variables indicating lastly as i know,the first step tutorial should give more details. or am i searching the wrong side. ** **
Re: Status of 2.0.4-alpha release
Good idea, Roman. On Tue, Apr 9, 2013 at 7:26 PM, Roman Shaposhnik r...@apache.org wrote: On Tue, Apr 9, 2013 at 7:03 PM, Arun C Murthy a...@hortonworks.com wrote: Thanks Xuan, Vinod and Cos. I'll spin the rc0 in the next 24hrs. Once the official RC is ready I'll pull the tarball into the Bigtop infra and will provide integration testing results. In fact, I think starting from 2.0.4-alpha we should probably get into the habit of integration testing the next branch every single night so that everybody can jump onto test failures, not just us Bigtop folks. Thanks, Roman.
Re: [DISCUSS] stabilizing Hadoop releases wrt. downstream
Thanks Bobby. HBase trunk can build upon 2.0 SNAPSHOOT so that regression can be detected early. On Tue, Mar 5, 2013 at 7:18 AM, Robert Evans ev...@yahoo-inc.com wrote: That is a great point. I have been meaning to set up the Jenkins build for branch-2 for a while, so I took the 10 mins and just did it. https://builds.apache.org/job/Hadoop-Common-2-Commit/ Don't let the name fool you, it publishes not just common, but HDFS, YARN, MR, and tools too. You should now have branch-2 SNAPSHOTS updated on each commit to branch-2. Feel free to bug me if you need more integration points. I am not an RE guy, but I can hack it to make things work :) --Bobby On 3/5/13 12:15 AM, Konstantin Boudnik c...@apache.org wrote: Arun, first of all, I don't think anyone is trying to put a blame on someone else. E.g. I had similar experience with Oozie being broken because of certain released changes in the upstream. I am sure that most people in BigTop community - especially those who share the committer-ship privilege in BigTop and other upstream projects, including Hadoop, - would be happy to help with the stabilization of the Hadoop base. The issue that a downstream integration project is likely to have is - for once - the absence of regularly published development artifacts. In the light of it didn't happen if there's no picture here's a couple of examples: - 2.0.2-SNAPSHOT weren't published at all; only release 2.0.2-alpha artifacts were - 2.0.3-SNAPSHOT weren't published until Feb 29, 2013 (it happened just once) So, technically speaking, unless an integration project is willing to build and maintain its own artifacts, it is impossible to do any preventive validation. Which brings me to my next question: how do you guys address Integration is high on the list of *every* release. Again, please don't get me wrong - I am not looking to lay a blame on or corner anyone - I am really curious and would appreciate the input. Vinod: As you yourself noted later, the pain is part of the 'alpha' status of the release. We are targeting +one of the immediate future releases to be a beta and so these troubles are really only the short +term. I don't really want to get into the discussion about of what constitutes the alpha and how it has delayed the adoption of Hadoop2 line. However, I want to point out that it is especially important for alpha platform to work nicely with downstream consumers of the said platform. For quite obvious reasons, I believe. I think there is a fundamental problem with the interaction of Bigtop with the downstream projects, if nothing else, with BigTop is as downstream as it can get, because BigTop essentially consumes all other component releases in order to produce a viable stack. Technicalities aside... Hadoop. We never formalized on the process, will BigTop step in after an RC is up for vote or before? As I see it, it's happening Bigtop essentially can give any component, including Hadoop, and better yet - the set of components - certain guaratees about compatibility and dependencies being included. Case in point is missing commons libraries missed in 1.0.1 release that essentially prevented HBase from working properly. after the vote is up, so no wonder we are in this state. Shall we have a pre-notice to Bigtop so that it can step in before? The above is in contradiction with earlier statement of Integration is high on the list of *every* release. If BigTop isn't used for integration testing, then how said integration testing is performed? Is it some sort of test-patch process as Luke referred earlier? And why it leaves the room for the integration issues being uncaught? Again, I am genuinely interested to know. these short term pains. I'd rather like us swim through these now instead of support broken APIs and features in our beta, having seen this very thing happen with 1.*. I think you're mixing the point of integration with downstream and being in an alpha phase of the development. The former isn't about supporting broken APIs - it is about being consistent and avoid breaking the downstream applicaitons without letting said applications to accomodate the platform changes first. Changes in the API, after all, can be relatively easy traced by integration validation - this is the whole point of integration testing. And BigTop does the job better then anything around, simply because there's nothing else around to do it. If you stay in shape-shifting alpha that doesn't integrate well for a very long time, you risk to lose downstream customers' interest, because they might get tired of waiting until a next stable API will be ready for them. Let's fix the way the release related communication is happening across our projects so that we can all work together and make 2.X a success. This is a very good point indeed! Let's start a separate discussion thread
Re: [VOTE] Release hadoop-0.23.2-rc0
What are the issues fixed / features added in 0.23.2 compared to 0.23.1 ? Thanks On Thu, Mar 29, 2012 at 3:45 PM, Arun C Murthy a...@hortonworks.com wrote: I've created a release candidate for hadoop-0.23.2 that I would like to release. It is available at: http://people.apache.org/~acmurthy/hadoop-0.23.2-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] Rename hadoop branches post hadoop-1.x
1, 2, 5 (non-binding) On Mon, Mar 19, 2012 at 5:13 PM, Arun C Murthy a...@hortonworks.com wrote: We've discussed several options: (1) Rename branch-0.22 to branch-2, rename branch-0.23 to branch-3. (2) Rename branch-0.23 to branch-3, keep branch-0.22 as-is i.e. leave a hole. (3) Rename branch-0.23 to branch-2, keep branch-0.22 as-is. (4) If security is fixed in branch-0.22 within a short time-frame i.e. 2 months then we get option 1, else we get option 2. Effectively postpone discussion by 2 months, start a timer now. (5) Do nothing, keep branch-0.22 and branch-0.23 as-is. Let's do a STV [1] to get reach consensus. Please vote by listing the options above in order of your preferences. thanks, Arun [1] http://en.wikipedia.org/wiki/Single_transferable_vote
HBase trunk on 0.23 Was: Update on hadoop-0.23
Minor correction to Todd's report, currently HBase TRUNK doesn't compile against 0.23 ( https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/42/console ): [ERROR] /home/hudson/hudson-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java:[35,38] cannot find symbol [ERROR] symbol : class FSConstants [ERROR] location: package org.apache.hadoop.hdfs.protocol [ERROR] Cheers On Tue, Sep 27, 2011 at 12:56 PM, Todd Lipcon t...@cloudera.com wrote: Hi all, Just an update from the HBase side: I've run some cluster tests on HDFS 0.23 (as of about a month ago) and it generally works well. Performance for some workloads is ~2x due to HDFS-941, and can be improved a bit more if I finish HDFS-2080 in time. I did not do extensive failure testing (to stress the new append/sync code) but I do plan to do that in the coming months. HBase trunk can compile against 0.23 by using -Dhadoop23 on the maven build. Currently some 15 or so tests are failing - the following HBase JIRA tracks those issues: https://issues.apache.org/jira/browse/HBASE-4254 (these may be indicative of HDFS side bugs) Any help there from the community would be appreciated! -Todd On Tue, Sep 27, 2011 at 12:24 PM, Roman Shaposhnik r...@apache.org wrote: Hi Arun! Thanks for the quick reply! I'm sorry if I had too many questions in my original email, but I can't find an answer to my integration tests question. Could you, please, share a URL with us where I can find out more about them? On Mon, Sep 26, 2011 at 11:20 PM, Arun C Murthy a...@hortonworks.com wrote: # We made changes to Pig - rather we got help from the Pig team, particularly Daniel. So, we plan to work through the rest of the stack - Hive, Oozie etc. very soon and we'll depend on updated releases from the individual projects. Do we have any kinds of commitment from downstream projects as far as those updates are concerned? Are they targeting these changes as part of point (patch) release of an already released version (like Pig 0.9.X for example) or will it be part of a brand new major release? Thanks, Roman. -- Todd Lipcon Software Engineer, Cloudera
Re: Welcoming Alejandro Abdelnur as a Hadoop Committer
Alejandro has been making contributions to HBase as well. Congratulations Alejandro ! On Mon, Sep 26, 2011 at 9:21 AM, Tom White t...@cloudera.com wrote: On behalf of the PMC, I am pleased to announce that Alejandro Abdelnur has been elected a committer in the Apache Hadoop Common, HDFS, and MapReduce projects. We appreciate all the work Alejandro has put into the project so far, and look forward to his future contributions. Welcome, Alejandro! Cheers, Tom
Re: Welcoming Harsh J as a Hadoop committer
Harsh definitely deserves this honor. Cheers On Thu, Sep 15, 2011 at 11:36 PM, Aaron T. Myers a...@cloudera.com wrote: Congratulations, Harsh! Very well-deserved. -- Aaron T. Myers Software Engineer, Cloudera
Re: Hadoop´s Internationalization
Marcos: Which hadoop version(s) do you plan to work on ? Where I can find the sources of the docs? In the latest TRUNK, you would find these directories: ./common/src/docs ./hdfs/src/c++/libhdfs/docs ./hdfs/src/docs ./mapreduce/src/docs Cheers On Sat, Jun 25, 2011 at 8:17 AM, Marcos Ortiz mlor...@uci.cu wrote: OK, Owen, Where I can find the sources of the docs? Which is the format for the docs? DocBook, ReST, etc? El 6/25/2011 4:38 AM, Owen O'Malley escribió: On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortizmlor...@uci.cu wrote: Regards to all the list. I´m looking for a proper way to work on the internationalization of Hadoop, but I don´t know if this is a good project or if this is useful for the community, at least, I think that it would be very useful for many people that want to see the messages of the project in another language, for example, in Spanish. I think it would be very useful, but very time consuming project. I'd suggest that you start by translating the documentation for the release into another language first. -- Owen -- Marcos Luís Ortíz Valmaseda Software Engineer (UCI) http://marcosluis2186.**posterous.comhttp://marcosluis2186.posterous.com http://twitter.com/**marcosluis2186 http://twitter.com/marcosluis2186
Re: hbase
Mapreduce isn't required. For your query, Hive would be a better fit. On Sun, Apr 10, 2011 at 5:36 AM, Mag Gam magaw...@gmail.com wrote: Just curious, does hbase require mapreduce? Basically, I have several terabytes of data and I would like to query it similar to sql fashion. Was wondering if mapreduce was required.
Re: How to pass a parameter to map ?
I can pass paramenter to map when i configure a job? You can utilize hadoop Configuration (JobConf). On Thu, Mar 17, 2011 at 1:15 PM, Alessandro Binhara binh...@gmail.comwrote: I need select a data on map... I can pass paramenter to map when i configure a job? Other questions : I need sort a data and save in many files. the name of file is a sort Key ..??? thanks ..
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?
Thats why I think we should go to 0.22 ASAP and get companies to build their new features on trunk against that. There was a thread in Nov - 'Caution using Hadoop 0.21' It would be helpful to see response to 0.22 Thanks for getting the discussion off the ground, St.Ack
Re: WARNING : There are about 1 missing blocks. Please check the log or run fsck.
Run fsck. On Thu, Aug 26, 2010 at 5:56 AM, vaibhav negi sssena...@gmail.comwrote: Hi , I am using hadoop version : 0.20.2 . I am getting this error message. WARNING : There are about 1 missing blocks. Please check the log or run fsck. What to do? Is there any problem in the cluster Thanks and Best Regards Vaibhav Negi
Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
This would imply hadoop-0.20-security-append or hadoop-0.20-append-security release be created which contains security and append features. On Thu, Aug 26, 2010 at 4:22 PM, Arun C Murthy a...@yahoo-inc.com wrote: On Aug 26, 2010, at 12:08 PM, Stack wrote: On Mon, Aug 23, 2010 at 5:27 PM, Arun C Murthy a...@yahoo-inc.com wrote: In the interim I'd like to propose we push a hadoop-0.20-security release off the Yahoo! patchset (http://github.com/yahoo/hadoop-common). This will ensure the community benefits from all the work done at Yahoo! for over 12 months *now*, and ensures that we do not have to wait until hadoop-0.22 which has all of these patches. Sounds good to me. What will this release be called? hadoop-0.20.3-security? hadoop-0.20-security. I want to ensure hadoop-0.20 be a separate line, so as to not confuse people. Conceivably, one could imagine a Hadoop Security + Append release soon after. Well, it'd probably be better if we just did an append release first? A good few of us have been banging on the 0.20-append branch w/ a while now and its for sure doing append better than 0.20 did (smile). I think these are orthogonal and both can run their own course. Arun
Re: Child processes on datanodes/task trackers
Use jps to find out pid of the Child. Then use this to find out which job the Child belongs to: ps aux | grep pid On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote: Hi, I wanted to know why I see running Child processes on my datanodes even though there is no job running at that time. Are these left over from failed attempts? Is there anything I can do to keep these clean? Thanks, Deepika
Re: Child processes on datanodes/task trackers
After you obtain pid, you can use jstack to see what the Child process was doing. What hadoop version are you using ? On Wed, Aug 25, 2010 at 7:28 PM, C J c.josh...@yahoo.com wrote: Thanks for your reply. Some of these child tasks belong to successful jobs. I am wondering why they are still hanging there for long finished jobs. From: Ted Yu yuzhih...@gmail.com To: general@hadoop.apache.org Sent: Wed, August 25, 2010 4:17:38 PM Subject: Re: Child processes on datanodes/task trackers Use jps to find out pid of the Child. Then use this to find out which job the Child belongs to: ps aux | grep pid On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote: Hi, I wanted to know why I see running Child processes on my datanodes even though there is no job running at that time. Are these left over from failed attempts? Is there anything I can do to keep these clean? Thanks, Deepika
Re: Child processes on datanodes/task trackers
I don't use ehcache. Did you forget to close CacheManager at the end of your job by any chance ? On Wed, Aug 25, 2010 at 7:59 PM, C J c.josh...@yahoo.com wrote: Thanks Ted! I did a jstack and it seems there is an issue with ehcache that I am using in the mapper task. net.sf.ehcache.cachemana...@57ac3379 daemon prio=10 tid=0x59180800 nid=0x379e in Object.wait() [0x41506000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2aaabb0b89a8 (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked 0x2aaabb0b89a8 (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) Locked ownable synchronizers: - None . . . The hadoop version I am using is 0.20.2. Thanks. From: Ted Yu yuzhih...@gmail.com To: general@hadoop.apache.org Sent: Wed, August 25, 2010 7:34:35 PM Subject: Re: Child processes on datanodes/task trackers After you obtain pid, you can use jstack to see what the Child process was doing. What hadoop version are you using ? On Wed, Aug 25, 2010 at 7:28 PM, C J c.josh...@yahoo.com wrote: Thanks for your reply. Some of these child tasks belong to successful jobs. I am wondering why they are still hanging there for long finished jobs. From: Ted Yu yuzhih...@gmail.com To: general@hadoop.apache.org Sent: Wed, August 25, 2010 4:17:38 PM Subject: Re: Child processes on datanodes/task trackers Use jps to find out pid of the Child. Then use this to find out which job the Child belongs to: ps aux | grep pid On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote: Hi, I wanted to know why I see running Child processes on my datanodes even though there is no job running at that time. Are these left over from failed attempts? Is there anything I can do to keep these clean? Thanks, Deepika
Re: Lazy initialization of Reducers
I don't find such parameter in 0.20.2 Please create such flag in your own class. On Wed, Jul 21, 2010 at 10:15 AM, Syed Wasti mdwa...@hotmail.com wrote: Hi, I read about this Reducer Lazy initialization in a document found in the below URL. http://www.scribd.com/doc/23046928/Hadoop-Performance-Tuning It says “:In M/R job Reducers are initialized with Mappers at the job initialization, but the reduce method is called in reduce phase when all the maps had been finished. So in large jobs where Reducer loads data (100 MB for business logic) in-memory on initialization, the performance can be increased by lazily initializing Reducers i.e. loading data in reduce method controlled by an initialize flag variable which assures that it is loaded only once. By lazily initializing Reducers which require memory (for business logic) on initialization, number of maps can be increased.” But I did not find any other resource which talks about Reducer Lazy initialization. Does anyone have experience on this ? If yes, how and where can I set this parameter to get it working. Thanks for the support. Regards Syed Wasti
Re: Hadoop and XML
Interesting. String class is able to handle this scenario: 348 public String(byte[] data, String encoding) throws UnsupportedEncodingException { 349 this(data, 0, data.length, encoding); 350 } On Tue, Jul 20, 2010 at 6:01 AM, Jeff Bean jwfb...@cloudera.com wrote: I think the problem is here: String valueString = new String(valueText.getBytes(), UTF-8); Javadoc for Text says: *getBytes http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getBytes%28%29 *() Returns the raw bytes; however, only data up to getLength() http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getLength%28%29 is valid. So try getting the length, truncating the byte array at the value returned by getLength() and THEN converting it to a String. Jeff On Mon, Jul 19, 2010 at 9:08 AM, Ted Yu yuzhih...@gmail.com wrote: For your initial question on Text.set(). Text.setCapacity() allocates new byte array. Since keepData is false, old data wouldn't be copied over. On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo peter.mine...@reardencommerce.com wrote: I am already using XmlInputFormat. The input into the Map phase is not the problem. The problem lays in between the Map and Reduce phase. BTW - The article is correct. DO NOT USE StreamXmlRecordReader. XmlInputFormat is a lot faster. From my testing, StreamXmlRecordReader took 8 minutes to read a 1 GB XML document; where as, XmlInputFormat was under 2 minutes. (Using 2 Core, 8GB machines) -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, July 16, 2010 9:44 PM To: general@hadoop.apache.org Subject: Re: Hadoop and XML From an earlier post: http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo peter.mine...@reardencommerce.com wrote: Moving the variable to a local variable did not seem to work: /PrivateRateSetvateRateSet public void map(Object key, Object value, OutputCollector output, Reporter reporter) throws IOException { Text valueText = (Text)value; String valueString = new String(valueText.getBytes(), UTF-8); String keyString = getXmlKey(valueString); Text returnKeyText = new Text(); Text returnValueText = new Text(); returnKeyText.set(keyString); returnValueText.set(valueString); output.collect(returnKeyText, returnValueText); } -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Fri 7/16/2010 2:51 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML Whoopsright after I sent it and someone else made a suggestion; I realized what question 2 was about. I can try that, but wouldn't that cause Object bloat? During the Hadoop training I went through; it was mentioned to reuse the returning Key and Value objects to keep the number of Objects created down to a minimum. Is this not really a valid point? -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Friday, July 16, 2010 2:44 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML I am not using multi-threaded Map tasks. Also, if I understand your second question correctly: Also can you try creating the output key and values in the map method(method lacal) ? In the first code snippet I am doing exactly that. Below is the class that runs the Job. public class HadoopJobClient { private static final Log LOGGER = LogFactory.getLog(Prds.class.getName()); public static void main(String[] args) { JobConf conf = new JobConf(Prds.class); conf.set(xmlinput.start, PrivateRateSet); conf.set(xmlinput.end, /PrivateRateSet); conf.setJobName(PRDS Parse); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(PrdsMapper.class); conf.setReducerClass(PrdsReducer.class); conf.setInputFormat(XmlInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); // Run the job try { JobClient.runJob(conf); } catch (IOException e) { LOGGER.error(e.getMessage(), e
Re: Hadoop and XML
I also added Peter's comment to the JIRA I logged: https://issues.apache.org/jira/browse/HADOOP-6868 On Tue, Jul 20, 2010 at 9:38 AM, Ted Yu yuzhih...@gmail.com wrote: So the correct call should be: String valueString = new String(valueText.getBytes(), 0, valueText.getLength(), UTF-8); Cheers On Tue, Jul 20, 2010 at 9:23 AM, Jeff Bean jwfb...@cloudera.com wrote: data.length is the length of the byte array. Text.getLength() most likely returns a different value than getBytes.length. Hadoop reuses box class objects like Text, so what it's probably doing is writing over the byte array, lengthening it as necessary, and just updating a separate length attribute. Jeff On Tue, Jul 20, 2010 at 8:56 AM, Ted Yu yuzhih...@gmail.com wrote: Interesting. String class is able to handle this scenario: 348 public String(byte[] data, String encoding) throws UnsupportedEncodingException { 349 this(data, 0, data.length, encoding); 350 } On Tue, Jul 20, 2010 at 6:01 AM, Jeff Bean jwfb...@cloudera.com wrote: I think the problem is here: String valueString = new String(valueText.getBytes(), UTF-8); Javadoc for Text says: *getBytes http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getBytes%28%29 *() Returns the raw bytes; however, only data up to getLength() http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getLength%28%29 is valid. So try getting the length, truncating the byte array at the value returned by getLength() and THEN converting it to a String. Jeff On Mon, Jul 19, 2010 at 9:08 AM, Ted Yu yuzhih...@gmail.com wrote: For your initial question on Text.set(). Text.setCapacity() allocates new byte array. Since keepData is false, old data wouldn't be copied over. On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo peter.mine...@reardencommerce.com wrote: I am already using XmlInputFormat. The input into the Map phase is not the problem. The problem lays in between the Map and Reduce phase. BTW - The article is correct. DO NOT USE StreamXmlRecordReader. XmlInputFormat is a lot faster. From my testing, StreamXmlRecordReader took 8 minutes to read a 1 GB XML document; where as, XmlInputFormat was under 2 minutes. (Using 2 Core, 8GB machines) -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, July 16, 2010 9:44 PM To: general@hadoop.apache.org Subject: Re: Hadoop and XML From an earlier post: http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo peter.mine...@reardencommerce.com wrote: Moving the variable to a local variable did not seem to work: /PrivateRateSetvateRateSet public void map(Object key, Object value, OutputCollector output, Reporter reporter) throws IOException { Text valueText = (Text)value; String valueString = new String(valueText.getBytes(), UTF-8); String keyString = getXmlKey(valueString); Text returnKeyText = new Text(); Text returnValueText = new Text(); returnKeyText.set(keyString); returnValueText.set(valueString); output.collect(returnKeyText, returnValueText); } -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Fri 7/16/2010 2:51 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML Whoopsright after I sent it and someone else made a suggestion; I realized what question 2 was about. I can try that, but wouldn't that cause Object bloat? During the Hadoop training I went through; it was mentioned to reuse the returning Key and Value objects to keep the number of Objects created down to a minimum. Is this not really a valid point? -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Friday, July 16, 2010 2:44 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML I am not using multi-threaded Map tasks. Also, if I understand your second question correctly: Also can you try creating the output key and values in the map method(method lacal) ? In the first code snippet I am doing exactly that. Below is the class that runs the Job. public class HadoopJobClient { private static final Log LOGGER = LogFactory.getLog(Prds.class.getName()); public static void
Re: Hadoop and XML
For your initial question on Text.set(). Text.setCapacity() allocates new byte array. Since keepData is false, old data wouldn't be copied over. On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo peter.mine...@reardencommerce.com wrote: I am already using XmlInputFormat. The input into the Map phase is not the problem. The problem lays in between the Map and Reduce phase. BTW - The article is correct. DO NOT USE StreamXmlRecordReader. XmlInputFormat is a lot faster. From my testing, StreamXmlRecordReader took 8 minutes to read a 1 GB XML document; where as, XmlInputFormat was under 2 minutes. (Using 2 Core, 8GB machines) -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, July 16, 2010 9:44 PM To: general@hadoop.apache.org Subject: Re: Hadoop and XML From an earlier post: http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo peter.mine...@reardencommerce.com wrote: Moving the variable to a local variable did not seem to work: /PrivateRateSetvateRateSet public void map(Object key, Object value, OutputCollector output, Reporter reporter) throws IOException { Text valueText = (Text)value; String valueString = new String(valueText.getBytes(), UTF-8); String keyString = getXmlKey(valueString); Text returnKeyText = new Text(); Text returnValueText = new Text(); returnKeyText.set(keyString); returnValueText.set(valueString); output.collect(returnKeyText, returnValueText); } -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Fri 7/16/2010 2:51 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML Whoopsright after I sent it and someone else made a suggestion; I realized what question 2 was about. I can try that, but wouldn't that cause Object bloat? During the Hadoop training I went through; it was mentioned to reuse the returning Key and Value objects to keep the number of Objects created down to a minimum. Is this not really a valid point? -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Friday, July 16, 2010 2:44 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML I am not using multi-threaded Map tasks. Also, if I understand your second question correctly: Also can you try creating the output key and values in the map method(method lacal) ? In the first code snippet I am doing exactly that. Below is the class that runs the Job. public class HadoopJobClient { private static final Log LOGGER = LogFactory.getLog(Prds.class.getName()); public static void main(String[] args) { JobConf conf = new JobConf(Prds.class); conf.set(xmlinput.start, PrivateRateSet); conf.set(xmlinput.end, /PrivateRateSet); conf.setJobName(PRDS Parse); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(PrdsMapper.class); conf.setReducerClass(PrdsReducer.class); conf.setInputFormat(XmlInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); // Run the job try { JobClient.runJob(conf); } catch (IOException e) { LOGGER.error(e.getMessage(), e); } } } -Original Message- From: Soumya Banerjee [mailto:soumya.sbaner...@gmail.com] Sent: Fri 7/16/2010 2:29 PM To: general@hadoop.apache.org Subject: Re: Hadoop and XML Hi, Can you please share the code of the job submission client ? Also can you try creating the output key and values in the map method(method lacal) ? Make sure you are not using multi threaded map task configuration. map() { private Text keyText = new Text(); private Text valueText = new Text(); //rest of the code } Soumya. On Sat, Jul 17, 2010 at 2:30 AM, Peter Minearo peter.mine...@reardencommerce.com wrote: I have an XML file that has sparse data in it. I am running a MapReduce Job that reads in an XML file, pulls out a Key from within the XML snippet and then hands back the Key and the XML snippet (as the Value) to the OutputCollector. The reason is to sort the file back into order. Below is the snippet of code. public class XmlMapper extends MapReduceBase implements Mapper { private Text keyText = new Text(); private Text valueText = new
Re: Hadoop and XML
From an earlier post: http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo peter.mine...@reardencommerce.com wrote: Moving the variable to a local variable did not seem to work: /PrivateRateSetvateRateSet public void map(Object key, Object value, OutputCollector output, Reporter reporter) throws IOException { Text valueText = (Text)value; String valueString = new String(valueText.getBytes(), UTF-8); String keyString = getXmlKey(valueString); Text returnKeyText = new Text(); Text returnValueText = new Text(); returnKeyText.set(keyString); returnValueText.set(valueString); output.collect(returnKeyText, returnValueText); } -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Fri 7/16/2010 2:51 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML Whoopsright after I sent it and someone else made a suggestion; I realized what question 2 was about. I can try that, but wouldn't that cause Object bloat? During the Hadoop training I went through; it was mentioned to reuse the returning Key and Value objects to keep the number of Objects created down to a minimum. Is this not really a valid point? -Original Message- From: Peter Minearo [mailto:peter.mine...@reardencommerce.com] Sent: Friday, July 16, 2010 2:44 PM To: general@hadoop.apache.org Subject: RE: Hadoop and XML I am not using multi-threaded Map tasks. Also, if I understand your second question correctly: Also can you try creating the output key and values in the map method(method lacal) ? In the first code snippet I am doing exactly that. Below is the class that runs the Job. public class HadoopJobClient { private static final Log LOGGER = LogFactory.getLog(Prds.class.getName()); public static void main(String[] args) { JobConf conf = new JobConf(Prds.class); conf.set(xmlinput.start, PrivateRateSet); conf.set(xmlinput.end, /PrivateRateSet); conf.setJobName(PRDS Parse); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(PrdsMapper.class); conf.setReducerClass(PrdsReducer.class); conf.setInputFormat(XmlInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); // Run the job try { JobClient.runJob(conf); } catch (IOException e) { LOGGER.error(e.getMessage(), e); } } } -Original Message- From: Soumya Banerjee [mailto:soumya.sbaner...@gmail.com] Sent: Fri 7/16/2010 2:29 PM To: general@hadoop.apache.org Subject: Re: Hadoop and XML Hi, Can you please share the code of the job submission client ? Also can you try creating the output key and values in the map method(method lacal) ? Make sure you are not using multi threaded map task configuration. map() { private Text keyText = new Text(); private Text valueText = new Text(); //rest of the code } Soumya. On Sat, Jul 17, 2010 at 2:30 AM, Peter Minearo peter.mine...@reardencommerce.com wrote: I have an XML file that has sparse data in it. I am running a MapReduce Job that reads in an XML file, pulls out a Key from within the XML snippet and then hands back the Key and the XML snippet (as the Value) to the OutputCollector. The reason is to sort the file back into order. Below is the snippet of code. public class XmlMapper extends MapReduceBase implements Mapper { private Text keyText = new Text(); private Text valueText = new Text(); @SuppressWarnings(unchecked) public void map(Object key, Object value, OutputCollector output, Reporter reporter) throws IOException { Text valueText = (Text)value; String valueString = new String(valueText.getBytes(), UTF-8); String keyString = getXmlKey(valueString); getKeyText().set(keyString); getValueText().set(valueString); output.collect(getKeyText(), getValueText()); } public Text getKeyText() { return keyText; } public void setKeyText(Text keyText) { this.keyText = keyText; } public Text getValueText() { return valueText; } public void setValueText(Text valueText) { this.valueText = valueText; } private String getXmlKey(String value) { // Get the Key from the XML in the value. } } The XML snippet from the Value is fine when it is passed into the map() method. I am not changing any data either, just pulling out
Re: Exception in thread main java.lang.ClassNotFoundException in Hadoop
I assume you have used jar command to confirm that org.myorg.WordCount is in wordcount.jar On Mon, Jul 12, 2010 at 10:59 PM, james isaac jamesisaac.d...@gmail.comwrote: This is the command i used and prompt is pointing to the current directory where wordcount.jar resides. I have also set the path for the HADOOP_HOME and added the bin directory of hadoop to the classpath. current directory = /home/user/demo/wordcount HADOOP_HOME=/home/user/hadoop_sws/hadoop-0.20.2 $ hadoop jar wordcount.jar org.myorg.WordCount /hdfs/data/input /hdfs/data/output On Mon, Jul 12, 2010 at 9:14 PM, Ted Yu yuzhih...@gmail.com wrote: Please give the commandline you used. You should have specified the jar containing WordCount class. On Mon, Jul 12, 2010 at 5:15 AM, james isaac jamesisaac.d...@gmail.com wrote: Hi, I have just started my career with Hadoop. I tried to execute the example given in http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.htmlas explained in the tutorial. I was not able to execute the program as described in the tutorial. I am getting the error as shown below. I tried executing with some other examples. They are also showing the same “ClassNotFoundException”. I tried to find the solution in various websites but i could not find it. Please help me to find the problem. Exception in thread main java.lang.ClassNotFoundException: org.myorg.WordCount at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
problem starting cdh3b2 jobtracker
We installed cdh3b2 0.20.2+320 and saw some strange error in jobtracker log: 2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9001 2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2010-07-02 01:49:31,988 WARN org.apache.hadoop.mapred.JobTracker: Error starting tracker: java.io.IOException: Cannot create toBeDeleted in /data1/mapred/local at org.apache.hadoop.util.MRAsyncDiskService.init(MRAsyncDiskService.java:85) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1688) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765) 2010-07-02 01:49:32,990 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2010-07-02 01:49:32,991 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to sjc1-hadoop0.sjc1.ciq.com/10.201.8.204:9001http://sjc1-hadoop0.sjc1.carrieriq.com/10.201.8.204:9001: Address already in use at org.apache.hadoop.ipc.Server.bind(Server.java:198) at org.apache.hadoop.ipc.Server$Listener.init(Server.java:261) at org.apache.hadoop.ipc.Server.init(Server.java:1043) at org.apache.hadoop.ipc.RPC$Server.init(RPC.java:492) at org.apache.hadoop.ipc.RPC.getServer(RPC.java:454) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1628) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.apache.hadoop.ipc.Server.bind(Server.java:196) ... 8 more 2010-07-02 01:49:32,992 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: But 9001 wasn't used: [sjc1-hadoop0.sjc1:hadoop 25618]netstat -nta | grep 9001 [sjc1-hadoop0.sjc1:hadoop 25619]netstat -nta | grep 9000 tcp0 0 10.201.8.204:9000 0.0.0.0:* LISTEN tcp0 0 10.201.8.204:9000 10.201.8.214:4223 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.212:49074 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.206:11910 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.210:62611 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.213:1299 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.205:9756 ESTABLISHED tcp0 0 10.201.8.204:9000 10.201.8.207:59207 ESTABLISHED Here is output from ifconfig: bond0 Link encap:Ethernet HWaddr 00:30:48:60:53:94 inet addr:10.201.8.204 Bcast:10.201.8.255 Mask:255.255.255.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:351496605 errors:0 dropped:1015 overruns:0 frame:0 TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:119420730164 (111.2 GiB) TX bytes:120002123131 (111.7 GiB) eth0 Link encap:Ethernet HWaddr 00:30:48:60:53:94 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:351496605 errors:0 dropped:1015 overruns:0 frame:0 TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:119420730164 (111.2 GiB) TX bytes:120002123131 (111.7 GiB) Interrupt:161 eth1 Link encap:Ethernet HWaddr 00:30:48:60:53:94 UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:169 Has anyone encountered similar issue ?
checksum error
Hi, We currently use hadoop 0.20.1 I found the following trace in our log: 2010-05-17 16:19:55,389 INFO [FSInputChecker] Found checksum error: b[0, 512]=53455106246f72672e6170616368652e6861646f6f702e696f2e426f6f6c65616e5772697461626c652b636f6d2e6361727269657269712e6d326d2e636f6d6d6f6e732e4f746155706c6f61645772697461626c6501002a6f72672e6170616368652e6861646f6f702e696f2e636f6d70726573732e44656661756c74436f646563aa555ae318451e8e76c0230f56f74d1f1466000101789cdddb09785355da00e0ef26699bae49ba2f94de2ed436857ab3364194b414294a355296e29aa449079025968ae82f78cb22051c088202e2125040d9acb88bc36410441d9dbf50dc1e06c57143749c8e0e0c7bff73ee92a6e9a7f23f33cef33f7f1f0839eff9ee3df77ef7dcedeb03130facc763b437197d76abd96237983c6e8399731b3d564395d9e2b6797c6e60b9268fdb5d55d56c31180c6693c56bb21bc9327683d5e036dadd162fb056afc76ae17cd62a8bd1e7717b0d3e6f95d5e2357b4c55be260fd76406d6e76b32992c9e2aafdb62b659bccd9cdb63b134f92c4d4d1e5b95d5e401d66430d8dc06b7d76e68f61a9acd469bdb6a37703ea3c9e33556594c1c19a5b9c964f671568bc96b34b83d1eb2850693db6d767306b3d10a6c1367367bb9660f89b47a7d469fd96771db7d06aec9cbd9c9c69351ec5683917ce38c16bbcd62b17bcc4dcd96263347f6c4d2ecf536937d31588c16a3bdb9 org.apache.hadoop.fs.ChecksumException: Checksum error: file:/incoming_ub/ciq_dca_uploads/2010/03/26/100813/17_00.ub at 0 at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277) at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241) at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176) at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193) at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412) at org.apache.hadoop.mapred.SequenceFileRecordReader.init(SequenceFileRecordReader.java:43) at com.ciq.m2m.platform.util.hadoop.ReportingSequenceFileRecordReader.init(ReportingSequenceFileRecordReader.java:54) at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.assureReader(MultiFileRecordReader.java:128) at com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.readNextFromAnyReader(MultiFileRecordReader.java:146) at com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.readNextFromAnyReader(MultiFileRecordReader.java:181) at com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.next(MultiFileRecordReader.java:122) at com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.next(MultiFileRecordReader.java:20) at com.ciq.m2m.platform.mmp2.input.BaseOtaUploadRecordReader.readNextUpload(BaseOtaUploadRecordReader.java:125) at com.ciq.m2m.platform.mmp2.input.OtaUploadToBinaryPackageRecordReader.next(OtaUploadToBinaryPackageRecordReader.java:176) at com.ciq.m2m.platform.mmp2.input.OtaUploadToBinaryPackageRecordReader.next(OtaUploadToBinaryPackageRecordReader.java:56) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) If you saw similar trace before, please share your comment. Thanks
Re: datanode goes down, maybe due to Unexpected problem in creating temporary file
That blk doesn't appear in NameNode log. For datanode, 2010-05-15 00:09:31,023 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_926027507678171558_3620 src: /10.32.56.170:49172 dest: / 10.32.56.171:50010 2010-05-15 00:09:31,024 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_926027507678171558_3620 received exception java.io.IOException: Unexpected problem in creating temporary file for blk_926027507678171558_3620. File /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558 should not be present, but is. 2010-05-15 00:09:31,024 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-5814095875968936685_2910 received exception java.io.IOException: Unexpected problem in creating temporary file for blk_-5814095875968936685_2910. File /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_-5814095875968936685 should not be present, but is. 2010-05-15 00:09:31,025 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.32.56.171:50010, storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075, ipcPort=50020):DataXceiver java.io.IOException: Unexpected problem in creating temporary file for blk_926027507678171558_3620. File /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558 should not be present, but is. at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376) at org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133) at org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:98) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:619) 2010-05-15 00:09:31,025 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.32.56.171:50010, storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075, ipcPort=50020):DataXceiver 2010-05-15 00:19:28,334 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_926027507678171558_3620 src: /10.32.56.170:36887 dest: / 10.32.56.171:50010 2010-05-15 00:19:28,334 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_926027507678171558_3620 received exception java.io.IOException: Unexpected problem in creating temporary file for blk_926027507678171558_3620. File /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558 should not be present, but is. 2010-05-15 00:19:28,334 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.32.56.171:50010, storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075, ipcPort=50020):DataXceiver java.io.IOException: Unexpected problem in creating temporary file for blk_926027507678171558_3620. File /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558 should not be present, but is. at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376) at org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133) at org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:98) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:619) 2010-05-15 00:29:25,635 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_926027507678171558_3620 src: /10.32.56.170:34823 dest: / 10.32.56.171:50010 On Mon, May 17, 2010 at 11:43 AM, Todd Lipcon t...@cloudera.com wrote: Hi Ted, Can you please grep your NN and DN logs for blk_926027507678171558 and pastebin the results? -Todd On Mon, May 17, 2010 at 9:57 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com I found this in datanode log: 2010-05-15 07:37:35,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: / 10.32.56.171:50010 2010-05-15 07:37:35,956 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_926027507678171558_3620 received exception java.io.IOException: Unexpected
Re: MultipleInputs in 0.20
Please refer to MAPREDUCE-1743. Other option is to duplicate MultipleInputs, DelegatingInputFormat classes and slightly modify TaggedInputSplit (as I suggested earlier). This way you use your own (functional) version :-) On Sun, May 9, 2010 at 2:08 PM, Oded Rosen o...@legolas-media.com wrote: By what I've learned from different sites around the web (hadoop wiki, cloudera http://www.cloudera.com/blog/2009/05/what%E2%80%99s-new-in-hadoop-core-020/ , mail archive, etc), the MultipleInputs class that was available in 0.18-0.19 versions of hadoop, was not moved to the 0.20 new API. (so does MultipleOutputs, but that's another story) I wanted to know if there is a way around this - to use two different paths with two different input format (sequence file, text file) as sources to the same job, with a special mapper for each input type - using hadoop 0.20 API. I think that writing a new job using 0.19 API only means more trouble later, when it's officially deprecated. I saw there is a jira goog_292716485 (MAPREDUCE-1170)https://issues.apache.org/jira/browse/MAPREDUCE-1170open for this issue, with a patch marked as Won't fix. If someone out there can help me with this, I will be most thankful. Cheers, -- Oded
Re: Compilation failed when compile hadoop common release-0.20.2
Did you first try 'ant jar' from under /hadoop_src ? On Fri, Mar 5, 2010 at 3:47 PM, Gary Yang garyya...@yahoo.com wrote: Hi, I try to compile hadoop common of the release 0.20.2. Below are the error messages and java and ant versions I am using. Please tell me what I missed. .. [javadoc] Standard Doclet version 1.6.0_18 [javadoc] Building tree for all the packages and classes... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... java5.check: BUILD FAILED /hadoop_src/common/build.xml:908: 'java5.home' is not defined. Forrest requires Java 5. Please pass -Djava5.home=base of Java 5 distribution to Ant on the command-line. echo $JAVA_HOME /usr/java/latest which java /usr/java/latest/bin/java /usr/java/latest/bin/java -version java version 1.6.0_18 Java(TM) SE Runtime Environment (build 1.6.0_18-b07) Java HotSpot(TM) Server VM (build 16.0-b13, mixed mode) ant -version Apache Ant version 1.7.1 compiled on June 27 2008 Thanks, Gary
Re: splittable encrypted compressed files
A bit more background: the encryption is required by third party. Data coming from third party may come in encrypted .gz format. Cheers On Sun, Jan 24, 2010 at 2:34 PM, Gautam Singaraju gautam.singar...@gmail.com wrote: A encryption/decryption mechanism requires the secret key to be securely distributed. As far as moving data on the wire is concerned, SSH is already being used. I am wondering about the performance hit when encryption and then again decryption is applied to the data and what value that might add. --- Gautam On Tue, Jan 19, 2010 at 2:38 PM, Ted Yu yuzhih...@gmail.com wrote: Hi, I experimented with hadoop-lzo and liked it. Our requirement is changing and we may need to encrypt data files. Does anyone know of splittable encrypted compression scheme that map task can use ? Thanks