Re: [ANNOUNCE] New Hadoop Committer - Chris Trezzo

2017-04-24 Thread Ted Yu
Congratulations, Chris

On Mon, Apr 24, 2017 at 12:42 PM, Sangjin Lee  wrote:

> It is my pleasure to announce that Chris Trezzo has been elected as a
> committer on the Apache Hadoop project. We appreciate his contributions to
> Hadoop thus far, and look forward to more.
>
> Please join me in congratulating Chris!
>
> Regards,
> Sangjin on behalf of the Apache Hadoop PMC
>


Re: [ANNOUNCE] New Hadoop Committer - Varun Saxena

2016-06-24 Thread Ted Yu
Congratulations Varun!

On Fri, Jun 24, 2016 at 11:57 AM, Li Lu  wrote:

> Congrats Varun!
>
> Li Lu
>
> > On Jun 24, 2016, at 11:51, 俊平堵  wrote:
> >
> > On behalf of the Apache Hadoop PMC, I am pleased to announce that Varun
> > Saxena
> > has been elected a committer on the Apache Hadoop project.  We appreciate
> > all
> > of Varun's hard work thus far, and we look forward to his continued
> > contributions.
> >
> > Congratulations, Varun!
> >
> >
> > Cheers,
> >
> > Junping
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: general-h...@hadoop.apache.org
>


Re: [ANNOUNCE] New Hadoop Committer - Larry McCay

2016-04-13 Thread Ted Yu
Congrats, Larry. 

> On Apr 12, 2016, at 11:33 PM, John Zhuge  wrote:
> 
> Congratulations Larry!
> 
> John Zhuge
> Software Engineer, Cloudera
> 
>> On Tue, Apr 12, 2016 at 11:31 PM, Xiao Chen  wrote:
>> 
>> Congrats Larry!
>> 
>> -Xiao
>> 
>>> On Tue, Apr 12, 2016 at 11:30 PM, Lei Xu  wrote:
>>> 
>>> Congratulation!
>>> 
 On Tue, Apr 12, 2016 at 11:28 PM, Gera Shegalov  wrote:
 Congrats and Welcome, Larry!
 
 On Wed, Apr 13, 2016 at 7:57 AM Chris Nauroth <
>> cnaur...@hortonworks.com>
 wrote:
 
> On behalf of the Apache Hadoop PMC, I am pleased to announce that
>> Larry
> McCay has been elected a committer on the Apache Hadoop project.  We
> appreciate all of Larry's hard work thus far, and we look
> forward to his continued contributions.
> 
> Congratulations, Larry!
> 
> 
> --Chris Nauroth
>>> 
>>> 
>>> 
>>> --
>>> Lei (Eddy) Xu
>>> Software Engineer, Cloudera
>> 


Re: [ANNOUNCE] New Apache Hadoop Committer : Kai Zheng

2016-04-13 Thread Ted Yu
Congrats, Kai. 

> On Apr 13, 2016, at 12:35 AM, Rakesh Radhakrishnan  
> wrote:
> 
> Congratulations, Kai!
> 
> Rakesh
> 
>> On Wed, Apr 13, 2016 at 1:00 PM, Zhihai Xu  wrote:
>> 
>> Congrats Kai!
>> 
>> zhihai
>> 
>> On Wed, Apr 13, 2016 at 12:19 AM, Varun Saxena 
>> wrote:
>> 
>>> Congrats Kai !
>>> 
 On Wed, 13 Apr 2016 at 12:45, Uma gangumalla 
>>> wrote:
>>> 
 Hi All,
 
  On behalf of the Apache Hadoop PMC, I am pleased to announce that
 Kai Zheng has been elected as a committer in the Apache Hadoop
 project.
 
 We appreciate all the work Kai Zheng has put into the project so far,
 and look forward to his future contributions.
 
 Welcome aboard, Kai Zheng.
 
 Congratulations!
 
 Regards,
 Uma Maheswara Rao Gangumalla
 On behalf of the Apache Hadoop PMC
>> 


Re: [ANNOUNCE] New Hadoop PMC Member - Sangjin Lee

2016-04-13 Thread Ted Yu
Congratulations, Sangjin. 

> On Apr 12, 2016, at 11:10 PM, Zhihai Xu  wrote:
> 
> Congrats Sangjin!
> 
> zhihai
> 
> On Tue, Apr 12, 2016 at 11:05 PM, Rohith Sharma K S <
> rohithsharm...@huawei.com> wrote:
> 
>> Congrats Sangjin :-)
>> 
>> 
>> -Original Message-
>> From: Chris Nauroth [mailto:cnaur...@hortonworks.com]
>> Sent: 13 April 2016 11:34
>> To: general@hadoop.apache.org
>> Subject: [ANNOUNCE] New Hadoop PMC Member - Sangjin Lee
>> 
>> On behalf of the Apache Hadoop PMC, I am very pleased to announce that
>> Sangjin Lee has been elected as a PMC Member on the Apache Hadoop project,
>> recognizing his continued contributions to the project so far.
>> 
>> Please join me in congratulating Sangjin!
>> 
>> --Chris Nauroth
>> 
>> 


Re: [ANNOUNCE] New Hadoop Committer - Naganarasimha Garla

2016-04-07 Thread Ted Yu
Congratulations, Naga.

On Thu, Apr 7, 2016 at 11:00 AM, Varun Saxena 
wrote:

> Congrats Naga !
>
> - Varun.
>
> On Thu, Apr 7, 2016 at 11:29 PM, Wangda Tan  wrote:
>
> > On behalf of the Apache Hadoop PMC, I am pleased to announce
> > that Naganarasimha Garla has been elected a committer on the Apache
> Hadoop
> > project.  We appreciate all of Naga's hard work thus far, and we look
> > forward to his continued contributions.
> >
> > Welcome onboard and congratulations, Naga!
> >
> > Thanks,
> > Wangda Tan
> >
>


Re: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC

2016-02-29 Thread Ted Yu
Congratulations Yongjun. 

> On Feb 29, 2016, at 12:17 AM, Naganarasimha G R (Naga) 
>  wrote:
> 
> Congrats Yongjun !
> 
> + Naga
> 
> From: Tsuyoshi Ozawa [oz...@apache.org]
> Sent: Monday, February 29, 2016 13:37
> To: general@hadoop.apache.org
> Cc: Yongjun Zhang
> Subject: Re: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC
> 
> Congrats, Yongjun!
> 
> - Tsuyoshi
> 
>> On Mon, Feb 29, 2016 at 4:55 PM, Xiao Chen  wrote:
>> Congrats Yongjun! :)
>> 
>> -Xiao
>> 
>> On Sun, Feb 28, 2016 at 11:52 PM, Brahma Reddy Battula <
>> brahmareddy.batt...@huawei.com> wrote:
>> 
>>> Congrats Yongjun!!
>>> 
>>> -Original Message-
>>> From: a...@cloudera.com [mailto:a...@cloudera.com] On Behalf Of Aaron T.
>>> Myers
>>> Sent: 29 February 2016 15:50
>>> To: general@hadoop.apache.org; Yongjun Zhang
>>> Subject: [ANNOUNCE] Yongjun Zhang added to the Apache Hadoop PMC
>>> 
>>> On behalf of the Apache Hadoop PMC, I am very pleased to announce that
>>> Yongjun Zhang has been elected as a PMC member on the Apache Hadoop
>>> project. This is in recognition of Yongjun's sustained and significant
>>> contributions to the project, and we look forward to even more
>>> contributions from him in the future.
>>> 
>>> Please join me in congratulating Yongjun on this accomplishment.
>>> 
>>> Great work, Yongjun.
>>> 
>>> Best,
>>> Aaron, on behalf of the Apache Hadoop PMC
>>> 


Re: [ANNOUNCE] New Hadoop Committer - Eric Payne

2016-02-11 Thread Ted Yu
Congratulations.

On Thu, Feb 11, 2016 at 7:55 AM, Rohith Sharma K S <
rohithsharm...@apache.org> wrote:

> Congratulations Eric!! :-)
> On 11 Feb 2016 21:23, "Jason Lowe"  wrote:
>
> > On behalf of the Apache Hadoop PMC, I am pleased to announce that Eric
> > Payne has been elected a committer on the Apache Hadoop project.  We
> > appreciate all of Eric's hard work thus far, and we look forward to his
> > continued contributions.
> >
> > Welcome and congratulations, Eric!
> >
> > Jason
> >
>


Re: [ANNOUNCE] New Hadoop Committer - Masatake Iwasaki

2016-01-27 Thread Ted Yu
Congratulations.

On Wed, Jan 27, 2016 at 6:43 PM, Xiao Chen  wrote:

> Congrats Masatake Iwasaki! Well deserved.
>
> -Xiao
>
> On Wed, Jan 27, 2016 at 6:38 PM, Yongjun Zhang 
> wrote:
>
> > Congratulations Masatake!
> >
> > --Yongjun
> >
> > On Wed, Jan 27, 2016 at 6:37 PM, Tsuyoshi OZAWA <
> > ozawa.tsuyo...@lab.ntt.co.jp> wrote:
> >
> > > On behalf of the Apache Hadoop PMC, I am pleased to announce that
> > > Masatake Iwasaki has been elected as a committer on the Apache Hadoop
> > > project. We appreciate all of Masatake's hard work thus far, and we
> > > look forward to his continued contributions.
> > >
> > > Welcome Masatake!
> > >
> > > Regards,
> > > - Tsuyoshi
> > >
> >
>


Re: [ANNOUNCE] Additions to Apache Hadoop PMC - Akira, Robert, Tusyoshi and Wangda

2016-01-12 Thread Ted Yu
Congratulations!

> On Jan 12, 2016, at 5:08 PM, Karthik Kambatla  wrote:
> 
> On behalf of the Apache Hadoop PMC, I am very pleased to announce the
> following folks have been elected as a PMC member on the Apache Hadoop
> project recognizing their sustained and significant contributions to the
> project:
> 
>   - Akira Ajisaka
>   - Robert Kanter
>   - Tsuyoshi Ozawa
>   - Wangda Tan
> 
> Please join me congratulating them.
> 
> Cheers!
> Karthik
> (on behalf of the Hadoop PMC)


Re: [ANNOUNCE] New Hadoop Committer - Walter Su

2015-10-28 Thread Ted Yu
Congrats, Walter. 

> On Oct 28, 2015, at 5:16 PM, Vinayakumar B  wrote:
> 
> Congrats Walter!
> 
> Welcome aboard.
> 
> Regards,
> Vinay
> 
>> On Thu, Oct 29, 2015 at 5:45 AM, Haohui Mai  wrote:
>> 
>> On behalf of the Apache Hadoop PMC, I am very pleased to announce that
>> Walter Su has been elected a committer on the Apache Hadoop project
>> recognizing his continued contributions to the project.
>> 
>> Walter is already a branch committer on HDFS-7285 and can hit the ground
>> running :)
>> 
>> Welcome Walter!
>> 
>> Cheers!
>> 


Re: [ANNOUNCE] New Apache Hadoop committer Zhe Zhang

2015-10-19 Thread Ted Yu
Congrats, Zhe

> On Oct 19, 2015, at 1:56 PM, Andrew Wang  wrote:
> 
> Hi all,
> 
> It is my pleasure to welcome Zhe Zhang as an Apache Hadoop committer. Zhe
> has been working on the project for over a year now, notably on the HDFS
> erasure coding feature (HDFS-7285) as well as other bug fixes and
> improvements.
> 
> Please join me in congratulating Zhe!
> 
> Best,
> Andrew


Re: [ANNOUNCE] New Hadoop Committer - Anubhav dhoot

2015-09-20 Thread Ted Yu
Congrats Anubhav

> On Sep 20, 2015, at 9:56 PM, Karthik Kambatla  wrote:
> 
> On behalf of the Apache Hadoop PMC, I am pleased to announce that Anubhav
> Dhoot has been elected a committer on the Apache Hadoop project recognizing
> his contributions to the project.
> 
> We appreciate Anubhav's hard work thus far, and and look forward to his
> contributions.
> 
> Welcome Anubhav!
> 
> Cheers!


Re: [ANNOUNCE] Welcoming Devaraj K to Apache Hadoop PMC

2015-08-05 Thread Ted Yu
Congrats, Devaraj.

On Wed, Aug 5, 2015 at 5:16 PM, Brahma Reddy Battula 
brahmareddy.batt...@hotmail.com wrote:

 Congratulations Devaraj!

  Date: Wed, 5 Aug 2015 16:56:47 -0700
  Subject: [ANNOUNCE] Welcoming Devaraj K to Apache Hadoop PMC
  From: umamah...@apache.org
  To: general@hadoop.apache.org
 
  Hi,
 
   On behalf of the Apache Hadoop Project Management Committee (PMC), it
  gives me great pleasure to announce that Devaraj K is recently welcomed
 to
  join as a member of Apache Hadoop PMC. He has done lot of good work to
  hadoop and looking forward to his greater achievements in the future!
 
  Please join me in welcoming Deva to Hadoop PMC!
 
  Congratulations Deva!
 
 
  Regards,
  Uma  (On behalf of the Apache Hadoop Project Management Committee)




Re: [ANNOUNCE] New Hadoop committer - Zhihai Xu

2015-07-30 Thread Ted Yu
Congratulations, Zhihai.

On Thu, Jul 30, 2015 at 12:49 PM, Wangda Tan wheele...@gmail.com wrote:

 Congrats!

 Regards,
 Wangda

 On Thu, Jul 30, 2015 at 12:36 PM, Karthik Kambatla ka...@cloudera.com
 wrote:

  On behalf of the Apache Hadoop PMC, I am pleased to announce that Zhihai
 Xu
  has been elected a committer on the Apache Hadoop project recognizing his
  prolific work in the last year or so.
 
  We appreciate Zhihai's hardwork thus far, and look forward to his
  contributions.
 
  Welcome Zhihai!
 
  Cheers!
  Karthik
 



Re: environment setup

2015-07-26 Thread Ted Yu
Putting general@ to bcc.

If you plan to ask about Apache Hadoop setup issue, consider using user@

If you are installing distro from some vendor, please use vendor-specific
mailing list.

Cheers

2015-07-26 13:13 GMT-07:00 Serkan Taş serkan@likyateknoloji.com:

 Hi,

 Which mail list best fits for the questions of  hadoop dev environment
 setup problems ?

 Thanx


 *Serkan Taş*
 Mobil : +90 532 250 07 71
 Likya Bilgi Teknolojileri
 ve İletişim Hiz. Ltd. Şti.
 www.likyateknoloji.com

 --
 Bu elektronik posta ve onunla iletilen bütün dosyalar gizlidir. Sadece
 yukarıda isimleri belirtilen kişiler arasında özel haberleşme amacını
 taşımaktadır. Size yanlışlıkla ulaşmışsa bu elektonik postanın içeriğini
 açıklamanız, kopyalamanız, yönlendirmeniz ve kullanmanız kesinlikle
 yasaktır. Lütfen mesajı geri gönderiniz ve sisteminizden siliniz. Likya
 Bilgi Teknolojileri ve İletişim Hiz. Ltd. Şti. bu mesajın içeriği ile
 ilgili olarak hiç bir hukuksal sorumluluğu kabul etmez.

 This electronic mail and any files transmitted with it are intended for
 the private use of  the persons named above. If you received this message
 in error, forwarding, copying or use of any of the information is strictly
 prohibited. Please immediately notify the sender and delete it from your
 system. Likya Bilgi Teknolojileri ve İletişim Hiz. Ltd. Şti. does not
 accept legal responsibility for the contents of this message.
 --







 P
 Bu e-postayı yazdırmadan önce, çevreye olan sorumluluğunuzu tekrar düşünün.
 Please consider your environmental responsibility before printing this
 e-mail.





Re: [ANNOUNCE] Welcoming Vinayakumar B to Apache Hadoop PMC

2015-07-08 Thread Ted Yu
Congrats, Vinay.

On Wed, Jul 8, 2015 at 11:10 AM, Uma gangumalla umamah...@apache.org
wrote:

 Hi,

  On behalf of the Apache Hadoop Project Management Committee (PMC), it
 gives me great pleasure to announce that Vinayakumar B is recently welcomed
 to join as a member of Apache Hadoop PMC. He has done lot of good work to
 hadoop and looking forward to his greater achievements in the future!

 Please join me in welcoming Vinay to Hadoop PMC!

 Congratulations Vinay!


 Regards,
 Uma  (On behalf of the Apache Hadoop Project Management Committee)



Re: [ANNOUNCE] Welcoming Xuan Gong to Apache Hadoop PMC

2015-07-07 Thread Ted Yu
Congrats, Xuan.

On Tue, Jul 7, 2015 at 12:49 PM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 It gives me great pleasure to announce that Xuan Gong is welcomed to join
 as a member of Apache Hadoop PMC. Here's appreciating all his work so far
 into the project and looking forward to greater achievements in the future!

 Please join me in welcoming Xuan to Hadoop PMC!

 Congratulations Xuan!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC



Re: [ANNOUNCE] New Hadoop Committer - Rohith Sharma K S

2015-07-07 Thread Ted Yu
Congrats Rohith!

On Tue, Jul 7, 2015 at 1:50 PM, Zhijie Shen zs...@hortonworks.com wrote:

 Congrats!
 
 From: Sangjin Lee sjl...@gmail.com
 Sent: Tuesday, July 07, 2015 1:48 PM
 To: general@hadoop.apache.org
 Subject: Re: [ANNOUNCE] New Hadoop Committer - Rohith Sharma K S

 Congratulations Rohith!

 On Tue, Jul 7, 2015 at 1:29 PM, Chris Nauroth cnaur...@hortonworks.com
 wrote:

  Welcome, Rohith!
 
  --Chris Nauroth
 
 
 
 
  On 7/7/15, 1:24 PM, Jian He j...@hortonworks.com wrote:
 
  On behalf of the Apache Hadoop PMC, I am pleased to announce that Rohith
  Sharma K S
  has been elected as a committer on the Apache Hadoop project.  We
  appreciate all of Rohith¹s hard work thus far, and we look forward to
 his
  continued contributions.
  
  Thanks,
  Jian He
 
 



Re: [ANNOUNCE] New Hadoop committer - Lei (Eddy) Xu

2015-06-22 Thread Ted Yu
Congratulations, Eddy. 


 On Jun 19, 2015, at 8:43 PM, Esteban Gutierrez este...@cloudera.com wrote:
 
 Congratulations Eddy!
 
 --
 Cloudera, Inc.
 
 
 On Fri, Jun 19, 2015 at 3:26 AM, Joep Rottinghuis jrottingh...@gmail.com
 wrote:
 
 Congrats Eddy !
 
 Joep
 
 Sent from my iPhone
 
 On Jun 18, 2015, at 9:20 PM, Rohith Sharma K S 
 rohithsharm...@huawei.com wrote:
 
 Congratulations :-)
 
 Thanks  Regards
 Rohith Sharma K S
 
 -Original Message-
 From: Andrew Wang [mailto:andrew.w...@cloudera.com]
 Sent: 16 June 2015 03:21
 To: general@hadoop.apache.org
 Subject: [ANNOUNCE] New Hadoop committer - Lei (Eddy) Xu
 
 Hello all,
 
 It is my pleasure to announce that Lei Xu, also known as Eddy, has
 accepted the Apache Hadoop PMC's invitation to become a committer. We
 appreciate all of Eddy's hard work thus far, and look forward to his
 continued contributions.
 
 Welcome and congratulations, Eddy!
 
 Best,
 Andrew, on behalf of the Apache Hadoop PMC
 


Re: [ANNOUNCE] New Hadoop Committer - Ming Ma

2015-06-21 Thread Ted Yu
Congratulations, Ming. 



 On Jun 19, 2015, at 8:25 PM, Esteban Gutierrez este...@cloudera.com wrote:
 
 Congratulations Ming Ma!
 
 
 
 --
 Cloudera, Inc.
 
 
 On Fri, Jun 19, 2015 at 2:07 PM, Masatake Iwasaki 
 iwasak...@oss.nttdata.co.jp wrote:
 
 Congratulations, Ming Ma!
 
 
 
 On 6/18/15 12:55, Chris Nauroth wrote:
 
 On behalf of the Apache Hadoop PMC, I am pleased to announce that Ming Ma
 has been elected as a committer on the Apache Hadoop project.  We
 appreciate all of Ming's hard work thus far, and we look forward to his
 continued contributions.
 
 Welcome, Ming!
 
 --Chris Nauroth
 


Re: [ANNOUNCE] New Hadoop committer - Varun Vasudev

2015-06-17 Thread Ted Yu
Congratulations Varun 



 On Jun 17, 2015, at 3:36 PM, Devaraj K deva...@apache.org wrote:
 
 Congrats Varun...
 
 Thanks
 Devaraj
 
 On Tue, Jun 16, 2015 at 10:24 PM, Vinod Kumar Vavilapalli 
 vino...@apache.org wrote:
 
 Hi all,
 
 
 It gives me great pleasure to announce that the Apache Hadoop PMC recently
 invited Varun Vasudev to become a committer in the project, to which he
 accepted.
 
 
 We deeply appreciate his efforts in the project so far, specifically in the
 areas of YARN and MapReduce. Here’s looking forward to his
 continued contributions going into the future!
 
 
 Welcome aboard and congratulations, Varun!
 
 
 +Vinod
 
 On behalf of the Apache Hadoop PMC
 
 
 
 -- 
 
 
 Thanks
 Devaraj K


Re: [ANNOUNCE] New Hadoop committer - Arun Suresh

2015-03-28 Thread Ted Yu
Congratulations, Arun.

On Fri, Mar 27, 2015 at 9:36 AM, Sangjin Lee sjl...@gmail.com wrote:

 Congratulations!

 On Thu, Mar 26, 2015 at 6:29 PM, Sun, Dapeng dapeng@intel.com wrote:

  Congratulations Arun!
 
  Regards
  Dapeng
 
  -Original Message-
  From: Andrew Wang [mailto:andrew.w...@cloudera.com]
  Sent: Friday, March 27, 2015 6:39 AM
  To: general@hadoop.apache.org
  Subject: [ANNOUNCE] New Hadoop committer - Arun Suresh
 
  It is my pleasure to announce that Arun Suresh has accepted the Apache
  Hadoop PMC's invitation to become a committer on the Apache Hadoop
 project.
  We appreciate all of Arun's hard work thus far, and look forward to his
  continued contributions.
 
  Welcome, Arun!
 



Re: Patch review process

2015-01-26 Thread Ted Yu
In some cases, contributor responded to review comments and attached
patches addressing the comments.

Later on, there was simply no response to the latest patch - even with
follow-on ping.

I wish this aspect can be improved.

Cheers

On Sun, Jan 25, 2015 at 6:03 PM, Tsz Wo (Nicholas), Sze 
s29752-hadoopgene...@yahoo.com.invalid wrote:

 Hi contributors,
 I would like to (re)start a discussion regrading to our patch review
 process.  A similar discussion has been happened in a the hadoop private
 mailing list, which is inappropriate.
 Here is the problem:The patch available queues become longer and longer.
 It seems that we never can catch up.  There are patches sitting in the
 queues for years.  How could we speed up?
 Regrads,Tsz-Wo



Re: [ANNOUNCE] Welcoming Jian He to Apache Hadoop PMC

2015-01-05 Thread Ted Yu
Congratulations, Jian !

On Mon, Jan 5, 2015 at 3:51 PM, Vinod Kumar Vavilapalli vino...@apache.org
wrote:

 Hi all,

 On behalf of the Apache Hadoop PMC, it gives me great pleasure to announce
 that Jian He is recently welcomed to join as a member of Apache Hadoop PMC.
 Here's appreciating all his work so far into the project and looking
 forward to greater achievements in the future!

 Please join me in welcoming Jian to Hadoop PMC! Congratulations Jian!

 Thanks,
 +Vinod



Re: [ANNOUNCE] Welcoming Zhijie Shen to Apache Hadoop PMC

2015-01-05 Thread Ted Yu
Congratulations, Zhijie !



 On Jan 5, 2015, at 3:51 PM, Vinod Kumar Vavilapalli vino...@apache.org 
 wrote:
 
 Hi all,
 
 On behalf of the Apache Hadoop PMC, it gives me great pleasure to announce 
 that Zhijie Shen is recently welcomed to join as a member of Apache Hadoop 
 PMC. Here's appreciating all his work so far into the project and looking 
 forward to greater achievements in the future!
 
 Please join me in welcoming Zhijie to Hadoop PMC! Congratulations Zhijie!
 
 Thanks,
 +Vinod


Re: [ANNOUNCE] New Hadoop Committer - Carlo Curino

2014-12-05 Thread Ted Yu
Congratulations, Carlo !

On Dec 5, 2014, at 12:43 PM, Chris Douglas cdoug...@apache.org wrote:

 On behalf of the Apache Hadoop PMC, I'm pleased to announce (albeit
 belatedly) that Carlo Curino has been elected as a committer on the
 project. Congratulations Carlo; thank you for all your contributions
 to Hadoop! -C


Re: [ANNOUNCE] New Hadoop Committers - Gera Shegalov and Robert Kanter

2014-12-04 Thread Ted Yu
Congratulations, Gera and Robert!

On Thu, Dec 4, 2014 at 1:43 PM, Chris Nauroth cnaur...@hortonworks.com
wrote:

 Congratulations!

 Chris Nauroth
 Hortonworks
 http://hortonworks.com/


 On Thu, Dec 4, 2014 at 11:56 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

  On behalf of the Apache Hadoop PMC, I am pleased to announce that Gera
  Shegalov and Robert Kanter have been elected as committers on the Apache
  Hadoop project. We appreciate all the work they have put into the project
  so far, and look forward to their future contributions.
 
  Welcome, Gera and Robert!
 
  -Sandy
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: All mirrored download links from the Apache Hadoop site are broken

2014-10-31 Thread Ted Yu
Looks like the following is working:
http://apache.mesi.com.ar/hadoop/common/

FYI

On Fri, Oct 31, 2014 at 10:55 AM, Andrew Purtell apurt...@apache.org
wrote:

 http://hadoop.apache.org/releases.html#Download
- http://www.apache.org/dyn/closer.cgi/hadoop/common/

 Every single mirror link 404s.

 --
 Best regards,

- Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)



Re: Some questions for getting help

2014-05-19 Thread Ted Yu
You can connect to chat room #hadoop on IRC. 
There're many people on that channel. 

Cheers

On May 18, 2014, at 11:31 PM, Pankti Majmudar pankti.majmu...@gmail.com wrote:

 Hi,
 I am a newbie to open source development and Hadoop.  I was looking at the
 newbie issues on JIRA and would like to pick up one of the minor bugs.
 
 I referred to the 'How To Contribute' page on Apache and started with
 getting the source code and setting the environment up.  Please let me know
 if there are any sources(IRCs etc) where I can ask for specific help and
 discuss about fixing any issue/bug.  Is the mailing list the only method
 for communicating or asking questions?
 
 Any pointers will be helpful.
 
 Thanks,
 Pankti


Re: I stopped receiving any email from this group!

2014-05-11 Thread Ted Yu
See https://twitter.com/infrabot

Cheers


On Sat, May 10, 2014 at 5:19 PM, Peyman Mohajerian mohaj...@gmail.comwrote:





Re: [ANNOUNCE] New Hadoop Committer - Xuan Gong

2014-04-13 Thread Ted Yu
Congratulations, Xuan.


On Sun, Apr 13, 2014 at 5:49 PM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that the Apache Hadoop PMC voted in Xuan Gong
 as a committer in the Apache Hadoop project. His contributions to YARN and
 MapReduce have been outstanding! We appreciate all the work Xuan has put
 into the project so far, and are looking forward to his future
 contributions.

 Welcome aboard, Xuan!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: hadoop eclipse plugin

2014-01-31 Thread Ted Yu
I used the following command in the root of workspace:
  859  ant clean compile
  860  ant eclipse

Eclipse project file is generated:
-rw-r--r--1 tyu  staff7942 Jan 31 22:21 .classpath
-rw-r--r--1 tyu  staff 392 Jan 31 22:21 .project

Cheers


On Fri, Jan 31, 2014 at 9:43 PM, pavan pavan.danthul...@gmail.com wrote:

 Hi,



 I am very new to hadoop, exploring hadoop-1.2.1 on windows using (cygwin +
 eclipse europa). Can anyone guide/help me in finding the information on
 eclipse plugin for the below configuration?



 OS : Windows(using Cygwin)

 Hadoop version : hadoop-1.2.1

 Eclipse : Ecipse Europa 3.3.2



 Thanks

 Pavan Kumar D








Re: [ANNOUNCE] New Hadoop Committer - Junping Du

2013-12-09 Thread Ted Yu
Congrats, Junping.


On Tue, Dec 10, 2013 at 3:07 AM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that Junping Du is voted in as a committer in
 the Apache Hadoop project. His contributions to Hadoop have been
 outstanding! We appreciate all the work Junping has put into the project so
 far, and are looking forward to his future contributions.

 Welcome aboard, Junping!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [ANNOUNCE] New Hadoop Committer - Omkar Vinit Joshi

2013-12-09 Thread Ted Yu
Congratulations, Omkar.


On Tue, Dec 10, 2013 at 3:08 AM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that Omkar Vinit Joshi is voted in as a
 committer to the Apache Hadoop project. His contributions to YARN have been
 nothing short of outstanding! We appreciate all the work Omkar has put into
 the project so far, and are looking forward to his future contributions.

 Welcome aboard, Omkar!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [ANNOUNCE] New Hadoop Committer - Mayank Bansal

2013-12-09 Thread Ted Yu
Congrats, Mayank.


On Tue, Dec 10, 2013 at 3:11 AM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that Mayank Bansal is voted in as a committer
 in the Apache Hadoop project. A long term Hadoop contributor, his
 contributions to YARN and MapReduce have been outstanding! We appreciate
 all the work Mayank has put into the project so far, and are looking
 forward
 to his future contributions.

 Welcome aboard, Mayank!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [ANNOUNCE] New Hadoop Committer - Jian He

2013-12-09 Thread Ted Yu
Congrats, Jian.


On Tue, Dec 10, 2013 at 3:10 AM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that Jian He is voted in as a committer to
 the Apache Hadoop project. His contributions to YARN have been nothing
 short of outstanding! We appreciate all the work Jian has put into the
 project so far, and are looking forward to his future contributions.

 Welcome aboard, Jian!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [ANNOUNCE] New Hadoop Committer - Zhijie Shen

2013-12-09 Thread Ted Yu
Congratulations, Zhijie.


On Tue, Dec 10, 2013 at 3:10 AM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Hi all,

 I am very pleased to announce that Zhijie Shen is voted in as a committer
 in the Apache Hadoop project. His contributions to YARN and MapReduce have
 been outstanding! We appreciate all the work Zhijie has put into the
 project so far, and are looking forward to his future contributions.

 Welcome aboard, Zhijie!

 Thanks,
 +Vinod
 On behalf of the Apache Hadoop PMC

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: [ANNOUNCE] New Hadoop Committer - Roman Shaposhnik

2013-10-24 Thread Ted Yu
Congratulations, Roman. 

On Oct 24, 2013, at 4:35 PM, Andrew Wang andrew.w...@cloudera.com wrote:

 Congrats Roman!
 
 
 On Thu, Oct 24, 2013 at 4:33 PM, Alejandro Abdelnur t...@cloudera.comwrote:
 
 On behalf of the Apache Hadoop PMC, I am pleased to announce that Roman
 Shaposhnik has been elected a committer in the Apache Hadoop project. We
 appreciate all the work Roman has put into the project so far, and look
 forward to his future contributions.
 
 Welcome, Roman!
 
 Cheers,
 
 --
 Alejandro
 


Re: Unclear Hadoop 2.1X documentation

2013-09-14 Thread Ted Yu
It might be easier if you start with some sandbox environment for your
first setup.
Search 'hadoop sandbox' in google.

Cheers


On Sat, Sep 14, 2013 at 11:22 AM, Mahmoud Al-Ewiwi mew...@gmail.com wrote:

 Thanks Mr. Ted
 what about 3 and 4 where are hadoop-common and hadoop-hdfs



 On Sat, Sep 14, 2013 at 9:09 PM, Ted Yu yuzhih...@gmail.com wrote:

  For #1, you can get the tar ball from
  http://www.apache.org/dyn/closer.cgi/hadoop/common/
  e.g. http://www.motorlogy.com/apache/hadoop/common/hadoop-2.1.0-beta/
 
  It is in maven too: http://mvnrepository.com/artifact/org.apache.hadoop/
 
  For #2, see https://code.google.com/p/protobuf/
 
 
  On Sat, Sep 14, 2013 at 10:54 AM, Mahmoud Al-Ewiwi mew...@gmail.com
  wrote:
 
   Hello,
  
   I'm new to Hadoop and i want to learn it in order to do a project.
   I'v started reading the documentation at this site:
  
  
  
 
 http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-project-dist/hadoop-common/SingleCluster.html
  
   for setting a single node, but i could not figure a lot of things in
  these
   documentation.
  
   1.You should be able to obtain the MapReduce tarball from the release
  
   I could not find this tarball, where is it.
  
   2.You will need protoc 2.5.0 installed
  
   what is that, there is no even a link for it or what is it
  
   3.Assuming you have installed hadoop-common/hadoop-hdfs
  
   what also are these, and why you are assuming that. i have just
 downlaod
   the hadoop-2.1.0-beta
   http://ftp.itu.edu.tr/Mirror/Apache/hadoop/common/hadoop-2.1.0-beta/
 and
   extracted it
  
   4. and exported *$HADOOP_COMMON_HOME*/*$HADOOP_HDFS_HOME*
  
   !! that is strange, to where should these environment variables
  indicating
  
   lastly as i know,the first step tutorial should give more details. or
 am
  i
   searching the wrong side.
   **
   **
  
 



Re: Status of 2.0.4-alpha release

2013-04-09 Thread Ted Yu
Good idea, Roman.

On Tue, Apr 9, 2013 at 7:26 PM, Roman Shaposhnik r...@apache.org wrote:

 On Tue, Apr 9, 2013 at 7:03 PM, Arun C Murthy a...@hortonworks.com wrote:
  Thanks Xuan, Vinod and Cos.
 
  I'll spin the rc0 in the next 24hrs.

 Once the official RC is ready I'll pull the tarball into the Bigtop infra
 and will provide integration testing results.

 In fact, I think starting from 2.0.4-alpha we should probably get
 into the habit of integration testing the next branch every single
 night so that everybody can jump onto test failures, not just
 us Bigtop folks.

 Thanks,
 Roman.



Re: [DISCUSS] stabilizing Hadoop releases wrt. downstream

2013-03-07 Thread Ted Yu
Thanks Bobby.

HBase trunk can build upon 2.0 SNAPSHOOT so that regression can be detected
early.

On Tue, Mar 5, 2013 at 7:18 AM, Robert Evans ev...@yahoo-inc.com wrote:

 That is a great point.  I have been meaning to set up the Jenkins build
 for branch-2 for a while, so I took the 10 mins and just did it.

 https://builds.apache.org/job/Hadoop-Common-2-Commit/

 Don't let the name fool you, it publishes not just common, but HDFS, YARN,
 MR, and tools too.  You should now have branch-2 SNAPSHOTS updated on each
 commit to branch-2.  Feel free to bug me if you need more integration
 points.  I am not an RE guy, but I can hack it to make things work :)

 --Bobby

 On 3/5/13 12:15 AM, Konstantin Boudnik c...@apache.org wrote:

 Arun,
 
 first of all, I don't think anyone is trying to put a blame on someone
 else. E.g. I had similar experience with Oozie being broken because of
 certain released changes in the upstream.
 
 I am sure that most people in BigTop community - especially those who
 share the committer-ship privilege in BigTop and other upstream
 projects, including Hadoop, - would be happy to help with the
 stabilization of the Hadoop base. The issue that a downstream
 integration project is likely to have is - for once - the absence of
 regularly published development artifacts. In the light of it didn't
 happen if there's no picture here's a couple of examples:
 
   - 2.0.2-SNAPSHOT weren't published at all; only release 2.0.2-alpha
 artifacts were
   - 2.0.3-SNAPSHOT weren't published until Feb 29, 2013 (it happened just
 once)
 
 So, technically speaking, unless an integration project is willing to
 build and maintain its own artifacts, it is impossible to do any
 preventive validation.
 
 Which brings me to my next question: how do you guys address
 Integration is high on the list of *every* release. Again, please
 don't get me wrong - I am not looking to lay a blame on or corner
 anyone - I am really curious and would appreciate the input.
 
 
 Vinod:
 
  As you yourself noted later, the pain is part of the 'alpha' status
  of the release. We are targeting +one of the immediate future
  releases to be a beta and so these troubles are really only the
  short +term.
 
 I don't really want to get into the discussion about of what
 constitutes the alpha and how it has delayed the adoption of Hadoop2
 line. However, I want to point out that it is especially important for
 alpha platform to work nicely with downstream consumers of the said
 platform. For quite obvious reasons, I believe.
 
  I think there is a fundamental problem with the interaction of
  Bigtop with the downstream projects, if nothing else, with
 
 BigTop is as downstream as it can get, because BigTop essentially
 consumes all other component releases in order to produce a viable
 stack. Technicalities aside...
 
  Hadoop. We never formalized on the process, will BigTop step in
  after an RC is up for vote or before? As I see it, it's happening
 
 Bigtop essentially can give any component, including Hadoop, and
 better yet - the set of components - certain guaratees about
 compatibility and dependencies being included. Case in point is
 missing commons libraries missed in 1.0.1 release that essentially
 prevented HBase from working properly.
 
  after the vote is up, so no wonder we are in this state. Shall we
  have a pre-notice to Bigtop so that it can step in before?
 
 The above is in contradiction with earlier statement of Integration
 is high on the list of *every* release. If BigTop isn't used for
 integration testing, then how said integration testing is performed?
 Is it some sort of test-patch process as Luke referred earlier?  And
 why it leaves the room for the integration issues being uncaught?
 Again, I am genuinely interested to know.
 
  these short term pains. I'd rather like us swim through these now
  instead of support broken APIs and features in our beta, having seen
  this very thing happen with 1.*.
 
 I think you're mixing the point of integration with downstream and
 being in an alpha phase of the development. The former isn't about
 supporting broken APIs - it is about being consistent and avoid
 breaking the downstream applicaitons without letting said applications
 to accomodate the platform changes first.
 
 Changes in the API, after all, can be relatively easy traced by
 integration validation - this is the whole point of integration
 testing. And BigTop does the job better then anything around, simply
 because there's nothing else around to do it.
 
 If you stay in shape-shifting alpha that doesn't integrate well for
 a very long time, you risk to lose downstream customers' interest,
 because they might get tired of waiting until a next stable API will
 be ready for them.
 
  Let's fix the way the release related communication is happening
  across our projects so that we can all work together and make 2.X a
  success.
 
 This is a very good point indeed! Let's start a separate discussion
 thread 

Re: [VOTE] Release hadoop-0.23.2-rc0

2012-03-29 Thread Ted Yu
What are the issues fixed / features added in 0.23.2 compared to 0.23.1 ?

Thanks

On Thu, Mar 29, 2012 at 3:45 PM, Arun C Murthy a...@hortonworks.com wrote:

 I've created a release candidate for hadoop-0.23.2 that I would like to
 release.

 It is available at: http://people.apache.org/~acmurthy/hadoop-0.23.2-rc0/

 The maven artifacts are available via repository.apache.org.

 Please try the release and vote; the vote will run for the usual 7 days.

 thanks,
 Arun

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/





Re: [VOTE] Rename hadoop branches post hadoop-1.x

2012-03-22 Thread Ted Yu
1, 2, 5 (non-binding)

On Mon, Mar 19, 2012 at 5:13 PM, Arun C Murthy a...@hortonworks.com wrote:

 We've discussed several options:

 (1) Rename branch-0.22 to branch-2, rename branch-0.23 to branch-3.
 (2) Rename branch-0.23 to branch-3, keep branch-0.22 as-is i.e. leave a
 hole.
 (3) Rename branch-0.23 to branch-2, keep branch-0.22 as-is.
 (4) If security is fixed in branch-0.22 within a short time-frame i.e. 2
 months then we get option 1, else we get option 2. Effectively postpone
 discussion by 2 months, start a timer now.
 (5) Do nothing, keep branch-0.22 and branch-0.23 as-is.

 Let's do a STV [1] to get reach consensus.

 Please vote by listing the options above in order of your preferences.

 thanks,
 Arun

 [1] http://en.wikipedia.org/wiki/Single_transferable_vote




HBase trunk on 0.23 Was: Update on hadoop-0.23

2011-09-28 Thread Ted Yu
Minor correction to Todd's report, currently HBase TRUNK doesn't compile
against 0.23 (
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/42/console
):

[ERROR] 
/home/hudson/hudson-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java:[35,38]
cannot find symbol
[ERROR] symbol  : class FSConstants
[ERROR] location: package org.apache.hadoop.hdfs.protocol
[ERROR]

Cheers

On Tue, Sep 27, 2011 at 12:56 PM, Todd Lipcon t...@cloudera.com wrote:

 Hi all,

 Just an update from the HBase side: I've run some cluster tests on
 HDFS 0.23 (as of about a month ago) and it generally works well.
 Performance for some workloads is ~2x due to HDFS-941, and can be
 improved a bit more if I finish HDFS-2080 in time. I did not do
 extensive failure testing (to stress the new append/sync code) but I
 do plan to do that in the coming months.

 HBase trunk can compile against 0.23 by using -Dhadoop23 on the maven
 build. Currently some 15 or so tests are failing - the following HBase
 JIRA tracks those issues:
 https://issues.apache.org/jira/browse/HBASE-4254

 (these may be indicative of HDFS side bugs)

 Any help there from the community would be appreciated!

 -Todd

 On Tue, Sep 27, 2011 at 12:24 PM, Roman Shaposhnik r...@apache.org wrote:
  Hi Arun!
 
  Thanks for the quick reply!
 
  I'm sorry if I had too many questions in my original email, but I can't
 find
  an answer to my integration tests question. Could you, please, share
  a URL with us where I can find out more about them?
 
  On Mon, Sep 26, 2011 at 11:20 PM, Arun C Murthy a...@hortonworks.com
 wrote:
  # We made changes to Pig - rather we got help from the Pig team,
 particularly Daniel.
 
  So, we plan to work through the rest of the stack - Hive, Oozie etc.
 very soon and we'll
  depend on updated releases from the individual projects.
 
  Do we have any kinds of commitment from downstream projects as far as
 those
  updates are concerned? Are they targeting these changes as part of point
 (patch)
  release of an already released version (like Pig 0.9.X for example) or
  will it be
  part of a brand new major release?
 
  Thanks,
  Roman.
 



 --
 Todd Lipcon
 Software Engineer, Cloudera



Re: Welcoming Alejandro Abdelnur as a Hadoop Committer

2011-09-26 Thread Ted Yu
Alejandro has been making contributions to HBase as well.

Congratulations Alejandro !

On Mon, Sep 26, 2011 at 9:21 AM, Tom White t...@cloudera.com wrote:

 On behalf of the PMC, I am pleased to announce that Alejandro Abdelnur
 has been elected a committer in the Apache Hadoop Common, HDFS, and
 MapReduce projects. We appreciate all the work Alejandro has put into
 the project so far, and look forward to his future contributions.

 Welcome, Alejandro!

 Cheers,
 Tom



Re: Welcoming Harsh J as a Hadoop committer

2011-09-16 Thread Ted Yu
Harsh definitely deserves this honor.

Cheers

On Thu, Sep 15, 2011 at 11:36 PM, Aaron T. Myers a...@cloudera.com wrote:

 Congratulations, Harsh! Very well-deserved.

 --
 Aaron T. Myers
 Software Engineer, Cloudera



Re: Hadoop´s Internationalization

2011-06-25 Thread Ted Yu
Marcos:
Which hadoop version(s) do you plan to work on ?

 Where I can find the sources of the docs?

In the latest TRUNK, you would find these directories:

./common/src/docs
./hdfs/src/c++/libhdfs/docs
./hdfs/src/docs
./mapreduce/src/docs

Cheers

On Sat, Jun 25, 2011 at 8:17 AM, Marcos Ortiz mlor...@uci.cu wrote:

 OK, Owen, Where I can find the sources of the docs?
 Which is the format for the docs? DocBook, ReST, etc?

 El 6/25/2011 4:38 AM, Owen O'Malley escribió:

  On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortizmlor...@uci.cu  wrote:



 Regards to all the list.
 I´m looking for a proper way to work on the internationalization of
 Hadoop,
 but I don´t know if this is a good project
 or if this is useful for the community, at least, I think that it would
 be
 very useful for many people that want to see the messages
 of the project in another language, for example, in Spanish.



 I think it would be very useful, but very time consuming project. I'd
 suggest that you start by translating the documentation for the release
 into
 another language first.

 -- Owen




 --
 Marcos Luís Ortíz Valmaseda
  Software Engineer (UCI)
  http://marcosluis2186.**posterous.comhttp://marcosluis2186.posterous.com
  http://twitter.com/**marcosluis2186 http://twitter.com/marcosluis2186




Re: hbase

2011-04-10 Thread Ted Yu
Mapreduce isn't required.
For your query, Hive would be a better fit.

On Sun, Apr 10, 2011 at 5:36 AM, Mag Gam magaw...@gmail.com wrote:

 Just curious, does hbase require mapreduce? Basically, I have several
 terabytes of data and I would like to query it similar to sql fashion.
 Was wondering if mapreduce was required.



Re: How to pass a parameter to map ?

2011-03-17 Thread Ted Yu
 I can pass paramenter to map when i configure a job?
You can utilize hadoop Configuration (JobConf).

On Thu, Mar 17, 2011 at 1:15 PM, Alessandro Binhara binh...@gmail.comwrote:

 
 
 
 I need select a data on map...
 I can pass paramenter to map when i configure a job?

 Other questions :
 I need sort a data and save in many files. the name of file is a sort Key
 ..???

 thanks ..



Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?

2010-12-22 Thread Ted Yu
 Thats why I think we should go to 0.22 ASAP and get companies to build
their new features on trunk against that.

There was a thread in Nov - 'Caution using Hadoop 0.21'
It would be helpful to see response to 0.22


 
  Thanks for getting the discussion off the ground,
  St.Ack




Re: WARNING : There are about 1 missing blocks. Please check the log or run fsck.

2010-08-26 Thread Ted Yu
Run fsck.

On Thu, Aug 26, 2010 at 5:56 AM, vaibhav negi sssena...@gmail.comwrote:

 Hi ,

 I am using hadoop version : 0.20.2 . I am getting this error message.

 WARNING : There are about 1 missing blocks. Please check the log or run
 fsck.

 What to do? Is there any problem in the cluster

 Thanks and Best Regards

 Vaibhav Negi



Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

2010-08-26 Thread Ted Yu
This would imply hadoop-0.20-security-append or hadoop-0.20-append-security
release be created which contains security and append features.

On Thu, Aug 26, 2010 at 4:22 PM, Arun C Murthy a...@yahoo-inc.com wrote:


 On Aug 26, 2010, at 12:08 PM, Stack wrote:

  On Mon, Aug 23, 2010 at 5:27 PM, Arun C Murthy a...@yahoo-inc.com wrote:

 In the interim I'd like to propose we push a hadoop-0.20-security release
 off the Yahoo! patchset (http://github.com/yahoo/hadoop-common). This
 will
 ensure the community benefits from all the work done at Yahoo! for over
 12
 months *now*, and ensures that we do not have to wait until hadoop-0.22
 which has all of these patches.


 Sounds good to me.  What will this release be called?
  hadoop-0.20.3-security?


 hadoop-0.20-security. I want to ensure hadoop-0.20 be a separate line, so
 as to not confuse people.



   Conceivably, one could imagine a Hadoop Security + Append release soon
 after.


 Well, it'd probably be better if we just did an append release first?
 A good few of us have been banging on the 0.20-append branch w/ a
 while now and its for sure doing append better than 0.20 did (smile).


 I think these are orthogonal and both can run their own course.

 Arun



Re: Child processes on datanodes/task trackers

2010-08-25 Thread Ted Yu
Use jps to find out pid of the Child.
Then use this to find out which job the Child belongs to:
ps aux | grep pid

On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote:

 Hi,

 I wanted to know why I see running Child processes on my datanodes even
 though
 there is no job running at that time. Are these left over from failed
 attempts?

 Is there anything I can do to keep these clean?

 Thanks,
 Deepika





Re: Child processes on datanodes/task trackers

2010-08-25 Thread Ted Yu
After you obtain pid, you can use jstack to see what the Child process was
doing.

What hadoop version are you using ?

On Wed, Aug 25, 2010 at 7:28 PM, C J c.josh...@yahoo.com wrote:

 Thanks for your reply.

 Some of these child tasks belong to successful jobs. I am wondering why
 they are
 still hanging there for long finished jobs.





 
 From: Ted Yu yuzhih...@gmail.com
 To: general@hadoop.apache.org
 Sent: Wed, August 25, 2010 4:17:38 PM
 Subject: Re: Child processes on datanodes/task trackers

 Use jps to find out pid of the Child.
 Then use this to find out which job the Child belongs to:
 ps aux | grep pid

 On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote:

  Hi,
 
  I wanted to know why I see running Child processes on my datanodes even
  though
  there is no job running at that time. Are these left over from failed
  attempts?
 
  Is there anything I can do to keep these clean?
 
  Thanks,
  Deepika
 
 
 







Re: Child processes on datanodes/task trackers

2010-08-25 Thread Ted Yu
I don't use ehcache.
Did you forget to close CacheManager at the end of your job by any chance ?

On Wed, Aug 25, 2010 at 7:59 PM, C J c.josh...@yahoo.com wrote:

 Thanks Ted!

 I did a jstack and it seems there is an issue with ehcache that I am using
 in
 the mapper task.


 net.sf.ehcache.cachemana...@57ac3379 daemon prio=10
 tid=0x59180800
 nid=0x379e in Object.wait() [0x41506000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x2aaabb0b89a8 (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Timer.java:509)
- locked 0x2aaabb0b89a8 (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:462)

   Locked ownable synchronizers:
- None
 .
 .
 .


 The hadoop version I am using is 0.20.2.

 Thanks.



 
 From: Ted Yu yuzhih...@gmail.com
 To: general@hadoop.apache.org
 Sent: Wed, August 25, 2010 7:34:35 PM
 Subject: Re: Child processes on datanodes/task trackers

 After you obtain pid, you can use jstack to see what the Child process was
 doing.

 What hadoop version are you using ?

 On Wed, Aug 25, 2010 at 7:28 PM, C J c.josh...@yahoo.com wrote:

  Thanks for your reply.
 
  Some of these child tasks belong to successful jobs. I am wondering why
  they are
  still hanging there for long finished jobs.
 
 
 
 
 
  
  From: Ted Yu yuzhih...@gmail.com
  To: general@hadoop.apache.org
  Sent: Wed, August 25, 2010 4:17:38 PM
  Subject: Re: Child processes on datanodes/task trackers
 
  Use jps to find out pid of the Child.
  Then use this to find out which job the Child belongs to:
  ps aux | grep pid
 
  On Wed, Aug 25, 2010 at 12:20 PM, C J c.josh...@yahoo.com wrote:
 
   Hi,
  
   I wanted to know why I see running Child processes on my datanodes even
   though
   there is no job running at that time. Are these left over from failed
   attempts?
  
   Is there anything I can do to keep these clean?
  
   Thanks,
   Deepika
  
  
  
 
 
 
 
 







Re: Lazy initialization of Reducers

2010-07-21 Thread Ted Yu
I don't find such parameter in 0.20.2

Please create such flag in your own class.

On Wed, Jul 21, 2010 at 10:15 AM, Syed Wasti mdwa...@hotmail.com wrote:


 Hi,

 I read about this Reducer Lazy initialization in a document found in the
 below URL.

 http://www.scribd.com/doc/23046928/Hadoop-Performance-Tuning



 It says “:In M/R job Reducers are initialized with Mappers at the job
 initialization, but the reduce method is called in reduce phase when all the
 maps had been finished. So in large jobs where Reducer loads data (100 MB
 for business logic) in-memory on initialization, the performance can be
 increased by lazily initializing Reducers i.e. loading data in reduce method
 controlled by an initialize flag variable which assures that it is loaded
 only once. By lazily initializing Reducers which require memory (for
 business logic) on initialization, number of maps can be increased.”



 But I did not find any other resource which talks about Reducer Lazy
 initialization.

 Does anyone have experience on this ?

 If yes, how and where can I set this parameter to get it working.



 Thanks for the support.


 Regards
 Syed Wasti





Re: Hadoop and XML

2010-07-20 Thread Ted Yu
Interesting.
String class is able to handle this scenario:

  348   public String(byte[] data, String encoding) throws
UnsupportedEncodingException {
  349   this(data, 0, data.length, encoding);
  350   }



On Tue, Jul 20, 2010 at 6:01 AM, Jeff Bean jwfb...@cloudera.com wrote:

 I think the problem is here:

 String valueString = new String(valueText.getBytes(), UTF-8);

 Javadoc for Text says:

 *getBytes
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getBytes%28%29
 
 *()
  Returns the raw bytes; however, only data up to
 getLength()
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getLength%28%29
 is
 valid.

 So try getting the length, truncating the byte array at the value returned
 by getLength() and THEN converting it to a String.

 Jeff

 On Mon, Jul 19, 2010 at 9:08 AM, Ted Yu yuzhih...@gmail.com wrote:

  For your initial question on Text.set().
  Text.setCapacity() allocates new byte array. Since keepData is false, old
  data wouldn't be copied over.
 
  On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo 
  peter.mine...@reardencommerce.com wrote:
 
   I am already using XmlInputFormat.  The input into the Map phase is not
   the problem.  The problem lays in between the Map and Reduce phase.
  
   BTW - The article is correct.  DO NOT USE StreamXmlRecordReader.
   XmlInputFormat is a lot faster.  From my testing, StreamXmlRecordReader
   took 8 minutes to read a 1 GB XML document; where as, XmlInputFormat
 was
   under 2 minutes. (Using 2 Core, 8GB machines)
  
  
   -Original Message-
   From: Ted Yu [mailto:yuzhih...@gmail.com]
   Sent: Friday, July 16, 2010 9:44 PM
   To: general@hadoop.apache.org
   Subject: Re: Hadoop and XML
  
   From an earlier post:
   http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html
  
   On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo 
   peter.mine...@reardencommerce.com wrote:
  
Moving the variable to a local variable did not seem to work:
   
   
/PrivateRateSetvateRateSet
   
   
   
public void map(Object key, Object value, OutputCollector output,
Reporter
reporter) throws IOException {
   Text valueText = (Text)value;
   String valueString = new String(valueText.getBytes(),
UTF-8);
   String keyString = getXmlKey(valueString);
Text returnKeyText = new Text();
   Text returnValueText = new Text();
   returnKeyText.set(keyString);
   returnValueText.set(valueString);
   output.collect(returnKeyText, returnValueText); }
   
-Original Message-
From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
Sent: Fri 7/16/2010 2:51 PM
To: general@hadoop.apache.org
Subject: RE: Hadoop and XML
   
Whoopsright after I sent it and someone else made a suggestion; I
realized what question 2 was about.  I can try that, but wouldn't
 that
  
cause Object bloat?  During the Hadoop training I went through; it
 was
  
mentioned to reuse the returning Key and Value objects to keep the
number of Objects created down to a minimum.  Is this not really a
valid point?
   
   
   
-Original Message-
From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
Sent: Friday, July 16, 2010 2:44 PM
To: general@hadoop.apache.org
Subject: RE: Hadoop and XML
   
   
I am not using multi-threaded Map tasks.  Also, if I understand your
second question correctly:
Also can you try creating the output key and values in the map
method(method lacal) ?
In the first code snippet I am doing exactly that.
   
Below is the class that runs the Job.
   
public class HadoopJobClient {
   
   private static final Log LOGGER =
LogFactory.getLog(Prds.class.getName());
   
   public static void main(String[] args) {
   JobConf conf = new JobConf(Prds.class);
   
   conf.set(xmlinput.start, PrivateRateSet);
   conf.set(xmlinput.end, /PrivateRateSet);
   
   conf.setJobName(PRDS Parse);
   
   conf.setOutputKeyClass(Text.class);
   conf.setOutputValueClass(Text.class);
   
   conf.setMapperClass(PrdsMapper.class);
   conf.setReducerClass(PrdsReducer.class);
   
   conf.setInputFormat(XmlInputFormat.class);
   conf.setOutputFormat(TextOutputFormat.class);
   
   FileInputFormat.setInputPaths(conf, new
 Path(args[0]));
   FileOutputFormat.setOutputPath(conf, new
Path(args[1]));
   
   // Run the job
   try {
   JobClient.runJob(conf);
   } catch (IOException e) {
   LOGGER.error(e.getMessage(), e

Re: Hadoop and XML

2010-07-20 Thread Ted Yu
I also added Peter's comment to the JIRA I logged:
https://issues.apache.org/jira/browse/HADOOP-6868

On Tue, Jul 20, 2010 at 9:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 So the correct call should be:
 String valueString = new String(valueText.getBytes(), 0,
 valueText.getLength(), UTF-8);

 Cheers


 On Tue, Jul 20, 2010 at 9:23 AM, Jeff Bean jwfb...@cloudera.com wrote:

 data.length is the length of the byte array.

 Text.getLength() most likely returns a different value than
 getBytes.length.

 Hadoop reuses box class objects like Text, so what it's probably doing is
 writing over the byte array, lengthening it as necessary, and just
 updating
 a separate length attribute.

 Jeff

 On Tue, Jul 20, 2010 at 8:56 AM, Ted Yu yuzhih...@gmail.com wrote:

  Interesting.
  String class is able to handle this scenario:
 
   348   public String(byte[] data, String encoding) throws
  UnsupportedEncodingException {
   349   this(data, 0, data.length, encoding);
   350   }
 
 
 
  On Tue, Jul 20, 2010 at 6:01 AM, Jeff Bean jwfb...@cloudera.com
 wrote:
 
   I think the problem is here:
  
   String valueString = new String(valueText.getBytes(), UTF-8);
  
   Javadoc for Text says:
  
   *getBytes
  
 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getBytes%28%29
   
   *()
Returns the raw bytes; however, only data up to
   getLength()
  
 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html#getLength%28%29
   is
   valid.
  
   So try getting the length, truncating the byte array at the value
  returned
   by getLength() and THEN converting it to a String.
  
   Jeff
  
   On Mon, Jul 19, 2010 at 9:08 AM, Ted Yu yuzhih...@gmail.com wrote:
  
For your initial question on Text.set().
Text.setCapacity() allocates new byte array. Since keepData is
 false,
  old
data wouldn't be copied over.
   
On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo 
peter.mine...@reardencommerce.com wrote:
   
 I am already using XmlInputFormat.  The input into the Map phase
 is
  not
 the problem.  The problem lays in between the Map and Reduce
 phase.

 BTW - The article is correct.  DO NOT USE StreamXmlRecordReader.
 XmlInputFormat is a lot faster.  From my testing,
  StreamXmlRecordReader
 took 8 minutes to read a 1 GB XML document; where as,
 XmlInputFormat
   was
 under 2 minutes. (Using 2 Core, 8GB machines)


 -Original Message-
 From: Ted Yu [mailto:yuzhih...@gmail.com]
 Sent: Friday, July 16, 2010 9:44 PM
 To: general@hadoop.apache.org
 Subject: Re: Hadoop and XML

 From an earlier post:

  http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html

 On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo 
 peter.mine...@reardencommerce.com wrote:

  Moving the variable to a local variable did not seem to work:
 
 
  /PrivateRateSetvateRateSet
 
 
 
  public void map(Object key, Object value, OutputCollector
 output,
  Reporter
  reporter) throws IOException {
 Text valueText = (Text)value;
 String valueString = new
  String(valueText.getBytes(),
  UTF-8);
 String keyString = getXmlKey(valueString);
  Text returnKeyText = new Text();
 Text returnValueText = new Text();
 returnKeyText.set(keyString);
 returnValueText.set(valueString);
 output.collect(returnKeyText, returnValueText); }
 
  -Original Message-
  From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
  Sent: Fri 7/16/2010 2:51 PM
  To: general@hadoop.apache.org
  Subject: RE: Hadoop and XML
 
  Whoopsright after I sent it and someone else made a
 suggestion;
  I
  realized what question 2 was about.  I can try that, but
 wouldn't
   that

  cause Object bloat?  During the Hadoop training I went through;
 it
   was

  mentioned to reuse the returning Key and Value objects to keep
 the
  number of Objects created down to a minimum.  Is this not really
 a
  valid point?
 
 
 
  -Original Message-
  From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
  Sent: Friday, July 16, 2010 2:44 PM
  To: general@hadoop.apache.org
  Subject: RE: Hadoop and XML
 
 
  I am not using multi-threaded Map tasks.  Also, if I understand
  your
  second question correctly:
  Also can you try creating the output key and values in the map
  method(method lacal) ?
  In the first code snippet I am doing exactly that.
 
  Below is the class that runs the Job.
 
  public class HadoopJobClient {
 
 private static final Log LOGGER =
  LogFactory.getLog(Prds.class.getName());
 
 public static void

Re: Hadoop and XML

2010-07-19 Thread Ted Yu
For your initial question on Text.set().
Text.setCapacity() allocates new byte array. Since keepData is false, old
data wouldn't be copied over.

On Mon, Jul 19, 2010 at 8:01 AM, Peter Minearo 
peter.mine...@reardencommerce.com wrote:

 I am already using XmlInputFormat.  The input into the Map phase is not
 the problem.  The problem lays in between the Map and Reduce phase.

 BTW - The article is correct.  DO NOT USE StreamXmlRecordReader.
 XmlInputFormat is a lot faster.  From my testing, StreamXmlRecordReader
 took 8 minutes to read a 1 GB XML document; where as, XmlInputFormat was
 under 2 minutes. (Using 2 Core, 8GB machines)


 -Original Message-
 From: Ted Yu [mailto:yuzhih...@gmail.com]
 Sent: Friday, July 16, 2010 9:44 PM
 To: general@hadoop.apache.org
 Subject: Re: Hadoop and XML

 From an earlier post:
 http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html

 On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo 
 peter.mine...@reardencommerce.com wrote:

  Moving the variable to a local variable did not seem to work:
 
 
  /PrivateRateSetvateRateSet
 
 
 
  public void map(Object key, Object value, OutputCollector output,
  Reporter
  reporter) throws IOException {
 Text valueText = (Text)value;
 String valueString = new String(valueText.getBytes(),
  UTF-8);
 String keyString = getXmlKey(valueString);
  Text returnKeyText = new Text();
 Text returnValueText = new Text();
 returnKeyText.set(keyString);
 returnValueText.set(valueString);
 output.collect(returnKeyText, returnValueText); }
 
  -Original Message-
  From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
  Sent: Fri 7/16/2010 2:51 PM
  To: general@hadoop.apache.org
  Subject: RE: Hadoop and XML
 
  Whoopsright after I sent it and someone else made a suggestion; I
  realized what question 2 was about.  I can try that, but wouldn't that

  cause Object bloat?  During the Hadoop training I went through; it was

  mentioned to reuse the returning Key and Value objects to keep the
  number of Objects created down to a minimum.  Is this not really a
  valid point?
 
 
 
  -Original Message-
  From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
  Sent: Friday, July 16, 2010 2:44 PM
  To: general@hadoop.apache.org
  Subject: RE: Hadoop and XML
 
 
  I am not using multi-threaded Map tasks.  Also, if I understand your
  second question correctly:
  Also can you try creating the output key and values in the map
  method(method lacal) ?
  In the first code snippet I am doing exactly that.
 
  Below is the class that runs the Job.
 
  public class HadoopJobClient {
 
 private static final Log LOGGER =
  LogFactory.getLog(Prds.class.getName());
 
 public static void main(String[] args) {
 JobConf conf = new JobConf(Prds.class);
 
 conf.set(xmlinput.start, PrivateRateSet);
 conf.set(xmlinput.end, /PrivateRateSet);
 
 conf.setJobName(PRDS Parse);
 
 conf.setOutputKeyClass(Text.class);
 conf.setOutputValueClass(Text.class);
 
 conf.setMapperClass(PrdsMapper.class);
 conf.setReducerClass(PrdsReducer.class);
 
 conf.setInputFormat(XmlInputFormat.class);
 conf.setOutputFormat(TextOutputFormat.class);
 
 FileInputFormat.setInputPaths(conf, new Path(args[0]));
 FileOutputFormat.setOutputPath(conf, new
  Path(args[1]));
 
 // Run the job
 try {
 JobClient.runJob(conf);
 } catch (IOException e) {
 LOGGER.error(e.getMessage(), e);
 }
 
 }
 
 
  }
 
 
 
 
  -Original Message-
  From: Soumya Banerjee [mailto:soumya.sbaner...@gmail.com]
  Sent: Fri 7/16/2010 2:29 PM
  To: general@hadoop.apache.org
  Subject: Re: Hadoop and XML
 
  Hi,
 
  Can you please share the code of the job submission client ?
 
  Also can you try creating the output key and values in the map
  method(method
  lacal) ?
  Make sure you are not using multi threaded map task configuration.
 
  map()
  {
  private Text keyText = new Text();
   private Text valueText = new Text();
 
  //rest of the code
  }
 
  Soumya.
 
  On Sat, Jul 17, 2010 at 2:30 AM, Peter Minearo 
  peter.mine...@reardencommerce.com wrote:
 
   I have an XML file that has sparse data in it.  I am running a
   MapReduce Job that reads in an XML file, pulls out a Key from within

   the XML snippet and then hands back the Key and the XML snippet (as
   the Value) to the OutputCollector.  The reason is to sort the file
  back into order.
   Below is the snippet of code.
  
   public class XmlMapper extends MapReduceBase implements Mapper {
  
private Text keyText = new Text();
private Text valueText = new

Re: Hadoop and XML

2010-07-16 Thread Ted Yu
From an earlier post:
http://oobaloo.co.uk/articles/2010/1/20/processing-xml-in-hadoop.html

On Fri, Jul 16, 2010 at 3:07 PM, Peter Minearo 
peter.mine...@reardencommerce.com wrote:

 Moving the variable to a local variable did not seem to work:


 /PrivateRateSetvateRateSet



 public void map(Object key, Object value, OutputCollector output, Reporter
 reporter) throws IOException {
Text valueText = (Text)value;
String valueString = new String(valueText.getBytes(),
 UTF-8);
String keyString = getXmlKey(valueString);
 Text returnKeyText = new Text();
Text returnValueText = new Text();
returnKeyText.set(keyString);
returnValueText.set(valueString);
output.collect(returnKeyText, returnValueText);
 }

 -Original Message-
 From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
 Sent: Fri 7/16/2010 2:51 PM
 To: general@hadoop.apache.org
 Subject: RE: Hadoop and XML

 Whoopsright after I sent it and someone else made a suggestion; I
 realized what question 2 was about.  I can try that, but wouldn't that
 cause Object bloat?  During the Hadoop training I went through; it was
 mentioned to reuse the returning Key and Value objects to keep the
 number of Objects created down to a minimum.  Is this not really a valid
 point?



 -Original Message-
 From: Peter Minearo [mailto:peter.mine...@reardencommerce.com]
 Sent: Friday, July 16, 2010 2:44 PM
 To: general@hadoop.apache.org
 Subject: RE: Hadoop and XML


 I am not using multi-threaded Map tasks.  Also, if I understand your
 second question correctly:
 Also can you try creating the output key and values in the map
 method(method lacal) ?
 In the first code snippet I am doing exactly that.

 Below is the class that runs the Job.

 public class HadoopJobClient {

private static final Log LOGGER =
 LogFactory.getLog(Prds.class.getName());

public static void main(String[] args) {
JobConf conf = new JobConf(Prds.class);

conf.set(xmlinput.start, PrivateRateSet);
conf.set(xmlinput.end, /PrivateRateSet);

conf.setJobName(PRDS Parse);

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);

conf.setMapperClass(PrdsMapper.class);
conf.setReducerClass(PrdsReducer.class);

conf.setInputFormat(XmlInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));

// Run the job
try {
JobClient.runJob(conf);
} catch (IOException e) {
LOGGER.error(e.getMessage(), e);
}

}


 }




 -Original Message-
 From: Soumya Banerjee [mailto:soumya.sbaner...@gmail.com]
 Sent: Fri 7/16/2010 2:29 PM
 To: general@hadoop.apache.org
 Subject: Re: Hadoop and XML

 Hi,

 Can you please share the code of the job submission client ?

 Also can you try creating the output key and values in the map
 method(method
 lacal) ?
 Make sure you are not using multi threaded map task configuration.

 map()
 {
 private Text keyText = new Text();
  private Text valueText = new Text();

 //rest of the code
 }

 Soumya.

 On Sat, Jul 17, 2010 at 2:30 AM, Peter Minearo 
 peter.mine...@reardencommerce.com wrote:

  I have an XML file that has sparse data in it.  I am running a
  MapReduce Job that reads in an XML file, pulls out a Key from within
  the XML snippet and then hands back the Key and the XML snippet (as
  the Value) to the OutputCollector.  The reason is to sort the file
 back into order.
  Below is the snippet of code.
 
  public class XmlMapper extends MapReduceBase implements Mapper {
 
   private Text keyText = new Text();
   private Text valueText = new Text();
 
   @SuppressWarnings(unchecked)
   public void map(Object key, Object value, OutputCollector output,
  Reporter reporter) throws IOException {  Text valueText = (Text)value;

  String valueString = new String(valueText.getBytes(), UTF-8);
  String keyString = getXmlKey(valueString);
  getKeyText().set(keyString);  getValueText().set(valueString);
  output.collect(getKeyText(), getValueText());  }
 
 
   public Text getKeyText() {
   return keyText;
   }
 
 
   public void setKeyText(Text keyText) {  this.keyText = keyText;  }
 
 
   public Text getValueText() {
   return valueText;
   }
 
 
   public void setValueText(Text valueText) {  this.valueText =
  valueText;  }
 
 
   private String getXmlKey(String value) {
 // Get the Key from the XML in the value.
   }
 
  }
 
  The XML snippet from the Value is fine when it is passed into the
  map() method.  I am not changing any data either, just pulling out
  

Re: Exception in thread main java.lang.ClassNotFoundException in Hadoop

2010-07-14 Thread Ted Yu
I assume you have used jar command to confirm that org.myorg.WordCount is in
wordcount.jar

On Mon, Jul 12, 2010 at 10:59 PM, james isaac jamesisaac.d...@gmail.comwrote:

 This is the command i used and prompt is pointing to the current directory
 where wordcount.jar resides. I have also set the path for the HADOOP_HOME
 and added the bin directory of hadoop to the classpath.

 current directory = /home/user/demo/wordcount
 HADOOP_HOME=/home/user/hadoop_sws/hadoop-0.20.2
 $ hadoop jar wordcount.jar org.myorg.WordCount /hdfs/data/input
 /hdfs/data/output



 On Mon, Jul 12, 2010 at 9:14 PM, Ted Yu yuzhih...@gmail.com wrote:

  Please give the commandline you used.
  You should have specified the jar containing WordCount class.
 
  On Mon, Jul 12, 2010 at 5:15 AM, james isaac jamesisaac.d...@gmail.com
  wrote:
 
   Hi,
  
   I have just started my career with Hadoop. I tried to execute the
 example
   given in
   http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.htmlas
   explained in the tutorial. I was not able to execute the program as
   described in the tutorial. I am getting the error as shown below. I
 tried
   executing with some other examples. They are also showing the same
   “ClassNotFoundException”. I tried to find the solution in various
  websites
   but i could not find it. Please help me to find the problem.
  
   Exception in thread main
   java.lang.ClassNotFoundException: org.myorg.WordCount
   at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
   at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
  
 



problem starting cdh3b2 jobtracker

2010-07-01 Thread Ted Yu
We installed cdh3b2 0.20.2+320 and saw some strange error in jobtracker log:

2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
up at: 9001
2010-07-02 01:49:31,977 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
webserver: 50030
2010-07-02 01:49:31,988 WARN org.apache.hadoop.mapred.JobTracker: Error
starting tracker: java.io.IOException: Cannot create toBeDeleted in
/data1/mapred/local
at
org.apache.hadoop.util.MRAsyncDiskService.init(MRAsyncDiskService.java:85)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1688)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765)

2010-07-02 01:49:32,990 INFO org.apache.hadoop.mapred.JobTracker: Scheduler
configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT,
limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1)
2010-07-02 01:49:32,991 FATAL org.apache.hadoop.mapred.JobTracker:
java.net.BindException: Problem binding to
sjc1-hadoop0.sjc1.ciq.com/10.201.8.204:9001http://sjc1-hadoop0.sjc1.carrieriq.com/10.201.8.204:9001:
Address already in use
at org.apache.hadoop.ipc.Server.bind(Server.java:198)
at org.apache.hadoop.ipc.Server$Listener.init(Server.java:261)
at org.apache.hadoop.ipc.Server.init(Server.java:1043)
at org.apache.hadoop.ipc.RPC$Server.init(RPC.java:492)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:454)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1628)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:199)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:191)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3765)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:196)
... 8 more

2010-07-02 01:49:32,992 INFO org.apache.hadoop.mapred.JobTracker:
SHUTDOWN_MSG:

But 9001 wasn't used:
[sjc1-hadoop0.sjc1:hadoop 25618]netstat -nta | grep 9001
[sjc1-hadoop0.sjc1:hadoop 25619]netstat -nta | grep 9000
tcp0  0 10.201.8.204:9000   0.0.0.0:*
LISTEN
tcp0  0 10.201.8.204:9000   10.201.8.214:4223
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.212:49074
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.206:11910
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.210:62611
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.213:1299
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.205:9756
ESTABLISHED
tcp0  0 10.201.8.204:9000   10.201.8.207:59207
ESTABLISHED

Here is output from ifconfig:
bond0 Link encap:Ethernet  HWaddr 00:30:48:60:53:94
  inet addr:10.201.8.204  Bcast:10.201.8.255  Mask:255.255.255.0
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
  RX packets:351496605 errors:0 dropped:1015 overruns:0 frame:0
  TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:119420730164 (111.2 GiB)  TX bytes:120002123131 (111.7
GiB)

eth0  Link encap:Ethernet  HWaddr 00:30:48:60:53:94
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:351496605 errors:0 dropped:1015 overruns:0 frame:0
  TX packets:178144953 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:119420730164 (111.2 GiB)  TX bytes:120002123131 (111.7
GiB)
  Interrupt:161

eth1  Link encap:Ethernet  HWaddr 00:30:48:60:53:94
  UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  Interrupt:169

Has anyone encountered similar issue ?


checksum error

2010-05-24 Thread Ted Yu
Hi,
We currently use hadoop 0.20.1

I found the following trace in our log:

 2010-05-17 16:19:55,389 INFO  [FSInputChecker] Found checksum error: b[0,
512]=53455106246f72672e6170616368652e6861646f6f702e696f2e426f6f6c65616e5772697461626c652b636f6d2e6361727269657269712e6d326d2e636f6d6d6f6e732e4f746155706c6f61645772697461626c6501002a6f72672e6170616368652e6861646f6f702e696f2e636f6d70726573732e44656661756c74436f646563aa555ae318451e8e76c0230f56f74d1f1466000101789cdddb09785355da00e0ef26699bae49ba2f94de2ed436857ab3364194b414294a355296e29aa449079025968ae82f78cb22051c088202e2125040d9acb88bc36410441d9dbf50dc1e06c57143749c8e0e0c7bff73ee92a6e9a7f23f33cef33f7f1f0839eff9ee3df77ef7dcedeb03130facc763b437197d76abd96237983c6e8399731b3d564395d9e2b6797c6e60b9268fdb5d55d56c31180c6693c56bb21bc9327683d5e036dadd162fb056afc76ae17cd62a8bd1e7717b0d3e6f95d5e2357b4c55be260fd76406d6e76b32992c9e2aafdb62b659bccd9cdb63b134f92c4d4d1e5b95d5e401d66430d8dc06b7d76e68f61a9acd469bdb6a37703ea3c9e33556594c1c19a5b9c964f671568bc96b34b83d1eb2850693db6d767306b3d10a6c1367367bb9660f89b47a7d469fd96771db7d06aec9cbd9c9c69351ec5683917ce38c16bbcd62b17bcc4dcd96263347f6c4d2ecf536937d31588c16a3bdb9
org.apache.hadoop.fs.ChecksumException: Checksum error:
file:/incoming_ub/ciq_dca_uploads/2010/03/26/100813/17_00.ub at 0
at
org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
at
org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428)
at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417)
at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412)
at
org.apache.hadoop.mapred.SequenceFileRecordReader.init(SequenceFileRecordReader.java:43)
at
com.ciq.m2m.platform.util.hadoop.ReportingSequenceFileRecordReader.init(ReportingSequenceFileRecordReader.java:54)
at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.assureReader(MultiFileRecordReader.java:128)
at
com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.readNextFromAnyReader(MultiFileRecordReader.java:146)
at
com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.readNextFromAnyReader(MultiFileRecordReader.java:181)
at
com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.next(MultiFileRecordReader.java:122)
at
com.ciq.m2m.platform.util.hadoop.MultiFileRecordReader.next(MultiFileRecordReader.java:20)
at
com.ciq.m2m.platform.mmp2.input.BaseOtaUploadRecordReader.readNextUpload(BaseOtaUploadRecordReader.java:125)
at
com.ciq.m2m.platform.mmp2.input.OtaUploadToBinaryPackageRecordReader.next(OtaUploadToBinaryPackageRecordReader.java:176)
at
com.ciq.m2m.platform.mmp2.input.OtaUploadToBinaryPackageRecordReader.next(OtaUploadToBinaryPackageRecordReader.java:56)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)

If you saw similar trace before, please share your comment.

Thanks


Re: datanode goes down, maybe due to Unexpected problem in creating temporary file

2010-05-17 Thread Ted Yu
That blk doesn't appear in NameNode log.

For datanode,
2010-05-15 00:09:31,023 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:49172 dest: /
10.32.56.171:50010
2010-05-15 00:09:31,024 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_926027507678171558_3620 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
2010-05-15 00:09:31,024 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_-5814095875968936685_2910 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_-5814095875968936685_2910.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_-5814095875968936685
should not be present, but is.
2010-05-15 00:09:31,025 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:98)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:619)
2010-05-15 00:09:31,025 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver

2010-05-15 00:19:28,334 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:36887 dest: /
10.32.56.171:50010
2010-05-15 00:19:28,334 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_926027507678171558_3620 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
2010-05-15 00:19:28,334 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:98)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:619)
2010-05-15 00:29:25,635 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:34823 dest: /
10.32.56.171:50010

On Mon, May 17, 2010 at 11:43 AM, Todd Lipcon t...@cloudera.com wrote:

 Hi Ted,

 Can you please grep your NN and DN logs for blk_926027507678171558 and
 pastebin the results?

 -Todd

 On Mon, May 17, 2010 at 9:57 AM, Ted Yu yuzhih...@gmail.com wrote:

  Hi,
  We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com
 
  I found this in datanode log:
 
  2010-05-15 07:37:35,955 INFO
  org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
  blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
  10.32.56.171:50010
  2010-05-15 07:37:35,956 INFO
  org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
  blk_926027507678171558_3620 received exception java.io.IOException:
  Unexpected

Re: MultipleInputs in 0.20

2010-05-09 Thread Ted Yu
Please refer to MAPREDUCE-1743.

Other option is to duplicate MultipleInputs, DelegatingInputFormat classes
and slightly modify TaggedInputSplit (as I suggested earlier).
This way you use your own (functional) version :-)

On Sun, May 9, 2010 at 2:08 PM, Oded Rosen o...@legolas-media.com wrote:

 By what I've learned from different sites around the web (hadoop wiki,
 cloudera
 http://www.cloudera.com/blog/2009/05/what%E2%80%99s-new-in-hadoop-core-020/
 ,
 mail archive, etc),
 the MultipleInputs class that was available in 0.18-0.19 versions of
 hadoop,
 was not moved to the 0.20 new API.
 (so does MultipleOutputs, but that's another story)

 I wanted to know if there is a way around this - to use two different paths
 with two different input format (sequence file, text file) as sources to
 the
 same job,
 with a special mapper for each input type - using hadoop 0.20 API. I think
 that writing a new job using 0.19 API only means more trouble later, when
 it's officially deprecated.

 I saw there is a jira goog_292716485
 (MAPREDUCE-1170)https://issues.apache.org/jira/browse/MAPREDUCE-1170open
 for this issue, with a patch marked as Won't fix.
 If someone out there can help me with this, I will be most thankful.

 Cheers,
 --
 Oded



Re: Compilation failed when compile hadoop common release-0.20.2

2010-03-05 Thread Ted Yu
Did you first try 'ant jar' from under /hadoop_src ?

On Fri, Mar 5, 2010 at 3:47 PM, Gary Yang garyya...@yahoo.com wrote:

 Hi,

 I try to compile hadoop common of the release 0.20.2. Below are the error
 messages and java and ant versions I am using. Please tell me what I missed.

 ..

  [javadoc] Standard Doclet version 1.6.0_18
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...

 java5.check:

 BUILD FAILED
 /hadoop_src/common/build.xml:908: 'java5.home' is not defined.  Forrest
 requires Java 5.  Please pass -Djava5.home=base of Java 5 distribution to
 Ant on the command-line.


 echo $JAVA_HOME
 /usr/java/latest

 which java
 /usr/java/latest/bin/java

 /usr/java/latest/bin/java -version
 java version 1.6.0_18
 Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
 Java HotSpot(TM) Server VM (build 16.0-b13, mixed mode)

 ant -version
 Apache Ant version 1.7.1 compiled on June 27 2008


 Thanks,


 Gary









Re: splittable encrypted compressed files

2010-01-24 Thread Ted Yu
A bit more background: the encryption is required by third party. Data
coming from third party may come in encrypted .gz format.

Cheers

On Sun, Jan 24, 2010 at 2:34 PM, Gautam Singaraju 
gautam.singar...@gmail.com wrote:

 A encryption/decryption mechanism requires the secret key to be
 securely distributed. As far as moving data on the wire is concerned,
 SSH is already being used. I am wondering about the performance hit
 when encryption and then again decryption is applied to the data and
 what value that might add.

 ---
 Gautam



 On Tue, Jan 19, 2010 at 2:38 PM, Ted Yu yuzhih...@gmail.com wrote:
  Hi,
  I experimented with hadoop-lzo and liked it.
 
  Our requirement is changing and we may need to encrypt data files. Does
  anyone know of splittable encrypted compression scheme that map task can
 use
  ?
 
  Thanks