Hi Dhruba, Thanks for the pointer. I'm going to try and pull this code into our internal 20-ish distro. Would you object if I make a contribution of that result if it is successful?
Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Dhruba Borthakur <dhr...@gmail.com> >To: Andrew Purtell <apurt...@apache.org> >Cc: "hdfs-user@hadoop.apache.org" <hdfs-user@hadoop.apache.org> >Sent: Tuesday, September 20, 2011 2:18 AM >Subject: Re: Need help regarding HDFS-RAID > > >Hi andy, > > >we do run a version of HDFS RAID that is backported from Apache trunk to a >0.20 based release. Our code is >in https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/raid >But I do not have an elegant way to contribute this code to Apache 0.20.2xx.x. > > >thanks, >dhruba > > >On Sat, Sep 17, 2011 at 9:16 AM, Andrew Purtell <apurt...@apache.org> wrote: > >Hi Dhruba, >> >> >>Would you consider a contribution of this to branch-0.20-security >>aka 0.20.2xx.x? >> >> >>If I am mistaken and you do not have a 0.22-ish HDFS RAID backported to an >>0.20-ish platform, please disregard. >> >> >>Best regards, >> >> >> - Andy >> >>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via >>Tom White) >> >> >>>________________________________ >>>From: Dhruba Borthakur <dhr...@gmail.com> >>>To: hdfs-user@hadoop.apache.org; Andrew Purtell <apurt...@apache.org> >>>Sent: Thursday, September 15, 2011 10:14 AM >>> >>>Subject: Re: Need help regarding HDFS-RAID >>> >>> >>> >>>That's right Andy. 0.22+. We are running a HDFS-RAID code base that is >>>pretty close to what is available in Apache hdfs trunk. >>> >>> >>>-dhruba >>> >>> >>>On Thu, Sep 15, 2011 at 10:08 AM, Andrew Purtell <apurt...@apache.org> wrote: >>> >>>But that is the HDFS RAID effectively in 0.22+, not 0.21, right Dhruba? >>>> >>>> >>>>Best regards, >>>> >>>> >>>> - Andy >>>> >>>>Problems worthy of attack prove their worth by hitting back. - Piet Hein >>>>(via Tom White) >>>> >>>> >>>>>________________________________ >>>>>From: Dhruba Borthakur <dhr...@gmail.com> >>>>>To: hdfs-user@hadoop.apache.org >>>>>Sent: Thursday, September 15, 2011 10:06 AM >>>>>Subject: Re: Need help regarding HDFS-RAID >>>>> >>>>> >>>>> >>>>>We use HDFS RAID in a big way. Data older than 12 days are RAIDED using >>>>>XOR encoding (effective replication of 2.5). Data older than a few months >>>>>are raided using ReedSolomon (effective observed replication factor of >>>>>1.5). This is running on our 60 PB size cluster for about an year now. >>>>> >>>>> >>>>>thanks >>>>>dhruba >>>>> >>>>> >>>>> >>>>>On Thu, Sep 15, 2011 at 5:31 AM, Ajit Ratnaparkhi >>>>><ajit.ratnapar...@gmail.com> wrote: >>>>> >>>>>Hi, >>>>>> >>>>>> >>>>>>We were planning to use it for past data archival(instead of moving it to >>>>>>archival store). >>>>>>Archiving it in HDFS gives advantage of making it easily available for >>>>>>processing whenever required. >>>>>> >>>>>> >>>>>>Is there any archival solution in hadoop ecosystem? >>>>>> >>>>>> >>>>>>thanks, >>>>>>Ajit. >>>>>> >>>>>> >>>>>> >>>>>>On Thu, Sep 15, 2011 at 5:05 PM, Harsh J <ha...@cloudera.com> wrote: >>>>>> >>>>>>Hey Ajit, >>>>>>> >>>>>>>HDFS-RAID was never part of the 0.20 release. It made its debut in the >>>>>>>0.21 release [1]. I know that Facebook uses it (and also did develop >>>>>>>it), but unsure of users beyond Facebook. >>>>>>> >>>>>>>While 0.21 overall is not entirely deemed as production-usable yet >>>>>>>(and is in fact, possibly abandoned for efforts on 0.22+), you can >>>>>>>give that release a whirl on a test cluster and see for yourself if >>>>>>>your need beats the stability. >>>>>>> >>>>>>>Just curious though - why are you looking to use this specifically? >>>>>>> >>>>>>>[1] - >>>>>>>http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21/mapreduce/src/contrib/raid/ >>>>>>> >>>>>>> >>>>>>>On Thu, Sep 15, 2011 at 4:37 PM, Ajit Ratnaparkhi >>>>>>><ajit.ratnapar...@gmail.com> wrote: >>>>>>>> Hi, >>>>>>>> We want to use HDFS-RAID in our production cluster. >>>>>>>> (http://wiki.apache.org/hadoop/HDFS-RAID) >>>>>>>> I am not able to find source/binaries/configs for this in official >>>>>>>> hadoop >>>>>>>> distribution from apache hadoop. (checked in 0.20.1 and 0.20.2). >>>>>>>> Can somebody please tell me where can I find that? and installation >>>>>>>> procedure? >>>>>>>> Also, is HDFS-RAID implementation stable enough to use in production? >>>>>>>> thanks, >>>>>>>> Ajit. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>-- >>>>>>>Harsh J >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>>-- >>>>>Connect to me at http://www.facebook.com/dhruba >>>>> >>>>> >>>>> >>> >>> >>> >>>-- >>>Connect to me at http://www.facebook.com/dhruba >>> >>> >>> > > > >-- >Connect to me at http://www.facebook.com/dhruba > > >