Hi Dhruba,

Thanks for the pointer. I'm going to try and pull this code into our internal 
20-ish distro. Would you object if I make a contribution of that result if it 
is successful?


Best regards,


    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
Tom White)

>________________________________
>From: Dhruba Borthakur <dhr...@gmail.com>
>To: Andrew Purtell <apurt...@apache.org>
>Cc: "hdfs-user@hadoop.apache.org" <hdfs-user@hadoop.apache.org>
>Sent: Tuesday, September 20, 2011 2:18 AM
>Subject: Re: Need help regarding HDFS-RAID
>
>
>Hi andy,
>
>
>we do run a version of HDFS RAID that is backported from Apache trunk to a 
>0.20 based release. Our code is 
>in https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/raid
>But I do not have an elegant way to contribute this code to Apache 0.20.2xx.x. 
>
>
>thanks,
>dhruba
>
>
>On Sat, Sep 17, 2011 at 9:16 AM, Andrew Purtell <apurt...@apache.org> wrote:
>
>Hi Dhruba,
>>
>>
>>Would you consider a contribution of this to branch-0.20-security 
>>aka 0.20.2xx.x?
>>
>>
>>If I am mistaken and you do not have a 0.22-ish HDFS RAID backported to an 
>>0.20-ish platform, please disregard.
>>
>>
>>Best regards,
>>
>>
>>    - Andy
>>
>>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
>>Tom White)
>>
>>
>>>________________________________
>>>From: Dhruba Borthakur <dhr...@gmail.com>
>>>To: hdfs-user@hadoop.apache.org; Andrew Purtell <apurt...@apache.org>
>>>Sent: Thursday, September 15, 2011 10:14 AM
>>>
>>>Subject: Re: Need help regarding HDFS-RAID
>>>
>>>
>>>
>>>That's right Andy. 0.22+. We are running a HDFS-RAID code base that is 
>>>pretty close to what is available in Apache hdfs trunk.
>>>
>>>
>>>-dhruba
>>>
>>>
>>>On Thu, Sep 15, 2011 at 10:08 AM, Andrew Purtell <apurt...@apache.org> wrote:
>>>
>>>But that is the HDFS RAID effectively in 0.22+, not 0.21, right Dhruba?
>>>>
>>>> 
>>>>Best regards,
>>>>
>>>>
>>>>       - Andy
>>>>
>>>>Problems worthy of attack prove their worth by hitting back. - Piet Hein 
>>>>(via Tom White)
>>>>
>>>>
>>>>>________________________________
>>>>>From: Dhruba Borthakur <dhr...@gmail.com>
>>>>>To: hdfs-user@hadoop.apache.org
>>>>>Sent: Thursday, September 15, 2011 10:06 AM
>>>>>Subject: Re: Need help regarding HDFS-RAID
>>>>>
>>>>>
>>>>>
>>>>>We use HDFS RAID in a big way. Data older than 12 days are RAIDED using 
>>>>>XOR encoding (effective replication of 2.5). Data older than a few months 
>>>>>are raided using ReedSolomon (effective observed replication factor of 
>>>>>1.5). This is running on our 60 PB size cluster for about an year now.
>>>>>
>>>>>
>>>>>thanks
>>>>>dhruba
>>>>>
>>>>>
>>>>>
>>>>>On Thu, Sep 15, 2011 at 5:31 AM, Ajit Ratnaparkhi 
>>>>><ajit.ratnapar...@gmail.com> wrote:
>>>>>
>>>>>Hi,
>>>>>>
>>>>>>
>>>>>>We were planning to use it for past data archival(instead of moving it to 
>>>>>>archival store).
>>>>>>Archiving it in HDFS gives advantage of making it easily available for 
>>>>>>processing whenever required.
>>>>>>
>>>>>>
>>>>>>Is there any archival solution in hadoop ecosystem?
>>>>>>
>>>>>>
>>>>>>thanks,
>>>>>>Ajit.
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Thu, Sep 15, 2011 at 5:05 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>
>>>>>>Hey Ajit,
>>>>>>>
>>>>>>>HDFS-RAID was never part of the 0.20 release. It made its debut in the
>>>>>>>0.21 release [1]. I know that Facebook uses it (and also did develop
>>>>>>>it), but unsure of users beyond Facebook.
>>>>>>>
>>>>>>>While 0.21 overall is not entirely deemed as production-usable yet
>>>>>>>(and is in fact, possibly abandoned for efforts on 0.22+), you can
>>>>>>>give that release a whirl on a test cluster and see for yourself if
>>>>>>>your need beats the stability.
>>>>>>>
>>>>>>>Just curious though - why are you looking to use this specifically?
>>>>>>>
>>>>>>>[1] - 
>>>>>>>http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21/mapreduce/src/contrib/raid/
>>>>>>>
>>>>>>>
>>>>>>>On Thu, Sep 15, 2011 at 4:37 PM, Ajit Ratnaparkhi
>>>>>>><ajit.ratnapar...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> We want to use HDFS-RAID in our production cluster.
>>>>>>>> (http://wiki.apache.org/hadoop/HDFS-RAID)
>>>>>>>> I am not able to find source/binaries/configs for this in official 
>>>>>>>> hadoop
>>>>>>>> distribution from apache hadoop. (checked in 0.20.1 and 0.20.2).
>>>>>>>> Can somebody please tell me where can I find that? and installation
>>>>>>>> procedure?
>>>>>>>> Also, is HDFS-RAID implementation stable enough to use in production?
>>>>>>>> thanks,
>>>>>>>> Ajit.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>--
>>>>>>>Harsh J
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>-- 
>>>>>Connect to me at http://www.facebook.com/dhruba
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>>-- 
>>>Connect to me at http://www.facebook.com/dhruba
>>>
>>>
>>>
>
>
>
>-- 
>Connect to me at http://www.facebook.com/dhruba
>
>
> 

Reply via email to