subject:"\[jira\] \[Commented\] $HDFS\-503$ Implement erasure coding as a layer on HDFS"

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2014-11-24 Thread Vincent.Wei (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224107#comment-14224107
 ] 

Vincent.Wei commented on HDFS-503:
--

Is anybody know how to build this patch on Hadoop v2.2.0 ?

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-11-23 Thread Hemanth Makkapati (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156434#comment-13156434
]

Hemanth Makkapati commented on HDFS-503:

Hey,
I am a beginner with hadoop and started delving into the code only lately.
As I was trying to get RAID up and running, I observed the following exception
in the log

ERROR org.apache.hadoop.raid.RaidNode: java.lang.NullPointerException
at org.apache.hadoop.raid.RaidNode.tmpHarPathForCode(RaidNode.java:1491)
at org.apache.hadoop.raid.RaidNode.doHar(RaidNode.java:1217)
at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:73)
at org.apache.hadoop.raid.RaidNode$HarMonitor.run(RaidNode.java:1371)
at java.lang.Thread.run(Thread.java:636)

The reason for this seems to be the absence of 'erasurecode' tag in raid
configuration file which, in my case, is very similar to the sample
configuration file provided. Once the tag is introduced, which is allowed to
assume either XOR or RS, I didn't see any exception. Also, the README file also
doesn't mention anything about such a tag.
Please confirm if my observation is correct.
Thought of posting it here for the benefit of others.
BTW, I checked out code from the trunk.

Thank you.

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.21.0

Attachments: raid1.txt, raid2.txt

The goal of this JIRA is to discuss how the cost of raw storage for a HDFS
file system can be reduced. Keeping three copies of the same data is very
costly, especially when the size of storage is huge. One idea is to reduce
the replication factor and do erasure coding of a set of blocks so that the
over probability of failure of a block remains the same as before.
Many forms of error-correcting codes are available, see
http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has
described DiskReduce
https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
My opinion is to discuss implementation strategies that are not part of base
HDFS, but is a layer on top of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-07-06 Thread sri (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060343#comment-13060343
]

sri commented on HDFS-503:
--

I have couple of questions,

1)With, Raid being setup, I am not able to generate DFSAdmin report (hadoop
dfsadmin -report). Why is that so ?

2)I am not able to reduce the targetReplicationFactor to 0 (I want to run
mapreduce where the Bloackfixer retrives the data from the raided disks) Is der
any way to do this.

Thanks in advance

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-07-06 Thread sri (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060450#comment-13060450
]

sri commented on HDFS-503:
--

I would like to know, if the stripes just act as a recovery option(when other
datanodes have failed), or can they act as input to the mapreduce jobs(to
satisfy locality).

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-07-06 Thread dhruba borthakur (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060975#comment-13060975
]

dhruba borthakur commented on HDFS-503:
---

1. Raid has no impact on dfsadmin -report command.

2. You won't be able to set a replication factor to 0. You would have to
manually pull the plug (kill it) on a datanode to see how raid works.

3. stripe locations do not contribute to split locations of a block, thus they
are not used for map-reduce locality.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-06-12 Thread sri (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048444#comment-13048444
 ] 

sri commented on HDFS-503:
--

Error in name node 
ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: 
java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.hadoop.dfs.DistributedRaidFileSystem
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:866)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1304)
at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
at org.apache.hadoop.fs.Trash.init(Trash.java:62)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startTrashEmptier(NameNode.java:292)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:288)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:434)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.dfs.DistributedRaidFileSystem
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:864)
... 11 more

Can some body help me out..


 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-06-12 Thread Robert Chansler (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048446#comment-13048446
 ] 

Robert Chansler commented on HDFS-503:
--

I'll look forward to reading your message when I return to the office Friday 17 
June.


 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-06-02 Thread Krishnaraj (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042930#comment-13042930
]

Krishnaraj commented on HDFS-503:
-

Is there any stable version of Hadoop erasure coding. Where can I download the
source code of it. I am not able to find it, in the hadoop trunk.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-06-02 Thread dhruba borthakur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043211#comment-13043211
 ] 

dhruba borthakur commented on HDFS-503:
---

source code in 
http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/src/contrib/raid

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2011-06-02 Thread Krishnaraj (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043213#comment-13043213
]

Krishnaraj commented on HDFS-503:
-

I got this patch and I took the HDFS from
http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/ and put it in contrib and
built it. But I din know how to use it further(ie instead of the hadoop jar
that we setup in the cluster. I did not get the jar as mentioned in README). Is
there any detailed tutorial?

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-10-04 Thread Ramkumar Vadali (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12917481#action_12917481
]

Ramkumar Vadali commented on HDFS-503:
--

@shravankumar, to get a basic idea of HDFS RAID, you can read up Dhruba's blog
post
http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html

If you need this for demo purposes, could you use the current hadoop trunk? I
am not sure about the exact date of the next release.
To use RAID, you need to create a configuration file and start the RAID daemon.
You can look for examples in the unit tests, say TestRaidNode.

For further communication, you can contact me directly.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-10-03 Thread shravankumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12917286#action_12917286
 ] 

shravankumar commented on HDFS-503:
---

@Ramkumar vadali.Thank you sir
Where can i access the raid code(which have been fixed).


 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-09-30 Thread Ramkumar Vadali (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916677#action_12916677
]

Ramkumar Vadali commented on HDFS-503:
--

@shravankumar Quite a few bugs in raid have been fixed in trunk. This will be
part of the upcoming release hadoop-0.22. What do you mean by raid API?

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-09-24 Thread shravankumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914757#action_12914757
 ] 

shravankumar commented on HDFS-503:
---

Can any one help me. In which stable version of hadoop this raid become a part 
and how can i access the API documents related to raid

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-07-01 Thread JIRA

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12884405#action_12884405
]

Celina d´ Ávila Samogin commented on HDFS-503:
--

I have to intend to propose something about implementation of erasure coding
techniques in HDFS, starting in July, 2010. I will add a comment for to say
what I'm doing or ask for hints, soon as possible. For now, I have studied the
texts suggested in this issue and others papers. I have read about RS codes and
LDPC codes. I have not even started to implement and did not even start the
test.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-07 Thread shravankumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876178#action_12876178
 ] 

shravankumar commented on HDFS-503:
---

Thank you sir.

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-07 Thread shravankumar (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876187#action_12876187
]

shravankumar commented on HDFS-503:
---

Hello sir,

1. what is the meaning for this
srcPath prefix=hdfs://dfs1.xxx.com:8000/user/dhruba/

2. In ADMINISTRATION they mentioned RaidNode Software what it means.

3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages.
This will come when we installed hadoop 0.20.1 or we need download ant package.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-02 Thread shravankumar (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874514#action_12874514
]

shravankumar commented on HDFS-503:
---

Hello sir,

1. what is the meaning for this
srcPath prefix=hdfs://dfs1.xxx.com:8000/user/dhruba/

2. In ADMINISTRATION they mentioned RaidNode Software what it means.

3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages.
This will come when we installed hadoop 0.20.1 or we need download ant
package.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-02 Thread shravankumar (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874518#action_12874518
]

shravankumar commented on HDFS-503:
---

The tags(property,description) used in programming are normal HTML Tags or they
have different meaning.
Can you send me the document which consist of meanings of these tags.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-02 Thread Rodrigo Schmidt (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874902#action_12874902
]

Rodrigo Schmidt commented on HDFS-503:
--

The tags are XML.

There is no documentation for the tags, either.

In short, Raid is still being optimized and changes are constant. Any strong
documentation effort at this point would be meaningful for a very short period
of time.

The source code is the best and most precise documentation you can rely upon.
That's the good thing about open source projects. You can easily get around
stale documentation.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-06-01 Thread shravankumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873913#action_12873913
 ] 

shravankumar commented on HDFS-503:
---

Thank you.

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-05-31 Thread shravankumar (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873571#action_12873571
]

shravankumar commented on HDFS-503:
---

Thank you sir.
I have one more query both raid1.txt and raid2.txt looks similar what is the
difference between them.
In the implementation for parity whether they are used NORMAL CRC OR SOME OTHER
MECHANISMS like REED SOLOMON CODES.

Shravan Kumar.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-05-31 Thread shravankumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873572#action_12873572
 ] 

shravankumar commented on HDFS-503:
---

For raid1.txt and raid2.txt any DESIGN DIAGRAMS like Class Diagram are there.

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-05-31 Thread Rodrigo Schmidt (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873589#action_12873589
]

Rodrigo Schmidt commented on HDFS-503:
--

raid1.txt and raid2.txt are different patches. The most recent was the one that
got committed.

Raid is implementing simple xor parity right now, but we have plans to extend
it in the future.

Sorry, no design diagrams that I'm aware of.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-04-28 Thread shravankumar (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861740#action_12861740
]

shravankumar commented on HDFS-503:
---

Dear sir,
I have downloaded the code for Implement erasure coding as a layer on HDFS.
But i was unable to execute it. Please guide me regarding this.

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2010-04-28 Thread Rodrigo Schmidt (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861924#action_12861924
]

Rodrigo Schmidt commented on HDFS-503:
--

Hi,

Can you provide more details on what you have done and what didn't work.

Did you follow the instructions on the README file? Which error did you see?

Cheers,
Rodrigo

Implement erasure coding as a layer on HDFS
---

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-11-09 Thread dhruba borthakur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12775050#action_12775050
 ] 

dhruba borthakur commented on HDFS-503:
---

I am investigating

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.22.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-10-06 Thread dhruba borthakur (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12762538#action_12762538
]

dhruba borthakur commented on HDFS-503:
---

{quote}

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 13 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

{quote}

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-09-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754683#action_12754683
]

Hadoop QA commented on HDFS-503:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12419417/raid2.txt
against trunk revision 814221.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 13 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 release audit. The applied patch generated 157 release audit warnings
(more than the trunk's current 154 warnings).

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/testReport/
Release audit warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/console

This message is automatically generated.

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.21.0

Attachments: raid1.txt, raid2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-09-13 Thread Tsz Wo (Nicholas), SZE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754787#action_12754787
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-503:
-

 I created a separate JIRA HDFS-600 to make the Parity generation algorithm 
 pluggable.
Thanks, Dhruba.

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-09-09 Thread Raghu Angadi (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753349#action_12753349
]

Raghu Angadi commented on HDFS-503:
---

This seems pretty useful. since this is done outside HDFS, it is simpler for
users to start experimenting.

Say a file has 5 blocks with replication of 3 : total 15 blocks
With this tool, replication could be reduced to 2, with one block for parity :
total 10 + 2 blocks
This is a savings of 20% space. Is this math correct?

Detecting when to 'unRaid' :
* The patch does this using a wrapper filesystem over HDFS.
** This requires file to be read by the client.
** More often than not, HDFS knows about irrecoverable blocks much before
a client reads.
** this only semi-transparent to the users since they have to use the new
filesystem.

* Another completely transparent alternative could be to make 'RaidNode' ping
NameNode for missing blocks.
** NameNode already knows about blocks that don't have any known good
replica. And fetching that list is cheap.
** RaidNode could check if the corrupt/missing block belongs to any of
its files.
** Rest of RaidNode pretty much remains the same as this patch.

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: raid1.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-08-28 Thread Tsz Wo (Nicholas), SZE (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749047#action_12749047
]

Tsz Wo (Nicholas), SZE commented on HDFS-503:
-

Took a quick look of the patch. Very cool!

It seems that the parity is computed by xor. If there is a clean api, we may
improve it by some advanced codes like Reed-Solomon in the future.

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: raid1.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-08-28 Thread dhruba borthakur (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749081#action_12749081
]

dhruba borthakur commented on HDFS-503:
---

Hi Nicholas, I agree with you completely. The current patch implements basic
xor. Once this patch is accepted by the community, I plan to make the algorithm
pluggable, so that people can plug in more advanced erasure codes into the
framework laid out by this patch.

If you have the time and energy, please review the patch and provide any
feedback you may have. Thanks.

Implement erasure coding as a layer on HDFS
---

Key: HDFS-503
URL: https://issues.apache.org/jira/browse/HDFS-503
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: raid1.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-07-24 Thread Hong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735011#action_12735011
 ] 

Hong Tang commented on HDFS-503:


As a reference, FAST 09 has a paper that benchmarks the performance of various 
open source erasure coding implementations: 
http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.html.

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

[jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS

34 matches

Site Navigation

Mail list logo

Footer information