subject:"\[jira\] \[Commented\] \(HDFS\-6482\) Use block ID\-based block layout on datanodes"


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646287#comment-14646287
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2217 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2217/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646257#comment-14646257
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #268 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/268/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645568#comment-14645568
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8236 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8236/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645941#comment-14645941
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1001 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1001/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645927#comment-14645927
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #271 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/271/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646134#comment-14646134
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #260 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/260/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646093#comment-14646093
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2198 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2198/])
HDFS-8834. TestReplication is not valid after HDFS-6482. (Contributed by Lei 
Xu) (lei: rev f4f1b8b267703b8bebab06e17e69a4a4de611592)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156320#comment-14156320
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/698/])
HDFS-6482. Fix CHANGES.txt in trunk (arp: rev 
be30c86cc9f71894dc649ed22983e5c42e9b6951)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156398#comment-14156398
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/])
HDFS-6482. Fix CHANGES.txt in trunk (arp: rev 
be30c86cc9f71894dc649ed22983e5c42e9b6951)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156514#comment-14156514
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/])
HDFS-6482. Fix CHANGES.txt in trunk (arp: rev 
be30c86cc9f71894dc649ed22983e5c42e9b6951)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-02 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157106#comment-14157106
 ] 

Harsh J commented on HDFS-6482:
---

Updated the FAQ that was covering DN block moves to reflect the new 
maintain-subdir requirement: 
https://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157521#comment-14157521
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Thanks, [~qwertymaniac].

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-10-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155014#comment-14155014
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6163/])
HDFS-6482. Fix CHANGES.txt in trunk (arp: rev 
be30c86cc9f71894dc649ed22983e5c42e9b6951)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128495#comment-14128495
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1867 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1867/])
HDFS-6482. Fix CHANGES.txt in trunk. (arp: rev 
0de563a18e9e09207e3ef5f1cad1d2e788af9503)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128916#comment-14128916
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1892 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1892/])
HDFS-6482. Fix CHANGES.txt in trunk. (arp: rev 
0de563a18e9e09207e3ef5f1cad1d2e788af9503)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-03 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120886#comment-14120886
 ] 

Arpit Agarwal commented on HDFS-6482:
-

I added a patch to HDFS-6981. Please review it.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118513#comment-14118513
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Yeah, it would be great to have this in 2.6.  Is HDFS-6981 blocking merging 
this to 2.6?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-02 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118554#comment-14118554
 ] 

Arpit Agarwal commented on HDFS-6482:
-

That is the known issue, yes.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-29 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115703#comment-14115703
 ] 

Arpit Agarwal commented on HDFS-6482:
-

Now that HDFS-6800 is in trunk to support DN layout changes with rolling 
upgrade I'd like to include this improvement in 2.6.

I plan to test it with rolling upgrades over the next couple of weeks.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-06 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088044#comment-14088044
 ] 

Arpit Agarwal commented on HDFS-6482:
-

Hi [~james.thomas], is the branch-2 merge blocked by HDFS-6800?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-06 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088059#comment-14088059
 ] 

James Thomas commented on HDFS-6482:


[~arpitagarwal], yep, we're waiting on discussion there.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-06 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088108#comment-14088108
 ] 

Arpit Agarwal commented on HDFS-6482:
-

It is too late to mention this now but I am concerned about the delta between 
trunk and branch-2 while we wait on HDFS-6800 to get resolved. Will continue 
the discussion on HDFS-6800.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083524#comment-14083524
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #631 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/631/])
HDFS-6482. Use block ID-based block layout on datanodes (James Thomas via Colin 
Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615223)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DiskChecker.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/Block.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeLayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSFinalize.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeLayoutUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/UpgradeUtilities.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-24-datanode-dir.tgz
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-datanode-dir.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch,

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083569#comment-14083569
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1825 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1825/])
HDFS-6482. Use block ID-based block layout on datanodes (James Thomas via Colin 
Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615223)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DiskChecker.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/Block.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeLayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSFinalize.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeLayoutUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/UpgradeUtilities.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-24-datanode-dir.tgz
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-datanode-dir.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch,

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083589#comment-14083589
 ] 

Hudson commented on HDFS-6482:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1850 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1850/])
HDFS-6482. Use block ID-based block layout on datanodes (James Thomas via Colin 
Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615223)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DiskChecker.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/Block.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeLayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSFinalize.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeLayoutUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/UpgradeUtilities.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-24-datanode-dir.tgz
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-datanode-dir.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch,

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-01 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14082901#comment-14082901
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Thanks for your hard work on this, James.  Committed to trunk (but not 
branch-2).

Let's continue to discuss the rolling downgrade issues (need for additional 
rolling DN downgrade tests, general DN rolling downgrade strategy, etc.) on 
HDFS-6800.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14082945#comment-14082945
 ] 

Hudson commented on HDFS-6482:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5999 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5999/])
HDFS-6482. Use block ID-based block layout on datanodes (James Thomas via Colin 
Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1615223)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DiskChecker.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/Block.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeLayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSFinalize.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeLayoutUpgrade.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/UpgradeUtilities.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-24-datanode-dir.tgz
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-datanode-dir.txt


 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch,

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-01 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083032#comment-14083032
 ] 

Chris Nauroth commented on HDFS-6482:
-

Hi, [~james.thomas] and [~cmccabe].  This patch broke compilation on Windows.  
I filed HADOOP-10925 for it, and I expect to have a patch ready shortly.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-01 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083035#comment-14083035
 ] 

James Thomas commented on HDFS-6482:


[~cnauroth], apologies, thanks.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-08-01 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083113#comment-14083113
 ] 

Colin Patrick McCabe commented on HDFS-6482:


thanks, Chris

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081365#comment-14081365
 ] 

Suresh Srinivas commented on HDFS-6482:
---

[~cmccabe], does this patch change the directory structure and during rollback 
the new directory structure is retained? If so, this is a concern for me and I 
am -0 on this. We should at least make sure we have rollback tests to check 
that things work and there are no undocumented hidden assumption in the older 
software version (to which we are rolling back to) is broken. Given rollback 
could be to any older release from where upgrade to this version is allowed; 
that makes testing and ensuring nothing is broken that much more hard.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081427#comment-14081427
 ] 

James Thomas commented on HDFS-6482:


[~sureshms] No, during rollback the directory structure in the previous 
directory on the DN is restored, and this is the old directory structure. I 
have run some tests on my computer that show that rollback works. It is not 
really possible to write a rollback test that checks this case because we 
cannot run an older version of the DN code in the test.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081603#comment-14081603
 ] 

James Thomas commented on HDFS-6482:


[~sureshms] Is the documentation the Rollback section at 
http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
 correct? You are supposed to restart the DNs normally, without flags like 
-rollback or -rollingupgrade rollback? If you restart the DNs with 
-rollback, everything should work normally and the previous directory should 
be restored with the old layout. [~arpitagarwal], any thoughts on this?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081607#comment-14081607
 ] 

James Thomas commented on HDFS-6482:


Or [~kihwal]?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081618#comment-14081618
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Why don't we merge this to trunk and then open another JIRA to iron out any 
issues with rolling upgrades between different DN layout versions.  At minimum, 
we should decide whether we support rolling DN upgrades between different 
layout versions, and if we don't support it, give a clear failure message to 
admins.  But this patch is big enough that I don't think cramming all that into 
here is a good idea.  There also seem to be some issues with rolling DN 
downgrade now (for example, HDFS-6005 removed {{datanode \-rollingupgrade 
\-rollback}}, but not the usage text for it displayed in {{\-help}}.)

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-31 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081739#comment-14081739
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Hey guys, I filed HDFS-6800 to have the rolling upgrade discussion.  I'm going 
to commit this to trunk (but *not* to any other branches) in a bit if nobody 
has any objections.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-30 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080201#comment-14080201
 ] 

Colin Patrick McCabe commented on HDFS-6482:


+1.  Thanks for your work on this, James.

I'm going to commit this to trunk today if there's no further comments.  We can 
decide about branch-2 later

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-30 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080275#comment-14080275
 ] 

Suresh Srinivas commented on HDFS-6482:
---

[~cmccabe], I have been traveling and not kept up with this. I will try to get 
back by tomorrow. If not, please go ahead and commit by tomorrow evening.

Any comments I may have can be addressed before merging to branch-2.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-30 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080383#comment-14080383
 ] 

Colin Patrick McCabe commented on HDFS-6482:


ok, I will wait for tomorrow evening.  thanks for looking at this, suresh.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-21 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069541#comment-14069541
 ] 

James Thomas commented on HDFS-6482:


In response to [~cmccabe]'s comments:

One thread per storage directory doesn't make sense here since this is the 
number of threads to use for the hard link process for ONE storage directory. 
The hard link processes for the storage directories are currently not run in 
parallel.

We can create a separate JIRA to add the native code to the regular hard link 
path. I've created a separate code path in this change for the upgrade to the 
block ID-based layout, and I want to focus on optimizing that in this JIRA.

[~sureshms], any thoughts?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-21 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069631#comment-14069631
]

Colin Patrick McCabe commented on HDFS-6482:

bq. One thread per storage directory doesn't make sense here since this is the
number of threads to use for the hard link process for ONE storage directory.
The hard link processes for the storage directories are currently not run in
parallel.

Understood. It seems like we should be parallelizing the upgrade of different
storage directories, since clearly we'd like to keep all those disks busy if we
could. Anyway, this JIRA is big enough as-is, so let's not worry about it
right now.

James, given that the you've gotten the upgrade times in the single seconds
now, I am +1 on putting this change in 2.x. [~sureshms], [~atm], what are your
thoughts here?

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch,
HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch,
HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch,
hadoop-24-datanode-dir.tgz

Right now blocks are placed into directories that are split into many
subdirectories when capacity is reached. Instead we can use a block's ID to
determine the path it should go in. This eliminates the need for the LDir
data structure that facilitates the splitting of directories when they reach
capacity as well as fields in ReplicaInfo that keep track of a replica's
location.
An extension of the work in HDFS-3290.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066942#comment-14066942
]

Hadoop QA commented on HDFS-6482:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12656582/HDFS-6482.8.patch
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7390//console

This message is automatically generated.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066976#comment-14066976
]

Hadoop QA commented on HDFS-6482:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12656582/HDFS-6482.8.patch
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7391//console

This message is automatically generated.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-18 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066988#comment-14066988
]

Colin Patrick McCabe commented on HDFS-6482:

1 second for 100k blocks is pretty good.

bq. Added a configuration parameter for users to specify the number of threads
to be used in the hard link process.

Perhaps one thread per storage directory would make sense? I'm not sure if a
configuration option is useful, if this upgrade is a one time event (and the
NameNodes that would be upgraded have already been deployed.)

bq. We use these optimizations for the hard link process only when upgrading to
the block ID-based layout, because otherwise the directory structures of the
old and new layouts should be the same and we can perform fast batch hard links
over directories – see HDFS-1445.

Why not always use the native path, if it's faster? It should be trivial to
implement the batch symlink API via the native path. You'd just write a
for loop in java that made some calls down into the JNI function you already
wrote. There is a new symlink API coming up in Java7, so we'll want to stop
using the shell thing eventually anyway.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-18 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066995#comment-14066995
 ] 

Colin Patrick McCabe commented on HDFS-6482:


So, when I try to apply your patch locally, I get git binary diffs are not 
supported.  It looks like different versions of GNU patch have different 
behavior here (presumably support is a new feature?) and we're playing jenkins 
roulette.  I would say put the tar.gz file in a separate attachment for now, so 
we can get a jenkins run.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-18 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066997#comment-14066997
 ] 

Colin Patrick McCabe commented on HDFS-6482:


and also patch fails when you get git binary diffs are not supported-- 
hence the message you're seeing in the log output

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067068#comment-14067068
]

Hadoop QA commented on HDFS-6482:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12656597/hadoop-24-datanode-dir.tgz
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7394//console

This message is automatically generated.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067359#comment-14067359
 ] 

Hadoop QA commented on HDFS-6482:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656608/HDFS-6482.9.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
  org.apache.hadoop.ipc.TestIPC
  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
  org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7396//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7396//console

This message is automatically generated.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-08 Thread James Thomas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055173#comment-14055173
]

James Thomas commented on HDFS-6482:

[~sureshms] Thanks for the info. I don't understand your last comment -- could
you explain further? Also, I don't think it makes sense to support both the
LDir structure and this structure simultaneously. We would need to continue to
maintain information in the ReplicaMap about where each block was located
(since we wouldn't know whether it was stored with the old or new scheme), so
there would be no memory usage savings. I'm not sure we would ever reach a
point where all blocks stored with the old scheme would be gone and we could
officially stop using the location field in ReplicaInfo.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-08 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055419#comment-14055419
]

Colin Patrick McCabe commented on HDFS-6482:

bq. \[james wrote\]: Also, I don't think it makes sense to support both the
LDir structure and this structure simultaneously

The main reason to do this change is to save memory and simplify things by not
having to store the path to each replica. If we support the old layout, then
we no longer have this nice property. We could still get some of the gains by
setting the path to null in some of the various data structures... basically
assume that null means this replica is located at a place determined by its
block id. And non-null would mean using the old system. This might be a
possible solution. I would prefer not to go down this road due to the greater
code complexity, though.

bq. \[suresh wrote\]: I think creating hard links with new schema is an issue.
The main reason for hardlinks created as it is done today is to minimize the
impact of any bug in new software. The simplest thing was done where we
iterated over directories and created hardlinks. Rollback must ensure the
system goes back to previous state of the system.

I don't see why a rollback wouldn't work here. It's the same as going from the
old (pre hadoop-2.0) layout to the new block pool-based layout. We also used
hardlinks there to provide downgrade capability, and it also worked there.
We're not changing the contents of the old directory, just moving it out of the
way and hardlinking to the block and meta files within it.

bq. James Thomas, we did a bunch of improvement to cut down the time from 10s
of minutes to a couple of minutes. See HDFS-1445 for more details. Clearly
anything significantly above 60S (design goal of rolling upgrades) will results
in issues for rolling upgrades.

Yes. This is a very important consideration. James and I discussed a few ways
to optimize the hardlink process. I think that it's very possible for this to
be done in a second or two at most. If you assume 500,000 replicas spread over
10 drives, you have 50,000 hardlinks to make on each drive. This just isn't
going to take that long, since the operations you're doing are just altering
memory (we don't fsync after calling {{link}}). It's just a question of doing
it in a smart way that minimizes the number of {{exec}} calls we make (and
possibly obtains some parallelism).

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053895#comment-14053895
 ] 

Suresh Srinivas commented on HDFS-6482:
---

HDFS-5535 added support for rolling upgrade. During that time, given datanode 
layout rarely changes, rollback was not considered for datanodes. Given that 
this jira is changing datanode layout version, impact on rollback should be 
considered before this change can be committed.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, 
 HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, 
 HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-07 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053958#comment-14053958
 ] 

James Thomas commented on HDFS-6482:


I think DN rollback should work fine with this change. The previous directory 
will contain the blocks laid out with the old structure, and on rollback this 
structure will be restored. The relevant code is in DataStorage.java, 
particularly linkBlocks().

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, 
 HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, 
 HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-07 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054008#comment-14054008
]

Colin Patrick McCabe commented on HDFS-6482:

The new current directory will contain hardlinks to the block and metadata
files in the previous directory. So it seems like rollback should work fine
in this case.

It would be nice to add a unit test where we upgrade a DataNode from the
non-blockid-based version to a blockid-based version, and then do a rollback.
Can you add this, James? Since you already added
{{hadoop-24-datanode-dir.tgz}}, it shouldn't be too difficult to add a unit
test that rolls back to this version from the new version.

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch,
HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch,
HDFS-6482.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054044#comment-14054044
]

Suresh Srinivas commented on HDFS-6482:
---

bq. The new current directory will contain hardlinks to the block and
metadata files in the previous directory. So it seems like rollback should
work fine in this case.
Can a brief design doc be posted to this jira to describe what the new
directory structure is, what happens during upgrade to release with this
change? I do not have time to review jira to figure this out.

My read is, this will not work if during upgrade the previous layout is changed
to the new layout. During rolling upgrades hardlinks to all the blocks are
*not* created, only for the ones deleted post rolling upgrade. This is done to
keep the datanode upgrade time short to support quick restart. If rolling
upgrades cannot be supported, this code can only go into a major release.

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch,
HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch,
HDFS-6482.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-07 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054115#comment-14054115
]

Colin Patrick McCabe commented on HDFS-6482:

bq. My read is, this will not work if during upgrade the previous layout is
changed to the new layout. During rolling upgrades hardlinks to all the blocks
are not created, only for the ones deleted post rolling upgrade. This is done
to keep the datanode upgrade time short to support quick restart. If rolling
upgrades cannot be supported, this code can only go into a major release.

My understanding of this was that it was an optimization for the cases where
the datanode layout hadn't changed significantly (which was most upgrades). It
should not be interpreted as a hard limitation that prevents us from making
*any* changes for the datanode layout in the future.

James, it would be good to see some upgrade times for a DN with a few hundred
thousand blocks. It seems like this should be manageable, especially if we
parallelize it a bit.

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch,
HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch,
HDFS-6482.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054141#comment-14054141
]

Suresh Srinivas commented on HDFS-6482:
---

bq. My understanding of this was that it was an optimization for the cases
where the datanode layout hadn't changed significantly (which was most
upgrades).
One of the key requirements of rolling upgrades was to keep datanode upgrade
time as short as possible. Second, current rolling upgrades does not take
hardlinks as I mentioned already. Hence if the assumption is hardlinks will be
made, that needs to be factored in.

bq. It should not be interpreted as a hard limitation that prevents us from
making any changes for the datanode layout in the future.
Not all datanode layout changes need massive changes to underlying directory
structure. One solution is to support both directory structures and as the
blocks get deleted and re-added, they will naturally migrate to the new scheme.

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch,
HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch,
HDFS-6482.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-07-07 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054273#comment-14054273
 ] 

James Thomas commented on HDFS-6482:


[~sureshms] Do you know if there are any benchmarks that demonstrate that 
creating hundreds of thousands of hard links using the regular upgrade 
procedure is slow?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054508#comment-14054508
 ] 

Suresh Srinivas commented on HDFS-6482:
---

[~james.thomas], we did a bunch of improvement to cut down the time from 10s of 
minutes to a couple of minutes. See HDFS-1445 for more details. Clearly 
anything significantly above 60S (design goal of rolling upgrades) will results 
in issues for rolling upgrades.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes