[jira] [Updated] (HDFS-10810) Setreplication removing block from underconstrcution temporarily when batch IBR is enabled.

2016-10-02 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10810:

Attachment: HDFS-10810-branch-2.patch

Uploaded the branch-2 patch.

>  Setreplication removing block from underconstrcution temporarily when batch 
> IBR is enabled.
> 
>
> Key: HDFS-10810
> URL: https://issues.apache.org/jira/browse/HDFS-10810
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10810-002.patch, HDFS-10810-003.patch, 
> HDFS-10810-branch-2.patch, HDFS-10810.patch
>
>
> 1)Batch IBR is enabled with number of committed blocks allowed=1
> 2) Written one block and closed the file without waiting for IBR
> 3)Setreplication called immediately on the file. 
> So till the finalized IBR Received, block will not be added to 
> {{neededReconstruction}} since following check will be {{false}} as block is 
> not marked as complete.
> {code}
> if (isNeededReconstruction(block, repl.liveReplicas())) {
> neededReconstruction.update(block, repl.liveReplicas(),
> repl.readOnlyReplicas(), repl.decommissionedAndDecommissioning(),
> curExpectedReplicas, curReplicasDelta, expectedReplicasDelta);
>   }.{code}
> Hence block will not marked as under-replicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-10-02 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541527#comment-15541527
 ] 

Vinayakumar B commented on HDFS-10714:
--

hi [~yzhangal],[~kihwal] 
Would it selected approach looks fine to you?

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-10-02 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541527#comment-15541527
 ] 

Vinayakumar B edited comment on HDFS-10714 at 10/3/16 4:45 AM:
---

hi [~yzhangal],[~kihwal] 
Would selected approach looks fine to you?


was (Author: vinayrpet):
hi [~yzhangal],[~kihwal] 
Would it selected approach looks fine to you?

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10810) Setreplication removing block from underconstrcution temporarily when batch IBR is enabled.

2016-10-02 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541446#comment-15541446
 ] 

Mingliang Liu commented on HDFS-10810:
--

[~brahmareddy], can you confirm that the branch-2 has the same problem, and we 
need a branch-2 patch because of the conflict? Thanks!

>  Setreplication removing block from underconstrcution temporarily when batch 
> IBR is enabled.
> 
>
> Key: HDFS-10810
> URL: https://issues.apache.org/jira/browse/HDFS-10810
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10810-002.patch, HDFS-10810-003.patch, 
> HDFS-10810.patch
>
>
> 1)Batch IBR is enabled with number of committed blocks allowed=1
> 2) Written one block and closed the file without waiting for IBR
> 3)Setreplication called immediately on the file. 
> So till the finalized IBR Received, block will not be added to 
> {{neededReconstruction}} since following check will be {{false}} as block is 
> not marked as complete.
> {code}
> if (isNeededReconstruction(block, repl.liveReplicas())) {
> neededReconstruction.update(block, repl.liveReplicas(),
> repl.readOnlyReplicas(), repl.decommissionedAndDecommissioning(),
> curExpectedReplicas, curReplicasDelta, expectedReplicasDelta);
>   }.{code}
> Hence block will not marked as under-replicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10827) When there are unrecoverable ec block groups, Namenode Web UI shows "There are X missing blocks." but doesn't show the block names.

2016-10-02 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541441#comment-15541441
 ] 

Takanobu Asanuma commented on HDFS-10827:
-

The failed test can pass in my local computer.

> When there are unrecoverable ec block groups, Namenode Web UI shows "There 
> are X missing blocks." but doesn't show the block names.
> ---
>
> Key: HDFS-10827
> URL: https://issues.apache.org/jira/browse/HDFS-10827
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
> Attachments: HDFS-10827.1.patch, HDFS-10827.2.patch, case_2.png, 
> case_3.png
>
>
> For RS-6-3, when there is one ec block group and
> 1) 0~3 out of 9 internal blocks are missing, NN Web UI doesn't show any warns.
> 2) 4~8 out of 9 internal blocks are missing, NN Web UI shows "There are 1 
> missing blocks." but doesn't show the block names. (please see case_2.png)
> 3) 9 out of 9 internal blocks are missing, NN Web UI shows "There are 1 
> missing blocks." and also shows the block name. (please see case_3.png)
> We should fix the case 2). I think NN Web UI should show the block names 
> since the ec block group is unrecoverable.
> The values come from JMX. "There are X missing blocks." is 
> {{NumberOfMissingBlocks}} and the block names are {{CorruptFiles}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10827) When there are unrecoverable ec block groups, Namenode Web UI shows "There are X missing blocks." but doesn't show the block names.

2016-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541431#comment-15541431
 ] 

Hadoop QA commented on HDFS-10827:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10827 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831262/HDFS-10827.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 17a7e7d4d59f 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fe9ebe2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16971/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16971/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16971/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> When there are unrecoverable ec block groups, Namenode Web UI shows "There 
> are X missing blocks." but doesn't show the block names.
> ---
>
> Key: HDFS-10827
> URL: https://issues.apache.org/jira/browse/HDFS-10827
> Project: Hadoop HDFS

[jira] [Updated] (HDFS-10827) When there are unrecoverable ec block groups, Namenode Web UI shows "There are X missing blocks." but doesn't show the block names.

2016-10-02 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-10827:

Attachment: HDFS-10827.2.patch

I updated the patch with some improvements based on the discussion of 
HDFS-10826.

> When there are unrecoverable ec block groups, Namenode Web UI shows "There 
> are X missing blocks." but doesn't show the block names.
> ---
>
> Key: HDFS-10827
> URL: https://issues.apache.org/jira/browse/HDFS-10827
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
> Attachments: HDFS-10827.1.patch, HDFS-10827.2.patch, case_2.png, 
> case_3.png
>
>
> For RS-6-3, when there is one ec block group and
> 1) 0~3 out of 9 internal blocks are missing, NN Web UI doesn't show any warns.
> 2) 4~8 out of 9 internal blocks are missing, NN Web UI shows "There are 1 
> missing blocks." but doesn't show the block names. (please see case_2.png)
> 3) 9 out of 9 internal blocks are missing, NN Web UI shows "There are 1 
> missing blocks." and also shows the block name. (please see case_3.png)
> We should fix the case 2). I think NN Web UI should show the block names 
> since the ec block group is unrecoverable.
> The values come from JMX. "There are X missing blocks." is 
> {{NumberOfMissingBlocks}} and the block names are {{CorruptFiles}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Fenghua Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541287#comment-15541287
 ] 

Fenghua Hu commented on HDFS-10690:
---

Failed case has nothing to do with the patch, and they also passed on my test 
bed. [~xyao]

> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, 
> HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, 
> HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, 
> ShortCircuitCache_LinkedMap.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Fenghua Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541287#comment-15541287
 ] 

Fenghua Hu edited comment on HDFS-10690 at 10/3/16 1:27 AM:


Failed cases have nothing to do with the patch, actually they passed on my test 
bed. [~xyao]


was (Author: fenghua_hu):
Failed cases have nothing to do with the patch, and they also passed on my test 
bed. [~xyao]

> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, 
> HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, 
> HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, 
> ShortCircuitCache_LinkedMap.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Fenghua Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541287#comment-15541287
 ] 

Fenghua Hu edited comment on HDFS-10690 at 10/3/16 1:26 AM:


Failed cases have nothing to do with the patch, and they also passed on my test 
bed. [~xyao]


was (Author: fenghua_hu):
Failed case has nothing to do with the patch, and they also passed on my test 
bed. [~xyao]

> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, 
> HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, 
> HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, 
> ShortCircuitCache_LinkedMap.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7530) Allow renaming of encryption zone roots

2016-10-02 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-7530:
--
Fix Version/s: 2.6.5

> Allow renaming of encryption zone roots
> ---
>
> Key: HDFS-7530
> URL: https://issues.apache.org/jira/browse/HDFS-7530
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Fix For: 2.7.0, 2.6.5
>
> Attachments: HDFS-7530.001.patch, HDFS-7530.002.patch, 
> HDFS-7530.003.patch
>
>
> It should be possible to do
> hdfs dfs -mv /ezroot /newnameforezroot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15540714#comment-15540714
 ] 

Hadoop QA commented on HDFS-10690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 187 
unchanged - 4 fixed = 187 total (was 191) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 33s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 95m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSShell |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10690 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831252/HDFS-10690.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5b89cdb9bf4f 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fe9ebe2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16970/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16970/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client 
hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
| Console output | 

[jira] [Updated] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Fenghua Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fenghua Hu updated HDFS-10690:
--
Attachment: HDFS-10690.008.patch

> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, 
> HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, 
> HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, 
> ShortCircuitCache_LinkedMap.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-10-02 Thread Fenghua Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fenghua Hu updated HDFS-10690:
--
Attachment: (was: HDFS-10690.008.patch)

> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch, 
> HDFS-10690.003.patch, HDFS-10690.004.patch, HDFS-10690.005.patch, 
> HDFS-10690.006.patch, HDFS-10690.007.patch, HDFS-10690.008.patch, 
> ShortCircuitCache_LinkedMap.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15539978#comment-15539978
 ] 

Hadoop QA commented on HDFS-9820:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 
44 new + 172 unchanged - 12 fixed = 216 total (was 184) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
55s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-9820 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831237/HDFS-9820.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 113c4e3bd32b 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fe9ebe2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16969/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16969/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16969/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>   

[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-10-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15539938#comment-15539938
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Per discussion in HDFS-10314, we are going to continue to work on HDFS-9820. 
Uploaded rev005.


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch, HDFS-9820.005.patch
>
>
> A common use scenario (scenaio 1): 
> # create snapshot sx in clusterX, 
> # do some experiemnts in clusterX, which creates some files. 
> # throw away the files changed and go back to sx.
> Another scenario (scenario 2) is, there is a production cluster and a backup 
> cluster, we periodically sync up the data from production cluster to the 
> backup cluster with distcp. 
> The cluster in scenario 1 could be the backup cluster in scenario 2.
> For scenario 1:
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges.  Before that jira is implemented, we count on 
> distcp to copy from snapshot to the current state. However, the performance 
> of this operation could be very bad because we have to go through all files 
> even if we only changed a few files.
> For scenario 2:
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from 
> snapshot sx of either the source or the target cluster, to restore target 
> cluster's current state to sx. 
> Specifically,
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> As a native restore feature, HDFS-4167 would still be ideal to have. However, 
>  HDFS-9820 would hopefully be easier to implement, before HDFS-4167 is in 
> place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-10-02 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9820:

Attachment: HDFS-9820.005.patch

> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch, HDFS-9820.005.patch
>
>
> A common use scenario (scenaio 1): 
> # create snapshot sx in clusterX, 
> # do some experiemnts in clusterX, which creates some files. 
> # throw away the files changed and go back to sx.
> Another scenario (scenario 2) is, there is a production cluster and a backup 
> cluster, we periodically sync up the data from production cluster to the 
> backup cluster with distcp. 
> The cluster in scenario 1 could be the backup cluster in scenario 2.
> For scenario 1:
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges.  Before that jira is implemented, we count on 
> distcp to copy from snapshot to the current state. However, the performance 
> of this operation could be very bad because we have to go through all files 
> even if we only changed a few files.
> For scenario 2:
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from 
> snapshot sx of either the source or the target cluster, to restore target 
> cluster's current state to sx. 
> Specifically,
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> As a native restore feature, HDFS-4167 would still be ideal to have. However, 
>  HDFS-9820 would hopefully be easier to implement, before HDFS-4167 is in 
> place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10426) TestPendingInvalidateBlock failed in trunk

2016-10-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15539826#comment-15539826
 ] 

Masatake Iwasaki edited comment on HDFS-10426 at 10/2/16 6:04 AM:
--

Thanks for the comments, [~liuml07] and [~linyiqun].

I think the cause is that the {{delete}} and assertion was called before 
replication count reaches to 2. {{close}} after writing file can retrurns after 
at least 1 IBR from datanode reached to namenode. I had mentioned about the 
scenario but missed to check it before committing.. We can use 
{{DFSTestUtil#waitForReplication}} to wait IBRs. I will upload addendum patch 
later.


was (Author: iwasakims):
Thanks for the comments, [~liuml07] and [~linyiqun].

I think the cause is that the assertion was called before replication count 
reaches to 2. {{delete}} can retruns after at least 1 IBR from datanode reached 
to namenode. I had mentioned about the scenario but missed to check it before 
committing.. We can use {{DFSTestUtil#waitForReplication}} to wait IBRs. I will 
upload addendum patch later.

> TestPendingInvalidateBlock failed in trunk
> --
>
> Key: HDFS-10426
> URL: https://issues.apache.org/jira/browse/HDFS-10426
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, 
> HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch, 
> HDFS-10426.006.patch
>
>
> The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info:
> {code}
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
> testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock)
>   Time elapsed: 7.703 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92)
> {code}
> It looks that the {{invalidateBlock}} has been removed before we do the check
> {code}
> // restart NN
> cluster.restartNameNode(true);
> dfs.delete(foo, true);
> Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal());
> Assert.assertEquals(REPLICATION, cluster.getNamesystem()
> .getPendingDeletionBlocks());
> Assert.assertEquals(REPLICATION,
> dfs.getPendingDeletionBlocksCount());
> {code}
> And I look into the related configurations. I found the property 
> {{dfs.namenode.replication.interval}} was just set as 1 second in this test. 
> And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} 
> and the delete operation was slowly, it will cause this case. We can see the 
> stack info before, the failed test costs 7.7s more than 5+1 second.
> One way can improve this.
> * Increase the time of {{dfs.namenode.replication.interval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10426) TestPendingInvalidateBlock failed in trunk

2016-10-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15539826#comment-15539826
 ] 

Masatake Iwasaki edited comment on HDFS-10426 at 10/2/16 6:02 AM:
--

Thanks for the comments, [~liuml07] and [~linyiqun].

I think the cause is that the assertion was called before replication count 
reaches to 2. {{delete}} can retruns after at least 1 IBR from datanode reached 
to namenode. I had mentioned about the scenario but missed to check it before 
committing.. We can use {{DFSTestUtil#waitForReplication}} to wait IBRs. I will 
upload addendum patch later.


was (Author: iwasakims):
Thanks for the comments, [~liuml07] and [~linyiqun].

I think the cause is that {{delete}} was called before replication count 
reaches to 2. {{delete}} can retruns after at least 1 IBR from datanode reached 
to namenode. I had mentioned about the scenario but missed to check it before 
committing.. We can use {{DFSTestUtil#waitForReplication}} to wait IBRs. I will 
upload addendum patch later.

> TestPendingInvalidateBlock failed in trunk
> --
>
> Key: HDFS-10426
> URL: https://issues.apache.org/jira/browse/HDFS-10426
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, 
> HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch, 
> HDFS-10426.006.patch
>
>
> The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info:
> {code}
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
> testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock)
>   Time elapsed: 7.703 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92)
> {code}
> It looks that the {{invalidateBlock}} has been removed before we do the check
> {code}
> // restart NN
> cluster.restartNameNode(true);
> dfs.delete(foo, true);
> Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal());
> Assert.assertEquals(REPLICATION, cluster.getNamesystem()
> .getPendingDeletionBlocks());
> Assert.assertEquals(REPLICATION,
> dfs.getPendingDeletionBlocksCount());
> {code}
> And I look into the related configurations. I found the property 
> {{dfs.namenode.replication.interval}} was just set as 1 second in this test. 
> And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} 
> and the delete operation was slowly, it will cause this case. We can see the 
> stack info before, the failed test costs 7.7s more than 5+1 second.
> One way can improve this.
> * Increase the time of {{dfs.namenode.replication.interval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10426) TestPendingInvalidateBlock failed in trunk

2016-10-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15539826#comment-15539826
 ] 

Masatake Iwasaki commented on HDFS-10426:
-

Thanks for the comments, [~liuml07] and [~linyiqun].

I think the cause is that {{delete}} was called before replication count 
reaches to 2. {{delete}} can retruns after at least 1 IBR from datanode reached 
to namenode. I had mentioned about the scenario but missed to check it before 
committing.. We can use {{DFSTestUtil#waitForReplication}} to wait IBRs. I will 
upload addendum patch later.

> TestPendingInvalidateBlock failed in trunk
> --
>
> Key: HDFS-10426
> URL: https://issues.apache.org/jira/browse/HDFS-10426
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, 
> HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch, 
> HDFS-10426.006.patch
>
>
> The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info:
> {code}
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
> testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock)
>   Time elapsed: 7.703 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92)
> {code}
> It looks that the {{invalidateBlock}} has been removed before we do the check
> {code}
> // restart NN
> cluster.restartNameNode(true);
> dfs.delete(foo, true);
> Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal());
> Assert.assertEquals(REPLICATION, cluster.getNamesystem()
> .getPendingDeletionBlocks());
> Assert.assertEquals(REPLICATION,
> dfs.getPendingDeletionBlocksCount());
> {code}
> And I look into the related configurations. I found the property 
> {{dfs.namenode.replication.interval}} was just set as 1 second in this test. 
> And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} 
> and the delete operation was slowly, it will cause this case. We can see the 
> stack info before, the failed test costs 7.7s more than 5+1 second.
> One way can improve this.
> * Increase the time of {{dfs.namenode.replication.interval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org