[jira] [Updated] (HDFS-13124) hadoop-daemon.sh exits with 1 when running HDFS balancer on balanced cluster

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Kostrzewa updated HDFS-13124:
--
Affects Version/s: (was: 2.7.4)
   2.7.3

> hadoop-daemon.sh exits with 1 when running HDFS balancer on balanced cluster
> 
>
> Key: HDFS-13124
> URL: https://issues.apache.org/jira/browse/HDFS-13124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, scripts
>Affects Versions: 2.7.3
>Reporter: Zbigniew Kostrzewa
>Priority: Minor
>
> When running HDFS balancer via {{sbin/start-balancer.sh}} script on a 
> balanced cluster the script exits with 1 though the CLI behind it (i.e. 
> {{hdfs balancer}}) exits with 0. This is probably caused by following piece 
> of code found in {{hadoop-daemon.sh}}:
> {code:java}
> sleep 3;
> if ! ps -p $! > /dev/null ; then
>   exit 1
> fi
> {code}
> It seems the CLI command finishes so quickly in case of a balanced cluster 
> that the above {{ps}} does not find it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13128) HDFS balancer in single node cluster fails with "Another Balancer is running.."

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Kostrzewa updated HDFS-13128:
--
Description: 
In a single node "cluster", HDFS balancer fails with:
{noformat}
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
java.io.IOException: Another Balancer is running.. Exiting ...
{noformat}
and in Name Node logs there is:
{noformat}
2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 for /system/balancer.id
2018-02-09 07:23:21,739 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1126407107_1
2018-02-09 07:23:21,758 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.append: Failed to APPEND_FILE /system/balancer.id for 
DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 because this file lease is 
currently owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,758 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
10.9.4.184:49781 Call#12 Retry#0: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
APPEND_FILE /system/balancer.id for DFSClient_NONMAPREDUCE_1275100437_1 on 
10.9.4.184 because this file lease is currently owned by 
DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,773 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.9.4.184:30010 is added to 
blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 size 15
2018-02-09 07:23:21,776 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile: /system/balancer.id is closed by 
DFSClient_NONMAPREDUCE_-1126407107_1{noformat}

  was:
In a single node "cluster", HDFS balancer fails with:
{noformat}
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
java.io.IOException: Another Balancer is running.. Exiting ...
{noformat}
and in Name Node logs there is:
{noformat}
2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, replicas=[ReplicaUC[[
DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]} for 
/system/balancer.id
2018-02-09 07:23:21,739 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1126407107_1
2018-02-09 07:23:21,758 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.append: Failed to APPEND_FILE /system/balancer.id for 
DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 because this file lease is 
currently owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,758 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
10.9.4.184:49781 Call#12 Retry#0: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
APPEND_FILE /system/balancer.id for DFSClient_NONMAPREDUCE_1275100437_1 on 
10.9.4.184 because this file lease is currently owned by 
DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,773 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.9.4.184:30010 is added to 
blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 size 15
2018-02-09 07:23:21,776 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile: /system/balancer.id is closed by 
DFSClient_NONMAPREDUCE_-1126407107_1{noformat}


> HDFS balancer in single node cluster fails with "Another Balancer is 
> running.."
> ---
>
> Key: HDFS-13128
> URL: https://issues.apache.org/jira/browse/HDFS-13128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.7.3
>Reporter: Zbigniew Kostrzewa
>Priority: Minor
>
> In a single node "cluster", HDFS balancer fails with:
> {noformat}
> Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
> java.io.IOException: Another Balancer is running.. Exiting ...
> {noformat}
> and in Name Node logs there is:
> {noformat}
> 2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
> primaryNodeIndex=-1, 
> 

[jira] [Created] (HDFS-13128) HDFS balancer in single node cluster fails with "Another Balancer is running.."

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)
Zbigniew Kostrzewa created HDFS-13128:
-

 Summary: HDFS balancer in single node cluster fails with "Another 
Balancer is running.."
 Key: HDFS-13128
 URL: https://issues.apache.org/jira/browse/HDFS-13128
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover, hdfs
Affects Versions: 2.7.3
Reporter: Zbigniew Kostrzewa


In a single node "cluster", HDFS balancer fails with:
{noformat}
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
java.io.IOException: Another Balancer is running.. Exiting ...
{noformat}
and in Name Node logs there is:
{noformat}
2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, replicas=[ReplicaUC[[
DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]} for 
/system/balancer.id
2018-02-09 07:23:21,739 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1126407107_1
2018-02-09 07:23:21,758 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.append: Failed to APPEND_FILE /system/balancer.id for 
DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 becaus
e this file lease is currently owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 
10.9.4.184
2018-02-09 07:23:21,758 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
10.9.4.184:49781 Call#12 Retry#0: org.
apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to APPEND_FILE 
/system/balancer.id for DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 
because this file lease is currently 
owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,773 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.9.4.184:30010 is added to 
blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primar
yNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 size 15
2018-02-09 07:23:21,776 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile: /system/balancer.id is closed by 
DFSClient_NONMAPREDUCE_-1126407107_1{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13128) HDFS balancer in single node cluster fails with "Another Balancer is running.."

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Kostrzewa updated HDFS-13128:
--
Description: 
In a single node "cluster", HDFS balancer fails with:
{noformat}
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
java.io.IOException: Another Balancer is running.. Exiting ...
{noformat}
and in Name Node logs there is:
{noformat}
2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, replicas=[ReplicaUC[[
DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]} for 
/system/balancer.id
2018-02-09 07:23:21,739 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1126407107_1
2018-02-09 07:23:21,758 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.append: Failed to APPEND_FILE /system/balancer.id for 
DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 because this file lease is 
currently owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,758 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
10.9.4.184:49781 Call#12 Retry#0: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
APPEND_FILE /system/balancer.id for DFSClient_NONMAPREDUCE_1275100437_1 on 
10.9.4.184 because this file lease is currently owned by 
DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,773 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.9.4.184:30010 is added to 
blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 size 15
2018-02-09 07:23:21,776 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile: /system/balancer.id is closed by 
DFSClient_NONMAPREDUCE_-1126407107_1{noformat}

  was:
In a single node "cluster", HDFS balancer fails with:
{noformat}
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
java.io.IOException: Another Balancer is running.. Exiting ...
{noformat}
and in Name Node logs there is:
{noformat}
2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
primaryNodeIndex=-1, replicas=[ReplicaUC[[
DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]} for 
/system/balancer.id
2018-02-09 07:23:21,739 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1126407107_1
2018-02-09 07:23:21,758 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.append: Failed to APPEND_FILE /system/balancer.id for 
DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 becaus
e this file lease is currently owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 
10.9.4.184
2018-02-09 07:23:21,758 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
10.9.4.184:49781 Call#12 Retry#0: org.
apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to APPEND_FILE 
/system/balancer.id for DFSClient_NONMAPREDUCE_1275100437_1 on 10.9.4.184 
because this file lease is currently 
owned by DFSClient_NONMAPREDUCE_-1126407107_1 on 10.9.4.184
2018-02-09 07:23:21,773 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.9.4.184:30010 is added to 
blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primar
yNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-dae233d3-5c71-498e-9a8b-669bff3fccdf:NORMAL:10.9.4.184:30010|RBW]]}
 size 15
2018-02-09 07:23:21,776 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile: /system/balancer.id is closed by 
DFSClient_NONMAPREDUCE_-1126407107_1{noformat}


> HDFS balancer in single node cluster fails with "Another Balancer is 
> running.."
> ---
>
> Key: HDFS-13128
> URL: https://issues.apache.org/jira/browse/HDFS-13128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.7.3
>Reporter: Zbigniew Kostrzewa
>Priority: Minor
>
> In a single node "cluster", HDFS balancer fails with:
> {noformat}
> Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
> java.io.IOException: Another Balancer is running.. Exiting ...
> {noformat}
> and in Name Node logs there is:
> {noformat}
> 2018-02-09 07:23:21,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocate blk_1073741865_1041{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
> primaryNodeIndex=-1, 

[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358033#comment-16358033
 ] 

genericqa commented on HDFS-10453:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}143m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestBlocksScheduledCounter |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-10453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909885/HDFS-10453-trunk.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 351fb731547f 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1bc03dd |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23000/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23000/testReport/ |
| Max. process+thread count | 

[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13120:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Fix the unit test build issue and commit to branch-2.8.

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4, 2.7.6
>
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch, 
> HDFS-13120.branch-2.8.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13099) RBF: Use the ZooKeeper as the default State Store

2018-02-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357971#comment-16357971
 ] 

Hudson commented on HDFS-13099:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13638 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13638/])
HDFS-13099. RBF: Use the ZooKeeper as the default State Store. (yqlin: rev 
543f3abbee79d7ec70353f0cdda6397ee001324e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/federation/store/FederationStateStoreTestUtils.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSRouterFederation.md
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/federation/RouterConfigBuilder.java


> RBF: Use the ZooKeeper as the default State Store
> -
>
> Key: HDFS-13099
> URL: https://issues.apache.org/jira/browse/HDFS-13099
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
>  Labels: incompatible, incompatibleChange
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.2
>
> Attachments: HDFS-13099.001.patch, HDFS-13099.002.patch, 
> HDFS-13099.003.patch, HDFS-13099.004.patch, HDFS-13099.005.patch, 
> HDFS-13099.006.patch
>
>
> Currently the State Store Driver relevant settings only written in its 
> implement classes.
> {noformat}
> public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl {
> ...
>   /** Configuration keys. */
>   public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX =
>   DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk.";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH =
>   FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT =
>   "/hdfs-federation";
> ..
> {noformat}
> Actually, they should be moved into class {{DFSConfigKeys}} and documented in 
> file {{hdfs-default.xml}}. This will help more users know these settings and 
> know how to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13099) RBF: Use the ZooKeeper as the default State Store

2018-02-08 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13099:
-
Fix Version/s: (was: 3.0.1)
   3.0.2
 Release Note: Change default State Store from local file to ZooKeeper. 
This will require additional zk address to be configured. 

> RBF: Use the ZooKeeper as the default State Store
> -
>
> Key: HDFS-13099
> URL: https://issues.apache.org/jira/browse/HDFS-13099
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
>  Labels: incompatible, incompatibleChange
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.2
>
> Attachments: HDFS-13099.001.patch, HDFS-13099.002.patch, 
> HDFS-13099.003.patch, HDFS-13099.004.patch, HDFS-13099.005.patch, 
> HDFS-13099.006.patch
>
>
> Currently the State Store Driver relevant settings only written in its 
> implement classes.
> {noformat}
> public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl {
> ...
>   /** Configuration keys. */
>   public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX =
>   DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk.";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH =
>   FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT =
>   "/hdfs-federation";
> ..
> {noformat}
> Actually, they should be moved into class {{DFSConfigKeys}} and documented in 
> file {{hdfs-default.xml}}. This will help more users know these settings and 
> know how to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13099) RBF: Use the ZooKeeper as the default State Store

2018-02-08 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13099:
-
   Resolution: Fixed
 Hadoop Flags: Incompatible change,Reviewed
Fix Version/s: 3.0.1
   2.9.1
   2.10.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-3.0, branch-2 and branch-2.9.
Thanks [~elgoiri] for the review. Add the release note for this JIRA since this 
is a incompatible change.

> RBF: Use the ZooKeeper as the default State Store
> -
>
> Key: HDFS-13099
> URL: https://issues.apache.org/jira/browse/HDFS-13099
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
>  Labels: incompatible, incompatibleChange
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-13099.001.patch, HDFS-13099.002.patch, 
> HDFS-13099.003.patch, HDFS-13099.004.patch, HDFS-13099.005.patch, 
> HDFS-13099.006.patch
>
>
> Currently the State Store Driver relevant settings only written in its 
> implement classes.
> {noformat}
> public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl {
> ...
>   /** Configuration keys. */
>   public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX =
>   DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk.";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH =
>   FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT =
>   "/hdfs-federation";
> ..
> {noformat}
> Actually, they should be moved into class {{DFSConfigKeys}} and documented in 
> file {{hdfs-default.xml}}. This will help more users know these settings and 
> know how to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-13001:
-
Attachment: HDFS-13001.002.patch

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
> Attachments: HDFS-13001.001.patch, HDFS-13001.002.patch
>
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-13001:
-
Status: In Progress  (was: Patch Available)

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0, 2.9.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
> Attachments: HDFS-13001.001.patch, HDFS-13001.002.patch
>
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-13001:
-
Status: Patch Available  (was: In Progress)

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0, 2.9.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
> Attachments: HDFS-13001.001.patch, HDFS-13001.002.patch
>
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357925#comment-16357925
 ] 

Ajay Kumar commented on HDFS-10453:
---

[~hexiaoqiao], I was referring to moving the check {{if (bc == null || 
(bc.isUnderConstruction() && block.equals(bc.getLastBlock(}} before 
{{if(targets == null || targets.length == 0)}}.

As [~xkrogen] mentioned earlier "A deleted block should always be removed from 
the needingReplications list regardless of whether or not any targets were 
found for it, so it makes sense to perform this check before the check for an 
empty targets list. ".  In current scenario it will be removed in next 
iteration of {{computeReplicationWorkForBlocks}}.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453-branch-2.7.007.patch, HDFS-10453-branch-2.7.008.patch, 
> HDFS-10453-branch-2.8.001.patch, HDFS-10453-branch-2.9.001.patch, 
> HDFS-10453-branch-3.0.001.patch, HDFS-10453-trunk.001.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 

[jira] [Updated] (HDFS-13127) Fix TestContainerStateManager and TestOzoneConfigurationFields

2018-02-08 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-13127:
-
Attachment: HDFS-13127-HDFS-7240.001.patch

> Fix TestContainerStateManager and TestOzoneConfigurationFields
> --
>
> Key: HDFS-13127
> URL: https://issues.apache.org/jira/browse/HDFS-13127
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-13127-HDFS-7240.001.patch
>
>
> TestContainerStateManager is failing because SCM is unable to find a 
> container with enough free space to allocate a new block in the container.
> TestOzoneConfigurationFields is failing because configs "ozone.rest.servers"  
> and "ozone.rest.client.port" are added in ozone-default.xml however they 
> aren't specified as any of the config keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13127) Fix TestContainerStateManager and TestOzoneConfigurationFields

2018-02-08 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-13127:
-
Status: Patch Available  (was: Open)

> Fix TestContainerStateManager and TestOzoneConfigurationFields
> --
>
> Key: HDFS-13127
> URL: https://issues.apache.org/jira/browse/HDFS-13127
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-13127-HDFS-7240.001.patch
>
>
> TestContainerStateManager is failing because SCM is unable to find a 
> container with enough free space to allocate a new block in the container.
> TestOzoneConfigurationFields is failing because configs "ozone.rest.servers"  
> and "ozone.rest.client.port" are added in ozone-default.xml however they 
> aren't specified as any of the config keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13127) Fix TestContainerStateManager and TestOzoneConfigurationFields

2018-02-08 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDFS-13127:


 Summary: Fix TestContainerStateManager and 
TestOzoneConfigurationFields
 Key: HDFS-13127
 URL: https://issues.apache.org/jira/browse/HDFS-13127
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: HDFS-7240


TestContainerStateManager is failing because SCM is unable to find a container 
with enough free space to allocate a new block in the container.

TestOzoneConfigurationFields is failing because configs "ozone.rest.servers"  
and "ozone.rest.client.port" are added in ozone-default.xml however they aren't 
specified as any of the config keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357897#comment-16357897
 ] 

He Xiaoqiao commented on HDFS-10453:


[~xkrogen],[~ajayydv]
Sorry for so late comments. I just update v008 
[#HDFS-10453-branch-2.7.008.patch] patch for branch-2.7, and patchs for other 
branchs also be ready, which correct the properly invoke to get number bytes of 
block. {{getNumBytes()}}.

{quote}He Xiaoqiao, Patch v8 doesn't have changes from patch v7 in 
BlockManager#computeReplicationWorkForBlocks. Is that intentional?{quote}
[~ajayydv] Thanks for your comments firstly. since we have saved blocksize 
within the constructor for ReplicationWork rather than calling 
block.getNumBytes() within chooseTargets() in new patch 
[#HDFS-10453-branch-2.7.008.patch], so it is impossible to choose target for a 
block whose length is {{Long.MAX_VALUE}}. FYI.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453-branch-2.7.007.patch, HDFS-10453-branch-2.7.008.patch, 
> HDFS-10453-branch-2.8.001.patch, HDFS-10453-branch-2.9.001.patch, 
> HDFS-10453-branch-3.0.001.patch, HDFS-10453-trunk.001.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after 

[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Attachment: HDFS-10453-trunk.001.patch
HDFS-10453-branch-3.0.001.patch
HDFS-10453-branch-2.9.001.patch
HDFS-10453-branch-2.8.001.patch
HDFS-10453-branch-2.7.008.patch

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453-branch-2.7.007.patch, HDFS-10453-branch-2.7.008.patch, 
> HDFS-10453-branch-2.8.001.patch, HDFS-10453-branch-2.9.001.patch, 
> HDFS-10453-branch-3.0.001.patch, HDFS-10453-trunk.001.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Attachment: (was: HDFS-10453-branch-2.7.008.patch)

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453-branch-2.7.007.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7959) WebHdfs logging is missing on Datanode

2018-02-08 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-7959:
--
Fix Version/s: 2.7.6

HDFS-13126 back ported this to branch-2.7. Adding 2.7.6 as Fix version.

> WebHdfs logging is missing on Datanode
> --
>
> Key: HDFS-7959
> URL: https://issues.apache.org/jira/browse/HDFS-7959
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0, 3.0.0-alpha1, 2.7.6
>
> Attachments: HDFS-7959.1.branch-2.patch, HDFS-7959.1.trunk.patch, 
> HDFS-7959.2.branch-2.patch, HDFS-7959.2.trunk.patch, 
> HDFS-7959.3.branch-2.patch, HDFS-7959.3.trunk.patch, 
> HDFS-7959.branch-2.patch, HDFS-7959.patch, HDFS-7959.patch, HDFS-7959.patch, 
> HDFS-7959.trunk.patch
>
>
> After the conversion to netty, webhdfs requests are not logged on datanodes. 
> The existing jetty log only logs the non-webhdfs requests that come through 
> the internal proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-13126:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Just committed this to branch-2.7. Thank you [~xkrogen].

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357855#comment-16357855
 ] 

Konstantin Shvachko commented on HDFS-13126:


+1 clean back port.

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12975) Changes to the NameNode to support reads from standby

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357776#comment-16357776
 ] 

genericqa commented on HDFS-12975:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
29s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-hdfs-project in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
39s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 32s{color} 
| {color:red} hadoop-hdfs-project generated 431 new + 0 unchanged - 0 fixed = 
431 total (was 0) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} hadoop-hdfs-project: The patch generated 926 new 
+ 0 unchanged - 0 fixed = 926 total (was 0) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
33s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
2s{color} | {color:red} The patch generated 2 new + 10 unchanged - 0 fixed = 12 
total (was 10) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
12s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
54s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 4 new + 1 
unchanged - 0 fixed = 5 total (was 1) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
25s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 33s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes 

[jira] [Updated] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-12345:
---
Description: 
Dynamometer has now been open sourced on our [GitHub 
page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].

To encourage getting the tool into the open for others to use as quickly as 
possible, we went through our standard open sourcing process of releasing on 
GitHub. However we are interested in the possibility of donating this to Apache 
as part of Hadoop itself and would appreciate feedback on whether or not this 
is something that would be supported by the community.

Also of note, previous [discussions on the dev mail 
lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]

  was:Creating this JIRA to better track 
[discussions|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]
 we've been having in dev mail lists around Dynamometer. Will fill in details 
soon.


> Scale testing HDFS NameNode with real metadata and workloads
> 
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>Priority: Major
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads

2018-02-08 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang reassigned HDFS-12345:


Assignee: Erik Krogen  (was: Zhe Zhang)

> Scale testing HDFS NameNode with real metadata and workloads
> 
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>Priority: Major
>
> Creating this JIRA to better track 
> [discussions|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]
>  we've been having in dev mail lists around Dynamometer. Will fill in details 
> soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357742#comment-16357742
 ] 

genericqa commented on HDFS-13120:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
53s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
57s{color} | {color:red} The patch generated 494 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:30 |
| Failed junit tests | hadoop.hdfs.web.TestHttpsFileSystem |
|   | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.TestSetrepIncreasing |
|   | hadoop.hdfs.TestDatanodeRegistration |
|   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.TestDatanodeReport |
|   | hadoop.hdfs.web.TestHftpFileSystem |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestBlocksScheduledCounter |
|   | org.apache.hadoop.hdfs.TestDFSClientFailover |
|   | org.apache.hadoop.hdfs.TestDFSClientRetries |
|   | org.apache.hadoop.hdfs.TestRead |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
|   | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | org.apache.hadoop.hdfs.TestFileAppendRestart |
|   | org.apache.hadoop.hdfs.security.TestDelegationToken |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestSeekBug |
|   | org.apache.hadoop.hdfs.TestDFSMkdirs |
|   | org.apache.hadoop.hdfs.TestDFSOutputStream |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | org.apache.hadoop.hdfs.TestDFSRollback |
|   | org.apache.hadoop.hdfs.TestMiniDFSCluster |
|   | org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestTrashWithEncryptionZones |
|   | 

[jira] [Commented] (HDFS-12975) Changes to the NameNode to support reads from standby

2018-02-08 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357734#comment-16357734
 ] 

Chao Sun commented on HDFS-12975:
-

Attached a working patch just to trigger jenkins and check whether there's any 
test regression. It's not ready yet.

> Changes to the NameNode to support reads from standby
> -
>
> Key: HDFS-12975
> URL: https://issues.apache.org/jira/browse/HDFS-12975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-12975.000.patch
>
>
> In order to support reads from standby NameNode needs changes to add Observer 
> role, turn off checkpointing and such.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12975) Changes to the NameNode to support reads from standby

2018-02-08 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-12975:

Status: Patch Available  (was: Open)

> Changes to the NameNode to support reads from standby
> -
>
> Key: HDFS-12975
> URL: https://issues.apache.org/jira/browse/HDFS-12975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-12975.000.patch
>
>
> In order to support reads from standby NameNode needs changes to add Observer 
> role, turn off checkpointing and such.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12975) Changes to the NameNode to support reads from standby

2018-02-08 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-12975:

Attachment: HDFS-12975.000.patch

> Changes to the NameNode to support reads from standby
> -
>
> Key: HDFS-12975
> URL: https://issues.apache.org/jira/browse/HDFS-12975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-12975.000.patch
>
>
> In order to support reads from standby NameNode needs changes to add Observer 
> role, turn off checkpointing and such.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13052) WebHDFS: Add support for snasphot diff

2018-02-08 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357729#comment-16357729
 ] 

Xiaoyu Yao commented on HDFS-13052:
---

Thanks [~ljain] for working on this. The patch looks good to me overall. Just 
have a few minor issues:

WebhdfsFileSystem.java
Line 1323: getSnapshotDiffReport should not increment the write op statistics.

SnapshotDiffReport.java
Line 196: I agree that getToSnapshot() can be a more accurate name. But let's 
keep the original getLaterSnapshotName() API as this is a public. Keeping it 
unchanged avoid potential backward compatibility issues.

JsonUtilClient.java
Line 706: debug output to System.err.println can be removed.




> WebHDFS: Add support for snasphot diff
> --
>
> Key: HDFS-13052
> URL: https://issues.apache.org/jira/browse/HDFS-13052
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HDFS-13052.001.patch, HDFS-13052.002.patch, 
> HDFS-13052.003.patch
>
>
> This Jira aims to implement snapshot diff operation for webHdfs filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357704#comment-16357704
 ] 

genericqa commented on HDFS-13126:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.7 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
30s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} branch-2.7 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
38s{color} | {color:red} hadoop-common-project/hadoop-common in branch-2.7 has 
3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} branch-2.7 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 20s{color} | {color:orange} root: The patch generated 1 new + 10 unchanged - 
0 fixed = 11 total (was 10) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 125 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m  6s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m  4s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
19s{color} | {color:red} The patch generated 281 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}102m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-common:2 |
|   | hadoop-hdfs:8 |
| Failed junit tests | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.util.bloom.TestBloomFilters |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
| Timed out junit tests | org.apache.hadoop.conf.TestConfiguration |
|   | org.apache.hadoop.http.TestHttpServer |
|   | org.apache.hadoop.hdfs.TestRead |
|   | org.apache.hadoop.security.TestPermission |
|   | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | org.apache.hadoop.hdfs.security.TestDelegationToken |
|   | org.apache.hadoop.security.TestPermissionSymlinks |
|   | org.apache.hadoop.security.TestRefreshUserMappings |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.TestDFSShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ea57d10 |
| JIRA Issue | HDFS-13126 |
| 

[jira] [Updated] (HDFS-12953) XORRawDecoder.doDecode throws NullPointerException

2018-02-08 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12953:
-
Target Version/s: 3.1.0, 3.0.2  (was: 3.1.0, 3.0.1)

> XORRawDecoder.doDecode throws NullPointerException
> --
>
> Key: HDFS-12953
> URL: https://issues.apache.org/jira/browse/HDFS-12953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Xiao Chen
>Priority: Major
>
> Thanks [~danielpol] report on HDFS-12860.
> {noformat}
> 17/11/30 04:19:55 INFO mapreduce.Job: map 0% reduce 0%
> 17/11/30 04:20:01 INFO mapreduce.Job: Task Id : 
> attempt_1512036058655_0003_m_02_0, Status : FAILED
> Error: java.lang.NullPointerException
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.XORRawDecoder.doDecode(XORRawDecoder.java:83)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:106)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170)
> at 
> org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:423)
> at 
> org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94)
> at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:382)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:318)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at 
> org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:563)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13029) /.reserved/raw/.reserved/.inodes/ is not resolvable.

2018-02-08 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-13029:
-
Target Version/s: 3.0.2  (was: 3.0.1)

> /.reserved/raw/.reserved/.inodes/ is not resolvable.
> --
>
> Key: HDFS-13029
> URL: https://issues.apache.org/jira/browse/HDFS-13029
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>Priority: Major
>
> Namenode cannot resolve {{/.reserved/raw/.reserved/.inodes/}} path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13109) Support fully qualified hdfs path in "crypto" commands path argument

2018-02-08 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357695#comment-16357695
 ] 

Xiaoyu Yao commented on HDFS-13109:
---

Thanks [~hanishakoneru] for working on this. The patch looks good to me 
overall. 
Can you add a unit test for the fullly qualified path support?

> Support fully qualified hdfs path in "crypto" commands path argument
> 
>
> Key: HDFS-13109
> URL: https://issues.apache.org/jira/browse/HDFS-13109
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDFS-13109.001.patch
>
>
> When creating an Encryption Zone, if the fully qualified path is specified in 
> the path argument, it throws the following error.
> {code:java}
> ~$ hdfs crypto -createZone -keyName mykey1 -path hdfs://ns1/zone1
> IllegalArgumentException: hdfs://ns1/zone1 is not the root of an encryption 
> zone. Do you mean /zone1?
> ~$ hdfs crypto -createZone -keyName mykey1 -path "hdfs://namenode:9000/zone2" 
> IllegalArgumentException: hdfs://namenode:9000/zone2 is not the root of an 
> encryption zone. Do you mean /zone2?
> {code}
> The EZ creation succeeds as the path is resolved in 
> DFS#createEncryptionZone(). But while creating the Trash directory, the path 
> is not resolved and it throws the above error.
>  A fully qualified path should be supported by {{crypto}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357638#comment-16357638
 ] 

Wei Yan commented on HDFS-13123:


{quote}hard linking across block pools as one option and even tiered storage
{quote}
Yes, I was also told this by other ppl. But haven't got the details, so I just 
put "copy" instead of "distcp" in the doc ;)

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-13122) Tailing edits should not update quota counts on ObserverNode

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13122.

Resolution: Duplicate

> Tailing edits should not update quota counts on ObserverNode
> 
>
> Key: HDFS-13122
> URL: https://issues.apache.org/jira/browse/HDFS-13122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Currently in {{FSImage#loadEdits()}}, after applying a set of edits, we call
> {code}
> updateCountForQuota(target.getBlockManager().getStoragePolicySuite(), 
> target.dir.rootDir);
> {code}
> to update the quota counts for the entire namespace, which can be very 
> expensive. This makes sense if we are about to become the ANN, since we need 
> valid quotas, but not on an ObserverNode which does not need to enforce 
> quotas.
> This is related to increasing the frequency with which the SbNN can tail 
> edits from the ANN to decrease the lag time for transactions to appear on the 
> Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13122) Tailing edits should not update quota counts on ObserverNode

2018-02-08 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357628#comment-16357628
 ] 

Erik Krogen commented on HDFS-13122:


Whoops, you're right, thanks [~csun]!

> Tailing edits should not update quota counts on ObserverNode
> 
>
> Key: HDFS-13122
> URL: https://issues.apache.org/jira/browse/HDFS-13122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Currently in {{FSImage#loadEdits()}}, after applying a set of edits, we call
> {code}
> updateCountForQuota(target.getBlockManager().getStoragePolicySuite(), 
> target.dir.rootDir);
> {code}
> to update the quota counts for the entire namespace, which can be very 
> expensive. This makes sense if we are about to become the ANN, since we need 
> valid quotas, but not on an ObserverNode which does not need to enforce 
> quotas.
> This is related to increasing the frequency with which the SbNN can tail 
> edits from the ANN to decrease the lag time for transactions to appear on the 
> Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13122) Tailing edits should not update quota counts on ObserverNode

2018-02-08 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357625#comment-16357625
 ] 

Chao Sun commented on HDFS-13122:
-

[~xkrogen]: is this on 2.7.X? since it seems to be already fixed by 
[HDFS-6763|https://issues-test.apache.org/jira/browse/HDFS-6763].

> Tailing edits should not update quota counts on ObserverNode
> 
>
> Key: HDFS-13122
> URL: https://issues.apache.org/jira/browse/HDFS-13122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Currently in {{FSImage#loadEdits()}}, after applying a set of edits, we call
> {code}
> updateCountForQuota(target.getBlockManager().getStoragePolicySuite(), 
> target.dir.rootDir);
> {code}
> to update the quota counts for the entire namespace, which can be very 
> expensive. This makes sense if we are about to become the ANN, since we need 
> valid quotas, but not on an ObserverNode which does not need to enforce 
> quotas.
> This is related to increasing the frequency with which the SbNN can tail 
> edits from the ANN to decrease the lag time for transactions to appear on the 
> Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357611#comment-16357611
 ] 

Erik Krogen commented on HDFS-13126:


Attached v000 patch with backport; clean overall. Conflicts due to the changes 
to use the DefaultHttpResponse field instead of local variable but doesn't look 
like any logic conflicts.

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13126:
---
Attachment: HDFS-13126-branch-2.7.000.patch

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-13126 started by Erik Krogen.
--
> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work stopped] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-13126 stopped by Erik Krogen.
--
> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13126-branch-2.7.000.patch
>
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13120:
--
Attachment: HDFS-13120.branch-2.8.patch

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4, 2.7.6
>
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch, 
> HDFS-13120.branch-2.8.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13126:
---
Issue Type: Bug  (was: Improvement)

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). This was fixed 
> in HDFS-7959 but not added to branch-2.7 where the original breakage occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13126) Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13126:
---
Summary: Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request 
logging for WebHDFS  (was: Re-enable HTTP Request Logging for WebHDFS)

> Backport [HDFS-7959] to branch-2.7 to re-enable HTTP request logging for 
> WebHDFS
> 
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). If the request 
> logging is enabled, we should add a Netty 
> [LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
>  to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357575#comment-16357575
 ] 

Íñigo Goiri commented on HDFS-13123:


Thanks [~ywskycn] for the doc. Right now we are leveraging DistCp for this but 
I remember there was some conversation about other options.
[~chris.douglas], I remember you mentioned doing hard linking across block 
pools as one option and even tiered storage; any thoughts?
In any case, I think we should start with DistCp but keep in mind the option to 
leverage other mechanisms.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated HDFS-13123:
---
Attachment: HDFS Router-Based Federation Rebalancer.pdf

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-4843) testDeleteBlockPool has a bug which makes it fail occasionally

2018-02-08 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357558#comment-16357558
 ] 

Bharat Viswanadham commented on HDFS-4843:
--

Hi [~vincent cho]

This has been fixed as part of HDFS-5892

Now, in MiniDFSNNTopology, a method is added simpleFederatedTopology to take 
DFS_NAMESERVICES as input. dfs.nameservices set in the test are used in the 
MiniDFSCluster.

> testDeleteBlockPool has a bug which makes it fail occasionally
> --
>
> Key: HDFS-4843
> URL: https://issues.apache.org/jira/browse/HDFS-4843
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 2.0.4-alpha
> Environment: normal enviroment
>Reporter: vincent cho
>Priority: Major
>
> int the test case "testDeleteBlockPool" :
> first we set "DFSConfigKeys.DFS_NAMESERVICES" as 
> "namesServerId1,namesServerId2", but after a cluster was built, the 
> nameservice are set as "ns1,ns2" because the code set a default value for 
> them when we new one cluster.
> then we refresh namenode with the nameservice set as "namesServerId2" . so we 
> will add a new nameservice named "namesServerId2" and remove the two old 
> nameservices. because "ns2" and "namesServerId2" have the same bpid, so when 
> the add thread run faster than the remove process, the bpByBlockPoolId in 
> BlockPoolManager will be empty  so that it fails when create a new file or 
> path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13126) Re-enable HTTP Request Logging for WebHDFS

2018-02-08 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357551#comment-16357551
 ] 

Kihwal Lee commented on HDFS-13126:
---

See HDFS-7959

> Re-enable HTTP Request Logging for WebHDFS
> --
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). If the request 
> logging is enabled, we should add a Netty 
> [LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
>  to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

2018-02-08 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-10419:

Summary: Building HDFS on top of new storage layer (HDSL)  (was: Building 
HDFS on top of Ozone's storage containers)

> Building HDFS on top of new storage layer (HDSL)
> 
>
> Key: HDFS-10419
> URL: https://issues.apache.org/jira/browse/HDFS-10419
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Major
> Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13099) RBF: Use the ZooKeeper as the default State Store

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357395#comment-16357395
 ] 

Íñigo Goiri commented on HDFS-13099:


Unit tests seem unrelated.
+1 on  [^HDFS-13099.006.patch].

> RBF: Use the ZooKeeper as the default State Store
> -
>
> Key: HDFS-13099
> URL: https://issues.apache.org/jira/browse/HDFS-13099
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
>  Labels: incompatible, incompatibleChange
> Attachments: HDFS-13099.001.patch, HDFS-13099.002.patch, 
> HDFS-13099.003.patch, HDFS-13099.004.patch, HDFS-13099.005.patch, 
> HDFS-13099.006.patch
>
>
> Currently the State Store Driver relevant settings only written in its 
> implement classes.
> {noformat}
> public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl {
> ...
>   /** Configuration keys. */
>   public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX =
>   DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk.";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH =
>   FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT =
>   "/hdfs-federation";
> ..
> {noformat}
> Actually, they should be moved into class {{DFSConfigKeys}} and documented in 
> file {{hdfs-default.xml}}. This will help more users know these settings and 
> know how to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13119) RBF: Manage unavailable clusters

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357389#comment-16357389
 ] 

Íñigo Goiri edited comment on HDFS-13119 at 2/8/18 6:48 PM:


Thanks [~linyiqun] for taking this.

The example I was giving is {{RouterRpcServer#renewLease()}}.
This function calls {{rpcClient.invokeConcurrent(nss, method, false, false);}} 
with {{nss}} being all the subclusters.
{{RouterRpcClient#invokeConcurrent()}} goes and spawns a thread in the 
{{executorService}} for each subcluster, so for unavailable subcluster we have 
a thread here stuck for 200 seconds in our case.
We also have a lot of threads from this thread pool named {{RPC Router 
Client-XXX}}.
We actually have an option to have a timeout which we use for some UI option; 
I'm not sure this is OK for {{renewLease()}} for example.
Does it make sense?

The current problem is that the thread factory in this {{executorService}} has 
no limit and we should have one (configurable preferably).
However, this doesn't fix the real problem which is checking forever for 
something we know is down.
I think your proposal for avoiding the retries could be the other part of this 
fix.


was (Author: elgoiri):
Thanks [~linyiqun] for taking this.

The example I was giving is {{RouterRpcServer#renewLease()}}.
This function calls {{rpcClient.invokeConcurrent(nss, method, false, false);}} 
with {{nss}} being all the namespaces.
{{RouterRpcClient#invokeConcurrent()}} goes and spawns a thread in the 
{{executorService}} for each subcluster, so for unavailable subcluster we have 
a thread here stuck for 200 seconds in our case.
We also have a lot of threads from this thread pool named {{RPC Router 
Client-XXX}}.
We actually have an option to have a timeout which we use for some UI option; 
I'm not sure this is OK for {{renewLease()}} for example.
Does it make sense?

The current problem is that the thread factory in this {{executorService}} has 
no limit and we should have one (configurable preferably).
However, this doesn't fix the real problem which is checking forever for 
something we know is down.
I think your proposal for avoiding the retries could be the other part of this 
fix.

> RBF: Manage unavailable clusters
> 
>
> Key: HDFS-13119
> URL: https://issues.apache.org/jira/browse/HDFS-13119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
>
> When a federated cluster has one of the subcluster down, operations that run 
> in every subcluster ({{RouterRpcClient#invokeAll()}}) may take all the RPC 
> connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13119) RBF: Manage unavailable clusters

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357389#comment-16357389
 ] 

Íñigo Goiri commented on HDFS-13119:


Thanks [~linyiqun] for taking this.

The example I was giving is {{RouterRpcServer#renewLease()}}.
This function calls {{rpcClient.invokeConcurrent(nss, method, false, false);}} 
with {{nss}} being all the namespaces.
{{RouterRpcClient#invokeConcurrent()}} goes and spawns a thread in the 
{{executorService}} for each subcluster, so for unavailable subcluster we have 
a thread here stuck for 200 seconds in our case.
We also have a lot of threads from this thread pool named {{RPC Router 
Client-XXX}}.
We actually have an option to have a timeout which we use for some UI option; 
I'm not sure this is OK for {{renewLease()}} for example.
Does it make sense?

The current problem is that the thread factory in this {{executorService}} has 
no limit and we should have one (configurable preferably).
However, this doesn't fix the real problem which is checking forever for 
something we know is down.
I think your proposal for avoiding the retries could be the other part of this 
fix.

> RBF: Manage unavailable clusters
> 
>
> Key: HDFS-13119
> URL: https://issues.apache.org/jira/browse/HDFS-13119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
>
> When a federated cluster has one of the subcluster down, operations that run 
> in every subcluster ({{RouterRpcClient#invokeAll()}}) may take all the RPC 
> connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13120:
--
Fix Version/s: 2.7.6
   2.8.4
   3.0.1
   2.9.1
   2.10.0
   3.1.0

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4, 2.7.6
>
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13120:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks all for the reviews. I've commit the patch to trunk, branch-3.0, 
branch-3.0.1, branch-2, branch-2.9, branch-2.8 and branch-2.7 with the 
CHANGES.txt update.

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357376#comment-16357376
 ] 

Hudson commented on HDFS-13120:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13633 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13633/])
HDFS-13120. Snapshot diff could be corrupted after concat. Contributed (xyao: 
rev 8faf0b50d435039f69ea35f592856ca04d378809)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirConcatOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357371#comment-16357371
 ] 

Íñigo Goiri commented on HDFS-13123:


Just for reference, in HDFS-10467, the document already mentioned the 
Rebalancer in some places so this will be a place to track this component in 
detail.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13109) Support fully qualified hdfs path in "crypto" commands path argument

2018-02-08 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357369#comment-16357369
 ] 

Hanisha Koneru commented on HDFS-13109:
---

Thanks for flagging this, [~shahrs87]. Are you planning on working on 
HDFS-12586? If not, I will close that Jira and continue here.

> Support fully qualified hdfs path in "crypto" commands path argument
> 
>
> Key: HDFS-13109
> URL: https://issues.apache.org/jira/browse/HDFS-13109
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDFS-13109.001.patch
>
>
> When creating an Encryption Zone, if the fully qualified path is specified in 
> the path argument, it throws the following error.
> {code:java}
> ~$ hdfs crypto -createZone -keyName mykey1 -path hdfs://ns1/zone1
> IllegalArgumentException: hdfs://ns1/zone1 is not the root of an encryption 
> zone. Do you mean /zone1?
> ~$ hdfs crypto -createZone -keyName mykey1 -path "hdfs://namenode:9000/zone2" 
> IllegalArgumentException: hdfs://namenode:9000/zone2 is not the root of an 
> encryption zone. Do you mean /zone2?
> {code}
> The EZ creation succeeds as the path is resolved in 
> DFS#createEncryptionZone(). But while creating the Trash directory, the path 
> is not resolved and it throws the above error.
>  A fully qualified path should be supported by {{crypto}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13126) Re-enable HTTP Request Logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13126:
---
Description: Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request 
logs no longer include WebHDFS requests because the HTTP request logging is 
done internal to {{HttpServer2}}, which is no longer used (replaced by Netty). 
If the request logging is enabled, we should add a Netty 
[LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
 to the ChannelPipeline for the http(s) servers used by the DataNode.  (was: 
Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
include WebHDFS requests because the HTTP Request handling is done internal to 
{{HttpServer2}}, which is no longer used. If the request logging is enabled, we 
should add a Netty 
[LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
 to the ChannelPipeline for the http(s) servers used by the DataNode.)

> Re-enable HTTP Request Logging for WebHDFS
> --
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP request logging is done internal to 
> {{HttpServer2}}, which is no longer used (replaced by Netty). If the request 
> logging is enabled, we should add a Netty 
> [LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
>  to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13126) Re-enable HTTP Request Logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-13126:
--

Assignee: Erik Krogen

> Re-enable HTTP Request Logging for WebHDFS
> --
>
> Key: HDFS-13126
> URL: https://issues.apache.org/jira/browse/HDFS-13126
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, webhdfs
>Affects Versions: 2.7.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
> include WebHDFS requests because the HTTP Request handling is done internal 
> to {{HttpServer2}}, which is no longer used. If the request logging is 
> enabled, we should add a Netty 
> [LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
>  to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13126) Re-enable HTTP Request Logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-13126:
--

 Summary: Re-enable HTTP Request Logging for WebHDFS
 Key: HDFS-13126
 URL: https://issues.apache.org/jira/browse/HDFS-13126
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Affects Versions: 2.7.0
Reporter: Erik Krogen


Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
include WebHDFS requests because the HTTP Request handling is done internal to 
{{HttpServer2}}, which is no longer used. If the request logging is enabled, we 
should add a Netty 
[LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
 to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-02-08 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357343#comment-16357343
 ] 

Ajay Kumar commented on HDFS-10453:
---

[~hexiaoqiao], Patch v8 doesn't have changes from patch v7 in 
{{BlockManager#computeReplicationWorkForBlocks}}. Is that intentional?

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453-branch-2.7.007.patch, HDFS-10453-branch-2.7.008.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-6763) Initialize file system-wide quota once on transitioning to active

2018-02-08 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357290#comment-16357290
 ] 

Brahma Reddy Battula commented on HDFS-6763:


Good to have in branch-2.7 too.?

> Initialize file system-wide quota once on transitioning to active
> -
>
> Key: HDFS-6763
> URL: https://issues.apache.org/jira/browse/HDFS-6763
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Daryn Sharp
>Assignee: Kihwal Lee
>Priority: Major
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HDFS-6763.patch, HDFS-6763.v2.patch, HDFS-6763.v3.patch
>
>
> {{FSImage#loadEdits}} calls {{updateCountForQuota}} to recalculate & verify 
> quotas for the entire namespace.  A standby NN using shared edits calls this 
> method every minute.  The standby may appear to "hang" for many seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357277#comment-16357277
 ] 

genericqa commented on HDFS-11187:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 25m 
54s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
22s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
14s{color} | {color:red} The patch generated 118 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:25 |
| Failed junit tests | hadoop.hdfs.TestLease |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestParallelShortCircuitReadUnCached |
|   | hadoop.hdfs.TestDataTransferProtocol |
|   | hadoop.hdfs.TestMiniDFSCluster |
| Timed out junit tests | org.apache.hadoop.hdfs.TestMaintenanceState |
|   | org.apache.hadoop.hdfs.TestFileAppend |
|   | org.apache.hadoop.hdfs.TestSafeMode |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeDowngrade |
|   | org.apache.hadoop.hdfs.TestFileCorruption |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestHDFSServerPorts |
|   | org.apache.hadoop.hdfs.TestDFSUpgrade |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.TestRenameWhileOpen |
|   | org.apache.hadoop.hdfs.TestPipelines |
|   | org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | org.apache.hadoop.hdfs.TestFSOutputSummer |
|   | org.apache.hadoop.hdfs.TestExternalBlockReader |
|   | org.apache.hadoop.hdfs.TestHFlush |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestEncryptedTransfer |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.TestDFSRename |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-11187 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909807/HDFS-11187-branch-2.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  

[jira] [Commented] (HDFS-13046) consider load of datanodes when read blocks of file

2018-02-08 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357246#comment-16357246
 ] 

Xiao Chen commented on HDFS-13046:
--

+0 Mixed feelings but given this is off by default it should be fine.

As Akira said if network is more performant it would be a good idea to separate 
the reads to more DN, so we get an overall higher utilization. (Best fit for 
EC?)

On the other hand, it's also possible that we schedule more reads to be remote 
and sacrifices locality, In the case when network is slow, the overall read 
throughput may be lower.

> consider load of datanodes when read blocks of file
> ---
>
> Key: HDFS-13046
> URL: https://issues.apache.org/jira/browse/HDFS-13046
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: hu xiaodong
>Assignee: hu xiaodong
>Priority: Minor
> Attachments: 
> HDFS-13046-considerLoadAfterSortBydistance-001-sample.patch, 
> HDFS-13046-sample.patch
>
>
> When sorting block locations, we just consider the distance of datanodes. can 
> we consider the load of datanodes? We can add a configuration such as 
> 'dfs.namenode.reading.considerLoad', if set to true, then sort the 
> blocklocations by load of the datanodes, otherwise sort by distance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357222#comment-16357222
 ] 

Ajay Kumar commented on HDFS-13120:
---

+1 (non binding)

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13125) Improve efficiency of JN -> Standby Pipeline Under Frequent Edit Tailing

2018-02-08 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-13125:
--

 Summary: Improve efficiency of JN -> Standby Pipeline Under 
Frequent Edit Tailing
 Key: HDFS-13125
 URL: https://issues.apache.org/jira/browse/HDFS-13125
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: journal-node, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


The current edit tailing pipeline is designed for
* High resiliency
* High throughput
and was _not_ designed for low latency.

It was designed under the assumption that each edit log segment would typically 
be read all at once, e.g. on startup or the SbNN tailing the entire thing after 
it is finalized. The ObserverNode should be reading constantly from the 
JournalNodes' in-progress edit logs with low latency, to reduce the lag time 
from when a transaction is committed on the ANN and when it is visible on the 
ObserverNode.

Due to the critical nature of this pipeline to the health of HDFS, it would be 
better not to redesign it altogether. Based on some experiments it seems if we 
mitigate the following issues, lag times are reduced to low levels (low 
hundreds of milliseconds even under very high write load):
* The overhead of creating a new HTTP connection for each time new edits are 
fetched. This makes sense when you're expecting to tail an entire segment; it 
does not when you may only be fetching a small number of edits. We can mitigate 
this by allowing edits to be tailed via an RPC call, or by adding a connection 
pool for the existing connections to the journal.
* The overhead of transmitting a whole file at once. Right now when an edit 
segment is requested, the JN sends the entire segment, and on the SbNN it will 
ignore edits up to the ones it wants. How to solve this one may be more tricky, 
but one suggestion would be to keep recently logged edits in memory, avoiding 
the need to serve them from file at all, allowing the JN to quickly serve only 
the required edits.

We can implement these as optimizations on top of the existing logic, with 
fallbacks to the current slow-but-resilient pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13112) Token expiration edits may cause log corruption or deadlock

2018-02-08 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-13112:
--
Target Version/s: 3.0.1, 2.7.6  (was: 2.7.6)

> Token expiration edits may cause log corruption or deadlock
> ---
>
> Key: HDFS-13112
> URL: https://issues.apache.org/jira/browse/HDFS-13112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 0.23.8
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-13112.patch
>
>
> HDFS-4477 specifically did not acquire the fsn lock during token cancellation 
> based on the belief that edit logs are thread-safe.  However, log rolling is 
> not thread-safe.  Failure to externally synchronize on the fsn lock during a 
> roll will cause problems.
> For sync edit logging, it may cause corruption by interspersing edits with 
> the end/start segment edits.  Async edit logging may encounter a deadlock if 
> the log queue overflows.  Luckily, losing the race is extremely rare.  In ~5 
> years, we've never encountered it.  However, HDFS-13051 lost the race with 
> async edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13112) Token expiration edits may cause log corruption or deadlock

2018-02-08 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357180#comment-16357180
 ] 

Kihwal Lee commented on HDFS-13112:
---

The handler for {{enterSafeMode}} has acquired a write lock. The secret manager 
is stuck waiting for read lock. 
{noformat}
"Thread[Thread-206,5,main]" #287 daemon prio=5 os_prio=0 tid=0x7f5ee8f59ec0 
nid=0x6348 waiting on condition [0x7f5eb8acd000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xd8b103c8> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.readLock(FSNamesystemLock.java:142)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.readLock(FSNamesystem.java:1580)
at 
org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logUpdateMasterKey
(DelegationTokenSecretManager.java:374)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey
(AbstractDelegationTokenSecretManager.java:353)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey
(AbstractDelegationTokenSecretManager.java:376)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run
(AbstractDelegationTokenSecretManager.java:683)
at java.lang.Thread.run(Thread.java:748)
{noformat}

> Token expiration edits may cause log corruption or deadlock
> ---
>
> Key: HDFS-13112
> URL: https://issues.apache.org/jira/browse/HDFS-13112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 0.23.8
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-13112.patch
>
>
> HDFS-4477 specifically did not acquire the fsn lock during token cancellation 
> based on the belief that edit logs are thread-safe.  However, log rolling is 
> not thread-safe.  Failure to externally synchronize on the fsn lock during a 
> roll will cause problems.
> For sync edit logging, it may cause corruption by interspersing edits with 
> the end/start segment edits.  Async edit logging may encounter a deadlock if 
> the log queue overflows.  Luckily, losing the race is extremely rare.  In ~5 
> years, we've never encountered it.  However, HDFS-13051 lost the race with 
> async edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13081) Datanode#checkSecureConfig should check HTTPS and SASL encryption

2018-02-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357171#comment-16357171
 ] 

Daryn Sharp commented on HDFS-13081:


{quote}However a combination like privileged port for HTTP and SASL for RPC 
should also work. 
{quote}
100% agree.  It's absurd to restrict a "more secure" DN using privileged ports 
from becoming "even more secure" with SASL.
{quote}what are your thoughts on having privileged port for HTTP with SASL for 
RPC?
{quote}
Seems a bit counterintuitive, but I wouldn't object for the following reasons:
 # SSL-only is a design requirement to prevent rogues stealing a non-privileged 
port.
 # A rogue can't listen on the privileged port unless it's root.
 # If root is compromised, it's already game over regardless of whether SSL is 
used.
 # At this point, it's up to the admin to decide if SSL is strictly required.

I rescind my negative if this Jira morphs to address your valid observations.

> Datanode#checkSecureConfig should check HTTPS and SASL encryption
> -
>
> Key: HDFS-13081
> URL: https://issues.apache.org/jira/browse/HDFS-13081
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 3.0.0
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13081.000.patch
>
>
> Datanode#checkSecureConfig currently check the following to determine if 
> secure datanode is enabled. 
>  # The server has bound to privileged ports for RPC and HTTP via 
> SecureDataNodeStarter.
>  # The configuration enables SASL on DataTransferProtocol and HTTPS (no plain 
> HTTP) for the HTTP server. The SASL handshake guarantees authentication of 
> the RPC server before a client transmits a secret, such as a block access 
> token. Similarly, SSL guarantees authentication of the
>  HTTP server before a client transmits a secret, such as a delegation token.
> For the 2nd case, HTTPS_ONLY means all the traffic between REST client/server 
> will be encrypted. However, the logic to check only if SASL property resolver 
> is configured does not mean server requires an encrypted RPC. 
> This ticket is open to further check and ensure datanode SASL property 
> resolver has a QoP that includes auth-conf(PRIVACY). Note that the SASL QoP 
> (Quality of Protection) negotiation may drop RPC protection level from 
> auth-conf(PRIVACY) to auth-int(integrity) or auth(authentication) only, which 
> should be fine by design.
>  
> cc: [~cnauroth] , [~daryn], [~jnpandey] for additional feedback.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12636:
-
Status: Patch Available  (was: Open)

retriggering jenkins.

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch, HDFS-12636-HDFS-7240.005.patch, 
> HDFS-12636-HDFS-7240.006.patch, HDFS-12636-HDFS-7240.007.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12636:
-
Status: Open  (was: Patch Available)

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch, HDFS-12636-HDFS-7240.005.patch, 
> HDFS-12636-HDFS-7240.006.patch, HDFS-12636-HDFS-7240.007.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread Mukul Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357136#comment-16357136
 ] 

Mukul Kumar Singh commented on HDFS-12636:
--

Thanks for updating the patch [~ljain].

+1 pending jenkins, the patch looks good to me. 

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch, HDFS-12636-HDFS-7240.005.patch, 
> HDFS-12636-HDFS-7240.006.patch, HDFS-12636-HDFS-7240.007.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357105#comment-16357105
 ] 

genericqa commented on HDFS-12636:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
10s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
23s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 26m 
40s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
47s{color} | {color:red} hadoop-hdfs-client in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
51s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
55s{color} | {color:red} hadoop-ozone in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
45s{color} | {color:red} hadoop-hdfs-client in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
50s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
44s{color} | {color:red} hadoop-ozone in HDFS-7240 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
28s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 28s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
34s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 36 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
2s{color} | {color:red} The patch 384 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m  
1s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  

[jira] [Commented] (HDFS-13116) Ozone: Refactor Pipeline to have transport and container specific information

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357103#comment-16357103
 ] 

genericqa commented on HDFS-13116:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
6s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 59s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.TestOzoneConfigurationFields |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.ozone.scm.container.TestContainerStateManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d11161b |
| JIRA Issue | HDFS-13116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909797/HDFS-13116-HDFS-7240.008.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 66b11b2825b5 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Commented] (HDFS-13112) Token expiration edits may cause log corruption or deadlock

2018-02-08 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357050#comment-16357050
 ] 

Kihwal Lee commented on HDFS-13112:
---

When I ran the failed tests, the following is reproduced.  It seems related to 
the change. Please investigate.

{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
[ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
111.306 s <<< FAILURE! - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
[ERROR] 
testSecretManagerState(org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions)
  Time elapsed: 60.008 s  <<< ERROR!
java.lang.Exception: test timed out after 6 milliseconds
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
at java.lang.Thread.join(Thread.java:1326)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.stopThreads(AbstractDelegationTokenSecretManager.java:653)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopSecretManager(FSNamesystem.java:1143)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.enterSafeMode(FSNamesystem.java:4535)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeAdapter.enterSafeMode(NameNodeAdapter.java:100)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions.testSecretManagerState(TestHAStateTransitions.java:525)
{noformat}

This passes without the patch.
{noformat}
mvn test -Dtest=TestHAStateTransitions#testSecretManagerState
{noformat}

> Token expiration edits may cause log corruption or deadlock
> ---
>
> Key: HDFS-13112
> URL: https://issues.apache.org/jira/browse/HDFS-13112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 0.23.8
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-13112.patch
>
>
> HDFS-4477 specifically did not acquire the fsn lock during token cancellation 
> based on the belief that edit logs are thread-safe.  However, log rolling is 
> not thread-safe.  Failure to externally synchronize on the fsn lock during a 
> roll will cause problems.
> For sync edit logging, it may cause corruption by interspersing edits with 
> the end/start segment edits.  Async edit logging may encounter a deadlock if 
> the log queue overflows.  Luckily, losing the race is extremely rare.  In ~5 
> years, we've never encountered it.  However, HDFS-13051 lost the race with 
> async edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-02-08 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357022#comment-16357022
 ] 

Gabor Bota edited comment on HDFS-11187 at 2/8/18 3:14 PM:
---

Corrected the patch based on Xiao's comments.

[~xiaochen] thanks for the review, putting the implementation from 
FsVolumeImpl#addFinalizedBlock to FsDatasetImpl#finalizeReplica is a good idea, 
and I almost missed it.

Regards,
Gabor


was (Author: gabor.bota):
Corrected the patch based on Xiao's comments.

[~xiaochen] thanks for the review, putting the implementation from 
FsVolumeImpl#addFinalizedBlock to FsDatasetImpl#finalizeReplica is a good idea, 
and almost missed it.

Regards,
Gabor

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.1.0, 3.0.2
>
> Attachments: HDFS-11187-branch-2.001.patch, 
> HDFS-11187-branch-2.002.patch, HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357034#comment-16357034
 ] 

genericqa commented on HDFS-13001:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 17 new + 4 unchanged - 1 fixed = 21 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13001 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909784/HDFS-13001.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 407d0ef9e976 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f491f71 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22993/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22993/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test 

[jira] [Updated] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-02-08 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-11187:
--
Status: Patch Available  (was: Open)

Corrected the patch based on Xiao's comments.

[~xiaochen] thanks for the review, putting the implementation from 
FsVolumeImpl#addFinalizedBlock to FsDatasetImpl#finalizeReplica is a good idea, 
and almost missed it.

Regards,
Gabor

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.1.0, 3.0.2
>
> Attachments: HDFS-11187-branch-2.001.patch, 
> HDFS-11187-branch-2.002.patch, HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-02-08 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-11187:
--
Attachment: HDFS-11187-branch-2.002.patch

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.1.0, 3.0.2
>
> Attachments: HDFS-11187-branch-2.001.patch, 
> HDFS-11187-branch-2.002.patch, HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-02-08 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-11187:
--
Status: Open  (was: Patch Available)

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.1.0, 3.0.2
>
> Attachments: HDFS-11187-branch-2.001.patch, 
> HDFS-11187-branch-2.002.patch, HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread Lokesh Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356974#comment-16356974
 ] 

Lokesh Jain commented on HDFS-12636:


[~msingh] Thanks for reviewing the patch! v7 patch addresses your comments.

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch, HDFS-12636-HDFS-7240.005.patch, 
> HDFS-12636-HDFS-7240.006.patch, HDFS-12636-HDFS-7240.007.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356972#comment-16356972
 ] 

genericqa commented on HDFS-13118:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-hdfs-project: The patch generated 47 new 
+ 191 unchanged - 6 fixed = 238 total (was 197) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
41s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new 
+ 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
25s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m  3s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}182m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  Null passed for non-null parameter of 
org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$SnapshotDiffReportListingEntryProto$Builder.setFileType(HdfsProtos$HdfsFileStatusProto$FileType)
 in 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.convert(SnapshotDiffReportListing$DiffReportListingEntry)
  Method invoked at PBHelperClient.java:of 
org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$SnapshotDiffReportListingEntryProto$Builder.setFileType(HdfsProtos$HdfsFileStatusProto$FileType)
 in 

[jira] [Updated] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-02-08 Thread Lokesh Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDFS-12636:
---
Attachment: HDFS-12636-HDFS-7240.007.patch

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch, HDFS-12636-HDFS-7240.005.patch, 
> HDFS-12636-HDFS-7240.006.patch, HDFS-12636-HDFS-7240.007.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356967#comment-16356967
 ] 

Shashikant Banerjee commented on HDFS-13120:


+1 the 002 patch looks good to me.

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-02-08 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356962#comment-16356962
 ] 

Shashikant Banerjee commented on HDFS-13118:


Thanks [~ehiggs], for working on this. I had a quick look at this and patch 
looks good to me overall. Some minor comments:

1.nit: SnapshotDiffListingInfo..java:90 : extra space.

2.SnapshotDiffListingInfo..java: 107
{code:java}
createdList.add(new DiffReportListingEntry(DIRECTORY, dirId,
created.getId(), path, created.isReference(), null));{code}
I am just curious to know why the INodeType fileld being harcoded to 
"DIRECTORY" here?

In the created list for directory diff, we can have files/directory/symlinks... 
I think we should check what exactly the inode being added in the created list  
(directory/file/symlink).

3.SnapshotDiffListingInfo..java: 129:
{code:java}
final DiffReportListingEntry e = target != null ?
new DiffReportListingEntry(DIRECTORY, dirId, d.getId(), path,
true, target) :
new DiffReportListingEntry(DIRECTORY, dirId, d.getId(), path,
false, null);
deletedList.add(e);{code}
While adding in the deleted list also, should we not check what the actual 
inode is instead of hardcoding it as "DIRECTORY"?

 

4.SnapshotDiffReportGenerator:237
{code:java}
private List generateReport(
DiffReportListingEntry modified) {
  List diffReportList = new ChunkedArrayList<>();
  ChildrenDiff list = dirDiffMap.get(modified.getDirId());
  for (DiffReportListingEntry created : list.getCreatedList()) {
RenameEntry entry = renameMap.get(created.getFileId());
if (entry == null || !entry.isRename()) {
  diffReportList.add(new DiffReportEntry(
  modified.getINodeType().toSnapshotDiffReportINodeType(),
  isFromEarlier ? DiffType.CREATE : DiffType.DELETE,
  created.getSourcePath()));
}
  }
  for (DiffReportListingEntry deleted : list.getDeletedList()) {
RenameEntry entry = renameMap.get(deleted.getFileId());
if (entry != null && entry.isRename()) {
  diffReportList.add(new DiffReportEntry(
  modified.getINodeType().toSnapshotDiffReportINodeType(),
  DiffType.RENAME,
  isFromEarlier ? entry.getSourcePath() : entry.getTargetPath(),
  isFromEarlier ? entry.getTargetPath() : entry.getSourcePath()));
} else {
  diffReportList.add(new DiffReportEntry(
  modified.getINodeType().toSnapshotDiffReportINodeType(),
  isFromEarlier ? DiffType.DELETE : DiffType.CREATE,
  deleted.getSourcePath()));
}
  }
  return diffReportList;
}

{code}
For each modified directory , we get the childrenList here and determine 
whether its a Rename, create or delete op. But seems like for every 
create/delete/rename entry here we putting INodeType as the 
"modified.getINodeType()" while adding it to the diffReportList which i think 
should be created.getINodeType(), entry.getINodeType() , deleted.getINodeType() 
respectively.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-02-08 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12935:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.1
   2.10.0
   Status: Resolved  (was: Patch Available)

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 3.0.2
>
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, 
> HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, 
> HDFS-12935.008.patch, HDFS-12935.009-branch-2.patch, 
> HDFS-12935.009-branch.2.patch, HDFS-12935.009.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-02-08 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356928#comment-16356928
 ] 

Brahma Reddy Battula commented on HDFS-12935:
-

Yes,Test failures are unrelated.

Committed to branch-2 and branch-2.9.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Fix For: 3.1.0, 3.0.1, 3.0.2
>
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, 
> HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, 
> HDFS-12935.008.patch, HDFS-12935.009-branch-2.patch, 
> HDFS-12935.009-branch.2.patch, HDFS-12935.009.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-13120:
---
Hadoop Flags: Reviewed

+1 the 002 patch looks good.

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13116) Ozone: Refactor Pipeline to have transport and container specific information

2018-02-08 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-13116:
-
Attachment: HDFS-13116-HDFS-7240.008.patch

> Ozone: Refactor Pipeline to have transport and container specific information
> -
>
> Key: HDFS-13116
> URL: https://issues.apache.org/jira/browse/HDFS-13116
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-13116-HDFS-7240.001.patch, 
> HDFS-13116-HDFS-7240.002.patch, HDFS-13116-HDFS-7240.003.patch, 
> HDFS-13116-HDFS-7240.004.patch, HDFS-13116-HDFS-7240.005.patch, 
> HDFS-13116-HDFS-7240.006.patch, HDFS-13116-HDFS-7240.007.patch, 
> HDFS-13116-HDFS-7240.008.patch
>
>
> Currently pipeline has information about both the container as well Transport 
> layer. This results in cases where a new pipeline (i.e. transport) 
> information is allocated for each container creation.
> This code can be refactored so that the Transport information is separated 
> from the container, then the {{Transport}} can be shared between multiple 
> pipelines/containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13124) hadoop-daemon.sh exits with 1 when running HDFS balancer on balanced cluster

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Kostrzewa updated HDFS-13124:
--
Description: 
When running HDFS balancer via {{sbin/start-balancer.sh}} script on a balanced 
cluster the script exits with 1 though the CLI behind it (i.e. {{hdfs 
balancer}}) exits with 0. This is probably caused by following piece of code 
found in {{hadoop-daemon.sh}}:
{code:java}
sleep 3;
if ! ps -p $! > /dev/null ; then
  exit 1
fi
{code}
It seems the CLI command finishes so quickly in case of a balanced cluster that 
the above {{ps}} does not find it.

  was:
When running HDFS balancer via {{sbin/start-balancer.sh}} script on a balanced 
cluster the script exits with 1 though the CLI behind it (i.e. {{hdfs 
balancer}}) exits with 0. This is probably caused by following piece of code 
found in {{hadoop-daemon.sh}}:
{code:java}
sleep 3;
if ! ps -p $! > /dev/null ; then
  exit 1
fi
{code}


> hadoop-daemon.sh exits with 1 when running HDFS balancer on balanced cluster
> 
>
> Key: HDFS-13124
> URL: https://issues.apache.org/jira/browse/HDFS-13124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, scripts
>Affects Versions: 2.7.4
>Reporter: Zbigniew Kostrzewa
>Priority: Minor
>
> When running HDFS balancer via {{sbin/start-balancer.sh}} script on a 
> balanced cluster the script exits with 1 though the CLI behind it (i.e. 
> {{hdfs balancer}}) exits with 0. This is probably caused by following piece 
> of code found in {{hadoop-daemon.sh}}:
> {code:java}
> sleep 3;
> if ! ps -p $! > /dev/null ; then
>   exit 1
> fi
> {code}
> It seems the CLI command finishes so quickly in case of a balanced cluster 
> that the above {{ps}} does not find it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13124) hadoop-daemon.sh exits with 1 when running HDFS balancer on balanced cluster

2018-02-08 Thread Zbigniew Kostrzewa (JIRA)
Zbigniew Kostrzewa created HDFS-13124:
-

 Summary: hadoop-daemon.sh exits with 1 when running HDFS balancer 
on balanced cluster
 Key: HDFS-13124
 URL: https://issues.apache.org/jira/browse/HDFS-13124
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover, scripts
Affects Versions: 2.7.4
Reporter: Zbigniew Kostrzewa


When running HDFS balancer via {{sbin/start-balancer.sh}} script on a balanced 
cluster the script exits with 1 though the CLI behind it (i.e. {{hdfs 
balancer}}) exits with 0. This is probably caused by following piece of code 
found in {{hadoop-daemon.sh}}:
{code:java}
sleep 3;
if ! ps -p $! > /dev/null ; then
  exit 1
fi
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13116) Ozone: Refactor Pipeline to have transport and container specific information

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356901#comment-16356901
 ] 

genericqa commented on HDFS-13116:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
26s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
20s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 50s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
50s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d11161b |
| JIRA Issue | HDFS-13116 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-11060) make DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED configurable

2018-02-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356861#comment-16356861
 ] 

genericqa commented on HDFS-11060:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 637 unchanged - 2 fixed = 642 total (was 639) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 31s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.TestErasureCodingPolicies |
|   | hadoop.hdfs.TestFileChecksum |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.TestErasureCodingExerciseAPIs |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.TestSetrepIncreasing |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-11060 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909759/HDFS-11060.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 60aff183d819 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f491f71 |
| maven | version: Apache 

[jira] [Updated] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-13001:
-
Status: Patch Available  (was: In Progress)

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0, 2.9.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
> Attachments: HDFS-13001.001.patch
>
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang updated HDFS-13001:
-
Attachment: HDFS-13001.001.patch

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
> Attachments: HDFS-13001.001.patch
>
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-13001) Testcase improvement for DFSAdmin

2018-02-08 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-13001 started by Jianfei Jiang.

> Testcase improvement for DFSAdmin
> -
>
> Key: HDFS-13001
> URL: https://issues.apache.org/jira/browse/HDFS-13001
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Minor
>
> Testcase improvement for DFSAdmin command. The commands should be tested 
> under following environments:
> (1) Both Namenode are up online
> (2) NN1 is off offline and NN2 is up online
> (3) NN1 is up online and NN2 is down offline
> (4) Both Namenode are down offline
> The testcases can be improved.
> Testcases can be improved like code below.
> {code:java}
>   private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
>   String message) throws Exception {
> setUpHaCluster(false);
> switch (nnIndex) {
>   case 0:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().transitionToActive(1);
> break;
>   case 1:
> cluster.getDfsCluster().shutdownNameNode(1);
> cluster.getDfsCluster().transitionToActive(0);
> break;
>   case 2:
> cluster.getDfsCluster().shutdownNameNode(0);
> cluster.getDfsCluster().shutdownNameNode(1);
> break;
>   default:
> }
> int exitCode = admin.run(command);
> if (nnIndex != 2) {
>   assertEquals(err.toString().trim(), 0, exitCode);
>   assertOutputMatches(message + newLine);
> } else {
>   assertNotEquals(err.toString().trim(), 0, exitCode);
>   assertOutputNotMatches(message + newLine);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13120) Snapshot diff could be corrupted after concat

2018-02-08 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13120:
--
Affects Version/s: 2.7.0

> Snapshot diff could be corrupted after concat
> -
>
> Key: HDFS-13120
> URL: https://issues.apache.org/jira/browse/HDFS-13120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, snapshots
>Affects Versions: 2.7.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to 
> Assertion upon DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below 
> but during loading edit entry of DeleteSnapshotOp. After the investigation, 
> we found this is a regression caused by HDFS-3689 where the snapshot diff is 
> not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element 
> already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
>   at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
>   at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
>   at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-02-08 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
 Assignee: Ewan Higgs
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

Added a patch that adds the INodeType to the SnapshotDiffReport. This will make 
it easier to make programatic comparisons between sides of a snapshot. This 
includes a new field in the profobuf: {{fileType}} that reuses the existing 
enum from the {{HdfsFileStatusProto}}. This field is optional with the 
intention of being backwards compatible.

This does not change the printed output of the command line for 
{{snapshotDiff}}.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-02-08 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Attachment: HDFS-13118.001.patch

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >